Modelling

of voice quality correlates

 

Voice Source Analysis

There are a number of models with varying number of parameters used to model glottal flow. They can be roughly classified into the following categories: parametric non-interactive glottal flow models (assumed linear independence between glottal source and vocal tract), interactive parametric and mechanical (based to a varying degree on physics of voice source, mainly vocal folds), and three-dimensional physiological and numerical glottal models (based on physiological properties of the glottis). The most commonly used models are KLGLOTT88 [10] - a four-parameter model, R++ [11] - a five-parameter model and LF model [12] - a five-parameter model. In this work Liljencrants-Fant model will be used to model voice source as it is more ´powerful´ than KLGLOTT88 model by having an am parameter governing the symmetry of open phase and is more researched than R++ model. It has been shown that this model is able to adequately represent a wide range of natural variations.

The LF glottal flow model (LF-GFM), see eq. 1&2, is a function of time, defined as having only positive or null values, and it is periodic with a period of T0. On a fundamental period, the glottal flow is bell-shaped. The maximum value of the glottal flow is Av. Throughout the closed phase of the cycle, the air does not flow through the glottis and hence both, the glottal flow and its derivative are of zero value. The glottal flow rises during the abduction of open phase and decreases during the adduction. The open quotient, OQ is a proportional to the pitch period ranging from 0 to 1 in value. It also defines the instant of glottal closure as . The proportion of the opening phase, relative to the open phase, is designated by the asymmetry coefficient am. On the other hand, the speed quotient expresses the ratio of closing phase to opening phase. The glottal opening phase is inherently longer than the glottal closure phase and thus a restriction is required on the range of values am can take. It has been shown that good range boundaries are [0.5 £ am £ 0.9] although in natural speech am is most likely found in the range between 0.6 – 0.7. The spectral tilt or return phase coefficient is the fifth parameter has an effect on the abruptness of closure.

Change this:

The Ta parameter is the effective duration of the return phase. It defines the period of time in-between Te and the instant where the tangent of the second section of the glottal pulse derivative sets off.
Its value should ensure that the GFM is a continuous function of time and from some cases at glottal closure instant. Abrupt closure causes discontinuity at in the glottal flow derivative. Ta is inversely proportional to the frequency at which dU(t)/dt gains an extra 6dB/oct negative slope. As such it has a big effect on spectral tilt of glottal source and the perceptual voice quality. In case of creaky voice the glottal closure is much more abrupt compared to modal voice source. Hence, Ta is smaller giving rise to more energy in higher frequency region and a smaller spectral tilt. Note the following parameter relationships


The following constraints are placed on the LF model in order to satisfy continuity of the model and the fact that during the closure of glottis no air flow takes place.

The figure 7 shows the software that is used to generate the glottal flow pulse and its derivative based on LF model. The graph at the top left corner entitled "glottal model" displays one pitch period of glottal pulse flow (RED line) and glottal pulse derivative (BLUE line). The graph opposite to it shows the frequency response of the glottal flow derivative as a logarithmic distribution of power across spectrum. The lower graph displays a series of pulses. To which noise pink or red can be added. The alpha parameter is automatically adjustable to comply with the zero glottal airflow during closed phase condition. The software is used to analyse the behaviour of the glottal airflow in time and frequency domain.

Figure 7, LF - Glottal flow modelling software

Designed for screen resolution 1024x768 using HTML 4.0 and CSS level 1.
Any comments, questins or suggestions are very welcome and should be directed to emir.turajlic@brunel.ac.uk.