Speech transmission index

Speech Transmission Index (STI) is a measure of speech transmission quality. The absolute measurement of speech intelligibility is a complex science. The STI measures some physical characteristics of a transmission channel (a room, electro-acoustic equipment, telephone line, etc.), and expresses the ability of the channel to carry across the characteristics of a speech signal. STI is a well-established objective measurement predictor of how the characteristics of the transmission channel affects speech intelligibility.

The influence[1] that a transmission channel has on speech intelligibility is dependent on:

History

The STI was introduced by Tammo Houtgast and Herman Steeneken in 1971,[2] and was accepted by Acoustical Society of America in 1980.[3] Steeneken and Houtgast decided to develop the Speech Transmission Index because they were tasked to carry out a very lengthy series of dull speech intelligibility measurements for the Netherlands Armed Forces. Instead, they spent the time developing a much quicker objective method (which was actually the predecessor to the STI).[4]

Houtgast and Steeneken developed the Speech Transmission Index while working at The Netherlands Organisation of Applied Scientific Research TNO. Their team at TNO kept supporting and developing the STI, improving the model and developing hardware and software for measuring the STI, until 2010. In that year, the TNO research group responsible for the STI spun out of TNO and continued its work as a privately owned company named Embedded Acoustics. Embedded Acoustics now continues to support development of the STI, with Herman Steeneken (now formally retired from TNO) still acting as a senior consultant.

In the early years (until approx. 1985) the use of the STI was largely limited to a relatively small international community of speech researchers. The introduction of the RASTI ("Room Acoustics STI") made the STI method available to a larger population of engineers and consultants, especially when Bruel & Kjaer introduced their RASTI measuring device (which was based on the earlier RASTI system developed by Steeneken and Houtgast at TNO). RASTI was designed to be much faster than the original ("full") STI, taking less than 30 seconds instead of 15 minutes for a measuring point. However, RASTI was only intended (as the name says) for pure room acoustics, not electro-acoustics. Application of RASTI to transmission chains featuring electro-acoustic components (such as loudspeakers and microphones) became fairly common, and led to complaints about inaccurate results. The use of RASTI was even specified by some application standards (such as CAA specification 15 for aircraft cabin PA systems) for applications featuring electro-acoustics, simply because it was the only feasible method at the time. The inadequacies of RASTI were sometimes simply accepted for lack of a better alternative. TNO did produce and sell instruments for measuring full STI and various other STI derivatives, but these devices were relatively expensive, large and heavy.

Around the year 2000, the need for an alternative to RASTI that could also be applied safely to Public Address (PA) systems had become fully apparent. At TNO, Jan Verhave and Herman Steeneken started work on a new STI method, that would later become known as STIPA (STI for Public Address systems). The first device to include STIPA measurements available for sale to the general public was made by Gold-Line. At this time, STIPA measuring instruments are available from various manufacturers.

RASTI was standardized internationally in 1988, in IEC-60268-16. Since then, IEC-60268-16 was revised three times, the latest revisions (rev.4) appearing in 2011. Each revision included updates of the STI methodology that had become accepted in the STI research community over time, such as the inclusion of redundancy between adjacent octave bands (rev.2), level-dependent auditory masking (rev.3) and various methods for applying the STI to specific populations such as non-natives and the hearing impaired (rev.4). An IEC maintenance team is currently working on rev. 5.

RASTI was declared obsolete by the IEC in June 2011, with the appearance of rev. 4 of IEC-602682-16. At this time, this simplified STI derivative was still stipulated as a standard method in some industries. STIPA is now seen as the successor to RASTI for almost every application.

Scale

STI is a numeric representation measure of communication channel characteristics whose value varies from 0 = bad to 1 = excellent.[5] On this scale, an STI of at least .5 is desirable for most applications.

Barnett (1995,[6] 1999[7]) proposed to use a reference scale, the Common Intelligibility Scale (CIS), based on a mathematical relation with STI (CIS = 1 + log (STI)).

STI CIS Scale.
Speech Intelligibility may be expressed by a single number value. Two scales are most commonly used: STI and CIS

STI predicts the likelihood of syllables, words and sentences being comprehended. As an example, for native speakers, this likelihood is given by:

STI Value Quality according to IEC 60268-16 Intelligibility of Syllables in % Intelligibility of Words in % Intelligibility of Sentences in %
0 - 0.3 bad 0 - 34 0 - 67 0 - 89
0.3 - 0.45 poor 34 - 48 67 - 78 89 - 92
0.45 - 0.6 fair 48 - 67 78 - 87 92 - 95
0.6 - 0.75 good 67 - 90 87 - 94 95 - 96
0.75 - 1 excellent 90 - 96 94 - 96 96 - 100

If non-native speakers, people with speech disorders or hard-of-hearing people are involved, other probabilities hold.

It is interesting but not astonishing that STI prediction is independent of the language spoken - not astonishing, as the ability of the channel to transport patterns of physical speech is measured.

Another method is defined for computing a physical measure that is highly correlated with the intelligibility of speech as evaluated by speech perception tests given a group of talkers and listeners. This measure is called the Speech Intelligibility Index, or SII.[8]

Nominal qualification bands for STI

The IEC 60268-16 ed4 2011 Standard defines a qualification scale in order to provide flexibility for different applications. The values of this alpha-scale run from "U" to "A+".[9]

STI qualification bands.
Nominal qualification bands for STI
Examples of STI qualification bands and typical applications.
Examples of STI qualification bands and typical applications

Standards

STI has gained international acceptance as the quantifier of channel influence on speech intelligibility. The International Electrotechnical Commission Objective rating of speech intelligibility by speech transmission index,[9] as prepared by the TC 100 Technical Committee, defines the international standard.

Further the following standards have, as part of the requirements to be fulfilled, integrated testing the STI and realisation of a minimal speech transmission index:

STIPA

STIPA (Speech Transmission Index for Public Address Systems) is a version of the STI using a simplified method and test signal. Within the STIPA signal, each octave band is modulated simultaneously with two modulation frequencies. The modulation frequencies are spread among the octave bands in a balanced way, making it possible to obtain a reliable STI measurement based on a sparsely sampled Modulation Transfer Function matrix. Although initially designed for Public Address systems (and similar installations, such as Voice Evacuation Systems and Mass Notification Systems), STIPA can also be used for a variety of other applications. The only situation in which RASTI is currently considered inferior to full STI is in the presence of strong echoes.

A single STIPA measurement generally takes between 15 and 25 seconds, combining the speed of RASTI with (nearly) the wide scope of applicability and reliability of full STI.

Since STIPA has become widely available, and given the fact that RASTI has several disadvantages and no benefits over STIPA, RASTI is now considered obsolete.

"STIPA Test Signal"
A sample of a STIPA test signal from NTi Audio (2011)

Problems playing this file? See media help.

Although the STIPA test signal does not resemble speech to the human ear, in terms of frequency content as well as intensity fluctuations it is a signal with speech-like characteristics.

Speech can be described as noise that is intensity-modulated by low-frequency signals. The STIPA signal contains such intensity modulations at 14 different modulation frequencies, spread across 7 octave bands. At the receiving end of the communication system, the depth of modulation of the received signal is measured and compared with that of the test signal in each of a number of frequency bands. Reductions in the modulation depth are associated with loss of intelligibility.

Indirect method

An alternative Impulse response method, also known at the "indirect method," assumes that the channel is linear and requires stricter synchronization of the sound source to the measurement instrument. The main benefit of the indirect method over the direct method (based on modulated test signals) is that the full MTF matrix is measured, covering all relevant modulation frequencies in all octave bands. In very large spaces (such as cathedrals), where echoes are likely to occur, the indirect method is usually preferred over direct method (e.g. using modulated STIPA signals). In general, the indirect method is often the best option when studying speech intelligibility based on "pure room acoustics," when no electro-acoustic components are present within the transmission path.

However, the requirement that the channel must be linear implies that the indirect method cannot be used reliably in many real-life applications: whenever the transmission chain features components that might exhibit non-linear behaviour (such as loudspeakers), indirect measurements may yield incorrect results. Also, depending on the type of impulse response measurement that is used, the influence of background noise present during measurements may not be dealt with correctly. This means that the indirect method should only be used with great care when measuring Public Address systems and Voice Evacuation systems. IEC-60268-16 rev. 4 does not disallow the indirect method for such applications, but issues the following words of warning: "Critical analysis is therefore required of how the impulse response is obtained and potentially influenced by non-linearities in the transmission system, particularly as in practice, system components can be operated at the limits of their performance range." In practice, verification of the validity of the linearity assumption is often too complex for everyday use, making the (direct) STIPA method the preferred method whenever loudspeakers are involved.

Although many measuring tools based on the indirect method offer STIPA as well as "full STI" options, the sparse Modulation Transfer Function matrix inherent to STIPA offers no advantages when using the indirect method. Impulse response based STIPA measurements must not be confused with direct STIPA measurements, as the validity of the result still depends on whether or not the channel is linear.

List of manufacturers of STI measuring instruments

STI measuring instruments are (and have been) made by various manufactures. Below is a list of brands under which STI measuring instruments have been sold, in alphabetical order.

The market for STI measuring solution is still developing, so the above list is subject to change as manufacturers enter or leave the market. The list does not include software producers that produce STI-capable acoustic measuring and simulation software. Mobile apps for STIPA measurements (such as the ones sold by Studio Six Digital and Embedded Acoustics ) are also excluded from the list.

See also

References

Jacob, K., McManus, S., Verhave, J.A., and Steeneken, H., (2002) "Development of an Accurate, Handheld, Simple-to-use Meter for the Prediction of Speech Intelligibility", Past,Present, and Future of the Speech Transmission Index, International Symposium on STI

External links

This article is issued from Wikipedia - version of the 11/18/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.