In recent years, we have noticed that vocoder selection for critical voice communication systems is sometimes made without full consideration of the technical aspects and potential consequences of the choice.

It is noteworthy that even in high-end communication devices, free codecs are occasionally used, which can impact key performance characteristics and, in some cases, lead to a system that does not meet the expected quality and reliability standards.

We hope the information provided below will be useful not only for those new to the field but also for experienced professionals, helping them better understand the critical characteristics of vocoders and make well-informed decisions when selecting the right vocoder for their voice communication systems.

To illustrate these points, we present a direct comparison of three vocoders at the same bit rate of 1200 bps: the open-source Codec2 (v1.2), the MELPe vocoder, and our TWELP vocoder. We also evaluate TWELP at 600 bps, operating at only half the rate.

Listening examples and objective evaluations clearly show that while Codec2 may be attractive as a free solution, it falls significantly short in both speech quality and intelligibility. MELPe provides stronger performance, while TWELP consistently delivers excellent results: at 1200 bps it outperforms both alternatives, and even at 600 bps it significantly surpasses Codec2 at 1200 bps, while only slightly trailing MELPe at 1200 bps.

Below you can listen to audio samples and review objective test results using PESQ, STOI, and ESTOI metrics, providing both a subjective and quantitative perspective on the differences between these vocoders.

Essential Information. As professionals in the field of voice communication know, any communication system is primarily defined by three key characteristics:

  • Speech Quality: Refers to how natural and clear the speech sounds. Higher quality means less distortion, fewer artifacts, and a more pleasant listening experience.
    It directly impacts the recognition of the speaker and the conveyance of their emotions, which is especially critical in high-stakes situations.

  • Speech Intelligibility: Measures how easily speech can be understood, even in noisy conditions. While degradation in quality may occur, high intelligibility ensures that key information remains recognizable.
    A drop in intelligibility down to 90% (missing one word out of ten) may not be extremely critical, as the brain can often infer lost information from context.
    However, when intelligibility falls down to 70% and lower (missing three or more words out of ten), communication becomes significantly more challenging and can lead to irreversible consequences in critical situations.

  • Latency: Refers to the time delay between speech input and output. Lower latency is crucial for real-time communication to prevent unnatural pauses and delays.
    Noticeable time delays not only complicate communication but can also result in severe consequences when rapid and accurate voice transmission is vital.

Therefore, vocoders should primarily be compared based on these essential parameters. 

All measurement results presented below can be independently verified using the samples and utilities available via the links at the bottom of this webpage. 

You will find not only all the necessary utilities in the archive but also a complete test environment in the form of batch (BAT) command files. You only need to run the corresponding command file to test the speech quality and intelligibility of samples processed by a specific vocoder. 
You can also explore the information inside the command files to see how the process works.

Technology Features. The TWELP vocoders are based on the newest technology of speech coding called "Tri-Wave Excited Linear Prediction" (TWELP) that was developed by experts of DSPINI. 

TWELP technology is a new class of vocoders that differs from any other LPC-based vocoders by:

  • advance reliable method of pitch estimation
  • pitch-synchronous analysis
  • advance tri-wave model of excitation
  • newest quantization schemes
  • pitch-synchronous synthesis

Thanks to its unique features, TWELP technology offers significantly better speech quality and intelligibility compared to other well-known technologies, including AMBE+2, MELPe, ACELP, and others, at bit rates ranging from 300 bps to 4800 bps and beyond.
Unlike many other LBR vocoders (such as MELPe, for example), TWELP also provides superior quality for non-speech signals like sirens, background music, and more.

In contrast, CODEC2 is based on the older and simpler SHC (Sinusoidal Harmonic Coding) technology, which was widely used over 30 years ago.

Speech Quality. As is well known, obtaining an objective assessment of the speech quality of low-bitrate vocoders by ear is extremely difficult and labor-intensive.

Despite the common belief that the ITU-T P.862 utility, specifically designed for objective speech quality evaluation, is not suitable for assessing low-bitrate vocoders, our many years of experience show that this is not entirely accurate.
Yes, of course, this assessment is not absolutely precise. However, it is very useful as it helps to identify quality differences when comparing vocoders.

Recently, we have started using speech intelligibility measurements based on the STOI (Short-Time Objective Intelligibility) and ESTOI (Extended Short-Time Objective Intelligibility) methods and have found a clear correlation between PESQ quality scores and intelligibility scores in the STOI/ESTOI metrics.
Of course, there are some differences, especially when comparing vocoders with very low bitrates. However, it is safe to say that all these evaluation methods are important and, by complementing each other, provide a sufficiently accurate and objective picture when comparing vocoder quality.

The TWELP 1200 bps, MELPe 1200 bps, TWELP 600 bps vocoders and CODEC2 1200 bps vocoder were tested using the ITU-T P.50 speech database for 20 different languages.
We have updated the speech database by minimizing inter-speech pauses to eliminate their impact on the evaluation results.
Therefore, the numbers obtained from the quality measurements using this updated speech database differ from those previously obtained with the original speech database, where speech pauses were not removed.
The ITU-T P.862 utility was used to estimate speech quality in PESQ terms:

 
LanguageTWELP
1200 bps
MELPe
1200 bps
TWELP
600 bps
CODEC2
1200 bps
American 2.917 2.711 2.506 2.492
Arabic 2.875 2.722 2.522 2.496
British 2.793 2.689 2.540 2.290
Chinese 2.899 2.607 2.506 2.486
Danish 2.910 2.720 2.615 2.519
Dutch 2.767 2.644 2.487 2.271
Finnich 2.699 2.630 2.499 2.202
French 2.969 2.818 2.606 2.560
German 2.954 2.793 2.616 2.471
Greek 2.889 2.698 2.571 2.448
Hindi 2.951 2.847 2.642 2.488
Hungarian 2.944 2.793 2.583 2.664
Italian 3.092 2.983 2.770 2.718
Japanese 3.040 2.898 2.659 2.526
Norwegian 2.920 2.745 2.562 2.574
Polish 2.926 2.758 2.621 2.423
Portuguese 2.965 2.894 2.694 2.659
Russian 2.827 2.689 2.506 2.477
Spanish 2.944 2.816 2.627 2.713
Swedish 3.013 2.907 2.688 2.576
Average2.9152.7682.5912.503

Superiority of the TWELP 1200, MELPe 1200 and TWELP 600 over CODEC2 1200 is on average 0.412, 0.265 and 0.088 PESQ apropriately.

The diagram and table above illustrate the significant differences in speech quality between the TWELP, MELPe, and Codec2 vocoders.
As is well known, at such low bit rates even a 0.1 PESQ difference is considered substantial. In this case, we observe a difference of 0.265 PESQ between MELPe and Codec2.
The gap between TWELP and Codec2 is even more striking — 0.412 PESQ, which clearly demonstrates the superior performance of TWELP.
It is also noteworthy that TWELP not only outperforms Codec2 by a wide margin, but also provides a considerable advantage over MELPe at the same 1200 bps rate.
Moreover, even at 600 bps — operating at only half the bit rate — TWELP still delivers higher speech quality than Codec2 at 1200 bps.

Speech Intelligibility. The five-point speech intelligibility scale is typically represented in the following terms:

Excellent – 96-100% intelligibility
Good – 86-95% intelligibility
Fair – 70-85% intelligibility
Poor – 50-69% intelligibility
Bad – <50% intelligibility

This scale is used, for example, in speech intelligibility assessment studies such as the Modified Rhyme Test (MRT) or the Speech Intelligibility Index (SII).

We use the STOI (Short-Time Objective Intelligibility) and ESTOI (Extended Short-Time Objective Intelligibility) metrics to evaluate speech intelligibility.
These metrics have proven their objectivity over the past years and have become so popular that even the latest version of Matlab includes one of these evaluators. 
Although we use both metrics, we believe that the ESTOI metric provides a more objective result when assessing parametric vocoders, which are highly nonlinear devices that significantly distort the spectral composition of the signal. 

Here is the comparison of the speech intelligibility, using the same updated ITU-T P.50 speech base for 20 different languages and the above mentioned STOI and ESTOI metrics: 

 
LanguageTWELP
1200 bps
MELPe
1200 bps
TWELP
600 bps
CODEC2
1200 bps
American 88.30 85.20 84.53 83.81
Arabic 87.45 84.76 83.07 82.45
British 85.72 80.97 82.02 80.44
Chinese 87.73 83.74 83.40 84.30
Danish 88.83 84.86 85.13 83.64
Dutch 86.82 82.66 83.37 80.98
Finnich 84.05 80.12 79.64 78.81
French 88.28 84.28 84.48 84.52
German 87.97 85.01 84.37 82.87
Greek 87.30 84.03 83.48 82.83
Hindi 87.60 84.38 83.97 81.50
Hungarian 87.64 84.72 84.27 84.72
Italian 87.68 85.00 83.15 82.67
Japanese 87.75 86.01 84.14 83.51
Norwegian 88.65 85.12 84.74 85.20
Polish 88.05 83.93 84.24 83.33
Portuguese 87.56 85.05 84.01 84.46
Russian 86.56 82.55 82.71 82.88
Spanish 86.96 83.57 82.46 82.86
Swedish 86.25 83.47 82.31 82.75
Average87.3683.9783.4882.93

The superiority of MELPe 1200 over CODEC2 1200 is 1.04%.
The superiority of TWELP 1200 over CODEC2 1200 averages 4.43%
The superiority of TWELP 600 over CODEC2 1200 averages 0.55%

The diagram and table above also illustrate the significant differences in speech intelligibility between the TWELP, MELPe, and Codec2 vocoders.
Measurements indicate that these vocoders belong to fundamentally different intelligibility categories.
  • TWELP at 1200 bps falls into the “Good” category (86–95%).
  • MELPe at 1200 bps and Codec2 at 1200 bps both fall within the “Fair”category (70–85%), providing intelligibility comparable to the TWELP vocoder at 600 bps, despite TWELP operating at only half the bitrate.
Considering that a low-bitrate vocoder is a nonlinear device that significantly distorts the spectrum of the original speech signal, the ESTOI metric provides more accurate assessments of speech intelligibility after vocoding:
 
LanguageTWELP
1200 bps
MELPe
1200 bps
TWELP
600 bps
CODEC2
1200 bps
American 80.77 78.01 74.48 70.03
Arabic 80.87 78.27 74.28 68.81
British 78.09 75.02 72.80 66.17
Chinese 82.20 78.08 75.65 74.39
Danish 82.04 77.70 76.22 70.40
Dutch 80.03 76.97 74.36 67.16
Finnich 76.62 73.56 70.44 66.56
French 81.58 77.91 75.68 71.83
German 80.52 76.58 74.59 67.96
Greek 81.49 78.51 76.04 71.74
Hindi 79.12 76.53 72.77 64.28
Hungarian 80.21 76.09 73.59 69.50
Italian 80.46 77.88 73.28 70.05
Japanese 82.28 80.21 76.09 71.84
Norwegian 81.97 78.86 76.13 73.08
Polish 81.43 77.75 75.53 69.52
Portuguese 80.80 78.04 75.07 71.41
Russian 79.61 76.02 73.07 68.73
Spanish 80.84 77.57 74.07 71.94
Swedish 78.07 75.91 71.91 69.17
Average80.4577.2774.3069.73

The superiority of MELPe 1200 over CODEC2 1200 is 7.54%.
The superiority of TWELP 1200 over CODEC2 1200 averages 10.72%
The superiority of TWELP 600 over CODEC2 1200 averages 4.57%

The diagram and table above also illustrate the significant differences in speech intelligibility between the TWELP, MELPe, and Codec2 vocoders.
Measurements indicate that these vocoders belong to fundamentally different intelligibility categories.

  • TWELP and MELPe at 1200 bps, as well as TWELP at 600 bps, fall into the Fair category (70–85%).
  • Codec2 at 1200 bps lies on the boundary between the “Poor” and “Fair” categories, delivering intelligibility significantly lower than that of the TWELP vocoder at 600 bps, despite TWELP operating at only half the bitrate.
These results show that the Codec2 1200 bps vocoder performs noticeably below even the TWELP 600 bps vocoder, despite TWELP operating at a significantly lower bitrate.
At 1200 bps, TWELP outperforms MELPe and Codec2 by 3.18% and 10.72%, respectively. 

Speech Samples (WAV-files). We used the preference method when comparing speech quality and intelligibility by ear.

First, we compared the CODEC2 1200 bps vocoder with the MELPe 1200 bps vocoder.
All independent listeners preferred MELPe, noting its significantly higher speech intelligibility and substantially fewer artifacts (unwanted distortions) in the synthesized speech.

Next, we compared the CODEC2 1200 bps vocoder with the TWELP 1200 bps vocoder.
All independent listeners clearly preferred TWELP, highlighting its superior quality, clarity, naturalness, excellent speech intelligibility, and the almost complete absence of audible artifacts.

We also conducted comparative tests between the CODEC2 1200 bps vocoder and the TWELP 600 bps vocoder, which operates at half the bit rate.
Although the listeners’ opinions were divided in this case, the majority still favored TWELP.
Many were surprised to learn that the vocoders operate at bit rates differing by a factor of two.

You can listen to short samples of the original speech as well as the processed speech from these vocoders in any of the 20 languages using the links in the table below.

Additionally, you can download the complete set of ITU-T P.50 samples (an updated version without pauses) as ZIP files for all languages at once using the links in the "Downloads" section at the bottom of the page.
For the best listening experience, we recommend using high-quality headphones or premium audio equipment to hear the nuances and differences in the vocoder sound more clearly.

LanguageSource
speech
CODEC2
1200 bps
TWELP
600 bps
MELPe
1200 bps
TWELP
1200 bps
American
Arabic
British
Chinese
Danish
Dutch
Finnich
French
German
Greek
Hindi
Hungarian
Italian
Japanese
Norwegian
Polish
Portuguese
Russian
Spanish
Swedish

Superiority In Quality Of The Non-speech Signals. In contrast to other LBR vocoders (MELPe, AMBE+2, etc.), TWELP vocoders provide high quality of non-speech signals, including police, ambulance, fire sirens, etc. This feature in conjunction with high quality natural human-sounding of voice makes TWELP vocoders well suitable for replacement of analog radio by digital radio and also for other applications where high quality transmitting of non-speech signals is relevant along with high quality transmitting of speech signals.

Source
signal
CODEC2
1200 bps
TWELP
600 bps
MELPe
1200 bps
TWELP
1200 bps

High Robustness To Acoustic Noise. In real-world voice communication applications, the speech signal at the vocoder's input is typically distorted to some extent by external noise.
In many cases, such as military operations and similar scenarios, the signal-to-noise ratio (SNR) can be extremely low.
High speech intelligibility under such conditions is a critically important factor, often directly affecting people’s safety and even their lives.

We used the ITU-T P.50 speech database as a basis for generating samples where the original speech was mixed with "pink" noise at various SNR levels, ranging from 40 dB down to 0 dB.
We then processed all these noisy speech samples using the TWELP 1200 bps, MELPe 1200 bps, TWELP 600 bps and CODEC2 1200 bps vocoders.
Next, we measured speech intelligibility for all these samples using the ESTOI metric, as it is the most suitable for objectively evaluating speech signals processed by a parametric vocoder, especially in noisy environments.

At high SNR levels, adding noise has little effect on intelligibility. However, when the SNR drops below 30 dB, speech intelligibility after both vocoders begins to decline.
Since the CODEC2 vocoder inherently provides significantly lower speech quality and intelligibility compared to the MELPe and TWELP vocoders, speech intelligibility at very low SNR levels after CODEC2 becomes very low, while after MELPe and TWELP, it remains at a satisfactory level.

Below, we present a diagram and a table showing intelligibility values for SNR = 10 dB.
This diagram and table clearly illustrate the advantages of the MELPe and TWELP vocoders over the CODEC2 vocoder in acoustic noise conditions.
TWELP vocoder provides much better speech intelligibility in comparison with both vocoders. 

 
 
LanguageTWELP
1200 bps
MELPe
1200 bps
TWELP
600 bps
CODEC2
1200 bps
American 69.98 66.92 63.51 60.66
Arabic 71.11 69.34 63.47 63.05
British 66.98 64.32 61.32 57.43
Chinese 73.61 69.84 66.61 66.25
Danish 70.18 65.41 63.88 60.72
Dutch 67.25 64.64 61.77 57.37
Finnich 65.97 62.45 59.25 58.36
French 71.57 68.53 65.31 63.03
German 68.20 65.00 62.24 59.90
Greek 71.77 68.12 65.56 63.10
Hindi 67.23 64.62 60.82 55.60
Hungarian 71.27 68.20 63.85 61.91
Italian 68.67 66.67 61.39 60.10
Japanese 73.08 71.71 66.21 63.85
Norwegian 73.25 69.64 67.13 65.48
Polish 70.51 67.27 64.72 60.10
Portuguese 70.31 67.93 63.59 63.15
Russian 68.18 65.57 62.02 59.41
Spanish 72.64 68.39 64.80 64.04
Swedish 64.40 62.05 57.39 57.39
Average69.8165.5763.2461.05

The superiority of MELPe 1200 over CODEC2 1200 is 4.52%.
The superiority of TWELP 1200 over CODEC2 1200 averages 8.76%
The superiority of TWELP 600 over CODEC2 1200 averages 2.19%

Under strong acoustic noise, the relative ranking between the vocoders remains the same.

  • TWELP 1200 bps vocoder, while pushed down to the boundary between the “Fair” and “Poor” categories, continues to lead with a clear margin.
  • MELPe 1200 bps and TWELP 600 bps vocoders both drop fully into the “Poor” category.
  • Codec2 1200 bps vocoder, which was already in "Poor" category, falls even lower within it.

Note that the TWELP 1200 bps vocoder provides higher intelligibility for heavily noise-corrupted speech than the CODEC2 1200 bps vocoder does for completely clean speech.

Below, you can listen to short samples of noisy English speech at SNR = 10 dB processed by both vocoders.

CODEC2
1200 bps
TWELP
600 bps
MELPe
1200 bps
TWELP
1200 bps
Source signal
(SNR=10dB)

Note: 
All the above evaluations for both vocoders were conducted without activating noise reduction systems in order to assess the vocoders themselves exclusively.

Latency. Here is a comparison of the latency added by these vocoders to the communication system.

  • TWELP 1200 bps vocoder operates with a frame size of 40 ms (320 samples) and has 20 ms look-ahead time in the analysis, providing a total algorithmic delay of 60 ms.
  • MELPe 1200 bps vocoder operates with a frame size of 67.5 ms (540 samples) and provides a total algorithmic delay of 103.75 ms.
  • CODEC2 2400 bps vocoder operates with a frame size of 40 ms (320 samples) and has 20 ms look-ahead time, providing a total algorithmic delay of 60 ms.

Note: 
We did not analyze the CODEC2 vocoder's code to determine its algorithmic delay.
Instead, we estimated the look-ahead time based on the sample delay in the output files relative to the input samples, which typically corresponds to the actual value of this time.
The frame size value was taken from the vocoder's documentation.

You can see that both the TWELP 1200 and Codec2 1200 vocoders have the same low latency of 60 ms, which is acceptable for both full-duplex and half-duplex communication (given proper integration and minimal computational overhead).
In contrast, the MELPe 1200 bps vocoder has a latency of 103.75 ms, which exceeds the well-known 100 ms threshold and causes difficulties for normal voice communication in such systems.

 Guarantee And Support.  DSPINI guarantees a quality and accordance of all technical characteristics of the product to requirement of current specifications. Testing and other method of quality control are used for guarantee support.

Software Integrity and Security. DSPINI guarantees the ABSOLUTE integrity of its software, free from any undocumented features, undeclared capabilities, or hidden functions. Our customers can be assured that none of our software/code contains any secret features or functionalities concealed from the user. If necessary, we are ready to provide the source code of our software products for appropriate certification.
Moreover, our software is available in source code form—you simply need to purchase the appropriate license to use it.

Any Platforms.  DSPINI performs a highly optimized porting of the vocoder for any other DSP, RISC or general-purpose platform in short time: 1-2 months.

Licensing Terms. DSPINI is the exclusive owner of the rights to the TWELP vocoder software, a customer should obtain a license from DSPINI only.

Customization.  DSPINI can customize any vocoder under specific requirements- other bit rate, frame size, any other robustness to channel errors, etc. Please contact us for the details.

Prospects.  DSPINI is impoving and developing continuously a set of new vocoders with range from 300 bps up to 9600 bps, based on SPR and TWELP technologies.

Related Software.  Any vocoder may be effectively used in a bundle with other DSPINI's products:

  • Linear and acoustic echo cancellers,
  • Speech Enhacers / Noise cancellers,
  • Wired or radiomodems for any types of channels and bitrates,
  • Other DSP products.

Downloads:

Note: 
This testbench includes:
- The ITU-T P.50 speech database (updated by removing pauses and adding noisy speech samples)
- The ITU-T P.862 utility
- The STOI/ESTOI utility
- The Audio File Time Aligner utility
- Audio samples processed by CODEC2, MELPe and TWELP vocoders at different bitrates
- A testing environment with command (BAT) files

Conclusion. The open-source CODEC2 vocoder is a notable achievement for a free codec operating at low bit rates, and if minimizing costs is the primary concern for your application, it may be worth considering.

However, due to its older technology, CODEC2 generally provides lower speech quality and intelligibility compared to modern solutions, including the TWELP vocoder, which delivers superior performance even at a much lower bit rate.
In addition, CODEC2 struggles with non-speech signals such as sirens and alarms.

When it comes to mission-critical communications — military, governmental, or emergency services — speech clarity is not merely important but absolutely vital: misunderstandings or repeated transmissions can cost precious time and, in extreme cases, lives.
For this reason, we strongly recommend carefully evaluating all factors relevant to your specific application and, where appropriate, consulting with voice communication experts before making a final decision on which codec to adopt.

We would be happy to provide more detailed information about the TWELP vocoder, assist with your decision-making, and discuss whether it may be a good fit for your needs. Please feel free to contact us via email.

Additionally, if you would like to see a comparison of our vocoders with any other standard or open-source options, we are happy to conduct the necessary tests and share the results on our website.

For practical guidance, please see Choosing the Right Codec.