(version 5.0) offers speech quality and intelligibility that were previously achievable only at double or higher bitrates. In our search for a worthy competitor and given the growing demand for free codecs in recent years, we decided to compare our vocoder with CODEC2, which operates at twice the bitrate, to highlight the fundamental difference between a free and a commercial solution.

For Digital HF Radio and other markets.

TWELP Technology Features. The vocoder is based on newest technology of speech coding called "Tri-Wave Excited Linear Prediction" (TWELP) that was developed by experts of DSPINI. 

TWELP technology is a new class of vocoders that differs from any other LPC-based vocoders by:

  • advance reliable method of pitch estimation
  • pitch-synchronous analysis
  • advance tri-wave model of excitation
  • newest quantization schemes
  • pitch-synchronous synthesis

Thanks to these unique features, TWELP technology provides significantly better speech quality than other well-known technologies—including AMBE+2, MELPe, ACELP, and others—at equivalent bit rates ranging from 300 bps to 4800 bps and beyond.

Additionally, unlike other low-bitrate vocoders (such as MELPe, for example), TWELP delivers much higher quality for non-speech signals, including sirens, background music, and similar audio.

Speech Quality. This is a comparison with the CODEC2 vocoder, which operates at 3200 bps and 1600 bps.
The TWELP 1600 bps and CODEC2 3200 bps and 1600 bps vocoders were tested using the ITU-T P.50 speech base in 20 different languages.

Note: 
We have updated the speech database by minimizing inter-speech pauses to eliminate their impact on the evaluation results. Therefore, the numbers obtained from the quality measurements using this updated speech database differ from those previously obtained with the original speech database, where speech pauses were not removed.

The ITU-T P.862 tool was used to evaluate speech quality in terms of PESQ scores:
Created with Highcharts 4.1.9Chart context menuP E S QSpeech Quality ComparisonTWELP 1600 bps vs CODEC2 3200 bps and 1600 bpsTWELP 1600 bpsCODEC2 3200 bpsCODEC2 1600 bpsAmericanArabicBritishChineseDanishDutchFinnichFrenchGermanGreekHindiHungarianItalianJapaneseNorwegianPolishPortugueseRussianSpanishSwedish2.22.32.42.52.62.72.82.933.13.23.3
The diagram demonstrates a substantial difference in speech quality after processing with the TWELP 1600 bps vocoder compared to CODEC2, even when the latter operates at a bitrate twice as high - 3200 bps.. Exact numbers are shown in the table below.
LanguageTWELP
1600 bps
CODEC2
3200 bps
CODEC2
1600 bps
American 3.086 2.758 2.550
Arabic 3.040 2.823 2.581
British 2.946 2.497 2.351
Chinese 3.028 2.912 2.658
Danish 3.094 2.794 2.575
Dutch 2.905 2.506 2.333
Finnich 2.886 2.418 2.244
French 3.110 2.849 2.649
German 3.070 2.680 2.529
Greek 3.070 2.768 2.527
Hindi 3.063 2.756 2.538
Hungarian 3.083 2.993 2.747
Italian 3.220 3.029 2.776
Japanese 3.035 2.777 2.607
Norwegian 3.067 2.943 2.710
Polish 3.027 2.668 2.485
Portuguese 3.128 2.911 2.660
Russian 2.977 2.724 2.551
Spanish 3.060 3.096 2.775
Swedish 3.105 2.777 2.590
Average3.0502.7842.572

The average difference between TWELP 1600 bps and CODEC2 3200 bps is 0.266 PESQ, while the average difference between TWELP 1600 bps and CODEC2 1600 bps is 0.478 PESQ.

 Speech Intelligibility. Here is a comparison with the CODEC2 vocoder, which operates at twice the bitrate (3200 bps) and at the same bitrate (1600 bps) . The TWELP 1600 bps vocoder and the CODEC2 3200 bps and 1600 bps vocoders were tested using the ITU-T P.50 speech database, covering 20 different languages.
STOI (Short-Time Objective Intelligibility) and ESTOI (Extended Short-Time Objective Intelligibility) metrics were used to assess speech intelligibility: 
Created with Highcharts 4.1.9Chart context menu % (STOI score) Speech Intelligibility Comparison (STOI metric)TWELP 1600 bps vs CODEC2 3200 bps and 1600 bpsTWELP 1600 bpsCODEC2 3200 bpsCODEC2 1600 bpsAmericanArabicBritishChineseDanishDutchFinnichFrenchGermanGreekHindiHungarianItalianJapaneseNorwegianPolishPortugueseRussianSpanishSwedish7879808182838485868788899091
The diagram demonstrates a significant advantage of the TWELP 1600 bps vocoder over both the CODEC2 1600 bps and 3200 bps vocoders, despite the latter operating at double the bitrate. Exact values are provided in the table below:
LanguageTWELP
1600 bps
CODEC2
3200 bps
CODEC2
1600 bps
American 90.61 87.89 84.28
Arabic 89.72 87.31 83.54
British 88.66 84.77 80.45
Chinese 89.66 88.90 85.31
Danish 90.81 87.96 83.65
Dutch 89.54 85.27 81.35
Finnich 86.83 84.70 78.75
French 90.12 88.75 85.10
German 90.20 87.04 82.90
Greek 90.06 88.01 83.02
Hindi 90.09 86.19 82.32
Hungarian 89.62 88.74 85.71
Italian 89.71 87.67 83.25
Japanese 89.94 88.03 84.33
Norwegian 90.82 89.90 86.09
Polish 90.32 87.82 83.63
Portuguese 89.61 89.07 85.28
Russian 89.08 87.01 83.37
Spanish 89.25 88.31 83.22
Swedish 88.55 87.66 83.10
Average89.6687.5583.43

The average difference between the TWELP 1600 bps and the CODEC2 3200 bps vocoders is 2.11%, while the average difference between the TWELP 1600 bps and the CODEC2 1600 bps vocoders is 6.23%.

Considering that a low-bitrate vocoder is a nonlinear device that significantly distorts the spectrum of the original speech signal, the ESTOI metric provides more accurate assessments of speech intelligibility after vocoding:
Created with Highcharts 4.1.9Chart context menu % (ESTOI score) Speech Intelligibility Comparison (ESTOI metric)TWELP 1600 bps vs CODEC2 3200 bps and 1600 bpsTWELP 1600 bpsCODEC2 3200 bpsCODEC2 1600 bpsAmericanArabicBritishChineseDanishDutchFinnichFrenchGermanGreekHindiHungarianItalianJapaneseNorwegianPolishPortugueseRussianSpanishSwedish62646668707274767880828486
The diagram demonstrates a significant advantage of the TWELP 1600 bps vocoder over both the CODEC2 1600 bps and 3200 bps vocoders, despite the latter operating at double the bitrate. Exact numbers are shown in the table below.
LanguageTWELP
1600 bps
CODEC2
3200 bps
CODEC2
1600 bps
American 82.27 76.98 69.29
Arabic 84.23 76.53 69.49
British 82.30 73.62 64.38
Chinese 84.99 81.45 74.65
Danish 84.81 77.55 69.16
Dutch 83.36 74.75 66.38
Finnich 80.18 75.85 65.07
French 84.40 79.25 71.46
German 83.33 75.19 66.35
Greek 85.05 79.58 70.85
Hindi 83.08 72.58 63.79
Hungarian 83.15 77.81 70.62
Italian 83.73 78.09 70.01
Japanese 85.03 79.01 72.43
Norwegian 85.26 80.47 73.49
Polish 84.68 77.04 68.50
Portuguese 83.82 79.05 71.33
Russian 82.97 76.04 68.74
Spanish 83.70 80.41 71.28
Swedish 80.90 77.17 68.09
Average83.6677.4269.27

The average difference between the TWELP 1600 bps and the CODEC2 3200 bps vocoders is 6.24%, while the average difference between the TWELP 1600 bps and the CODEC2 1600 bps vocoders is 14.39%.

You can download the P.862 and STOI/ESTOI utilities, along with all speech samples, by using the links in the 'Downloads' section at the bottom of the page, and then check all the numbers presented above.


Speech Samples (WAV-files). 
A few independent experts compared the TWELP 1600 bps vocoder with the CODEC2 3200 bps and 1600 bps vocoders using the preference method. All listeners preferred the TWELP vocoder at both CODEC2 bitrates, noting that the speech was clearer, sounded more natural, and was less synthetic.

You can play and listen to short samples of the source speech, as well as the speech processed by the CODEC2 3200 bps, 1600 bps vocoders and the TWELP 1600 bps vocoder, using the links in the table below.

For the best listening experience, we recommend using high-quality headphones or premium audio equipment to hear the nuances and differences in the vocoder sound more clearly.

You can also download the complete set of P.50 samples as zip files for all languages simultaneously by using the links in the 'Downloads' section at the bottom of the page.

LanguageSource
speech
CODEC2
3200 bps
CODEC2
1600 bps
TWELP
1600 bps
American
Arabic
British
Chinese
Danish
Dutch
Finnich
French
German
Greek
Hindi
Hungarian
Italian
Japanese
Norwegian
Polish
Portuguese
Russian
Spanish
Swedish

Superiority In Quality Of The Non-speech Signals. In contrast to other LBR vocoders (MELPe, AMBE+2, etc.), TWELP vocoders provide high quality of non-speech signals, including police, ambulance, fire sirens, etc. This feature in conjunction with high quality natural human-sounding of voice makes TWELP vocoders well suitable for replacement of analog radio by digital radio and also for other applications where high quality transmitting of non-speech signals is relevant along with high quality transmitting of speech signals.

Source
signal
CODEC2
3200 bps
CODEC2
1600 bps
TWELP
1600 bps

High Robustness To Acoustic Noise. In contrast to other LBR vocoders, TWELP vocoders are highly robust to acoustic noise due to a reliable pitch estimation method and other features of TWELP technology.
Created with Highcharts 4.1.9Chart context menu% ESTOI scoreSpeech Intelligibility Comparison (ESTOI metric)TWELP 1600 bps vs CODEC2 3200 bps and 1600 bps in Acoustic Noise TWELP 1600 bpsCODEC2 3200 bpsCODEC2 1600 bpsAmericanArabicBritishChineseDanishDutchFinnichFrenchGermanGreekHindiHungarianItalianJapaneseNorwegianPolishPortugueseRussianSpanishSwedish5254565860626466687072747678
The diagram demonstrates a significant advantage of the TWELP 1600 bps vocoder over both the CODEC2 1600 bps and 3200 bps vocoders, despite the latter operating at double the bitrate. Exact numbers are shown in the table below.
LanguageTWELP
1600 bps
CODEC2
3200 bps
CODEC2
1600 bps
American 71.93 66.85 60.47
Arabic 73.60 69.67 63.37
British 69.80 64.23 56.25
Chinese 76.05 72.12 66.55
Danish 71.55 67.72 60.34
Dutch 69.24 63.56 56.34
Finnich 68.27 66.63 57.52
French 73.30 70.15 62.95
German 70.58 66.59 58.97
Greek 73.89 70.33 63.21
Hindi 69.62 62.65 54.79
Hungarian 73.91 69.80 62.71
Italian 70.99 67.33 60.84
Japanese 75.61 70.44 64.33
Norwegian 75.42 71.75 65.99
Polish 72.90 67.06 60.17
Portuguese 72.55 69.92 63.25
Russian 71.20 66.10 59.67
Spanish 74.20 71.54 64.17
Swedish 65.92 64.85 57.36
Average72.0367.9660.96

The average difference between the TWELP 1600 bps and the CODEC2 3200 bps vocoders is 4.07%, while the average difference between the TWELP 1600 bps and the CODEC2 1600 bps vocoders is 11.07%.

Additionally, the TWELP vocoder features an NCSE (Noise Cancellation Speech Enhancement) preprocessor, which cleans the input speech signal from noise and enhances speech quality. 
Below, you can listen to a short fragment of heavily noisy English speech after passing through CODEC2 and TWELP vocoders, with NCSE disabled and enabled, respectively.
NCSEInput speech
(SNR=10dB)
CODEC2
3200 bps
CODEC2
1600 bps
TWELP
1600 bps
Disabled
Enabled    
The NCSE integrated into the TWELP vocoder is described in more detail on the webpage for our standalone product, 'NCSE-AGC Preprocessor'.
High Robustness To The Channel Errors. 
The TWELP technology offers highly efficient speech compression by eliminating redundancy while preserving excellent quality and intelligibility. To enhance robustness against transmission errors, we provide specialized versions called TWELP Robust.
These vocoders are based on an effective Joint Source-Channel Coding approach. Each vocoder is equipped with a custom-designed FEC, tailored to its specific characteristics and operational conditions. 
TWELP Robust vocoders provide high speech quality simultaneously in noisy channel as well as in noiseless channel. FEC can operate with "soft decisions" as well as with "hard decisions" from a modem. "Soft decisions" mode provides much better robustness in comparison with the "hard decisions" mode.
For all users of our non-robust vocoder versions, we offer the following recommendations.
 
The diagram below illustrates the sensitivity of bits at the output of the vocoder to communication channel errors.
Essentially, the diagram shows by what percentage speech quality is reduced when a specific bit is distorted. The first bits in order cause catastrophic distortions, while the latter bits have significantly less impact on quality.
Created with Highcharts 4.1.9Chart context menuBit numberSensitivity, %Bit Sensitivity to ErrorsTWELP 1600 bps vocoderTWELP 1600 bps Vocoder051015202530354045505560650102030405060708090100
We strongly recommend using FEC (Forward Error Correction) with unequal protection of the bits in strong accordance with their sensitivity to errors and utilizing 'Soft Decisions' decoding. This will provide the highest robustness of the vocoder against errors in the channel.

Additional Functionalities. The following additional functionalities are developed by DSPINI and integrated into TWELP vocoders:

  • Noise Cancellation Speech Enhancement (NCSE)
  • Automatic Gain Control (AGC),
  • Voice Activity Detector (VAD),
  • Discontinuous Transmission (DTX),
  • Tone Detection/Generation (Single tones and Dual tones). The tones are transmitted by the vocoder facilities.

Note: 
The Tone Detector/Generator functionalities are not integrated into the code by default but can be added free of charge upon request.

Each functionality has unique features, performance and characteristics, providing significant superiority over any well-known implementations on the market.

Technical Characteristics And Resource Requirements:

Technical characteristics
Bit Rate
(bps)
AlgorithmFrame size
(ms)
Algorithmic delay
(including frame size)
(ms)
Sampling rate
(kHz)
Signal formatBit stream format
1600 TWELP 40 60 8 Linear
16-bit
PCM
64
Additional functionalities
NameFunctionalityTechnical characteristics
NameValue
AGC Automatic Gain 
Control
Control range: 0 ... +42 dB
NCSE Noise Canceller -
Speech Enhancer
SNR increasing  20 dB
Speech quality
improvement
> 0.1 PESQ
Tone
Detector
Single/Dual tones 
detection
In accordance with international standards
Tone
Generator
Single/Dual tones 
generation
Special generator, kept continuity of signal 
(phase and amplitude of signal of previous frame)
DTX Discontinuous 
Transmission
Reduces bit rate down to 110 bps in pauses
between active speech regions
VAD Voice Activity 
Detection
High reliability even with pink noise at an SNR < 0 dB.
CNG Comfort Noise 
Generation
Type of noise "white"
Level - 60 dB
The NCSE and AGC integrated into the TWELP vocoder are described in more detail on the webpage for our standalone product, 'NCSE-AGC Preprocessor'.

Resources for ARM Cortex-M4 platform
ModuleMIPS*
peak
Memory (KBytes)
ProgramData
ConstantsChannelHeapStack
Voice Encoder 105 36 185 4.6 4.0 1.0
NCSE 6.2
AGC 0.2
Voice Decoder 10.3
Voice Encoder +
Voice Decoder
115.3
Total 121.7

Resources for TI's C64 DSP platform
ModuleMIPS*
peak
Memory (KBytes)
ProgramData
ConstantsChannelHeapStack
Voice Encoder 38.3 68 185 4.6 4.0 1.0
NCSE 2.8
AGC 0.2
Voice Decoder 4.4
Voice Encoder +
Voice Decoder
42.7
Total 45.7

Resources (estimated) for TI's C55 DSP platform
ModuleMIPS*
peak
Memory (KBytes)
ProgramData
ConstantsChannelHeapStack
Voice Encoder 64.0 22 185 4.6 4.0 1.0
NCSE 6.5
AGC 0.2
Voice Decoder 10.0
Voice Encoder +
Voice Decoder
74.0
Total 80.7

* DSPINI continues optimization of the TWELP algorithm and code in order to minimize computational complexity of the vocoder.

Software Integrity and Security. DSPINI guarantees the ABSOLUTE integrity of its software, free from any undocumented features, undeclared capabilities, or hidden functions. Our customers can be assured that none of our software/code contains any secret features or functionalities concealed from the user. If necessary, we are ready to provide the source code of our software products for appropriate certification.
Moreover, our software is available in source code form—you simply need to purchase the appropriate license to use it.

Guarantee And Support.  DSPINI guarantees a quality and accordance of all technical characteristics of the product to requirement of current specifications. Testing and other method of quality control are used for guarantee support.

Any Platforms.  DSPINI can port this vocoder software into any other DSP, RISC or general- purposes platform inshort time: 1-2 months.

Licensing Terms.  To use the vocoder, customer should obtain a license from DSPINI only.

Customization.  The vocoder can be customized under any specific requirements- other bit rate, frame size, any other robustness to channel errors, etc. Please contact with us for details.

Prospects.  DSPINI is impoving and developing continuously a set of new vocoders with range from 300 bps up to 9600 bps, based on TWELP technology.

Related Software.  This vocoder may be effectively used in a bundle with other DSPINI's products:

  • Linear and acoustic echo cancellers,
  • Multichannel noise cancellers (including two-microphone adaptive array),
  • Wired or radiomodems for any types of channels and bitrates,
  • Other products.

Downloads: