DSP Innovations - TWELP 1200/2400 bps Scalable Vocoder

This page is still under construction!

Details: Written by Sergey; Published: 2024 June 12; Last Updated: 2024 June 17

WARNING!
This page is still under construction. The information on it is not up-to-date.

Provides the highest quality of digital voice communication and loseless interoperability between networks with different bandwidth, providing two-layer encoding at 700 bps (basic layer) + 900 bps (extension layer) = 1600 bps bit rate total.

It's ideal for Digital Voice in HF radio at 700 and 1600 bps, as well as in VHF/UHF radio, supporting both or just the 1600 bps bit rate. This ensures the highest voice quality in each network and maintains lossless interoperability between them. Transcoding is not needed, which saves significant computational resources in gateways and also means no degradation in speech quality, even when users interoperate at different bit rates.

TWELP vocoder enables cutting the bitrate in half compared to current standard solutions while maintaining high speech quality in conditions with a lower SNR (3-7 dB lower). It also enables doubling the number of channels in VHF/UHF radio without compromising voice communication quality.

For Digital Voice in HF, VHF/UHF Radio, Digital Mobile Radio (DMR) and other markets.

TWELP Technology Features.
The vocoder is based on the newest technology of a speech coding called "Tri-Wave Excited Linear Prediction" (TWELP) that was developed by experts of DSPINI.

TWELP technology is a new class of vocoders that differs from any other LPC-based vocoders in:

an advance reliable method of pitch estimation
a pitch-synchronous analysis
an advance tri-wave model of excitation
newest quantization schemes
a pitch-synchronous synthesis

Thanks to these unique features, TWELP technology provides much better speech quality in comparison with any well-known technologies, including AMBE+2, MELPe, ACELP, etc. at the same bit rate in range from 300 bps up to 9600 bps and beyond. Moreover, in contrast to other LBR vocoders (like MELPe, etc.) TWELP provides much better quality for non-speech signals like sirens, background music, etc.

Superiority In Speech Quality.
Here's the comparative analysis between the TWELP 700/1600 scalable vocoder and standard MELPe vocoder, which operates at 1200 and 2400 bps bit rates.
TWELP at 700 bps bit rate is compared with MELPe at 1200 bps, and TWELP at 1600 bps is compared with MELPe at 2400 bps.
The ITU-T P.50 speech database for 20 different languages was utilized, and the ITU-T P.862 utility was employed to estimate speech quality in PESQ terms.

The diagrams demonstrate that the TWELP vocoder allows the reduction of the bit rate by almost two times compared to the MELPe vocoder while maintaining speech quality at the same level (subjectively even better—listen to it below).
Exact numbers are shown in the tables below.

Language	MELPe 1200 bps	TWELPs 700 bps
American	2.876	2.838
Arabic	2.809	2.747
British	2.826	2.783
Chinese	2.710	2.762
Danish	2.797	2.790
Dutch	2.646	2.634
Finnich	2.641	2.649
French	2.897	2.843
German	2.803	2.759
Greek	2.753	2.744
Hindi	2.875	2.812
Hungarian	2.831	2.811
Italian	2.989	2.956
Japanese	2.983	2.929
Norwegian	2.800	2.764
Polish	2.792	2.798
Portuguese	2.900	2.858
Russian	2.735	2.697
Spanish	2.851	2.797
Swedish	2.958	2.886
Average	2.824	2.793
The difference in quality is on average just 0.031 PESQ

Language	MELPe 2400 bps	TWELPs 1600 bps
American	3.077	3.121
Arabic	3.053	2.989
British	3.019	3.022
Chinese	2.970	3.007
Danish	3.022	3.035
Dutch	2.830	2.855
Finnich	2.791	2.888
French	3.106	3.103
German	2.998	3.005
Greek	3.004	3.011
Hindi	3.089	3.049
Hungarian	3.086	3.029
Italian	3.226	3.201
Japanese	3.188	3.148
Norwegian	3.032	3.000
Polish	3.029	3.020
Portuguese	3.146	3.125
Russian	2.952	2.933
Spanish	3.048	3.052
Swedish	3.147	3.105
Average	3.041	3.035
The difference in quality is on average just 0.006 PESQ

Speech Samples (WAV-files).
Several independent experts listened to the TWELP 1200/2400 scalable vocoder in comparison with MELPe and AMR vocoders that operated at twice the higher bit rates, using a method of preferences.
Despite the objective estimations of speech quality in PESQ terms being approximately the same, the majority of experts preferred TWELP over the standard vocoders especially for low bit rate, noting a much more natural human-sounding voice in the TWELP vocoder.
You can listen to short samples of the source speech as well as the speech processed by each pair of vocoders for any of the 20 languages in the table below.
Additionally, you can download a full set of P.50 samples as zip-files for all languages simultaneously in the Downloads section below.

Language	Source speech	MELPe 2400 bps	TWELPs 1200 bps
American
Arabic
British
Chinese
Danish
Dutch
Finnich
French
German
Greek
Hindi
Hungarian
Italian
Japanese
Norwegian
Polish
Portuguese
Russian
Spanish
Swedish

Language	Source speech	AMR 4750 bps	TWELPs 2400 bps
American
Arabic
British
Chinese
Danish
Dutch
Finnich
French
German
Greek
Hindi
Hungarian
Italian
Japanese
Norwegian
Polish
Portuguese
Russian
Spanish
Swedish

Superiority In Quality Of The Non-speech Signals.
In contrast to other LBR vocoders (MELPe, AMBE+2, etc.), TWELP vocoders provide a high quality of non-speech signals, including police, ambulance, fire sirens, etc. This feature in conjunction with a high quality natural human-sounding of the voice makes TWELP vocoders well suitable for replacement of analog radio with a digital radio and also for other applications where a high quality transmission of non-speech signals is relevant along with a high quality transmission of speech signals.

Source type	Source signal	MELPe 2400	TWELPs 2400
Siren only
With voice

High Robustness To Acoustic Noise.
In contrast to other LBR vocoders, TWELP vocoders are well robust to acoustic noise thanks to robust reliable method of pitch estimation and other features of TWELP technology.

Moreover, vocoder includes in-built Noise Cancellation—Speech Enhancement (NCSE) functionality that improves a speech quality in a noisy acoustic environment.

NCSE Mode	Source signal	MELPe 2400	TWELPs 2400
Disabled
Enabled

High Robustness To The Channel Errors.
The diagram illustrates how the sensitivity of the bits in the vocoder output is affected by errors.

	TWELP 700/1600 Scalable Vocoder
1	100.000
2	98.937
3	68.716
4	68.570
5	63.331
6	63.050
7	62.468
8	62.231
9	62.130
10	62.073
11	61.889
12	61.303
13	59.539
14	57.209
15	51.169
16	50.837
17	47.387
18	47.217
19	45.793
20	42.809
21	42.121
22	40.045
23	36.597
24	36.374
25	34.345
26	30.903
27	28.537
28	28.025
29	27.951
30	27.267
31	24.755
32	22.765
33	22.550
34	22.296
35	16.582
36	16.487
37	16.393
38	16.196
39	15.471
40	15.311
41	12.408
42	12.271
43	12.047
44	9.488
45	8.884
46	8.880
47	8.845
48	7.648
49	7.069
50	5.975
51	5.114
52	2.178
53	1.793
54	0.998
55	0.400
56	5.626
57	5.071
58	4.856
59	4.674
60	8.432
61	8.417
62	8.038
63	8.034
64	7.265
65	7.062
66	6.983
67	6.848
68	6.654
69	6.342
70	6.164
71	5.828
72	5.518
73	4.436
74	4.170
75	3.580
76	3.515
77	3.488
78	2.615
79	2.505
80	2.478
81	2.466
82	1.865
83	1.829
84	1.722
85	1.429
86	1.184
87	1.082
88	1.026
89	0.711
90	0.651
91	0.635
92	0.400
93	0.308
94	0.281
95	0.211
96	0.000

The first part (56 bits) represents the basic layer, while the second part (72 bits) represents the extension layer of the scalable vocoder.

We strongly recommend using FEC (Forward Error Correction) with unequal protection of the bits in strong accordance with their sensitivity to errors and utilizing 'Soft Decisions' decoding. This will provide the highest robustness of the vocoder against errors in the channel.

Additional Functionalities.
The following additional functionalities are developed by DSPINI and integrated into TWELP vocoders:

Automatic Gain Control (AGC),
Noise Cancellation for Speech Enhancement (NCSE)
Voice Activity Detector (VAD),
Tone Detection/Generation (Single tones and Dual tones). The tones are transmitted by the vocoder facilities.

Each functionality has unique features, performance and characteristics, providing a significant superiority over any well-known implementations on the market.

Technical Characteristics And Resource Requirements:

Technical characteristics
Bit Rate (bps)	Algorithm	Frame size (ms)	Algorithmic delay (including frame size) (ms)	Sampling rate (kHz)	Signal format	Bit stream format
1200/2400	TWELP	80	100	8	Linear 16-bit PCM	56/128

Additional functionalities
Name	Functionality	Technical characteristics
Name	Functionality	Name	Value
AGC	Automatic Gain Control	Control range:	0 ... +20 dB
NCSE	Noise Canceller - Speech Enhancer	SNR increasing	> 6 dB
NCSE	Noise Canceller - Speech Enhancer	Speech quality improvement	> 0.1 PESQ
Tone Detector	Single/Dual tones detection	In accordance with the international standards
Tone Generator	Single/Dual tones generation	Special generator, kept continuity of a signal (phase and amplitude of signal of previous frame)
VAD	Voice Activity Detection	Reliable detection speech in background noise
CNG	Comfort Noise Generation	Type of noise	"white"
CNG	Comfort Noise Generation	Level	- 60 dB

Resources for ARM Cortex-M4 platform
Module	MIPS* peak	Memory (KBytes)
		Program	Data
		Program	Constants	Channel	Heap	Stack
Encoder	145	52	968	5.0	4.0	1.0
NCSE	6
AGC	0.6
Decoder	19
Encoder + Decoder	164
Total	171

Resources for TI's C64 DSP platform
Module	MIPS* peak	Memory (KBytes)
		Program	Data
		Program	Constants	Channel	Heap	Stack
Encoder	52	98	968	5.0	4.0	1.0
NCSE	2.7
AGC	0.3
Decoder	5.5
Encoder + Decoder	57.5
Total	60.5

Resources (estimated) for TI's C55 DSP platform
Module	MIPS* peak	Memory (KBytes)
		Program	Data
		Program	Constants	Channel	Heap	Stack
Encoder	88	32	968	5.0	4.0	1.0
NCSE	7
AGC	0.4
Decoder	13
Encoder + Decoder	101
Total	109

* DSPINI continues optimization of the TWELP algorithm and the code in order to minimize a computational complexity of the vocoder.
For use cases, where consumption resources are critical, we can reduce MIPS as well as memory, but with minor decreasing of speech quality.

Vulnerability / Security.
DSPINI guarantees an ABSOLUTE cleanliness of the software from any undocumented features, undeclared capabilities, etc. All our customers can be assured that any our software/code doesn't contain any secret functions or features hidden from the user. We are ready to provide the source codes of our software products for an appropriate certification if needed.

Guarantee And Support.
DSPINI guarantees a quality and accordance of all technical characteristics of the product to the requirement of the current specifications. Testing and the other method of quality control are used for a guarantee support.

Any Platforms.
DSPINI can port this vocoder software onto any other DSP, RISC or general-purposes platform in a short time: 1-2 months.

Licensing Terms.
To use the vocoder software, a customer should obtain a license from DSPINI only.

Customization.
The vocoder can be customized under any specific requirements - other bit rate, frame size, any other robustness to channel errors, etc. Please contact us for the details.

Prospects.
DSPINI is impoving and developing continuously a set of new vocoders with the range from 300 bps up to 9600 bps, based on TWELP technology.

Related Software.
This vocoder may be effectively used in a bundle with other DSPINI's products:

Linear and acoustic echo cancellers,
Multichannel noise cancellers (including two-microphone adaptive array),
Wired or radiomodems for any types of channels and bitrates,
Other products.

Downloads:

Datasheet (pdf)
ITU-T P.50 source speech samples (zip)
MELPe 2400 bps speech samples (zip)
TWELP-1200/2400 1200 bps speech samples (zip)
AMR 4750 bps speech samples (zip)
TWELP-1200/2400 2400 bps speech samples (zip)
PC-evaluation package (zip) — at request
User's Guide document (pdf) — at request

TWELP 700/1600 bps Scalable Vocoder