PICOLA and TDHS

Japanese Version

by Mikio IKEDA
Since 2003
Last update, 30 Nov 2006

I have never seen description of the PICOLA on the WWW. I wonder if this page is the ONLY one description of the PICOLA all over the world :-)

Any feedbacks (including my English problems) are welcome.

News: Archive have updated (30Nov2006)

Archive is here. (30 Nov 2006: bug fix ) This software is Free!

1. What is PICOLA?

PICOLA is a method for time scale modification of speech (or other audio signals). The name is taken from abbreviation of Pointer Interval Controlled OverLap and Add. Also the name represents the behavior of the algorithm. The algorithm is developed by Naotaka Morita, who have been working at NTT, in 1985 at Nagayo University.

The algorithm of PICOLA is similar to that of TDHS (Time Domain Harmonic Scaling). Both PICOLA and TDHS modify time scale without modifing frequency scale (or modify frequency scale without modifing time scale). At certain compansion ratio, both PICOLA and THDS, the algorithms can be identical. The difference between PICOLA and TDHS is that TDHS is time domain interpretation of frequency scaling and requires long term buffering for higher companion rates, but PICOLA is a local time scale modification and requires short term buffering. PICOLA and TDHS are developed independently. In other words, the PICOLA is developed without any knowledge of TDHS. In addition, PICOLA enables flexible and low distortion time scaling compared to TDHS. There is no significant quality difference between TDHS and PICOLA when comansion rate is 1/3, 1/2, 2/3, 3/2, 2 or 3. Both TDHS and PICOLA compand audio signals at any compansion raten/m. THDS degrade audio signals significantly for larger n, m.

2. The algorithm of PICOLA

PICOLA uses the periodicity of the speech signals (or other audio signals). In other words, the existence of the periodicity enables time scale modification of the speech signals by PICOLA. In the following sections Tp represents the pitch period.

2.1 Time Scale Expansion of the PICOLA

The algorithm of PICOLA is depicted following figures (Figure 1,2). These figures show all about what PICOLA does.

In the expansion case, PICOLA inserts Tp length wavelet every L interval. Therefore, L length wavelet is expanded into Tp+L length one. To avoid discontinuity, the inserted Tp length wavelet is windowed and overlapped (see Figure 1). The expansion rate r is derived from r = (Tp+L)/L. In other expression, the L is determined by L=Tp /(r-1) using the pitch period Tp and expansion rate r ( >1 ).

Therefore, the expansion procedure follows:

  1. Assume the current processing point is A (see Figure 1).
  2. Extract pitch period Tp (using correlation method, AMDF method, etc).
  3. Determine L=Tp/(r-1).
  4. Overlap and add to generate Tp length wavlet from former one (see Figure 1) and latter one of the processing point A.
  5. Copy into output buffer the Tp length overlapped wavelet.
  6. Copy L length wavelet into output buffer from input buffer.
  7. Move next processing point B and repeat the procedure.


Figure 1: Time scale expansion procedure of the PICOLA

2.2 Time Scale Compression of the PICOLA

In the compression case, PICOLA removes Tp length wavelet every Tp+L interval. Therefore, Tp+L length wavelet is compressed into L length one. To avoid discontinuity, L length wavelet is windowed and overlapped (see Figure 2). The compression rate r is derived from r = L/(Tp+L). In other expression, the L is determined by L=Tp r/(1-r) from the pitch period Tp and compression rate r ( <1 ).

Therefore, the compression procedure follows:

  1. Assume the current processing point is A (see Figure 2).
  2. Extract pitch period Tp (using correlation method, AMDF method etc).
  3. Determine L=Tp r/(1-r).
  4. Shape last Tp length waveform in the output buffer.
  5. Shape next Tp length waveform in the input buffer.
  6. Overlap and add these waveforms on the output buffer.
  7. Copy L length wavelet into output buffer from input buffer.
  8. Shift next processing point C and repeat the procedure.


Figure 2: Time scale compression procedure of the PICOLA.

The name of PICOLA is taken from its operations of overlapping and addition and controlling the duration of buffer sample copying.

3. Improvements of PICOLA algorithm

Some improvements may required for realistic applications. Following problems are resolved already in the program

Precise compansion rate compensation
The compansion rate cannot become the ideal value, because L is rounded into integer value. Therefore compansion rate compensation is required especially for short pitch period speech (200Hz or above in fundamental frequency) like female or children ones. This problem is resolved following procedure. This is well known error feedback procedure.
Buffer size limitation
When the compansion rate r becomes closer to 1, the L becomes larger. For example, when Tp = 10ms and r = 1.05, L becomes 200ms ! When all 200ms speech have stored into buffer, that results in 200ms delay! Larger delay is not acceptable for various applications.
This estimation is not correct. Because the buffering is required only 'Tp' length (but the impimentation is somewhat harder). Overlapping is required only for 'Tp' length signal, therefore buffring is required only 'Tp' length. But you should note that this analysis is the case of time scale modification. When you would like to apply PICOLA to frequency scale modification, the delay would be results in 'L' or 'Tp+L' in the original sampling frequency.

4. The differences between PICOLA and TDHS

The main difference between PICOLA and TDHS is that TDHS requires large buffering and PICOLA does not. PICOLA can compand audio signals at any rates L/(Tp+L) or (Tp+L)/L. And PICOLA modifies the audio signal only Tp length (every L or Tp+L interval). Although TDHS can compand signal at rates n/m for every intger n and m (assume that n and m does not have any common divisor), TDHS modifies nTp length buffering when n is larger than m(the length is m Tp when m is larger than n). Therefore PICOLA requires only Tp buffering. One the other hand TDHS requires n Tp or m Tp length buffering. Thus TDHS is applicable only for small n and m (3/1, 3/2, 2/1, 1/2, 2/3, 1/3).

Following Figures (Figure 2 and 3) show the compansion procedure of the TDHS. For both expansion and compression, TDHS overlaps and adds n Tp length signals and replace the original mTp length signal into produced n Tp length signal.

Consequently, both TDHS and PICOLA algorithm is identical when compression rate is 1/2.


Figure 3: The scale expansion procedure of TDHS (1:2)


Figure 4: Time scale compression procedure of TDHS (3:2)


Return to Home