Note: This readme is shortened version of the original
SoundTouch readme providing documentation for the time stretching feature.
SoundTouch WWW page: www.surina.net/soundtouch
SoundTouch audio processing library v1.5.0
SoundTouch library Copyright (c) Olli
Parviainen 2002-2009
3. About implementation & Usage tips
3.3. About algorithms
Time-stretching means changing
the audio stream duration without affecting it's pitch. SoundTouch
uses WSOLA-like time-stretching routines that operate in the time
domain. Compared to sample rate transposing, time-stretching is a
much heavier operation and also requires a longer processing
"window" of sound samples used by the
processing algorithm, thus increasing the algorithm input/output
latency. Typical i/o latency for the SoundTouch
time-stretch algorithm is around 100 ms.
3.4 Tuning the algorithm parameters
The time-stretch algorithm has few
parameters that can be tuned to optimize sound quality for
certain application. The current default parameters have been
chosen by iterative if-then analysis (read: "trial and error")
to obtain best subjective sound quality in pop/rock music
processing, but in applications processing different kind of
sound the default parameter set may result into a sub-optimal
result.
The time-stretch algorithm default
parameter values are set by the following #defines in file "TDStretch.h":
#define DEFAULT_SEQUENCE_MS AUTOMATIC
#define DEFAULT_SEEKWINDOW_MS AUTOMATIC
#define DEFAULT_OVERLAP_MS 8
These parameters affect to the time-stretch
algorithm as follows:
- DEFAULT_SEQUENCE_MS: This is
the default length of a single processing sequence in milliseconds
which determines the how the original sound is chopped in
the time-stretch algorithm. Larger values mean fewer sequences
are used in processing. In principle a larger value sounds better when
slowing down the tempo, but worse when increasing the tempo and vice
versa.
By default, this setting value is calculated automatically according to
tempo value.
- DEFAULT_SEEKWINDOW_MS: The seeking window
default length in milliseconds is for the algorithm that seeks the best
possible overlapping location. This determines from how
wide a sample "window" the algorithm can use to find an optimal mixing
location when the sound sequences are to be linked back together.
The bigger this window setting is, the higher the possibility to find a
better mixing position becomes, but at the same time large values may
cause a "drifting" sound artifact because neighboring sequences can be
chosen at more uneven intervals. If there's a disturbing artifact that
sounds as if a constant frequency was drifting around, try reducing
this setting.
By default, this setting value is calculated automatically according to
tempo value.
- DEFAULT_OVERLAP_MS: Overlap
length in milliseconds. When the sound sequences are mixed back
together to form again a continuous sound stream, this parameter
defines how much the ends of the consecutive sequences will overlap with each other.
This shouldn't be that critical parameter. If you reduce the
DEFAULT_SEQUENCE_MS setting by a large amount, you might wish to try a
smaller value on this.
Notice that these parameters can also be
set during execution time with functions "TDStretch::setParameters()"
and "SoundTouch::setSetting()".
The table below summaries how the
parameters can be adjusted for different applications:
Parameter name |
Default value
magnitude |
Larger value
affects... |
Smaller value
affects... |
Effect to CPU burden |
SEQUENCE_MS
|
Default value is relatively
large, chosen for slowing down music tempo |
Larger value is usually
better for slowing down tempo. Growing the value decelerates the
"echoing" artifact when slowing down the tempo. |
Smaller value might be better
for speeding up tempo. Reducing the value accelerates the "echoing"
artifact when slowing down the tempo |
Increasing the parameter
value reduces computation burden |
SEEKWINDOW_MS
|
Default value is relatively
large, chosen for slowing down music tempo |
Larger value eases finding a
good mixing position, but may cause a "drifting" artifact |
Smaller reduce possibility to
find a good mixing position, but reduce the "drifting" artifact. |
Increasing the parameter
value increases computation burden |
OVERLAP_MS
|
Default value is relatively
large, chosen to suit with above parameters. |
|
If you reduce the "sequence
ms" setting, you might wish to try a smaller value. |
Increasing the parameter
value increases computation burden |
5. Change History
5.1. SoundTouch library Change History
1.5.0:
- Added normalization to correlation calculation and improvement automatic seek/sequence parameter calculation to improve sound quality
- Bugfixes:
- Fixed negative array indexing in quick seek algorithm
- FIR autoalias filter running too far in processing buffer
- Check against zero sample count in rate transposing
- Fix for x86-64 support: Removed pop/push instructions from the cpu detection algorithm.
- Check against empty buffers in FIFOSampleBuffer
- Other minor fixes & code cleanup
- Fixes in compilation scripts for non-Intel platforms
- Added Dynamic-Link-Library (DLL) version of SoundTouch library build,
provided with Delphi/Pascal wrapper for calling the dll routines
- Added #define PREVENT_CLICK_AT_RATE_CROSSOVER that prevents a click artifact
when crossing the nominal pitch from either positive to negative side or vice
versa
1.4.1:
- Fixed a buffer overflow bug in BPM detect algorithm routines if processing
more than 2048 samples at one call
1.4.0:
- Improved sound quality by automatic calculation of time stretch algorithm
processing parameters according to tempo setting
- Moved BPM detection routines from SoundStretch application into SoundTouch
library
- Bugfixes: Usage of uninitialied variables, GNU build scripts, compiler errors
due to 'const' keyword mismatch.
- Source code cleanup
v1.3.1:
- Changed static class declaration to GCC 4.x compiler compatible syntax.
- Enabled MMX/SSE-optimized routines also for GCC compilers. Earlier
the MMX/SSE-optimized routines were written in compiler-specific inline
assembler, now these routines are migrated to use compiler intrinsic
syntax which allows compiling the same MMX/SSE-optimized source code with
both Visual C++ and GCC compilers.
- Set floating point as the default sample format and added switch to
the GNU configure script for selecting the other sample format.
v1.3.0:
- Fixed tempo routine output duration inaccuracy due to rounding
error
- Implemented separate processing routines for integer and
floating arithmetic to allow improvements to floating point routines
(earlier used algorithms mostly optimized for integer arithmetic also
for floating point samples)
- Fixed a bug that distorts sound if sample rate changes during the
sound stream
- Fixed a memory leak that appeared in MMX/SSE/3DNow! optimized
routines
- Reduced redundant code pieces in MMX/SSE/3DNow! optimized
routines vs. the standard C routines.
- MMX routine incompatibility with new gcc compiler versions
- Other miscellaneous bug fixes
v1.2.1:
- Added automake/autoconf scripts for GNU
platforms (in courtesy of David Durham)
- Fixed SCALE overflow bug in rate transposer
routine.
- Fixed 64bit address space bugs.
- Created a 'soundtouch' namespace for
SAMPLETYPE definitions.
v1.2.0:
- Added support for 32bit floating point sample
data type with SSE/3DNow! optimizations for Win32 platform (SSE/3DNow! optimizations currently not supported in GCC environment)
- Replaced 'make-gcc' script for GNU environment
by master Makefile
- Added time-stretch routine configurability to
SoundTouch main class
- Bugfixes
v1.1.1:
- Moved SoundTouch under lesser GPL license (LGPL). This allows using SoundTouch library in programs that aren't
released under GPL license.
- Changed MMX routine organiation so that MMX optimized routines are now implemented in classes that are derived from
the basic classes having the standard non-mmx routines.
- MMX routines to support gcc version 3.
- Replaced windows makefiles by script using the .dsw files
v1.01:
- "mmx_gcc.cpp": Added "using namespace std" and
removed "return 0" from a function with void return value to fix
compiler errors when compiling the library in Solaris environment.
- Moved file "FIFOSampleBuffer.h" to "include"
directory to allow accessing the FIFOSampleBuffer class from external
files.
v1.0:
6. Acknowledgements
Kudos for these people who have contributed to development or submitted
bugfixes since
SoundTouch v1.3.1:
- Arthur A
- Richard Ash
- Stanislav Brabec
- Christian Budde
- Brian Cameron
- Jason Champion
- Patrick Colis
- Justin Frankel
- Jason Garland
- Takashi Iwai
- Paulo Pizarro
- RJ Ryan
- John Sheehy
Moral greetings to all other contributors and users also!
7. LICENSE
SoundTouch audio processing library
Copyright (c) Olli Parviainen
This library is free software; you can
redistribute it and/or modify it under the terms of the GNU
Lesser General Public License version 2.1 as published by the Free Software
Foundation.
This library is distributed in the hope
that it will be useful, but WITHOUT ANY WARRANTY; without even
the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU Lesser General Public License for
more details.
You should have received a copy of the GNU
Lesser General Public License along with this library; if not,
write to the Free Software Foundation, Inc., 59 Temple Place,
Suite 330, Boston, MA 02111-1307 USA