Skip to content

Pitch Shifting

Phil Schatzmann edited this page Mar 29, 2024 · 6 revisions

Pitch shifting is the functionality to change the pitch of an input signal w/o changing the length or speed. If you use a higher pitch you get a Mickey Mouse voice, if you use a lower pitch you create a Hulk voice.

In my first implementation I used the work from Stephan Bernsee and implemented an solution which was based on FFT. But it was a tremendous failure: the microcontroller was just not fast enough to process the signal, so the sound was breaking up badly!

So I needed another approach: Thats' when I found YetAnotherElectronicsChannel's suggestion to use a ring buffer instead.

Though audio waves are varying vastly, they tend to remain stable during a short period of time. So we just need to provide a ring buffer where we can write at constant speed and where we can read the audio data back with variable floating point offsets (= variable speed): The only challenge with this approach, is that we need to implement some special logic when the read pointer is overrunning the write pointer (or vice versa).

Since a variable pitch ring buffer is key to this approach, I am providing different implementations that we can evaluate below:

Arduino Test Sketch

I am just generating a sine wave and run it thru the pitch conversion. Here is the Arduino sketch that just prints out a signal. The result can then be analysed in the Serial Plotter:

#include "AudioTools.h"

float pitch_shift = 1.3;
int buffer_size = 1000;
uint16_t sample_rate=44100;
uint8_t channels = 1;                                      // The stream will have 2 channels 
SineWaveGenerator<int16_t> sineWave(32000);                // subclass of SoundGenerator with max amplitude of 32000
GeneratedSoundStream<int16_t> sound(sineWave);             // Stream generated from sine wave
CsvStream<int16_t> out(Serial, 1);                         // Final output to Serial
//use one of VariableSpeedRingBufferSimple, VariableSpeedRingBuffer, VariableSpeedRingBuffer180 
PitchShiftOutput<int16_t, VariableSpeedRingBuffer<int16_t>> pitchShift(out);
<int16_t, VariableSpeedRingBuffer<int16_t>> pitchShift(out);
StreamCopy copier(pitchShift, sound);                       // copies sound to out

// Arduino Setup
void setup(void) {  
  // Open Serial 
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Warning);

  // Define CSV Output
  auto config = out.defaultConfig();
  config.sample_rate = sample_rate; 
  config.channels = channels;
  out.begin(config);

  // configure pitch shift
  auto pcfg = pitchShift.defaultConfig();
  pcfg.copyFrom(config);
  pcfg.pitch_shift = pitch_shift;
  pcfg.buffer_size = buffer_size;
  pitchShift.begin(pcfg);

  // Setup sine wave
  sineWave.begin(channels, sample_rate, N_B4);
  Serial.println("started...");
}

// Arduino loop - copy sound to out 
void loop() {
  copier.copy();
}

VariableSpeedRingBufferSimple

In this implementation I have no special overrunning logic. So we will get quite some unwanted noise.

VariableSpeedRingBuffer180

This is the logic taken from YetAnotherElectronicsChannel. The quality is much better, but we still get some strange behaviour.

VariableSpeedRingBuffer

I tried to implement my own optimized logic which interpolates values and tries to allign the phase when the read pointer overtakes the write pointer.

I could not find any flaws in my implementation.

Clone this wiki locally