Fundamental Frequency Extraction - OBSOLETE
There are many different algorithms for extracting the fundamental frequency of sounds. This algorithm is based on the spectrogram, and appears to perform well in the face of noisy signals and widely varying fundamental frequencies.
I shall describe the process of extracting a fundamental frequency spectrum (ffs) from a regular spectrogram spectrum (s). This is the process used to generate the "Pitch view" in Luscinia. The fundamental frequency extraction algorithm in the measurement section in closely related.
1) Assume that you have a spectrum generated by FFT, and transformed into a dB scale. Assume that there are y points in the spectrum, from 0 to maxf Hz
2) Create a sequence of y points (called logTransform) on a log, base 2 scale, that run from minf Hz to maxf Hz. These points correspond to the logarithmic frequency scale of the final spectrum
logMax=log2(maxf)
logMin=log2(minf)
step=(logMax-logMin)/y
Then:
logTransform[i]=2 ^ (logMin+(i*step))
where i is the ith position on the scale
3) Consider each frequency in logTransform in turn (the candidates), and assess how well it fits as a model of fundamental frequency. To do this, we have to cycle through each of the frequencies in the original spectrum:
i is the position of the candidate frequency in logTransform
j is the position of the frequency within spectrum
Then sum over all j
spectrum[j]*phase[i][j]
spectrum[j] is the intensity of the spectrum at position j
4) phase is a matrix of the phase difference between all possible pairs of frequencies in the spectrum and in logTransform. "Phase difference" is used loosely here - calculating phase difference using a typical trig function (cos) did not give very good results - the drop-off of the cos function is convex, while I find a more concave function works best:
More detail about how phase is calculated.
a) find the nearest integer multiple of logTransform[i] to j. In other words: if logTransform[i] IS a fundamental frequency, what would be the nearest harmonic to j?
b) measure the absolute difference between the nearest harmonic to j and j itself
In java, a and b together looks like:
phase[i][j]=Math.abs(i-transform[j]*Math.round(i/transform[j]));
c) normalize the result of b by the candidate frequency (logTransform[j]):
phase[i][j]/=transform[j]*(1-fundAdjust)+fundAdjust;
phase[i][j]*=2;
if (phase[i][j]>1){phase[i][j]=1;}
Note that here I have introduced a parameter, fundAdjust. This parameter affects the degree of normalization - if it is 1, then the normalization is removed, and if it is 0, normalization is complete. This parameter adjusts the rate of increase of the function as the phase difference increases.
d) take a logarithm of the result of c. I multiply the output of c by 100 first to generate positive values.
e) normalize all of the phase differences for this candidate frequency (i) so that they add up to 1.
5) Going back to the evaluation of the candidate frequency using the spectrum, when we calculate
spectrum[j]*phase[i][j]
what we are doing is multiplying the intensity of each part of the spectrum by the phase difference between where the frequency at each part of the spectrum is, and where we expect harmonics to actually be. in other words, if there is a lot of energy in the spectrum away from where harmonics are predicted, then this product will give us a large number. If we sum all of the spectrum in this way, and get a small value, that means that most of the energy in the spectrum was found close to where we predicted it should be