Normalization for Dynamic Time Warping

It is necessary to normalize the acoustic measurements that have been selected for dynamic time warping. Variation in different acoustic units such as Hz or ms is unlikely to be equivalent! It appears that how parameters are normalized plays a big role in the overall success of the DTW algorithm.

The most simple normalization would be to divide parameters by the overall standard deviation of the entire dataset of the comparison. A problem emerged with this method, however: two long elements that differed by x % in length were scored as being more dissimilar than two short elements that differed by the same proportion. Normalization by the standard deviation of individual elements had the opposite problem: an element that differs little in a certain parameter will be scored as identical to an element that differs greatly in that parameter, so long as the trajectory of that parameter is similar in the two elements (e.g. goes from high to low).

The solution employed by Luscinia is slightly more computer-intensive: normalize by the joint standard deviation of each pair of elements that are being compared. This appears, in practice to give very good results: comparisons involving elements of widely varying lengths and frequencies produce upgma trees in which distances between obviously shared elements are similar and small, no matter how long the elements are.

where x is an acoustical measure for two elements A & B which have lengths m and n respectively.

sAB is the joint standard deviation for the two elements.

However, a problem with this method arises if we consider a measure that does not vary within an element, but does vary between elements. The problem is that the distance between these elements will not now depend on the distance in this parameter - all pairs of elements will get a similar distance score. In practice, I am not sure this is a very big problem, except if "Gap after" elements is included (by definition, this is a measure that does not vary within the element). The solution (it's not very elegant) is to also calculate the overall standard deviation, s (this is calculated in much the same way as show above, but including all elements in the data set rather than just two), and allow normalization by a combination of the overall and joint standard deviations:

Here sf is the final standard deviation that is used to normalize the elements. q is the parameter "SD Ratio" that is found in the "Comparison parameters" window. I recommend leaving q at a value of 0.5 for most comparisons.