What is DTW?
Short Answer
-
In time series analysis, dynamic time warping (DTW) is one of the algorithms for measuring similarity between two temporal sequences, which may vary in speed.
-
DTW has been applied to temporal sequences of video, audio, and graphics data. Any data that can be turned into a linear sequence can be analyzed with DTW.
Long Story
Dynamic Time Warping
- The objective of time series comparison methods is to produce a distance metric between two input time series. The similarity of two-time series is typically measured by the Euclidean distance:
\[d(x, y) = \sqrt{\sum_{t}(x_{t}-y_{t})^{2}}\]
-
Dynamic time warping is a seminal time series comparison technique that has been used for speech and word recognition since the 1970s with sound waves as the source.
-
This technique can be used not only for pattern matching but also anomaly detection.
-
When looking at the red and blue lines in the following graph, the traditional time series matching performs extremely restrictive. However, dynamic time warping allows the two curves to match up evenly in spite of the timestamps are not synthetic.
A Cryptocurrency Example
- We collect historical price time series for three cryptocurrencies (Bitcoin, Ethereum, Binance). Time spans from 11/01/2020 to 11/30/2020. Since their prices have huge discrepancies, we normalized their prices in order to plot them in a single figure.
-
Although each cryptocurrency has its own story, all of them have similar tendencies. Intuitively, we have the impression that the time series of Bitcoin and Ethereum are more similar. On the contrary, Binance alienates them a little bit. It seems that Binance is more dissimilar to Ethereum.
-
To test our intuition, firstly, we calculate the correlation coeffient matrix of these three time series of normalized prices.
Bitcoin | Ethereum | Binance | |
---|---|---|---|
Bitcoin | 1.00 | 0.92 | 0.74 |
Ethereum | 0.92 | 1.00 | 0.86 |
Binance | 0.74 | 0.86 | 1.00 |
-
Now, we use the method
fastdtw
in package fastdtw to calculate DTW values among these times series. -
Python code for DTW:
import numpy as np
from scipy.spatial.distance import euclidean
from fastdtw import fastdtw
btc_bnb_dtw = fastdtw(btc, bnb, dist=euclidean)[0]
Bitcoin | Ethereum | Binance | |
---|---|---|---|
Bitcoin | 0.00 | 1.13 | 3.37 |
Ethereum | 1.13 | 0.00 | 4.02 |
Binance | 3.37 | 4.02 | 0.00 |
-
The inference we obtain from the correlation matrix is that Bitcoin and Ethereum are high correlated, which coincides with our intuition. However, Binance and Ethereum have a higher correlation coenficient than that between Binance and Bitcoin.
-
In contrast, the DTW distance matrix shows that all three time series have a DTW distance of 0 to themselves. At the same time, both Etherium and Binance have a smaller distance from Bitcoin than the other peer. Which fits our intuition perfectly.