Detecting Changes in Time Series Data with ChangerFind: A Python Implementation

Change Point Detection using ChangerFind: A Python Implementation

Change point detection is a statistical technique used to identify significant changes or anomalies in a time series data. In this blog post, we will explore how to implement change point detection using the ChangerFind library in Python.

Introduction to ChangerFind

ChangerFind is an open-source library for change point detection in Python. It allows users to detect changes in a time series data with high accuracy and speed. The library uses a novel algorithm that combines the strengths of traditional statistical methods with machine learning techniques.

Assumptions

Before we dive into the implementation, let’s make some assumptions:

We have a time series dataset y that we want to analyze.
We assume that the data is normally distributed and has no missing values.
We are interested in detecting changes in the mean or variance of the data.

Installing ChangerFind

To use ChangerFind, you need to install it first. You can do this using pip:

pip install changefinder

Importing Libraries

Before we start implementing change point detection, let’s import the necessary libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from changefinder import ChangeFinder

Loading Data

Let’s assume that we have a CSV file templeture_data_.csv containing our time series data. We can load it using pandas:

df_templeture = pd.read_csv('templeture_data_.csv')
y = pd.Series(df_templeture.templeture.values, index=pd.date_range(start='2019-11-11  22:00:00', periods=len(df_templeture),freq='min'))

Preprocessing Data

Before we apply change point detection, let’s preprocess our data by calculating the first-order difference:

y_diff = y.diff()

Applying Change Point Detection

Now it’s time to apply change point detection using ChangerFind. We will use the ChangeFinder class from the library and specify the order of differences we want to detect (in this case, 1):

cf = changefinder.ChangeFinder(r=0.01, order=1, smooth=7)

We can then apply change point detection to our data using the update method:

result = np.empty(len(y_diff))
for i, d in enumerate(y_diff):
    result[i] = cf.update(d)

Visualizing Results

Finally, let’s visualize our results by plotting the original data and the detected changes:

fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(result, label="score")
ax2 = ax.twinx()
ax2.plot(y_diff, alpha=0.3, label="observation")
plt.show()

Discussion

The resulting plot shows the original data on the left and the detected changes on the right. The changes are represented by a score value that indicates how likely it is that there was a change in the data.

However, we notice that the plot only shows a few data points, which seems incorrect. This is because the update method returns an empty array for most of the values due to the smooth=7 parameter, which means that the algorithm will only detect changes when the score value falls below a certain threshold.

To fix this issue, we need to adjust the parameters of the ChangeFinder class. In particular, we can try reducing the order parameter to 1 or even 0 to reduce the sensitivity of the algorithm:

cf = changefinder.ChangeFinder(r=0.01, order=12, smooth=7)

By doing so, we will get a more accurate result that shows all the changes in the data.

Conclusion

In this blog post, we demonstrated how to implement change point detection using ChangerFind in Python. We discussed the assumptions and prerequisites for implementing change point detection, installed the necessary libraries, loaded our data, preprocessed it by calculating the first-order difference, applied change point detection using ChangerFind, visualized the results, and discussed potential issues and solutions.

By following these steps and adjusting the parameters of the ChangeFinder class, you should be able to detect changes in your time series data with high accuracy.

Last modified on 2025-01-27