Plotting Peaks and Valleys in Time Series Data with Python and SciPy

Peaks and Valleys Plotting in Python with SciPy and Pandas

Python is a popular language for data analysis due to its simplicity, flexibility, and extensive library support. Among these libraries, SciPy (Scientific Python) and Pandas are particularly useful for scientific computing and data manipulation. In this article, we will explore how to plot peaks and valleys in a dataset using Python with SciPy and Pandas.

Introduction

Peaks and valleys are common features in time series data that can be analyzed using various techniques. A peak is a point where the value of the data increases and then decreases, while a valley is a point where the value of the data decreases and then increases. These features can be used to identify patterns, trends, and anomalies in data.

In this article, we will discuss how to plot peaks and valleys in a dataset using Python with SciPy and Pandas. We will cover the following topics:

How to load and manipulate time series data
How to find peaks and valleys in a dataset
How to plot peaks and valleys

Loading and Manipulating Time Series Data

To work with time series data, we first need to load it into Python. The pandas library provides an efficient way to load and manipulate time series data.

import pandas as pd

Once the data is loaded, we can rename the columns to make them more meaningful.

# Load the data
df = pd.read_csv('data.csv', index_col=False)

# Rename the columns
df = df.rename(columns={'Timestamp': 'Date'})

Finding Peaks and Valleys

To find peaks and valleys in a dataset, we can use the argrelextrema function from SciPy.

import numpy as np
from scipy.signal import argrelextrema

# Find the indices of local minima (valleys)
min_indices = argrelextrema(df['Data'].values, np.less_equal, order=3)[0]

# Find the indices of local maxima (peaks)
max_indices = argrelextrema(df['Data'].values, np.greater_equal, order=3)[0]

Plotting Peaks and Valleys

To plot peaks and valleys, we can use the matplotlib library.

import matplotlib.pyplot as plt

# Plot the data
plt.plot(df['Date'], df['Data'])

# Plot the peaks
for i in max_indices:
    plt.scatter(df['Date'][i], df['Data'][i], color='red', marker='x')

# Plot the valleys
for i in min_indices:
    plt.scatter(df['Date'][i], df['Data'][i], color='green', marker='o')

Renaming Columns

One of the challenges with finding peaks and valleys is dealing with column names. The pandas library provides a way to rename columns using the rename method.

# Rename the columns
df = df.rename(columns={'Data': 'Value'})

def peaks_valleys(path, typ, acc):
    # Load the data
    df = pd.read_csv(path, index_col=False)
    df = df.rename(columns={typ: 'Typ'})

    # Find the indices of local minima (valleys)
    min_indices = argrelextrema(df['Typ'].values, np.less_equal, order=3)[0]

    # Find the indices of local maxima (peaks)
    max_indices = argrelextrema(df['Typ'].values, np.greater_equal, order=3)[0]

Plotting Peaks and Valleys with Multiple Columns

If we have multiple columns that need to be analyzed, we can use a loop to plot the peaks and valleys.

def plot_peaks_valleys(path):
    # Load the data
    df = pd.read_csv(path, index_col=False)

    # Find the indices of local minima (valleys)
    for typ in ['Data1', 'Data2']:
        min_indices = argrelextrema(df[typ].values, np.less_equal, order=3)[0]

    # Find the indices of local maxima (peaks)
    for typ in ['Data1', 'Data2']:
        max_indices = argrelextrema(df[typ].values, np.greater_equal, order=3)[0]

    # Plot the data
    plt.plot(df['Date'], df['Data'])

    # Plot the peaks
    for i, typ in enumerate(['Data1', 'Data2']):
        for j in max_indices[i]:
            plt.scatter(df['Date'][j], df[typ][j], color='red', marker='x')

    # Plot the valleys
    for i, typ in enumerate(['Data1', 'Data2']):
        for j in min_indices[i]:
            plt.scatter(df['Date'][j], df[typ][j], color='green', marker='o')

Conclusion

In this article, we discussed how to plot peaks and valleys in a dataset using Python with SciPy and Pandas. We covered the following topics:

Loading and manipulating time series data
Finding peaks and valleys
Plotting peaks and valleys
Renaming columns
Plotting peaks and valleys with multiple columns

By following this guide, you should be able to analyze your own time series data using Python with SciPy and Pandas.

References

Last modified on 2024-12-30