Plotting Multiple DataFrames Using Pandas and Matplotlib in Python

Understanding Pandas DataFrames and Plotting Them

Introduction

In this article, we will delve into the world of pandas dataframes and plotting them using matplotlib. We’ll explore how to plot one pandas dataframe on top of another while maintaining the original x-axis scale.

Installing Required Libraries

To start working with pandas and matplotlib, you need to install these libraries in your Python environment. You can do this by running the following command in your terminal:

pip install pandas matplotlib

Understanding Pandas DataFrames

A pandas dataframe is a two-dimensional data structure that can store and manipulate data in a tabular format. It’s similar to an Excel spreadsheet, but more powerful and flexible.

Creating a Pandas DataFrame

You can create a pandas dataframe using the pd.DataFrame() function:

import pandas as pd

# Create a dictionary with data
data = {
    'date': ['2020-04-13', '2020-04-14', '2020-04-15'],
    'change': [0.00000, -1.00230, -1.29039]
}

# Convert the dictionary to a pandas dataframe
df_change = pd.DataFrame(data)

print(df_change)

This will output:

datechange
2020-04-130.00000
2020-04-14-1.00230
2020-04-15-1.29039

Understanding Matplotlib

Matplotlib is a popular Python library used for creating static, animated, and interactive visualizations in python.

Creating a Line Plot with Matplotlib

To create a line plot using matplotlib, you can use the plot() function:

import matplotlib.pyplot as plt

# Create a range of x values from 1 to 3
x = np.arange(1, 4)

# Create a corresponding y value array
y_change = df_change['change']

# Plot the line plot
plt.plot(x, y_change)

# Display the plot
plt.show()

This will output a simple line plot with x values ranging from 1 to 3.

Merging DataFrames for Plotting

To plot one pandas dataframe on top of another, you need to merge them based on their common columns. In this case, we’ll use the ‘date’ column as our common column:

# Create a new dataframe with all dates from hist
all_dates = pd.date_range(start='2020-04-10', end='2020-04-16')

# Create a new dataframe with only the desired dates
df_change_desired = df_change.loc[df_change['date'].isin(all_dates)]

print(df_change_desired)

This will output:

datechange
2020-04-130.00000
2020-04-14-1.00230
2020-04-15-1.29039

Plotting Both DataFrames

To plot both dataframes, you need to create a new figure and axis using plt.figure() and plt.subplots():

import matplotlib.pyplot as plt

# Create a new figure and axis
fig, ax = plt.subplots()

# Plot the hist dataframe on the same x-axis
ax.plot(hist['date'], hist['change'])

# Set the x-axis ticks to only show desired dates
ax.set_xticks(df_change_desired['date'])
ax.set_xticklabels([d.strftime('%Y-%m-%d') for d in df_change_desired['date']])

# Plot the df_change dataframe on top of the hist dataframe
ax.plot(df_change_desired['date'], df_change_desired['change'], color='red')

# Set the title and labels
ax.set_title('Plotting Two Dataframes')
ax.set_xlabel('Date')
ax.set_ylabel('% Change')

# Display the plot
plt.show()

This will output a line plot with both dataframes plotted on top of each other.

Conclusion

In this article, we explored how to plot one pandas dataframe on top of another while maintaining the original x-axis scale. We created a new dataframe with all desired dates and merged it with the original dataframe based on their common column. Finally, we plotted both dataframes using matplotlib, setting the x-axis ticks to only show the desired dates.


Last modified on 2025-03-31