Creating Pie Charts for Each Column in a Pandas DataFrame: A Customizable Approach

Creating Pie Charts for Each Column in a Pandas DataFrame

In this article, we will explore how to create pie charts for each column in a Pandas DataFrame. This is particularly useful when working with categorical data and wanting to visualize the distribution of values across different categories.

Introduction to Pandas and DataFrames

Pandas is a powerful library used for data manipulation and analysis in Python. A DataFrame is a two-dimensional table of data with rows and columns, similar to an Excel spreadsheet or a SQL database. Each column represents a variable, and each row represents an observation.

In this article, we will use the Pandas library to create a DataFrame from sample data and then extract the categorical values for each column. We will also explore how to visualize these values using pie charts.

Sample Data

To illustrate the concepts discussed in this article, let’s start with some sample data:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({'a': ['table', 'chair', 'chair', 'lamp', 'bed'],
                   'b': ['lamp', 'candle', 'chair', 'lamp', 'bed'],
                   'c': ['mirror', 'mirror', 'mirror', 'mirror', 'mirror']})

print(df)

This code creates a DataFrame df with three columns: a, b, and c. The values in each column represent different categories.

Tabulating and Visualizing the Data

To tabulate the data and create a pie chart for each column, we can use the following code:

# Tabulate the data using value_counts()
df2 = df.apply(pd.value_counts).fillna(0)

# Create a bar plot of the tabulated data
df2.plot.bar()

# Display the plot
import matplotlib.pyplot as plt
plt.show()

However, this approach produces a single massive plot with all columns combined. This is not ideal for our purpose.

Creating Pie Charts for Each Column

To create pie charts for each column, we can use the following code:

# Plotting pie charts for each column
df2.plot(kind='pie', subplots=True,
         autopct='%1.1f%%', startangle=270, fontsize=17,
         layout=(2,2), figsize=(10,10))

This code creates a figure with four subplots, one for each column in the DataFrame. Each subplot displays a pie chart representing the distribution of values across different categories.

However, this approach still has some limitations. For example, the titles and legends are not explicitly set, which can make it difficult to interpret the plots.

Customizing the Pie Charts

To customize the pie charts further, we can use various options available in the matplotlib library. Here’s an updated code snippet:

# Plotting pie charts for each column with customizations

import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame({'a': ['table', 'chair', 'chair', 'lamp', 'bed'],
                   'b': ['lamp', 'candle', 'chair', 'lamp', 'bed'],
                   'c': ['mirror', 'mirror', 'mirror', 'mirror', 'mirror']})

# Tabulate the data using value_counts()
df2 = df.apply(pd.value_counts).fillna(0)

# Create a figure with four subplots, one for each column
fig, axs = plt.subplots(2, 2, figsize=(10, 10))

# Iterate over each subplot and create a pie chart
for ax, col in zip(axs.flat, df.columns):
    values = df2[col].values
    labels = df2[col].index
    ax.pie(values, labels=labels, autopct='%1.1f%%', startangle=270,
           textprops={'fontsize': 12})

# Set the title for each subplot
for ax, col in zip(axs.flat, df.columns):
    ax.set_title(col)

# Layout so plots do not overlap
fig.tight_layout()

# Display the plot
plt.show()

This updated code snippet creates a figure with four subplots, one for each column in the DataFrame. Each subplot displays a pie chart representing the distribution of values across different categories. The titles and legends are explicitly set using the set_title method.

Conclusion

In this article, we explored how to create pie charts for each column in a Pandas DataFrame. We discussed various options available in the matplotlib library for customizing the plots. By following these steps and using the provided code snippets, you can create meaningful and interpretable pie charts for your categorical data.

Example Use Cases

Here are some example use cases where creating pie charts for each column can be useful:

  • Visualizing the distribution of values across different categories in a dataset.
  • Comparing the relative frequencies of different categories in a dataset.
  • Identifying patterns or trends in a dataset by comparing the proportions of different categories.

By using the techniques discussed in this article, you can create effective pie charts for your Pandas DataFrames and gain insights into your data.


Last modified on 2023-10-21