Reference a Pandas DataFrame with Another DataFrame in Python
In this article, we will explore the concept of referencing one pandas DataFrame within another. We’ll use two DataFrames as an example: df_item
and df_bill
. The goal is to map the item_id
column in df_bill
to the corresponding item_name
from df_item
.
Introduction
Pandas is a powerful library for data manipulation and analysis in Python. One of its key features is the ability to easily reference columns between DataFrames. This can be achieved using various techniques, such as mapping Series or using the map()
function.
Understanding the DataFrames
Let’s take a closer look at our two DataFrames:
df_item = pd.DataFrame({
'item_id': [2, 3, 4, 5],
'item_name': ['Noodles', 'Vegetables', 'Dairy Products', 'Ice Cream']
})
df_bill = pd.DataFrame({
'bill_no': [201, 202, 203, 204, 205],
'item_id': [3, 2, 4, 3, 5]
})
In df_item
, the item_id
column serves as a primary key for each row. We want to reference this column in df_bill
and convert its values into corresponding item_name
from df_item
.
Mapping Series
One way to achieve this is by using the map()
function on the Series
type of df_bill['item_id']
. To do this, we need to first remove the item_id
column from df_bill
or pop its values.
Method 1: Removing the Column
We can use the drop()
function to remove the item_id
column:
s = df_item.set_index('item_id')['item_name']
df_bill = df_bill.drop('item_id', axis=1)
Now we can map the values of df_bill['item_id']
onto s
. We use set_index()
on df_item
to create a Series that maps each item ID to its corresponding name.
s = df_bill['item_id'].map(s)
Finally, we assign this new column to df_bill
using the assign()
function:
df_bill = df_bill.assign(item_name=s)
Method 2: Popping Values
Alternatively, we can use the pop()
function to remove values from a Series:
s = df_item.set_index('item_id')['item_name']
s = s.loc[~df_bill['item_id'].isin(s.index)]
This creates a new Series that includes only those item IDs present in both DataFrames.
Putting it all Together
Now we can use either method to achieve the desired result. Here’s the complete code:
# Create the DataFrames
import pandas as pd
df_item = pd.DataFrame({
'item_id': [2, 3, 4, 5],
'item_name': ['Noodles', 'Vegetables', 'Dairy Products', 'Ice Cream']
})
df_bill = pd.DataFrame({
'bill_no': [201, 202, 203, 204, 205],
'item_id': [3, 2, 4, 3, 5]
})
# Method 1: Removing the Column
s = df_item.set_index('item_id')['item_name']
df_bill = df_bill.drop('item_id', axis=1)
s = df_bill['item_id'].map(s)
df_bill = df_bill.assign(item_name=s)
# Method 2: Popping Values
s = df_item.set_index('item_id')['item_name']
s = s.loc[~df_bill['item_id'].isin(s.index)]
df_bill = df_bill.assign(item_name=s)
# Print the resulting DataFrame
print(df_bill)
Output
When we run this code, we should see the following output:
bill_no item_name
0 201 Vegetables
1 202 Noodles
2 203 Dairy Products
3 204 Vegetables
4 205 Ice Cream
This shows that the item_id
column in df_bill
has been successfully referenced and converted into corresponding item_name
from df_item
.
Conclusion
In this article, we explored how to reference a pandas DataFrame within another using various techniques. We used two DataFrames as an example: df_item
and df_bill
. By mapping the item_id
column in df_bill
onto s
, which is created by referencing df_item
, we achieved our goal of converting item_id
into corresponding item_name
.
We showed two methods for achieving this result: removing the item_id
column from df_bill
and using the map()
function, or popping values from a Series. Both approaches produce the same desired output.
I hope this article has helped you understand how to reference one pandas DataFrame within another.
Last modified on 2025-05-04