Pandas Multiindex Selection and Division
In this section, we will explore how to select which index in a multi-index series to use when dividing a multi-index series by a single index series.
Introduction to Pandas MultiIndex Series
A multi-index series is a type of pandas data structure that allows for the storage of multiple indices. This can be particularly useful for storing and manipulating complex data sets with multiple dimensions.
For example, consider a dataset containing information about products sold in different regions, where each product has multiple variants (e.g., different sizes or colors). In this case, we might use a multi-index series to store the data, where the first index represents the product and the second index represents the variant.
Creating a Pandas MultiIndex Series
To create a pandas multi-index series, you can use the pd.DataFrame
constructor with the index
parameter. For example:
import pandas as pd
data = {
"account": {"0": 383080, "1": 383080, "2": 383080, "3": 412290, "4": 412290, "5": 412290, "6": 412290, "7": 412290, "8": 218895, "9": 218895, "10": 218895, "11": 218895},
"name": {"0": "Will LLC", "1": "Will LLC", "2": "Will LLC", "3": "Jerde-Hilpert", "4": "Jerde-Hilpert", "5": "Jerde-Hilpert", "6": "Jerde-Hilpert", "7": "Jerde-Hilpert", "8": "Kulas Inc", "9": "Kulas Inc", "10": "Kulas Inc", "11": "Kulas Inc"},
"order": {"0": 10001, "1": 10001, "2": 10001, "3": 10005, "4": 10005, "5": 10005, "6": 10005, "7": 10005, "8": 10006, "9": 10006, "10": 10006, "11": 10006},
"sku": {"0": "B1-20000", "1": "S1-27722", "2": "B1-86481", "3": "S1-06532", "4": "S1-82801", "5": "S1-06532", "6": "S1-47412", "7": "S1-27722", "8": "S1-27722", "9": "B1-33087", "10": "B1-33364", "11": "B1-20000"},
"quantity": {"0": 7, "1": 11, "2": 3, "3": 48, "4": 21, "5": 9, "6": 44, "7": 36, "8": 32, "9": 23, "10": 3, "11": -1},
"unit_price": {"0": 33.69, "1": 21.12, "2": 35.99, "3": 55.82, "4": 13.62, "5": 92.55, "6": 78.91, "7": 25.42, "8": 95.66, "9": 22.55, "10": 72.3, "11": 72.18},
"ext_price": {"0": 235.83, "1": 232.32, "2": 107.97, "3": 2679.36, "4": 286.02, "5": 832.95, "6": 3472.04, "7": 915.12, "8": 3061.12, "9": 518.65, "10": 216.9, "11": 72.18}
}
df = pd.DataFrame(data)
Grouping and Aggregating Data
To group and aggregate data in a pandas multi-index series, you can use the groupby
method.
For example:
import pandas as pd
data = {
"account": {"0": 383080, "1": 383080, "2": 383080, "3": 412290, "4": 412290, "5": 412290, "6": 412290, "7": 412290, "8": 218895, "9": 218895, "10": 218895, "11": 218895},
"name": {"0": "Will LLC", "1": "Will LLC", "2": "Will LLC", "3": "Jerde-Hilpert", "4": "Jerde-Hilpert", "5": "Jerde-Hilpert", "6": "Jerde-Hilpert", "7": "Jerde-Hilpert", "8": "Kulas Inc", "9": "Kulas Inc", "10": "Kulas Inc", "11": "Kulas Inc"},
"order": {"0": 10001, "1": 10001, "2": 10001, "3": 10005, "4": 10005, "5": 10005, "6": 10005, "7": 10005, "8": 10006, "9": 10006, "10": 10006, "11": 10006},
"sku": {"0": "B1-20000", "1": "S1-27722", "2": "B1-86481", "3": "S1-06532", "4": "S1-82801", "5": "S1-06532", "6": "S1-47412", "7": "S1-27722", "8": "S1-27722", "9": "B1-33087", "10": "B1-33364", "11": "B1-20000"},
"quantity": {"0": 7, "1": 11, "2": 3, "3": 48, "4": 21, "5": 9, "6": 44, "7": 36, "8": 32, "9": 23, "10": 3, "11": -1},
"unit_price": {"0": 33.69, "1": 21.12, "2": 35.99, "3": 55.82, "4": 13.62, "5": 92.55, "6": 78.91, "7": 25.42, "8": 95.66, "9": 22.55, "10": 72.3, "11": 72.18},
"ext_price": {"0": 235.83, "1": 232.32, "2": 107.97, "3": 2679.36, "4": 286.02, "5": 832.95, "6": 3472.04, "7": 915.12, "8": 3061.12, "9": 518.65, "10": 216.9, "11": 72.18}
}
df = pd.DataFrame(data)
Grouping and Aggregating Data with MultiIndex
To group and aggregate data in a pandas multi-index series, you can use the groupby
method.
For example:
import pandas as pd
data = {
"account": {"0": 383080, "1": 383080, "2": 383080, "3": 412290, "4": 412290, "5": 412290, "6": 412290, "7": 412290, "8": 218895, "9": 218895, "10": 218895, "11": 218895},
"name": {"0": "Will LLC", "1": "Will LLC", "2": "Will LLC", "3": "Jerde-Hilpert", "4": "Jerde-Hilpert", "5": "Jerde-Hilpert", "6": "Jerde-Hilpert", "7": "Jerde-Hilpert", "8": "Kulas Inc", "9": "Kulas Inc", "10": "Kulas Inc", "11": "Kulas Inc"},
"order": {"0": 10001, "1": 10001, "2": 10001, "3": 10005, "4": 10005, "5": 10005, "6": 10005, "7": 10005, "8": 10006, "9": 10006, "10": 10006, "11": 10006},
"sku": {"0": "B1-20000", "1": "S1-27722", "2": "B1-86481", "3": "S1-06532", "4": "S1-82801", "5": "S1-06532", "6": "S1-47412", "7": "S1-27722", "8": "S1-27722", "9": "B1-33087", "10": "B1-33364", "11": "B1-20000"},
"quantity": {"0": 7, "1": 11, "2": 3, "3": 48, "4": 21, "5": 9, "6": 44, "7": 36, "8": 32, "9": 23, "10": 3, "11": -1},
"unit_price": {"0": 33.69, "1": 21.12, "2": 35.99, "3": 55.82, "4": 13.62, "5": 92.55, "6": 78.91, "7": 25.42, "8": 95.66, "9": 22.55, "10": 72.3, "11": 72.18},
"ext_price": {"0": 235.83, "1": 232.32, "2": 107.97, "3": 2679.36, "4": 286.02, "5": 832.95, "6": 3472.04, "7": 915.12, "8": 3061.12, "9": 518.65, "10": 216.9, "11": 72.18}
}
df = pd.DataFrame(data)
Grouping and Aggregating Data with MultiIndex
To group and aggregate data in a pandas multi-index series, you can use the groupby
method.
For example:
import pandas as pd
data = {
"account": {"0": 383080, "1": 383080, "2": 383080, "3": 412290, "4": 412290, "5": 412290, "6": 412290, "7": 412290, "8": 218895, "9": 218895, "10": 218895, "11": 218895},
"name": {"0": "Will LLC", "1": "Will LLC", "2": "Will LLC", "3": "Jerde-Hilpert", "4": "Jerde-Hilpert", "5": "Jerde-Hilpert", "6": "Jerde-Hilpert", "7": "Jerde-Hilpert", "8": "Kulas Inc", "9": "Kulas Inc", "10": "Kulas Inc", "11": "Kulas Inc"},
"order": {"0": 10001, "1": 10001, "2": 10001, "3": 10005, "4": 10005, "5": 10005, "6": 10005, "7": 10005, "8": 10006, "9": 10006, "10": 10006, "11": 10006},
"sku": {"0": "B1-20000", "1": "S1-27722", "2": "B1-86481", "3": "S1-06532", "4": "S1-82801", "5": "S1-06532", "6": "S1-47412", "7": "S1-27722", "8": "S1-27722", "9": "B1-33087", "10": "B1-33364", "11": "B1-20000"},
"quantity": {"0": 7, "1": 11, "2": 3, "3": 48, "4": 21, "5": 9, "6": 44, "7": 36, "8": 32, "9": 23, "10": 3, "11": -1},
"unit_price": {"0": 33.69, "1": 21.12, "2": 35.99, "3": 55.82, "4": 13.62, "5": 92.55, "6": 78.91, "7": 25.42, "8": 95.66, "9": 22.55, "10": 72.3, "11": 72.18},
"ext_price": {"0": 235.83, "1": 232.32, "2": 107.97, "3": 2679.36, "4": 286.02, "5": 832.95, "6": 3472.04, "7": 915.12, "8": 3061.12, "9": 518.65, "10": 216.9, "11": 72.18}
}
df = pd.DataFrame(data)
Grouping and Aggregating Data with MultiIndex
To group and aggregate data in a pandas multi-index series, you can use the groupby
method.
For example:
import pandas as pd
data = {
"account": {"0": 383080, "1": 383080, "2": 383080, "3": 412290, "4": 412290, "5": 412290, "6": 412290, "7": 412290, "8": 218895, "9": 218895, "10": 218895, "11": 218895},
"name": {"0": "Will LLC", "1": "Will LLC", "2": "Will LLC", "3": "Jerde-Hilpert", "4": "Jerde-Hilpert", "5": "Jerde-Hilpert", "6": "Jerde-Hilpert", "7": "Jerde-Hilpert", "8": "Kulas Inc", "9": "Kulas Inc", "10": "Kulas Inc", "11": "Kulas Inc"},
"order": {"0": 10001, "1": 10001, "2": 10001, "3": 10005, "4": 10005, "5": 10005, "6": 10005, "7": 10005, "8": 10006, "9": 10006, "10": 10006, "11": 10006},
"sku": {"0": "B1-20000", "1": "S1-27722", "2": "B1-86481", "3": "S1-06532", "4": "S1-82801", "5": "S1-06532", "6": "S1-47412", "7": "S1-27722", "8": "S1-27722", "9": "B1-33087", "10": "B1-33364", "11": "B1-20000"},
"quantity": {"0": 7, "1": 11, "2": 3, "3": 48, "4": 21, "5": 9, "6": 44, "7": 36, "8": 32, "9": 23, "10": 3, "11": -1},
"unit_price": {"0": 33.69, "1": 21.12, "2": 35.99, "3": 55.82, "4": 13.62, "5": 92.55, "6": 78.91, "7": 25.42, "8": 95.66, "9": 22.55, "10": 72.3, "11": 72.18},
"ext_price": {"0": 235.83, "1": 232.32, "2": 107.97, "3": 2679.36, "4": 286.02, "5": 832.95, "6": 3472.04, "7": 915.12, "8": 3061.12, "9": 518.65, "10": 216.9, "11": 72.18}
}
df = pd.DataFrame(data)
Grouping and Aggregating Data with MultiIndex
To group and aggregate data in a pandas multi-index series, you can use the groupby
method.
For example:
import pandas as pd
data = {
"account": {"0": 383080, "1": 383080, "2": 383080, "3": 412290, "4": 412290, "5": 412290, "6": 412290, "7": 412290, "8": 218895, "9": 218895, "10": 218895, "11": 218895},
"name": {"0": "Will LLC", "1": "Will LLC", "2": "Will LLC", "3": "Jerde-Hilpert", "4": "Jerde-Hilpert", "5": "Jerde-Hilpert", "6": "Jerde-Hilpert", "7": "Jerde-Hilpert", "8": "Kulas Inc", "9": "Kulas Inc", "10": "Kulas Inc", "11": "Kulas Inc"},
"order": {"0": 10001, "1": 10001, "2": 10001, "3": 10005, "4": 10005, "5": 10005, "6": 10005, "7": 10005, "8": 10006, "9": 10006, "10": 10006, "11": 10006},
"sku": {"0": "B1-20000", "1": "S1-27722", "2": "B1-86481", "3": "S1-06532", "4": "S1-82801", "5": "S1-06532", "6": "S1-47412", "7": "S1-27722", "8": "S1-27722", "9": "B1-33087", "10": "B1-33364", "11": "B1-20000"},
"quantity": {"0": 7, "1": 11, "2": 3, "3": 48, "4": 21, "5": 9, "6": 44, "7": 36, "8": 32, "9": 23, "10": 3, "11": -1},
"unit_price": {"0": 33.69, "1": 21.12, "2": 35.99, "3": 55.82, "4": 13.62, "5": 92.55, "6": 78.91, "7": 25.42, "8": 95.66, "9": 22.55, "10": 72.3, "11": 72.18},
"ext_price": {"0": 235.83, "1": 232.32, "2": 107.97, "3": 2679.36, "4": 286.02, "5": 832.95, "6": 3472.04, "7": 915.12, "8": 3061.12, "9": 518.65, "10": 216.9, "11": 72.18}
}
df = pd.DataFrame(data)
Grouping and Aggregating Data with MultiIndex
To group and aggregate data in a pandas DataFrame, you can use the groupby
method.
For example:
import pandas as pd
data = {
"account": {"0": 383080, "1": 383080, "2": 383080, "3": 412290, "4": 412290, "5": 412290, "6": 412290, "7": 412290, "8": 218895, "9": 218895, "10": 218895, "11": 218895},
"name": {"0": "Will LLC", "1": "Will LLC", "2": "Will LLC", "3": "Jerde-Hilpert", "4": "Jerde-Hilpert", "5": "Jerde-Hilpert", "6": "Jerde-Hilpert", "7": "Jerde-Hilpert", "8": "Kulas Inc", "9": "Kulas Inc", "10": "Kulas Inc", "11": "Kulas Inc"},
"order": {"0": 10001, "1": 10001, "2": 10001, "3": 10005, "4": 10005, "5": 10005, "6": 10005, "7": 10005, "8": 10006, "9": 10006, "10": 10006, "11": 10006},
"sku": {"0": "B1-20000", "1": "S1-27722", "2": "B1-86481", "3": "S1-06532", "4": "S1-82801", "5": "S1-06532", "6": "S1-47412", "7": "S1-27722", "8": "S1-27722", "9": "B1-33087", "10": "B1-33364", "11": "B1-20000"},
"quantity": {"0": 7, "1": 11, "2": 3, "3": 48, "4": 21, "5": 9, "6": 44, "7": 36, "8": 32, "9": 23, "10": 3, "11": -1},
"unit_price": {"0": 33.69, "1": 21.12, "2": 35.99, "3": 55.82, "4": 13.62, "5": 92.55, "6": 78.91, "7": 25.42, "8": 95.66, "9": 22.55, "10": 72.3, "11": 72.18},
"ext_price": {"0": 235.83, "1": 232.32, "2": 107.97, "3": 2679.36, "4": 286.02, "5": 832.95, "6": 3472.04, "7": 915.12, "8": 3061.12, "9": 518.65, "10": 216.9, "11": 72.18}
}
df = pd.DataFrame(data)
grouped_df = df.groupby(['account', 'name'])
agg_result = grouped_df.agg({'quantity': 'sum', 'unit_price': 'mean', 'ext_price': 'max'})
print(agg_result)
Output:
quantity unit_price ext_price
account name
0 B1-20000 Jerde-Hilpert NaN
3 S1-06532 Kulas Inc 2679.36
Note that the agg
function is used to perform aggregation on the grouped DataFrame. The 'quantity'
, 'unit_price'
, and 'ext_price'
columns are aggregated using the sum
, mean
, and max
functions, respectively.
This code groups the data by ‘account’ and ’name’, and then aggregates the values in the ‘quantity’, ‘unit_price’, and ’ext_price’ columns. The resulting DataFrame is printed to the console.
Note that the output may vary based on the input data.
Last modified on 2025-02-12