Creating Python Dictionary from Excel
Introduction
In this article, we will explore how to create a dictionary in Python using data imported from an Excel file. We will go through the process step-by-step, explaining each part and providing examples.
Requirements
To follow along with this tutorial, you’ll need:
- Python 3.x installed on your computer
- The
xlrd
library, which can be installed using pip:pip install xlrd
Excel Data Structure
Before diving into the code, let’s take a look at how data is structured in an Excel file. The data is stored in rows and columns, with each cell containing a value.
For example, if we have an Excel file with three sheets (MM, DD, FF) and some sample data, the sheet might look like this:
MM | FF | |
---|---|---|
s1 | c1 | |
s2 | c2 | |
s3 | c3 |
Using xlrd
to Import Data
To start, we’ll use the xlrd
library to open and read our Excel file. We can import xlrd
in Python using the following code:
import xlrd
file_location = "data.xlsx"
workbook = xlrd.open_workbook(file_location)
In this example, replace "data.xlsx"
with the actual path to your Excel file.
Defining Sheets and Data Structures
We can then select which sheets we want to use by calling sheet_by_name
:
M_Sheet = workbook.sheet_by_name("MM")
D_Sheet = workbook.sheet_by_name("DD")
F_Sheet = workbook.sheet_by_name("FF")
Here, M_Sheet
, D_Sheet
, and F_Sheet
are the sheet objects we’ll use to access the data.
We define lists M
, D
, and F
to store the values from each sheet. These will be used to create our final dictionary:
M = []
for i in range(M_Sheet.nrows):
value = (M_Sheet.cell(i,0).value)
M.append(value)
D = []
for j in range(D_Sheet.nrows):
value = (D_Sheet.cell(j,0).value)
D.append(value)
F = []
for f in range(F_Sheet.nrows):
value = (F_Sheet.cell(f,0).value)
F.append(value)
This code loops through each row of the sheet and appends the value from column 0 to our respective lists.
However, using this approach is inefficient because it results in duplicate data. A more efficient way would be to use dictionaries where keys are from one list and values are from another.
Creating a Dictionary
Our goal now is to create a dictionary DICT
with keys taken from the sheets M
and F
, while using the values from sheet D
. The code below shows how we can achieve this:
dico_s = {}
for s in S:
dico_d = {}
for d in D:
idx = D.index(d) + len(D) * S.index(s)
dico_d[d] = C[idx]
dico_s[s] = dico_d
print(dico_s)
In this code, S
is the list of values from sheet M
, D
is the list of values from sheet D
, and C
is the list of values from sheet F
. The inner loop uses the index method to find the corresponding value in the list C
.
This way, we avoid duplicating data by storing each key-value pair separately.
Example Walkthrough
Let’s walk through an example where we want to create a dictionary with three keys (s1, s2, s3) and their corresponding values from sheets D
and F
. Here’s the code:
nb_s = 3; nb_d = 2
S = ['s' + str(x) for x in range(1, nb_s + 1)]
D = ['d' + str(x) for x in range(1, nb_d + 1)]
C = ['c' + str(x) for x in range(1, (len(S) * len(D)) + 1)]
print(S)
print(D)
print(C)
dico_s = {}
for s in S:
dico_d = {}
for d in D:
idx = D.index(d) + len(D) * S.index(s)
dico_d[d] = C[idx]
dico_s[s] = dico_d
print(dico_s)
This code creates three lists S
, D
, and C
using list comprehensions. Then it iterates through the lists to create a dictionary where each key is from sheet M
and its corresponding value is from the combined values of sheets D
and F
.
The final output should look like this:
['s1', 's2', 's3']
['d1', 'd2']
['c1', 'c2', 'c3', 'c4', 'c5', 'c6']
DICO-{'s1': {'d1': 'c1', 'd2': 'c2'},
's2': {'d1': 'c3', 'd2': 'c4'},
's3': {'d1': 'c5', 'd2': 'c6'}}
Creating a Dictionary from Excel with Different Number of Sheets
If we have a different number of sheets in our Excel file, we can modify the code to accommodate this.
Here’s how we could do it:
nb_s = 4; nb_d = 6
S = ['s' + str(x) for x in range(1, nb_s + 1)]
D = ['d' + str(x) for x in range(1, nb_d + 1)]
C = ['c' + str(x) for x in range(1, (len(S) * len(D)) + 1)]
print(S)
print(D)
print(C)
dico_s = {}
for s in S:
dico_d = {}
for d in D:
idx = D.index(d) + len(D) * S.index(s)
dico_d[d] = C[idx]
dico_s[s] = dico_d
print(dico_s)
The output will look like this:
['s1', 's2', 's3', 's4']
['d1', 'd2', 'd3', 'd4', 'd5', 'd6']
['c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9', 'c10', 'c11', 'c12',
'c13', 'c14', 'c15', 'c16', 'c17', 'c18', 'c19', 'c20', 'c21', 'c22', 'c23', 'c24']
DICO-{'s1': {'d1': 'c1', 'd2': 'c2', 'd3': 'c3', 'd4': 'c4', 'd5': 'c5', 'd6': 'c6'},
's2': {'d1': 'c7', 'd2': 'c8', 'd3': 'c9', 'd4': 'c10', 'd5': 'c11', 'd6': 'c12'},
's3': {'d1': 'c13', 'd2': 'c14', 'd3': 'c15', 'd4': 'c16', 'd5': 'c17', 'd6': 'c18'},
's4': {'d1': 'c19', 'd2': 'c20', 'd3': 'c21', 'd4': 'c22', 'd5': 'c23', 'd6': 'c24'}}
In this example, we have four keys (s1, s2, s3, s4) and their corresponding values from the combined data of sheets D
and F
.
Last modified on 2024-08-19