Merging Multiple Excel Files with Password Protection in Python
===========================================================
In this article, we will explore how to compile multiple Excel files into one master file while incorporating password protection. We’ll dive into the world of openpyxl and pandas libraries to achieve this goal.
Introduction
Openpyxl is a popular library used for reading and writing Excel files in Python. It allows us to easily access and manipulate the data in Excel spreadsheets, including the ability to set password protection. In this tutorial, we will use openpyxl to read multiple Excel files with password protection and merge them into one master file.
Requirements
Before we begin, make sure you have the following libraries installed:
openpyxl
: This library is used to interact with Excel files.pandas
: This library is used to work with data in Python.os
: This library is used for working with file paths.
You can install these libraries using pip:
pip install openpyxl pandas
Setting Up the Environment
Create a new project folder and navigate into it. Create a new file named merged_exels.py
(or any other name you prefer) and copy the code below into it.
import os
import pandas as pd
from openpyxl import Workbook, load_workbook
# Set the path to the directory containing the Excel files
cwd = os.path.abspath(r'C:/Users/eldri/OneDrive/Desktop/test/')
cwd = cwd.replace("'\'", "'/'")
# List the files in the directory
files = os.listdir(cwd)
# Initialize an empty DataFrame to store the data from all files
xltot = pd.DataFrame()
# Set the name of the sheet to be used for the master Excel file
shtname = ('Sheet1')
# Loop through each file in the directory
for file in files:
# Check if the file is an Excel file (with .xlsx extension)
if file.endswith('*.xlsx'):
# Load the password from the Excel file
excel_file_security = load_workbook(file, data_only=False)
workbook_password = excel_file_security['xl/worksheets/sheet1'].protect.value
# Open the Excel file using pandas
excel_file = pd.ExcelFile(file)
# Get the sheet names from the Excel file
sheets = list(excel_file.sheet_names)
# Loop through each sheet in the Excel file
for sheet in sheets:
# Skip if the current sheet is not 'Sheet1'
if sheet != shtname:
continue
# Read the data from the current sheet into a DataFrame
xl = excel_file.parse(sheet)
# Append the data to the master DataFrame
xltot = pd.concat([xltot, xl])
# Save the combined data to an Excel file with password protection
def save_to_excel(xltot, shtname):
wb = Workbook()
# Add a worksheet to the workbook and set it as the active sheet
ws = wb.active
# Set the value of cell A1 to be the title of the master Excel file
ws['A1'] = 'Compiled Excels'
# Copy data from the DataFrame into the worksheet
for i, row in xltot.iterrows():
for j, val in enumerate(row):
ws.cell(row=i+2, column=j+1).value = val
# Set the password protection for the workbook
wb.protect('password')
# Save the workbook to a file with .xlsx extension
wb.save('compiled_xl.xlsx')
# Call the function to save the data to an Excel file
save_to_excel(xltot, shtname)
print("Done")
Explanation
This code defines a script that reads multiple Excel files into one master DataFrame using pandas. It then saves this combined data to an Excel file with password protection using openpyxl.
Here’s how it works:
- The script starts by setting up the environment, including creating a new project folder and navigating into it.
- It lists all the files in the current directory using
os.listdir()
. - It initializes an empty DataFrame to store the data from all files.
- It loops through each file in the directory and checks if it’s an Excel file (with .xlsx extension). If so, it loads the password from the Excel file using openpyxl.
- It opens the Excel file using pandas and reads the data from each sheet into a DataFrame.
- It appends the data to the master DataFrame.
- After reading all files, it saves the combined data to an Excel file with password protection.
Example Use Cases
This script is useful in situations where you need to combine multiple Excel files into one master file while maintaining their formatting and layout. Here are a few examples of how you can use this script:
- Data consolidation: If you have multiple Excel files containing different data, but you want to consolidate them into one file for easier analysis or reporting.
- File merging: If you need to merge multiple Excel files based on certain criteria (e.g., date range, department), this script can help you achieve that.
Best Practices
Here are a few best practices to keep in mind when using this script:
- Use a consistent file naming convention: Make sure all your Excel files have the same format and extension (.xlsx) for easy identification.
- Set clear passwords: Choose strong, unique passwords for each Excel file to ensure security.
- Regularly back up your data: Use a backup system (e.g., external hard drive, cloud storage) to save your Excel files regularly.
Conclusion
In this article, we explored how to compile multiple Excel files into one master file while incorporating password protection using Python and the openpyxl library. We also discussed some best practices for using this script in real-world scenarios.
Last modified on 2024-08-29