For many years, I have been creating excel spreadsheets with a similar look and feel. My template consists of data organized into excel tables with similar categories having the same background colors and a merged cell with a category title above them (outside the table). The background colors are dark(ish) to provide contrast with a bold white font. I use blank columns before and after the table as borders and hide grid lines and unused columns.
I have been manually creating these spreadsheets in Excel, but when I needed to update one of them every week, I decided to create a python script to automate the process. The script worked well, but it was very hard to maintain. If I needed to add or remove a column, I would have to make a lot of changes, and if I needed to adapt the script to another dataset, I would have to almost start from scratch.
I promised myself that I would make a cleaner version of the excel exporter as soon as I had time. The original script was written using the pandas and xlswriter libraries, where I put the data on the sheet using pandas and formatted everything I needed using xlswriter. However, recently I needed to add hyperlinks to some cells in a particular column, and I was unable to do so with my current approach. So, I decided to completely rewrite my script to make it easier to adapt to new uses and allow for the possibility of adding hyperlinks to cells.
I don't intend to maintain this repository. However, I believe that the new version makes it very easy to create spreadsheets with my template and may be useful to others. Additionally, I think that the code is organized well enough to be adapted to other templates or used as a reference on how to organize a script to generate excel spreadsheets with complex formatting.
The Excel Exporter was created utilizing openpyxl and has the capability to export multiple sheets into a single Excel file.
The data being exported is sourced from dictionaries (one for each sheet). In cases where the data contains associated URLs, the exporter will automatically convert them into hyperlinks within the Excel file.
The following animation shows how the data is organized:
Configurations for columns, groups, sheets, and the workbook as a whole are stored in a YAML file.
- Workbook: File name, Cell formats and Sheets.
- Cell Formats: Font size, Horizontal alignment, Vertical alignment and Line break
- Sheets: Sheet Name, Group(s), and Column(s)
- Groups: Name and Background Color
- Columns: Order, Title (to be display in Excel), Variable Name (in the python dictionary), Cell format (previouly defined), Group it belongs to (previously defined) and Column Width
The following animation shows how the YAML file is organized:
Note:
The YAML animation doesn't include the workbook parameter update_message
, which was included later. Check it in example_config.yaml line 2.
Just use pip
pip install excel-exporter-bms
or poetry
poestry add excel-exporter-bms
To run this tutorial you should download example.py and example_config.yaml. It is expected to generate a file like example_output.xlsx
It's necessary to properly prepare your data before using the exporter. The data must be a list of dictionary of lists, where each dictionary represents a sheet in the excel file. Inside the dictionary each key represents a column and its corresponding list contains the column data. Each element in the list represents a cell value. If a hyperlink is required, the cell value should be represented as a tuple, with the first element being the cell value and the second element being the URL for the hyperlink.
data = [
{
"employee_id": [(1001, 'https://employee.my_company.com/id/1001'),
(1002, 'https://employee.my_company.com/id/1002')],
"employee_name": ["John Doe", "Jane Doe"],
"department": ["Sales South Area", "Sales West Area"],
"job_title": ["Manager", "Associate"]
},
{
"employee_id": [1001, 1002],
"sales_current_year": [100000, 120000],
"sales_last_year": [90000, 110000]
}
]
- If your data originally comes from a list of dictionaries, you can use the
utils
module to convert it.
from excel_exporter.utils.data_conversion import from_list_of_dicts
sheet1_data = from_list_of_dicts(sheet1_list_of_dicts)
data = [sheet1_data]
- If your data is in a pandas dataframe, a similar conversion process applies.
from excel_exporter.utils.data_conversion import from_pandas_dataframe
sheet1_data = from_pandas_dataframe(sheet1_df)
data = [sheet1_data]
Configure the YAML based on the example described on the previous section
Before using the exporter, you'll need to load the YAML configuration file that you'll be using to export your data.
from excel_exporter.configuration.load_config import load_config
config = load_config('example_config.yaml')
This exporter was created for periodically updated spreadsheets. So it also important to include the time and date of the dataset. These can be retrieved using datetime module.
from datetime import datetime
update_time = datetime.now()
Once your data is prepared and the YAML is loaded, you're ready to run the exporter. Simply pass your data, configuration and update time as arguments to the exporter, and the export process will begin. It will return a openpyxl workbook.
from excel_exporter.exporter.create_excel import export_excel
workbook = export_excel(data, config, update_time)
Once the export process is complete, you can choose to either save the workbook file as a real file on your local machine, or upload it to a remote server using an API. Here is an example of how to save the file locally:
workbook.save(config.file_name)