JSON from Excel using Python openpyxl

Powering an application from a collection of static files rather than a call to an API or database server is well worth considering. Removing a production database and data access layer from the mix eliminates complexity and trims development time. Static files aren’t right for every situation, but for small apps they may be all you need.

In my days building back ends for interactive news applications, most of our apps consumed JSON. Simple apps ran off one or two static JSON files. The more complex used hundreds. Updating content was as easy as generating new JSON files and uploading them to a file server.

I’ve written about how to use Python to generate JSON files from a SQL database. But it’s just as easy to use Python to make JSON out of Excel spreadsheets. The key ingredient is the Python library openpyxl, which offers a collection of functions for reading and writing Excel .xlsx workbooks. You can add it to your Python environment by running pip install openpyxl.

Let’s look at how it works, then use with some Python to generate JSON.

Basic openpyxl functions

Let’s say we have an Excel workbook containing a small table repeated over three worksheets. The table in each case looks like this:

Here are some snippets of code — just scratching the surface — to interact with it programmatically:

# Demonstrates basic openpyxl functions for working with Excel files
# (.xlsx format)

from openpyxl import load_workbook

# load the workbook
wb = load_workbook('excel-xlrd-sample.xlsx')

# Print the sheet names

# Get a sheet by name
sheet = wb['Cars20']

# Iterate through sheet rows, returning each as a tuple:
for row in sheet.values:

# If you just want the first column:
for cell in sheet['A']:

# Index individual cells:
cell_c4 = sheet['C4'].value

# Or you can use:
cell_c4 = sheet.cell(row=4, column=3).value

# Let's say you want the same cell every sheet in a workbook:
for sheet in wb.sheetnames:
    open_sheet = wb[sheet]
    cell_x = open_sheet['D2'].value

From Excel to JSON

Pretty cool stuff. Now, let’s convert our sample spreadsheet to JSON. I’ll borrow some of the techniques I discussed when outlining how to use Python to build JSON from a SQL database:

import json
from collections import OrderedDict
from itertools import islice
from openpyxl import load_workbook

# Open the workbook and select a worksheet
wb = load_workbook('excel-xlrd-sample.xlsx')
sheet = wb['Cars20']

# List to hold dictionaries
cars_list = []

# Iterate through each row in worksheet and fetch values into dict
for row in islice(sheet.values, 1, sheet.max_row):
    cars = OrderedDict()
    cars['car-id'] = row[0]
    cars['make'] = row[1]
    cars['model'] = row[2]
    cars['miles'] = row[3]

# Serialize the list of dicts to JSON
j = json.dumps(cars_list)

# Write to file
with open('data.json', 'w') as f:

Here’s the breakdown: We open the workbook, select a sheet, and set up a blank list to hold rows of data. Then we use Python’s islice tool to iterate through the rows in the sheet, skipping the header and stopping at the max_row property of the sheet.

For each row, we add each cell to a key/value pair in a dictionary, then add each dictionary to a list. Then we dump the list to JSON and write to a file.

Of course, a spreadsheet this simple doesn’t need a Python script to make its way to JSON. Just use Mr. Data Converter. But as soon as your JSON requirements gain complexity, the ability to use Python to nest nodes, build strings and transform data on the fly make this approach very appealing.

Like what you see here? Please check out my book, Practical SQL: A Beginner’s Guide to Storytelling With Data, available from No Starch Press and through all fine booksellers.

Generate JSON From SQL Using Python

Author’s note: This post has been updated from its original 2012 version to use PostgreSQL and the Python psycopg2 library. Also, if you like what you read here, please check out my book Practical SQL: A Beginner’s Guide to Storytelling with Data from No Starch Press.

Let’s say you want to generate a few hundred — or even a thousand — flat JSON files from a SQL database. Maybe you want to power a data visualization but have neither the time nor the desire to spin up a server to dynamically generate the data. So, you want flat files, each one small for quick loading. And a lot of them.

A few lines of Python is all you need.

I’ve gone this route for a few data-driven visuals, creating JSON files out of large database tables. Python works well for this, with its JSON encoder/decoder offering a flexible set of tools for converting Python objects to JSON. Let’s see how this works.

The Data

I’m using PostgreSQL for this tutorial. Here’s a script that creates a table and fills it with two rows of data:

CREATE TABLE students_test (
    id integer,
    first_name text,
    last_name text,
    street text,
    city text,
    st text,
    zip text
INSERT INTO students_test VALUES
(1, 'Samantha', 'Baker', '9 Main St.', 'Hyde Park', 'NY', '12538'),
(2, 'Mark', 'Salomon', '3 Stavesacre Terrace', 'Dallas', 'TX', '75204');

The Script

Here’s an example Pythonscript that generates two JSON files from that query. One file contains JSON row arrays, and the other has JSON key-value objects. Below, we’ll walk through it step-by-step.


Setting up Python in Windows 10

Installing Python under Windows 10 is fairly easy as long as you set up your system environment correctly. Here’s your quick guide:

  1. Visit the official Python download page and grab the Windows installer for the latest version of Python 3.
  2. Navigate to the folder where you saved the installation file. Right-click the installer and select “Run as Administrator.” Click “Yes” when Windows asks if you want the program to make changes to your computer. 
  3. The next dialog asks whether you want to “Install Now” or “Customize Installation.” First, check the boxes that say “Install launcher for all users” and “Add Python 3.9 to PATH”. Then, choose “Customize Installation.”
  4. On the next screen, check all boxes under “Optional Features.” Click next.
  5. Under “Advanced Options,” set the location where you want to install Python. I like to use a location I can easily find, such as:

That will remind me that I’ve installed Python 3.9.

Also under “Advanced Options,” check the following boxes:

  • “Install for all users”
  • “Associate files with Python”
  • “Create shortcuts for installed applications”
  • “Add Python to environment variables”
  • “Precompile standard library”
  1. Click through the dialog to finish the installation and exit the installer.

Testing Your Install

Now, to launch the Python interpreter from the command line, you can open a command prompt (Start Menu > Windows System > Command Prompt) and type:


That will load the Python interpreter:

Python 3.9.1 (tags/v3.9.1:1e5d33e, Dec 7 2020, 17:08:21) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

Because of the settings you chose during the install, you can now run this interpreter — and, more important, a script — from any directory on your system.

Type exit() and hit Return to exit the interpreter and get back to a C: prompt.

Optional: Set up useful Python packages

Python 3 comes with the package installer pip already in place, which makes it super easy to add useful packages to your Python installation. The syntax is this (replace package_name_here with the name of a package you want to install):

pip install package_name_here

Let’s add a couple of must-have utilities for web scraping: Requests and BeautifulSoup. You can use pip to install them all with one command:

pip install beautifulsoup4 requests

csvkit, which I covered here, is a great tool for dealing with comma-delimited text files. Add it:

pip install csvkit

You’re now set to get started using and learning Python under Windows 10. If you’re looking for a guide, start with the Official Python tutorial.

Like What You Read Here?
My book Practical SQL: A Beginner’s Guide to Storytelling with Data from No Starch Press offers a step-by-step guide learning SQL and working with relational databases. Learn how to wrangle the everyday data you encounter to gain meaningful insights.

Analyzing Shapefile Data with PostgreSQL

This post is adapted from material in my book Practical SQL: A Beginner’s Guide to Storytelling with Data from No Starch Press.

Spend some time digging into geographic information systems (GIS) and soon enough you’ll encounter a shapefile. It’s a GIS file type developed by mapping software firm Esri for use in its popular ArcGIS platform. A shapefile contains the geometric information to describe a shape—a river, road, lake, or town boundary, for example—plus metadata about the shape, such as its name.

Because the shapefile has become a de facto standard for publishing GIS data, other applications and software libraries use shapefiles too, such as the open source QGIS.

While researching GIS topics for a chapter in my book, Practical SQL, I learned that it’s easy to import a shapefile into a PostGIS-enabled PostgreSQL database. The information that describes each shape is stored in a column of data type geometry, and so you can run spatial queries to calculate area, distances, intersections of objects, and more.

Here’s a quick exercise, adapted from the book.


‘Practical SQL’ Book in Early Release

My first book, Practical SQL: A Beginner’s Guide to Storytelling with Data, is out in early release from No Starch Press starting today! If you pre-order from No Starch, you can download the Introduction and first four chapters now. You’ll get additional chapters regularly until the final version comes out in February 2018.

Practical SQL is for people who encounter data in their everyday lives and want to know how to analyze or transform it. The book covers real-world data and scenarios, from analyzing U.S. Census demographics to the duration of taxi rides in New York City. I’ve aimed the exercises at beginning SQL coders, and all the code and data can be downloaded via No Starch’s site.

That database you’ll use is the free, open-source PostgreSQL, along with the pgAdmin 4 graphical user interface. We cover all the basics you’ll find in standard ANSI SQL along with PostgreSQL-specific features such as full text search and GIS.

More to come as additional chapters hit early release!