Story hunting in birth, death data

Tracking the U.S. government’s annual count of births and deaths is one of my little obsessions. I keep annual totals in a spreadsheet and look forward to observing, with each new year of data, the trends.

By studying this most basic of demographics, we can learn much about a nation’s past—and its unfolding future.

For example, the CDC’s provisional 2018 U.S. birth data released in May 2019 showed that births in the U.S. dropped for a fourth year in a row, to the lowest level in 32 years. In a story for The Wall Street Journal, we encapsulated the trend and what demographers point to as likely causes: sharp declines in the teen birth rate, increased use of longer-acting contraceptives, and more women in the workforce delaying childbirth, among others.

It’s also useful, with demographic data (and other topics, such as the economy), to take a long-term view. For example, below are the annual number of births and deaths from 1933 to 2018 plotted via the Google Charts API. Hover for details:

These two fever lines contain several generational milestones worth watching. To me, the most interesting trend is the one developing at the far right of the chart.

At a time when the annual number of deaths continues to climb (give partial credit to the leading edge of the Baby Boomer generation passing age 70), the number of births in the U.S. is showing no signs of a rebound. In fact, the ratio of births to deaths, 1.37 in 2017, is lower than it was in 1936 at the height of the Great Depression.

This trend has demographers sounding alarms and is the sort of story line that invites a deeper analysis. But if we step back and look at the longer-term picture, birth and death data holds even more story lines tied to generational development. Consider:

  • The first baby boomers — those born in 1946 — are turning 73 in 2019. The youngest are firmly in their mid-fifties. This cohort of 70 million-plus is either already retired or looking at doing so within a decade.
  • The Gen Xers that follow have hit middle age, now in their late 30s to early 50s. (Gen X poster boy Eddie Vedder of Pearl Jam hit the half-century mark in 2014.)
  • The first of the Millennials — the “echo boomers” whose numbers peaked in 1990 — are in their mid to late 30s.
  • Generation Z, born starting in 1997, are just starting to make their presence known in the workforce.

Each generation brings a new sensibility to the stages of life, and the relative size and makeup of each group — not to mention its cultural context — gives journalists plenty of opportunity for storytelling.

For example, much has been written about the bump of post-World War II babies marching closer to retirement (someday), Social Security, and the years where health care becomes a major concern. But what about the inevitable? Notice that the number of deaths in the U.S. has ticked up to about 2.8 million a year. Expect that to climb as Boomers head into the years where death rates rise dramatically. How will 4 million deaths annually affect the industry around end-of-life care, not to mention the business of funeral homes and cemetery plots?

These sorts of trends are slow-burning, but they reflect movements that exert hidden but massive force on our culture, much like the tides. The savvy data analyst keeps an eye on them not just for what they say this year but what they reveal over time.

Setting up Python in Windows 10

Installing Python under Windows 10 is fairly easy as long as you set up your system environment correctly. Below is my quick guide, which follows similar how-to’s I’ve written for installing Python under Windows 7 and under Windows 8.1.

Ready? Here’s your quick guide:

Set up Python on Windows 10

1. Visit the official Python download page and grab the Windows installer for the latest version of Python 3. One note:

  • Python is available in two versions — Python 2 and Python 3. For beginners, that may be confusing. In short, Python 3 is the current and future state of the language; Python 2 is a legacy version that still has a large base of users. Python 2 will reach its end of life in January 2020 and will only get bug fixes till then.

2. Right-click on the installer and select “Run as Administrator.” Click “Yes” when Windows asks if you want the program to make changes to your computer.

3. The next dialog asks whether you want to “Install Now” or “Customize Installation.” You want to “Customize Installation,” so click that.

4. On the next screen, check all boxes under “Optional Features.” Click next.

Analyzing Shapefile Data with PostgreSQL

This is one in a series of posts adapted from material in the book Practical SQL.

Spend some time digging into geographic information systems (GIS) and soon enough you’ll encounter a shapefile. It’s a GIS file type developed by mapping software firm Esri for use in its popular ArcGIS platform. A shapefile contains the geometric information to describe a shape—a river, road, lake, or town boundary, for example—plus metadata about the shape, such as its name.

Because the shapefile has become a de facto standard for publishing GIS data, other applications and software libraries use shapefiles too, such as the open source QGIS.

While researching GIS topics for a chapter in my book, Practical SQL, I learned that it’s easy to import a shapefile into a PostGIS-enabled PostgreSQL database. The information that describes each shape is stored in a column of data type geometry, and so you can run spatial queries to calculate area, distances, intersections of objects, and more.

Here’s a quick exercise, adapted from the book. Continue…

DC PostgreSQL User Group, June 2018

Many thanks to Stephen Frost of Crunchy Data and Brad Sneade of LiveSafe for inviting me to speak about my book Practical SQL at the DC PostgreSQL User Group in early June! The night featured good food, fun conversations, and tales from me on how a journalist came to write a book about PostgreSQL and data analysis.

I shared tips I picked up along the way on using PostGIS, crosstabs, statistics functions, and Python within PostgreSQL—all topics I cover in the book.

The DC PostgreSQL Users Group features a warm, inviting atmosphere. Check out its Meetup page and consider stopping in if you’re in the region.

A few tweets from the evening:

Next time, I will bring a bigger screen …

‘Practical SQL’ Available in Bookstores!

I’m thrilled to say that Practical SQL: A Beginner’s Guide to Storytelling with Data is officially released today! The title is published by No Starch Press and distributed via Penguin Random House, which means you can find it wherever books are sold.

From the description:

Practical SQL is an approachable and fast-paced guide to SQL (Structured Query Language), the standard programming language for defining, organizing, and exploring data in relational databases. The book focuses on using SQL to find the story your data tells, with the popular open-source database PostgreSQL and the pgAdmin interface as its primary tools.

Practical SQL Anthony DeBarrosMuch of Practical SQL is based on the years I spent in newsrooms, including USA TODAY, poring over data sets in search of a story. SQL-driven databases were a central part of my toolkit, allowing me to organize, clean, and find meaning in data sets ranging from a handful of rows up to millions of records across dozens of tables. Today, the language is still widely used, powering thousands upon thousands of software applications.

Please check out the bundle from No Starch that includes a print copy plus ebook versions (PDF, .mobi, and .epub). No Starch Press is a thoughtful company that supports the open source software community, so you can feel good backing them. No Starch often runs promotions, so follow the company on Twitter or get their newsletter for deals.

Of course, you can also order the book through Amazon, and copies should be on shelves at Barnes & Noble or your favorite independent local bookstore.

In coming weeks, I’ll announce some special giveaways, sharing tips from the book, and booking some in-store appearances. Stay tuned.