‘Practical SQL’ Book in Early Release

My first book, Practical SQL: A Beginner’s Guide to Storytelling with Data, is out in early release from No Starch Press starting today! If you pre-order from No Starch, you can download the Introduction and first four chapters now. You’ll get additional chapters regularly until the final version comes out in February 2018.

Practical SQL is for people who encounter data in their everyday lives and want to know how to analyze or transform it. The book covers real-world data and scenarios, from analyzing U.S. Census demographics to the duration of taxi rides in New York City. I’ve aimed the exercises at beginning SQL coders, and all the code and data can be downloaded via No Starch’s site.

That database you’ll use is the free, open-source PostgreSQL, along with the pgAdmin 4 graphical user interface. We cover all the basics you’ll find in standard ANSI SQL along with PostgreSQL-specific features such as full text search and GIS.

More to come as additional chapters hit early release!

NoVa-Py Talk: Building a Python Package

One of the most popular uses of the API for DocumentCloud, the document research/publishing platform where I work, is to bulk-upload hundreds or thousands of documents. People usually hack their own code together to do this, sometimes using the Python or Ruby wrappers for the API.

After talking with users and hearing their thoughts about the workflow — a desire to have a record of each file’s URL once uploaded, for example — I saw an opportunity to add some luxury to the process. A couple of months, a lot of research, and a few bruises later, I had my first Python package: pneumatic.

pneumatic does a few things to make life easier. It grabs information about each uploaded file and saves it in a SQLite database, which you can dump to csv. It uses Python’s multiprocessing module to try to add some speed (recognizing that this is a network-bound task). And it scans all subfolders for files, which is handy when you obtain a collection of files organized that way.

Learning about Python packaging was as much a part of the project as creating the library itself. The folks at the Northern Virginia Python Users Group were kind enough to invite me to share what I learned recently. Click through the title card to view the slides.



A Hierarchy of User Experience Needs

Remember Abraham Maslow’s hierarchy of needs? In the 1940s, the Brooklyn-born psychologist plotted categories of human motivation on a continuum. He theorized that our first desires are to secure basic physical needs, such as food and shelter. Once we satisfy those, we’re free to seek higher things. Think love and belonging, or well-being.

The people who use software products also have needs, and they too come in something of a continuum. This fact’s not often well-articulated by users, but it shows up in bug reports, support requests and casual conversations. Let’s call it a Hierarchy of User Experience Needs. Consider these statements:

  • “Tried logging in and got an error message.”
  • “An option to customize menu colors would be great.”
  • “How are you planning to handle integration with [Hot New Service X]?”

If we pay attention, we can place each of those comments along a continuum. Some reflect needs that fall toward the basic and immediate; others are more aesthetic and emotional. Over time, if feedback bunches up at any point, that’s a cue to address that part of your product with an appropriate response.

As a proposal (while not ignoring all the other takes on the idea), let’s explore a hierarchy that plots these needs:


Here it is in detail, starting from the bottom:

1. Invitation: The Product Welcomes.

Let’s start at the life-or-death end of the hierarchy. Before users sign up, before they download a single file or enter a credit card, they need a gut-level feeling about your product: assurance that it’s going to work with them and for them — not against them.

They’ll know just by exploring. Is the interface simple or jumbled? Is there ample documentation? Is it written for brainiacs or for everyone? Does the cost reflect the value of the problem your product solves? Will you give help?

Hints that your product fails here include a level of engagement that falls off sharply from initial interest. If something’s turning people away before they really get started, dig to find out what that is.

2. Function: The Product Works, or Else.

Available, functioning software also resides at the food-and-shelter end of user needs. Throw a roadblock here — a site that won’t load, a procedure that hangs, results that are inaccurate — and not much else matters. Users will exit towards an alternative.

Feedback at this point on the continuum usually carries a desperate tone — some form of, “Help!” But be on the lookout for low-grade annoyances that surface in conversations such as complaints that a search didn’t return expected results or that it’s nigh impossible to access a menu. Not all product failures are equivalently spectacular, but it’s perilous to ignore even small ones.

3. Speed: It’s a Need.

It’s amazing that a spinning beach ball can become annoying after about 4 seconds, but it does. The best of the Internet is fast, and that’s the bar users expect — that a product will do its thing Right Away. So, if your mapping application’s data layer takes forever to render on top of base tiles, that’s disappointing. If a search takes 20 seconds to return a query, that’s a lifetime compared with the expectation Google search delivers.

Optimization is hard, but at this side of the Hierarchy of User Experience Needs, where fulfillment is still more life-and-death than nicety, it’s worth watching. Recently, a fast food place opened in our neighborhood, and after two tries in which a drive-thru purchase took 20 minutes, we’ve started saying, “Let’s find something else.” Don’t let your users do the same.

4. Beauty: The Product is Pleasing.

Once you satisfy a user’s basic needs for fast, functional, easy-to-grok software, they can look for the product to meet higher-order, less-tangible needs. One is a simple desire for the product to delight. More than the simple friendliness expressed in the basic Invitation need, at this point in the hierarchy users want to have their senses tingled.

This means your product pays attention to how it looks, how it interacts and communicates, and even how it sounds. If possible, it should evoke some sense of wonder. Nothing should be jarring or too raw (unless that’s part of the appeal). The team message client Slack is an exemplary model here.

5. Partnership: We’re In this Together.

Heading towards even-higher needs, the next is a sense that the team behind the product is at the user’s side, sharing their concerns and invested in their success. Call it empathy.

This need’s expressed less in complaints and more in desire. When users talk about wanting to hear from the team, whether in a forum, on Twitter or via a blog post, that’s a tip they’re seeking connection. Even users of a wildly successful product will chafe against lengthy absences of communication from the project’s owner.

Miss meeting this need and you miss an opportunity for building loyalty and lasting bonds.

6. Vision: We’ll Grow Together

This final need is about hope. No one using your product really wants to be in the same place next year, doing the same things and trying to solve the same problems. (Even if they think they do, they don’t.) In this top-most part of the Hierarchy of User Experience Needs, users join their personal aspirations with those of the product in hope of growing together.

To meet this, product owners must communicate vision. Where is the product going? How will it anticipate next year’s needs? How has it done so in the past?

Vision is as critical to retaining customers as the sense of Partnership. Fulfill it, and your users will have confidence to renew subscriptions, suggest your product to colleagues, and resist the temptation to defect to some other shiny object.

Your thoughts?

Add your ideas and share what you’ve learned!

Nieman Lab’s 2016 Product Predictions

This past December, the Nieman Lab rounded up more than 100 predictions for 2016 about the news business — its people, its economics and, of course, the craft of storytelling itself. As the editors described it:

Each year, we ask some of the smartest people in journalism and digital media what they think is coming in the next 12 months. Here’s what they had to say.

Of the pieces, a few are particularly relevant to those of us working on products related to content creation and publishing. For example, how can we support the unique needs of small, local media outlets? How should our technology grow in light of new publishing platforms such as Facebook’s Instant Articles? And if the homepage is less relevant than ever, how can our products increase engagement on article pages?

Lots to ponder as we face the unwritten page that is 2016. Some highlights from Nieman’s report:


Rise of the platform: “The coming year will see more companies abandon websites altogether to save costs, pouring all resources into media creation and leaving presentation and distribution entirely to outside platforms. … The hot new job next year in distributed media companies will be platform partnerships manager — the person who acts as the interface between editorial, technology, and outside partners.”

Platforms, take 2: “By 2016, most content will be consumed … on other people’s platforms. … It’s early still, but (a) our content is being viewed at a higher volume than before, and (b) we’re monetizing those views at a higher rate.”

Frictionless video: “In 2016, push video will get smarter. It will know what to suggest and when we’re most likely to watch and participate.”

Botification: “It will be the use of bots in news that will be a major development in 2016. … Next year, we’ll start seeing the emergence of fully fledged AI personal assistants.”


There’s revenue in quality: “People — lots of people, of all ages, incomes, backgrounds, and nationalities — will pay money for good content.”

Monetizing your tribe: “There’s a far more important game to be played in 2016 underpinning all the forward-looking business models and technologies: Everyone will be trying to win over and monetize a loyal base of readers or viewers.”

A better metric of success: “In 2016, data will be used to define a metric that publishers and platforms can both stand behind and use to measure success much more meaningfully.”

Small-publisher success: “In 2016, we’ll move beyond isolated examples to see a growing wave of small independent publishers launching publications and starting to achieve success.”

Local media 1: “As yesterday’s local media companies run toward digital models pioneered by Vox or The New York Times, they’re actually accelerating their death. But make no mistake, the financial rewards for building tomorrow’s Gannett, Lee, Tribune, and McClatchy are massive. … You’ll see BuzzFeed, Vox, and Vice acquire and actually listen to local media companies that are (a) growing and (b) profitable.”

Local media 2: “More and more, we’re seeing local news publishers putting the local back into news operations. Whether they’re members of Local Independent Online News Publishers or running other sorts of outlets, publishers who actually live in the communities their news organizations cover are showing the road to healthy, sustainable, and effective local news.

Local media 3: “Journalism that truly serves and invests in the public creates a virtuous feedback loop in which the public will invest in and protect the journalism.”


Podcasting matures: “As streaming on-demand content grows, we’ll see the rise of audio discovery outside of a dedicated app, and integrated into the ways we already share content on social streams, the open web, and on mobile.”

Static is the new interactive: “We’ve already started to realize that not everything needs to be interactive. In 2016, we’ll see the rise of static graphics as news organizations seek to create good mobile experiences.”

More VR: “People will be scanning their own environments with their phones, and virtually, instantaneously, hanging out in each others’ spaces. The scenes of major news stories will also be scanned, and audiences will be walking around ‘inside’ them rather than watching them on a screen.”

The article is readers’ point of entry: “It’s no longer the home page. … At this year’s Columbia University School of Journalism conference Journalism + Silicon Valley, a participant asked Mark Thompson, CEO of The New York Times Co., to name his greatest challenge. His answer? ‘How do we get a person who reads one news story to read a second story in The New York Times?’ ”

Niche topic sites: “2016 will be a year of growth for subject-specific publishers, as broader audiences use single-topic sites for deep expertise on the issues that matter most to their families. Verticals are growing up.”

And, because we need some levity …

Realizing the Internet’s promise: “When 2016 draws to a close, we’ll look out at each other across that pulsing quasar of perfect connected knowledge and creative citizenry, and we’ll smile.”

Today’s weather in my inbox, via Python

In the category of “potentially useful but mostly just a learning exercise,” here’s a Python script that emails me the local weather report twice a day. I loaded it on a Raspberry Pi my family gave me as a gift last year, set up a cron task, and now each day when I wake up I have a forecast waiting in my inbox. Makes me feel special!

The script — compatible with Python 3.6 and Python 2.7 — uses the awesome Requests library to fetch two endpoints from the Weather Underground API. One provides a forecast, and the other offers a summary of yesterday’s weather. For emailing, it uses the standard Python smtplib.

The code’s available on Github, so fork it and make it your own. You’ll need to have the Requests and simplejson libraries installed. Contributions are welcome!

Here’s a quick overview on how to set it up:

First, you’ll need to sign up for a Weather Underground API key. The free developer level has more than enough calls per day for this app, so choose that unless you plan to obsess about the weather in an oversized manner.

The API key and your email parameters go into a settings.py file:

mail_settings = {
    'address': 'anyone@example.com',
    'pw': 'your-email-password',
    'smtp': 'post.example.com',
    'from': 'Mr. Weather Robot'

send_to_addresses = ['someone@example.com', 'someone_else@example.com']

api_key = 'your-wunderground-api-key'

Then, here’s the wx-mail.py file:

import datetime
import smtplib
import requests
import simplejson as json
from email.mime.text import MIMEText
from local_settings import mail_settings, send_to_addresses, api_key

def fetch_forecast(api_key, request_type):
    mail_url = 'http://api.wunderground.com/api/' + api_key + '/' +\
               request_type + '/forecast/q/PA/Reading.json'
    r = requests.get(mail_url)
    j = json.loads(r.text)
    return j

def build_html(forecast_json, yesterday_json):
    # build some HTML snippets to open and close this email
    html_open = """\
    html_close = """\

    # let's now build the HTML body contents
    wxdate = forecast_json['forecast']['txt_forecast']['date']
    mail_text = '<h3>Hello, DeBarros family!</h3><p>Here is the ' +\
                'Leesburg, Va., weather forecast as of ' + wxdate + '</p>'
    forecast_length = len(forecast_json['forecast']['txt_forecast']['forecastday']) - 1

    # looping through the JSON object
    for i in range(0, forecast_length):
        cast = '<p><b>' +\
            forecast_json['forecast']['txt_forecast']['forecastday'][i]['title'] +\
            '</b>: ' +\
            forecast_json['forecast']['txt_forecast']['forecastday'][i]['fcttext'] +\
        mail_text += cast

    # Now, for yesterday's weather summary ...
    # We'll pull the date and some weather data from the summary API endpoint
    summary_date = yesterday_json['history']['dailysummary'][0]['date']['pretty']

    high_low_temp = yesterday_json['history']['dailysummary'][0]['maxtempi'] +\
        ' / ' +\
        yesterday_json['history']['dailysummary'][0]['mintempi'] +\
        ' degrees Fahrenheit'

    max_min_humid = yesterday_json['history']['dailysummary'][0]['maxhumidity'] +\
        '% / ' +\
        yesterday_json['history']['dailysummary'][0]['minhumidity'] + '%'

    precipitation = yesterday_json['history']['dailysummary'][0]['precipi'] +\
        ' inches'

    max_wind_speed = yesterday_json['history']['dailysummary'][0]['maxwspdi'] +\
        ' mph'

    yesterday_html = """\
    <h3>Here's yesterday's weather summary:</h3>
    <p><b>High/low temperature: </b>""" + high_low_temp + '</p>' +\
    '<p><b>Max/min humidity: </b>' + max_min_humid + '</p>' +\
    '<p><b>Precipitation: </b>' + precipitation + '</p>' +\
    '<p><b>Maximum wind speed: </b>' + max_wind_speed + '</p>'

    # put it all together
    html_body = html_open + mail_text + yesterday_html + html_close
    return html_body

def send_email(mail_text):
    # Set the current time and add that to the message subject
    cur_date = datetime.date.today().strftime("%B") +\
        ' ' + datetime.date.today().strftime("%d") +\
        ', ' + datetime.date.today().strftime("%Y")
    subject = 'Family forecast for ' + cur_date

    # Set up the message subject, etc. Then send it.
    COMMASPACE = ', '

    msg = MIMEText(mail_text, 'html')
    msg['Subject'] = subject
    msg['From'] = mail_settings['from']
    msg['To'] = COMMASPACE.join(send_to_addresses)

    server = smtplib.SMTP(mail_settings['smtp'], 25)
    server.login(mail_settings['address'], mail_settings['pw'])
    server.sendmail(mail_settings['address'], send_to_addresses,

if __name__ == "__main__":
    forecast_json = fetch_forecast(api_key, 'forecast')
    yesterday_json = fetch_forecast(api_key, 'yesterday')
    mail_text = build_html(forecast_json, yesterday_json)

The code’s straightforward, but a few things to note:

  • The Python standard smtplib provides all you need for sending the email. Check the official docs for examples.
  • I’ve gotten into the habit of using the simplejson library for wrangling API response objects, but the standard Python json library works just as well.

Have fun, and may all your coding days be sunny and warm.