Calculating Medians With SQL

Given that median is such a valuable statistical measure, it’s baffling that Microsoft’s SQL Server and other relational databases (MySQL, PostgreSQL) don’t have a built-in MEDIAN function. Well, this week, after working through a data set in SQL Server — and deciding I didn’t want to push the data into SPSS to find medians — I hit the web to find a T-SQL workaround.

I found a ton of solutions (some from people with no clue about the difference between median and average), but the one below — adapted from a post by Adam Machanic at sqlblog.com — was the best. It produces accurate results and is fairly speedy to boot.

Here’s an example. Consider this table with student grades from two courses:
.

ID Class FirstName Grade
1 Math Bob 65
2 Math Joe 72
3 Math Sally 95
4 Science Bob 65
5 Science Joe 81
6 Science Sally 81
7 Science Mike 72

We’d like to find the median grade in each class. Here’s the script:
Continue…

Essential SQL Queries (To Me, At Least)

For a session of five-minute “lightning talks” at this week’s 2010 Investigative Reporters and Editors conference in Phoenix, I contributed “Five Essential Queries for SQL Server.” Aside from the basic SELECT statement, these are five techniques that, at least for me, either solved a tricky problem or made coding life more efficient. They came to me after some trial and error or from using the coder’s best friend, Google.

I realize that many journalists prefer the open source (free) MySQL to Microsoft’s product, so I’ve replicated the five queries below in MySQL syntax. You can download script files for either syntax here:

Five essential queries (MS SQL Server)
Five essential queries (MySQL)

Feedback and your ideas are welcome. Here they are:

1. Create a temporary table with identity column.
Temp tables are handy for storing and manipulating data when you need a table but don’t want to make it part of your actual schema. In SQL Server, the table variable is held in memory and disappears once the query finishes executing.
 

DECLARE @tmp TABLE (
   id int identity(1,1), 
   FirstName varchar(50)
   )

INSERT INTO @tmp (FirstName) VALUES ('Bob')
INSERT INTO @tmp (FirstName) VALUES ('Joe')
INSERT INTO @tmp (FirstName) VALUES ('Sally')

SELECT * FROM @tmp

Continue…