May 5, 2012
Sketches: How Mariano Rivera Compares to Baseball’s Best Closers

One of the best things about working at a newspaper is that you can come into work and do something different every day. Yesterday I had planned on spending the day doing some longer-term work in preparation for the Olympics and generally phoning it in Friday-style when a handful of us got assigned a daily – a graphic that looked back on Mariano Rivera’s career in light of his A.C.L. injury on Thursday. I was totally going to do an insane 3D-video that analyzed his cutter, but apparently someone did that already, so we went with charts instead. I looked at saves over time of top pitchers while my colleague Tom Giratikanon, who just started this week, compared Rivera across different categories.

We had a broad idea for what we were going for, which Matt Ericson sketched out by hand:

matte drawing

I scraped the data for the players with the most saves from baseball-reference.com (using an old template Shan Carter made using hpricot, which I learned is now “over”), then sketched the top 250 or so in R. This only takes a couple seconds to read about,  but it was in fact at least two hours of screws ups and swearing before I saw this chart:

first chart

Which eventually turned to this (we export odd colors to pick them up easily in Adobe Illustrator):

preprint

And the final print version:

Final print

Online, we took basically the same approach, except we wanted to make them interactive, so Shan Carter pitched in some D3 expertise and Tom made his in Raphael, and six painless hours later, after all the programming, browser checking, conditional loading (which might not be a term) and Matt Ericson VPNing in from New Jersey to fix everything, we had a nice interactive, mostly mobile-friendly graphic:

web

Our approach wasn’t revolutionary or anything – in fact, Amanda and I used an identical charting form to chart home runs a couple years ago – but the package worked well, and if anything, Rivera stands out more in the saves chart than Barry Bonds does in the homers chart. And it was a promising start to the possibility of turning around this kind of work on deadline. 

April 30, 2012
Sketches from White House State Dinners

Elisabeth Bumiller’s recent profile of Jeremy Bernard, the first man and openly gay person to be the White House social secretary, used an interesting dataset: a list of everyone who has attended a state dinner in the Obama administration. I don’t have a ton of experience with Styles (or with “style”, for that matter), but this was a good chance to do something different with a new section. Except not that different, since charts are pretty much the only trick.

Alicia Parlapiano and I ended up using a sort of spiral plot, which we then just joined together in illustrator. I remembered that we had used a similar technique in one of my first graphics at the Times to visualize which countries were good at which sports. (Then, as now, Amanda did the hard stuff.) So I ported the code from Actionscript to use for this, while also sizing for frequency of visits.

Here’s the sketch:

sketch

And how it looked in print:

print

Matt Ericson and Amanda Cox helped out on a late night to make a fun interactive version, perfect for gawking at all those people who were invited instead of you. 

Web v

April 23, 2012
White House Visits and Democratic Donors: Data Sketches and a Call for Votes

In last Sunday’s, paper Mike McIntire and Michael Luo published their investigation into White House visits by large Democratic donors. As simple as the chart was, we pondered many complex options before publishing it.

Early on, I thought some large-scale visualization of all major donors might be interesting, so I plotted a couple hundred of the top donors (based loosely on first and last names) with donations and WH visits on the same axis to see if there was any meaningful pattern. It looked like this:

data art throwaway

Although it looked sort of cool (in a meaningless data-art kind of way), nothing there illuminated the real focus of the story – namely, the possibility that large donors might get more access to the White House. Really, that was my only idea, and I was being annoying and complaining about it when Amanda Cox matter-of-factly told me to make a sketch that showed the percent chance of visiting the White House based on one’s total donation size. An hour later, I had this:

concept-1

We all liked it right away. Most of the remaining work went to matching the databases of donors and visitors as well as we could. That data work is important, but horribly unsexy and not really conducive to sketches. In general, we matched on middle initials where we could, and Matt Ericson helped me implement his handy Mr. People gem to get the various names parsed in a uniform fashion. Otherwise, all the data work was done in R, with a typically heavy-bordering-on-embarrassing level of assistance from Amanda. 

Once we published, there was some discussion about the form of the chart on Twitter, and I admit it’s slightly odd. We had a lot of discussion about form on our end, too. So I present 4 options, each named for a delightful animal (we do a lot of animal-based filenames in the department, for some reason):

First, the “Blue Whale,” arguably the most straightforward, accessible approach. This form makes the trend the focus of the graphic:Blue Whale

"Polar Bear" is perhaps the best chart for a more technical audience…

Polar bear

…but it might mean fewer people understand it. And is it me, or do the horizontal segments look like error margins instead of donation ranges? It’s not quite a scatterplot, since the percentages plotted represent “buckets” of donation sizes rather than individual points. 

A slightly different approach, the “Tree Lobster" might indeed be the most accurate representation of this dataset:

tree lobster

But where’s the continuity? And seriously, how boring are bar charts? Also, labeling is hard on this thing, which is not a trivial problem.

Lastly, (Dull) Giraffe:

Dull giraffe

Seriously, this one is dull and maybe not worth discussing. Or is it? Discuss. Any discussion of these forms might happen on Twitter under the hashtag #chartingSpiritAnimals until I figure out how to put comments into this site, which, let’s face it, isn’t ever going to happen. 

If you’ve seen the graphic online or in print, you’ll know that we went with the Blue Whale. Aside from carrying the crucial Steve Duenes/Matt Ericson/Amanda Cox voting bloc (their decisions somehow track the majority vote 100% of the time), it felt suited for the data and the story it was published with.

print version

(It looks fine online too, but it’s sort of stranded on its own URL.)

Finally, as a disclaimer, the data plotted in these examples is slightly different than what went into print last week, as we did some manual tweaking on a handful of names, which moved a couple percentages up or down a tiny bit. 

Look forward to seeing if any data visualizers Tweet silly animal names this week. I’ll go first…

April 11, 2012
Mapping Process: Rick Santorum’s Race

We had a medium-sized graphic in today’s paper looking back on Rick Santorum’s campaign. The map was made in R using maptools, a package I find increasingly easy and fun to use. For me, the best part about visualizing data in R is that it even when you screw things up pretty bad, the result usually looks pretty cool. 

Anyway, the map is not revolutionary or anything, but it worked well to tell the story we wanted to tell. I took a screenshot of it at various points in the process (although a small army of people took care of most of the hard parts). Looked great online, too, thanks to that same small army.

Here, making sure I remember how to plot counties:

Map1

Sizing bubbles by margin of victory (too big, it turns out):

Map2

Getting the colors and sizing closer:

Map3

Exporting everything to a PDF so Illustrator can easily clean up the vector work:

Map4.1

In today’s paper:

Map5

Map6

March 10, 2012
Sketches from the Romney-Santorum ternary plot

This week the graphics department published a couple graphics based on exit poll data. The first one, made mostly by Shan Carter, was similar in many respects to the one he made in 2008 to show the differences in voters supporting Barack Obama and Hillary Clinton. (Known internally as the “delightful dancing boxes.”)

This view, which focused on Mitt Romney and Rick Santorum, was perfect for capturing the differences between their supporters, but we also wanted to show the influence of the other candidates, who have gotten substantial amounts of delegates.

Shan addressed this with a quick sketch:

shan sketch

Next they tried a ternary plot (I had to look it up myself), which is apparently beloved in geology and frequently to describe soil samples. Anyway, I came on to the project late, after the concept had been more or less decided.

First, a sketch showing how voters of a single demographic group supported in 7 different states. (Groups that supported Mitt Romney are farther to the right; groups for Santorum are farther to the left; groups supporting anyone else are toward the bottom.)

sketch1

A different approach, and one we eventually went with, showed all the groups across a single state. This is for Iowa. 

iowa

Then we just tried to show this as best we could. One thought was to label the biggest groups and draw lines for the shift from another state. Here’s who Michigan voters supported, with the lines emphasizing the main groups’ change from New Hampshire.

lines

We really liked the lines in print, but once you animate the transitions you don’t really need them, since the motion has the same effect. (Plus I didn’t know how program the lines anyway.)

Then we just had to build the thing, which we made using the D3 libraries. In Flash this thing would have been not so hard, and it was slow going at the beginning. But we’re as good as anyone at copy/pasting from demos, so it wasn’t too long before this:

demo1

became this:

final

Fun!

Also, we used this demo for hit detection. To make your own ternary plots, install the vcd package in R. Here’s the reference

February 19, 2012
Before and After: Analyzing ESPN “SportsCenter” Transcripts

A couple weeks ago, just in time for the Super Bowl, we published a couple fun graphics that used transcripts of ESPN’s “SportsCenter” as a way to look back on the NFL season. 

Originally, I had a concept in mind very similar to the one we (mostly Shan Carter) did in 2010 for the World Cup.

world cup

A colleague suggested instead using 3D players rather than photos, in part just to do something new and in part to give us a way to put more players on a field at one time. Here’s a progression of sketches on that concept:

Original whiteboard sketch:

bad drawing

A drawing for how it would fit on a print page:

print drawing

Graham Roberts’s proof of concept (with sizes semi-randomly assigned):

Proof of concept

We added some labels and charts to Graham’s final rendering:

final render

It ended up looking pretty cool and we were happy with it, but in the course of our analysis we really noticed a lot of funny quotes and cliches that the announcers said but I couldn’t really find an interesting way to present them.

We made a ton of charts looking for keywords we wanted to inspect, which let us sift through the data a little faster (though eventually we would have to weed out non-NFL references by hand). This output showed charts for mentions of words, both cumulative and week-by-week, along with a list of the usage of each word in context:

report

But presenting them was kind of a challenge. A straightforward approach (the only kind I know how to do, really) didn’t do much for anyone and took up a ton of space, so we dumped it:

dullard

We tried highlighting individual sentences (like “You’ll have more luck getting a ticktack out of the mouth of an alligator than getting information, especially about injuries, out of the mouth of Bill Belichick,” Aug. 10), but there wasn’t anything cohesive about a random list of quotes.

Then my boss said to write something original with the quotes as if I were writing for McSweeney’s. I said, great idea, imagining something like “Is It OK To Dunk On the President?”, one of my favorite McSweeney’s articles ever. Unfortunately, I couldn’t get it to work. Luckily, our intern, Ritchie King, who was already helping me with the analysis, was.

He turned a handful of silly cliches into a hilarious narrative about sports, war and Tim Tebow. We made his cliches piece the center of the graphic and had Sam Sifton read it online. (If you haven’t heard it yet, it’s worth a listen.)

tebow

Anyway, it was a fun project and proof that data is out there for almost any crazy idea. It also emphasized two important lessons. One, from Amanda Cox, is that you should make a hundred charts and pick the best one. We definitely did that – our project folder is full of boring analysis of various players and ideas. The second lesson is that the design and editing machine of the NYT graphics department can take a decent idea and turn it into something much better.

For the nerds out there, most of the analysis was done in R using the tm, openNLP and Rstem packages, but I can’t be sure which methods I used from which since Amanda just told me to import all of them.

4:50pm  |   URL: http://tmblr.co/ZpUrewGhFfjc
  
Filed under: Before and After 
January 29, 2012
Before and After: Defense Puzzle Responses

A few weeks ago we published a “Defense Budget Puzzle" (a sequel of sorts to one we made in 2010 that dealt with the federal deficit) that focused on a series of choices that the Pentagon is making to cut its budget. This time, however, we stored the choices readers made when they “submitted” their plans. And last week, when Elisabeth Bumiller and Thom Shanker highlighted the Pentagon’s first major step toward that goal, we published the results of the more than 12,000 readers who submitted a plan.

We weren’t sure how to visualize the results to include a choice’s popularity and its cost. So we grouped them by category and explored a couple different presentations (both using a very few lines of R). The first one used proportional circles and I sort of liked it but almost none of the smart people I showed it to did, which is pretty much the end of the story there.

Puzzle sketch 1

The second one was based on a simple chart we had run about ads the week before (the link to which I can’t seem to find).

Sketch2 Responses

It looked cleaner, fit better in the space and was pretty straightforward to make. So we went with it and built it. (Alas, my neon colors were changed to something more sensible.) Still looks pretty close to the final output, though.

Final responses

9:34pm  |   URL: http://tmblr.co/ZpUrewFbgU4p
  
Filed under: Before and After 
January 15, 2012
Before, During and After: The Richest 1 Percent

This weekend the NYT published Shaila Dewan and Robert Gebeloff’s story about the richest 1 percent of Americans (a more diverse bunch than you’d think). The graphics department published a lot of work in print and online to accompany the article. Online, there was an interactive map that shows you where you and your income rank in 344 zones across the country and a treemap of what jobs the 1 percent hold. But the print version, made by Alicia DeSantis and Ford Fessenden, was really imaginative. (I’m writing this only as a fan - my involvement was limited to about 10 minutes of data monkeying.)

First, Alicia’s original sketch, written on some junk paper:

Alicia's original sketch

Originally, they wanted to export the “labels-map” using ArcMap, but to make it as easy as possible to style (it’s not so fun to try to dynamically color or manipulate strings in Arcview, as far as I know, anyway), I used R (specifically, the maptools library) to make a pdf, which takes only about 5 lines of code.

Here’s the original output as a proof of concept:

R output 1

Then, after a couple iterations, we did more styling on the programmatic side to cut down on manual labor.

R output 2

And the final product:

Final product, 1pct

This is a good example, I think, of using each medium to its best potential, meeting the design constraints of each. More and more, this means making totally separate versions of things – admittedly, it frequently takes twice the time and energy – but the mediums are just so different that works well in one just doesn’t work well in another. 

One thing I do wish we could do better online is integrating graphics in the context of stories and other assets – photos, videos, whatever. Unfortunately, we don’t get to make every web page by hand once per day like we do in print. 

spread

December 25, 2011
Before and After: The Path to 270

Last week we did a graphic showing potential paths to victory for Democrats in 2012. We only had a half day to make this, so we really scrambled to get it done, and the end product was once again pretty close to Matt Ericson’s original drawing.

Before:

Before

After:

after

8:57pm  |   URL: http://tmblr.co/ZpUrewDnXpjz
  
Filed under: Before and After 
December 21, 2011
Before and After: Indonesian Paper

I was cleaning out my desk today and came across some sketches from some graphics we’ve published in the last year. Here’s a sketch drawn by Mike McIntire, an investigative reporter, as he described to me his story about the ties between a Tea Party group and an Asian Paper Company. (When someone needs to draw something to explain the story to you, it’s a good sign that the story needs a graphic.)

Before:

Before

After

After-network

12:23am  |   URL: http://tmblr.co/ZpUrewDZHGpP
  
Filed under: Before and After 
Liked posts on Tumblr: More liked posts »