Lisa says we should be keeping a "secret trash-blog" and that's what you've found here.
I learned a ton from chartsnthings (despite Kevin saying that he's too embarrassed to read it).
Things I wrote about things I made in 2017.
The idea is cribbed from an upshot piece. We got a little crunched for time and weren't as fancy about customizing feedback for ways in which the lines were drawn incorrectly. Still, I think the core idea of making people put their assumptions down and overlaying reality over them works and increases memorability/engagement.
I'm a little surprised it hasn't been doing more; the chart dragging code isn't super complex -- just 60 lines of d3 for a mvp. TODO EU you draw it
We've also been tracking the accuracy of people's responses, but need to come up with a more understandable form than the log scaled small multiple violin plot.
Bubbles done on a saterday. Halo didn't work, larry tweet with them in pring
The house doesn't post live election results so we had to send people into the chamber to count votes! (rachel times insider)
More congressional table journalism. Tried writting a tool to do automatially from twtiter and fb API, but videos are messy
Jeremy got a hold of some interesting data this afternoon and we threw together a quick piece.
I made the map with d3 and a couple of canvas tricks from an old Bloomberg piece. There are too many points (80,000+!) to animate with svg, so I use two canvas layers. The top one is cleared every frame and each moving moving point is redrawn. The bottom frame only has points drawn on it and is never cleared so it keeps a record of every location.
We briefly talked about showing time in different ways - a line chart or small multiple maps by hour - but there was a chunk of time missing. After publishing, I explored some alternative representations with d3-contour which clearly shows the higher rate of hacking in Eastern Europe and China. Its easier to use a nonlinear scale when you're programming at higher level than drawing rectangles on top of each other.
Of course, number of hacked IPs per square mile is not the most meaningful thing in the world to show, so some kind binning to compare the amount of hacking to the number of computers in different regions of the world.
I was curious about about the rest of the group stage would play out, so I threw together a visualization looking at how the result of each game will affect each team's chance of making out of groups.
I started out thinking that I would follow the form of my previous crack at this problem. Since each team is only playing two games, you only get four groupings of points which ends up not being as interesting - all of the interesting action happen in -other- team's games. So I tried looking the scenarios in a more structured way, shuffling the boxes to compare how the outcome of the game would change everyone's standings. I felt like I was learning interesting things from the different arrangements, but it was hard to see which games had the biggest impact so I sorted each column individually instead of keeping each scenario in its own row. A little easier to read but loses some of the nice mathematical elegance of exploring 6d hypercube.
Charts were still kinda weird so I stuck them inside of graph-scroll to make them more palatable. There's still way too much text in the piece, but I didn't give myself a ton of time for editing.
Edit: Darn it! I messed up the tie-breaking rules! I counted the total wins in games between the tied three teams instead of comparing the head to head records of each team. Sorry about the incorrect info everyone : (
If three or more teams are tied, the head-to-head record of all teams against each other team involved in the tiebreaker will be considered. If a single team owns a winning record (as defined as winning more than 50% of the games) against every other team in the tiebreaker, they are automatically granted the highest seed available in the tiebreaker, and a new tiebreaker is declared amongst the remaining teams.
I started out just trying to show how much LeBron scored in each playoff game. A table of circles did a nice job of showing wins and losses, but area isn't easy to compare at such a small size. Using height instead helped, but I wasn't thrilled about information density; the pieces I'm proudest of are rich with detail that rewards exploration and careful reading.
I tried squeezing the graphic to fit other players in, but since the y position was used to encode both season and points scored, it wasn't possible to get very small. So I moved season to the x position, distinguishing it from series with white space, which opened up a lot of vertical room for other players. This came at the cost of being able to compare LeBron's record at different rounds of the playoffs across seasons — he hasn't lost a first round game for 5 years! — which I tried to alleviate by highlighting result of the last series played in a season.
A little bit of polish made it prettier, but there were still significant problems with the form. The piece was about LeBron passing MJ's record, but because so much emphasis was placed on individual games, it was hard to compare the area of players' charts to see who had the most points. Tight on time, I fell back on the classic cumulative record chart. Updated with more transitions and scrolling!
After LeBron's 11 point game gave me some extra time, I went back to idea of trying to show each playoff game. Stacked bars eliminated the huge amount of vertical variation between games so player's point were more perceivable. To distinguish series, I switched color from win/loss to playoff depth. This made me a little sad — you can't trace your finger along and recall individual games like you could in the previous version — but made it did a better job showing how players from different eras got their points. Finally, I used a small multiples layout instead of vertically sorting by time with a common x scale to fit more players in a smaller space.
update to post election piece. staggered transitions? langishing code gets ugly
This started as a quick follow up piece to reports that the DOJ was going to challenge affirmative action in college admissions. Historic demographic data on college admissions and young high school graduates wasn't easy to pull together quickly though, so we started to put together a bigger piece not tied to new cycle.
The design of the top charts went through several iterations. We started out with slope charts showing how the student population of different demographics had changed at different types of schools over the last 35 years. Fitting the white percentages on common scales was trickery, so we switched to showing the difference between percent admissions and population.
I really wanted the gap charts to work 0 they show so many different stories with just a few lines! - so spent some time tweaking the layout to squeeze them in. Distinguish between positive and negative gaps wasn't intuitive though (even with particle animation), so we ended up using an even more slimmed down version of the slope charts.
If I had a little more time, I would have liked to try including more chart forms and alternative gap measurements (the ratio of percents isn't the same as the difference of percents!) by transitioning between them in a scrollytelling piece. That would have required a big rewrite of the copy and code didn't make sense to attempt while we were waiting for a break in the news to publish. Other things to explore: a wider selection of schools (we had a drop down that let you chart any of the ~4,000 schools, but weren't 100% confident in the data so it was cut) and graduation rates.
The trickiest part of this piece was finding the right data source. We wanted frequently updating, hourly data to show where the rain was falling the hardest and how much had fallen overall.
I started looking at the Global Precipitation Measurement Constellation which has data on rainfall around the world in 30 minute slices released on 6 hour lag. After spending a few hours figuring out how to open up netCDF files, I realized the data wasn't updated as regularly as I hoped. Coloring the data points by observation time showed the paths of satellites moving across the sky. Since not every point gets updated at the same time or on the same interval, calculating cumulative rainfall was trickier than just summing the hourly interval - too tricky to do on deadline.
After spending most of a Saturday wandering down a dead end, I was ready to give up. Until Anjali found a NOAA ftp server with exactly the data I was looking for! The format was a bit strange - a shapefile with a grid of points showing calculated rainfall. I threw together a rough script to download download the last few days of observations, combine them into a csv and plot the values.
Since both the cumulative and the hourly rainfall were interesting, I tried a bivariate color scale to trace the hurricane's path in red. You can see the eye of the hurricane as it lands! All the colors were a little too much to explain in a key though, so we switched to circles to show the current path of the hurricane. We also had to cut down on the spatial resolution to keep the file size under control - maybe a video would have been better, but I'm a big fan of tiny charts inside of tooltips.
For more on all the technical details that went into this, checkout the tutorial.
We exported data from the hurricane rescue map and animated the messages to get a sense of where people needed help over time.
With thousands of messages, there was way too much text to print everything. We manually looked through the messages to pull out interesting, representative snippets that conveyed what each of dots popping up signified. Spacing them out semi-evenly during the animation was tricky when when scrolling through thousands of rows in a spreadsheet, so I made a little chart to help see the timing.
We had a brief moment of panic when we realized the basemaps (projected to Texas South in qgis) didn't line up with with the dots (projected to Texas South in d3). Apparently, d3 assumes that the earth is a sphere while qgis uses a more accurate ellipsoid. Preprojecting the dots with mapshaper and adding them to the basemap to line up the scaling & translating fixed the problem after a couple of hours of head scratching. I'm slowing learning how to do GIS things.
Archie suggested on nice touch on animating dots that I'll be reusing. I've usually shown new data points entering by transitioning the size. Combined with all the text on the screen, this made lots of extra visual noise. Replacing the resizing with fading halos highlighted new points without nearly so much noise.
Trying to solve these problems on deadline and running low on sleep gives you tunnel vision. For a totally different perspective on the flooding in Fort Arther read this: I downloaded an app. And suddenly, was part of the Cajun Navy.
Philip Klotzbach has been keeping a running tally of different records that Irma has broken.
To give his numbers a little bit of context, we started exploring different ways of representing Atlantic hurricanes. We tried a couple of different representations - scatterplots, maps, line charts. Since every chart had the same set of storms on it, I started playing around with ways of transitioning the charts into each other and we quickly decided to do a scrolly piece (we've done a lot of stacks in the last few weeks).
I took a couple days the week after to rewrite in regl. Includes my right to left time scale (so the westward paths don't invert) and line to scatter transition that were just a little too confusing to publish.
Nadja did most of the work on this piece. To distinguish the year lines I drew a slightly thicker black line behind each, a trick I picked up from a Bloomberg piece last year. To show the progression of time, we changed to color scale to years, which unfortunately made it quite similar to Bloomberg's Arctic Sea Ice chart.
I tried a couple of other different approaches to differentiate the design a little. A radial chart, an area chart showing the max/min ice extent over time, a variation on that also showing the 25%/median/75% extent and one width a gradient. I think the area charts do an effective job showing the trend, but a wiggling line chart is a more compelling form that doesn't require as much explanation so we kept it.
For the second time in a row, my dream of a line to scatter plot transition was thwarted . The falling dots were a little too joyful for charts about the Arctic melting away. I couldn't totally get rid of them though; click the year button five times and scroll down.
/r/guns had a thread this morning about the type of weapon used in the Vegas shooting. We didn't have any concrete information about the weapons used yet, so we started to explore ways of helping a lay audience understand the noises different kinds of fire rates make.
Jon picked apart sound files to pick out when each gun shot occurred. We considered using the sound of each gunshot, but filtering out the backround noise was too difficult, so I started toying around with webaudio
Design based on one of my favorite graphics, which is a little sad.
After getting a couple of requests for an update to the 2016 version, I grabbed this year's data and threw it into the charts. The code wasn't quite as pretty as I remembered, but I think I've fixed the three-way tiebreaker bug that threw off the MSI chart—if not please let me know!
Hopefully next year I'll have a chance to explore another representation of this data. I'd like something that you can read top to bottom as matches progress. With our world cup coverage canceled there should be plenty of time!