Data storytelling and storytelling with data: is there a difference? A fellow conference attendee posed this question to me during last month’s Tapestry Conference in Annapolis. After thinking for a moment, I responded that for me, the difference lay in the process.
I envision data storytelling as when you’re looking at data and want to know, “What is this data trying to tell me?” Storytelling with data for me is where you have a story in mind and seek data to substantiate it. Data storytelling feels more quantitative; I imagine needing to collect, clean, manipulate, and analyze the data before crafting the story. Storytelling with data, however, feels more fluid, with the story and the data coming together concurrently.
I acknowledge this may be complete bunk, and I welcome thoughts and critiques from others. At the end of the day, defining data storytelling may be less important than actually doing it. But after attending the second Tapestry Conference on data storytelling, I’m left itching for a framework, or at least continued conversation. Data storytelling is a beautiful concept, applicable across many domains: journalism, academia, technology development, business, advocacy, public policy. It’s also in its infancy, and defining it might force structure on a realm that needs exploration and freedom.
That doesn’t mean we should avoid descriptions of what constitutes good data storytelling. Journalist and infographics professor Alberto Cairo offered a starting point in his keynote (slides) on visualization for communication as “the insightful art.” Visualization for general audiences, he said, should be:
1. Truthful: Present your best understanding of the truth.
2. Functional: Choose perceptual elements (e.g., color, font) that help your audience understand what you want to convey.
3. BeautIful: Please the senses of your reader.
4. Insightful: Help your reader understand the main point; explain what is surprising, relevant, or interesting about the data.
5. Enlightening: Change someone’s mind for the better.
Personally, I would put “beautiful” last, not because it’s unimportant, but because for me, conveying information comprises the core of data storytelling.
Cairo encouraged us to be evidence-driven communicators, not activists. This is 100 percent true for journalists. However, activists who want to tell their story should feel welcome to adopt the principles of data storytelling. I agree that infographics should not massage data or mislead readers. But, as my aforementioned definition suggests, it’s possible for the story to precede the data.
Jock Mackinlay, researcher and Tableau Software VP, offered one check against misguided data storytelling: provide raw data with visualizations. Doing so can hook readers into your visualization, letting them explore it for themselves. It also validates the author and can promote conversation, enabling others to carry analysis further.
The importance of data literacy underpinned both presentations. Readers are going to see infographics from journalists and marketers, and they need to know how to differentiate them. Raw data provides the audience with a powerful tool, but only if the audience itself feels capable and empowered to take that data and run with it. Plenty of people do feel this way, and I hope that future Tapestry conferences will help us think of ways to build data literacy in our schools and workplaces so that even more people do.
Stay tuned for more Tapestry Conference posts.
Learn to code? The question populated headlines this year. The Atlantic‘s Olga Khazan set journalists a-Twitter after pronouncing that journalism schools should not require students to “learn code.” She insisted her opposition extended to HTML and CSS, not data journalism, data analysis, or data visualization, making her post’s headline feel misleading given that those can require learning code.
Sean Mussenden of the American Journalism Review concisely expressed what I thought when reading Khazan’s piece. I fact-checked AJR articles in college, and tricking my brain to think I was fact-checking is the only thing that saved me from hurling a rock at my laptop while coding.
Four months ago I was a coding newbie. My crowning achievement was a Python script that determined whether a given string of text was of Tweet-able length. By December, I had cleaned and manipulated datasets in Python, created heat maps and scree plots in R, designed map visualizations in D3, and analyzed my Facebook and Twitter data. I needed the structure and graded homework assignments that graduate school courses in data manipulation, exploratory data analysis, and information visualization offered, but I wouldn’t have survived those classes without the wealth of resources on the Interwebz. These lessons I absorbed may help you meet your code-learning resolutions.
1. Find a tutorial that works for you
Free online tutorials abound. Shop around, take what works, and leave what doesn’t. I’m not suggesting giving up at the first sign of difficulty. Coding is hard, frustrating, tedious, and time-consuming. But it won’t always be. Rewards, even just the personal satisfaction of overcoming challenges, await those patient enough to try. Sink your time into a tutorial that fits your learning style and avoid wasting time on one that doesn’t. Last January I enrolled in a Coursera class on data analysis in R. The description said a programming background was helpful but not required. A week into the course, it was clear: a programming background was definitely required. I couldn’t afford to spend 10 hours on assignments I didn’t understand, so I stopped.
2. Google is your friend
Tutorials won’t give you all the information you need, but Google can help. Paste your error message into the search bar to get a sense of what went wrong. Or, (and I found this more effective), type what you’re trying to accomplish. Even the craziest phrase (“after splitting elements in lines in python, keep elements together in for loop”) will get you somewhere. People often share snippets of code on forums like Stack Overflow. Test their code on your machine and see what happens. Debugging is a random walk, requiring you to chase links and try several strategies before that glorious moment when the code finally listens to you. Don’t worry. You’re learning even when you’re doing it wrong.
3. But people are your best friend
I tweeted my frustration with the Coursera class last January. To my surprise, digital storyteller Amanda Hickman responded to my tweets and set up a Tumblr to walk me through the basics of R Studio. People want to help, and their help will get you through the frustration of learning to code. This semester I saw the graduate student instructor nearly every week during office hours, bringing him the specific or conceptual questions that tutorials and Google couldn’t explain me. When you get stuck, reach out. Ask that cousin who works in IT to help you debug something. Post on social media that you’re looking for help. Use Meetup to find fellow coders with whom you can meet face-to-face. Find groups like PyLadies (for Python) and go to their meetings. Don’t let impostor syndrome, or the feeling that you’re not really a “coder” stop you. You are a coder.
4. Take breaks
My first coding professor said, “Don’t spend hours on a coding problem. Take a break and return when your mind is fresh.” LISTEN TO HIM. More than once, I sunk six or seven hours trying to debug code, only to collapse into bed and then solve the problem within an hour the next morning. When coding threatens to consume your life (or unleash dormant violent tendencies), say, “Eff this for now” and take a well-deserved break.
What do journalists, surgery center developers, professors, small business owners, and researchers share in common? All take in a lot of data and must translate and present that information to others in a compelling manner. Also, all of them attended Tapestry, the inaugural conference on data storytelling. The event offered a valuable opportunity to connect with all sorts of people who seek to shape this nascent field of data storytelling.
Wish you were there? Check out this Storify I created and learn more: http://storify.com/PriyaKumar/tapestry.
Journalism connects people with information, and data storytelling connects narratives to individuals. The former is what kept me a newspaper subscriber for three years, something I explore in this post on “lock-in” and resistance to adapt to new technologies. The latter is illustrated in examples such as the Washington Post’s Fiscal Cliff calculator and the site Syria Deeply. Why is engagement through, for example, interactive graphics and participatory journalism important? What does “engagement” even mean?
Poynter’s Matt Thompson included it in the “Buzzwurgatory,” a collection of those vague terms we put in headlines so the Google bots will find our page. Engagement may euphemistically refer to attracting more readers, he writes, but it should focus on incorporating the audience into our work. This helps journalists ensure they’re covering the “right” stories. The Atlantic’s Ta-Nehisi Coates recently tweeted a link to a 1996 James Fallows piece, “Why Americans Hate the Media.” Fallows describes how journalists consistently focus on the political horse race at the expense of digging into the policy questions that actually impact individuals. He writes:
“When ordinary citizens have a chance to pose questions to political leaders, they rarely ask about the game of politics. They want to know how the reality of politics will affect them—through taxes, programs, scholarship funds, wars.”
Data storytelling does that. The tools that exist to display information, from data visualizations to interactive databases to geocoded maps, encourage readers to explore. Take the Chicago Tribune’s Illinois School Report Card database, which enables readers to filter data by school name or address and also to examine trends among the whole system. Articles offer context to the data. The Guardian’s Datastore encourages Flickr users to submit photos of their own mashups that use Datablog information. And Syria Deeply calls on the audience to contribute information about a conflict that few journalists can access:
“Single-topic platforms, such as Syria Deeply, are not made up of a team of journalists and editors reporting to a passive audience. Instead, they embrace participatory journalism in which civilian journalists can collaborate and contribute to the news process with personal stories and firsthand accounts. As a platform, we are then able to aggregate and curate the most useful content on that topic into one space.”
Nonprofit organizations also find data storytelling a helpful tool to evaluate their services and ensure their funding provides the most benefit for people. GlobalGiving collects stories from individuals in Kenya and Uganda, feeds them into a software called Sensemaker, and derives insights that improve its operation and ultimately help the organization better accomplish its mission.
Through a course in online communities, I’m learning the principles behind how people behave and connect online. Understanding this helps data storytellers promote engagement, in the fullest sense of the word.
What are the most engaging examples of data storytelling you’ve seen lately? Write a comment and let everyone know.
How many rewards cards hang on your keychain? How many website accounts do you maintain? How much information do you share with organizations? Type your name into Spokeo and see what comes up. Chances are, it’s pretty accurate.
Many places collect personal information; that’s nothing new. But combine the ability to store unlimited amounts of data, aggregate and analyze massive datasets, and instantly release information into the public realm. You get the power to use customer behavior to determine when women are pregnant. You get maps that show addresses of people licensed to own pistols. You get the question of how aggressively to prosecute someone who downloads too many articles.
What are the implications of this? Thinking from a policy perspective can help journalists spur discussions around the role and use of data.
Journalists should also consider the policy implications of their own work. For example, the New York-based Journal News obtained gun license data, which was public, and mapped the addresses of those licensed to own pistols. This sparked an outcry among citizens and triggered debate among media circles as to whether “journalists have a free pass to do whatever they want with public-record data.” New York state then passed legislation that removed such information from public access. The incident reminds journalists to ask the question, “What do I hope to accomplish with this story,” at each step of the reporting process.
Government use of data is another area ripe for data storytelling. As Scott Shackford writes:
“The degradation of the Fourth and Fifth Amendments is an academic or theoretical matter for so many people and often lacks a strong human narrative to draw public outrage….Whereas, just about everybody’s on Facebook. Facebook’s privacy systems affect them directly every day, and they see it. So Americans are furious that Instagram might sell their photos, while shrugging at what the federal government might do with the exact same data.”
Data and policy are not independent. For this reason, policy coursework comprises the third leg of my concentration in data storytelling (with data analysis and design being the first and second). Understanding what organizations do with data is as important as using data to present compelling stories.
How does storytelling happen? Someone has an idea, consults a variety of human and electronic sources, sifts through the information he or she has collected, extracts the meaningful parts, and distills them into a narrative. Of course, the process is often much more circuitous than this, but a successful story must cross each of these hurdles.
Data analysis helps journalists find compelling story ideas, identify additional sources to consult, and extract meaning. But how to ensure that readers and viewers grasp that meaning and understand the significance of the work? By paying attention to design.
Half of Americans get their news digitally (through online, mobile, social networking, email, or podcasts), and that number is going to grow. To a data storyteller, that means crafting a narrative for the digital environment and then, for non-online only outlets, adapting it to the print or broadcast product.
All journalists, not just data storytellers, must, as Martin Belam writes, “think reader, not editor.”
- Attention: Why would a reader be interested in this narrative? rather than, What does my editor want to see?
- Value: What will the reader hope to learn from this piece? rather than, What does my editor think is important?
- User Experience: How can the reader dig into this data and uncover more insight? rather than, I’ll hand this database off to the web person to post it on the website.
For data storytellers, this means thinking about the best way to present data, whether a simple spreadsheet or a time-lapsed bubble graph. It means thinking about whether to create a static infographic or an interactive data visualization. (Stay tuned for updates on my experience in the Knight Center for Journalism in the Americas’ online course, “Introduction to Infographics and Data Visualization.”)
The way information is presented significantly impacts the way people use it. If a website doesn’t load quickly, users will click away. If a user has to constantly scroll horizontally on a mobile screen because the text didn’t automatically adjust, the user will click away. Solving these problems takes design and coding savvy; if that sounds like you, check out the looming challenge of responsive design, or creating interfaces that optimize for viewing on any device.
Focusing on design doesn’t mean turning into a graphic designer or a web developer. It simply means thinking intentionally about the tools at hand and selecting the ones that ones best convey your narrative. Good writers write for their audience. Good designers design for their users. Good journalists need to do both.
Design forms the second leg of my concentration in Data Storytelling (data analysis being the first). Through coursework in graphic design, interaction design, and information visualization, I learn how people want to see information.
What engaging presentations of data have you encountered? Presentations that fell flat? Share them in the comments below.
What attracts people to journalism? A love of language? An insatiable curiosity about how the world works? These characteristics drive journalists to pursue stories and captivate an audience using the power of narrative. They also make journalists great candidates to learn coding.
Yes, computer programming, that gibberish-like collection of symbols and phrases that make our computers whir and our Internet function. That coding.
Programming languages resemble human languages in that they operate under a set of rules, or syntax, and they enable us to communicate with another group (in this case machines rather than people). Programming languages focus more on logic than math, and learning to code offers a reminder that there is more than one way to think about things. Learning any new language takes time and practice, but payoffs exist. The sheer glee of writing a few lines of code that actually function mirrors the deep satisfaction of writing a beautiful sentence. I experienced this when I used the len function in Python to write a program that states whether a given text is too long for a Tweet. Yes, this already exists, but it was my code, and it worked.
A more significant payoff to learn even basic coding is the ability to suss out stories from:
“stacks of financial disclosure forms, court records, legislative hearings, officials’ calendars or meeting notes, and regulators’ email messages. …With a suite of reporting tools, a journalist will be able to scan, transcribe, analyze, and visualize the patterns in these documents. Adaptation of algorithms and technology, rolled into free and open source tools, will level the playing field between powerful interests and the public by helping uncover leads and evidence that can trigger investigations by reporters.”
Call it computational journalism, precision journalism, or data journalism, but digging through unstructured data is how the media will undertake its watchdog responsibility. This doesn’t mean journalists need degrees in computer science (though it wouldn’t hurt), but it does mean that journalists should understand the capabilities of software and learn one or two tools they can apply in their daily reporting. John Diedrich did so with databases and won an award.
“To hang in there — to produce data-driven journalism, or design a mobile app, or write a long-form profile story — students need to have both good taste and a desire to master something. … At the root of all this talk about programming, apps, and so on, is the idea of story. But have our students seen the story in the data, in the graphic, in the app?”
Coding and data analysis form one leg of my concentration in Data Storytelling. I don’t intend to become a programmer, but I do want to speak the same language as a coder and understand how to tell a computer to dig in the way I want it to.
What has learning to code helped you accomplish? Share your story in the comments.