How To Find Data Sources and Organize Data To Fit Your Content Needs

Angela Skane

Apr 09, 20207 min read
How to Find Data Sources

When finalizing a piece of content, have you ever realized you missed the release of data that you could have used as a primary source, causing you to either start over due to outdated information? It is a hard pill to swallow, but this has happened to me on more than one occasion, and sometimes it meant missing out on timely content opportunities.

Finding public data sources is challenging enough, let alone trying to keep track when a data set will be updated. Below you will find everything you need to get started — strategies, resources, and tips that can help you step up your content planning by utilizing free, fresh public data sources and using them to your advantage.

Tip 1: Sign Up for the Right Data Newsletters

Your first task is to figure out what public data sources exist, not only for your topic area but in general. If you don’t know what exists already, how can you brainstorm a new concept for content? This process involves a lot of planning and hard work, but it will save you time in the long run when you can continually refer back to it.

Start by having dozens of datasets delivered to your email each week. You can subscribe to data newsletters, which curate unique data sets that can help inspire your content ideas as well as send you information that is relevant for your brand.

Here is my list of data-based email lists. Each one of these newsletters has different types of content and is sent at different frequencies. 

Numlock

http://walthickey.com/numlock-newsletter/

Delivers fun, surprising, and clean stats to your inbox. This is meant to highlight fun numbers within a topic and not so much as a full datasource.

Emailed daily

Data is Plural

https://tinyletter.com/data-is-plural

Focuses on data gathered by other people and is often unique and sometimes niche. These data stories often combine multiple datasets or include unique data.

Emailed weekly

Dataworld

https://data.world/

An update of different data featured on the site, which is a community-based site and curates a combination of unique sources and links to government and other official data sources.

Emailed weekly

Kaggle

https://www.kaggle.com

A combination of coding and programming for data science projects as well as public datasets to use. This site and newsletter is customizable and perfect for more advanced data scientists

As updates in your selected interest areas happen

Flowing Data

http://flowingdata.com/

Highlights the latest trends in data visualization, statistics, and data. Great inspiration for finding data and ways to present it.

Daily

The Pudding

https://pudding.cool/

More for content inspiration, this takes popular topics being talked about in society and tells the story within these topics with data and visuals.

Once every 6 weeks

My favorite newsletter from above is the Data is Plural newsletter. Not only does it send weekly updates, but it shares really unique data sets that aren't going to be the first results on search engines.

They also curate every dataset they send out into one big spreadsheet for followers to go back and easily reference any time. You can do a keyword search in this sheet to see if there is any data relevant to the ideas you are working on. 

Another incredible source for inspiration from the list above is Flowing Data. This source serves two purposes for me:

  • Finding data sets/ideas.
  • Learning unique visualization techniques.

This newsletter provides a lot of amazing examples and can really inspire you to elevate the way you are presenting data in your content.

Tip 2: Subscribe to Subreddits on Data Sources and Visualizations

Reddit provides a never-ending supply of interesting content and inspiration. Beyond supplying me with emerging trends and topics, Reddit also helps me find data to support my ideas. 

To get started, I recommend a handful of subreddits to keep an eye on.

r/datasets

It is dedicated to the same topic we are exploring here — finding data. I always post in this forum when I could use some backup finding a difficult data point or when I am just having an off day and need some help finding what I am looking for. 

These content creators and data specialists are an incredible group of people to build a connection with. On top of real-time support from the data community, this forum also hosts about cool and exciting data sets you can explore and add to your arsenal. 

r/dataisbeautiful

Data is beautiful screenshot form RedditThis subreddit serves two purposes — it helps you find data you can work with and shows you how you can visualize it. On the same note, r/dataisugly shows you datasets that exist but how not to visualize them. These resources can help you understand what you can achieve (or what is visually ineffective) with the data you are exploring. 

If you are new to using data in your content, seeing how others are bringing it to life can make the daunting task of jumping into data much less intimidating. 

r/usdataisbeautiful

This subreddit is similar to r/dataisbeautiful but dedicated specifically to U.S.-based data. If you have ever visited a government agency site, you know how difficult it can be to navigate. Trying to find the data you need from a government agency can be excruciating, but this forum helps get you going in the right direction and makes it much less painful. 

No matter which route you choose, you will be sure to find bookmark-able inspiration.

r/DataVizRequests

Not as common as a source for finding datasets, but still useful is the subreddit r/datavizrequests. This subreddit encourages users to ask readers to visualize the data they have available. So, like many of our other sources, this subreddit can lead you to data you can use in your content, or help you find a way to present the data you have already found. There are a lot of people who are excited to help visualize and find data, so using tools like Reddit can really help you find a team of data experts to help you with your content ideas, without the hours of searching and recruiting through traditional talent searches. 

When working with any of these subreddits, you should avoid just merely duplicating the visuals you find. You should find ways to work with the data you find in unique ways or gather inspiration from the visual methods you find.

Tip 3: Perform a Google Search for Data and Dataset

Don’t tell anyone around my office, but I don’t magically know of a data set that will have the exact statistic they are looking for. One of the best resources at our disposal is Google.

I simply perform a quick search of the topic, followed by “data” and “dataset.” This serves the obvious purpose of finding a data source to work with, but it also allows me to see how people have already covered this data set and what stories they were able to tell with it.

Doing a Google search with the term dataset

The data you find can be great inspiration for spinoff ideas from things that have been done. What questions did everyone leave unanswered?

Very often, a quick internet search does the trick, providing a new source to add to my personal database of, well, databases. That leads right into my last and probably most important tip.

Tip 4: Curate a List of Data Sources for Your Topic Area

This strategy will help you quickly see how old data is and discover upcoming releases.

If you are not organized with your data sources, you will be overwhelmed. Trying to dig through data sets to see which one contains the piece of information you are looking for can add hours of additional time to your work. 

Organizing Your Data Lists

Start a sheet with the data source name, link, last publish date, anticipated update (which you can find based on patterns of previous releases, content calendars, or by emailing the data owner), and a summary column that allows you to quickly see the type of data the source captures.

Organizing data lists in Excel

It's a good practice to set aside time each quarter to go through and make sure you know what was previously released, as well as what is to come so you can properly plan out your content calendar. 

Once you start expanding your data collection, you can create new tabs within your sheet that sort your resources into different topic clusters. 

Making Sure Your Data is Still Relevant 

As mentioned above, knowing how recent your data is important to ensure you have the most relevant data and stats to work with.

One thing to avoid is assuming what you find first on Google is the most recent data.

The best way to make sure you are working with the most recent statistics is to go to the source of the data and look at release dates, which are often on the cover page of the reports of data released. This is often difficult to find for some sources, so reaching out to the source through email or even Twitter can help you figure out if the data is the most recent.

As far as determining what is recent enough for your content, that is up to you. There are some datasets that, while information may be released in the current year, the data may be from previous years. Some data sources are a few years behind on their reporting, and depending on the topic, it may be okay for your site's content.

In terms of link building, publishers may be more reluctant to publish stats that are more than a year old. It is all about what your content goals look like. If you find a page and can't determine a date, you can always throw it into Archive.org and see when the page was published. It saves a screen captures of the pages, so you can also look and see if and when a page was updated. 

Conclusion

Starting out with data is intimidating, but sticking with it can mean having a large supply of fascinating, accurate, useful information to serve as the foundation or support for the content you are creating. 

Participating in data journalism allows you to stand out with trusted information and present your brand in a really authoritative way. Starting with the right foundation — high-quality data sources — is the key to success. Jump in with these tips and start working to incorporate data into your content creation.

Share
Author Photo
Angela Skane is a creative strategy supervisor at Fractl. Throughout her career she has produced over 100 content marketing campaigns, many based on public and government data sources. Her content for clients has been featured on Men’s Health, Yahoo! Finance, Motley Fool, Consumer Affairs, and more.