2/22/2021
Get Twitter Data Using R and Tableau


During the 2020 Tableau Conference-ish this year, I published a viz called #DATA20 BY THE MINUTE where I visualized tweets with the #data20 hashtag counted by the minute. To do this, I used R to collect the data from Twitter. I thought others might find this useful in tracking their own tweets, hashtags or other people and topics, so here is a quick blog post on how to get this data and bring it into Tableau.

Initiate the Code

To use this code, you will need a Twitter handle and to set up a Twitter Developer App (free) here. After creating an app, you will get an API Key, API Secret Key and Bearer Token. We need these three to execute the code that downloads the data. Note: scroll to the bottom of this blog post if you want to copy and paste all of the code at once.

Note: In the R code above, replace the consumer_key, consumer_secret, and bearer_token with your own inside the quotes (without the brackets). Every request sent to Twitter must include a token so you should store it as an environment variable.

Get the #data20 Hashtag

After setting up the initiating codes, the next step is the code to collect the data. The sample code below will search tweets for the hashtag #data20 and return 25,000 results, not including retweets.

After the data is collected, the next bit of code will write the data to a CSV. Replace the path and file name below to your desired location.



The final output is two CSV files, one with the tweets, and the other location file with the user information. You can create a relationship (noodle) or join them together in Tableau using the field user id.

Read Twitter Status IDs and Look up Tweets

Another useful tool is looking up specific Twitter Status IDs and the tweets. For example, I used this technique to track the activity of my Tableau Tips last year. I published 194 tips and wanted to see what the most favorited tips were at the end of the year. To do this, I used a Google Sheet that had a list of all of the Tweets, specifically the Tweet Status ID.

In the R code below, I read these Status IDs from a Google Sheet into R, then look up each of them to gather the information about each Tweet. In this case, the Status ID is at the end of the URL, so there is a line of code that parses the Status_ID from the URL link. If you had a simple list of just the Status-IDs that you wanted to track then you wouldn’t need to parse them out of the URL.

The rtweet package in R

There are a number of other tools available in the rtweet package. For example, you can get followers, mentions, favorites and timelines of a user. You can download members or subscribers of a list. You can retrieve the direct messages you have sent or received. You can download trends on Twitter, globally, using a city name, or even a longitude and latitude. For more information and sample code for doing some of these other things, check out the documentation on the rtweet package here.

Below is all of the code used for this project as a quick reference for you to copy and paste.

I hope you find this information useful. If you have any questions feel free to email me at Jeff@DataPlusScience.com

Jeffrey A. Shaffer

Follow on Twitter @HighVizAbility





Source link

Shares:
Leave a Reply

Your email address will not be published. Required fields are marked *