LICENSE: Creative commons CC BY-NC-ND 3.0 (for other licenses please contact Dr. Rubens or Dr. Russell).
ATTRIBUTION:
M. G. Russell, J. Flora, M. Strohmaier, J. Poschko, R. Perez, N. Rubens.  Semantic Analysis of Energy-Related Conversations in Social Media: A Twitter Case Study.  International Conference of Persuasive Technology (Persuasive 2011), Columbus, OH, USA, Jun.2011.
NOTE: dataset (data.txt.zip) is encrypted to obtain decryption key please contact Dr. Martha G. Russell (martha.russell at stanford.edu) or Dr. Neil Rubens (rubens at ActiveIntelligence.org)
WARNING: USE OF DATASET IS NOT ALLOWED WITHOUT REGISTRATION



CONTENTS:
LICENSE.txt - this file
keywords.txt - keywords used to acquire the data
data.txt.zip - compressed and encrypted tweet data (for obtaining decryption eke @see NOTE above)


Energy-Behaviors Dataset 
https://s3.amazonaws.com/energy-behaviors/dataset/2010.09.03-2011.01.04.xl.snapshot.zip


SUMMARY:
2,472,900 tweets
18,338 hashtags (w/ occurrence of 3 or more)
Date Range: 3.Sep.2010 - Jan.3.2011 (4 months)
Data Collection Frequency: daily snapshots
Keywords: keywords related to energy saving behaviors (see keywords.txt).



DESCRIPTION:

This dataset was constructed as a part of the "Social Media Analytics for Monitoring and Changing Energy Consumption Behavior" initiative of the Stanford ARPAe project.  For more details please refer to: M. G. Russell, J. Flora, M. Strohmaier, J. Poschko, R. Perez, N. Rubens.  Semantic Analysis of Energy-Related Conversations in Social Media: A Twitter Case Study.  International Conference of Persuasive Technology (Persuasive 2011), Columbus, OH, USA, Jun.2011.

The initial purpose of constructing of this dataset was to assist in understanding the role of social media in changing consumers energy behavior.  We believe it could be useful for other purposes as well, and therefore are releasing it publicly.

Data was acquired on a daily basis by utilizing the NodeXL Twitter Importer module *xl1, which captured the latest messages containing energy related keywords (see keywords.txt).  The eco-linguistic keywords used to collect the tweets was developed at Stanford University by Drs. June Flora, Carrie Armel, and Martha Russell, under sponsorship from the US Advanced Research Projects Agency for Energy, and Media X at Stanford University.


*xl1
NodeXL Twitter Importer module creates a separate file for each of the keywords.  This dataset contains amalgamation of these files.  Some of the tweets are duplicated in the dataset, since a tweet could be captured by several keyword-based files.



FOR MORE INFORMATION:

M. G. Russell, J. Flora, M. Strohmaier, J. Poschko, R. Perez, N. Rubens.  Semantic Analysis of Energy-Related Conversations in Social Media: A Twitter Case Study.  International Conference of Persuasive Technology (Persuasive 2011), Columbus, OH, USA, Jun.2011.

Website of the Change Energy Behavior Initiative:
http://mediax.stanford.edu/changeeb.html

Web-interface to the tweetonomy network based on this dataset:
http://energy.twex.poeschko.com






