Data, Data, Data
- http://www.kdnuggets.com/datasets/
- http://thedatahub.org/
- http://www.infochimps.com/datasets
- Project Gutenberg – 40,000+ free books
- Million Song Dataset – descriptors for 1,000,000+ popular songs
- Research Quality Data Sets – collected by Hilary Mason (@hmason)
- WordNet – a large lexical database of English words
- Google Public Data – thousands of datasets
- Yahoo! Research Datasets – datasets about language, networks, etc.
- Konect: the Koblenz Network – collection of network datasets
- Pachube (clearinghouse for live data streams, from sensors etc.)
Some API’s
Personal Visualization / Quantified Self
- Fitbit
- Quantified Self
- IOGraph
- WattVision, for your energy usage
- DrinkingDiary, for your alcohol intake
- Bedposted, for your sex
- RunKeeper, for your running paths
- LastGraph, for your music
- And yet more tools for helping people keep track of data about yourself
Other Resources
- Mechanical Turk (tool for organizing low-cost human data labor)
- OpenPaths (tool for collecting your GPS wanderings)
- Microaggressions (data set / collection tool)