Mining the Social Web, 2nd Edition


How can you tap into the wealth of social web data to discover who’s making connections with whom, what they’re talking about, and where they’re located? With this expanded and thoroughly revised edition, you’ll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

  • Employ the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
  • Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
  • Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
  • Build interactive visualizations with D3.js, an extraordinarily flexible HTML5 and JavaScript toolkit
  • Take advantage of more than two-dozen Twitter recipes, presented in O’Reilly’s popular “problem/solution/discussion” cookbook format

The example code for this unique data science book is maintained in a public GitHub repository. It’s designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.

Table of Contents
Part I: A Guided Tour of the Social Web
Chapter 1. Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More
Chapter 2. Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
Chapter 3. Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More
Chapter 4. Mining Google+: Computing Document Similarity, Extracting Collocations, and More
Chapter 5. Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts, and More
Chapter 6. Mining Mailboxes: Analyzing Who’s Talking to Whom About What, How Often, and More
Chapter 7. Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More
Chapter 8. Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing over RDF, and More

Part II: Twitter Cookbook
Chapter 9. Twitter Cookbook

Part III: Appendixes
Appendix A. Information About This Book’s Virtual Machine Experience
Appendix B. OAuth Primer
Appendix C. Python and IPython Notebook Tips & Tricks

Book Details

  • Paperback: 448 pages
  • Publisher: O’Reilly Media; 2nd Edition (October 2013)
  • Language: English
  • ISBN-10: 1449367615
  • ISBN-13: 978-1449367619
Download [26.5 MiB]

You may also like...

Leave a Reply