Mining the Social Web


Want to tap the tremendous amount of valuable social data in Facebook, Twitter, LinkedIn, and Google+? This refreshed edition helps you discover who’s making connections with social media, what they’re talking about, and where they’re located. You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.

Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.

  • Get a straightforward synopsis of the social web landscape
  • Use adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+
  • Learn how to employ easy-to-use Python tools to slice and dice the data you collect
  • Explore social connections in microformats with the XHTML Friends Network
  • Apply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detection
  • Build interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits

“A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data.”
–Alex Martelli, Senior Staff Engineer, Google

Table of Contents
Chapter 1. Introduction: Hacking on Twitter Data
Chapter 2. Microformats: Semantic Markup and Common Sense Collide
Chapter 3. Mailboxes: Oldies but Goodies
Chapter 4. Twitter: Friends, Followers, and Setwise Operations
Chapter 5. Twitter: The Tweet, the Whole Tweet, and Nothing but the Tweet
Chapter 6. LinkedIn: Clustering Your Professional Network for Fun (and Profit?)
Chapter 7. Google+: TF-IDF, Cosine Similarity, and Collocations
Chapter 8. Blogs et al.: Natural Language Processing (and Beyond)
Chapter 9. Facebook: The All-in-One Wonder
Chapter 10. The Semantic Web: A Cocktail Discussion

Book Details

  • Paperback: 360 pages
  • Publisher: O’Reilly Media (January 2011)
  • Language: English
  • ISBN-10: 1449388345
  • ISBN-13: 978-1449388348
Download [13.1 MiB]

You may also like...

No Responses

  1. hola says:

    Thanks a lot!

Leave a Reply