How much data is floating around on the Internet for you to use in your applications? If the sizes of 21 Recipes for Mining Twitter and Data Source Handbook are any indication, the answer is: not much. These are easily the shortest books I’ve ever reviewed (60 pages for Mining Twitter, and a mind-boggling 30 in Data Source Handbook). This is not necessarily a bad thing, but when a book is short enough to be basically a blog post I have to wonder how useful it will be as a publication.
Mining Twitter is a fun little book with an emphasis on hands-on projects that I like. As with his book Mining The Social Web, author Matthew Russell has chosen to use Python to write his applications so it poses a learning curve for Python newbies, but at least the techniques are easy enough to understand and the projects have some useful results. I particularly like the mashups of Twitter data and geolocation in the last couple projects.
The problem with such a short book is some topics only get the barest coverage. OAuth, for example, which protects Twitter data and allows access with authorization, can be a complicated concept but in Mining Twitter the subject only gets a couple pages. Matthew provides some code that send credentials to Twitter and returns user data access, but users who don’t really comprehend the process will just need to go online and research OAuth.
The Data Source Handbook is interesting to me because, despite the ribbing I’ve just given about the book’s small size, there really are a lot of public data sources and APIs that give access to great information. There are almost 60 APIs listed in this book, which might be part of the problem: at roughly two data sources listed per page, there is no real documentation other than a small description and an example of the API call and result. Readers can use Data Source Handbook only as a catalog of public data sources and learn more by checking APIs out online.
The real benefit of Data Source Handbook is the sheer number of public data sources available. I knew of only a few of these APIs before I read this book, and it’s the kind of book that’s easy to keep on the shelf and refer to it when some data is needed for an application. The book is organized into sections—websites, people, search, geolocation and more—so it’s easy to find what you need. Just remember this book is only a developer’s starting point to finding and employing a public API.
21 Recipes for Mining Twitter
Matthew A. Russell
Published by O’Reilly
Data Source Handbook
Published by O’Reilly