On XKCD, Machine Learning, Google Vision, and Birds

Hi there! I'm Herbert, a software engineer and fan of a webcomic called XKCD, a comic about life, love, math, science, and technology. Only a few short years ago, this comic came out:

This has long been a problem in computer science - things that humans take for granted as being simple are in fact really complicated. Your standard human three-year-old is probably capable of determining whether a picture has a bird, but up until recently, it took serious effort to even consider doing something like that with a computer. However, last week I attended Google's GCP Next 2016 conference in San Francisco, and was introduced for the first time to Google's Vision API, and was inspired to create this bird recognition app. I am writing this at 11:50 PM PST. I made my initial git commit at 10:10 PM. In less than two hours, I've created what XKCD's Randall Munroe pretty accurately described as a 5-year, multi-PhD student project just two years ago. Furthermore, the really amazing thing here is that how extraordinarily simple it all was. Most of the difficulty was in getting my project set up properly - the actual API calls took about 5 minutes. At some point I'll toss this all up on GitHub, but there really is very little code to this entire project. All of the heavy lifting, the machine learning algorithms, the training of the images, everything - it's all part of the Vision API. I highly recommend checking out the docs if you're interested.

Cloud Vision Docs

Back to the birds