Deep Learning, Is it Magic?

Kenny Darrell

August 20, 2014


A collection of things has led me to want to learn more about Deep Learning. I have heard the term for some time now but am not really sure what it is all about. I have had some experience with image processing from my days in undergraduate engineering and image mining from my days as an engineer. This has led to an interest in trying my luck at image mining as well. Both of these areas seem to be all of the rage now as tons of major companies have been buying startups related in any way to images or deep learning. Kaggle has taken notice as well.

It appears one is also a great tool for solving the other. So where to start with learning some of these things. At one point when I was a graduate student I new a fair deal about neural networks. Some of that has since faded but I thought it sufficient enough to jump in and try to use some deep learning tools. Later I can try to dig into the math.

There are tutorials all over, some are good and some are bad. Here is a link to one that gave me some success.

Setting Up the Virtual Environment

The first requirement you need to fulfill is to install Vagrant and be able to use it. This was a great learning experience in itself as I have been meaning to learn one of these virtual development environment tools. I can't say that I am an expert but it was fairly easy to get up to speed and feel comfortable with this one. I am sure my knowledge only scratched the surface but I already see the value in them.

Once this is setup you can run the following from the command line.

mkdir ~/dl_webcast
cd ~/dl_webcast

vagrant box add dl_webcast

vagrant init dl_webcast
vagrant up

vagrant ssh

The real value in this virtual environment is that it already has a pre trained model included. In the tutorial it has you evaluate an image that comes preloaded. To jump right in and use the pre-build model run the python script as seen below.

cd ~/caffe
python python/ --print_results examples/images/cat.jpg foo


Now let's try this with some images I found online since the example only comes with one image to evaluate.

beagle 0.49523
English foxhound 0.07124
Walker hound 0.05694
basset 0.05413
basset 0.01672

Doberman 0.75929
miniature pinscher 0.09810
Rottweiler 0.07136
toy terrier 0.02606
black-and-tan coonhound 0.02000

These images of dogs look pretty good. Now lets try something I think is a little harder. I can't tell the difference between an alligator and a crocodile. One image even has another animal to throw the model off. How well does it work?

American alligator 0.72230
African crocodile 0.27429
common iguana 0.00146
Komodo dragon 0.00028
rock python 0.00024

African crocodile 0.39933
American alligator 0.16945
alligator lizard 0.05302
agama 0.05002
garter snake 0.04964

Wow, I am pretty impressed. This seemed to work pretty well. Deep Learning does seem like magic at first but these are all pictures from the internet. How well will it work from real images. I wanted to try an image I created from my phone. I sacrificed life and limb to get the picture below hoping it would be a harder test. I was also thinking this would be more of how this could be utilized in a cool application, a phone app that tells you what kind of wildlife you are looking at.

skunk 0.05580
coho 0.04459
American alligator 0.04231
tench 0.03036
banjo 0.02585

WHAT, it thinks I am a skunk. After the bike ride it took to get there my wife may agree with that classification but I sure don't. I am at ease though after investigating the probability, it has very low confidence in its guess. I then wondered what would happen if I modified the picture a bit. I took off a lot of the perimeter to see if it got any better.

American alligator 0.04321
banjo 0.03946
apiary 0.03699
jersey 0.03635
neck brace 0.02906

That looks a little better. It does get the alligator and most importantly it does not think that I am a skunk. I think it sees me as the jersey, or the the neck brace?


What happens if I take myself out and just look at the alligator?

aircraft carrier 0.08376
snow leopard 0.07310
grey whale 0.06862
indri 0.05899
steel arch bridge 0.04745

This is very interesting. Is is obviously wrong but I can see where it was going with the aircraft carrier, whale and the bridge. This reassures me that these are not magic. They do a good job of classifying images but the problem is not solved to completion. I think they deserve the attention they have been getting lately though. Perhaps I will be able to dig deeper and look at there inner workings more in the future.