A collection of things has led me to want to learn more about Deep Learning. I have heard the term for some time now but am not really sure what it is all about. I have had some experience with image processing from my days in undergraduate engineering and image mining from my days as an engineer. This has led to an interest in trying my luck at image mining as well. Both of these areas seem to be all of the rage now as tons of major companies have been buying startups related in any way to images or deep learning. Kaggle has taken notice as well.
It appears one is also a great tool for solving the other. So where to start with learning some of these things. At one point when I was a graduate student I new a fair deal about neural networks. Some of that has since faded but I thought it sufficient enough to jump in and try to use some deep learning tools. Later I can try to dig into the math.
There are tutorials all over, some are good and some are bad. Here is a link to one that gave me some success.
The first requirement you need to fulfill is to install Vagrant and be able to use it. This was a great learning experience in itself as I have been meaning to learn one of these virtual development environment tools. I can't say that I am an expert but it was fairly easy to get up to speed and feel comfortable with this one. I am sure my knowledge only scratched the surface but I already see the value in them.
Once this is setup you can run the following from the command line.
mkdir ~/dl_webcast
cd ~/dl_webcast
vagrant box add dl_webcast https://d2rlgkokhpr1uq.cloudfront.net/dl_webcast.box
vagrant init dl_webcast
vagrant up
vagrant ssh
The real value in this virtual environment is that it already has a pre trained model included. In the tutorial it has you evaluate an image that comes preloaded. To jump right in and use the pre-build model run the python script as seen below.
cd ~/caffe
python python/classify.py --print_results examples/images/cat.jpg foo
Now let's try this with some images I found online since the example only comes with one image to evaluate.
beagle | 0.49523 |
English foxhound | 0.07124 |
Walker hound | 0.05694 |
basset | 0.05413 |
basset | 0.01672 |
Doberman | 0.75929 |
miniature pinscher | 0.09810 |
Rottweiler | 0.07136 |
toy terrier | 0.02606 |
black-and-tan coonhound | 0.02000 |
These images of dogs look pretty good. Now lets try something I think is a little harder. I can't tell the difference between an alligator and a crocodile. One image even has another animal to throw the model off. How well does it work?
American alligator | 0.72230 |
African crocodile | 0.27429 |
common iguana | 0.00146 |
Komodo dragon | 0.00028 |
rock python | 0.00024 |
African crocodile | 0.39933 |
American alligator | 0.16945 |
alligator lizard | 0.05302 |
agama | 0.05002 |
garter snake | 0.04964 |
Wow, I am pretty impressed. This seemed to work pretty well. Deep Learning does seem like magic at first but these are all pictures from the internet. How well will it work from real images. I wanted to try an image I created from my phone. I sacrificed life and limb to get the picture below hoping it would be a harder test. I was also thinking this would be more of how this could be utilized in a cool application, a phone app that tells you what kind of wildlife you are looking at.
skunk | 0.05580 |
coho | 0.04459 |
American alligator | 0.04231 |
tench | 0.03036 |
banjo | 0.02585 |
WHAT, it thinks I am a skunk. After the bike ride it took to get there my wife may agree with that classification but I sure don't. I am at ease though after investigating the probability, it has very low confidence in its guess. I then wondered what would happen if I modified the picture a bit. I took off a lot of the perimeter to see if it got any better.
American alligator | 0.04321 |
banjo | 0.03946 |
apiary | 0.03699 |
jersey | 0.03635 |
neck brace | 0.02906 |
That looks a little better. It does get the alligator and most importantly it does not think that I am a skunk. I think it sees me as the jersey, or the the neck brace?
What happens if I take myself out and just look at the alligator?
aircraft carrier | 0.08376 |
snow leopard | 0.07310 |
grey whale | 0.06862 |
indri | 0.05899 |
steel arch bridge | 0.04745 |
This is very interesting. Is is obviously wrong but I can see where it was going with the aircraft carrier, whale and the bridge. This reassures me that these are not magic. They do a good job of classifying images but the problem is not solved to completion. I think they deserve the attention they have been getting lately though. Perhaps I will be able to dig deeper and look at there inner workings more in the future.