Friday, June 11, 2010

Finding Identical and Different Images

Its a common problem to have lots of images, and when you do, it even more common to
have lots of duplicates. The problem with images, is they can be the same to the eye, but vastly different at the binary level. Different resolutions, slightly different cropping, but yet at a glance the same. In IT its common to calculate a hash, or magic number that changes radically for slight difference. What more useful in imaging duplication is changing slightly with near identical images.

This ruby plugin, does exactly that. I'm looking at using in a upcoming project, and will update the article as I go forward. What's even more exciting is the ability to do the same in video.

http://github.com/mperham/phashion