Right in line with my too-obvious-to-be-worth-anything prediction, Google have just released a Labs image similarity feature for Google Images. Others have commented on this already, but obviously this is hugely interesting for me because of my currently work on Empora‘s exploratory visual search so I’m going to throw my tuppence into the ring aswell.
Below are my first impressions.
Google Similar Images (GSI) offers just one piece of functionality, the ability to find images that are similar to your selected image. You may only select images from their chosen set, there’s no dynamic image search capacity yet. Similar images are displayed either as a conventional result set when you click on “similar images”, or as a list of thumbnails in the header when you click through to see the original source.
The aims of this work will be (broadly):
- Keeping up with the Joneses. The other major search engines are working on similar functionality and Google can’t be seen to fall behind.
- User engagement. The more time you spend exploring on Google, the more their brand is burned into your subconscious.
- Later expansion of search monetisation. Adsense and Adwords get a better CTR than untargeted advertising because they adapt to the context of your search. If context can also be established visually there seems like strong potential for revenue.
The quality of results for a project like this are always going to be variable as the compromises between precision, recall, performance, and cost are going to continue to be sketched out in crayon until more mature vocabularies and toolsets are available. That said, Google need to keep users impressed, and they’ve done pretty well.
A few good examples:
A few bad examples:
- Not shoes at all actually
- What exactly is the similarity measure here? Faces? Bridalwear? Hair products?
Under the hood
Once the “qtype=similar” parameter is set in the URL, the only parameter that affects the set of similar images is the “tbnid” which identifies the query image. The text query parameter does not seem to change the result set, only changing the accompanying UI. While this doesn’t allow us to draw any dramatic conclusions it would allow them to pre-compute the results for each image.
The first clear conclusion is metadata. Google have obviously been leveraging their formidable text index, and why not. The image similarity behaviour indicates that the textual metadata associated with images is being used to affect the results. One of the clearest indicators is that they’re capable of recognising the same individual’s face as long as that person’s name is mentioned. Unnamed models don’t benefit from the same functionality.
My second insight is that they’re almost certainly using a structural technique such as Wavelet Decomposition to detect shapes within images. The dead give-away here is that search results are strongly biased towards photographs taken from the same angle.
I suspect that they’re not yet using a visual fingerprinting technique (such as FAST) to recognise photographs of the same object. If they were doing this already I suspect that they’d have used this method to remove duplicate images. This may well come later.
All in all my impression is that they’ve implemented this stuff well, but that there’s a lot more yet to come. Namely:
- Handling of duplicates, i.e. separation between searching for the similar images and instances of the same image
- A revenue stream