Wednesday 21st January, 2009
Yesterday Andrew Stromberg pointed me to the excellent IPhone app by image-matching outfit Snaptell.
Snaptell’s application takes an input image (of an album, DVD, or book) supplied by the user and identifies that product, linking to 3rd party services. This is equivalent to the impressive TinEye Music but with a broader scope. As Andrew points out, the app performs very well at recognising these products.
Algorithmically the main problems faced by someone designing a system to do this are occlusions (e.g. someone covering a DVD cover with their thumb as they hold it) and transformations (e.g. skewed camera angle, or a product that’s rotated in the frame)
There are a number of techniques to solve these problems, (e.g. the SIFT and SURF algorithms) most of which involve using repeatable methods to find key points or patterns within images, and then encoding those features in such a way that is invariant to rotation (i.e. will still match when upside-down) and an acceptable level of distortion. At query-time the search algorithm can then find the images with the most relevant clusters of matching keypoints.
It seems like Snaptell have mastered a version of these techniques. When I tested the app’s behaviour (using my copy of Lucene in Action) I chose an awkward camera angle and obscured around a third of the cover with my hand and it still worked perfectly. Well done Snaptell.
Monday 19th January, 2009
The hyperlink revolution allowed text documents to be joined together. This created usable relationships between data that have enabled one of the biggest technological shifts of the recent age… large scale adoption of the internet. Try to imagine Wikipedia or Google without hyperlinks and you’ll see how critical this technique is to the web.
We’re on the verge of another revolution, this time in computer vision.
Imagine a world were the phone in your pocket could be used to find or create links in the physical world. You could get reviews for a restaurant you were standing outside without even knowing its name, or where you were. You could listen to snippets of an album before you bought it, or find out where nearby has the same item for less. You could read about the history of an otherwise unmarked and anonymous building, get visual directions, or use your camera phone as a window into a virtual game in the real world.
A team at the university of Ljubljana (the J is pronounced like a Y for anyone unfamiliar) have released a compelling video demonstrating their implementation of visual linking. They use techniques that I assume are derived from SIFT to match known buildings in an unconstrained walk through a neighbourhood. These image segments are then converted into links to enable contextually relevant information.
When you combine this with other other techniques, such as the contour-based work being done by Jamie Shotton of MSR and you start to see how that future will appear. Bring in the mass adoption of GPS handsets driven by the Iphone amongst others and it’s pretty clear there’s going to be a change in the way people create and access information.
The only questions are who, and when.
Wednesday 15th October, 2008
I’ve just noticed that we now have job specs for the positions we’re looking to fill.
The office is on Portobello Road in west London. We’re playing with some cutting-edge technology, in a positive and enthusiastic environment, so if you’re looking or have any questions then get in touch.
Tuesday 23rd September, 2008
Building an image search index takes quite a lot of processing power. Apart from all the usual mucking about that building a regular search index entails, you also have to download, resize, and analyse all the images that you want in your index. That analysis itself will consist of many different tasks, usually including the use of visual features to analyse colour, texture, shape, etc. and the use of classifiers to recognise specific objects.
Canadian visual search outfit Incogna have taken an interesting approach to image processing, from what I can tell building their image search indexes using massively parallel GPUs. Asking around the team, that technique has anecdotally produced some very successful tests, so I’ll be keeping an eye on these guys in future.