I’ve written previously about work by Canadian image search technologists Incogna to harness the power of graphics processors to index images, but I’ve only just found time to try out their image search beta.
A bit of context
The challenges surrounding image search are significant and numerous (just like the opportunities). To start with, an image search engine has to have most of the main features and properties of a text search engine, so problems include some degree of natural language processing, semantic indexing, and all the problems that come with creating a distributed search index.
When you add image comparison into the mix that puts even more load on your infrastructure in terms of bandwidth, storage, the CPU required to extract usable data from images, and the data structures required to hold that visual data.
While those aspects of a search engine design can be tricky to get right, I don’t see them as the hardest problem. I think the hardest problem is caused by the fact that a picture paints a thousand words. An image contains so much information that it’s hard to know exactly what aspect of it a user is looking at… and hence what they mean by similarity and relevance. For example, if a user uses a holiday snap from Yosemite as a query image, are they looking for other pictures of Yosemite, or for shirts like the guy in the picture is wearing?
The point I’m getting to is that nobody has really nailed this problem yet, because it’s hard.
Playing with this search engine is fun. The user interface is slick and responsive, and search results are returned in a fraction of a second (it’s hard to overstate how important the perception of speed is to user experience). Looking under the hood the interface is using Prototype to handle animation and talk to a JSON API, which is a solid choice.
From the results returned from a few similarity searches, they seem to be calculating similarity using some sort of shape analysis, some colour data, and also making significant use of non-visual metadata.
The input data they’ve crawled seems to have come from a fairly wide range of sources, although it’s hard to see how big their index is in total since they return quite a tight result set, opting for precision rather than recall.
That leads me to my only criticism of the beta, which is that precision is only useful if you can be sure you have a reasonable understanding of what is relevant to a user. The beta doesn’t offer the user a mechanism to help determine relevance; the “thousand words” problem I described above. In order to become really useful, users have to be able to help the search engine decide what is relevant. That said, I don’t have all the answers myself, and Incogna seem like a smart bunch. I expect them to have some very interesting ideas over the next 12 months.