Pixsta team launches Empora fashion site

Friday 3rd April, 2009

Yesterday night we finally broke a bottle of champagne against the side of the good ship Empora and watched her slide out of the dock. We’ve been working on the project for the past couple of months, so it’s a pleasure to see it go live.

As well as the usual search functionality you’d expect on a retail site, Empora enables searching and browsing using the content of product images (currently either women’s clothes or men’s clothes). When you view a product you’re also shown items that may relate to it visually, either in terms of shape or colour.

As with any project there are always things I’d change, and things that aren’t done yet, but overall I’m pretty chuffed with what our team has accomplished so far. We’re by no means finished though. Expect big things in the near future.

Advertisements

Twitter search parallels other vertical search domains

Sunday 1st March, 2009

In case you haven’t tried it already, Twitter’s search tool is very well implemented. It’s effective, slick, and very fast.

Being able to quickly and efficiently search through the life streams and conversations of a good proportion of the thought leaders and early adopters in the UK and US seems to me like something with a bit of potential… a stream that’s ripe for news and knowledge management apps like Techmeme, Silobreaker, and Google News. It’s a fair bet that conversation and life-streaming will be a valuable search domain just like user-uploaded video (apparently Youtube searches outnumber Yahoo’s).

Conventional (i.e. text and metadata-driven) image search is another search domain in which the big search companies seem willing to absorb losses. As I (and many others) have mentioned before, their willingness to do this stems from their desire to occupy user mindshare for the entire search concept, rather than piecemeal domains or verticals. As we can see from attempts by Google and Microsoft to include content-based image retrieval (CBIR) functionality that eagerness is not likely to be restricted to textual image search.

While my opinion may obviously be biased, I wouldn’t be that surprised to see “conversation” (Twitter, Friendfeed and life-streaming) and “product” (including price and visual similarity features)  tabs integrated into the search boxes of the big three in the relatively near future.


Recognising specific products in images

Wednesday 21st January, 2009

Yesterday Andrew Stromberg pointed me to the excellent IPhone app by image-matching outfit Snaptell.

Snaptell’s application takes an input image (of an album, DVD, or book) supplied by the user and identifies that product, linking to 3rd party services. This is equivalent to the impressive TinEye Music but with a broader scope. As Andrew points out, the app performs very well at recognising these products.

Algorithmically the main problems faced by someone designing a system to do this are occlusions (e.g. someone covering a DVD cover with their thumb as they hold it) and transformations (e.g. skewed camera angle, or a product that’s rotated in the frame)

There are a number of techniques to solve these problems, (e.g. the SIFT and SURF algorithms) most of which involve using repeatable methods to find key points or patterns within images, and then encoding those features in such a way that is invariant to rotation (i.e. will still match when upside-down) and an acceptable level of distortion. At query-time the search algorithm can then find the images with the most relevant clusters of matching keypoints.

It seems like Snaptell have mastered a version of these techniques. When I tested the app’s behaviour (using my copy of Lucene in Action) I chose an awkward camera angle and obscured around a third of the cover with my hand and it still worked perfectly. Well done Snaptell.


Welcome to the image link revolution

Monday 19th January, 2009

The hyperlink revolution allowed text documents to be joined together. This created usable relationships between data that have enabled one of the biggest technological shifts of the recent age… large scale adoption of the internet. Try to imagine Wikipedia or Google without hyperlinks and you’ll see how critical this technique is to the web.

We’re on the verge of another revolution, this time in computer vision.

Imagine a world were the phone in your pocket could be used to find or create links in the physical world. You could get reviews for a restaurant you were standing outside without even knowing its name, or where you were. You could listen to snippets of an album before you bought it, or find out where nearby has the same item for less. You could read about the history of an otherwise unmarked and anonymous building, get visual directions, or use your camera phone as a window into a virtual game in the real world.

A team at the university of Ljubljana (the J is pronounced like a Y for anyone unfamiliar) have released a compelling video demonstrating their implementation of visual linking. They use techniques that I assume are derived from SIFT to match known buildings in an unconstrained walk through a neighbourhood. These image segments are then converted into links to enable contextually relevant information.

When you combine this with other other techniques, such as the contour-based work being done by Jamie Shotton of MSR and you start to see how that future will appear. Bring in the mass adoption of GPS handsets driven by the Iphone amongst others and it’s pretty clear there’s going to be a change in the way people create and access information.

The only questions are who, and when.


Incogna monetise pure image search

Monday 12th January, 2009

I must have missed the launch of this feature, but Incogna’s most recent blog post talks about how they’ve implemented visual advertising. The results vary, but overall they’ve implemented it well.

I’ve written about Incogna’s image search before, but there’s more to add; when using this tool, as a user you have no visibility into the depth or type of data available to you. Nor does the app currently give control over movement, other than using text search and query images.

Establishing context (or, lost in the supermarket)

Any fans of Steve Krug’s usability classic will recognise the metaphor here. If you’re in an aisle in a supermarket you can see both the length of the aisle and the content of the shelves (at least the ones near you). You also know your rough position in the store, and can see signs and the contents of shelves.

Using that input data you can navigate (with a few hiccups) anywhere in the store.

Incogna’s app currently allows you to compare visually, and to search using text, but the depth and type of results remains hidden. As such there’s no real way to effectively navigate within the data set.

I should be clear at this point that this isn’t a criticism of Incogna’s app. This is not a problem with an easy or obvious solution. What I’m suggesting is that there’s still scope for some killer navigation features in this area.

Making money

The monetisation feature on Incogna appears only when their system thinks it can produce a good match between your search and the sponsored products. This is a wise move, since irrelevant ads would ruin the user experience.

It seems like the results use mainly visual comparison data, possibly with some categorisation thrown in. It worked brilliantly with pictures of trucks, but curiously while I was browsing Canon cameras it presented sponsored ads for televisions (both are rectangular I suppose).

Having fun

The main issue standing in the way of Incogna’s revenue stream is that their app is not yet fun to use. As mentioned above there’s no sense of position or direction. You can’t learn anything about the images you find without clicking through to the source site, and you can’t properly refine your search…  you have to start again, which means that there’s no big advantage over Google, or any other text-based image search.

More another time.


Dose – alpha entering final contractions

Thursday 27th November, 2008

We’re nearing the completion of the first Alpha of our new content-based image retrieval (CBIR) engine. It seems wrong to let it be born without a name, so I’ve settled provisionally on Dose which stands for Distributed Object Similarity Engine.

It’ll be a little while before we’ve got any user-facing products to show for all our hard work, but we’ve learnt a lot and should have something good at the end of it.


Idée’s Multicolr Search Lab

Thursday 27th November, 2008

This morning I had another play with the Multicolr Search Lab from visual search outfit Idée Inc and decided to make some notes, which I’ve posted below.

Idée in context

Idée are one of the bigger players in the visual search space, although they currently occupy quite a different market to my new team Pixsta.

Idée’s biggest product so far is TinEye, which is used to find uses of a single specific source image across the Internet. The main commercial use appears to be detection of copyright infringment, a service they provide to photographers and copyright owners for a fee. To my (admitedly limited) knowledge they’re the only company offering this specific service.

MultiColr

The Multicolr Search Lab (MSL) is a proof-of-concept that demonstrates Idée’s ability to index image by colour. As an image-based ‘labs’ project, the UI naturally reflects a mixture and Google Labs functionality and Flickr’s Web 2.0 styling. Its clean and simple. I like it a lot.

Naturally as a ‘labs’ project it has no direct revenue stream, but it’s a nice demo and there may well be use for this type of technology in some areas: Interior design for example; add a simple hook-up to a printing service like Photobox and you’ve got a revenue stream for photographers and your own service.

As Multicolr is currently running at an adequate speed over a stated 10 million images I’d guess that the underlying technology is ready for at the least enterprise scale applications, if not internet scale.

MultiColr engine

From my previous play with Multicolr I has certain expectations as to how it worked internally.

In any search application you need precision, but you also need to be able to bring back close matches if no direct matches exist. Since MultiColr is using RGB hex values as query terms I’d suspected that they’d rounded those hex values to match their chosen quantisation, and were matching on those approximate colours.

This isn’t what they’re doing, as I discovered when I changed the colour specified in their nicely RESTful URL to a subtely different colour (one that should fall within the same quantisation bin). If they were using naive quantisation then the results would have remained the same. In fact they changed.

So although they may still be using some form of quantisation to avoid gaps in queries, it seems like they’re also allowing the raw colour value (either in RGB or a different colour space) into the index, and scoring images appropriately based on each query, i.e. the quantisations seen in the UI may well be totally arbitrary.

All in all, an good piece of work. Well done Idée.