Novel Image Understanding
There are over 200 billion images on the internet today and this collection continues to grow by leaps and bounds. Image search engines often only surface a portion of those images and often rely on the text surrounding an image on a webpage, or the image file’s name. With the growing number of images on the Internet it is important to have the ability to organize and surface the images in the most efficient, meaningful way possible so that better images can be shown to searchers.
We want to move beyond simple image classification. Textual tags associated with an image often tell us that there is a tiger in an image. Not all images are labeled this way, but there are more than enough on any one subject to fill a search-result page.
People come to image-search engines for many reasons. Users type an average of 2.2 words, but their underlying request is much more subtle, often representing an information or entertainment need that would normally require a much longer and deeper query.
We need novel and useful ways to organize and structure image content. Can we sort celebrity pictures by their subject’s age when the photo was taken? Or by their hair style? Can we discover how a logo has evolved over time? Can we organize pictures by their geographic location or the type of object? There are many ways to organize photos. What are the ways that are not obvious? What can we do better than we can do today?
Users would like a better fit between their information and entertainment requests and the content returned by a search. How can we better organize multimedia content to fit user’s needs and desires?
Researchers working on this challenge will develop a way to successfully measure user intent, relevance of organizational principles and performance efficiency. In addition, creativity in presenting results and allowing for ease of searching and browsing will also be a key criteria. Lastly, the elegance of the solution will be judged by its ease of integration into a search engine’s pipeline, and the efficiency with which it can understand user intent and process one or more meaningful clusters – this latter part refers to processing speed. If a technology takes too long to provide meaningful organization vs. another technology can process the data very quickly, then the latter is much more attractive.
Yahoo’s Flickr is a great source for many high-quality images on a myriad of subjects. Use the Flickr API to download the images you need to understand and test your ideas.