Last modification: 2-Oct-10
Fotopedia‘s system for ranking the quality and suitability of photos is is based on counting votes. This results in cumulative ratings like +2 (few people have seen the image, or maybe people don’t like it) , +22 (a more popular image), or -2 (some people care enough to vote it down). Fotopedia’s rating system has multiple purposes:
- It helps remove less interesting or less relevant photos.
- It results in a ranking among the stronger photos (e.g. to view the “best” ones).
- It helps motivate the photographers.
If the system for rating and ranking works well, users looking for information find great photos. If it works well, photographers and rankers also stay motivated and keep contributing their photos and energy. A good ranking system should thus help Fotopedia outperform alternative and more straightforward ways (e.g. Flickr, Google) when you are interesting in high quality photos that illustrate a particular subject.
Purpose of this posting
Fotopedia plans to update their rating/ranking/voting system. This posting analyzes the “old” ranking system, presents some general ideas for improvements, and hopes to trigger some good discussion on the topic. The second part of this posting becomes a bit more concrete about what an improved rating and ranking system could look like.
Learning from the search engines
Search engines such as Google face a similar challenge: they need to present the most relevant Internet pages for a search query at the top of the list of results. This helps the user find results quickly.
Initially such page ranking algorithms selected pages based on keyword matching: if you search for apple boot camp you would find pages which contain all 3 words. But particularly Google excels in guessing a page’s relevance by looking at the other pages that link to that page. Google’s proprietary PageRank algorithm uses the estimated ratings of linking pages to rank the linked page. The assumption is that if a page is interesting, other page will link to it. And if a page is really interesting, other interesting pages will link to it.
So Google essentially uses information that is automatically harvested from the Internet to guess which pages fit your search best. Note that this means that there is no need for people to answer questions about the pages in order to find out which ones are good. The entire system is thus automated (albeit with the use of vast compute resources).
Fotopedia’s challenge is different: if you are looking for photos of Grapes, the software requires humans to link photos to the Grape article. And the software needs additional human help to judge the quality and relevance of the submission.
Despite differences between searching for text and searching for photos, we can still learn something from the search engines:
- Ranking algorithm accuracy is critical: Google kept competition (remember AltaVista?) at bay largely by using smarter ranking algorithms.
- You might be able to get hints about photo quality/relevance by using available data in a smarter way. It is OK for the ranking system to be a bit fancy, and the exact method of scoring doesn’t need to be visible to the users – it is only important that users see enough of the ranking system to know that their votes make a difference and to believe that the system is fair.
And obviously, the quality of the questions which are asked to users about photos are critical in determining how much you can learn about the suitability of the photos and how long it takes the system to learn what it can learn. Both points (learning the right things, and learning them with as little user input as possible) are areas that could be improved.
The “old” Fotopedia rating system
A user can vote once on any photo (per article in which it is used). Giving it a “thumbs up”. adds +1 to the score or rating of the photo. Alternatively “thumbs down” decreases the rating by 1. Actually the rating only applies to that photo in the context of one specific article – even though the photo may be linked to multiple articles.
My photo of the violin of a famous 19th century Norwegian violinist scored +5 for the article on the violinist, it also rated +5 for the article about the violin’s builder, but it received a -1 rating as a general picture of a violin.
Scores of +5 or more (counting the original poster) currently promote an image from Candidate to Top, meaning that it becomes an official photo for that article. Enough subsequent thumbs-downs can kick the photo out of the list of Top photos. The threshold between Candidate and Top is sometimes manually set by Fotopedia staff to higher values like +10 when there are numerous submissions (e.g. Flower has 300 submissions).
This photo was linked to the violinist, the violin maker and to Violin.
powered by Fotopedia
Moderators are empowered to give larger rating boosts – mainly to shorten the learning process (which can now take months or years). I am not aware that moderators (can) reduce the score of a photo by more than one. But they can undo the link between a photo and an article if the photo is unsuitable or has had a negative score for a long time.
A smarter ranking algorithm needed
Unlike Google, Fotopedia currently does its ranking using very simple algorithms on manually provided information. Currently a user only has two ways to influence a photo’s rating and ranking:
- increase the rating by one
- decrease the rating by one
That means everybody (except for moderators?) has the same impact – regardless of their track record or qualifications. Examples of information that is not used:
- is the +1/-1 choice because of visual- or content reasons?
- how knowledgeable is the photographer/submitter? (e.g. New Yorker supplying a New York photo)
- how knowledgeable is the voter? (e.g. New Yorker voting on a New York photo)
- say a photo is linked to both the “Gondala” and the “Venice” articles. If the photo is rated in one context, the rating in the second context is unchanged.
- is the photo better/worse/similar to existing photos in the set?
- is the photo exceptional, excellent, good, average, etc. according to the voter?
What kind of photos should rank high?
It helps to be explicit (and agree) on what we are trying to measure and what we are trying to achieve with the measurement.
The question current asked to a voter is:
“Is this a great photo to illustrate this article?”
This helps, but doesn’t tell us whether we are looking for great-looking photo that is somehow relevant for the article, or a photo that adds significant information to the article but may not be visually great. So I believe it is important to answer whether Fotopedia is mainly aims to be
- a reliable source of information and show interesting aspects of the topic (goal is to be an “encyclopedia”),
- a source of visually pleasing pictures showing the subject and showing it in an interesting way (goal is to be a series of “coffee table books”), or
- both of these at the same time?
The Quality Chart currently says:
The world isn’t perfect. You might feel the need to represent it differently using Photoshop and your artistic talent. The encyclopedia isn’t the place to express such needs. We illustrate the world as it is, in all its beauty and ugliness. Artistic and overprocessed photos (including HDR) don’t belong.
Although this may suggest that Fotopedia images should be first and foremost informative, Fotopedia’s derivative iPad applications like “Heritage” rely heavily on the visual side. The current lack of emphasis on captions and the set of article links when rating a photo also suggest that the visual side is currently getting more attention.
Salon, library or both?
My assumption in the in the rest of this discussion piece is that an ideal Fotopedia photo should be both informative and visually pleasing – although I can’t define either precisely. This means that Fotopedia would target both the salon’s coffee table and the library’s bookshelf – so to speak.
The photo is linked to the Lighthouse article and is visually pleasing. It clearly illustrates what a lighthouse is and does, but information about location is missing.
powered by Fotopedia.
The “informative” and information accuracy are needed if Fotopedia aims to serve as a visual wrapper around- or companion to Wikipedia. But I believe “attractive” is also needed because:
- Even newspapers select photos (“President holding speech”) on both criteria. Newspapers are commercial products and readers prefer newspapers with “nice” pictures. The same applies to Fotopedia.
- Fotopedia plans to use its image collection to publish more “coffee table books” (like Fotopedia Heritage for the iPad). By definition, coffee table books should attract casual browsing and depend heavily on picture and graphical design quality.
- The vast majority of Fotopedia photos are already visually attractive. In fact, I wouldn’t be surprised if many voters decide to vote +1 mainly when the picture is nice (e.g. “good enough to hang on the wall”) and on-topic – but in that order!
- The rating is critical to motivate photographers to submit images. The rating system shouldn’t be too different from how photographers and their customers (editors, clubs, relatives) rate images – especially for documentary images. A photographer (e.g. for National Geographic) will strive to support the article’s text with attractive pictures.
A final example
An example of emphasis on aesthetics without worrying about the encyclopedia goals are a series of photos of fruit being dropped into water. This gives visually interesting photos and this photo actually earned the highest ranking within the Grape article. I would say it is clearly less suitable as an encyclopedia photo because the water and the splashing don’t convey anything relevant: it doesn’t tell me about grapes, how they are grown or how they are used. So ranking suggests that people often rate pictures on their photographic merits, while ignoring the informational merits.
A visually attractive and technically challenging photo, but unsuited to illustrate grapes.
powered by Fotopedia
[ see continuation in part 2 of this article ]