Fotopedia – the rating system (1/2)

Last modification: 2-Oct-10

Fotopedia‘s system for ranking the quality and suitability of photos is is based on counting votes. This results in cumulative ratings like +2 (few people have seen the image, or maybe people don’t like it) , +22 (a more popular image), or -2 (some people care enough to vote it down). Fotopedia’s rating system has multiple purposes:

  1. It helps remove less interesting or less relevant photos.
  2. It results in a ranking among the stronger photos (e.g. to view the “best” ones).
  3. It helps motivate the photographers.

If the system for rating and ranking works well, users looking for information find great photos. If it works well, photographers and rankers also stay motivated and keep contributing their photos and energy. A good ranking system should thus help Fotopedia outperform alternative and more straightforward ways (e.g. Flickr, Google) when you are interesting in high quality photos that illustrate a particular subject.

Ratings (e.g. +1) are valid in the context of an article ("Black cat").

Purpose of this posting

Fotopedia plans to update their rating/ranking/voting system. This posting analyzes the “old” ranking system, presents some general ideas for improvements, and hopes to trigger some good discussion on the topic. The second part of this posting becomes a bit more concrete about what an improved rating and ranking system could look like.

Learning from the search engines

Search engines such as Google face a similar challenge: they need to present the most relevant Internet pages for a search query at the top of the list of results. This helps the user find results quickly.

Initially such page ranking algorithms selected pages based on keyword matching: if you search for apple boot camp you would find pages which contain all 3 words. But particularly Google excels in guessing a page’s relevance by looking at the other pages that link to that page. Google’s proprietary PageRank algorithm uses the estimated ratings of linking pages to rank the linked page. The assumption is that if a page is interesting, other page will link to it. And if a page is really interesting, other interesting pages will link to it.

So Google essentially uses information that is automatically harvested from the Internet to guess which pages fit your search best. Note that this means that there is no need for people to answer questions about the pages in order to find out which ones are good. The entire system is thus automated (albeit with the use of vast compute resources).

Fotopedia’s challenge is different: if you are looking for photos of Grapes, the software requires humans to link photos to the Grape article. And the software needs additional human help to judge the quality and relevance of the submission.

Despite differences between searching for text and searching for photos, we can still learn something from the search engines:

  1. Ranking algorithm accuracy is critical: Google kept competition (remember AltaVista?) at bay largely by using smarter ranking algorithms.
  2. You might be able to get hints about photo quality/relevance by using available data in a smarter way. It is OK for the ranking system to be a bit fancy, and the exact method of scoring doesn’t need to be visible to the users – it is only important that users see enough of the ranking system to know that their votes make a difference and to believe that the system is fair.

And obviously, the quality of the questions which are asked to users about photos are critical in determining how much you can learn about the suitability of the photos and how long it takes the system to learn what it can learn. Both points (learning the right things, and learning them with as little user input as possible) are areas that could be improved.

The “old” Fotopedia rating system

A user can vote once on any photo (per article in which it is used). Giving it a “thumbs up”. adds +1 to the score or rating of the photo. Alternatively “thumbs down” decreases the rating by 1. Actually the rating only applies to that photo in the context of one specific article – even though the photo may be linked to multiple articles.

My photo of the violin of a famous 19th century Norwegian violinist scored +5 for the article on the violinist, it also rated +5 for the article about the violin’s builder, but it received a -1 rating as a general picture of a violin.

Scores of +5 or more (counting the original poster) currently promote an image from Candidate to Top, meaning that it becomes an official photo for that article. Enough subsequent thumbs-downs can kick the photo out of the list of Top photos. The threshold between Candidate and Top is sometimes manually set by Fotopedia staff to higher values like +10 when there are numerous  submissions (e.g. Flower has 300 submissions).

This photo was linked to the violinist, the violin maker and to Violin.
powered by Fotopedia

Moderators are empowered to give larger rating boosts – mainly to shorten the learning process (which can now take months or years). I am not aware that moderators (can) reduce the score of a photo by more than one. But they can undo the link between a photo and an article if the photo is unsuitable or has had a negative score for a long time.

A smarter ranking algorithm needed

Unlike Google, Fotopedia currently does its ranking using very simple algorithms on manually provided information. Currently a user only has two ways to influence a photo’s rating and ranking:

  1. increase the rating by one
  2. decrease the rating by one

That means everybody (except for moderators?) has the same impact – regardless of their track record or qualifications. Examples of information that is not used:

  • is the +1/-1 choice because of visual- or content reasons?
  • how knowledgeable is the photographer/submitter? (e.g. New Yorker supplying a New York photo)
  • how knowledgeable is the voter? (e.g. New Yorker voting on a New York photo)
  • say a photo is linked to both the “Gondala” and the “Venice” articles. If the photo is rated in one context, the rating in the second context is unchanged.
  • is the photo better/worse/similar to existing photos in the set?
  • is the photo exceptional, excellent, good, average, etc. according to the voter?

What kind of photos should rank high?

It helps to be explicit (and agree) on what we are trying to measure and what we are trying to achieve with the measurement.

The question current asked to a voter is:

“Is this a great photo to illustrate this article?”

This helps, but doesn’t tell us whether we are looking for great-looking photo that is somehow relevant for the article, or a photo that adds significant information to the article but may not be visually great. So I believe it is important to answer whether Fotopedia is mainly aims to be

  • a reliable source of information and show interesting aspects of the topic (goal is to be an “encyclopedia”),
  • a source of visually pleasing pictures showing the subject and showing it in an interesting way (goal is to be a series of “coffee table books”), or
  • both of these at the same time?

The Quality Chart currently says:

The world isn’t perfect. You might feel the need to represent it differently using Photoshop and your artistic talent. The encyclopedia isn’t the place to express such needs. We illustrate the world as it is, in all its beauty and ugliness. Artistic and overprocessed photos (including HDR) don’t belong.

Although this may suggest that Fotopedia images should be first and foremost informative, Fotopedia’s derivative iPad applications like “Heritage” rely heavily on the visual side. The current lack of emphasis on captions and the set of article links when rating a photo also suggest that the visual side is currently getting more attention.

Salon, library or both?

My assumption in the in the rest of this discussion piece is that an ideal Fotopedia photo should be both informative and visually pleasing – although I can’t define either precisely. This means that Fotopedia would target both the salon’s coffee table and the library’s bookshelf – so to speak.

The photo is linked to the Lighthouse article and is visually pleasing. It clearly illustrates what a lighthouse is and does, but information about location is missing.
powered by Fotopedia.

The “informative” and information accuracy are needed if Fotopedia aims to serve as a visual wrapper around- or companion to Wikipedia. But I believe “attractive” is also needed because:

  1. Even newspapers select photos (“President holding speech”) on both criteria. Newspapers are commercial products and readers prefer newspapers with “nice” pictures. The same applies to Fotopedia.
  2. Fotopedia plans to use its image collection to publish more “coffee table books” (like Fotopedia Heritage for the iPad). By definition, coffee table books should attract casual browsing and depend heavily on picture and graphical design quality.
  3. The vast majority of Fotopedia photos are already visually attractive. In fact, I wouldn’t be surprised if many voters decide to vote +1 mainly when the picture is nice (e.g. “good enough to hang on the wall”) and on-topic – but in that order!
  4. The rating is critical to motivate photographers to submit images. The rating system shouldn’t be too different from how photographers and their customers (editors, clubs, relatives) rate images – especially for documentary images. A photographer (e.g. for National Geographic) will strive to support the article’s text with attractive pictures.

A final example

An example of emphasis on aesthetics without worrying about the encyclopedia goals are a series of photos of fruit being dropped into water. This gives visually interesting photos and this photo actually earned the highest ranking within the Grape article. I would say it is clearly less suitable as an encyclopedia photo because the water and the splashing don’t convey anything relevant: it doesn’t tell me about grapes, how they are grown or how they are used. So ranking suggests that people often rate pictures on their photographic merits, while ignoring the informational merits.

A visually attractive and technically challenging photo, but unsuited to illustrate grapes.
powered by Fotopedia

[ see continuation in part 2 of this article ]

Lightroom 3 review

Adobe released Lightroom 3.0 on June 8th, after eight months of public beta testing.

I simply kept using LR 2.7 during the beta testing period as the beta version didn’t allow you to easily import LR 2.x databases, and I didn’t want to run any risk with my existing catalog data. So, even though official releases can still have some bugs, I upgraded to LR 3.0 as soon as the final version was available.

For an overview of what’s new in Lightroom 3.0, see for example Adobe’s own site.

Upgrade process

Lightroom 3.0 installs itself alongside the LR 2.x version. When the software is first run, a new LR v3 catalog (=metadata database) is generated alongside the old version 2 catalog. This means you can go back to the old version if necessary. For my catalog containing 25000 images (and over 100,000 keywords), the conversion took about 15 minutes.

Lens correction modules

Lightroom 3 has the option to correct vignetting, lens distortion and lateral chromatic aberration for

  • Canon (26 lenses and 2 point-and-shoots),
  • Nikon (7 lenses and 1 point-and-shoot),
  • Sigma (2 APC-C lenses and 3 full-frame lenses)
  • Sony (Sony DT 18-200mm only), and
  • Tamron (Tamron DI 28-75mm only)

This can be seen as a simple equivalent to DxO’s Optics Pro lens correction modules: the program automatically corrects these defects based on calibration data provided by Adobe (in some cases with support from the lens maker). I still hope that Adobe will acquire DxO’s technology – but this seems less likely now that Lightroom 3.0 does the low-hanging fruit part of what DxO does.

There are Canon lenses supported at present (2 of my 3 lenses; 100mm f/2.8 macro missing):

Canon lens support in Lightroom 3

The first lens is incidentally the Canon PowerShot G10/G11 point-and-shoot camera, but can also be used for the Canon PowerShot S90. The number of supported lenses will likely continue to grow: modules can be provided by Adobe, third parties, or even by end-users (Adobe provides software for this).

Support for managing video files

My directory tree containing all my pictures also contain a few dozen short HD video fragments made using my Canon 5D Mark II. It is a good idea to run “Synchronize Folder…” on the file system because this allows Lightroom 3.0 to find and import these videos. In my case, it also picked up some JPGs that my daughter had made with her camera and had manually placed in the directory tree. The support for videos is currently pretty basic: you can see a thumbnail, can view it using an external application (e.g. Windows Media Player), can add keywords, and can (obviously) export the file – which in this case just means copying the file as-is.

This isn’t much, but should be enough for now to prevent the following scenario that was easily possible with Lightroom 2 with a newer model camera:

  1. you take hundreds of pictures (JPG or Raw) with your DSL, but also a video (.MOV)
  2. you use Lightroom 2.x to import the pictures from your flash card. It warns that there are some movie files, but it doesn’t do anything with them.
  3. you are eager to see your pictures, so you start running Lightroom (adding keywords, deleting the weak images, etc.).
  4. you put your flash card back in the camera and… reformat the flash card: the video files are now lost.

Lightroom 3, in contrast, imports both the pictures and any file format that it recognizes as videos (I have seen .mov, .mp4, .avi work).

The 2003 versus the 2010 “process”

Adobe’s original image improvement “flow” or “process” was getting a bit out of date (it hasn’t fundamentally changed since Lightroom was introduced in 2003). So the Lightroom engineers needed a way to improve this without causing old photo’s to suddenly start looking slightly different. Thus by default, Lightroom 3 still uses the “2003 process” for existing images in catalogs and uses a new “2010 process” for anything that is newly imported. You have full control over which of process you want – these are just the defaults. The main improvement in the 2010 process is supposedly the handling of high-ISO images.

Below is a 100% crop of a raw image taken with a Canon 5D Mk II using a 24-104mm f/4L IS USM lens. The lens is good, but not great, so we can see some lens artifacts when we zoom in all the way.

Image EXIF data: ISO 200 with Highlight Tone Protection enabled (essentially underexposed!), 32mm, f/6.7, 1/250, tripod, raw @ 21 MPixels. In addition, the original image was by 1 stop too dark (probably due to spot metering). The HTP and the underexposure together mean that the dark parts of the image exhibit chroma noise – even at 200 ISO. Warning: the differences between the images is very small. I will point them out, but if you want to compare them, you can download the files and compare them in a slideshow-like tool.

100% crop of a 21 MPixel image using Lightroom 3.0 using 2010 process without lens correction

In this image (remember that you are seeing only 1.1% of the surface area of the full image – the full image is 10× wider and 10× higher), look for:

  • the purple fringe at border between sleeve and Leigh’s arm. This is chromatic aberration. It is not too visible here because we are not too far from the center of the image and the image quality would have been visibly worse at 24mm.
  • the purple color noise in the lady’s gray sweater and my blue sweater. Surprising in a 200 ISO image, but this is again because of the HTP setting and accidental underexposure.
  • the moiré in the striped pink blouse. A resolution of 21 MPixel may sound more than high enough, but it is actually not too high by modern standards: it corresponds to the same pixel pitch as an 8 MPixel APS-C camera.
Test image using Lightroom 3.0, 2010 process and lens correction

This image is very similar to the previous one. But look for:

  • There is less/no purple fringe at border between sleeve and arm. The lens correction module for the 24-105mm lens has automatically fixed this.
  • When you compare this image to the one without lens correction, you see slightly different cropping on the left side. Check out the pearl necklace. This is due to the circa 1% distortion: you lose a few pixels.
Test image using Lightroom's 2003 process with lens correction

This image is again quite similar to the previous two – but things to look for:

  • there seems to be less purple color noise in the lady’s gray sweater and my blue sweater.

Adobe obviously intends the 2010 process to outperform the 2003 process. In this case, the 2003 process does a better job (argggh; the first comment by a reader incidentally seems to confirm this). Adobe demonstrates the differences between the two processes mainly using high-ISO images.

Test image using DxO v6.2 with lens correction (and black level and exposure tweaking to match the overall appearance of the Lightroom output)

The same file processed using DxO Optics Pro version 6.2. DxO specializes in noise reduction and correction of lens aberations. Using default settings, I would say it has gotten rid of the chroma noise, but at the cost of detail in the sweaters. This essentially means that when your image is too noisy, you use spatial low-pass filtering to reduce the noise – at some loss of detail. You can tune all these settings in both Lightroom and DxO, so you may be able to fix this by moving away from the default settings – after all it is a critical trade-off. Note also that the aliasing in the pink blouse is less than in the Lightroom images, suggesting more effective de-mosaicing filtering.

Lens correction benefits

The cropped image used above does show too much lens correction. So let’s look at another image shot using a Canon EF 24-105mm f/4L lens. This image with lots of straight lines and indirect lighting incidentally shows a hall where Belgian coal miners used to shower after their shift.

ISO 200, 24mm, f/7.1, 1/100, full image (move mouse over image to see result of lens correction)

If you compare the two images, you see clearly that the lens has quite some distortion at 24mm (and with a full frame sensor). The website even calls this 4.3% distortion “massive“. A direct comparison also shows that the light fall-off of 1.5 stops in the corners (this is at f/7.1; it would have been more at f/4).

100% crop of a corner (move mouse over image to see result of lens correction)

The cycan and magenta fringes are clearly visible (both around the roof and between the tiles) and are largely corrected using the lens correction module. In the crop, both images are distortion corrected for practical reasons. It is worth noting that although all of this is pretty advanced stuff, you only need to click on a checkbox to activate lens correction.

iPhone support

The iPhone 3G and 3GS are both supported with respect to the camera inside these phones. On the one hand these are popular camera’s and undoubtedly have medium quality optics. Possibly Adobe added this as a bonus for people with fancy cameras who also use iPhones. Anyone using the iPhone as a main camera probably doesn’t care too much about image quality.

I am also pretty sure that the Canon PowerShot S90 is also supported. In fact, the Adobe software silently did distortion correction for this model without informing the user or giving users the option to enable or disable the feature. This was a design decision by Canon: correct residual lens aberrations in Canon software and where possible also in major 3rd party software.

Tethering your camera

Connecting your camera via a USB cable to a laptop or desktop is easy and can be useful. Every picture you take is sent over the USB cable and shows up in Lightroom pretty much immediately. The picture that shows up in Lightroom can automatically be given some keyword or preset or get the same adjustments as the previous image.

You can either trigger the camera’s shutter using the camera’s shutter button(s), or trigger the shutter from the computer (using a mouse or keyboard). You can also see important camera settings like ISO/aperture/shutter_speed/white_balance, but you cannot adjust these from the computer. I didn’t manage to start a video this way, but that may not be terribly useful anyway.

So the tethering works (at least with a hand full of recent cameras) and is easy to use. But don’t expect the ability to really control the camera from your armchair.

Oddities and bugs

  • Lens correction and image resizing. The lens correction module fixes distortion, resulting in a warped picture that is then automatically cropped back to a rectangle. You lose some pixels at the edges. The resulting image size in pixels is, however, identical to the original image size. I guess that choice is ok for casual users, but how about demanding users?Another way to explain what I mean: if you straighten the image using Lightroom (e.g. rotate it by 1 degree), you get a different image size and different aspect ratio. Distortion correction is somewhat comparable, but behaves differently.
  • File count in the keyword hierarchy. I have a keyword hierarchy that includes Locations > Europe (1) > Belgium (32) > Brussels (61) > Manneke Pis (3). The numbers indicate the number of pictures. In Lightroom 2.x the number following Europe would show the total number of images classified as Europe, including images that had only Belgium as keyword. This is clearly no longer the case. I probably only have 1 image that is labeled Europe that is not attributed to any specific country. I can click to get all Europe images (11017), but how do I get the odd one? If I cannot find it easily, why show me the number?
  • Tethered shooting. Turning off the camera while it was still connected to the computer for tethered shooting, gave a weird colored animation on the LCD on the back of the Canon 5D Mark II: essentially the camera thought it still needed to store a file. The animation looked like it might be designed for the WiFi adapter (which I don’t have) because something similar happens when the camera starts messing with FTP and HTTP protocols to push files to a nearby server.
  • Editing video capture time. Lightroom 3 doesn’t claim to be able to edit the capture time of videos, but this is a bit of an inconvenience. I had previously (LR 2.x) adjusted the times of a set of images to match the local timezone in which they were shot. Now I wanted to do the same for the videos in LR 3.0. It can’t. So now my videos show up in the wrong locations when the files are sorted based on capture time. I guess the capture time can be edited: the time is stored somewhere in/with the file itself. And if absolutely necessary, you could edit just the capture time as stored in the database (and risk losing the change if you resynchronize metadata).
  • Lens correction and freedom of choice. By default, lens correction will use the lens the image was taken with. Surprisingly, Lightroom 3 also lets you use the correction models for other lenses. Even lenses from other brands and for lenses that don’t have the focal length you are using. This is nice for playing around with (“what if I select a fish-eye”) or to use a similar modular if the one you need is not available. But there is no warning if you select a “wrong” module: it gets stored in your catalog.
  • Deletion. When a single image is viewed within a directory, and the image is deleted, Lightroom loses track of where you were within the directory. This applies for both Development view and Library view. This behavior is different than Lightroom 2, and a bit of a nuisance.