Thoughts about the Glicko rating system
For a while I’ve been intrigued with some of the statistics used for rating people at certain tasks, e.g. chess. There are several mathematical systems for assigning numerical ratings to chess players. For example, there is the Elo rating system and the Glicko rating system. I don’t really play chess anymore (arguably, I never really did), but I’m still interested in how the ratings system works, and how well it could be applied to other areas.
Similar systems have been used in various sports, and I wanted to play around with it.
I wrote a plugin for TextMate that allows you to use a MultiMarkdown table as a database of items, ratings, and error factors (labelled RD). You can then have the computer randomly pick two items, and you enter the winner (or tie). Alternatively, you can specify the two items to compare.
To try it out, I used my list of Favorite Movies. I then randomly chose sets of two movies, and decided which one I liked best. The computer then applies the Glicko scoring system, and assigns new ratings after each pairing. Over time, you could argue that the list will approach a “true” order of these movies in terms of which is the “best.” So now my list of favorite movies is beginning to have some order to it. I’ll continue to judge matches over time, so that the list should become more refined. I’ll even leave the scores and error factors for viewing.
One trick I used was that when the computer picks a random pair, one of the pair is the item with the highest error (i.e. the item that has competed the least.) This was to prevent some items from being left behind. The second item in the pair is completely random.
The plugin needs some more tweaking before it’s ready for public use, but I’m curious as to whether there is anyone else out there interested in such a thing.
I’m also curious about how else I could use this process….