Introduction
SpergSort highly-opinionated implementation of adaptive sorting using the Bradley-Terry model for paired comparisons with the goal of generating the most informative ranking of 乃木坂46 members. Based on Gwern’s research on statistical pairwise ranking and sorting of items, it provides a statistically robust way to rank idols through pairwise comparisons.
Mathematical Foundation
Bradley-Terry Model
At its core, SpergSort uses the Bradley-Terry model which posits that each item i has a latent “strength” parameter
Adaptive Selection
Rather than using a fixed comparison sort algorithm, SpergSort employs an adaptive strategy that prioritizes comparisons that would be most informative. For each potential comparison between items i and j, we calculate a score:
Where:
is the number of previous comparisons between i and j represents uncertainty (larger differences suggest more certain ordering) is a time penalty that decreases as time passes since the last comparison
Confidence Level
The confidence in our sorting is determined by two factors:
- Average comparisons per item (
) - Standard deviation of scores (
)
For n items, we target:
<90 IQ Section: Sorting with Fruits 🍎🍌🍊
Imagine you have a basket of different fruits and you want to rank them by how much you like them. But here’s the catch - you can only compare two fruits at a time!
How It Works (Simple Version)
- The app shows you two fruits
- You pick which one you like better
- The app remembers your choice
- It keeps showing you different pairs
- Eventually, it figures out your complete ranking!
For example:
- Round 1: 🍎 vs 🍌 (you pick 🍎)
- Round 2: 🍊 vs 🍎 (you pick 🍊)
- Round 3: 🍊 vs 🍌 (you pick 🍊)
Final ranking:
- 🍊 Orange (best)
- 🍎 Apple (second)
- 🍌 Banana (third)
Use cases
Entertainment Content Ranking
- Example: Tom Scott’s “What Is The Best Thing?” project where he used pairwise comparison to rank 7,188 items with 1.2 million votes to determine the best things in the world
- Perfect for ranking subjective content like media, songs, or fan preferences
Product Feature Prioritization
- Example: Labster’s UX Research team used pairwise comparison to test assumptions about customer feature requests
- Helped identify that what the sales team thought was a top priority actually didn’t make the top 10 customer needs
Competitive Gaming Rankings
- Example: Chess.com’s player rating system using Glicko 2
- Each game serves as a pairwise comparison between players
- Players start at 1500 points and move up/down based on performance