People frequently consult average ratings on online recommendation platforms before making consumption decisions. Research on the wisdom-of-the-crowd phenomenon suggests that average ratings provide unbiased quality estimates. Yet we argue that the process by which average ratings are updated creates a systematic bias. In analyses of more than 80 million online ratings, we found that items with high average ratings tend to attract more additional ratings than items with low average ratings. We call this asymmetry in how average ratings are updated endogenous crowd formation. Using computer simulations, we showed that it implies the emergence of a negative bias in average ratings. This bias affects items with few ratings particularly strongly, which leads to ranking mistakes. The average-rating rankings of items with few ratings are worse than their quality rankings. We found evidence for the predicted pattern of biases in an experiment and in analyses of large online-rating data sets.