Saturday Slide: Seeing Stars

One of the App Store’s unique selling points is its user-driven rating system, which allows customers to easily sound off on how much (or little) they like a particular piece of software. It’s as easy as assigning a rating from one to five stars. From there, the system aggregates and averages, and “the cream will rise” to the top, as Apple’s VP of iPod and iPhone Product Marketing Greg Joswiak has said.

In theory.

In real life, where money changes hands and developers eat or go hungry, the degree to which an app’s star rating affects its success is less than obvious. If user ratings are such an integral part of the App Store’s merchandising scheme, shouldn’t it be possible to list all games that have earned an average rating of 4.5 or 5 stars? You can’t. Star ratings don’t surface anywhere other than a particular app’s page.

Further, Apple’s main promotional areas–New and Noteworthy, What’s Hot, What We’re Playing, and Best (insert genre here) Games–don’t appear to have any direct connection to star scores at all. And although I haven’t run a full statistical regression, a game’s position on the Top Paid Apps list and its star rating seems weakly correlated at best. Commercially speaking, this is the App Store’s hallowed ground, where even 99 cent games start to make tens of thousands of dollars a day. We’ve got the 3.5 star StickWars in the driver’s seat at the moment, followed by the 4.5 star Flight Control and the 3.5 star Parking Lot.

For some games, the mechanism may be working as intended. Flight Control’s pulled its absurdly high rating after more than 7000 reviews, suggesting that it’s legit (we agree). But if StickWars and Parking Lot qualify as the “cream” of the App Store, consider me vegan. It appears that there are other forces at work here that are far more important than star ratings, which the App Store’s (supposedly) meritocratic model is (ostensibly) predicated upon.

This should surprise no one. Of course customers will consider many other factors when making a purchase, including a game’s price, its name, the genre, and even how cool its icon looks. We’ve seen that the Top 10 is largely the territory of 99 cent, unbranded microgames, and we’ve also noticed that there tends to be a lot of turnover on the list, although there are some exceptions (like the excellent Flick Fishing). So where do star ratings and user reviews fit in to this crowded picture, exactly?

To find out, I got in touch with Jani Kahrama of Secret Exit, makers of SPiN and Zen Bound. Jani doesn’t just make amazing games; he’s also one of the deepest thinkers on App Store economics around. And sure enough, he’s applied a lot of brainpower to this question.

He raises the point that game ratings tend to display a powerful selection effect, based on the kind of users a particular game attracts. “A niche application that caters only to a specific crowd can get a high rating average because only those who know what they’re buying are rating it,” he explains. “Apps that break into the mainstream get many impulse purchases, and there are users who give 1-star ratings for apps that don’t meet their expectations.”

In other words, customers want to get what they think they’re paying for. And games that are in the Top 10 will necessarily attract a much broader spectrum of customers than those that aren’t… including those that never really play games and therefore have no point of reference. It’s a calibration issue: one man’s three-star rating may be another’s four-star, even though they actually like the game exactly the same amount. Jani notes that the new bar graph has helped to reduce this effect, though, since it gives customers a better big-picture perspective.

Poor user reviews can be commercially significant, too, depending on how they surface. “We did see a drop in sales [for Zen Bound] on the same week when we noticed negative user reviews make the front page on the game page in the desktop iTunes App Store,” writes Jani. “It’s difficult to determine whether there’s a correlation.”

There’s a selection bias fly in this ointment, too, due to the method iTunes uses to choose which user reviews will surface first. It involves the “thumbs up/thumbs down” meta-rating tool whereby readers can choose whether or not they found a review useful. iTunes surfaces the three most useful reviews on an App’s front page, and these are often minority opinions.

“[For Zen Bound,] there were ~300 5-star reviews vs. ~20 reviews that had a negative tone,” Jani recounts. “But because the people who disliked the game had a smaller set of reviews to vote up, it ended up in a situation where only negative reviews made it to the front page. Despite 300 glowing reviews, the ones on the front page said it was 1) boring, 2) not worth the money and 3) neurotic.”

Whoops. Bet the App Store guys weren’t counting on that happening, especially for a game that really is the “cream” of the App Store.

Recent Stories