I sound like a broken record here, but I want to emphasize that avoiding the score adjustment is not a design goal of this system. The point of the adjustment is to allow entries to be relatively ranked in the bottom half by minimizing the randomness factor by scaling down scores with lower levels of confidence.
Also, keep in mind we’re using the median, not the average, so if a few people end up going all in rating a lot of entries, the median will not be affected. The median is a good representation of how participants of the jam are voting.
Lastly, for higher medians, also understand that increasing the median by 1 will have less of a score adjustment on entries that are around the median. The ratio of ratings received versus median number of ratings is used. (eg. 99/100 is much smaller than 9/10).
By adjusting the algorithm with the focus on reducing the number of people getting a score adjustment you essentially introduce the entropy into the rankings (so I don’t agree that your suggestion would stabilise the system). The main goal of the ranking algorithm is to avoid “fluke” type situations where a submission that wasn’t seen by many happened to get a higher rating, because people who did see it didn’t care, were biased, or something else. Especially in a jam like GMTK, where public ratings are allowed, this is definitely a concern. The secondary goal is to let entries that were both highly scored and got a large number of ratings to rise above in ranking. You may argue this may hide “hidden gem” like submissions (seen by few, but actually very good), but since those types of projects have less confidence in their overall score I think it’s a necessary sacrifice to accomplish the goal ranking every entry relative to every other entry. (For example, on itch.io’s browse pages, “hidden gems” are a good thing, so we use a different algorithm so allow that type of content to surface.)
All that said though, I wasn’t immediately dismissing your idea. I think it’s an interesting suggestion and I would need time to run results from existing jams and observe the kind of impact it has. Just thinking about it off hand I believe it would most likely introduce a higher “fluke” factor for jams that have lower medians. It’s hard to intuitively reason about how Median * 80%
compares to something like 40th percentile
for the adjustment cutoff without actually running the numbers to see how it performs.
Regarding your point about communication, I definitely agree that we can add more information to participants that they should participate in ratings games to boost their visibility. In the case of GMTK I don’t remember off hand how the host communicated that to the participants, but I believe most people understood that they should be rating games.
Thanks!