Throwing out highs/lows is a common practice in many scoring systems for precisely this reason.
Only question is when is it a too high/low value? I would say std deviation + 1.0 rating point. As a consequence, 2-3 ratings of me would be eliminated (Bern and Salzburg), but that seems fine.
Not a problem with me.
You could try using the median and the median absolute deviation to measure central tendency & dispersion. These are more robust against outliers.
Not possible/not out of the box via SQL functions. I do think this could be interesting and need to check.
You are currently using an 11 point uni-polar interval scale for rating. Most rating scales use a 5 or (less frequently) 7 points. The more points the less accurate your rating will be - due to effects such as central tendency bias. Using half stars is also much more difficult for a respondent to mentally process (really).
Well... For me I see the following scale:
1* 0.5 -> Should not be on the list.
2* 1.0 -> Pretty miserable.
3* 1.5-2.0 -> Below standard
4* 2.5 -> Average
5* 3.0-3.5 -> Above Average
6* 4.0 -> Good
7* 4.5 -> Exceptional
8* 5.0 -> World Wonder
Maybe one could group 2.5 and 3.0 and 3.5 and 4.0 to get to the seven point scale. What does anyone else think? I agree that 11 seems a bit too much, especially in the middle.
Because the population is small you might try using a bayesian average.
We already apply a Wilson score lower bound to the average (at 25%) to compute the score. I think this is covered. I was also thinking about normalizing each voter and awarding points based on that.
You could try positively weighting votes from "experts", this is quite often used in rating scales!
Weighing per visited sites and reviews would be fun. I will see how I can make that happen. Maybe give one vote per 100 visited sites and per 50 reviews written?
You could try forcing an objective and a subjective rating - both are valid. "Rate the site" and "Rate your visit", much like meltwaterfalls does in his reviews. This gives interesting insights - often respondents give ratings based on what they think they should think.
I think it should stay with one scale and be "Rate the site", not "Rate the Visit".
Finally you could be more explicit about what is being rated.
That's why we are having this discussion ;)