Date

Ranking bias has been appraised by the European Union's (EU) European Comission (EC) as a 2.7 Billion Euro offense. That's the record breaking fine that the EU placed on Google Search last month. In addition to a record breaking fine, they also put a price on a team capable of monitoring Google Search - 10 Million Euro. That's the size of the contract that the EU put out for a team to monitor Google Search rankings for favoritism of Alphabet services. So wtf is ranking bias? Here's one take on quantifying ranking bias by some friends from the Max Planck Institutes and Indiana University. The implementation is mine own, and I take full credit for any and all bugs.

Kulshrestha, J., Eslami, M., Messias J., Zafar, M.B., Ghosh, S., Gummadi, K.P., & Karahalios, K. (2017).
Quantifying search bias: Investigating sources of bias for political searches in social media. Proceedings of the
20th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2017)
.

Quantifying bias

In the paper cited above, Kulthresha et al. (2017) proposed the following formulas for quantifying ranking bias.

Given a page of $N$ search results that have each been assigned a bias score $s_i$, the input bias for a given query $IB(q)$ is the average of all bias scores $s_i$ present in the rankings:

$$IB(q) = \frac{1}{N}\sum_{i=1}^{N} s_i$$

In order to calculate the output bias, Kulshrestha et al. first calculate the bias $B$ for a query $q$ until rank $r$:

$$B(q, r) = \frac{\sum_{i=1}^{r} s_i}{r}$$

def get_bias_till(bias_scores, till_rank):
    bias_till = {}
    for idx, bias in enumerate(bias_scores[:till_rank]):
        rank = idx + 1
        bias_till[rank] = sum(bias_scores[:rank])/rank
    return bias_till

Then the output bias $OB$ is a normalized summation of the previous formula over all ranks:

$$OB(q, r) = \frac{\sum_{i=1}^{r} B(q, i)}{r}$$

def get_output_bias(bias_scores, till_rank):
    bias_till = get_bias_till(bias_scores, till_rank)
    total_bias = sum(bias_till.itervalues())
    output_bias = total_bias / till_rank
    return output_bias

And ranking bias is simply the difference of $OB$ and $IB$:

$$RB(q, r) = OB(q, r) - IB(q) $$

def get_ranking_bias(bias_scores, till_rank):
    return get_output_bias(bias_scores, till_rank) - sum(bias_scores) / length(bias_scores)