Two notes on the original Google paper
A few days ago Djoerd Hiemstra gave a guest lecture on estimating the size of big data problems, within the context of a Big Data course I am currently following. As a preparation, we read the paper of Sergey Brin and Lawrence Page from 1998 (read it here) where they introduced the anatomy of their search engine called “Google”. We did so in particular because it is interesting to compare their estimations on the size and scalability of Google with the colossus it has become today.
However, at the end of his guest lecture he pointed out two “fun facts” that I’d like to quickly share here.
1) The original paper on PageRank that fundamentally changed how the web looks today was rejected by the SIGIR 1998 conference.
2) In Appendix A of the paper mentioned above the authors discuss the dangers of advertising for search machines.
The first point is awkward and warrants a discussion of how valuable it is to be accepted at an academic conference. Despite being rejected, the contents of the paper and the company that followed from it completely reshaped the social reality of many.
The second point is also interesting and has an ironic note to it, given that we now know the direction in which Google headed. Follow the link above to read it for yourself (the paper is freely accessible), but here are two fragments:
Out of historical experience, the authors
expect that advertising funded search machines will be inherently biased towards the advertisers and away from the needs of the consumers (p. 18).
In general, it could be argued from the consumer point of view that the better the search engine is, the fewer advertisements will be needed for the consumer to find what they want. This of course erodes the advertising supported business model of the existing search engines. However, there will always be money from advertisers who want a customer to switch products, or have something that is genuinely new. But we believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm. (p. 18).