Original blog post here: >What are Google thinking<
I’d like to respond to some of the comments made on hackernews and puremango about my post yesterday, and about Binggate / copygate in general.
Reply: “There was never a ‘link’ between [hiybbprqag] and the spiked sites, so Bing must have been directly copying Google”
Yes, I said “linking” but I didn’t mean it in the sense of a hyperlink (my bad) but rather a conceptual link. What I meant was: “Google spotted that Bing sometimes includes results that seemed to be copied from Google, so Google set up a honeypot â€“ they made some made up words like [hiybbprqag] cause Google’s SERPs to link to random unrelated sites. A few weeks later, around 8% of those sites showed up on Bing for those queries.” I’ve edited the original post with this new wording.
It’s still not valid to jump from that to “Bing are copying Google”. All we can tell from that is “Bing are using information about URL and clicked pages”.
Here, in the clearest terms I can manage, are the two pictures of events:
1) Bing are copying Google
When you’re on Google, and Google alone, Bing kicks in some query string detector – based on URL, form fields, reading Google’s SERP HTML, whatever – and then waits for you to click a page. When you do, that query string is sent to Bing along with the URL of the page you clicked. That information creates a relationship between the query and the page which is then used to create Bing’s results.
2) Bing are just monitoring all clickthroughs
When you’re on any website, Bing kicks in some “What kind of page are we on” detector – based on URL, form fields, reading the HTML, whatever, and then waits for you to click a page. When you do, that data is sent to Bing along with the URL of the page you clicked. That information creates a relationship between the query and the page which is then used to create Bing’s results.
The first is reasonably a case of Bing copying Google, the second is not.
Moreover, why would Bing do that? If they wanted to copy Google, why wouldn’t they just scrape Google directly? But why would they want to copy Google? Because they’re not smart enough to create their own search engine? That’s what Google would like you to believe. I think it’s perfectly within MS’s budget to attract high-quality PhD students who know a thing or two about information retrieval. I mean, just search for [Microsoft Google defector] and you can see that a lot of Googlers used to work at MS. The talent is surely not so massively different that MS have to resort to copying?!? And if they did, why this convoluted strategy – plausible deniability, perhaps. I favour the simpler explanation.
Reply: Whatever, the main issue is that Bing Toolbar is spyware.
Firstly: the communications between Bing toolbar and Bing are encrypted, which hints to me that MS actually do care about protecting your data – the only reason to encrypt it is to ensure that malicious third parties can’t sniff your search data. That’s circumstantial, and could alternatively be read as “They don’t want anyone to know what info they’re collecting” I concede. Secondly: if it was the main issue, why isn’t it the main issue? Thirdly: Users accepted the ‘spying’ in the EULA. Now I know that no-one reads these and that it’s not a nice tactic for companies to hide behind that kind of get-out clause, so, Fourthly: Everyone spies on users. I recorded your browser and referer via Google Analytics when you visited this blog. All your web based email is being ‘spied’ on to create better adverts. This is the trade we make in exchange for free amazing web services. If you want privacy, use DDG.
Reply: It should be “Google is” not “Google are”
I’m English, that’s how we roll.
Reply: The bigger issue is that if this continues, Bing will end up as a copy of Google
Sure, for obscure queries, and perhaps the larger trend that Matt Cutts & co noticed is down to this. So Bing will likely change it. Harry Shum actually said “this is a new kind of clickfraud”. I’d be massively surprised if Bing don’t change the way their “ClickRank” system works (my shorthand word meaning “what I think Bing’s doing”). Just like if Googlebombing had continued, Google would have ended up full of spam. This is a non-issue, it won’t happen. As many have pointed out “It would have been smart to exclude Google from the start”. Yes it would, Bing didn’t anticipate that Google would have such an apparently huge effect on them, and much less that Google would handle this so immaturely.
Also, Google ‘copy’ snippets of webpages, they ‘copy’ images, they ‘copy’ entire websites for cache and for instant previews. They ‘spy’ on users for autosuggest. No-one has a problem with any of this. Because that’s how the web works. To get ahead, you have to start looking at more invasive signals. My own research on adaptive websites focused on tracking clicks, scrolling, word highlighting etc while a user is on a page. This is useful data which can be used to produce a better product if it’s handled sensitively.
I’m not saying “Bing do not have code in there to copy from Google directly”. They may do, they may be absolutely as desperate as Google paint them to be, but for that to be true, a lot of non-obvious, unsupported things (such as “it only happens on google.com”) have to also be true. Occam’s razor suggests we prefer the simpler explanation until further evidence arises.