LiveLeak = 500,000+ Visitors/Day -> Not Indexed by Google?
LiveLeak.com is a sometimes-controversial video-based site which receives over a half-million visitors a day and was, until just now, completely unindexed in Google - because they accidentally told Google not to index their site! The funniest part is that, as of 4 AM PST, their own front page still has a ‘try it for yourself’ link that encourages users to see that they aren’t indexed by Google.
In a rather tragic personal video a depressed-sounding LiveLeak administrator describes how he has been trying to contact Google about this potential act of censorship. Meanwhile, this video has received comments ranging from ranging from “google has stepped out of bounds” to “google is with the government in censoring the internet and they must be stopped!!!” Meanwhile, of course, they were informed (by a friend of mine) of the problem and, as you can see below, have fixed it:
Even the Wikipedia article on Google censorship contains an (albeit now contested) sub-section pointing out this issue.  How can a site with a half a million visitors a day and numerous web professionals looking into the problem NOT figure this out sooner? Go(d)ogle only knows, but now that they have removed the bot they are sure to get a flood of links from Google. Maybe it was just reverse psychology to begin with :P


Whoops, I’m a dumbass - I misread this article the first time - my bad! Didn’t realize you were posting a later screenshot LOL!
If you wanted to block all robots then you’d do this:
User-agent: *
Disallow: /
Erm…You do know that isn’t how it works right? Google were indexing another URL for the site just not the .com. Funny post though, nice one :)
Just to help you out
http://en.wikipedia.org/wiki/Robots_exclusion_file
Seriously bro, you’re going to look pretty dumb soon. That robot.txt is right and allows all robots. Look it up or look foolish :)
[…] 11th, 2007 Il pericolo censura non c’è. E’ il file robots.txt del sito che lo ha spinto fuori dai.. robot di Google ;) Posted by pipda Filed in altri […]
Hrm … yes, dumb. Actually, it seems pretty dumb to me that you couldn’t figure out I took that screenshot AFTER they fixed the problem :S But I digress - thanks for pointing out that I needed to make my article crystal clear on that point!
Kroq
The screenshot is the same as it was when this whole thing started cause it was the first thing I went to look at. Someone on Digg came to the same conclusion as you and then made a retraction. Sorry.
500.000 unici al giorno, e non indicizzato….
[…] Parlo di LiveLeak, un portale di video che non stanno sempre comodi a tutti, secondo quello che leggo in giro sui vari blog. Personalmente non l’ho visitato a fondo.
La questione è un altra, sembrerebbe che questo sito non sia indicizzato da G…
No apologies necessary … after all, you’re wrong. It was on there originally - I just didn’t get a screenshot of it because I didn’t realize they were about to remove it. Thanks for playing tho :D
The robots.txt file didn’t change though. This is the point I’m getting at. On the day the video came out a Digg member posted the same .txt saying that was blocking it then retracted his statement after he realised. However, your story does make for more interesting copy :)
The robots.txt indicated “Disallow: /”. Yahoo and MSN/Live.com dont check the robots.txt file as much as google and rather reference a cached version which could have been out of date. That would explain how those two engines had liveleak content in their index and google did not.
Unless you really want to think Google cock-blocked LiveLeak’s poorly-run and slow video service from taking their own traffic away. Seeing as though YouTube removes 99% of the material that makes up LiveLeak’s index on sight, that a very stupid conclusion to come to.
you know … all i REALlY ask is that people READ the post before they respond. that screnshot was taken later
My last response was directed to Trojan and the hordes of idiots who came up with weird-ass conspiracy theories instead of any sort of logical, real-world conclusion.
lol whoops
Did I say I thought Google blocked it? I doubt Google would care about a website so far under their product. I was just talking about the .txt file which was the same when I checked it when this first came out as it was when this entry was written. Also appears they had other url’s indexed, I was just interested if you all want to get defensive and put words in my mouth about conspiracies it says more about you than me.
Yes, ummm, it’s a little hard NOT to put words in your mouth, since the ones you use in the first place are confusing at best. Sorry!
@Trojan:
Are you fucking kidding me? You’re actually hiding behind that ingorant, half-assed response?
Please explain yourself or get laughed at.
@Fred
Whut? Laugh away big man, it’s all good :)
[…] This site, for new readers tuning in, has taken a lot of big-name internet companies to task, from LiveLeak to Digg and many others along the way. Possibly the only site I have posted only positive praise […]
@kroq aka Anonymous/Fred
Don’t be an idiot! We all know Anonymous, Fred and kroq are the same person.
LOLZ ummm … back that up with some semantic analysis or IP work or, well, anything. Otherwise … congrads! I have approved your comment for everyone else to see how much of an idiot you are :D
payday in laons laons payday in
Generally I do not post on blogs, but I would like to say that this post really forced me to do so! Good post.