Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
The Internet

Why Do Google Hit Numbers Vary? 378

Supa-Fly writes "I have a question about some conflicting results with the search engine google. I did a search for "pictures of mountains" and got exactly 1 million results. My friend did the same search (from the same office)and got 1,010,000 results. A second friend did the same search as the last 2 and got 1,020,000. These have not changed and every person gets the same results each time. My question is what is up with the discrepancies on google's search results?" Since this question is hard to answer from the outside, Craig Silverstein of Google kindly supplies his best answer to this question, below.

Craig writes: "Thanks for the great question. We get this from time to time and hopefully I can clear up some of the confusion. The number of estimated pages listed to the top right of a Google search results page is indeed, an estimate. It's a good estimate but still, an estimate.

There are many reasons why one might see a difference in the estimated number of pages returned for the same query. It's most likely the queries made by your co-workers were sent to different Google datacenters in what appears to have been a round-robin fashion. The index at any given Google datacenter can change slightly over the course of a day (each index is refreshed completely every three to four weeks). Depending on which datacenter finishes a query, the estimated number of results may vary.

Without having direct access to your environment it is hard for me to tell for sure, however, I believe this is the case."

This discussion has been archived. No new comments can be posted.

Why Do Google Hit Numbers Vary?

Comments Filter:
  • by ergo98 ( 9391 ) on Monday February 10, 2003 @09:32PM (#5275957) Homepage Journal
    Several weeks back I happened to mention a very nice new restaurant in Toronto on one of my pages, and within days shot to the #2 position on Google when searching for several variants of this restaurants name. I knew this by the fact that suddenly I was seeing closing on a hundred hits per day of people looking for this restaurant. Note that this restaurant has such a unique name that there are only around 5 pages of links in all anyways. Anyways suddenly the hits entirely stopped, and a search on Google found my page was purged from the database: Despite it being a unique name with few hits, it no longer even registered. A week later suddenly it was back in the #2 spot again.

    No idea why this happened, but it is entertaining to see it vary.
  • Amazing! (Score:5, Insightful)

    by PeterClark ( 324270 ) on Monday February 10, 2003 @09:34PM (#5275973) Journal
    An "Ask Slashdot" that actually went to the source for the answer first, without the usually bad/wrong/pointless pontificating that normally goes along with it. How long can such a good thing last, I wonder.
    :Peter
  • number oddities (Score:5, Interesting)

    by millette ( 56354 ) <robin@@@millette...info> on Monday February 10, 2003 @09:34PM (#5275976) Homepage Journal
    What's really odd is searching for a few words with OR, and noticing that adding words actually lowers the numbers of results obtained.
    • Re:number oddities (Score:4, Informative)

      by pete_p ( 70057 ) on Monday February 10, 2003 @09:39PM (#5276022) Homepage
      That's because Google doesn't do boolean searches. It will ignore the or (too common a word) and ends up treating it like an and search.
      • Re:number oddities (Score:5, Informative)

        by sparkz ( 146432 ) on Monday February 10, 2003 @09:43PM (#5276052) Homepage
        Wrong. OR is a boolean operator to Google. Check the "Advanced Search" link.
      • Re:number oddities (Score:5, Informative)

        by millette ( 56354 ) <robin@@@millette...info> on Monday February 10, 2003 @09:47PM (#5276085) Homepage Journal
        Actually, if you use an uppercase OR, it will perform a boolean search. Otherwise, the search defaults to an AND, unless of course you're using doublequotes "like this" to search for a phrase.
        • Re:number oddities (Score:5, Insightful)

          by Forgotten ( 225254 ) on Monday February 10, 2003 @11:45PM (#5276770)
          I nearly always use double quotes to search for phrases. It works extremely well with google. You can also combine multiple phrases, and unquoted terms as well.

          In fact, I'm surprised no one else mentioned that searching for "pictures of mountains" (quotes included) yields 1320 hits, which are likely to be much more useful than the other 998,690 or so. Though in this case I really would have searched for "pictures of mountains" OR "mountain pictures" (or done two searches).

          If you're not going to use the quotes, there's precious little point including the word "of" in the query.

          There are other useful tricks for the google search field listed on the help page, but double quotes is by far the most useful overall.

          (another handy trick if you're using Mac IE is to hack the app's resource fork so the '?' address bar shortcut goes to google instead of MSN - a trick expanded on in iCab's built in URL expansion)
  • googledance (Score:5, Interesting)

    by wfmcwalter ( 124904 ) on Monday February 10, 2003 @09:35PM (#5275982) Homepage
    There's a number of websites (dare I say "fansites") devoted to the study of google result variance - the so-called googledance.

    this [google-dance.com] and this [webrankinfo.com]

  • by FunkSoulBrother ( 140893 ) on Monday February 10, 2003 @09:37PM (#5275999)
    It's too bad Google doesn't have one of those things where you can watch everyone's search scrolling down the screen live. I bet there would be a lot of "pictures of mountains" searches right about now.

    I think some engine had that (metacrawler)? back in the day, was fun to watch, and I believe they didnt censor it.
  • by Tiber ( 613512 ) <josh.knarr@gmail.com> on Monday February 10, 2003 @09:38PM (#5276010) Homepage
    About a month ago, someone posted this story [kuro5hin.org] over on K5 [kuro5hin.org] regarding the google dance [internet-a...manual.com]. Good to see it's run by a marketing site, I couldn't think of anyone who might have more of an interest in rankings then those bastards. :P
  • Eureka! (Score:5, Funny)

    by creative_name ( 459764 ) <pauls@nospaM.ou.edu> on Monday February 10, 2003 @09:38PM (#5276016)
    No wonder I couldn't find the website I was looking for! It was in those missing 10,000 websites. If I had only gotten those and checked through them as thoroughly as I checked the other 1,010,000 then I would have certainly found it.

    Humor aside, this is pretty interesting. Alot like when you vote in a poll, go back to the main /. page and the poll from last week appears. You'd think the Uber Midgets and Stealth Ninjas could get it right ;-)
  • by Ayanami Rei ( 621112 ) <rayanami AT gmail DOT com> on Monday February 10, 2003 @09:39PM (#5276025) Journal
    like snowflakes falling
    google queries melt upon
    different servers

    like the wild flowers
    each view of the database
    unique, yet alike

    and...
    its that time of month
    google dances, results wiggle
    w00t first haiku post
  • ...and got 40,000 more search results (10,010,000 to 10,050,000). "Of" isn't included in the original search anyway, so I wonder why removing it yields a different estimate.
    • Probably for the same reason that the original search numbers were different for different people. As others have said, when Google removes the word 'of' it essentially treats it as if there was an 'and' there. If you remove 'of' manually it does the exact same thing.

      Guido, my good man, I do believe you have witnessed first hand the not-so-elusive google-dance.
  • by sssmashy ( 612587 ) on Monday February 10, 2003 @09:42PM (#5276046)

    It's simple, really... mountains are the new thing in pornography. People are snapping and posting so many pictures of naughty, erotically shaped rock formations that the number of mountain pics available worldwide on the net is rising by about 10,000 every 10 minutes.

    Soon, the number of phallic granite pics worldwide will even exceed the number of Jenna Jameson facials. Quite the phenomenon, really.

  • *grin* (Score:5, Funny)

    by Eric Seppanen ( 79060 ) on Monday February 10, 2003 @09:42PM (#5276049)
    Finally, proof that all Ask Slashdot questions could be more quickly answered by simply checking with Google :)
  • by elhondo ( 545224 ) on Monday February 10, 2003 @09:45PM (#5276072)
    Results have been inconsistent ever since they let those damn pigeons unionize. He's obviously covering for the union.
  • by jsprat ( 442568 ) on Monday February 10, 2003 @09:49PM (#5276100)
    Here's what I get:

    "pictures of mountains" 986,000
    "pictures of of mountains" 1,010,000
    "pictures of of of mountains" 1,020,000

    Two of these pages had a different top-ranked link.
    Funny thing, all three times Google told me "of is a very common word and was not included in my search", but it made a difference!

    Regardless of these results, Google is the best search engine. Period.

    • by Wild Wizard ( 309461 ) on Monday February 10, 2003 @10:01PM (#5276171) Journal
      has no one metioned the advanced settings you can use that changes what sites you get in a search

      w/english only
      1,010,000

      w/all languages
      1,040,000

      w/strict filter and all languages
      903,000

      w/strict filter and english only
      881,000
    • Deciding to test Google's AI, I took this a step further:

      all things are not always are not always you need to know you learned from Dr Richard s Wallace.: 2,240

      all things are not always are not always are not always me need to know me learned from Dr Richard s Wallace: 3,900,000

      But all things are not always are not always are not always are not always you need to know you learned from Dr Richard s Wallace: 5,490,000

      But all things are not always are not always are not always are not always are not always me need to know me learned from Dr Richard s Wallace: 5,490,000

      etc.

  • I have to wonder... (Score:4, Interesting)

    by greechneb ( 574646 ) on Monday February 10, 2003 @09:49PM (#5276101) Journal
    If this is the same reason that when I search, I get a list of 7 pages, and then after getting to page 5, there are only 6 pages. I would think that they would have a cookie set saying which server they are gathering their data for each search though...

    It is kind of aggrevating to be expecting 7 pages, and get only 6, I always think that the mystical disappearing page contains my wanted result though. :(
    • by RedWizzard ( 192002 ) on Monday February 10, 2003 @10:42PM (#5276383)
      I get a list of 7 pages, and then after getting to page 5, there are only 6 pages.
      I believe that what's happening there is that as you move through the pages of results Google realises that some of the later results are similar to some of the earlier results and omits them. You can get them back but clicking on the link at the end of the last page.
  • Re: Google Results (Score:3, Interesting)

    by Anonymous Coward on Monday February 10, 2003 @09:50PM (#5276110)
    Searches for : 'pictures of mountains'

    1) through no proxy (resulting in forward to google.ca as I live in Canada): 996,000

    2) through guardster.com (resulting in google.com): 1,040,000

    What is it you American's are getting that we in Canada are not? :)
  • by Anonymous Coward on Monday February 10, 2003 @09:52PM (#5276122)
    Craig Silverstein's "explanation" is compelling and fits the circumstances, but of course it would. The use of technical jargon such as "round robin" and "datacenter" is the first sign of conspiracy, and it doesn't take long before the entire facade crumbles to reveal the truth!

    Craig "Silverstein", if that's his real name, is one of the very few CIA and FBI double operatives working at the agencies' Virgina TIA global computer network monitoring laboratories, and Google is one of their most brilliant covers, providing not only a convenient alibi for gathering psychological desire/search profiles on people all over the world, but the mechanism for doing so as well. The estimate numbers are simply the individual lookup tracking identifiers. That's why they're different for every individual that performs a search!

    Craig "Silverstein" was also one of the guests invited to the dinner party between the Reagan / Bush Jrs. the night after the attempted assassina... one sec, a van just pulled up outsi

  • I get less with the phrase "pictures of mountains" (including quotes), I get an estimate of 1,320. The estimate also has to deal with quotes and the logic behind google (AND logic or is it OR logic).
  • by MacOS_Rules ( 170853 ) on Monday February 10, 2003 @09:55PM (#5276140) Homepage
    bevis: Huh-huh. They said *mountains*. Huh-huh.

    *smack*

    butt_head: They are slashdot. They make such references to screw up the google database, thus completly validating their newstories. Inevitable reposts will bump the number even higher!

    bevis: like a conspiracy. huh-huh

    butt_head: conspiracies are cool!
  • google fight! [googlefight.com]

    It's the answer to every problem.
  • Ugly Hullabaloo (Score:3, Interesting)

    by swordboy ( 472941 ) on Monday February 10, 2003 @10:02PM (#5276174) Journal
    Here's some radio commentary [wnyc.org] on the subjet matter. I heard it the other day on Public Radio International [pri.org]. An interesting read and somewhat related...
  • Another quirk I just noticed. My personal webpage was just recently indexed by Google. When searching for terms in it and getting it near the top of the search results, a cached link is seen next to it. This link works.

    However, if I enter cache:URL_HERE, it says it cannot find it. This feature works for other webpages, so I know it's not my syntax.
  • by El Camino SS ( 264212 ) on Monday February 10, 2003 @10:16PM (#5276242)

    Perhaps we should Ask Jeeves.

    Hmmmmmm?
  • by pipingguy ( 566974 ) on Monday February 10, 2003 @10:18PM (#5276254)
    I suspect that Google employs the nucular radiation-enhanced, super pigeons [google.com] as actual editors (as opposed to the slashdot phenomenon). Regular columbiformes are probably relegated to mundane crawling/pecking duty.

    My site, which is admittedly somewhat unique, has been listed in the top 5 for over 18 months now if the appropriate keywords are used.

    Of course, it also helps that those keywords are snarklbort, giffleblag and byzgetford.
  • by Tsali ( 594389 ) on Monday February 10, 2003 @10:23PM (#5276285)
    ... anyone care to tell me what they see
    on this one? [google-fight.com]
  • My soon to be exwife is trying to screw me out of back pay and my share of the company that I helped build she change the contact page of the company to remove my name and change the name of the page but thanks to google cache I could retrieve it and will be showing it to the labor board in the morning woooohoooo

  • Google Dance (Score:5, Informative)

    by kiwirob ( 588600 ) on Monday February 10, 2003 @10:46PM (#5276403) Homepage
    Results can also vary due to the Google Dance.

    Google has 7 data centers each with a copy of it's index and these are "usually" mapped to www.google.com [google.com]. But google also has versions located at www2.google.com [google.com] and www3.google.com [google.com].

    During the monthly update there can be different version of the index on each of the 3 versions. A website www.google-dance-tool.1hut.com [1hut.com] provides results for a search done on all 3 of googles index.

    To check to see if the google dance is happening the most common technique is to check the "back links" for mayor sites like Yahoo by typing "link:www.yahoo.com" into the search box. this will list all the sites with links to "www.yahoo.com".

    The Google Dance Tool site mentioned checks google every 5 minutes to see if the dance is on. Once it is started it sends out an automated email to subscribers (like me) so I can visit the site and see what the search positions for the next month on google will be using their google dance tool search.
  • as you can get through those first 1 010 000 results, then i'd start to worry about why it sometimes comes up with 20 000 more.
  • Google cheats (Score:2, Interesting)

    by Anonymous Coward
    They claim 76,300,000 pages with 'computer' try actually getting past 1000. It just stops.
  • ...google for the answer ?
  • "We have determined that the result varies because google does not like some of you for personal reasons."
  • ...all you needed to do was ask: plenty of "pictures of mountains" on my site [gdargaud.net]. And, no, this is not a troll.
  • by lucasw ( 303536 ) <lucasw@NoSPaM.icculus.org> on Tuesday February 11, 2003 @12:52AM (#5277053) Homepage Journal

    Spell Check:
    Type in candidate spellings of a word, and assume the spelling with the most search results is the right one:
    'amatuer' -> 3.9e6 hits, 'amateur' -> 35e6 hits. Amateur it is.
    'modelling' -> 2.6e6 hits, 'modeling' -> 5.7e6 hits. Close call, perhaps both are acceptable?

    Ego Boost:
    Everyone knows about this one: see what comes up under your own name (put it in quotes if necessary)- Hopefully if you run a small website or comment with your real name frequently in a google searchable place that'll come up first. But you'll have to work hard to beat out all those genealogy sites that just list thousands of names, graveyard roll-calls and whatnot. Oh, and there's some court case from five years ago where you're name is featured prominently. My namesake is shared with one of the first shaken babies to die and become a major local (wherever it happened) newstory- not much of a boost after all.

    Stalking:
    I'd imagine this pretty similar to the previous, but with names of other people you know or used to know: your old college sweetheart died in 1892! Wait...

    Trademark pre-research
    You need a product name- something fresh and original, and easily googleable? Start with a few ideas, and use a thesaurus (and don't forget cool foreign language words/roots) to refine the name until google hits are down to a zero. Run words together or otherwise potential customers will end up at sites that just randomly use those words at different points of the text- assume the customer is too dumb or lazy to use quotes.
    'NodeZero' is my new badass something-or-other- wait there's 1K hits, how about 'NodeNull'? Only 8 now, that's good, but better yet try 'NodeNothing'- zero results.
    After the google test see if the .com,.org,or .net site with the same name resolves, just in case.

    I'm sure there's many more...

  • by Prof.Phreak ( 584152 ) on Tuesday February 11, 2003 @04:15AM (#5277741) Homepage
    try reading: The Anatomy of a Large-Scale Hypertextual Web Search Engine

    http://www7.scu.edu.au/programme/fullpapers/1921/c om1921.htm [scu.edu.au]

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...