Objectively Comparing Competing Search Engines? 405
aendeuryu asks: "My default search engine of choice is, like most of you I assume, Google. That said, some complaints about Google over the years do seem to have some merit -- basically, that sometimes the indices aren't always updated, that it's too easy to manipulate via googlebombing or legislation, and that maybe too many of its featured services never get out of beta stage. Maybe the fact that Google has gone so long without significant competition is enough to make one at least begin to ask questions about it possibly becoming stagnant. Personally, I'm so used to doing things the Google way (and achieving acceptable results quickly) that I'm not really interested in switching -- case in point, all the above links referenced were quickly found via Google. However, what am I missing out on by not giving (for example) Yahoo search a shot? Or, more to the point, how would one go about trying to effectively and objectively compare competing search engines? In what areas have people found Google to have become obsolete for their purposes? Have less ignorant people than myself figured out ways to test a competing search engine's efficacy for themselves?"
Dont bother (Score:5, Insightful)
Re:Dont bother (Score:3, Informative)
The times that I have had problems is when I am not exactly sure what I am looking for in a few quick words. I can put it together in a question, such as "What is my house in Utah worth?" or "Why are flamingos pink?".
in those cases, I usually do ask.com That will get me going on a few pages, at which point I will know more clearly what I'm looki
Re:Dont bother (Score:4, Insightful)
I know everyone loves google, and I use it too, but I find that where it used to be an efficient way to find information, it's becoming less and less so as time goes on because of this sort of crap. As far as I'm concerned, if I need to pay to access the information, google should not be indexing that information and putting up links to the sign up page for me to waste my time with when the answer is already freely available elsewhere and that freely available source is in their index. If I wanted to use pay sites to provide my answers, I wouldn't be using google in the first place, would I?
Re:Dont bother (Score:3, Informative)
Re:Dont bother (Score:4, Informative)
For example: a question about Java [experts-exchange.com]. The question first, then the SIGN UP! bla bla, then a bunch of catagories, but if you scroll down further, you'll find answers to the questions, including the 'accepted answer' and such.
Hope this is useful to someone.
Cheers
Re:Dont bother (Score:3, Informative)
Re:Dont bother (Score:3, Informative)
"Flamingos are not born pink. They are white at birth. However, a substance -- called carotenoids -- in the foods they eat produce the bright pink color.
Flamingos would lose their shading if they could not eat carotenoid-filled foods like plankton, shrimp, or -- as handlers at the Philadelphia Zoo have found -- carrots. "
Re:Dont bother (Score:5, Interesting)
These are the top 4 results for "Why are Flamingos Pink?" (entered without quotation marks) in the top 3 search engine providers Yahoo!, MSN, and Google.
Yahoo!
http://www.finelinefeatures.com/pink/
http://www.shopping.com/xGS-Pink_Flamingos~FD-113
http://www.thewildones.org/Animals/flamingo.html Contains Answer
http://www.overstock.com/cgi-bin/d2.cgi?cid=48422
MSN
http://199.216.204.14/project04/legends2004/why_f
http://home.nycap.rr.com/useless/pink_flamingo/
http://www.cat1234.com/id56.htm
http://www.straightdope.com/columns/010518.html Contains Answer
Google
http://www.straightdope.com/columns/010518.html Contains Answer
http://www.thewildones.org/Animals/flamingo.html Contains Answer
http://www.amazon.com/exec/obidos/tg/detail/-/006
http://webexhibits.org/causesofcolor/7.html Contains Answer
As we can see... google out preforms all three, offering 3 sites that actualy contain the answer in the top 4 results, two of which are in the top two. CLEARLY providing better results on at least this topic than either Yahoo or MSN.
Anders
Re:Dont bother (Score:3, Insightful)
Sites like this come up all to frequently, even in google. What be really sweet, would be a way to EXCLUDE certain sites. Maybe it's already possible
Re:Dont bother (Score:4, Informative)
What be really sweet, would be a way to EXCLUDE certain sites.
For each site that you want to exclude, add a term along the lines of -site:overstock.com to your query.
Re:Dont bother (Score:3, Interesting)
Yes, at least on that topic. But here's my favorite: Try searching for WHO IS THE WHO on Google and Yahoo. Google is nowhere near an answer on page one, but Yahoo's first result is The Who's home page.
I know that this probably is caused by algorithms and how the different search engines treat stop words, but still: It seems as if no one search engine is best at everything yet, although Google currently (probably)
Re:Dont bother (Score:3, Informative)
Googling for "The Who" gave me mostly relevant results.
Re:Dont bother (Score:3, Informative)
I merely tried to point out that in some cases -- such as this -- searches phrased-as questions can return no relevant answers at all on Google.
Another thing: I may have composed my search in an idiotic fashion. But don't you think most people are idiot
Re:Dont bother (Score:3, Funny)
What a litigious world we live in where you have to sue your search engine (by default) when you don't get the results you want!
Re:Dont bother - why? Parallel to OS Wars (Score:5, Insightful)
Re:Dont bother (Score:3, Informative)
Or at least mentioned in the comments: vivisimo.com
Re:Dont bother (Score:5, Funny)
heh, I remember when we had to prepare our gopher searches on punch cards and wait days for machine time to run them, only to find that the research paper we thought we'd found was actually ascii porn with little popup jcl terminal windows selling "CHEEP A5PRIN" (because nobody had invented viagra!). And once you're name got out there, your bitnet account would be so full of spam that you wouldn't even want to use your wyse terminal! But you know what? We were thankful for the opportunity to be on the Internet.
you kids today...
Re:Dont bother (Score:5, Funny)
Meh. You think that was bad? Why, I remember when we had to hardwire our Internet searches on plugboards and read the results off of a teletype. Let me tell you, it was pretty tough rendering a web page on a machine without any memory. And every now and then some joker would wire the AC from the wall into a board just for laughs. No, we didn't mind the odd electrocution - it was all part of the fun of the Golden Age of computing.
Back in those days, spam was SPAM, and it came in a can. And we liked it!
Damn kids are soft these days... (Score:3, Funny)
dogpile.com (Score:5, Informative)
Alternates (Score:5, Informative)
Yahoo search is okay, not as nice as google, but a good second.
Alltheweb.com has found things google hasn't, but in general I rarely use it.
I rarely use MSN because it was awful all the times I tried it. Same for Altavista.
In general, if I'm searching for something I'll use google first and then Yahoo and Alltheweb to catch anything that google may have missed.
Re:Alternates (Score:2, Interesting)
I know that's not true, but generally if what I'm looking for isn't in the first two or three results pages of Google, then I give up.
This has only happened to me a few times (not finding what I want with Google), however it does bring up an interesting point. I trust Google results so much, is it possible that all the search results can be misleading or wrong information?
Re:Alternates (Score:4, Informative)
Here is my alternative. It is called Copernic Agent [copernic.com]. It is a desktop application that searches multiple search engines returns the results sorted by relevance. It will then let you further refine your search by searcheing aginst the actual pages in the result list. There is a free version that is crippleware. I bought the personal version and it was my favorite tool for searching job sites when I was unemployed.
Bizarre MSN search results (Score:2, Interesting)
My site has nothing to do with UTC or Flash. Turns out, it indexed my lame little archive page that displays article dates in UTC format. One of the article titles was something like "Flash Storm," so it indexed the "UTC" portion of the previous article's date and the word "Flash" that began the ne
astalavista.box.sk (Score:2)
Re:Alternates (Score:3, Interesting)
Alltheweb.com produces the same results as Yahoo search (basically ever since Yahoo merged with Overture). Yet you describe them as being distinct and with different qualities. You even will search on one after searching with the other.
Re:Alternates (Score:5, Informative)
I use Google for almost everything (Score:2, Informative)
Re:I use Google for almost everything (Score:2, Insightful)
Re:I use Google for almost everything (Score:2)
But the site is very cool and there's some great stuff on there making the wait worth it.
I quite like Google. (Score:2, Informative)
Re:I quite like Google. (Score:2)
I haven't found a good alternative to Google though...
Re:I quite like Google. (Score:3, Informative)
Not sure why you end up at different fr/dk/... domains though
Subjective (Score:4, Insightful)
Even by comparing keyword search side by side, one can still consider a worse result better, but who's to judge except the user?
I kept using Yahoo until it's not giving me results that I think are good enough, then I switched to Google, and I'll keep using Google until it's not returning good enough result.
Appalling (Score:5, Funny)
I have been browsing your internet site for several hours and am generally impressed with your coverage of IT related issues. However, when I saw an article on Google I just had to voice my opinion. I would just like to say how increadibly appalled I am with the Google internet search engine. My main concern with Google is how easy it makes for malicious people to find information on the now illegal Bittorent computer software.
Some background information on Bittorent and what makes it so dangerous:
1. The Bittorent computer software allows distribution copyrighted material.
2. In doing so it inadvertently causes excessive use of bandwidth. Now you might say that this is fairly harmless, but is it really? The effects of electromagnetic radiation pollution caused by this cannot be underestimated. Just think of the millions of wired and wireless connections lighting up and emmiting those deadly electromagnetic rays and all the innocent men, women and children being exposed to them.
Every bittorent user has blood on his (or hers) hands. From this point on, I am boycotting Google and advise any person with a shred of decency to do so too.
Re:Appalling (Score:2, Insightful)
Re:Appalling (Score:5, Funny)
Re:Appalling (Score:3, Funny)
It took me a while to get that this was satire. But just incase it wasn't a funny satirical post but instead a trolling astroturfer; I'll explain it better of the overzealous sladotters out there who are going to rip on this guy without comprehension.
If we follow his his warped logic we should boycott everything for example:
Here is background on the trucking industry and why it is so very dangerous:
1. Trucks should be banned because they allow
Just stick with what works. (Score:5, Funny)
I ask my wife the same thing. Honey, I'm used to doing things your way.. and I always get acceptable results from you.. but what am I missing out on by not giving (for example) Veronica a shot?
At least Google will never make you sleep on the couch, or give them half of all your assets. Hopefully.
I tried others...but I never changed my home page (Score:5, Interesting)
Why not find out .... (Score:5, Funny)
Sarcastic answer (Score:2, Funny)
Re:Sarcastic answer (Score:5, Funny)
Re:Sarcastic answer (Score:2)
Try this.. (Score:5, Funny)
Re:Try this.. (Score:2)
What I use (Score:2)
If I am looking to buy something offline, I use yell.com.
If I am looking for software, I use something like freshmeat or one of the rpm search facilities.
Otherwise, I use Google.
Try Yahoo (Score:2, Interesting)
Precision and Recall (Score:5, Informative)
When you measure a search technology, the values you typically look for are precision and recall. precision says "of the X results you gave me, how many of them are relevant". recall says "in the world, there were Y possible pages you could have found, but you gave me X of them".
you can't measure recall for a public search engine, but you can measure precision. Take a set of sample queries, and some users. Have them perform the queries, and go through the first ~100 pages and give them a "thumbs up" (relevant) or "thumbs down" (not relevant).
Your overall score will measure precision: if at N=100, all 100 were relevant, that's 1.0. if only 50 were judged relevant, precision is 0.5.
You can estimate recall by judging say 1,000 documents (phew). Then sample precision at N=10, 100, 500, etc, assuming that is an "exhaustive" list of documents in the world.
A simple way to test recall (Score:2)
Re:A simple way to test recall (Score:2)
Needs to be more complex though (Score:3, Insightful)
1) HOW relivant is a page, and is that page more highly ranked? It doesn't do me any good to have 99 slightly relivant results and 1 highly relivant result, if that one is at the end. So you have to measure how relivant the page is, and how high it appears in teh search and weight that.
2) The ability to find the correct page. Sometimes it's not that you are looking for general inforamtion on a topic, there's a specific page yo
Re:Precision and Recall (Score:2)
Metacrawler.com (Score:3, Interesting)
Re:Metacrawler.com (Score:3, Interesting)
So I hit up metacrawler.com for "sendmail tips". Just for the heck of it.
Result #4: Tips on EBay, Find Tip items at low prices.
Result #5: ServSafe Alcohol (R) Training Program, Comprehensive interactive training for those who serve alcohol.
Erm, what the hell? Leaving aside the fact that these are sponsored links thinly disguised as real results, they seem to lack relevance somewhat.
Re:Metacrawler.com (Score:3, Funny)
That may just be telling you that sendmail can drive you to drink
Re:Metacrawler.com (Score:2)
Re:Metacrawler.com (Score:4, Funny)
Re:Metacrawler.com (Score:3, Interesting)
Since then, when I haven't found w
Teoma (Score:5, Interesting)
Presentation (Score:4, Interesting)
I hate to say it, but... (Score:4, Insightful)
Frankly, I think you're on the right track when you ask, "What am I missing out on by not giving Yahoo search a shot?"
Likewise, I think you're on the wrong track when you go on, "Or, more to the point, how would one go about trying to effectively and objectively compare competing search engines?"
Comparing the results of searches is necessarily subjective. Only that first question has a real answer.
RD
Re:I hate to say it, but... (Score:2)
If there isn't, how are you going to answer your question #1 -- gut feeling? By missing out - do you mean parts of the web? usability features?
As I said in an earlier post -- it's nearly impossible. But that doesn't mean you can't come up with a reusable metric to make an objective judgement.
Search Engine Watch (Score:2, Informative)
Any algorithm can be gamed (Score:3, Insightful)
Here's a nice comparison (Score:5, Interesting)
http://www.langreiter.com/exec/yahoo-vs-google.ht
Sorry if it gets slashdotted.
Re:Here's a nice comparison (Score:2)
hahahhaha (Score:2)
Listen to the Buzz (Score:3, Interesting)
You don't have to bother evaluating better web based technologies. When they are worth using others will tell you about them. It's the nature of the web.
For example, a professor of the university department in which I worked came back from Digital Research Labs, enthusing about a great new search algorithm the designers of Digital's Computer Aided Design software had come up with. A short time later Altavista was 'it'.
The same happened a few years later. The buzz from collegues and those on the web was about a new search engine called Google.
The short answer is, "Don't go looking for the 'next search engine'. It will find you."
Wikipedia (Score:3, Insightful)
Punctuation (Score:5, Insightful)
I _used_ to go to altavista everytime i had a search that involved specific punctuation, usually some kind of coding question. Now i just get frustrated with google while trying to find some related term i can add in that will give me the results i want.
Google Is Good Enough For Most (Score:2)
Yahoo and Altavista worked ok for me before Google came along, but the clean interface and good results drew me in. So, the only thing that would convince me to switch to a different search engine would be if Google started cluttering
Why Google works (Score:5, Insightful)
Re:Why Google works (Score:3, Informative)
2. MSN Search has no graphical ads.
3. MSN Search separates the paid results just as clearly as Google does.
So, when was the last time you looked at MSN Search? Last year?
Re:Why Google works (Score:2)
One way to test (Score:5, Insightful)
phentermine
home loans
poker
mesothelioma
viagra
miserable failure
Then look at the sites that rank at the top. It's very easy to tell which search engines are more succeptible to manipulation. A quick look at the backlinks for sites favorably ranking in those competitive keywords tells you how that SE is doing.
Here's my opinion on the race between Google, Yahoo & MSN. Google has more sites that are authorities in the top results and Google penalizes over optimization however extreme examples of over optimization continue to show up in Google. Yahoo is a moderate success and does a fair job of filtering out spammy sites as well as authorities like wikipedia - wikipedia will always rise to the top in G but not in Y - and this is good for Y because you get more variety. MSN does an average job of filtering out blog spam but new sites are too favorably ranked and this is because MSN is new and has no recorded history of URLs. My personal preference is to use G simply because it loads the fastest in my browser... Maybe it's also worth pointing out that my company has several URLs ranked favorably in the terms listed above - looking at the change in rankings over time certainly helps give insight into which SE is better. MSN & Y are by far easier to manipulate than G but G gives the most traffic.
search.yahoo.com (Score:5, Informative)
Lately my Google results have been so Google bombed that I've been going back and forth between the two. I can't say for sure yet, but I may be in the middle of a bit of a personal transition.
Depending on what you're searching for, Google is often so front-loaded with dead-end advertiser links that its results aren't really worth much. Although it has to be said, it depends what type of a search user you are, and what types of things you're looking for.
Google is still the king of advanced search.
Vertical Search (Score:2)
For example you could do your music searching on your iPod or stereo, your yellow pages searches on your mobile phone, your video searches on your pvr. Of course it makes sense to expose a web front end to these engines as well, but it
3 cheers for objectivity. (Score:5, Funny)
MSN's sandbox test searchpage (Score:4, Informative)
Too bad the search results aren't nearly as up to par as google's results (in my opinion)
http://start.com/1 [start.com]
Simple Method (Score:5, Informative)
I've stuck with Google for a while, but I used to do surveys pretty often. My approach was to start preparing a couple of days in advance, by keeping notes about things I was searching for. Then I'd take three or four of them, usually the ones that I'd had the most trouble refining, and try them out on a bunch of search engines. For each, I'd keep track of how many searches I had to do and how many junk pages I had to get through before I could get to something useful on that subject. It usually became clear pretty quickly which search engines were allowing me to make efficient use of my time and which were wasting my time.
Another thing you might want to do is check out some of the newer "clustering" or "concept map" search engines such as Vivisimo or Kartoo, to see whether they suit your searching style better. They're really quite different from the search engines we've gotten used to, so the metrics I just described don't quite work for them. That doesn't mean they're better or worse - just different.
Yahoo seems lazy (Score:4, Interesting)
Also, Yahoo and MSN both seem extremely poor about figuring out the "right" url to link to. It's almost as if they index the first thing on any domain they come across, instead of trying to figure out where on the site most people link to, so you'll often find yourself deep-linked into a site where you'd prefer to be looking at a higher-level page to start. Google deeplinks too, but it seems to be only when it's really more relevant to the content.
I don't use a9 much, but it seems like google with a different skin. I swear sometimes they're snarfing google's results and storing them. Not that this is all bad, since Google's results tend to be some of the best, but it's still eerie.
Who Cares? (Score:2)
Who cares? Froogle and Google News (and for that matter, Gmail and Google Maps) are functional, and can be used today. Why would you let the word "beta" get in your way? Is there an unfulfilled promise?
basic benchmarks (Score:2)
Teoma used to be good... (Score:3, Informative)
Surprisingly, I still use Ask Jeeves (www.ask.com) for things - and find it finds things that Google has completely missed!
I guess you have to use a combination of several to really find everything you want - though Google by far is the best one.I base it on bot/spider visits (Score:3, Informative)
What's weird I'm noticing is that I don't see anything from something like a Yahoo bot at http://klomdark.servebeer.com:443/analog/report.h
Google still leads however. I wonder where Yahoo is getting it's data, unless it's from a crawl previous to fall 2003, as I'm not tracking logs from that far back. Strange.
Search Engine Watch (Score:5, Informative)
For some time now, Search Engine Watch [searchenginewatch.com] has provided a good editorial and comparison on various search engines. They focus on marketing topics, but also tend to talk a lot about the underlying technology, etc.
A recent roundup of engines is at http://searchenginewatch.com/links/article.php/215 6221 [searchenginewatch.com].
In the old days... (Score:3, Interesting)
Well a hell of a lot of those "old" search engines are still around! And they have become better over time. Google at one time was so much nicer than the others that people sort of got "lazy" and stopped browsing qround the engines. But everyone else didn't just curl up and die.
So just start engine hopping again. Try Google first if you must, but then try Yahoo, search.msn, alltheweb or search.com or other meta search engines that search all the real search engines for you.
Multiple sources of info have always been and always will be better than one giant conclomerate of info such as Google is becoming.
My search engine interface project (Score:3, Interesting)
I've just started a (Java) project to interface to a number of search engines. It might be a good place to start if you feel like doing some coding. See https://argos.dev.java.net/ [java.net] - there is no release yet but the code is in CVS.
It currently supports Blogdigger, Feedster, Del.icio.us, Google, MSN and Yahoo (and Google Desktop search). I'd like to include Ask.com, too, but they don't provide a programatic interface and I refuse to screen-scrape.
In my opinion none of the other search engines are close to Google in quality of results. I've found (to my surprise) that Ask.com gives me the second best results (they bought the old Teoma search engine, which was always okay. It had an index almost the size of Google's, which neither MSN or Yahoo can match yet.)
Re:Questions (Score:3, Informative)
Google Answers [google.com]
Re:Questions (Score:5, Insightful)
The other day I needed to know, for obscure reasons, the number of heroin addicts in Dublin. This is the kind of info that you know is probably on the web, but is going to be hard to find with Google.
I used BrainBoost - "How many heroin addicts are there in Dublin?" [brainboost.com], and, bam, first line of the result - "There are 13,000 heroin addicts in Dublin."
That's damn impressive. Out of curiosity I tried to see if I could find the same info with Google - it was fairly tough. Took three or four searches, eventually resorting to
which is a fairly specialized search that average users probably wouldn't be able to construct. The BrainBoost search, on the other hand, was completely natural, my granma could have done it.So, thumbs up for BrainBoost for question answering.
Still, it's not the kind of thing you'll want every day. For day-to-day search, Google is the tool, but BB is worth a look.
Re:Questions (Score:3, Informative)
That kind of engines are indeed nice. Still, they have their own oddities. For laughs, I tried to ask the system whether moon is made of cheese.
It so turns out that moon is indeed made of cheese!
"is moon made of cheese?"
"The Moon is Made of Cheese"
I guess it still takes some time before that kind search engines become more popular than the traditional ones.
Re:Questions (Score:3, Informative)
I have actually found searching for a plain english question to work in a number of other instances, as well.
Re:Questions (Score:2)
I don't think so. Google Answers a) costs money, and b) queries humans instead of a database (and that takes time).
AskJeeves is free and quickly queries a database.
What Google Answers provides should be more accurate, since humans can determine what you mean better; however, pay-per-"search" would out price simple questions like "Who wrote House of Leaves?" AND make them pointless to ask since, by the time you recieved an answer, it probably isn't pertanant. Google Answer
Re:Try them out yourself (Score:2, Funny)
Re:Try them out yourself (Score:2)
dumb joke, get over it (Score:2, Funny)
Re:other search engines (Score:3, Interesting)
Re:Clusty.com (Score:2)
FYI -- Clusty is owned by the folks that created vivisimo [vivisimo.com].
Re:Major reason Yahoo is better... (Score:3, Informative)
This is absolute rubbish. Google DOES crawl dynamic pages quite happily. It's crawled all of my sites with no problem.
Neither (no) search engine crawls dynamic sites where there are no links to the dynamic content (eg where you HAVE to search using keywords to find the content) but Google and Yahoo are happy to index any dynamic page which is directly linked to even if it has lots of parameters in the URL. Google has indexed 15000 dynamic pages on a directory site of mine quite happily.