Truth in Advertising? 393
PerformanceEng wonders: "I work as an engineer for a large technology company in the U.S., and have been privy to what I find a interesting practice. It's well known that marketing data sheets often paint the best picture of a product while leaving the devil in the details. I've come to expect this, and when I am evaluating technology, I always have a skeptic's eye for claims made by the sales and marketing folks.
However, I've also witnessed our product go into test labs (usually for the purposes of running a series of tests for a 'bake off' in a trade publication). Not uncommon is the attempt to 'tune' the configuration of the device under test to perform in the best light (not unlike tuning your car to pass emissions tests). I have seen it go as far as exploiting weaknesses in the test that, if the test operator discovered, would be considered bad faith. To the other engineers: Are you aware of this kind of practice at your company? To the IT professionals: How much faith do you put in these sorts of publications and their 'bake offs'? To everyone: When does spin doctoring cross the line and become false advertising?"
Video drivers (Score:4, Informative)
ATI's 'Quake' optimization. (Score:4, Informative)
Red-Handed, Red-Faced, Red Alert (Score:3, Informative)
Developer Quote Of The Week: "What we do is, given a benchmark, we try to do as well as we can on it, and make sure that our system is the fastest benchmark -- I mean, fastest system -- in the world." -- Brian Croll, Sun Microsystems' director of marketing for Solaris
Two weeks ago, Sun Microsystems got caught with its hand in the benchmarking cookie jar. Or did it? Depending on your point of view, Sun either grossly misrepresented the performance of its Solaris Java just-in-time compiler by fooling Pendragon Software's CaffeineMark performance test, o r Sun proved the CaffeineMark is not an acceptable measure of Java compiler performance.
For those who may have missed it, here's the background: In a Nov. 4 press release, Ivan Phillips, president of Pendragon Software, in Libertyville, Ill., a developer of software for personal digital assistants, accused Sun of engineering its new Java compiler to trick the CaffeineMark into reporting higher performance results.
When Sun's compiler detected a block of 600 bytecodes unique to the CaffeineMark (a technique known as pattern matching), the compiler bypassed data processing, and instead returned a value expected by the benchmark. This fooled the test into reporting performance results 300 times faster than the compiler would deliver in real-world use. Third-party developers subsequently validated Phillips' assertion. Interestingly, when Pendragon's engineers altered the test to appear different to Sun's compiler, the compiler's branching was short-circuited, and its performance plummeted. Java compilers under Windows 95, Windows NT, and the Mac OS delivered uniform results under both the original and altered tests.
Sun officials initially admitted no wrongdoing, and were quick to point out that optimizing software to improve benchmark scores is an accepted practice among computer technology vendors. "People are optimizing against the benchmark," says Brian Croll, Sun's director of marketing for Solaris.
Further, Croll maintained that the aberrant results indicate a fundamental flaw in Pendragon's benchmark suite, and do not represent any impropriety by Sun. "I don't know how valid the [CaffeineMark] is," Croll said. Then last week, during a day-long media briefing at Sun's Mountain View, Calif., headquarters, Sun officials updated their explanation of events. SunSoft president Janpieter Scheerder said the company was not trying "to do anything malicious;" rather, Sun engineers simply "optimized too much."
A Sun spokesperson at the event blamed the incident on human error, and said an engineering prototype somehow found its way through Sun's rigorous (you would think) development and quality assurance processes, and onto the Web, with documentation, and overblown press release in tow.
What if Pendragon officials had not discovered Sun's alleged trickery? What if Sun engineers tweaked their compiler to only improve its score 10-fold, instead of the eye-popping 300-fold increase that flagged Pendragon officials?
Sun's PR machine had already posted a press release, in which they touted their "new Web-enhanced Solaris operating environment" as delivering "the world's fastest Java technology performance." The release also claimed Solaris' compiler was 50% faster than the best Windows NT score, and cited the CaffeineMark as proof.
If Pendragon officials had not discovered the ruse, Sun's formidable sales and marketing machine would now be steam-rolling press and IT decision-makers alike, trumpeting Solaris' performance advantage over Microsoft's Window s NT, waving Sun's illicitly obtained CaffeineMark results as evidence in hand.
"Any benchmark, no matter what its original purpose, is subject to use as 'benchmarketing,'" says Larry Gray, board member of the Standard Performance Evaluation Corp. (SPEC), in Manassas, Va., a consortium that administers many well- known benchmarks. "I'd guess may
Are you kidding me? (Score:2, Informative)
This is a different situation. (Score:4, Informative)
Tuning can have a dramatic difference in performance, and unless you're familiar with all of the products involved, it's impossible to get the best performance out of each one.
The original poster is talking about where one of the systems has been modified so it is not a default install, and specifically customized before being sent to the testor, so that they will perform better. (like with ATI's Quake 'optimization' [tech-report.com]).
As another example, there were some folks trying to get higher rankings in SETI@home [zdnet.com.au], who would return bogus results -- as that was faster than actually performing the calculations. If someone knows that the results won't be checked for accuracy (or can't), and only for time, they can boost their rankings dramatically.
Sun Called on Java Claims (Score:3, Informative)
Tweaking Java test?: Sun Microsystems has been accused of manipulating Java benchmark software and using the results to state that its Solaris "runs Java applications 50 percent faster than Windows NT." Pendragon Software, maker of the benchmark software CaffeineMark, has put out a press release that claims Sun found a way to cheat on the benchmark tests, and then advertised the bogus scores. Sun has since removed the Java compiler from its download page, Pendragon says, but the original press release remains on the Sun site.
Sun admits Java testing error [com.com]
Sun Microsystems (SUNW) today conceded errors in the results of recent tests involving its Java programming language.
The company erred in not admitting that it matched code from a Java benchmark tool for one of its own Java compilers, Sun Software president Janpieter Scheerder said today. A benchmark is a battery of tests that measures the speed and performance of software running in various configurations.
Kicking off the "Inside Sun Software Day," Scheerder began his remarks with a mea culpa for Sun's actions, revealed last week in a report by CNET's NEWS.COM. At that time, Pendragon Software, makers of the CaffeineMark Java benchmark test, accused Sun of taking code from the CaffeineMark software and adding it to a beta version of the Solaris 2.6 Java just-in-time compiler. CaffeineMark is one of several developers that have created Java benchmarks.
Last week, Brian Croll--director of product marketing for Solaris, Sun's flavor of the Unix operating system--denied that Sun lifted the code. Today, however, Scheerder made it clear that Sun had made a big mistake.
"Nobody was trying to do anything malicious," Scheerder said. "We just optimized [the Solaris Java compiler] too much."
A Sun public relations manager called the episode a "big-time organizational breakdown" in which an engineering prototype that was never meant to go public was posted on the Web with all attendant documentation, along with a press release that touted the software's performance. Sun has also posted an explanation on its Web site.
"Sun committed an unintentional error when we published Java performance numbers for an engineering prototype that included code that specifically looked for a piece of code in the Caffeinemark 3.0 benchmark," according to a company statement.
In a release dated October 20, Sun bragged that, according to the CaffeineMark 3.0 test, Solaris 2.6 ran Java applications 50 percent faster than Windows NT. But it neglected to say that it had set the compiler to look specifically for a chunk of code from CaffeineMark. Reusing such a large chunk of specific code risks diverting too much of the compiler's resources, resulting in lower performance once the compiler is deployed in the real world, said Ivan Phillips, president of Pendragon.
After taking issue with Sun's test results, Phillips said he asked Sun to retract its claims and remove the compiler from its Web site. As of last week, Sun had not retracted its claims, so Phillips went public with his accusations.
Scheerder stressed today that the compiler, which was part of the Solaris 2.6 Java Development Kit 1.1.4 beta, was not shipping product. The company pulled it from its Web site soon after Phillips contacted them last month.
The news comes four days before the International Organization for Standardization (ISO) decides if Sun is qualified to be the official submitter of Java technology if and when Java becomes an international standard.
The official submitter has the responsibility to gather industry consensus and present it to the ISO's technical committee for consideration. There is some concern that Sun, which owns Java, might not be a neutral submitter. So far, 11 countries have voted yes on Sun's bid and one country--the United States--has voted no. A total of 27 countries are scheduled to vote by Friday.
I worked for HP.... (Score:3, Informative)
Advertising Claims (Score:2, Informative)
The upshot of all of this is that when it comes to it, a prospective customer will usually say "prove it" and you well, have to. I for one took great pride in being part of the tech/development/demonstration team in that I had a say on what went into the sales literature as I'd often be the one proving it...
Needless to say, as it was MY arse on the line, I managed to complete demonstrations without any screw-ups.
These kinds of "white papers" (Score:2, Informative)
I don't blame companies for acting this way, as it is a sales force's job to sell. I just ignore all of these white papers. I do however pay a great deal of attention to what companies like Gartner say about various products. They are paid by us (the consumers) as opposed to the producer and are not quite as susceptible to false analysis.
Re:Consumer audio (Score:3, Informative)
It sounds too insane to be true. I almost dismissed the entire site as being an elaborate hoax, but searching for "magnan cable" on google produces so many hits in apparently serious places, I can only conclude it is real. Unless the whole high-end audio world is having a laugh at our expense.
Always Read the Forums! (Score:3, Informative)
Re:Consumer Reports pays cash (Score:5, Informative)
As a hypothetical, let's say that CR judges crash-worthiness of a car using a 35 mph head on collison test. Car manufacturers which know this are going to optimize the structural integrity of the car to hold up well under this test at the expense of other types of crashs (side impact crashs, say). Another car may not perform as well in the head on test, but it may be safer over a entire universe of possible crashes. However, because it is not optimized for the CR crash test, it won't get as high a rating.
Lest you think I am putting stuff out of my butt, this situation actually occurred with respect to the Insurance Institute for Highway Safety. Up until a few years ago, cars were generally crash tested using the head on methodology. However, the IIHS decided to start using an offset crash methodology since was more likely to occur in real life. They found the results from the offset crashes did not necessarily match the results from the head on crashes. Cars that did well in the head on tests did not do as well in the offset crash tests. Obviously manufacturers had optimized crash worthiness for the test and not for overall safety.
So where does the blame lie? I would say it lies both with the testers and the manufacturers. The testers are to blame for coming up with a test that doesn't necessarily reflect real life. Meanwhile car makers are to blame for designing products to "beat the test" rather than to be safe overall.
I think the same is true in the case of the original poster. His company isn't doing anything illegal; if the tests can be beaten so easily, then what good are they? In fact, one could argue that his company is helping in the sense that they are revealing the test's shortcoming. However, I find it hard to believe that their underlying motives are altruistic. I would guess that their motivation for tweaking their system is to beat the test for their own gain, and not for some higher moral purpose. So in a sense they are violating the spirit of the competition, in my opinion, even if what they are doing isn't wrong in the legal sense.
Re:Consumer Reports pays cash (Score:4, Informative)
Until he retired, my uncle was head of their paint testing laboratory, and this is exactly what he did. He would, for example, test a paint's opacity by applying a coat directly to an unprimed test pattern. He used to drive the paint companies nuts -- but when he said a paint will cover in a single coat that's exactly what a consumer could expect.
Re:ATI's 'Quake' optimization. (Score:3, Informative)
The tone of the article almost has an edge of "I can't believe we do this in our industry I feel so dirty!" to it. The poster of the story is obviously some kind of new college hire or hasn't been in the industry for very long or something. All vendors do this, all the time. Its just the way it is.
Anonymously, because ... (Score:1, Informative)
For decades, plastics manufacturers sent prepared -- unrepresentative -- samples in for testing.
Once they had certification, they used the more flammable, cheaper, more malleable plastic in products.
Glad you asked. Can't tell you how I learned that, but it was a long time back; I think the practice stopped a decade ago. I hope it did.
Still plenty of stuff out there too dangerous to use:
State Fire Marshals -- Products Too Unsafe for Use in the Home
http://www.firemarshals.org/issues/home/docs/In
Re:"Digital Ready" headphones -- for digital ears? (Score:1, Informative)
Lots of people don't seem to know this and keep spouting the "world is analog" myth..
And yeah, the high-end audio world is full of crap. Unquestionably. There is no scientific proof whatsoever for any of their outrageous claims.
Re:"Digital Ready" headphones -- for digital ears? (Score:2, Informative)
These amps generate a high frequency digital signal, which is pulse width modulated. Ordinary transducers cannot reproduce individual pulses due their inertia, but they do get "nudged" a little by each pulse. In affect, the transducer will average the signal out, converting the high frequency digital signal into a low frequency analog one -- the audio.