Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Java Programming

TPC-C Benchmarks For JDBC? 5

woggo asks: "I need to benchmark two different JDBC drivers for a research project and would like to use a standard benchmark. I was able to find this implementation of TPC-W, but that is too much of a test of the Web server to be useful for my purposes. Does anyone know of a freely-available Java implementation of TPC-C? It needs to be reasonably conformant and I need to be able to cite the results in a paper without violating a license agreement, which would seem to exclude evaluation versions of products."
This discussion has been archived. No new comments can be posted.

TPC-C Benchmarks for JDBC?

Comments Filter:
  • JMeter [apache.org] has preliminary support for testing database drivers. I added it myself. It isn't TPC-C, but it's free and is actively being developed (in theory). (Of course it is written in java so this post won't get moded up, why are the slashdot moderators so anti-java--you'd think microsoft created it)
  • by Socializing Agent ( 262655 ) on Monday December 18, 2000 @09:06PM (#549710)
    OpenLink Software has released JBench, their TPC-A and TPC-C implementation under the GPL. The catch? You have to fill out a form to download it as part of their (non-free) Virtuoso db suite.

    Check here [openlinksw.com] for more info.

  • by MemRaven ( 39601 ) <kirkNO@SPAMkirkwylie.com> on Monday December 18, 2000 @10:05PM (#549711)
    First, what precisely are you trying to benchmark? Just saying you're trying to benchmark a driver isn't very interesting or specific at all, and won't get you far. Are you trying to measure total query throughput? Total user/query concurrency? Transactional performance? Time to First Row (i.e. query latency)? Time to Commit (i.e. transactional latency)? Under what workloads? Highly normalized relational data? Star/Snowflake schemas? Nothing but updates? Queries? Insertions? 2-way joins? 10-way-joins? So if you don't know what you're testing for, you're not going to do a very good job of testing it. Figure out what your benchmark is testing, and then find one which matches that, don't just go looking for a benchmark.

    The TPC is very correct in stating that a benchmark is not about one particular component. You can't test them in complete isolation. You must look at how a whole system performs under a specific workload. So it's actually pretty difficult to measure one component in isolation. If you're testing just the driver, you need to compare the drivers against the same workload, same schema, same data, same database. You can't test the WebLogic Type-4 Driver for Sybase with the Open Source Postgres Type-2 Driver. And since different databases are optimized for different things, you'll need to bear that in mind (this is something you'll probably know, but it's important for constructing a benchmark for your application).

    Next is the issue of actually publishing benchmark results. First of all, you cannot publish a full TPC benchmark without it being audited. You can use their data, use their queries, but you cannot publish comprehensive numbers (such as QphD or QthM) because it's not permitted by the TPC. Look at how academic research papers in the database field do it. If they're doing query performance, they give you comparative numbers on the various queries, and maybe put it together into some artificial number, but they don't attempt to give a result. If you're looking at the TPC benchmarks, that's a very important consideration.

    Now you get to the issue of publishing a result on ANY database. Not going to happen. Database vendors don't allow anybody to publish any benchmarks of their software under any circumstances, not even for academic reasons. The only people who get around that are people working for the companies, such as the IBM Almaden research people. You can't publish the results very easily, and have your advisor look into what you can use and/or publish. You might be stuck with Postgres or MySQL only.

    And finally, there's the issue of what TPC-C is. Nobody ever does TPC-C in Java, nobody ever does it with JDBC. Have you read a full disclosure report for an audited run before? Not just the executive summary, the FULL disclosure report. They don't use Java. Hell, they don't use SQL. They code things directly to a low-level API which is completely native, and completely in C. Because here's a big insight which the TPC and RDBMS vendors will never tell you: TPC-C is designed to indicate how quickly an RDBMS can act just like a mainframe. That isn't very fast at all, so they have to "cheat" by taking SQL out of the picture. Read their client code. It's all native C library calls.

    So with this we come to my recommendation: Use the TPC-W benchmark, for the following reasons:

    That's what people use JDBC for. JDBC isn't used for TPC-C or TPC-H or TPC-Q workloads. It's used for web workloads. Why not try the benchmark used to simulate web workloads with a technology designed for web pages?

    You can always play with the parameters. Don't like the reliance on static pages? Take them out entirely. Talk directly to the middle tier. Or reduce them in percentages to the point where they don't factor highly in the mix. If you can't get this tuned to where you see a performance difference, then the results are simple: JDBC performance doesn't matter. That's a pretty good result in itself.

    You can also just take the bits out you care about. Are you willing to munge your own source code? Then extract the database code from TPC-W and run that.

    You might also want to look at SQL3AP. It's good for measuring raw concurrency, but is a completely artificial benchmark with no relation to the real-world.

    Much like any other Ask Slashdot, we're not provided with sufficient information to give you any real information at all. But hopefully this helps. If you care, email for more information.

    Kirk Wylie

  • UCITA Reporter Ray Nimmer complained of "distortions" in the debate on UCITA, identifying as a "misrepresentation" the claim that "that UCITA allows licensors to prevent licensees from commenting about the products."[ 111] He said that "This allegation makes nice copy and superficial impact, but is simply untrue. You can scroll through the UCITA draft and will not find any such provision."

    that was a close call.

    :::

  • Kirk--

    Thanks for your thoughts. The reason why I'm "benchmarking drivers" is that one driver, which I am developing, has client-side relation cache and delegates to the native driver for cache misses. I don't think my question was inexact -- I asked for a TPC-C implementation in Java, not for "a db benchmark in Java"; I asked for a particular specific thing, not for a recommendation of a type of thing.

    The plan is to compare "apples-to-apples" by comparing my driver on top of a native driver to the native driver alone under a simulated "real" workload, using some benchmark thatmost people in the field will have a frame of reference to. Since I don't have a $50k computer at my immediate disposal, I'm not going to be publishing numbers that are immediately comparable to the published TPC results, but the ratio between the two should be meaningful and the fact that it is TPC-C should let someone know what sort of test it is.

    Next is the issue of actually publishing benchmark results. First of all, you cannot publish a full TPC benchmark without it being audited. You can use their data, use their queries, but you cannot publish comprehensive numbers (such as QphD or QthM) because it's not permitted by the TPC. Look at how academic research papers in the database field do it.
    I'm not interested in publishing "TPC results" for the purpose of selling hardware or software. I'm interested in saying "On this TPC-C-like workload, the driver with cache performed n times better than the driver without."
    Now you get to the issue of publishing a result on ANY database. Not going to happen. Database vendors don't allow anybody to publish any benchmarks of their software under any circumstances, not even for academic reasons. The only people who get around that are people working for the companies, such as the IBM Almaden research people. You can't publish the results very easily, and have your advisor look into what you can use and/or publish.
    Well, that's not entirely true. (I am at Wisconsin, the "inspiration" for the benchmark clause since 1983....) Although I was planning to test cached-Postgres vs. uncached-Postgres, DB2 will allow one to publish benchmarks. And if you want to publish relative benchmarks for a system with different parameters/drivers/etc. (as many papers have done), it is pretty straightforward to do it legally: just compare "DBMS A" to "DBMS B". That's what papers that I've read lately have done. However, Oracle-to-Sybase comparisons are not what I'm interested in, as I said.

    Thanks for your help, and I'm sorry I wasn't more clear by specifying that I was interested in comparing total throughput of two drivers on the same db. I may wind up only using the db code from TPC-W, but I think in that case, it would be smarter to just write my own benchmark, as "the db code from TPC-W" would have little more credibility/recognizability than "this benchmark with the following constraints...."

    cheers,
    wb

This file will self-destruct in five minutes.

Working...