Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
News

Open, Web-Based OLAP Clients? 125

Zoloft asks: "I'm looking for a web-based OLAP client; something with lots of nifty features that PHBs find appealing. The normal available offerings are proprietary, expensive, and closed, closed, closed. Open-source would be nice but not required. Money is not an issue. What's important to me is: it's UNIX native - not NT native and ported with one of those bloated NT-to-UNIX layers - and its data formats are *open*, *readable* and *programmable* - whether it sits in files or a database. I have been beaten down for 10 months with this product that was forced upon me, which shall remain nameless of course. The only way to "develop" a custom app was through its piggish graphical front-end binary, obfuscated file formats and no programming or scripting hooks. I could go on and on, but you know the deal."
This discussion has been archived. No new comments can be posted.

Open, Web-Based OLAP Clients?

Comments Filter:
  • by Anonymous Coward
    I've got the same problem. How would Cognos Power Play fit this problem? Thx.
  • by Anonymous Coward
    Comes to mind if you are talking about On Line Analytical Processing. But it seems to have disapeared from the web site. Or were you asking a different question? -- cary
  • by Anonymous Coward
    It used to be available for Linux for $99. Is it no longer available?
  • by Anonymous Coward
    Check out Hyperion Essbase. It has server for Unix/Solaris/AIX/NT plus web extensions (you buy them extra :/). What else to say... it works.
  • by Anonymous Coward
    Pretty well. And it is open source. It doesn't
    yet have a database link (wanna write one), but
    it does deal with some standard file formats
    like NetCDf. The control syntax is c-like, but
    the matrix syntax is - um - concise. But what
    do you expect from a product that will multiply
    a 10 x 3 x 6 x 4 matrix by a 3 x 6 matrix and
    give you (is this right?) a 10x4 result?

    Plus it can do graphs.

    You could set up a persistant Yorick process
    with a PHP or Mod-Perl web front end, suck
    in the data once and slice and dice it however
    you want. Plus it has an MPI interface so
    you can sling operations out across multiple
    processors.

    You can read more about yorick here

    http://www.linuxgazette.com/issue26/obrien.html


    -- cary
  • by Anonymous Coward
    I've often thought, that if I make ONE open source software, OLAP would be it.

    I Love Multidimensional Databases... ESSBASE and Microsoft Plato being the ones I know best.

    I'll tell you about the database I worked on, it was for a County government, the dimensions were Theme, Program, Package, Job Class, Department, Year, Account, and Version.

    So, if you wanted to know how much the County Hospital Nurses planned to spend on Pencils in '97, the database knew.

    It pulled data from MS Access, Oracle, SAP, IBM DB/2, and old Prime minicomputers, piled it all into a 2 gigabyte database, and allowed real-time browsing and editing.

    2 gigs is what the consultants set it up to be, I optimized it down to 500 megs, which, since the server only had 200 megs of RAM, made queries go 10 times faster. They liked me there :)

    Interestingly, If I extracted the 'level zero' information, and the database schema, and the attached Access database, I had everything needed to reconstruct the entire 2 gig database... those files pkzipped, fit onto 2 1.44 meg floppies.

    Basically, an OLAP database pre-calculates every possible combination of the data, and stores it for fast (on-line) retrieval.
  • by Zoloft ( 218 )
    A lot of great responses and quite a few interesting suggestions I never bumped
    into in my own travels. Thanks to all of you!

    You're all wonderful!
    Let's hold hands and sing......

    Sorry, I was getting carried away.
    Ok, I got some things to say - I just didn't want to put all of it in my
    original post.

    First a rant: No, the product wasn't from Micro$fot. However, I do hold them
    indirectly responsible. By pushing NT in everyone's face, and spreading this
    poor excuse for a development environment - Point-&-Click, Drag-&-Drop -
    and every vendor perpetuating this mind-set, now everyone's life has to be
    miserable.

    The PHBs are impressed when a marketing droid shows a demo on his 550 Pentium III
    laptop, running their product on NT, and of course it's native to NT, and the
    database it talks to is on the same laptop, and the table has 10 rows. Great.
    I have to use this damn mutant version running under Solaris on a 143MHz sparc,
    and talk to a database with million-row tables, 1,000 miles away!

    And now some further explanation: the project I worked on was not really an OLAP
    app, rather an ad-hoc query front-end. The vendor claims this is an OLAP tool,
    but that's like saying DOSSHELL is a GUI. Now we are evaluating real OLAP pro-
    ducts. So far, the landscape doesn't look good. And I won't be thrilled with the
    higher-ups if they again "standardize" on one square-peg product that we have to
    squeeze into a round hole.

    Ok, let me answer a few things-
    SamBeckett says: The only reasonable solution to this lad's problem would be to
    develop his own system."

    Of course! I wish I could convince the decision-makers. They're looking for a
    simplistic, one-size-fits-all solution. News Flash: There ain't none!
    Here's something useful I found, looks very good. However, the bulk of the
    information they provide requires an expensive subscription. But you pay for not
    only information but also consulting service. I only wish those upstairs would
    also see the value.

    http://www.olapreport.com

    And here's a quote from http://www.olapreport.com/How_not_to_buy.htm

    "The process of product selection can be tedious and expensive, so it is
    very tempting to try and pick a single preferred product set that will be
    used not just for the immediate project, but will become the official or
    unofficial standard for subsequent projects for years to come. However,
    this rarely works and can be dangerous."

    This makes perfect sense to me. Nevertheless, I'm afraid we're going down that
    over-trodden road to Hell.

    I have three questions-

    Cistron?: www.cistron.com -> www.cistron.nl, all in Dutch. Is there an English
    version? Or am I looking in the wrong place?

    Holos? : www.holos.com? Unfortunately, this website has nothing going on -
    looks like they're refurbishing. A search at Seagate for holos pro-
    duces nothing.

    What (or who) are Livingston and Merit?

    Again,thanks for all this amazing feedback!
  • The implementation probably varies by vendor, but every solution I'm familiar with requires recomputation when adding records.

    As a result, the usual model for generating a data store is to update the MDDB on a periodic basis rather than on a record-by-record basis. How often depends on the quantity of data and the immediacy of your needs. Daily, weekly, monthly, quarterly, and annual schedules are all used. Keep in mind that there is often (usually <g>) much more processing involved in ensuring data quality and integrity than with simply updating the data store. In businesses I've been involved with (healthcare, finance, pharmaceuticals), 30 - 90 day data lags are commonplace.

    Result of this is that typically a "data warehouse" distributes "data marts" -- periodic updates or versions of the data store -- on a regular basis.

  • Does this mean that all of SAS is to be ported to linux?

    Don't hold your breath. I know a thing or two [netcom.com] on the topic [1].

    The news that I've heard is that a code port has been done. Some problems were encountered, but they were resolved with the help of RedHat. The question is whether or not Linux fits with SAS Institute's (SI) traditional customer base and business model. SAS is a mature product, with about 30,000 installed sites, roughly 2% base growth per year, and 16% revenue growth in 1998. 52% of revenues are still related to mainframe platforms (with 27% PC and 17% Unix). In short: a market that's not exactly bleeding edge, with more blood being squeezed from the same old turnips.

    SAS programmers have been known to rant [infoworld.com] about several other shortcomings....

    My feeling is that SAS will come around to supporting Linux eventually. They might even surprise me and make an announcement in the next few weeks -- it's regional user group conference time, a favorite time of year to announce new products (their latest release, v7, has been featured for the last four years running <g>....). But I'd put the probability at about 25%. A leaked internal discussion indicates that there are serious internal conflicts over marketing, and until Jim Goodnight says "SAS will run on Linux", it's not going to happen. I tend to get good information both directly and through the mailing list I maintain, and I've heard nothing. Might try shaking the bushes a bit....

    However the open source movement has gathered momentum to the point that SAS is simply going to miss out. Flexible tools, source, server-based and distributed applications, are the new wave. SAS has got itself a neat little niche, but it's got an uphill grade -- getting steeper -- if it wants to catch up with the new wave.

    [1] Yeah, I know the site's stale. The sad news is that it's current -- there's simply nothing to report. The mailing list carries more current information, but it's also tooooo quite....

  • SAS's big push for the past five years has been data warehousing -- it's a growth path outside their traditional academic, DP, insurance, pharmaceutical, and healthcare business. Good as they are, they're having trouble competing due to the cost and complexity of their solutions.

    eCommerce is nothing if not a huge data gathering operation -- all of it live and on line -- with incredible opportunities to study customer behavior in detail. (For what it's worth, I see both sides of the ethical coin here -- DW is great for my wallet but really bad for my conscience, and I'm one of those people who pays cash, doesn't get the "club card" at the local grocery store, and supplies false information when I do -- Seymour Cray buys my milk and cereal).

    Example: eToys got through Christmas last year with a cluster of 90 P-class PCs running Linux and MySQL. They're not running big Iron, they've just started making the mistake of going to Oracle (well, I can understand why they're doing it but they'll live to regret it as well). Point being -- the growth area right now isn't happening on MVS, NT, or even Solaris. These are Linux or xBSD shops, and that's where their expertise lies.

    SI are more than welcome to unload the clip into their own foot if they want, but I'm not planning on waiting around to mop up the blood. I'll note that SI have been supporting Linux for their IntrNet product for a couple of years now. But the main products are not available.

  • by KMSelf ( 361 ) <karsten@linuxmafia.com> on Friday October 01, 1999 @06:08PM (#1644116) Homepage

    I've been programming SAS for seven years, and have begun exploring open source alternatives. I don't know of any general solutions, but components exist.

    What specific features do you need in an OLAP tool? Among those I can think of:

    • Raw data parser -- equivalent to the SAS data step, awk, or Perl -- something to turn raw ASCII (or EBSIDC, or binary) to structured data.

    • Procedural data language -- a general-use language for string manipulation, numeric conversion, arithetic, date and time manipulation and conversion, statistical functions, probability and statistical functions, data and flow control, etc. Similar to Perl, the SAS data step, PL/I.

    • Statistical and summarization functions -- at a minimum, frequencies, univariate statistics, ranking, and correlation procedures. R, Perl, and other solutions can provide these.

    • Performance, scalability, and efficiency options -- basic data manipulation should be quick and efficient, the solution should scale to terabyte or petabyte size data stores, and system resources should be used efficiently with respect to memory, CPU, storage, and bandwidth.

    • Graphics capability -- ability to generate business and scientific plots and charts.

    • Report-generating capability -- XML, HTML, postscript, monospaced output.

    • A data store, or data stores, or interfaces to existing data stores (eg: RDBMSs).

    • A multidimensional data model.

    • Metadata management, including descriptions, validation rules, and business rules.

    • Task control and scheduling.

    • Front-end development tools.

    • Interactive data exploration and analysis tools.

    I'm not saying that a solution should have all of these features, but they are, in rough ranking, the ones I'd be looking for. My preferred model is to build a solution from existing components, or at least structure it from multiple modules, rather than look for a single integrated system. One thing SAS has taught me is that this isn't the best way to fly.

    Anyone else have thoughts on relative importance, unnecessary items, or other features they would want to see?

  • OKay, I'm an idiot, I've seen this acronym being tossed around on /. for months now, and I still don't know what it means. Please explain what a "PHB" is...

    Folks will answer and tell you it's a "Pointy haired boss". Those of us who have one (or had) know better. It's "Pointy haired bastard".

  • Oracle is exceedingly Open To Your Buying Product From Them.

    Oracle salescritters are also extremely open to the idea of you adopting functionality that will forcibly tie you to buying future software from them.

    It unfortunately is quite tempting to "buy in" to using software that ties you to spending a whole lot more money on Oracle software. And this is why Larry Ellison is a billionaire...

  • For info on OLAP, I've written up link/definition information elsewhere in the thread.

    As for "PHB," that is the acronym that Scott Adams, famed for the comic strip Dilbert coined to describe the Pointy Haired Boss. Usually used to describe someone who is completely devoid of practical knowledge but who is "in charge."

  • It's always useful to present some definitions when using a TLA or other acronym that everyone is not expected to be familiar with...

    OLAP [olapcouncil.org] usually stands for On Line Analytical Processing. (Footnote: the OLAP Council website claims to intend to provide common definitions, but do not actually provide a definition for OLAP...)

    Datamation [datamation.com] describes it thus:

    On-Line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user.

    OLAP is pretty strongly associated with the common buzzword, "Data Warehousing."

    More precisely, what it is about is the notion of taking the data created by an online transaction processing system, and collecting this into a big database that you then want to do "analysis" on.

    The point here is that the analysts that are looking for patterns need to have a separate copy, as the things they do may hit a DB server hard, and are probably not friendly to the transaction-oriented operations of "Entering Invoices," "Processing Sales," "Paying Bills," and such.

    SAS is pretty big on OLAP, as they have been building powerful statistical software for many years now.

  • More info about Applix's TM/1 can be found at
    http://www.applix.com/applix ware/linux/prodovertm1.cfm [applix.com]
  • The only problem I've seen with Cistron Radius is that it tends to puke every now and again when hooked up to mysql. Haven't been able to tell if it's Mysql or Radius which causes it but all of a sudden the whole shebang stops working and all it needs is a restart and it's good for antoher couple weeks -- anyone?

    This server's got 128 megs of memory and 128 swap. It's been up about 3/4 of the year now running mail and web in addition to radius with no other problems.
  • I had to do the same thing with Radius. Until I showed that the proprietary software leaked 100 KB/S at our planned peak load.

    So I cranked out one based on the Livingston code and hooked to gdbm that could handle 5X the traffic, and didn't have memory leaks.

    They couldn't argue the obvious. In general, I'm very hesitant to base architectures on proprietary, closed products. It ends up being much cheaper for the company to go with an open solution.

    It is pretty difficult to prove it to most PHBs, though. Even after pointing out the 100 KB/S memory leakage (I'm not kidding), they just wanted to restart the server every hour.
  • I don't know if it was available at the time, but this was in early '98. There were two options that I knew of at the time, Livingston and Merit.
  • SQL may not appear on the front end, but that may be what the server gets from the tool. Most OLAP queries aim at aggregating all the data within particular ranges, eg "all sales of books costing over $20, in stores in the Southwestern US, to customers who paid by credit card." An OLAP tool would build the query by presenting the user with choices about each dimension, and then accumulating SQL predicates to define each range, eg "price > 20", "payment_type = 'credit card'", etc. Once this process is over, the tool comes up with a whopper of a query, possibly involving dozens of predicates. The database then has to look at this beast, and without any human assistance, come up with a good way of fetching the data.
  • Multidimensional refers more to the physical data layout than the actual logical model, which can still be relational.

    In a conventional RDBMS, a tuplet of values (a,b,c,d), say, can end up in any free block in the table (and indexes are needed for each column searched). In a multidimensional db, tuplets are laid out as if their values were array subscripts, the way a multidimensional array would be in, say, FORTRAN. It's a good deal easier to find data that way, since it's the same as an array lookup. The downside is that the data has to be fairly dense, or else you'll end up with a lot of empty holes and wasted space.

    In a data warehouse, the array is usually organized along dimensions that are well-populated, such as time, location, cost, etc., so the sparsity is manageable. There's also a high priority on looking up things by virtually any dimension or combination of dimensions, so the array format is particularly useful.
  • I have been beaten down for 10 months with this product that was forced upon me, which shall remain nameless of course. The only way to "develop" a custom app was through its piggish graphical front-end binary, obfuscated file formats and no programming or scripting hooks. I could go on and on, but you know the deal

    Ho! you mean Microsoft Access ! Yeah, I know about that, I've been through it too. A brain-damaging experience ... Unfortunately, the suit love it :)
  • *sniffle* Offtopic? Hrmph. He said "a product that shall remain nameless". Seems like an open invitation to humor a certain MS offering. Alas.. some moderators have had their humour surgically removed.

    --
  • Generally it's the implementation of said algorithms that is considered proprietary. Most of them are indeed published, and not necessarily proprietary.

  • AOLserver is most definitely open source and free.
  • After further review, the ACS system is also open source and free.

    Check out http://photo.net/wtr/using-the-acs.html [photo.net] for details.

  • > From what I've seen, Oracle products on UNIX
    > fit the bill here. Note that he says
    > that "Money is not an issue."

    Yup. They sure do. Hard to get more scriptable on *NIX than perl modules, which exist to tie into Oracle DBs. Also SQL*Loader, which is eminently scriptable, SQL*Plus, usable on ANY platform, etc. Oracle Discoverer is an OK (though pricey) OLAP reporting tool; also has a web-interface. You can always use Oracle DBs as the back end, and heck, use PHP or some other customizable, OSS solution for your front end tool.


  • What options are out there (if any) if one wanted to create "real" Multidimentional Databases under Linux? Oracle I assume could do it, but can they be access on a Windows PC with an ODBC/OLE DB driver?
  • If I remember correctly PostgreSQL can do multidimensional DBs along with relational-object DBs. For all I know MySQL can as well.

    I searched the mailing lists for "Multidimensional" and came up blank. Diving into the create_table.1 command, I found the following examples:

    1) A normal table:
    create table emp (name char16, salary float4, bdate abstime)

    2) A table that "inherits" another table (what the OO claim is all about):
    create table permemp (plan char16) inherits (emp)

    3) A table that contains a stores noughts-and-crosses in a 2-dimensional array:
    create table tictactoe (game int4, board char[][])

    None of these are a true multidimensional table - so my question still stands...
  • OLAP stands for OnLine Analytical Processor. From what little I know, it's basicially a program that can take information in the form of a (usually multidimentional) database and figure out stats and trends. Basically it's a one of those programs that figure things out like 20-40 year old males are the primary consumers of home electronics. (Like we already did not know that)

    The problem is the voodoo most OLAPs perform on the data is very complex and propritery.
  • 1. What is a PHB? perhaps I know not

    PHB is the acronym for Pointy-Haired Bosses...you know...like in Dilbert....

    2. What is OLAP?

    That's already been answered up above...
  • PHP is a server side scripting language often used for database-linked web pages.

    The logical leap from here to an OLAP engine is far too great for any sane person to make. OLAP and the Web have no direct or necessary relationship.

    New definition:

    "frind (n): a type of software, very useful for the type of work for which it was developed, but which is pushed relentlessly by the indiscriminate as being the solution to all problems past and future."

    Your not obligated to post on this forum, and if you don't know what you're talking about it's probably best not to.

  • Er, it sounds like the original poster is a programmer and someone is paying THEM to do the work.

    In the real world, most business people (bosses)want and are paid to solve business problems, not to finance open source initiatives which may provide something more or less like what they originally wanted at some indeterminate time.

    Running a business dependent on the output of one or more open source projects yet to reach fruition sounds like an extraodinarily effective way to go broke in the fastest possible time.

    I think the cosource way works if people "invest" in software, much as they invest in shares or property, with the opportunity to reap the benefits (which may be more or less than they would hope for) at an indeterminate later time. The "other interested people" who may invest may not come on board in the timeframe you would wish for, and if you can't wait that long or justify paying for the whole thing yourself then you lose.

    Trying to meet a corporate deadline using the methods you advocate would be far too risky IMHO.
  • If you want to build a custom Web-based OLAP client, I'd check out Internetivity's dbProbe (www.dbprobe.com) and AlphaBlox's AlphaBlox (www.alphablox.com).

    dbProbe stands out because of its very small Java client (about 100KB the last time I tested it), while still providing a full-featured interface similar to the fat client products (Cognos, Brio, BusinessObjects, etc.). Its cube is generated as ASCII text, so it's pretty malliable. The server runs on Linux or you can go directly to a bigger back-end like MetaCube or TM1. InterNetivity did lots of things right here.

    AlphaBlox is a newer and generally rawer product, but is specifically designed as a toolkit for building custom OLAP and reporting applications. Using JavaScript, HTML and their Java components, you can assemble a vertical OLAP application in a few hours. It's all client-side, so your cubes need to be relatively small (3 or 4 dimensions).

    There are lots of other Web-based interfaces (virtually every OLAP vendor has one now), but the rest are quite closed and intended to be used largely as-is.

    Regards,
    Tim Dyck
    PC Week Labs
  • PHB--Pointy Haired Bosses--from dilbert
  • The underlying data structures in MDDs (multi-dimensional databases) are generally proprietary--and why shouldn't they be? You don't expect to access RDBs (relational databases) through reading the file directly... you use SQL. In the same way you access MDDs through a language or front-end.

    An emerging 'standard' for accessing MDDs is MDX (Multi-Dimensional eXpressions), which first appeared in MS OLAP Services, although it's now popping up in most major OLAP servers (except Cognos at this stage). You don't have to 'program' the OLAP server--you just create an MDX expression. MDX provides access to pretty much all the power of most OLAP servers.

    For programming the build/maintenance processes, all tools have some form of OLE interface (in NT) or other interface for the Unix tools.

    The way the data is stored and 'indexed' is very important--trying to roll your own tool will _not_ create the same order of magnitude of performance of any commercial tool. For example, MS OLAP Services includes an optimisation algorithm that calculates which preaggregations to create to maximise query response.

    The tools I would suggest looking at would be MS OLAP Services for small to medium jobs (up to 100 million records it certainly works fine), or Cognos (although it's showing its age... I'd wait for the next release). For bigger jobs I'd look at Teradata (which runs most of the world's biggest data warehouses) or Redbrick (which is now part of Informix).
  • wtf does OLAP stand for?
  • by jayped ( 25281 ) on Friday October 01, 1999 @01:22PM (#1644145)
    IBM has OLAP for DB2. Here's the link.

    http://www.software.ibm.com/data/db2/ db2olap/ [ibm.com]

    Plus IBM is Open Source friendly. (although DB2 OLAP server is not yet available on Linux)
  • One more thing about Holos (at least before the web interface stuff), is that the interface is rude, horrid, and bletcherous---the client on the PC side actually opens up a faked-up telnet session, but doesn't keep-alive, so when the clients go down (all the time---they're PCs) you get lots of hung sessions using up bunches of memory, so you wind up setting up idle-user sweepers, with that constant battle between "more clever" users complaining about being logged out prematurely vs. the application staff complaining about tied up system resources.

    Now, the web stuff is supposed to fix all of that, but I haven't seen that yet. Fortunately, I'm not the SA for "those guys" any more.
  • The Olap Council [olapcouncil.org] proposed the MDAPI [olapcouncil.org], a server-independent API for accessing OLAP data, some time ago (first time around '97, I think). They still have some major vendors on their member list [olapcouncil.org], with Oracle seeming to be most actively pushing it (last year, that is). Yet, for about a year, nothing has been heard about it. In the meantime, there is Microsoft's "OLEDB for OLAP", also supported by several server vendors, but, obviously, running only on Windows, so it's not really an alternative. Does anybody know if MDAPI still has any supporters, and if so, what has been going on there in the last year? I can't seem to find any evidence the whole thing is even alive anymore.
  • WEKA [waikato.ac.nz]? Various machine learning/data mining tools. Java. GPL.

    And you'll keep getting assorted answers in various directions unless you're more specific...

  • dbProbe [internetivity.com]? Web OLAP. Commercial. Java.
  • Basically, an OLAP database pre-calculates every possible combination of the data, and stores it for fast (on-line) retrieval.

    Does it recalculate the statistics, etc. every time I update or add a new record?

  • Why didn't you just use Cistron RADIUSD?

    It's fast, stable, and open-source. We've used it for the past year (with and without NIS) and never had a problem with it.

    One of the best examples of OSS I can think of.
  • which is nice if you want to use their pre-built analytical products for CPE or Census statistics. Otherwise, with SAS you're looking at a very big development project just to build your databases.

    BTW, IBM isn't using "OLAP" anymore. Apparently none of the PHB's could remember what it meant, so now IBM is calling it "Business Intelligence" (as though buying DB2 could increase their IQs).

    This buzzword has arisen recently in publications such as "Enterprise Computing" and "Technical Support" (which seem to keep arriving somehow).

    But IBM shouldn't be dissed about DB2 WRT having really solid systems. With the right hardware, software, and technical staff, you can guarantee 100% (not 99.999%) DB2 uptime 24x7x365 (for only a few tens of millions annually, of course). ;-)

    When companies finally grow up and realize that they're spending millions of dollars on out of control menageries of hundreds of unstable NT database servers, they invariably go upscale to either Unix (if they're smart) or MVS (if rich).

    Some are actually rational and do both, depending in the case of each application upon what really makes the most sense. (These are the minority.)

    Anyway, DB2 on Linux will be a killer (if IBM doesn't totally screw it up, as they mostly do, when it comes to marketing outside their sacred "account control" business model). However, I only wish them wisdom, lest they wake up in a few years and find MS-Access running on their hot 9672's!

    S390





  • Ellison is rich because their product is good.

    According to this article [forbes.com] he's worth $13 billion. I think most of that's on paper in the form of stock and options in Oracle. Whether that's related to the quality of their product is anybody's guess.

  • by carnifex_maximus ( 35084 ) on Friday October 01, 1999 @12:33PM (#1644154)
    On-Line Analytic Processing is one reading. Try 'data warehouse' or 'data mart' as well. Cheers, Drieux
  • If I remember correctly PostgreSQL can do multidimensional DBs along with relational-object DBs. For all I know MySQL can as well.
  • Short answer: No. Multi-dimensional databases means you have n sets of 'columns' or 'rows'. I.e. a regular DB (e.g. Access) which divides its tables into columns and rows is 2D, strictly. An nD DB would have n-dimensions *each* divided into 'columns'.

    BTW, An open-source nD DB would kick-ass. But it has to be geared for *very* fast read access times...

    Anything out there?

  • OK. If in 2D world, the basic structure of a database is a 'table', in nD it will be, say an 'array'. An nD array will be:

    Array: "Sales"
    Dimension 1: "Store"
    Dimension 2: "Product"
    Dimension 3: "Time"
    .
    .
    etc...

    So you could store the sales history of every product sold in every store of a national chain (say Sears) in any given time interval.

    Then you can do things, like identify regional trends in sales, inventory optimization, etc.

    Of course, we're talking TBs of data, and hardware and software designed for speed ;-)

    You could theoretically do everything in C/C++, but I doubt anybody does that: you need very, very fast access to the data, which is usually done by a proprietary app/language (which is what the post is looking for). Then you can build application logic on top of that in anything you want really...

    Plug alert: this is what my employer [retek.com] does...
  • This is one of the best references I've seen for OLAP terminology. [moulton.com]

    Basically, OLAP/Data mining/ Ect. are optimizing data bases for analysis (ie, data warehousing) and then applying machine learning algorithms to find interesting, useful bits of information which you werent' aware of before. This process goes by many different names including OLAP, data mining, KDD (knowledge discovery in databases), and intelligent data analysis among others.

    While some of this is pure hype, there is some very cool, interesting work going on.
  • by tarvid ( 48247 )
    I had a lot of fun with Dynamicube from www.datadynamics.com. I applied it to 6 million record files with up to 10 dimensions. Wish I could find the same for Linux.

    There are several datacube efforts, some in actual practice which are not labeled as such. One is analog which builds a datacube from httpd access logs.

    Once one has well defined categories, all it takes is to add one dimension for the sum, form the cartesian product of the categories and sum all the combinations of the zeroth member in each category. The process is reversible so that records can be added and deleted.

    It can also be generalized to more than sums and there are some parallels to n-dimensional correlation matrices.

    The problem gets a bit nattier when the data is sparse. I suppose an index of keys would ameliorate the problem

    I would be willing to help in such an effort and can help lay out specs.

  • I too didn't know what PHB was off the bat, even though I am an avid Dilbert fan. After a while of starting the day by reading a Dilbert strip and perusing Slashdot, a bulb when on in my head and I derived that PHB was Pointy Haired Boss.
  • It's more complicated than that. OLAP applied complicated (usually proprietary) statistical algorithms to (entire) complex databases in order to extract information which isn't immediately apparent from just looking at a piece.
  • Actually...yes. That is why the data warehousing applications tend to be proprietary. They hire statisticians and mathmaticians to deal with it.
  • 1. Pointy-Haired Boss...see that "boring" layer of computers known as Dilbert.

    2. On Line Analytical Processing. Data Warehousing, Data Mining [insert other buzzword here. Not my cup of tea, certainly. But I'm not sure that you can call it the "boring" layer!
  • About a year ago, I built a web front-end to the Essbase OLAP server..it was pretty tacky on the back-end, using a bunch of perl scripts and the Essbase Web Gateway, but it had a decent web interface and looked good enough to print from the browser to give to an executive.

    I know that Essbase is pretty expensive, and the version i was using ran on a NT server. But i know they have ports to other platforms, and I _heard_ that they were considering porting it to Linux.

    Also, since the system was pretty hard to maintain, the development team that I am on developed an ASP solution. Essbase has open Visual Basic APIs, so we wrote ASP components to hook into the OLAP server...it is still sitting in QA right now, but it seems to be really robust and has plenty of room to grow.

    If you want more info, email me

  • Thats right! I forgot that it had the C API also...hell with that you could write a mod_essbase module for apache and make it really platform independent.

    Of course we chose ASP because I work for a large company who's standards committee is hooked on Microsoft products.


  • PowerPlay isn't particularly "open", and has the limitations talked about above. Largely graphical user interface, little back end programming ability, and no adaptive interface to speak of. Their next generation software looks a lot better, but of course it isn't out yet. The current release has had, in my opinion, some problems scaling. Check my reply to the main topic for more.
  • OLAP stuff is tough to write from scratch; even if you've got a good underlying database that's designed for OLAP, writing code to navigate a 7 dimension or higher cube isn't pretty. What is it you need to do that a low to mid tier package like Cognos won't do?

    Holos, which Seagate recently bought, is very open and strong. They have a structured programming language that gives you as much support as I imagine you could use for back AND front end work. Their web interface is developing, but it's in early stages yet... hopefully by next calendar year they'll have a new release out that will pretty that part up. If you need real strong scalability, Unix support (sorry, no Linux to my knowledge), and fairly open control, it may be worth looking into.

    Unfortunately, people that need the sophistication of OLAP haven't been the people that write OpenSource software, so I don't know of any truly open solutions. If anyone wants to write one, tho, I'm willing to help!
  • this is the closest thing to an intelligent post i have bothered to read. the answer, my dears, is essbase. http://www.hyperion.com. essbase is the world's fastest and most capable multidimensional database engine. it has been ported to every unix and the linux port is in beta/qa.

    essbase has had an open api (both in c, and vb) for years, and dozens of partners who have written front-ends like cognos, business objects, brio and a host of others which plug and play to the database.

    olap is not olap in the way that relational is relational. the fact of the matter is that arbor software recruited *the* dr. codd to sum up the weaknesses of relational database for realtime access as in the way relational dweeb dba's have been attempting for years in datawarehousing. only two vendors have implemented the full scope of codd's rules. they are arbor (now merged with hyperion) who own essbase, and oracle who purchased iri's express multidimensional engine. microsoft, on the late freight, bought an israeli joint called panorama, scrounged up some of the engineers from oracle and are now trying to undercut the market with a deally called microsoft olap services now bundled with sql server 7.0. (codenamed 'plato'). for those of you with enough sense to resist microsoft, i needn't comment further on this nt only wizard-based thingy.

    oracle has decommitted and begun to defund the express division. all of their marketing people have been canned. they are hoping that people will continue to remain clueless and assume that extentions to sql will handle everything 'olap'. of course unless you pay attention to the full spec of things codd said many years ago, then you are likely to buy all the marketing crap and actually believe that relational databases are going to do everything.

    the momentum remains with dweebBAs with big iron who hack their way through datawarehouse projects without real server-based olap. to them it's all 'slice & dice' and other PHB front-end sillyness. in the meantime, people who know are scarce. the rest of us are making buttloads of cash in the consulting biz, or are running the market-leading hyperion.

    the people who really know this stuff live at centrobe, beacon analytics, mcanalyst, data into action, pinnacle solutions, answerspace, navigator systems, ranzal & associates, symmetry and eds.

    bottom line is this. there is one company who is going to crunch the uncrunchable, and i'm talking about weblogs from yahoo, and that will be hyperion. it won't be done with relational crap.
  • data warehousing and data marting have completely different dynamics. data warehousing is all about storing everything in a consistant manner such that it can be replicated for any reason without losing its contextual meaning.

    data marts are specifically designed to answer specific questions about some process of the business.

    datamarts built with relational technology are quick and easy to design but offer limited functionality, performance and flexibility. datamarts built with true olap engines do all that work better, but they take a bit more brains to setup. (mostly the kind of brains that tell you relational technology doesn't cut it.)

  • there are two kinds of voodoo. chicken dancing voodoo and patent office voodoo. if everybody could do this, there would be 10 olap servers on the market. as it stands, there are about 4 worth mentioning.

    olap engines also figure out things like, practically speaking, how would i tune the profitability [hyperion.com] of international container shipping using activity based costing methods?

  • powerplay is essentially a front-end tool. it makes little cache files called 'powercubes' which reside on the client after you vaccuum up a big query from a source database. the main presumption of this design is that you intend to navigate around in a tiny subset of multidimensional space. the second presumption of this is that your server can't handle real-time queries. some people call that olap. it's not really.

  • here's a reference to Codd's Rules. [olapreport.com]

    this is a little bit dated but a decent enough overview. [mit.edu]

  • ibm santa teresa has done something like this. they actually have the ability to select out a 'square' of data within a fact table. this square has it's own security, logging and it is optimized for incremental update and read.

    by 'square' i mean only the area defined by the some contiguous columns intersecting contiguous rows WITHOUT the rest of the record. it's basically cell level retrieval, with security.
    --

    if tm1 went open source, that would be an amazing thing.
  • if you don't mind crappy performance and like doing everything by hand, then holos might be a bargain. i don't mean to flame, but if you knew how many customers buy holos based on its elevated rhetoric and replaced it as soon as they could afford to...

    my sources say that seagate doesn't even do its own olap on holos. and the holos front end tools allow you to use essbase as a back end through the essbase api [hyperion.com]. explain that.

    on the other hand, if you'd rather program a 4gl than use an api, i can sympathize with a choice of holos.

    by the way, if you want java beans, try painted word [paintedword.com]. if you want java component based web, try alphablox. [alphablox.com]

  • say "doornail".

    the market is saturated with front-end tool vendors, and all of the ones who are making any money have already ported to the essbase api and/or the microsoft api. the spec got completed but by that time the market-share game was already over. therefore, there is no interest in doing anything new and untested.

    some folks have complained about the need for a full objecty kind of api for essbase. hyperion is scratching its head about api stuff. the confusion is easy to understand: why do more when there are so many successful vendors out there using the current api?
  • I can save you a lot of time and trouble offline. yes i sell the stuff, but i have also done implementations going back several years. i have seen what folks are doing all around the country, and i understand the strengths and weaknesses of the major products.

    it's very exasperating for me to see people thrashing around on this issue, especially when they start talking up the minor vendors.

    if you absolutely are bent on not spending 25grand for your basic essbase server license and want to build your own stuff, then i really can't help you. that's just beneath the radar. if you want to go for the cheap stuff, my best recommendation would be tm/1.

    but if you want a first class system. mail me or discuss it here. mbowen@panix.com
  • i have never met anyone under the age of 40 who had the patience to work with SAS in any olap or D/W project. on the other hand, people i meet get to understand the genius of essbase.

    sas is a brilliant company for leveraging legacy apps. if you did a masters anywhere doing statistical research or clinical trials, you've tasted sas. lots of folks learn it in postgrad studies. but using sas in today's market is a little like mba's using their same hp calculators to run the company finances. it just doesn't make sense except for the fact that you are extremely comfortable with it. but what can you say to people who are in love?

    hyperion as a company has a problem in reaching slashdot-type folks. that's probably because it came from the finapps space. not a lot of bsd in corporate finance. be that as it may, don't let that convince you to try and build olap from scratch. i've met with very bright guys from citibank who built their own. (believe me it was a seriously hostile, new york style meeting). but when i explained the internals of essbase to them, they came around. i mean the stuff is patented [ibm.com], and the inventor understands everything about sparse matrix math and all that eigenstuff.

    there are a significant number of computational problems that solid olap technology addresses. the fact that ibm [ibm.com], acknowledged master of sql optimization, has opted to oem the hyperion technology rather than build their own is all the proof anyone should need.

  • The original poster asked for:

    Open-source would be nice but not required. Money is not an issue. What's important to me is: it's UNIX native - not NT native and ported with one of those bloated NT-to-UNIX layers - and its data formats are *open*, *readable* and *programmable* - whether it sits in files or a database.

    From what I've seen, Oracle products on UNIX fit the bill here. Note that he says that "Money is not an issue."

    *open* doesn't mean Open Sourced. Oracle products are definitely programmable, and readable, if that means the data is conveniently manipulated.

    As to Larry Ellison being a Billionaire. Oracle sells in a competitive database market. Ellison is rich because their product is good. Is there some requirement to hate those who are successful?

  • by JordanH ( 75307 ) on Friday October 01, 1999 @01:06PM (#1644179) Homepage Journal
    I don't know much about OLAP. I do know that Oracle is in that market, their products are reasonably "Open" and programmable and they claim to integrate well with the Web. I believe they've committed to porting their entire product line to Linux as well, but they have it all on numerous flavors of Unix today.

    Check out this link at Oracle's web site [oracle.com]

  • You ignorant c*nt!
  • 1. What is a PHB? perhaps I know not.

    2. What is OLAP?

    These are being asked because I rarely venture into the "boring" layer of computers namely the commercial level for just money for making more money level
  • Couldn't cgi's be considered an implimentation of OLAP?
  • Just having a couple of fields in the database ex:

    database 1 has a column for x and one for y and another for z can that be considered multidimentional?
  • Well maybe compared to AI, or quake, or perhaps biological life simulations.
  • how can a statistical algorithm be considered proprietary except to the author of it. From what I have seen in journals devoted to mathmatics most of the ideas applied to problems are far too complex for some average joe company to formulate. Do companies hire theoretical mathmaticians to do their program design?
  • so sort of like

    database 1
    database 1 part a
    database 1 part a subsection 1

    ? maybe I cannot see this in my head yet isn't that just a simple c++ program away? Is there a good text for multidimentional databases.
    1. PHB stands for 'Pointy Haired Boss'. See Dilbert [dilbert.com]...
    2. OLAP, others have already done an excellent job of describing 'On-Line Analytical Processing'. I will only point out that it is a management buzzword right now and, like most such buzzwords, it means different things to different people.

    Jack

  • I did a search a couple of months ago to find an open source OLAP, but i couldn't find one. This could simply mean that i didn't look hard enough. About the best thing i came up with was TM/1, which is now owned by Applix . I've done a little probing to see if they ever plan an open source release, but i've found nothing.

    I think that Erik Thomsen's book "OLAP Solutions" comes with an evaluation version of TM/1.

  • Down here in Australia we just had the users group conference for Asia Pacific and the product they are releasing is V8 and a series of vision products (HV CFO etc)
    I don't see sas on Linux in the near future but as they have released htmslq on linux then sas will be providing support for linux. I thought that providing support for linux was one of the major reasons that the port hade not gone production.

    BUT with the growth of linux as a platform as a web server and the porting of htmsql is intranet next? and then SAS? There is a trend towards the web as a reporting tool with webAf and webEIS as well as integrated html output into base sas in V8 with ODS. ( and incidentaly publish and subscribe , and CBI{portal} products like the joint venture with intraspect)

    The topic of open source is interesting tho as the latest new from Jim himself is that the long term plan for sas is the get out of the client market and V8 has made available com/Dcom (corba in the first maint release) classes to enable thid party thin client development on whatever platform you want to develop on.

    Grem
  • Firstly the link in one of the other posts didn't seem to work so try this. [sas.com]
    Secondly which SAS programming language don't you like ? Base SAS ,SCL, webAF or webEIS.
    I agree that SCL syntax can be a bit annoying at times but the next version of SAS V8 to be released early next year has much better syntax more like c++ or java. WebAF is just java with a lot of extra classes added to it and a ide so you know what to expect there.

    But the main reason for using a product like SAS is that you don't have to rewrite all the statistical and analytic back end procedures. However if you don't like the front end there is a standard server for Open OLAP server available from sas as well as several different web front ends.

    For more info check
    SAS OLAP [sas.com]
    Cognos OLAP [cognos.com]
    Oracle OLAP [oracle.com]

    An aside SAS is releasing htmsql 2.0 for Linux as well as all its standard platfroms on tuesday. Does this mean that all of SAS is to be ported to linux ?

    Grem
  • Just a quick comment about Ellison being a millionaire. I am not saying that Oracle products are necessarily bad, but their databases are comparable to SyBase and Informix, their applications are not better than SAP's and Siebel's, and their OLAP engine was actually ACQUIRED a couple of years ago together with a company (I don't know what it was called, maybe Express, just like the Oracle's OLAP engine).

    The reason why Oracle is so successful is their amazing marketing talent and great salespeople. After all, Ellison himself used to be a salesman, and that's the way he runs the company. Once again, I'm not trashing their soft, I'm just saying that marketing is WAY more important. Just look at Borland (Inprise, whatever)
    -- The word "woman" is not politically correct any longer.
  • I'm data administrator for a 400+ GB financials warehouse on Oracle. Yes, it does the job very well. We also front-end the warehouse with custom ColdFusion web apps and are porting to BusinessObjects' new BusinessIntelligence web product. The combination offers excellent results with minimal (2 DBAs) maintenance. The DBAs are primarily involved in designing new tables and data sourcing operations (and de-normalizing my models). We support several thousand users with an average of about 50 concurrent multi-dimensional queries. We're running on twinned HP UNIX servers and have had exactly one hour of down time in the last year and a half. So far our users seem to think that Oracle is the appropriate product to use.
  • Pig Headed Bastard(s) ;)

    --Clay

  • by CodeMunch ( 95290 ) on Friday October 01, 1999 @06:58PM (#1644194) Homepage
    Disclaimer: Yes, I do realize the question was for Unix...So if you don't wanna hear about some NT stuff, stop reading this message.

    Some OLAP (On-Line Analytical Processing - i.e. reporting) stuff off the top of my head that I've had contact with:

    Cognos [cognos.com]

    CrystalReports/Info [seagatesoftware.com]

    ActiveReports/ActiveCube by DataDynamics [datadynamics.com]

    Cyberprise [cyberprise.com]

    My company tried Cognos and it seems to be a heavy hitter to satisfy the PHB's. It's got data mining/driling down, stored static cubes so you don't need to go back to the DB (makes it fast) and when you drill down till you are out of data, you can go into the DB off the cubes. You can run reports right off the cube data. Unfortunately I wasn't a part of that venture but from what my co-worker(developer) says, it was pretty slick... YOU DO PAY through the nose for it.

    Crystal Reports: Very slick & easy to use. Almost idiot proof to run them off the web BUT the web engine is single threaded and you can only run one at a time on the web server (useless!). If your DB is slow and 5 people ask for a report each at the same time and they all take 5 minutes, the last person will be waiting 25 minutes...you know that by then they've already clicked refresh 15 gazillion times (or the default install of IE has given up). The ActiveX and JAVA controls that come with Crystal 7 that allow you to view reports through the browser are sweeeeet. You can export reports to RTF and a couple other formats right from the browser. Oh ya, it's also VERY easy to design reports and the COM interface makes it easy to work with. I demo'd CrystalInfo but select boxes confuse our users enough that we didn't want to give them the ability to create reports on a whim if they dont' understand the underlying tables. You can pay some comapny 10k U.S. for a multithreaded Crystal Print engine. Crystal7 is reasonably priced.

    On the same lines, ActiveReports (and ActiveCube) by Datadynamics is quite a bit more useful although no where as easy to use as Crystal & doesn't come with the handy pre-built functions to manipulate/shape data (but I like doing stuff from scratch ;). It is an ActiveX Designer plugin for VB. You crateall your reports in VB and then create a generic report object to wrap around those activeX designers. And the best thing of all was that you could run multiple reports simultaneously which beats the pants off of Crystal. The export controls aren't as full featured though and there aren't many export options but there are enough that you can get data into pretty much any app. Besides, too many options confuse users ;) ActiveReports is reasonably priced.

    Cyrberprise was another thing my company tried but I can't say much about it as our interest leaned more torwards COGNOS.

    Anyway, currently we are using a crappy buggy 16-bit Helpdesk software by Applix (transitioning to a home-brew Oracle Forms app instead) and the reporting was buggy and useless..not to mention they save the users password in a text file in the root of the system drive...but I digress.
    This Oracle thing (started out on web...too flaky, going client/server unfortunately) needed web reporting so all the Crystal Reports we had suited it perfect to run via ASP (Active Server Pages). I designed a system that allowed me to put all sorts of dynamic web selection forms in front of the crystal engine and pretty much run any report we had. I can add options to the selection form just by inserting into the DB and it pops up on the web page.
    This allowed users to run pre-defined report templates against the system to extract the stuff the needed. All in all, it works great (except for the slow DB and single threaded Crystal report engine) and I'm in the midst of modifying it to be able to run Crystal and ActiveReports (so I can port everything to ActiveReports).

    As was previously mentioned, you could use PHP script or PERL or C/C++ on Linux to do your stuff but that would require a lot of work.

    Sorry I couldn't give you any Linux info. Perhaps these companies have something coming down the pipe.

    --Clay

  • by SPrintF ( 95561 )
    "Olap" is Finnish in origin. It is synonymous with "data whorehouse". I don't understand it; I just explain it.
  • by SamBeckett ( 96685 ) on Friday October 01, 1999 @01:26PM (#1644196)

    I happen to work as an intern at a fortune 50 company right now-- and i've been asking around... They have every spare engineer and technician working on algorithms/programs/controls etc for the automation of data analysis.

    NASA and the Air Force started it years ago with a project they did to detect (and thereby save money on unschedulred repiars) early failures on rotor shafts and bearings.

    I find this stuff very boring and rudimentry. The *only* reason any company keeps this propeirty is because they don't want people to know how crappy their products are (i'm referring to in-house OLAP development, where release would mean exposure to their most sensitive failure data).

    IBM apparently uses OLAP on their hard drives because they don't quote their mean-time-between failures. I even called them and asked.

    I've used their SAS programming language earlier in the year, and I'll tell you-- it's NO walk in the park. The syntax is worse than umm.. well.. I guess it's just the worse.

    The only reasonable solution to this lad's problem would be to develop his own system.. It's the cheapest, probably the most reliable and, would be, by far, the most customizable.

    I feel for him for having to do this kind of work though because I was driven mad by it. I still have a bad taste in my mouth from it.

    My suggestion to him : hire a bunch of interns and have *them* do it.

  • PHP is the answer to your problem. PHP is a programming language, so you'll find all the flexibility you want. Still, you don't need much code to get started!

    Check out the homepage [php.net] for more information

  • The only freeware web-based OLAP solution I've seen is the Data Warehouse Subsystem bundled with Philip Greenspun's ArsDigita Community System (ACS) which runs under AOLserver and Oracle. Although I use the ACS, I haven't run the Data Warehouse Subsytem, I have seen Philip demonstrate some of its capabilities - seemed comparable to the various web-enabled commercial OLAP solution. The Data Warehouse Subsystem is relational OLAP (ROLAP), so all of the multi-dimentional aggregation is done within the database itself. FWIW, I develop Business Objects solutions at my work and am happy with it. http://www.photo.net/doc/dw.html
  • AFAIK the ACS system and AOLserver are open source not freeware. Oracle is neither. And yes, it runs under UNIX.
  • I'd look at building my own client and DEFINITELY back-ending it with MS SQL 7.0 and their OLAP. Yes, yes, yes, it's the Spawn of Bill and it hasn't got any penguins in it, but it's still the best thing out there for under a gazillion bucks. It's also pretty easy to build PHB compliant or web-based front ends to it, as it's amenable to remote control from a pretty simple API(sic)

    Use your favoured web tech du jour to front it, but personally I'd be using a stack of client-side XML, even if that meant entering the IE5 tar pit.

    Whatever you do, start out by reading Kimball / Inmon's books. The Data Warehousing Toolkit is abso-damned-lutely essential reading before starting. Buy it now ! Give copies to your friends! Their pets! Even their PHBs!

    If the data volume is enormous, then switch to RedBrick - by then you'd be well into SuitSpace though.

  • I have on my workbench a product that will meet this need, though it is not entirely open (not critical): it's UNIX native data formats are *open*, *readable* and *programmable* It's multi-dimensional data base is all of the above and extremely powerful. I would like to build this as a product if I can find some power users who are interested in working with me. .. Dan
  • Well, as far as I know Hyperion has indeed plans to port essbase to linux

    But then it would be nice to get something like Comshare DeciWeb's grids from a *NIX/Apache server (and having the charts and overviews would be just so damn cool!)

    BTW I use vim to edit Essbase/BudgetPlus scripts, anyone interested on exchanging macros?

    I've got:
    * syntax highlighting
    * a few macros (an enhanced "%" to match potentialy nested fix/endfix, if/elseif/else/endif, !LoopOn.../!EndLoop
    * Completion (type @ID^N to get @IDescendants)
    * a tcl/tk program to generate Tags (yup we use some unix native tools under Windows too)

    anyone interested?

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...