Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Education Open Source Stats

Ask Slashdot: Statistical Analysis Packages For Libraries? 146

HolyLime writes "I'm a librarian in a small academic library. Increasingly the administration is asking our department to collect data on various aspects of our activities, class taught, students helped, circulation, collection development, and so on. This is generating a large stream of data that is making it difficult, and time consuming, to qualitatively analyze. For anything complicated, I currently use excel, or an analogous spreadsheet program. I am aware of statistical analysis programs, like SPSS or SAS. Can anyone give me recommendations for statistical analysis programs? I also place emphasis on anything that is open source and easy to implement since it will allow me to bypass the convoluted purchase approval process."
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Statistical Analysis Packages For Libraries?

Comments Filter:
  • A good database? (Score:2, Interesting)

    by Anonymous Coward on Wednesday November 16, 2011 @03:42PM (#38076908)

    Hear me out. We deal with about 3 million data-producing elements and track in real-time to near-real-time. We ingest everything into MySQL (via macros, scripts, tools, etc.) and normalize the data on the way in. For analysis we simply query. Those queries may have their outcome displayed in a simple report generator, or (more often than not) via HTML5 Canvas graphs/charts, Cacti graphs, etc. What we're doing doesn't lend itself well to a SAS type solution. If you could use SAS for what you're doing, this probably wouldn't work for you.

  • R and Python (Rpy2) (Score:3, Interesting)

    by mpetch ( 692893 ) <mpetch@capp-sysware.com> on Wednesday November 16, 2011 @03:51PM (#38077026)
    I have grown accustomed to doing statistical analysis using Python and R using http://rpy.sourceforge.net/rpy2 [sourceforge.net]
  • by eldavojohn ( 898314 ) * <eldavojohn@noSpAM.gmail.com> on Wednesday November 16, 2011 @03:52PM (#38077042) Journal

    I also place emphasis on anything that is open source and easy to implement since it will allow me to bypass the convoluted purchase approval process.

    Sorry to burst your bubble, but if you want good support and easy implementation, you have to look for normal paid-for solutions. Besides, open source is not synonym for free. This is especially true with specialized software or something you want good support for. Open source just means you get the code aswell, so you can implement your own additions (without use of plugins) or change it.

    Your point may be valid. But what would really help your validity is mentioning some proprietary products that beat R and WEKA at their own game. Sure, I've used Matlab and it can't be beat in some respects and is heavily supported. But to suggest that just because it effortlessly interfaces with Excel spreadsheets when the person could get by with a simple export in Excel to run their R script on the resulting files? Not worth the cash, in my opinion. I don't go out and buy every piece of software to evaluate it, though. I'm aware of Matlab and Mathematica and have used them quite a bit ... but I still prefer R and WEKA. So, CmdrPony, go ahead and list all the proprietary point-and-click-omg-it-just-works software for our friend here. We're all waiting.

    But unless you get an product from a company that is spending money to develop it, you never get good software and good support.

    Say, friendo, have you ever heard of Linux? Eclipse? Audacity? PostGRES? VLC?

    No one can make both because everything in this world costs money, and developers have to live too. Open source and free software model works well for the likes of Google and Firefox because the developments get paid by money made with advertising. Statistical analysis software, and other specialized software is a different matter.

    Can you tell me what advertising model is employed to funnel money through Firefox into Google? I mean, Google makes a competing product called Chrome -- the rendering engines are even different! What in the world are you free basing?

  • by LWATCDR ( 28044 ) on Wednesday November 16, 2011 @04:00PM (#38077142) Homepage Journal

    It almost seems like you are not doing statistics as much as creating reports from data.
    Maybe you should be using a database instead of a spreadsheet or a statistics program.
    The Uber geek way would be to set up a LAMP server and create a webased system.
    The more convent way would be something like Access.
    You can then use Excel to manipulate the data as needed or the database program.

    In the end if you know excel you may want to stick with it. I see people use Excel for databases all the time. Drives me a bit nuts but sometimes what ever works is just fine.

  • by Anonymous Coward on Wednesday November 16, 2011 @04:09PM (#38077244)

    He said he wants something that is easy to implement, and only reason he is going with open source is because then he doesn't have to ask for purchase approval. Which IMO is a really stupid reason and will hurt in the long run - it's insane to take worse software just because you don't want to ask your boss if it's okay to buy this one.

    Horse shit. I've seen projects die because they couldn't get software through the approval process. Better to try 10 apps that are free and run in userspace (so no need to get IT involved for an Administrator install) than to wait for management approvals, budget cycles, and IT support, and never get the project done. If I'd done that on the job, I'd have been fired for taking too long to do my work.

    I also resent the implication the "free" means "worse."

    Sorry to burst your bubble, but if you want good support and easy implementation, you have to look for normal paid-for solutions. Besides, open source is not synonym for free. This is especially true with specialized software or something you want good support for. Open source just means you get the code aswell, so you can implement your own additions (without use of plugins) or change it.

    I'm guessing you haven't used R. Not only is there a thorough user manual [r-project.org], but there are books [amazon.com] from most major statistical and instructional groups on how to use R, AND the R-help mailing list [stat.ethz.ch] answers every R question I've ever had about it, AND there are local R user groups [revolutionanalytics.com] where you can get support similar to how LUG's work.

    But unless you get an product from a company that is spending money to develop it, you never get good software and good support. No one can make both because everything in this world costs money, and developers have to live too. Open source and free software model works well for the likes of Google and Firefox because the developments get paid by money made with advertising. Statistical analysis software, and other specialized software is a different matter.

    Please shut up. If your assumption were true, R would not exist. R exists, so you're just an asshat.

    My advice to the original poster: Use R if you have any familiarity with programming. Any higher level math/stat course OR experience with basic programming will let you get started in R. If you've been doing this all in Excel already, you're probably ready to hop into R. If you're still uncomfortable, I'm sure one of the people who value your academic library could help out.

Real Programmers don't eat quiche. They eat Twinkies and Szechwan food.

Working...