Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Internet

Obtaining Multi-Tier Application Logs for Reseach? 40

arohann asks: "I'm a research assistant in a well-known university in the US. As part of the research work my group is doing, we need access to the logs from a production system of an n-tier web-application. I've been looking around for a while with no result. Most places reply with a flat 'No!'. I was wondering if there anyone who could help/advise with this. Please read about our requirement below and do let me know if you can help?"
"We want to examine the request arrival behaviour of a real-world web-application and will also need to examine how long each request takes to be processed at each tier. We would collect this data over a few days and then use it to build a real-world model of the request behaviour of an internet application. This model would be used in our analysis and profiling of clustered, multi-tier, internet applications.

Of course, we realize it maybe that some of this data cannot be shared due to client privacy concerns. However, let me assure you that we are not interested in any client details and we're not particularly concerned with what kind of an application it is as long as its at least 3-tier, is a production system (we need a real-world model), and is used daily. We are also willing to sign a confidentiality agreement if necessary and follow any company protocol required to ensure that security and confidentiality are preserved.

Of course, if this results in any research paper publications, we would give credit to the supplier of the data.

Hoping to hear back from everyone soon ;)"
This discussion has been archived. No new comments can be posted.

Obtaining Multi-Tier Application Logs for Reseach?

Comments Filter:
  • i doubt you will get to see logs like that without taking them by force
  • No!
  • a candy bar?

    Before you laugh, it goes amazingly far where I work....
    • You would be surprised what you can get for $8 and a candy bar [sptimes.com].

      That said, the OP may find that if he gave us the log analysis tools and algorithms he wants to apply to the log files, a bunch of us would run the analysis on our own logs and send him the results. That way he would get the benefit of a slew of different data sets, instead of just one or two.

      Companies are not too keen on sending out their internal datasets (Sarbanes / Oxley might have something to do with that, or the thought of being caught i
  • My Suggestions, (Score:3, Insightful)

    by colemanguy ( 915683 ) <deanklenda@c o l emanguy.com> on Saturday November 12, 2005 @11:47PM (#14018482) Homepage
    My guess is your gonna need to try to contact them using something other then email, probably some sort of ceritifed letter.
  • no (Score:1, Insightful)

    by chris_mahan ( 256577 )
    I will echo that: No!

    System logs are for the machine's administrators and for software developers, not researchers.

    If you guys want research material, build your own systems and sink in the tens of miillions of dollars to do that. If your app is decent you'll have more log data that you could possibly wish for.

    • by Nik13 ( 837926 )
      Well, I doubt they'd manage to get logs from my workplace either (not for free at least)... But it should not cost tens of millions to build a somewhat simple 3 tier web app (especially with all the free tools that can generate a good part of the code for you, or using O/R mappers like [N]Hibernate). Then collecting data is a non issue (you could even offer someone to make them an app to solve a real world problem in exchange for some logs perhaps - sounds like a good deal).

      Although I see limited use for a
      • by Nik13 ( 837926 )
        (I hate replying to myself but anyways...)

        and will also need to examine how long each request takes to be processed at each tier.

        That can vary greatly in N-Tier apps. In N-Tier apps, many people put business logic in sprocs in the database (and sometimes some in the clients... poor design usually although can be used to "double check" things), while others will have exactly 100.0% of the BL in the Business Logic Layer and none anywhere else (none in sprocs whatsoever). Things like that will affect results g
  • I'm not familiar with this "reseach" that you speak of, but it doesn't sound like something I'd want to donate my logs to. It sounds like some sort of weird internet version of felching.. Ew. No thank you.
  • by jhoger ( 519683 ) on Sunday November 13, 2005 @12:11AM (#14018558) Homepage
    I'd start with companies that are already offering internships through your university. Find professors and graduate students that already have a working relationship with private sector folks and get introduced through them.

    Just cold calling or sending in letters or email is about as effective as you've found it to be.

    Also you should try looking through published artcles in trade journals and find out which companies are sponsoring research in your field by association with existing published research.

    The fact is that you'll certainly have to sign an NDA and likely they will have to scrub the data anyway. One way or another it's going to cost the donors $$$ that you aren't going to reimburse. Your project will have to fit in with their research goals or they'll be returning a favor from someone else.

    -- John.
  • by deranged unix nut ( 20524 ) on Sunday November 13, 2005 @12:11AM (#14018559) Homepage
    Your best bet would be to have your professors call in a favor from former students or their contacts in the industry.

    Most companies will consider this to be a security risk. They don't even want you to know the rough design of their backends let alone collect data from it.

    Some companies wouldn't know how to gather what you want and wouldn't risk letting you touch their systems.

    Most of these systems are probably messy, kludged together by former employees and hacked by current employees just enough to keep them running.

    If you have some time, get an internship and do your research on the side. :)

  • by stienman ( 51024 ) <adavis&ubasics,com> on Sunday November 13, 2005 @12:31AM (#14018621) Homepage Journal
    I'm a research assistant in a well-known university in the US. As part of the research work my group is doing, we need access to the logs from a production system of an n-tier web-application.

    Welcome to capitalism, we hope you enjoy your stay. While here, please note that TANSTAAFL [wikipedia.org].

    Asking for data from a business requires a lot of work on your part. You must somehow convince them that all the effort they are going to spend collecting, sanitizing, and providing you with the information is going to pay off for them in a reasonable way. Since this request involves several months of data, and more employee involvement than a 5 minute survey you'll have to build a strong relationship with a company who has this data.

    Opportunites include:
    • The research will help you identify areas where improvement will save $$$ in [bandwidth|speed|latency|etc]
    • We can supply one or more interns to do all the internal work as well as work on a few other projects of your choosing
    • You (manager, CEO, IT lackey) got your degree here and still have fuzzy feelings for the school
    • Oh benevolent ones! May we sip at the firehose? Verily, this research will help this university provide graduates of the caliber which will dazzle the eyes! Yea, they will be cheap, too.
    The key here, as in everything to do with business, is to network, network, network. Don't email - you cannot possibly explain your research in a way that will make them go, "Gee, I think I'd like to devote company resources to these kids tha the university of whatever!" in an email. At best send an email such as, "Dear sir, blah blah blah, we are researching n-tier applications and would like to spend a few moments talking with you about your architecture. When would be a good time to call?" Give it two days - Call them in any case except if they patently refuse to talk to you. Don't engage in email conversations - in order to get good buy-in, you need to talk to them (if only briefly) so they can associate a voice with the email. Then email all you want.

    You may have better luck calling at the outset, intriducing yourself and your research, then asking who at the company would be suited to help you out with your research. Then engage that person. Don't get too low on the totem pole or you may end up with someone who is inneffective within the company at getting you what you want. Certian companies (Google, forinstance) are resource rich and may be easier to work with, especially if you can get one or two workers involved and spending their 20% time helping you. If your research isn't exciting on a general level, you're in for a rough ride.

    Once you've started a conversation (with several people at different companies - you're still trying to get something they will be reluctant to give) then you can start edging into what you need to complete your research. This whole process will take 2-6 months just to set everything up. I hope you've started early.

    Good luck.

    -Adam
  • ...
    > if this results in any research paper publications,
    > we would give credit to the supplier of the data.

    If that's all you offer in return, which company will allocate the resources to verify:?
      (a) this breaches no privacy laws (b) business advantage isn't sacrificed? ...And rightly so for companies whose constitution is 'maximise profit'.

    Some suggestions:

    1. Offer a quid-pro-quo to companies you contact: in return for access, you will deliver (say) a multi-page detailed architectural review and specific recommendations on potential improvements, reviewed, say, by your professor.

    2. Talk to people who run websites for non-profits, or open-source/ creative-commons websites like wikipedia.org, sourceforge.net, even slashdot. The attitude there may be more sympathetic to your efforts and the admins more willing to knock up a few Perl scripts to strip logs of sensitive information.

    3. Offer to be a website maintainer for a large indepedent open-source / community effort and obtain agreement on your access to logs.
  • I can probably help (Score:2, Interesting)

    by abradsn ( 542213 )
    I've written a couple of these (including one for an extremely large software company that I'm sure you've heard of), and I'm currently working on one right now on the side (for my own personal gain).

    If you reply to this comment with your email address, then maybe we can work something out.

    I need some help with testing my current project, and you need some data. It 's actually more work for me to have someone besides myself test the software but the quality should be higher and it could help you out.
    • Hi, Yes I'm interested in the logs you have to offer for our research. Can you tell me a bit more about the application itself ? My email is arn129@psu.edu Thanks and regards Arjun
  • If you are from a "large university", how come you can find any big app log files right on campas? Most "large universities" have plenty of "n-tier web-applications". Me thinks your request smells bad.
    • If you are from a "large university", how come you can find any big app log files right on campas? Most "large universities" have plenty of "n-tier web-applications". Me thinks your request smells bad.

      At my university the response would be a big fat NO!. I have asked the admins for some software that I desparately needed to do my research work efficiently (OSS) and they just said "no that's a security risk". I fail to see how some basic image procesing software is a security risk, but they have it in t

  • You say you want it to make a business model etc, but you don't say what your final goal is.

    Since you're already on ./ tell us what you really want in the end, and we might find a workaround. One rule with asking questions is never assume you are right. You ask for help, so let us draw the conclusions.
  • ... of building yourself a net of vmwares for this purpose?
  • I can't help, but it smells fishy to me. Why don't you tell us the name of the "well-known university"? Where is your email address? Why don't you answer the post that asks for your email address? Why are the questioners of an Ask Slashdot question never replying to the answers?
  • Send a perl programs that changes all the fields that might be sensitive, so that the person can test and convince themselves thye're not going to leak information to you.

    That will make them realize that you understand some of the constraints they are under, and that you'e a nice person (:-))

    In particular, transform
    http://www.sin.com/porno [sin.com] into
    193'd-seen-HTTP-address

    --dave

  • Does sourceforge fit your definition?
  • Um, if you're at a "well-known university in the US", then I'm sure your campus IT and Admin departments have some interesting setups. Try getting log data from the on-line registration system, during the register/add/drop period at the beginning of a semester - plenty of traffic and data moving around.

    +0, Obvious?

    Christ, what do they teach in schools these days?
  • The way to make this happen is to offer something more than credit in return. If you are going to collect a lot of data on an application, you'll have the opportunity to give them some cool reports about where the bottlenecks are. Offer the reports in exchange for the data. Sure, they *could* make the reports themselves, but most places never do.

    If you can't get in on your own, convince some important vendor (e.g., IBM's Websphere group) that you're legit and can help them if they'll help you.

    And if nobody

"The medium is the massage." -- Crazy Nigel

Working...