Statistical Analyzers for HTTP Logs? 39
krishnaD asks: "I have been using webalizer to generate access
log reports for the site but lately my customers
are asking for statistics like average amount of time spent by visitors on site, if a person reaches a page X, what is the probability that from here he goes to Y? From which link people exited the site etc. Basically, they are asking for a detailed flow analysis of usage patterns of visitors. Are there any tools that will do this kind of analysis? I love to know what kind of tools other sysadmins use to generate reports for their clients."
Karma whore link (Score:2, Informative)
Re:Karma whore link (Score:2, Informative)
Analog [analog.cx] may be the most poular, but I also found it rather difficult to set up and get useful data into and out of.
Balam
Urchin - cream of the crop (Score:3, Informative)
If you get an account with Verio [verio.net], you will get your stats in Urchin for free.
Re:Urchin - cream of the crop (Score:2, Insightful)
When I worked for a
- cost
- not open (there were a few features, I would have loved to add/alter)
One really beautiful feature it has is that you can incorporate your sales stats with the program. I haven't tried it yet, but from what I remember, it allows you to directly check your sales stats/visitors ratio, where your purchasers came from, how long they stayed etc.
Net.Genesis (Score:1)
They also have an API that you can use to build custom functionality and/or match data against other systems (like a customer database)
Re:Urchin - cream of the crop (Score:3, Informative)
But it doesn't have as much detail as other vendors like Webtrends. You can't really do campaign analysis.
Review of these two [nwc.com]
More linkage (Score:3, Informative)
Log Analyzers (reallybig.com) [reallybig.com]
Web Log Analyzers (2K Communications) [2kweb.net]
WebTrends (Score:2, Insightful)
Re:WebTrends (Score:5, Informative)
I've used WebTrends for about a year, and couldn't be less impressed. Randomly chokes on logs that webalizer handles without trouble. Hard-to-use interface. Reports a number of things that you really can't tell from web logs.
On the plus side, the PHBs love it.
Re:WebTrends (Score:1)
I have to pretty much aggree to both points here after using it a couple of years back now. It actually seemed to get worse with new versions. And it was pretty costly for what it acutally did.
Re:WebTrends (Score:1)
Note: Many Mac users by default can not properly navigate through Webtrends Reports (Log Analyzer) due to a Java issue, which is an issue where I work where over half of the boxes are Mac.
Re:WebTrends (Score:1)
Re:WebTrends (Score:2, Interesting)
I wrote some perl scripts and used the GD modules to simulate something close to WebTrends output until I came up with something better.
Soon I found analog. The charts were not nearly as pretty as WebTrends, but the numbers were accurate and it ran about 15-20 times faster.
Finally, I found the ReportMagic add on for analog, and I started creating accurate -- and attractive -- reports again.
Not difficult (Score:2)
Re:Not difficult (Score:5, Informative)
The best way to get around this is setting a session cookie via Apache. Then you key off that.
Re:Not difficult (Score:1)
Thats fine for any new logs, but you also want something that works with your old data. Even if its not as easy a solution, or requires a couple of different approaches. I don't think any manager type would be pleased with out a retrospective view. (And if they didnt ask for it, adding it anyway can only help next time you ask them for a pay rise.)
Re:Not difficult (Score:2)
Then you run into people like me who routinely deny cookies unless the site has a valid reason for issuing them. This has become easier than ever for the average user with IE6's cookie management.
W3Perl (Score:4, Informative)
I'd recommend W3Perl http://www.w3perl.com/softs/index.html which is a kind of mess of perl scripts, but is surprisingly fast (much faster than other perl-only stats packages), and is the most full featured free package I've ever come across.
Set up is kind of a pain - it's rather complex, owing to the vast array of configurable thingies, but it works pretty well once it's put together.
There are some genuinely innovative features, such as a tree view of your website weighted by the popularity of each branch from
Worth a look if you are on a feature hunt. It requires some arcane image generation program to make the pretty graphs.
Oh, and if you were hoping to explore the code - be aware that the guy who wrote it is French
Sawmill (Score:2, Interesting)
A couple of years ago, I did some research for webstats packages for our websites, and came up with a package that I haven't seen mentioned yet: Sawmill [sawmill.net] is the best tool for the kinds of questions you mentioned -- it can run as a CGI program (or as its own daemon) and does on-the-fly limiting, different reports, etc. So if they want to know what kind of browsers people were using in the Support section at 3am, they can get that.
I put together a Perl CGI to handle combining logs from all of our different servers, and then feed the combined log to Sawmill (or FunnelWeb, the other package we wound up using).
-Esme
Assumptions (Score:2, Interesting)
Do not assume that people browse with just one browser window. I can not speak for others, but normally, when I leave a site, I close that browser where that site was. It is not often I follow a link out. If there are interesting links, I open them in new windows. It is not uncommon for me to have 16-32 windows open, often on 2-4 desktops.
yes, I know there are tricks to discourage this sort of browsing. Those also doscourage me from visiting the sites, if I can find friendlier alternatives.
Re:Assumptions (Score:1)
I always figured they were using some kind of best-guess algorithm...ie. first page off session would be without a local referer, last page of session would be last page visited with a local referer since session start. Pulling the links over to another window, I'm pretty sure, sends the referer over. It might screw with the "visit path" features, but not with session time or exit page.
Can anyone with more experience shed some light on how it is done?
add to webalizer (Score:2)
Even if you don't find stats packages that do what you want, you can make webalizer a lot better.
One word: Excel (Score:2, Interesting)
Re:One word: Excel (Score:2)
The good thing about doing the raw logs is that they give a better idea of how much traffic has passed off the site. This is usually more use to smaller sites that have to pay by bandwidth used, or more specifically, more use to their providers.
If you're just looking at tracking specific data, there's no easier way than to have all that data written to a database, you can have your webbugs tweaked to save off exactly what you want, you get all your data, no extraneous crap, and you can track whatever.
Re:One word: Excel (Score:1)
ModLogAn (Score:4, Interesting)
It produces similar reports, but it can works with a lot of servers, including FTP servers, firewalls, a bunch of web servers, realserver, shoutcast, squid, etc.
Sawmill (Score:1)
It's not free, but it is very nice.
Jeremy
phpOpenTracker (Score:1)
Sawmill (Score:1)
if buying a soft is not a problem... (Score:1)
http://www.quest.com/funnel_web/analyzer/
We used to use ILux (Score:2)
WebTrends Log Analyzer (Score:1)
Blue Martini (Score:1)