Take It Up a Level
About Caro Consulting  |  Subscribe to Newsletter  |  Contact Us

Homepage

Volume 2 Issue 4 

 Logfiles vs JavaScript
Quick Question An advertising pundit once said, "The most important word in the vocabulary of advertising is TEST. If you pretest your product with consumers, and pretest your advertising, you will do well in the marketplace."

First to respond with the correct answer wins a gift certificate to Starbucks.

View Past Issues
January 2004
February 2004
March 2004
The following is an outline of the pros and cons of using logfile (a file generated by the server and read later) vs JavaScript (code inserted into each page of the site that pings a different server, often hosted by a third party) data to track web site traffic and campaign information. The following has been adapted from Clicktracks' original document. If you would like to learn more about how this data is processed, please contact ClickTracks directly.

Logfiles
Logfiles have been around since the beginning of the web. Basic analysis of this data started with open source programs like Analog (note: the author of Analog is Dr. Stephen Turner, the CTO of ClickTracks). Logfiles easily yield useful data like the bandwidth consumed on the server, 404 errors, peak usage etc. More complex data like visitor sessions or unique visitors require more sophisticated analysis. Modern log analyzers may also determine the following—be sure to confirm this with the solution provider.
  1. Visitor Sessions—Determined with good accuracy if the analysis software is able to strip graphics files, join distinct pages into a single visitor session (including problems caused by dynamic IP addresses), and manage session timeouts.
  2. Session Accuracy—Improved if a session cookie is available. You will need to ask if standard session cookies generated by JSP, ASP and PHP can be supported or if a custom session cookie can be configured.
  3. Unique Visitors—Available if a unique, persistent cookie is available. Ask if the solution provider can manage the cookie and map it back to an original campaign, even if the time between clickthrough and purchase is separated by many weeks.
  4. Tracking visitors across multiple sites/domains—Cookies are often not transferred when the user moves from domain to domain; heuristics would need to be used.
  5. Robots and spiders—Ability to filter and place into other reports.
Problems With Logfiles
Much has been said about logfiles and their disadvantages. To summarize:
  1. Both session and persistent cookies must be present in the logfile. It is the responsibility of the website to set them. There is nothing complicated about doing this, but some companies cannot gather IT resources needed to include them. In the long term, online businesses should complete this task themselves, but other factors delay / prevent them.
  2. Caching of pages by ISPs and proxies can distort the data and lead to inaccuracies. For a while this was a major differentiator promoted by solution providers selling only JavaScript solutions, which suffer less from caching problems. In general, the amount of cached pages has declined as the cost of maintaining the cache hardware has outweighed the cost of the bandwidth saved. Nevertheless, logfiles are somewhat inaccurate.
JavaScript
JavaScript (sometimes known as 'client-side tagging', 'page tagging' or erroneously as 'cookies') requires some code to be pasted into each page to be tracked. When the page is loaded in the end user's browser, a request is sent to a server (often part of a third party service) and the data is gathered.

JavaScript gained popularity very quickly because it is very easy to generate reliable visitor session data. Since the script is able to set its own cookie, companies requiring good session data can get this without needing the main site to set a session cookie. When IT resources are already spread thin, it is useful to have responsibility for setting the cookie pushed to other places. Extending from this, JavaScript can also set a persistent cookie.

JavaScript is also able to more easily parse data from the contents of the page when this is often not available in the URL. Shopping cart total purchase value is a good example.

JavaScript-based tracking also nicely sidesteps the problem of tracking over multiple domains, since the session cookie exists inside the domain where the data is gathered, and not the domains of the site.

Problems with JavaScript
  1. Some server activity like redirects, PDF downloads, and so forth are opaque. They cannot be tracked.
  2. Logfile analysis is still needed for technical stats like bandwidth / 404s.
  3. In almost all cases the data is trapped on a third party service. As a company grows and becomes frustrated with its present system, it must evaluate the switching costs.
  4. While more accurate, it is still not perfect. For example, JavaScript errors, DNS failures and other glitches result in no data recorded, while a logfile would be fine.
  5. Pages become more unreliable as more JavaScript is added. The problem manifests as less reliability, rather than easily identified failure points.
  6. The cookies issued are 'third party' in that they do not originate from the domain hosting the web pages. For session cookies this is OK but persistent cookies require special handling through P3P and compact privacy headers. Present JavaScript code should handle this (ask your provider), but implementers should be aware that future changes to IE and other browsers might clamp down on third party cookies.
It Is Your Choice
Web Analytics is a complex subject, but the underlying technology is relatively simple. There are, after all, only two ways to obtain the data. Clicktracks' open approach extends to your choice of how to gather the data. Find out how ClickTracks aims to help you make the right choice.



See the value? Forward it on . If someone forwarded this to you, feel free to opt in to future issues.