in Blog

What machines think people do: a basic primer on web analytics

In a previous life, I once wrote:

Fundamentally, evaluation should be about measuring performance objectively in order to make improvements:

  • measuring: involves a process for collecting, recording and sharing data, perhaps from a number of sources, or of different types
  • performance: how successful the activity has been, which means how well it met its objectives, budget and timeframe – including unintended outcomes or side-effects
  • objectively: trying to overcome natural personal and psychological inclinations to look on the bright side, remember ‘peaks’ or anecdotes,and try to consider every aspect of the activity fairly and in proportion
  • in order to make improvements: not just evaluating for its own sake, but with the aim of making it better in future, through refining techniques or developing individuals

Source: Connecting with Communities: a good practice guide to outreach, CLG (2006)

I still think there’s something in that definition, and it came to mind when someone challenged me to write an intro to web analytics – a field awash with data and trends where it’s more important than anywhere to ask yourself: why?

analytics

So why analyse?

Before you analyse, ask yourself two questions: What is my site for? and What do I want to achieve by analysing? The answer to the first question will help you work out what kind of measures are worth paying attention to; the latter will help you be clear on what the numbers really mean for you. It’s important, because analytics can help you do all kinds of things:

To track progress towards a goal. Web analytics can help you benchmark and track trends over time to see how your site is performing: if the purpose of your website is to sell things, how well are you doing that? If it’s to build a community, are people coming back? If it’s to build an easy-to-use web app, do people get beyond the front page? It’s an obvious point, but not all goals are the same, so Good for one site is Bad for another.

To compare approaches.Web analytics are all about comparisons. Analytics can help you see if site A or site B send you more traffic; whether a particular piece of link text works better or worse than the one you tried last month, and whether people are more interested in your blog posts on Kerry Katona or Kefalonia.

To assess the value of what you’re doing. Return on investment is a dirty phrase, but the bottom line is that analytics can give you some of the raw materials for a story about what you achieve for the effort and money you invest in your site. But it’s the story and the insight into why people visit and what they do when whey come that’s really interesting.

To kill failure. Some sites or sections or campaigns flop for one reason or another. Analytics can tell you which ones they are, so you can try something else instead.

What do analytics look like?

Broadly-defined, I’d say there are four main kinds of analytics:

  1. Server side: these are based on the big log files stored by the server your website is hosted on, which adds a line each time a web browser requests something from it. It stores information about the machine address, the page or image requested, and what the requesting machine’s operating system and web browser is, all in a line something like:
    123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] "GET /pics/wpaper.gif HTTP/1.0" 200 6248 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"
  2. Client-side: these are stats collected using a service like Google Analytics or SpeedTrap, which uses a bit of javascript code in each page of a website to detect information about the page itself and the visitor’s machine. The tools are often able to provide quite a bit of detail – even sometimes providing ‘heat maps’ showing which parts of the page are most clicked on – but don’t record anything if the machine visiting the site has javascript turned off (and a few percent still do).
  3. Trackers & counters: particularly on the social web, you’ll find lots of services which track how many times something has been played, favourited, commented on or clicked through to. From comments on YouTube videos, views of a Flickr photo to clicks on a shortened bit.ly link from Twitter, all of these provide help to answer the ‘how many?’ question
  4. Panels & data-crunchers: some services can tell you how popular your site is without even asking you. Well, they claim to. Tools like Alexa track the websites visited by people who install a certain toolbar, or are paid to be members of a certain panel, while services like Hitwise use data from internet service providers to track which websites are visited by their customers – and fascinatingly, what the demographic and consumer profile is of those people. Whether these numbers mean anything depends a lot on the profile of their sample, and whether they reflect visitors to your site – if you’re a big consumer site, there’s a chance they will. At the very least, they’ll give you some comparisons in terms of rank order and traffic volumes for some of those kinds of sites.

What to look for

Trends: the absolute number of visitors or pages is usually less interesting than the trend over time. Are more people coming than last month? Are weekends always low, or is it something we did? Trends help you establish whether a change or campaign consistently has a positive or negative effect.

Journeys: when people come to your site, they move about. It’s a common fallacy to assume people visit the homepage, click on an item from the main menu, then an item from that page, then the next page, and then view your detailed page. But analytics often show the reality of people who land from a search engine deep within your site, unaware of your homepage. The journey tools within Google Analytics can help you spot where people give up and go elsewhere, so you can take action to make the journey smoother and keep the traffic (if that’s what your site’s about).

High & low stories: the high and low points can tell you some interesting stories in themselves – what made a particular blog post twice as popular as the norm? And was it bad timing or bad content maybe that made that other post sink like a stone?

Surprises: analytics are full of surprises, like the geographic origin of visitors (plenty of UK consumers read US sites, and vice versa), or the screen resolutions on which people are reading your site. Have a large cohort of iPhone readers browsing your site on a 320×480 screen? Consider tweaking your stylesheet.

What to look at

Hits v Pages v Visits v Uniques. A hit is request for a single part of a web page, like an image or a stylesheet – so isn’t a great measure, as a page with lots of pictures and associated files can have a lot of hits for very few actual people visiting. Pages is a better measure, more comparable to measures used in traditional media given the correlation with the number of times an advert is displayed, for example. A visit gets closer to the concept of ‘how many real people visited the website’, looking at the number of times someone came along to the site and viewed a set of pages in a single session. Finally, ‘unique visitors’ tries to de-duplicate visits from the same person or machine, to give a cleaner measure of the number of different people who came to the site. I generally find unique visitors the most valuable number, as it gives me a sense of the real human audience for the content.

Time on page. By seeing how long it takes, on average, for someone to move from page to page within your site, analytics work out the average time spent on each page and on the site as a whole per visit. It’s an interesting measure which can give you an indication of whether you content is properly being properly read or just skimmed. If your aim is engagement and your time-on-page is just a few seconds on average, there may be a problem – a longer time-on-page is generally thought better for most sites.

Bounces. A special case for visits are so called ‘bounces’ where a visitor visits only a single page on your site. Perhaps they come to the home page, realise you’re not for them, and click back to the Google search results. Or perhaps they land on the in-depth article they were looking for, and need to look no further. A lower bounce rate is generally thought better.

Conversions. Some of the more complex functionality in Google Analytics lets you define goals and ‘funnels’ to analyse how people move through your site towards a defined sales objective – maybe downloading a document, completing a multi-page form or clicking the ‘buy’ button.  Non-transactional government sites often don’t look as hard at this measure, but that’s not to say they shouldn’t. A lot of websites are created simply to look good or get lots of readers, but establishing some more stretching objectives like getting visitors to sign up to a newsletter, subscribe to an RSS feed or complete a form to join a ‘supporters’ scheme is more likely to show the value of the web longer-term in mobilising support and engagement from otherwise passing trade.

Referrers. Blimey, where did all those people come from? Referrer information tells you which site the visitor was on when they clicked on a link to your site. Looking at the list of sites which refer traffic to you can often open your eyes to unexpected organisations or individuals who found your content interesting and chose to link to you. ‘Direct entry’ generally means someone typed your URL in themselves or, in these days of desktop Twitter clients like TweetDeck, that they came to your site from a source outside of their main web browser. Many referrals are likely to come from search results pages which, in these days of Google dominance, are most useful in that they give you…

Keywords that people typed into the search engine in order to find the link to your site. These give you a sense of the popularity of different phrases used to describe your content, as well as some of the most amusing and surprising insights into your analytics – at time of writing this site, for example, is on the first page of results in Google for the phrase ‘sell stuff‘.

Browser stats. In making design choices about your site, browser stats can tell you what proportion of visitors used outdated browsers (such as Internet Explorer 6) and therefore do or don’t need catering for, as well as the screen resolutions they have, which can inform what kind of layout you go for – often very useful when combatting the oft-quoted stipulation that government sites need to work for the sizable minority of visitors on 800×600 browsers. They’re a minority, but they’re not as sizable as the folks browsing your postage stamp pages at 1600×1200.

Social stats

Social media tools and platforms introduce a new dynamic. On the whole, you don’t get the richness of traditional web analytics (though some platforms such as Ning let you plug in Google Analytics code if you pay a bit extra). But on the flip side, your analytics are much more public, which introduces its own interesting dynamics of ‘popular content’ and feeds the ego.

  • Views of videos, pictures or presentations are probably the most straightforward, along with click throughs of link services like bit.ly (tip: take any bit.ly link e.g. http://bit.ly/3zfftT and add /info/ in the middle to see the public stats on that link – e.g. http://bit.ly/info/3zfftT). [UPDATE: And as Robin says in the comments below, it's worth mentioning the stats built into Feedburner, which lets you track the otherwise untrackable activities of people who come to your content via your RSS feed and never actually visit the site itself]
  • Comments are the next notch up, showing who has engaged with the content to the extent of responding to it, e.g. @replying to a Tweet
  • Shares in the form of bookmarks on services like Delicious, Digg or StumbleUpon or re-tweets on Twitter indicate people who liked or felt inspired to spread your content to their own networks or save it for later. Ditto for starring/favouriting items.
  • Embeds and responses in the form of inbound links to your site (which you can pick up by searching for link:http://blog.helpfultechnology.com on Google or Google Blogsearch or using services like BackType) are maybe the highest form of engagement, where people are moved to respond – hopefully positively – to your content.

Measuring social media stats is both easy (they’re often public, and pretty straightforward) and hard (they’re spread over lots of sites, and can overlap or tell conflicting stories). Tools like PostRank (h/t Treepixie) are emerging to help disentangle the mess, and put these stats alongside your own site’s web analytics.

What analytics don’t tell you

With so much information, it’s easy to assume that’s the whole story, but of course it isn’t. Web analytics tell you what machines think people do, not why they do it, or even who they really are. Beware of treating one-off spikes and troughs as trends or significant patterns – maybe Google just tweaked their algorithm that day, or your site went down for an hour without you noticing. It’s also hard to assess the true extent of engagement from hard stats alone, and that’s often better done from a deeper sense of what people who come to your site say in the comments and do when they send you enquiries and feedback forms. Above all, be careful of attributing cause and effect to the stories you see in the stats: use the flexibility of stats to compare alternative approaches before deciding that you’ve been doing it wrong. See what norms you can glean from tools like Alexa or Hitwise if they’re appropriate to your audience, or informally from friends and colleagues if like me you operate on a smaller scale. And remember that stats can’t tell you much about the who and why – so consider using an old-fashioned visitor survey or subscriber questionnaire (or even just a blog post asking people to tell you a bit about themselves in the comments) to understand the visitor profile of your readers and what they want when they get there. More about that in another post, I suspect.

Congratulations for getting this far – you can be sure I’ll be watching the time on page carefully to see if you read it properly :)

Tweet about this on Twitter0Share on Facebook0Share on Google+0Share on LinkedIn0Email this to someone

Write a Comment

Comment

13 Comments

  1. Steph – awesome post, very comprehensive and accurate.

    Just once small ommision – Feedburner? By the above reckoning I didn’t exist until I clicked through to leave this comment (there’s a bit more to Social stats too, but perhaps it helps to keep the pixie dust to ourselves for the moment)…

  2. @Robin: good point, added above.

    @Paul C: fair point :) I look at a lot of this through the lens of Google Analytics which is undeniably awesome for the price. But there are plenty of other packages worth exploring once you want to do more sophisticated things like track groups of pages, run more customised reports or get stats on internally-hosted tools.

    @Paul J: interesting – I’ve updated above.

  3. Heh :} I kindof seriously meant it! But you’re right, should add that disclaimer they always add ‘please remember that other packages are available …’

    One of the interesting things I’ve found with Google Analytics is that a fair few people don’t realise how much it can do, so they over-discount it. ‘It’s free so it can’t be that good’ – Google’s so big they think there must be some sort of (bad) hidden motive.

    Couple of add points: you can share views and that can be very useful for looking at why or whether a similar page is more/less successful (I’m going to be looking at promoting this for local government); through adding code to URLS you can track clicks in places like email newsletters, email clicks on pages, PDF clicks.

    The SocMed stats stuff is also useful btw (forgot to mention).

  4. I’m glad you included what analytics don’t tell you. They can give you volumes of valuable data, but nothing really beats watching someone, or surveying someone.

  5. Thanks a lot for this – exactly what I needed to analyse my 10 regular readers ;0)

    I actually think maybe my tumblr bounce rate isn’t so bad given that my updates are also on the homepage, or is there a time limit on it? Is it people who leave within 10 seconds of landing on the page or something like that?

    Anyway good thorough introduction for people like me.

  6. Wow, I feel smarter already. Thanks for the run-down. Plus, hey, PostRank got a mention, so even better!

    And where were you when we were beta testing Analytics? :)

    Actually, if you haven’t checked it out yet, I’d love to get you in there. Perhaps sweeten the pot with three free months in exchange for feedback/review? If you’re interested, just give me a holler.

  7. Great post Steph, I don’t know how I missed this. In many ways this is more useful than the COI guidance on analytics as an intorduction to the subject.

    Interesting that you favour unique visitors as a measure of the “human audience”. With people potentially connecting through several devices, I tend to think of it as “unique devices”. Also, the “unique” bit means it only makes sense over a given time period. Are we interested in uniques per month, per quarter, per year etc? I guess it varies for different services.

    One thing I would mention is that, while client-side analytics may not report on Javascript-disabled browsers, server-side analytics tend to report all people from the same company as one person (due to accessing via the same IP address and user agent). Server-side data also needs to be filtered with the latest robots and spiders exclusion list while client-side has a more natural filter built in.

    Lastly, it might be worth mentioning combined metrics such as pages per visit or visits per visitor because they tend bring these sometimes huge numbers into something more manageable. It’s amazing what dividing one by the other can do!

  8. A very good introduction for a beginner like me.

    I’ve had a question for some time which I can’t find an answer to, which you might be able to help with regarding Keyword site statistics.

    Am I right in thinking that if a keyword search features my site, it doesn’t necessarily mean it has been visited by a real person? It just means I’m in a list supplied by Google or whatever.