I had an interesting telephone conversation with one of our clients recently about benchmarking their web site. They are a major government organisation and are redeveloping their website. They wanted to benchmark the site before and after the redevelopment. The main driver for the benchmarking seems to be to justify the expenditure on the redevelopment.
Knowing that no such thing exists I was forced to think about the best way to do this. What I came up with was as follows.
If you are going to try and benchmark something you have got to know what it is. What is a successful website? Well that depends on what you want it to do. If you are selling widgets then counting the number or value of widgets sold might be a good benchmark. But, if yours is an information site, like this client, how do you define success? Well they don't know. When I asked why they have a website I got no clear answers. 'Well everyone has a website'.
So I have to define success for them. The definition I like for public sector information sites is: 'target users can find the information they want quickly and easily and this information will be useful and relevant to them'.
There are three key elements to this definition.
Defining the target users was easy, i.e. civil servants and the Westminster village, but how best to measure 'finding information' quickly and easily? Well you could ask users by putting them on the site and either getting them to do tasks of their choosing, or setting some tasks and asking the users to rate the ease of obtaining information on a suitable scale (e.g. 1-10). The problem with this, as anyone who has watched any usability testing will know, is that most users are no good at rating how easy it is for them to find things. You can watch the most appalling user experience where the user fails to achieve any of their goals and when you ask them how it was they say "Oh quite good, probably a 7 or an 8" - while the observers have their heads in their hands about how bad the site is!
Well you could have an 'expert' do the rating instead of the user. They could record goal success rate and time how long it took to achieve a task on the basis that quicker is better. They could also develop a rating scale that took account of whether the user went straight to the goal or if they had to backtrack several times to find what they wanted. The problem with this approach is that we know that actual speed of achieving a task only loosely relates to a user's sense of satisfaction. We know that if all links in a trail to a goal give off a strong 'scent of information' then, even if there are a lot of links in that trail, users will have a higher level of satisfaction than fewer links with low scent. So the expert's view only partially relates to the user's actual experience.
Similar problems exist with measuring the usefulness and relevance of the content to the user - though here the user's view is probably the best measure available. But they don't know how else the information might be made available. Some users will say they are happy with that 50 page pdf even when the bit they want is just one sentence on page 43. Give them an alternative where they can go straight to what they want and they will give you an entirely different opinion about the pdf.
You could simply count the usability issues identified by an 'expert' with some sort of severity rating i.e. minor, moderate, severe. But this assumes the 'expert' identifies all the usability issues. We know this doesn't happen from all the work done by Rolf Molich in his CUE studies which show different experts identify different usability issues on the same site. All this would show was that the site had eliminated the usability issues identified by the expert. This might not relate to usability issues actually experienced by users.
So this conversation with the client went round in circles for a while discussing these sorts of issues when the client said:
'Well of course, if the key managers who worry about if I had spent the money on the development wisely could see the site was better I don't have a problem''
Which, of course, provided the solution to the problem. Just get the key stakeholders in a usability lab to watch some user testing of the original site and then subsequently the redeveloped site. They'll soon know if it is better.
My guess is that watching just one tester would answer the question about if the money has been spent wisely and be far more convincing than any number of numbers written in a report!