It has been argued (e.g. Penzo, 2005) that eye-tracking can augment standard usability testing methodologies by providing quantitative as well as qualitative data, and by providing insight into micro-behaviours on a site. Standard think-aloud usability testing provides qualitative information about what testers are looking at and how they feel about a web page, but eye tracking can provide a wealth of other information such as:
- How testers' eyes move around a page - the 'saccades'
- How long testers look at elements of a page ' the 'fixations'
- How often testers look at page elements ' the 'number of fixations'
- How long they look at the page elements ' the 'mean fixation duration'
- The dwell time within each screen
The main visual outputs from eye tracking that most people are familiar with, is the heat map ' basically a summary of fixations, and the gaze plot ' a summary of where testers have looked on a page.
Historically eye-tracking equipment was cumbersome, difficult to calibrate, fairly unreliable and it would not work for people wearing glasses or thick mascara. It has, however, improved significantly in recent years. It is now quick to calibrate, reliable (people can look away from the screen without disturbing the results), unobtrusive (you do not need to wear any special headgear) and works with most people (including those who like to wear lashings of mascara or thick specs). But it still requires specialist equipment which is expensive and it is confined mainly to lab based environments.
To produce meaningful heat maps or gaze plots requires large sample sizes ' Nielsen Norman Group (2006) argue at least 30 testers are needed to produce meaningful heat maps; these large sample sizes then generate loads of data and a lot of post session analysis. Larger sample sizes and extensive data analysis both have implications in the commercial environment on speed of turnaround of a project and the project cost.
Interpreting Eye Tracking Data
But the real issue with eye tracking is the interpretation of data. A number of commentators (e.g. Spool, 2006, Graphpaper.com 2006) have questioned the value of eye tracking, partly because of the practicalities but also because the results can be misinterpreted:
- Eye tracking can show us where someone was looking on a screen, but it can't tell us what they were seeing or thinking whilst looking at it ' did they register what they were looking at, did they understand it? And if the tester did not look at something directly, did they see it with their peripheral vision?
- Does a long fixation mean that someone is looking at something because it is interesting to them, or because it's very confusing and they're having to spend time making sense of it?
- Does a busy gaze plot tell us that there are lots of interesting things to look at on a page, or that a user is confused about where to go to achieve their goals?
These commentators have also questioned whether eye tracking delivers additional diagnostic value obtained from the additional cost and time: 'the good conclusions seem to be the same conclusions that a good UI designer (one who understands the desired effect of the design) would come up with without the aid of any tools, just going by their design instincts ' (Graphpaper.com 2006).
We have observed that the interpretation of heat maps has to be very task specific. If a tester is asked to look for a specific item, this will influence strongly they way they look at a web page, and can have major implications on the conclusions that can be drawn. For example, in a Nielsen workshop on eye tracking an example was given of a JCPenny.com home page, where much of the page was taken up with an image of teenage bedroom furniture and linen. The task that had been set was 'look for a gift for a baby girl'. The heat maps showed that testers were looking at the main navigation ' not the image. It was presented at the workshop that the image was a waste of real estate. Is that really the case or was it just that it wasn't relevant to this task? It may well have contributed to an impression of brand values and showcasing product. It would be very easy to manipulate conclusions from heat maps if used without other qualitative feedback on a site. As Spool (2006) says
The colorful heatmaps are cool (or warm?) to look at, but what are they actually telling you? When someone is gazing at something, is it because they want to look there? Or because the page made them look there? Or because they are resting their eyes there?
We have found that eye tracking is a useful additional tool in assessing usability ' but that actually it is most useful when used in real time to enable our clients to get greater insight into how users look at page. It's this real time use, rather than the production of heat maps, that our clients find to be most valuable.
It's partly about engagement with the process. Spool (2006) relates the experience of usability testing at Google where the lab with the eye tracker gets most use because
the developers pay more attention to the test when the little dot is bouncing around the screen
It's true ' we notice how our clients focus on the screen when the eye tracker is on. And it enables people who have had less exposure to users on sites to understand how users look at screens ' the way they look at navigation bars, the way they read text, the way they ignore advertisements!
Time and time again, whilst observing usability testing sessions using the think-aloud protocol, we notice that what a person does is quite different from what they might say i.e. their behaviours are not consistent with their attitudes. Arguably eye-tracking enhances the outputs from usability testing by providing a different dimension to understanding their behaviours.
Analysing these two streams of data in parallel provides an even richer picture of issues which may get in the way of users achieving their goals, and which might affect the user experience on a site.
How and when to use it
The first and most obvious use of eye tracking is alongside conventional think-aloud usability testing. As an overlay, on the screen being viewed by a tester, the eye tracking shows precisely what the user is looking at. There is far less doubt about which screen elements are being looked at or have been noticed. This can add significantly to the understanding of the usability issues on a website. This is particularly valuable when a testing session, or a video of it, is being viewed by a client as it makes explicit, without explanation, a lot of issues that may be obvious to an experienced usability professional. We find it can be used concurrently with think-aloud, rather than as retrospective think-aloud (i.e. reviewing the tape subsequently with the tester) provided the tester is allowed to explore the site within a loosely structured scenario ' focused around achieving one specific goal - rather then being in a tightly scripted format (which we don't think is a very effective way to usability test a site anyway).
One area where we have found eye tracking to be of particular value is when trying to assess the effectiveness of specific elements when comparing two versions of a page design or wire frames, or the effectiveness of specific screen elements e.g. promotions, navigation bars etc. This type of use was investigated and validated by Bojko (2006) but again she comments that 'eye movement measures on their own have a limited applicability in user experience research'and should almost always be used in combination with other measures, including behaviour and user attitudes/preferences'
We rarely produce heat maps for our clients. This is often because the budgets or project timescales do not allow for the numbers of testers or the scale of post session analysis.
But real time eye tracking, interpreted in the light of accompanying usability think-aloud evidence, provides greater insight than think-aloud alone, and enables our clients to get a much better understanding of what works and what doesn't work on their web pages.