5 tips for maximizing the value of eye tracking studies in VR
How to create the best virtual reality environment for your research project.
With almost a decade of eye tracking experience under his belt, Dr Tim Holmes is one of the great thought leaders within the industry. In this article he shares some useful lessons he's learned along the way.
The first tip is the direct result of a specific example I saw where the replay of the gaze, using the manufacturer's software, did not make sense when visualized over the stimuli because the images had been displayed at a different pixel resolution to the visualization images. This meant that fixations were not being visualized on the task related objects in the stimulus. Now as much as researchers would like participants to look "in the right place" all the time it's not realistic, and in this case, because some of the participants were not typically developing it was credible that some of the fixations would be "a bit off" and so the problem had been overlooked. In fact, the instruction slide in the experiment gave the game away because reading eye-movements are fairly predictable and, sure enough, participants were showing a pattern of fixations that were consistent with reading but extended beyond the text in all directions. I have seen similar problems with screen aspect ratios, where there was a mismatch between the two resulting in a slight but systematic offset of gaze points typically in one dimension. The point to remember here is that fixations tend to cluster around regions of an image that are highly informative, highly salient or strongly related to the task being performed. If you have data that clusters elsewhere it's worth checking that your set-up is mapping the gaze data correctly. Another good solution here is to always include some kind of calibration stimulus in your experiment so that you check that gaze is being mapped correctly.
OK, so this could be a blog entry in its own right! As I work with a lot of people who are eye tracking in the real-world rather than a university lab (yes, I know, some universities do real-world research too!) I frequently encounter studies where something unexpected has happened – be it the weather, the presence of other people in the environment, internal reflections in corrective lenses, or there not being enough time to collect data from the participant in the testing schedule. In most cases, a little pre-testing would have resolved the majority of the problems I have ever encountered with eye tracking studies.
Now, when I say piloting, I don't just mean 'running a participant through the task' or reading through the script – I mean walking the route at different times of day for a shopper study, calibrating multiple participants with your chosen eye tracker in the actual environment you will be testing in, running your analytics to ensure you can actually get the data you need from the study you've designed. Also of course, make sure you've actually selected an appropriate paradigm and eye tracker for the question you are researching. What makes sense in the lab or meeting room can suddenly seem fraught with issues once you actually try it out.
As an academic researcher this is something I always do, as a commercial researcher I know this is something there is often little time or budget for, but trust me, you can save yourself a whole lot of heartache by grabbing a couple of co-workers and getting them to be a participant before you commit thousands and days of work to recruiting, testing and analyzing the data from participants in a study you have not piloted. I guarantee that this one tip alone will result in better outcomes for everyone who tries it.
In academic research we tend not to use heat-maps that often and instead focus on quantitative analysis of the actual gaze measures such as average fixation duration, time to first fixation and saccade latency. Commercially there is a heavy reliance on the heat-map as the go-to visualization of all things eye tracking, but they only work if the following assumption is true: everyone in the sample behaved in roughly the same way over the same period of time. If there are spatial or temporal outliers in your sample, then the heat-map produced by most commercially available software will almost certainly be misleading; resulting in inaccurate insights and incorrect recommendations. If you must rely on heat-maps, then it is essential that you also look at individual gaze plots/replays to identify those outliers and remove them from the visualization you use to tell your story, otherwise, I'm sorry to say, you are fudging your results, and whilst that might make your stakeholders/customers happy, it will not generate the gains that you are promising them.
Speaking of heat-maps, how many times have you noticed a hot-spot right in the center of the image? This results from something we call central fixation bias which is a tendency by observers to look at the center of the scene and is especially prevalent in the first few fixations following onset of the stimulus. As a researcher controlling fixation prior to presentation of a stimulus is something I do all the time even in free-viewing paradigms, because knowing the origin of a participant's eye-movements helps in the analysis of stimulus driven attention, but from a purely mechanical perspective this means that there will always be a cluster of gaze points around the fixation control target, usually a cross, even after it has been removed because it takes time for the next eye movements to be planned and executed.
So, should your average market researcher be worried about this? Well the answer is a big YES! One easy way to mitigate much of this effect is to analyze fixations which occur 0.5-1 second AFTER the onset of the image. This will at least mitigate most of the initial bias. But here's the really important thing to remember: if you want to test whether a brand or product or claim will attract and capture attention, don't place it in the center of the scene. I previously advised on a findability study for competing candidate re-designs for a product. On the initial planograms I received, the new design was always in the middle of the middle shelf meaning that it would be highly unlikely that any difference between the designs would be detectable from the research. And research buyers, if you're ever presented with a heat-map showing a big red blob in the center of the screen and your product happens to be sitting underneath it, I advise you to take everything said about those results with a very large pinch of salt.
It's no secret that I feel heatmaps are potentially misleading territory if not executed correctly, so this raises the obvious question; what SHOULD you do when analyzing your results? Moreover, if you are a commercial researcher and not wanting to dive into raw data every time you run a study, are there easy tools you can use to add some value over and above the heat-maps you'll be REQUIRED to include in your presentations? Fear not my friend, I do indeed have answers to these questions, but their scope is limited to those who are interested in how the visual scene might be affecting attention and/or ability to complete a task. If you are interested in the mechanics of the eye-movements themselves I don't think there is any substitute for digging into the raw data.
In scientific research we usually have a hypothesis that we want to test which is something like "Participants in group A will find object 1 faster than object 2" – yeah, we're crazy like that! In commercial research, the questions might be framed more generally, and often use dangerously ambiguous concepts like "engagement" or "like", but in principle any research that's worth doing will have a question at its core: Does my product stand out on the shelf? Does the layout of this website reduce time to purchase? Does the Sat Nav distract the driver's attention from the road? The thing these questions have in common, is that they are asking a question about the timing or location of objects in the visual scene, and for questions like this a few Areas of Interest (AOI) can take your research to another level, because you can generate measures like Time To First Fixation, Accumulated Dwell Time, Revisits, etc. which are particularly useful if you're performing any kind of A/B testing.
But, you need to be careful with your definition of AOIs! For example, covering an entire web page or planogram with AOIs that leave no white space visible is typically not a good idea. Every eye-tracker has a limit to its accuracy. You can find these out from the manufacturers but usually it's around 0.5 – 1 degree of visual angle. What that means is, at approximately 60cm viewing distance the gaze point is accurate to between 0.5-1cm, with a wearable eye tracker the distance to the target may be larger, and therefore this figure increases. If your AOIs leave no white space in between then fixations on the borders between your AOIs will be reported but it's possible that the noise from the eye-tracker accuracy is affecting your results. Moreover, our own placement of gaze is not always that accurate either, so, for example, when I am reading text I typically position my gaze slightly above the words rather than directly on them. What this means is that AOIs typically should cover an area slightly larger (at 0.5 degrees in all directions) than the object you're interested in, and then ideally there should be a gap between your AOI and any other AOI on all sides. This way, you confidently report that gaze points in the AOI genuinely relate to that AOI and are not spilling over from another adjacent one.
This is the hopefully obvious corollary to tip 5a, because if you cannot be certain which AOI a person was looking at when AOIs touch, you CERTAINLY cannot be certain when they overlap! In this case, however, your eye tracking software will not have a problem reporting fixations in 2 AOIs simultaneously which means you might count fixations twice. Moreover, if the AOIs are overlapping because of their being in the same (x,y) coordinates but in a different depth (z) plane then it's also possible that layering of the AOIs might cause your eye tracking software to allocate the gaze point to the wrong AOI if you have not allowed for occlusions or parallax corrections (whether the image on the retina is focused on the near or far object).
All this becomes especially important when working with wearable eye trackers, where AOIs don't just apply to static areas in the scene but are constantly in motion due to head movements. For this reason, tools like Tobii Pro Lab provide you with the ability to create dynamic AOIs and, to save you some of the pain of positioning them frame by frame, apply interpolation algorithms to animate their positions between key frames, and to say when AOIs are visible or not to cater for occlusions. Remember that these algorithms typically transform or morph a 2D AOI shape between key frames rather than matching the actual outline of the object that the AOIs is associated with. Therefore you need to utilize a number of key frames to get the best accuracy for acceleration/deceleration and resizing of the object during the study.
How to create the best virtual reality environment for your research project.
Learn how the combination of eye tracking and VR can help you. Reveal what attracts customer attention, cut training costs, and optimize your design.
Watch the webinar to get guidance and hands-on tips about how to use our eye tracking research starter kit in UX and consumer research.
Subscribe to our stories about how people are using eye tracking and attention computing.