Man wearing a VR headset

Learn article

Building for UX: Connecting eye gaze to UI objects

Resource Details

  • Written by

    Lawrence Yau

  • Read time

    12 min

You've decided the time has come to try user input (UI) with eye tracking. Maybe you've seen eye tracking technology in products from Sony, Meta, or Apple.  Maybe you've thought "wouldn't it be awesome if…" while imagining application control and object interaction with eye movements. After all, nothing could be quicker than shooting a glance to choose an item.

Tobii fundamentals of eye tracking illustration

Once you start experimenting with gaze input, it will become clear that it's not like designing for a mouse or touchscreen. Your eyes are always moving. Even when they rest (i.e. fixation) there are small, involuntary movements which you can learn about in Eye Movement: Types and functions explained.

Plus, there is always some uncertainty around where the user is actually looking versus where the eye tracker reports the user is looking. The amount of error is both a function of the eye tracking hardware and the person being tracked. These characteristics of eye movement and input signal quality add unique challenges to the creation of gaze-driven interfaces.

In this article, we'll learn how the basic UI concept of pointing requires special handling when creating interfaces with eye-based input.

What is the user looking at?

The basic function of an eye tracker is to tell the system where the user is looking. In general, that information is determined by a vector in space originating from the eye, the gaze vector. The user's gaze might be provided to applications as a point on screen, the gaze position. When the gaze vector aligns with an interactive item, it becomes focused.

Tobii Gaze diagram

The simplest implementation of gaze control for a screen-based UI would be to use gaze position in place of mouse position, then add some mechanism for activation such as a gesture or button press. The simplest version of gaze control in 3D would be to ray cast from the gaze vector instead of from the controller or hand. While the approach is simple, there are reasons why eye gaze is not a drop-in substitute for hand-based pointer input.  

Let's look at some important differences: 

How is eye gaze different from mousing or touch? 

Tobii Eye Gaze table

Lowest resolution and stability – Measured gaze can differ from actual gaze by
several degrees or more. Just as touch UIs need larger widgets than mouse-driven UIs to accommodate fingertip-sized input, gaze-driven UIs need even more space for each widget. Consider that a typical touchscreen keyboard is the width of a smartphone, whereas a typical eye-controlled keyboard spans a full-size tablet screen.

Gap between human and input resolution - Eyes can focus on tiny details, just as a mouse pointer can, but eye tracking cannot match the accuracy of a mouse. The conventional onscreen mouse pointer would be inappropriate to use with gaze input since it would nearly always be offset from where the user is looking and would present a visual distraction near the area of focus. In any case, people don’t need to be told where they are looking.

Input is secondary to scanning - Gaze tends to move everywhere due to its primary role of scanning visual information. User feedback and activation mechanisms should be compatible with scanning activity to avoid the Midas Touch problem, where users unintentionally activate objects by gazing upon them.

A deeper discussion of UI design with eye tracking can be found in
Interaction Design Fundamentals.

What’s the best way to deal with eye tracker inaccuracy?

You may be wondering how to avoid user frustration when eye tracking accuracy is low and there is no visible pointer to help the user self-correct. Let’s look at several techniques to deal with inaccurate eye gaze information.

Solution #1 – Larger, center-weighted targets

Larger targets are easier to focus, however gaze positions near the boundary are still at risk of escaping outside the target. Therefore, the most visually salient features should be located toward the center of the target to guide the user’s eyes away from the edge.

Center weighted diagram

Advantages

  • Easy and intuitive to implement

Disadvantages

  • Impacts the UI aesthetic, making controls look chunkier
  • Consumes more screen real estate
  • Effectiveness reduced at larger distances in 3D UIs - targets shrink with distance

When to use

If the design is flexible, this is a simple and robust solution.

Solution #2 - Expanded hit region

The active zone of a gaze target is enlarged invisibly to capture gaze positions that are just outside the visual boundary. This technique is used in 2D and 3D interfaces to allow small or irregularly-shaped targets to be more easily activated. The expanded zone is transparent, so the apparent target size does not change.

Tobii Expanded hit region diagram

Advantages  

  • Invisible, respects the visual design 
  • Easy to implement by adding active margins or enlarging the 3D collision mesh 

Disadvantages 

  • Not suitable for overlapping or tightly spaced targets – empty space around targets becomes interactive territory 
  • Hard to ensure clear space around objects in 3D – transparent collision meshes in the foreground may block visible background targets 
  • Getting the right margin/collider scale requires experimentation 

When to use 

Active margins are ideal for 2D grid-based UIs without overlapping or touching targets. It can work with 3D if the caveats are acceptable. 

Solution #3 – Visible gaze direction 

Although problematic for reasons mentioned above, visualizing gaze direction may make sense in certain circumstances, such as when the UI operation tolerates gaze offsets. 

Tobii - Show gaze position

Advantages 

  • Providing user feedback generally empowers usage strategies 

Disadvantages 

  • Distracting and unnatural 
  • May frustrate users that experience larger gaze position offsets 
  • May be more trouble than it’s worth 

When to use 

Rarely, if ever. If the interaction design benefits from a rough estimate of gaze, for example to highlight an area of the screen, showing a spotlight effect around the gaze position can provide feedback for UI operations while limiting distraction. The highlighted region should be large enough to encompass the user’s actual gaze. 

Solution #4 - Explicit disambiguation 

Like confirmation dialogs, the user is prompted to clarify or confirm when the system is uncertain of the user’s intent. 

Disambiguation diagram

Advantages 

  • Handles difficult cases where target clustering is unavoidable 
  • Familiar interaction pattern that can be easy to learn 
  • Potential signature moment if designed well 

Disadvantages 

  • Design and development complexity 

When to use 

Consider this technique when the layout of visual targets can’t be controlled, and UI dialog features are available. Clarification may use a non-gaze input mechanism such as speech or body gesture. Additionally, context sensitive behavior can identify and filter candidate targets to minimize dialog complexity. 

Solution #5 – Machine learning algorithm 

This technique uses an algorithm to receive gaze input and scene information to determine what object the user is looking at. The algorithm should ideally be tuned to handle a variety of scenarios involving objects of different sizes in different locations, possibly in motion. 

 

Advantages 

  • Invisible, respects the visual design 
  • No UI constraints regarding minimum target sizes, clear zones or overlapping targets 
  • No need to tweak design parameters for best results 

Disadvantages 

  • Adds computational load that may require additional resources 
  • Algorithm is a black box and not necessarily portable 

When to use 

When the algorithm is available and computationally suited to the application, this solution is quick to implement and immediately improves the user experience. One implementation of this technique is Tobii’s G2OM (Gaze to Object Mapping) available for Unity applications. 

Summary 

User interaction driven by eye gaze is a natural evolution of humanizing computing experiences. Natural human eye movements and the variable signal quality of eye tracking devices create new challenges for effective UI design. Designers and developers can enhance user success, efficiency and comfort by implementing UI techniques specific to gaze input.  

 

Resource Details

  • Written by

    Lawrence Yau

  • Read time

    12 min

Author

  • Tobii employee

    Lawrence Yau

    Sales Solution Architect, TOBII

    Lawrence is currently a Solution Architect in Tobii's XR, Screen-based, and Automotive Integration Sales team where he shares his excitement and know-how about the ways attention computing will fuse technology's capabilities with human intent. At Tobii, Lawrence is captivated by the numerous ways that eye tracking enables natural digital experiences, provides opportunities to improve ourselves and others, and shifts behavior to achieve more satisfying and sustainable lives. With these transformative goals, he is invested in the success of those who are exploring and adopting eye tracking technologies. He is delighted to share his knowledge and passion with the XR community. His restless curiosity for humanizing technology has taken his career through facilitating integration of eye tracking technologies, developing conversational AI agents, designing the user experience for data governance applications, and building e-learning delivery and development tools. Lawrence received his BE in Electrical Engineering at The Cooper Union for the Advancement of Science and Art, and his MHCI at the Human-Computer Interaction Institute of Carnegie Mellon University.

Related content