The Future of Eye Tracking

 

November 22, 2000

 

 

 

What are you looking at? In the next couple of decades your computer may be able to answer that question. There are dozens of prototype systems currently under development in research institutions that are allowing their creators to visually interact with the PC in 2D and 3D ways never achieved before. Computer processing speeds are nearing the point where they are no longer the limiting factors for vision tracking, rather, the real limits deal directly with our ability to accomplish the implementation using an efficient physical mechanism. Potential applications travel far beyond interacting with a PC and into realms such as vehicle and aircraft steering, advertising, disability aid, MRI scanning, sociology study, controlling microscopes, remote control of robotics, and even following infant or animal eye activity. What might this technology look like? The final solution might be accomplished through a video camera embedded in glasses, a desktop camera, measurements of electric fields in contact lenses, measuring electric potential around the eye, or even with low powered eye scanning lasers. To make an accurate prediction about the look and feel of future eye tracking mechanisms, we will investigate the advantages and disadvantages of current prototypes and then discuss future enhancements that will usher in the new interface.

A popular approach is the use of video image analysis from a camera attached to a pair of glasses. Clemson University research is focused on providing a "gaze contingent virtual environment" through the use of a pair of head-mounted virtual reality, (VR), glasses. The hardware requirements of 8 MIPS processors with 8 gigs of RAM are large compared to today's standards, but within the next 20 years the equivalent will adorn the average desktop PC. The VR glasses are nothing new, however, contained within them is a small video camera that provides a video feed of user eye movement, (figure 1). The images are fed into the computer and the pupils locations are translated into tracing movements on a monitor. Since the end goal of the Clemson research is to create a system that will allow users to virtually view aircraft for visual inspection and job training, a three-dimensional world inside a cargo bay is used for experimentation, (figure 2).

Figure 1 Clemson University's VR eye tracking lab.

Figure 2. Clemson University's 3D cargo bay with a visual trace of the user's eye movements.

The "true" use of the third dimension increases complexity to an entirely new level since, in addition to the x and y eye coordinates being mapped, the camera must also measure the size of the pupil to estimate the focus level. This is nearly impossible to measure accurately because eye pupils vary significantly in size, and where the eye is focused depend on other factors as well. On Clemson's system only the x and y coordinates are mapped, so the user may navigate left/right/up/down in the 3D word while forward and backward movements are input via a keyboard or joystick. This 3D method of control would not only be a valuable training exercise for airplane mechanics, but it would be the basis of a fantastic gaming experience.

The Interactive Systems Labs at Carnegie Mellon have decided against using glasses and replaced them with a simple video camera on the desktop. According to their web-site, "Many eye gaze tracking methods rely on intrusive techniques". The desktop system can be advantageous for marketing it to the public since video cameras are relatively cheap and easy to sell since they don't appear "geeky". Unfortunately, the computer must perform the additional tasks of finding the face, then finding the eyes, and lastly determining the direction of the pupil, (figure 3). The user must be in direct range of the camera, whereas with glasses, the user has more freedom to move about and turn away from the monitor. This setback becomes irrelevant, however, when used for purposes such as Windows computer navigation for the disabled or handicapped as in the ERICA system developed at the University of Virginia, (figure 4), and the Eyegaze communication system by LC Technologies.

Figure 3 Face tracking system at Carnagie Mellon.

Figure 4 ERICA eye tracking system for the disabled.

Through the use of this technique, Carnegie researchers have been able to develop an interface to view panorama images, where the user can scroll through the 360-degree panorama and use spoken commands to zoom in and out.

As a variation of the image analysis technique without glasses, Penn State University's facility for eye tracking uses an infrared device that "can measure eye movements of individuals who are seated in front of its infrared (IR) emitter." Using this procedure, the pupil will appear as a bright spot to the camera because of its IR reflective properties, allowing the camera to easily pick up on the exact location of they eye.

The Kwangju Institute of Science and Technology is a prominent researcher in retina based user interfaces. Coupled with voice recognition and lip reading, their video camera system allows users to efficiently pick out facial features and then track movements. Through their experimentation it was discovered that certain limitations stem from the fact that human eye shapes are variable. Although their prototype is quite accurate for locating the eyes, (under optimal lighting and distance conditions), it was unable to locate the eyes on certain individuals. An example of this limitation is shown in figure 5. Once the eye is picked out, the movements can be used to control a tracing tool on the screen, (figure 6).

Figure 5, Limitations in eye scanning techniques at Kwangju Institute of Science and Technology

Figure 6, using the eye tracking tool to trace a path on the screen with Kwangiu's eye tracker.

The University of Bielefeld in Germany has taken eye tracking beyond the PC and into the realms of sociology. Their strange looking contraption, (figure 7), can be used to analyze the scene that the participant is viewing. A small camera mounted on unit records points of interest as the user looks around. In essence, it feeds an image to the monitor that allows anyone watching to see what the person wearing the eye tracker sees. Such a system could potentially let a computer analyze the environment that the user watches over time and make decisions about which types of visual cues interest the user. Through analyzing a scene for a period of time, a summation of the areas viewed can be put together to render an image of which areas within the frame of reference are viewed most intensely, (figure 8)

Figure 7, the Bielefeld cyborg contraption.

Figure 8, a 2D Gaussian distribution plot of the Bielefeld system shows which areas are most viewed over a period of time, (lower right).

Perhaps one of the more bizarre approaches toward an efficient vision tracking solution has been the use of contact lenses to track eye movement, (figure 9). While such methods are intrusive to the user, the accuracy is worth noting. According to a thesis by Arne Glenstrup for the University of Copenhagen,

"By making the user wear a special contact lens, it is possible to make quite accurate recordings of the direction of gaze. Two lens techniques exist: by engraving one or more plane mirror surfaces on the lens, reflections of light beams can be used to calculate the position of the eye. And by implanting a tiny induction coil into the lens, the exact positioning of the lens can be recorded through the use of high-frequency electro-magnetic fields placed around the user's head."

Unfortunately, high-frequency electro-magnetic fields are not entirely safe and many users who do not currently wear contacts may decide against wearing contacts solely to add eye-tracking capabilities to their computer. On the other hand, once the user has put the contacts in, they are completely free from any wires and head devices.

Figure 9, using contact lenses for finding gaze direction.

Like contact lenses, the method of measuring electric potential around the eye has generated little publicity. Using an array of extremely sensitive instruments with electrical sensor devices placed around the eye, the computer can estimate where they eye is looking by reading which muscle fibers are activated, (figure 10). Unfortunately, the accuracy of reading these electrical signals within the various methods is far from perfect. Also, the user must have the sensors very carefully attached around the eye with and an uncomfortable sticky material, making public acceptance unlikely.

Figure 10, electrical signals in the muscles surrounding they eye can be interpreted to determine eye direction.

A technique that has not yet been explored involves a method that is closely related to virtual retina display, (VRD). A company called Microvision the primary research drive behind a technology developed at the University of Washington coined VRD, (figure 12, 14). It allows them to "paint" rows of data across a person's eye using a low powered laser, allowing the person to view a virtual computer screen that appears about an arm's length away, (figure 11,13). Perhaps a spin-off of this technology could use the same laser to "read" the position of the retina. At the time of this report, no attempts to work with this approach to eye tracking have been made. The potential for this method is tremendous, however, virtual retina display itself is still in its infant experimental stages, so it follows that methods for eye tracking using a variant of VRD are distant on the timeline in comparison to methods that are currently in experimentation.


Figure 11, (top) The image is scanned onto the retina.
Figure 12, (bottom) "Nomad" is Microvision's prototype VRD prototype.

Figure 13, (top) rendering of a future virtual screen.
Figure 14, (bottom) Microvision's prototype helmet for the US Army's Virtual Cockpit Program.

After examining the various technologies currently in development, it becomes much easier to formulate a realistic vision of what the future holds in respect to eye tracking. The requirements that will need to be satisfied before a specific eye tracking technology will become available are unobtrusiveness, accuracy, and real-time processing.

Firstly, the device must be unobtrusive. While most of today's techno-geek crowd wouldn't be entirely opposed to wearing a device such as the prototype at the University of Bielefeld if it meant increasing productivity, the remaining population would prefer to strap a scuba mask over their head at a company meeting before being caught dead sporting such a horrid contraption. Health risks should be no more hazardous than the risks of carpal tunnel syndrome on the QWERTY keyboard.

The device must be accurate. If user looks at the "X" in the top right hand corner of a Windows explorer window to close the window, the computer must be able to immediately know that he/she is looking at that location. The device should not have to worry about interpreting a "click" or the desire to select or drag/drop a specific object. To do so would require technological capacities and computer intelligence far beyond anything we will have within the next century. Actions such as selection can be left to venues such as speech recognition, a remote device, a keyboard, or perhaps a switch placed in a tooth that would trigger when bitten. Due to the nature of the eye to "dart" around, applications such as graphic art will probably not be accomplished via eye tracking.

All movement and calculations must be accomplished in real-time. Today's computers are nearing the speeds that will be required for real-time processing. Real-time processing coupled with high accuracy rates will give the eye tracking approach the advantage over today's mouse and trackball interfaces. There is evidence suggested in Glenstrup's thesis that after summation of execution times of various metal processes that "there is empirical evidence that using an eye-gaze based interface actually can be faster than traditional selection devices." The end result of the study concludes, "the eye is there before the hand".

When we compare the technologies we have now, and the requirements needed before this technology becomes mainstream, one begins to doubt that any of the current approaches have true potential to live up to the standards. Perhaps the most promising technology that could hail the new age of eye tracking lies far into the future in the form of nano-bots. If it was possible to give a user a medical shot full of nano-bots that had the capability of traveling through the blood stream and attaching harmlessly to specific nerve fibers, tracking that person's eye would be a simple extension of the technology. Certain chips would be programmed to seek out the visual cortex or the nerves attached to and contained in the eye. The human body could be used as an antenna to broadcast each nano-bot's signal indicating if its nerve is firing or off, so that through the use of statistical analysis, the computer could determine where the user is looking. Along with interpreting the nerve's signal, each nano-bot could be equipped with instruments to activate the fiber at will, allowing the computer to display an image to the user as well as receive one. Such an approach would not only be dead accurate, but it would also be totally unobtrusive to the user.

As we can imagine, eye tracking will not provide a new metaphor for human computer interaction, although perhaps it will allow for modifications in today's windowed interface. It will not replace the keyboard, nor will it replace the buttons on a mouse. Eye tracking could, however, replace the mouse's tracking abilities. Not only would eye tracking be faster than the standard mouse, but it would allow for hands free typing. In virtual reality environments such as simulations and games, there is little doubt that retinal tracking would be superior to a joystick or mouse.

 


 

Bibliography

 

Arne John Glenstrup and Theo Engell-Nielsen, "Eye Controlled Media: Present and Future State", University of Copenhagen (Institute of Computer Science), spring 1995.

Clemson University web-site, Virtual Reality Eye Tracking Laboratory, http://www.vr.clemson.edu/eyetracking/ , 2000

ERICA Corporation web site, "Experience the Power of Eye Gaze", http://www.ericainc.com/system.html

Eye-tracking Group, University of Bielefeld, Bielefeld Germany , "Tracking of Eye Movements and Visual Attention", 1999

Kyung-Nam Kim, "Methods of eye-gaze tracking based on computer vision", Kwangju Institute of Science and Technology, 1998

LC Technologies, Inc. Product Brochure, "The Eyegaze Communication System"

Penn State University, Industrial and Manufacturing Engineering, http://www.ie.psu.edu/labs/hf/eye.htm

Rainer Stiefelhagen, Jie Yang, Alex Waibel, "Tracking Eyes and Monitoring Eye Gaze", Carnegie Melon University http://www.is.cs.cmu.edu/ISL.multimodal.eye.html , 1998