RESEARCH STARTER

Visual search

Visual search is the cognitive process of locating a specific object or image within a field of competing distractors. This complex skill involves various elements, including the number and similarity of distractors, the focus of attention, and contextual clues that can aid in the search. Researchers have identified that the brain processes visual information by interpreting distinct features such as size, color, and shape, which are analyzed in different areas of the brain. This intricate process allows individuals to recognize objects swiftly, often in a fraction of a second, even when those objects differ from previous experiences.

The effectiveness of visual search can be influenced by numerous factors, including the familiarity with the target object, the number of distractors present, and the expected location of the target based on context. For example, when looking for keys in a room, a person's search strategy is guided by past experiences that narrow down likely locations. Challenges arise in high-stakes scenarios, such as medical professionals scanning for tumors or security personnel searching for weapons, where the prevalence effect can lead to overlooking targets due to conditioned expectations.

Ongoing research aims to enhance understanding of visual search processes to improve training for individuals in critical roles and to develop technology that can replicate these human capabilities. This exploration of visual search not only sheds light on human cognition but also informs practical applications across various fields.

Authored By: Ungvarsky, Janine 1 of 4
Published In: 2020 2 of 4
Related Topics:
Brain;Software development;Tumors
3 of 4
Related Articles:
Bias in Visual Short-Term Memory for Motion Induced by Perceptually Suppressed Distractors.;Breaking the silence: Exploring the influence of auditory singularity on visual search.;Early visual modulation and selection predict saccadic timing during visual search: An ERP study.;Visual search attentional bias modification reduced the attentional bias in socially anxious individuals.;Visual Search, Pupillary Response, and Scoring Differences Between Expert and Novice Judges in Artistic Swimming: An Exploratory Study.
Bias in Visual Short-Term Memory for Motion Induced by Perceptually Suppressed Distractors.; Breaking the silence: Exploring the influence of auditory singularity on visual search.; Early visual modulation and selection predict saccadic timing during visual search: An ERP study.; Visual search attentional bias modification reduced the attentional bias in socially anxious individuals.; Visual Search, Pupillary Response, and Scoring Differences Between Expert and Novice Judges in Artistic Swimming: An Exploratory Study.
4 of 4

Full Article

Visual search is a concept that refers to the process of identifying a target object or image among a number of competing distractors. Several factors can complicate the search process, including the number of distractors, how similar the distractors are to the target, how much attention the searcher is applying to the process, and whether there is some context to guide the search process. Visual searches are part of everyday behavior, yet they are a very complex cognitive skill that involves many interacting components. Computer scientists and AI researchers have developed software/visual-search systems that can retrieve or interpret images using learned multimodal representations, but matching human flexibility in complex real-world scenes remains an active challenge.

Background

The first significant efforts to investigate how the visual search process works in the human brain were conducted in the 1970s by English-born psychologist Anne Treisman. Treisman was initially interested in how the ear and brain processed auditory stimulation, and she expanded this to studying how visual stimuli are processed. Treisman determined that different aspects of an object—for instance, its size, color, shape, and movement—are processed in different parts of the brain, often in parallel during early stages of perception. She then sought to discover how the brain creates a single image from these different features. She developed the feature integration theory (FIT) to explain how the brain binds the different aspects of an object into one cohesive image.

This study led her to investigate how people distinguish objects from one another. Treisman studied how the brain uses visual cues, past experiences, context, and other factors to locate objects and tell one from another. Her research provided not only a greater understanding of how the processes of sight and visual attention function but also helped experts in other fields. For instance, her work helped engineers who design traffic and emergency signs find ways to make the warnings more noticeable.

Overview

Seeing an object goes far beyond the biological function of light entering the eye and stimulating nerves to send signals to the brain. Researchers have determined that those signals are sent to different parts of the brain, including areas that determine how far away the object is and decode its color, size, shape, and other features. The brain must then interpret all of this information and bind it together into one cohesive image. This happens even when the current object differs in some way from the person's previous experience with the object. For instance, houses with one or two stories all register as houses; cars are cars whether they are black or blue or sports cars or antique models; a cat is recognizable whether it is standing, sitting, lying down, or half hidden behind a box. This also happens in a fraction of a second as the brain completes this complicated analysis and the object or image is recognized.

Researchers have found that a number of factors influence how the brain processes the complex input necessary. They have determined that the more information the brain has about the target object, the easier it is to identify the object. The brain is set up to look for an object's shape, its orientation, and its size. Other factors, such as complex feature combinations and how parts of the object intersect(does it form a T or an X, for example), may require focused attention for processing. The brain also uses previous experiences to help in the search process. For instance, while scanning a room looking for a set of keys, the search will focus on the lower half of the room, including the tops of furniture and the floor, but ignore places such as the sides and backs of chairs because experience indicates it is not likely the keys will be found there.

A number of factors have been identified as part of the process to provide guidance that helps many searches take a fraction of a second as opposed to many seconds or even minutes. One is the concept that some areas of possibility are more likely to contain the search target. A person trying to find the way to an appointment will look for directional signs on the wall or suspended from the ceiling because that is where the signs are most likely to be located. Knowing the parameters of the search is very useful. A person looking at pictures of clocks with all the hands in different positions and asked to find the ones that are similar will spend much more time on the task than someone who is told to find the clocks indicating the quarter hour.

The number of distractors that are present can also complicate a search. In the previously mentioned clock scenario, all clocks that are not indicating the quarter hour are distractors, or something that makes the target harder to identify. Nearly anything can be a distractor. For a person waiting for a friend at the airport, all the other people are distractors, for instance.

Researchers have also identified the degree of likelihood that the target will be found to be a factor in visual searches. The difficulty of finding the object is not likely to deter someone who is searching for something like a wedding ring dropped in deep-pile carpeting. However, this has been identified as a challenge for people such as medical professionals who scan hundreds of samples a day looking for tumors or other signs of disease. It is also an issue for workers whose job it is to scan for contraband objects such as weapons and bombs. The mere fact that they might review thousands of images without seeing a target image makes it more likely that they will overlook the target when it appears. This is partially a function of attention and partially a function of what is known as the prevalence effect. This is the tendency of the brain to not see something because it does not expect to see it; hundreds of visual searches that end without finding a target essentially condition the brain to not see the target.

For the most part, visual search processes happen subconsciously, without the searcher being aware of the factors that are part of the search. Researchers continue to better understand how these processes occur so that people whose job it is to successfully complete visual searches can be trained to do so more effectively. They also seek to understand how the human brain completes these processes in an effort to program computers to assist with such tasks as finding tumors, identifying terror threats, and seeking other potential dangers.

By the 2020s, visual search in computing had expanded into multimodal retrieval. Instead of relying only on hand-engineered visual features, many systems use learned representations that place images, text, and sometimes video into shared vector spaces. This allows people to search using text for images, use an image to find similar content, or combine an image with language to refine a query. These methods are now used in consumer search products, cloud AI services, and retrieval-augmented systems for visually rich documents. At the same time, high-stakes uses such as medicine and threat detection still require careful validation, because strong performance on benchmarks does not guarantee safe real-world use.

Bibliography

Conkle, Ann. "Visual Search Gets Real." Association for Psychological Science, 2 Aug. 2010, www.psychologicalscience.org/observer/visual-search-gets-real. Accessed 9 Apr. 2026.

Davis, E.T., and J. Palmer. "Visual Search and Attention: An Overview." Spatial Vision, vol. 17, no. 4–5, 2004, pp. 249–55, doi:10.1163/1568568041920168. Accessed 10 Apr. 2026.

Duecker, Katharina, et al. “Guided Visual Search is Associated with Target Boosting and Distractor Suppression in Early Visual Cortex.” Communications Biology, vol. 8, no. 912, 11 June 2025, doi:10.1038/s42003-025-08321-3. Accessed 10 Apr. 2026.

Eckstein, Miguel P. "Visual Search: A Retrospective." Journal of Vision, vol. 11, no. 5, 30 Dec. 2011, doi:10.1167/11.5.14. Accessed 9 Apr. 2026.

Kristjánsson, Árni. "Reconsidering Visual Search." i-Perception, vol. 6, no. 6, 8 Nov. 2015, journals.sagepub.com/doi/pdf/10.1177/2041669515614670. Accessed 9 Apr. 2026.

Mei, Lang, et al. “A Survey of Multimodal Retrieval-Augmented Generation.” Cornell University, 2025, doi:10.48550/arXiv.2504.08748. Accessed 10 Apr. 2026.

Rieger, Tobias, et al. “Likelihood Systems Can Improve Hit Rates in Low-Prevalence Visual Search over Binary Systems.” Human Factors: The Journal of the Human Factors and Ergonomics Society, vol. 67, no. 9, Feb. 2025, pp. 861–76, doi:10.1177/00187208251320589. Accessed 9 Apr. 2026.

"Visual Search for Features and Conjunctions." JoVE, 19 Mar. 2015, www.jove.com/science-education/10062/visual-search-for-features-and-conjunctions. Accessed 9 Apr. 2026.

Yu, Shi, et al. “VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents.” Cornell University, 2024, doi:10.48550/arXiv.2410.10594. Accessed 10 Apr. 2026.

Full Article

Background

Overview

Bibliography

Conkle, Ann. "Visual Search Gets Real." Association for Psychological Science, 2 Aug. 2010, www.psychologicalscience.org/observer/visual-search-gets-real. Accessed 9 Apr. 2026.

Davis, E.T., and J. Palmer. "Visual Search and Attention: An Overview." Spatial Vision, vol. 17, no. 4–5, 2004, pp. 249–55, doi:10.1163/1568568041920168. Accessed 10 Apr. 2026.

Eckstein, Miguel P. "Visual Search: A Retrospective." Journal of Vision, vol. 11, no. 5, 30 Dec. 2011, doi:10.1167/11.5.14. Accessed 9 Apr. 2026.

Kristjánsson, Árni. "Reconsidering Visual Search." i-Perception, vol. 6, no. 6, 8 Nov. 2015, journals.sagepub.com/doi/pdf/10.1177/2041669515614670. Accessed 9 Apr. 2026.

Mei, Lang, et al. “A Survey of Multimodal Retrieval-Augmented Generation.” Cornell University, 2025, doi:10.48550/arXiv.2504.08748. Accessed 10 Apr. 2026.

"Visual Search for Features and Conjunctions." JoVE, 19 Mar. 2015, www.jove.com/science-education/10062/visual-search-for-features-and-conjunctions. Accessed 9 Apr. 2026.

Yu, Shi, et al. “VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents.” Cornell University, 2024, doi:10.48550/arXiv.2410.10594. Accessed 10 Apr. 2026.

Visual search

Full Article

Background

Overview

Bibliography

More Like ThisRelated Articles

Related Articles (5)

Related Articles (5)

Full Article

Background

Overview

Bibliography

More Like ThisRelated Articles

Related Articles (5)

Related Articles (5)