Conceptual Metaphors for Explaining Search Engines

Conceptual Metaphors for Explaining Search Engines David G. Hendry and Efthimis N. Efthimiadis Information School University of Washington, Seattle, WA 98195 {dhendry, efthimis}@u.washington.edu ABSTRACT To explore how people naturally depict algorithmic processes, 232 participants were asked to sketch how a search engine works. While the sketches reveal a diverse range of visual and conceptual approaches, a subset of the sketches seem to exhibit an underlying regularity for describing algorithmic processes. To explain this regularity, we propose the conceptual metaphor A SEARCH ENGINE IS A SERIES OF TEXT TRANSFORMATIONS and describe a set of mappings from sketchable graphic elements to abstractions in the search engine domain. We believe that this metaphor can be applied to enable people to more easily conceptualize, describe, and explore complex systems. Author Keywords Conceptual Metaphor, Mental Model, Visual Notation, End-User Programming, Sketching, Search Engines ACM Classification Keywords H.1 MODELS AND PRINCIPLES: H.1.2 User/Machine Systems; K.3.2 Comp. and Info. Science Education INTRODUCTION A key feature of any learning situation is the conversation that takes place between the learner and his or her materials. Schön [5] calls attention to the power of sketching with pencil and paper to support this conversation. While he emphasizes the individual designer progressing through cycles of action followed by reflection, sketches also serve important communication functions, such as eliciting comments from others [2]. To facilitate this conversation, with oneself and with colleagues or critics, conventions for depicting abstractions can be introduced. A person, for example, might draw a rectangle and fill it with four small circles. The circles might represent documents and the rectangle might represent a folder to hold the documents. In this way, a working paper please sent comments to both authors sketches can create a mapping between perceptual elements on a display and abstractions in a problem domain. When a directed line is added, with its source touching a document and its destination lying within a second folder, we might ask: What is happening to the document? If this is a transformation of some kind, should it happen to all documents in the source folder or to only some of them? If the answer is only some of them, then how are they identified? Once the transformation is complete, what next happens to the document? How is the transformation specified? And so on. In this way, sketches allow us to reason in a relatively concrete fashion so that inferences can be made about a more abstract domain. In this case, the perceptual elements dots, rectangles and a directed line provide a conceptual metaphor for the relatively more abstract domain of documents and folders. In this paper we report preliminary findings from a study of how people naturally depict such abstractions in sketches. After a brief introduction to conceptual metaphor in the next section, we analyze a collection of sketches that were created in response to the task Draw a sketch to explain how search engines, such as Google, work. Then, we propose that the metaphor A SEARCH ENGINE IS A SERIES OF TEXT TRANSFORMATIONS captures much of their underlying regularity and describe its mappings in detail. CONCEPTUAL METAPHOR A conceptual metaphor is a set of mappings from a relatively concrete domain to a more abstract domain [3]. Through these mappings, the more abstract domain is more readily understood. For example, a metaphor for making things is THE OBJECT COMES OUT OF THE SUBSTANCE as in I made the database out of the document titles. Conceptual metaphors have been found to be a fundamental cognitive mechanism [3], and their use is pervasive in ordinary conversation and for reasoning in science and technology. More recently, Lakoff and Nún ez [4] have applied this approach to analyzing the conceptual structure of mathematics, aiming to unpack the layers of metaphor that enable mathematical reasoning. For example, the metaphor NUMBERS ARE POINTS ON A LINE is central to Euclidian 1

geometry and trigonometry. By identifying such metaphors and the specific mappings that exist between source and target domains, they claim that mathematics becomes more comprehensible. In sum, this ambitious project aims to propose nothing less than an empirical basis for an embodied-mind theory of mathematics. The analysis that follows relies on two concepts from this theory that are now briefly and informally introduced. The first concept is the Container schema, which consists of three distinctions: An inside, an outside, and a boundary. These distinctions, which are both perceptual and conceptual, allow us to readily see that four dots are within the and one dot is outside of it. Some of the sketches were largely representational. In the following sketch, the query is depicted by a horizontal input box akin to Google s visual design. The query is sent into the world, somehow yielding results. One must assume that the image of the world also depicts a matching process. A second class of sketches is largely systems-oriented, where box-and-line symbols are employed to depict key entities and processes. In this sketch the entities User 1, Search Engine, and Database are connected with directed communication pathways that are signaled by arrows and the words Searches for, requests, and returns. The second concept is the Source-Path-Goal schema, which consists of a source, a trajectory, and a goal. Below, we can see a source and a goal for a moving object. The dotted line indicates that part of the trajectory has yet to be realized and the arrow indicates the direction of the trajectory. If a moving object is at position x then we can conclude that all points between the source and x have been visited and the points between x and the Goal have yet to be visited. Source x Goal Like the Container schema, we are able to make these inferences because of our ability to recognize relationships from perceptual information. Importantly, words for the spatial relationships associated with these schemas are found in all human languages and are learned early in life; thus, these schemas are both perceptual and conceptual [4]. It is for this reason that the conceptual metaphor, which employed both Container and Source-Path-Goal schemas, given in the introduction was so easily understood. STUDY Task Undergraduate and graduate students at the University of Washington were prompted to draw sketches on 8 x 11 in. paper sheets. Students were given approximately 10 min to complete the task at the start of regularly scheduled classes. Participation was voluntary and anonymous. A sample of 232 sketches was collected for analysis in the spring and autumn of 2003. The following is a qualitative analysis of a subset of these sketches. Content Analysis To analyze these sketches, a normative model was created by drawing on textbook accounts of search engines and raters coded all 232 sketches. The procedural details and results of this analysis are reported elsewhere [1]. The full sample of sketches reveals a tremendous diversity of approaches for explaining the operation of search-engines. Conceptual metaphors Another class of sketches, representing approximately 20% of the sample, aims to explain computational processes and we shall restrict the following analysis to this class. To begin, consider this example and accompanying annotation: Data is sub-divided into smaller parts until the data can be easily search through [sic]. Much like a sorting algorithm Here, we can recognize a tree structure where a database consists of rectangles, which, in turn, consist of rectangles. The four dots,... and the use of a diagonal line appear to be used to indicate more of the same. This figure is imprecise and it raises more questions than it answers but even so it enables metaphoric reasoning. For example, if those rectangles are smaller parts [of data], then we can ask How small are they?, How did they get broken up?, Is additional information associated with the parts? How are the parts ordered vertically? Thus, even an ambiguous sketch purporting to explain how a search engine works can initiate a questioning process by mapping from the source domain (the sketch s elements) to the target domain (search engines). Next, we isolate and discuss elements that are readily discerned from the sketches. 1 Words excerpted from the sketches are underlined. 2

Documents are points The sketches often depict documents as single points or various geometric shapes; often, many instances are shown. The word lists appear to be merged at the Google database, as signaled by the converging lines. Sometimes lines loop across objects, indicating that a process must go out and come back as with this sketch: Containers Organic shapes are often used to represent the internet. Occasionally, documents are not explicitly depicted. On the other hand, rectangles or other geometric shapes are typically used to contain computed or well-specified data. Documents are text fragments Sometimes documents are depicted in greater detail. Often, short wavy lines are used to represent text fragments. Transformations Directed lines are typically used to depict a transformation of data from one form into another. Further, this sketch shows several interesting forms of abstraction with circles indicating processes and rectangles indicating information structures both of which are nested. Iteration and selection Ellipses ( ) are sometimes used to depict repeated operations but more often several examples are shown, leaving it to reader to infer that additional instances might be involved. Rules for selecting items or stopping conditions, if given at all, are stated in word annotations. In this sketch, flow, looping, and selection can be recognized (e.g., note the ghosted and crossed document icons and the curved, cyclic directed lines): Interestingly, the meanings of the lines can depend on their sources and destinations. On the right, the set of four directed lines seems to indicate the robot will visit the web and the vertical line seems to indicate the robot will send data onward. In the next example, documents, depicted with wavy lines, are transformed into word lists, depicted with shorter lines. DISCUSSION Of the 232 sketches in full sample, about 20% of them depict some algorithmic operation of search engines. As illustrated by the above excerpts, these sketches reveal a diverse range of conceptual and visual approaches. In fact, 3

perhaps the most notable feature of this body of sketches is just how different they are. This can be seen especially in the images of robots, webs, documents, and so on but also in the overall composition of the sketches, how annotations are used to complement the sketches, and the degree of completeness and correctness of the sketches. Nevertheless, close inspection of the sketches seems to suggest an underlying regularity where search engines are described by showing how units of text are successively transformed to smaller fragments. This transformation process can be expressed with the metaphor A SEARCH ENGINE IS A SERIES OF TEXT TRANSFORMATIONS. Under this metaphor, people explain the operation of a search engine by describing 1) How web pages are broken up into words that, in turn, are stored in some fashion; and 2) How queries are broken up into words and matched against the previously stored word data, yielding identifiers to the original pages. To do this, the sketches, at varying degrees of completeness, show a sequence of transformative phases. To represent these transformations, people seem to converge on a conceptual vocabulary that maps a graphic notation to computational processes. The following chart proposes a core set of mappings from the source domain (sketching) to the target domain (Search Engine) that define the metaphor A SEARCH ENGINE IS A SERIES OF TEXT TRANSFORMATIONS. Importantly, the Container and Source- Path-Goal schemas can be readily identified in the sketches. Source Domain Sketching Point or shape Short, wavy line Cloud Rectangle Directed line Lines diverging from Line attached to Converging lines Target Domain Search Engine Document Text fragment Large, undifferentiated Computed or well-specified Transform text fragment Iterate through contained items Iterate through contained items Accumulate text fragments or documents Several brief observations are now made. First, this is a conceptual notation that aims to capture the underlying expressiveness of the sketches. Thus, it is claimed that this notation should be able to represent a range of textprocessing systems. Second, the ability to represent rules for selecting or qualifying items is not captured by this notation. In the sketches, such rules are generally expressed by attaching word annotations to particular elements of the sketch. Third, this notation allows for various levels of abstraction and sketches at a high-level of abstraction can be expanded to show greater levels of detail. This comes from the composability property of Container schemas [4]. Finally, we speculate that this notation is natural; that is, it can be easily learned and provides an effective bridge for reasoning from sketches to problem domains. We think this is so because: 1) The notation was derived from sketches produced in response to a time-limited, open-end task; 2) The notation relies upon the Container and Source-Path- Goal schemas to a large extent and thus draws upon fundamental cognitive concepts for everyday reasoning [4]; and 3) This notation is amenable to practically any visual style, thereby accommodating different target domains and the stylistic needs or inclinations of the sketcher. CONCLUSION This preliminary study suggests that many people employ the conceptual metaphor A SEARCH ENGINE IS A SERIES OF TEXT TRANSFORMATIONS when asked to sketch the operation of a search engine. Certainly, this is not the only metaphor employed but it does seem particularly interesting for its focus on the algorithmic features of search engines. In future work, we hope to investigate 1) How this conceptual metaphor can be used to teach people about search engines; and 2) How the notation might be used to bridge the gulf between informal sketching and computer programs for text processing. Finally, we think a great deal more can be learned about how people conceptualize systems by examining the sketches that people produce in response to tasks such as Draw a sketch of how such and such a system works or Assuming that a SEARCH ENGINE IS A SERIES OF TEXT TRANSFORMATIONS, draw a sketch of how you think it works. REFERENCES 1. Hendry, D. G. and Efthimiadis, E. Conceptual models for search engines. Submitted to Journal of the American Society for Information Science and Technology. 2. Henderson, K. Flexible sketches and inflexible data bases: Visual communication, conscription devices, and boundary objects in design engineering. Science, Technology, & Human Values, 16, 2(1991), 448-473. 3. Lakoff, G. and Johnson, M. Metaphors We Live By. Chicago: The University of Chicago Press, 1980. 4. Lakoff, G. and Nún ez, R. E. Where Mathematics Comes From: How the Embodied Mind Brings Mathematics into Being. New York: Basic Books, 2000. 5. Schön, D. The Reflective Practitioner: How Professionals Think in Action. New York: Basic Books, 1983. 4