More on Query Types

Focused retrieval vs exploration, and why it matters

Mar 02, 2021

It’s pretty weird that we treat “Internet search” as one unified thing. In my last post, I briefly mentioned that modern search technologies work well with some types of queries and poorly others, but I didn’t dig into the specifics of what each of those query types entails, which is why we’re here now.

While there are many different reasons why someone might type something into Google these days, all queries fall into one of two big functional buckets:

Focused Retrieval Queries, or FRQs; or
Exploratory Search.

These are pretty self-explanatory from the outset, but we’ll dive into them in more detail in a second. For now, take a moment to think about your last five Google searches. What motivated each one? Do you remember? More importantly, did you know what you were looking for ahead of time? Could you tell when you’d found it?

1. Focused Retrieval Queries

FRQs are the result of a knowledgeable agent seeking something in particular, like a pdf version of a particular textbook or journal article, a video they saw a week ago and now want to share with a friend, or a recipe for scones.

If you answered yes to the questions above, your searches were FRQs. You knew what you were looking for ahead of time, which made it easy enough to tell when a given result was or was not what you were looking for.

As I mentioned in my previous piece, modern search paradigms are reasonably successful at responding to FRQs, assuming that the aforementioned knowledgeable agent knows how to formulate an effective query and doesn’t just search “goose” when what they want is this:1

In this case, the query "i smel drug meme" returns this image as the top result — I smel drug

The search engine’s success in handling FRQs is due largely to the linguistic behavior patterns of the agent performing the search. In particular, an agent who has greater cognitive access to the desired result (as in the case of the FRQ) will produce a query with a greater relevant keyword density. Without thinking too hard, they have the ability to hone in on what sets their desired result apart — what makes it memorable — and have access to the language necessary to relay that to the engine.

A note on cognitive access

We use the terms “cognitive access” and “cognitive availability” to describe two frames of the same relationship between an agent and their desired result. In particular, cognitive access refers to the agent’s ability to “see” what their desired outcome is, while cognitive availability refers to the property of that outcome of being readily accessible.

Regardless of which phrase we prefer, the concept remains the same. But why define a concept at all?

When material is more cognitively available, we spend fewer braincycles working towards accessing it, because it’s right there.

Imagine you have a beach ball full of money. If you have greater cognitive access to it, you can pretty much ignore the valve, because the valve is old news. You go in there and say, okay, this beach ball contains exactly five hundred and forty-one dollars. Information!

If, instead, you have lesser cognitive access, you might get hung up on its surface characteristics, such as the colors of the stripes or the mechanism of the valve, and totally miss the fact that the thing’s really just a pile of cash in disguise.

Which of these agents will more readily identify an extension of their query in a room full of similarly colored beach balls?

Keyword-based search and FRQ supremacy

If we claim that the goal of search in general is to “find what you’re looking for” where “what you’re looking for” is well defined beforehand, it makes sense to build a keyword-driven engine. After all, like we saw above, the FRQ is almost by definition a keyword-rich object.

Logically, in order to receive a particular type of result, we will submit an input containing the keywords that we relate to that particular thing, and therefore it is in the search engine’s best interest to pay special attention to keywords in our input.

As the Internet has grown to ubiquity, though, we’ve begun to move away from the targeted search practices of the early days. This has only been accelerated by the ever-growing sphere of peripheral influence surrounding each one of us. We are now far more likely to run into stray bits of information that may pique our interests and lead us to perform queries whose desired end results remain unknown to us until after the search is already underway.

This is different.

The search engine as we know it wasn’t made for this. In many cases, it still does a reasonably good job — I’d venture to guess that the acceptable performance of the FRQ-based search engine when faced with “what is this keyword” queries in combination with its proven success in the FRQ realm is what has allowed the field to effectively stagnate — but in many others it fails spectacularly.

The modern search engine exists to process and respond to FRQs. Pretending otherwise will only hold us back.

2. Exploratory Search

In contrast to the FRQ, an exploratory search is one with no well-defined end goal. It is searching for searching’s sake, venturing out into the great unknown to get a sense of what’s out there.2 It is also a modern search engine’s nightmare.

Where the FRQ is deep, exploratory search is shallow. Keyword relevance and density is comparatively low; semantic depth is minimal. Many of these searches could likely be performed using the only top 1,000-10,000 English words, turning the laser precision of the FRQ keyword paradigm on its head.

This is not to say that exploratory search is necessarily “simple” or “immature.” Rather, the searches often express complex ideas at their core, using vague, non-domain-specific language.

The hallmark of an exploratory search is a particular degree of ignorance of a domain’s semantic landscape. The word “ignorance” carries some negative connotations, but here I intend it to be read in full neutrality. Ignorance is good — it’s room to grow! But also, it makes Google shit the bed.

Designing for exploratory search

The average response to the Google-shitting-the-bed problem is to try — frequently in vain — to modify your query in such a way that you somehow manage to tap into the secret algorithmic voodoo driving the search engine and eventually surface with a useful result.

If that sounds needlessly complicated, join the club! It is idiotic to the max, like using a field hockey stick to play tennis.

Oh? What’s that? You wouldn’t use a tool designed for a fundamentally different purpose to perform a specific function that it is not at all suited for? They’re both ball sports! Basically the same thing! Don’t be silly!

Yeah, that’s how dumb it is.

Rather than whining about the ~abysmal state of things~ like a couple of Real Winners, though, let’s think about how we might address this issue head-on. We have a user story — Alice and Bob want to find useful information on a topic whose jargon is pretty alien to them — and we have the Internet, which is full of useful information, but currently indexed by its jargon.3

A platform designed with exploratory search in mind can’t index by jargon. Jargon exists at an inaccessible semantic depth. Rather, any platform intended to support this query style must focus instead on the query’s intent: the shape of the query function, as it were.

This may not be intuitive at first, especially if you’re not in the habit of viewing queries or other bits of text through a functional lens, but bear with me. In a later post, I’ll get into the excruciating technical detail behind this formulation. For now, just repeat “the query is what it does” under your breath until it becomes habit.

The long and short of it is, design for exploratory search is a fundamentally different process from the traditional FRQ-centric formulation. Priorities may not be shared, optimizations for one scheme may actively nerf the other. There is no reason, beyond sheer laziness, why this should be treated as an extension of the keyword-based FRQ model.

The Interaction Challenge

People aren’t going to stop using the same search engine for both types of query. Habit is habit, and being able to type some words into the address bar without really thinking beforehand about what type of task you’re performing is one of those invaluable features that defines the collective psychology of the information age.

Thus, despite all my ramblings on how these tasks are fundamentally different and require fundamentally different design frameworks, the end result should by all means feel, to the user, not all that different from its current iteration. There are obvious improvements to be made on the UI side of things, but, in particular, a user’s decision-making experience should not change.4

We must, therefore, find a way to integrate the two systems. Luckily, this seems right from the outset to be reasonably straightforward. If we can create a unified system that first establishes a query’s function, as I mentioned above, we can then search the space in that functional context, whatever it may be.5 In short, identify the query type, and perform the appropriate type of search for that query type. More on this later!

Summary

1) Focused Retrieval Queries are highly descriptive; Exploratory Search is not. Exploratory Search is open-ended; Focused Retrieval Queries are not. Focused Retrieval Queries play nicely with keyword-based search of various sorts; Exploratory Search does not.

2) It is unlikely that the average user consciously classifies their query type before performing a search. It is unlikely that the average user would readily change this behavior when faced with an alternative search tool designed for Exploratory Search in particular. To design our systems effectively, we must not alter the user’s decision-making experience.

In this case, the query “i smel drug meme” works perfectly, returning this image right at the top, followed by several variations. Separately, it cracks me up.

To be clear, the intent is still to find something relevant, but the constraints for relevance are not known beforehand. It is through the search itself that these conditions become known.

We use the phrase “useful information” here to denote topic-appropriate results with a high informational content relative to the searching agents’ baseline knowledge. In particular, “useful information” in the context of an agent is topic-appropriate content that is both novel and immediately actionable. Jargon saturation does not exceed the threshold at which it becomes a barrier to accessibility; content is not a mere restatement of prior knowledge.

More to be said on UI improvements later.

Fortunately, it seems that Google has been moving in this direction. Several of their newer and more experimental features seem to be geared towards responding appropriately to latent patterns in user behavior, indicating that this is at the very least on their minds. Fingers crossed for a more functional future!

Function Over Form

Discussion about this post