Faceting and indexes

One topic which seemed to come up repeatedly at an SI conference recently was Faceting, being presented by publishers as something new, overcoming problems with Search engines, allowing people to get to the information they want more quickly.

In fact, faceting is something which indexes do, but which is usually the aspect of indexes which is very rarely understood by non-indexers and often one of last things properly understood by indexing trainees.

The problem, of course, is terminology – new terms lending something an aurora of mystery. In fact we have all become familiar with Faceting, although we may not have known its name, as it has been used in online stores for over a decade. An example from Amazon will demonstrate:

On Amazon, if I wish to find a specific book, I set “Books” in the dropdown list, and in the Search box I enter “Gaddis, John Lewis. The United States and the Origins of the Cold War”. What I get back is:

Here we can see that Amazon has found 112 books which it thinks matches our (rather precise) criteria:

and they are presented in a sequential list, sorted “by relevance”, down the right hand side of the page. (The book with an exact match to the one we searched for is number three in that list). In the left-hand column, however, are the Facets:

Rather than scrolling back and forth through 112 books to find the one we want, we can click and see just only the 87 of these 112 categorized as “History”, or just the 13 categorized as Biography (or only the one categorized as “Home & Garden”! – which, rather surprisingly, turns out to be: American Intelligence in War-time London: The Story of the OSS: The OSS in London (Cass Series–Studies in Intelligence), by Nelson MacPherson).

That particular facet was “Department”, but there are other facets too, such as Format and Author.

Indexes do exactly this too, but rather than using the term “facet” they use the term “aspect” (the term used in Indexing, The Art of by G.Norman Knight [1979] p.98). So, if you look up the topic “schools” in a book index, you might find that there are 25 pages referenced. Indexing rules say that that is too many to be presented to the user in a single list:

schools, 13, 16, 35, 38, 52, 74, 145, 167, 214, 226, 235, 257, 270, 275, 280, 285, 297, 302, 318, 324, 357, 359, 365, 377, 390

so subheadings are created, dividing the topic up, showing aspects of it:

  legislation concerning, 13, 16, 35, 38, 52, 74
  literature on, 145, 167, 214, 226, 297, 302
  teaching salaries in, 235, 257, 270, 275, 280, 285
  teaching methods in, 318, 324, 357, 359, 365, 377, 390

One further point, is that indexes are far more sophisticated data structures than index users appreciate, because the end result is easy to use. So, Amazon still shows facets even when there is only one search result:

whereas indexing rules say that if there are fewer than 5 results for an index heading then no faceting (subheads) should be applied.

Perhaps we ought to use the term ‘facets’ in our communications with publishers and authors?


No related posts.

About James Lamb

James Lamb has a degree in Computer Science and Mathematics from London University, worked for over 20 years as a senior IT technician and team leader, much of that time for dealing rooms of international banks, and became a full-time, professional indexer in 2004.
This entry was posted in client communications, SIdelights (SI newsletter) and tagged . Bookmark the permalink.

One Response to Faceting and indexes

  1. Pilar Wyman says:

    Interest, James! Thanks for posting. Sorry we didn’t get more chance to chat at the last conference.

    What you call “faceting” I call “facet analysis.” (You say, “tomato,” I say, “tomato”?). Others may call it dimensional modeling. However we get to all those aspects may not matter too much. Whatever we call it may not matter too much — as long as the content of the index is helpful to the reader and the search.

    Here’s to quality indexes,

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>