Information Classification and Retrieval - The folks and their tags

Roger Hudson - Web Usability

The Web Standards Group, Sydney - 17 August, 2006

View the powerpoint version

Classification and navigation

Hoover and Port

"What's in a name? That which we call a rose
By any other name would smell as sweet.”

But only if we can find the rose.

Classifying things

Classification of living things


Carl (Carolus) Linnaeus, from Sweden: Father of Taxonomy.

The classification of living things, "Systema Naturae", published in 1758.

Classifying information (books)


Melville Dewey

Dewey grew up in a small U.S. town in the 1850's.

An insular man with an American Christian view of the world.

Dewey developed a system for classifying books which has become the most widely used classification system in the world.

Dewey Decimal Classification System

Dewey Decimal Classification System published in 1876. 

Organises non-fiction books into 10 general subject areas.

Each subject area has:

000 General Knowledge
100 Philosophy and Psychology
200 Religion and Mythology
300 Social Science
400 Language and Grammar
500 Natural Science and Mathematics
600 Applied Science, Technology
700 Arts and Recreation
800 Literature and Poetry
900 History and Geography

Dewey and information retrieval

A taxonomy with closely defined subject categories:

682: Small forge work (blacksmithing)

Great in the 1870's. What about the internet today?

004.019: Human Computer Interaction
004.67: General Internet books
005.72: Web usability books
808: Web Writing (in Literature)

Website taxonomy

Traditional approach to classifying web content

Navigation and information retrieval

In the early days, simplicity was the way to go.

But as sites got bigger new approaches were needed.

More navigation menus

Expanding menus

All's well in the world. Everything clearly defined. And in it's right place

But, not everything can be clearly defined

Back to the library for a moment


Indian mathematician and librarian S. R. Ranganathan.

The Dewey Decimal System;

Ranganathan, saw limitations with the Dewey System in the 1930's.

Introducing ‘Facets'

Ranganatha introduced the idea of classifying complex objects by the different “facets” they contain.

He proposed five facets for library material.

Rather than putting an object into a slot, facets allow for a composite classification of the object.

S.R. Ranganatha, “Colon Classification” published in1933

Facets and the I.T. age

Colon Classification and notation is complex and not widely used by libraries.

The concept of Facets underpins developments in information technologies.

Facets and the web

Wine Facets:

Web content is virtual and accessible from anywhere via hyperlinks.

Facets allow content holders to:

Also faceted systems are flexible and can easily accommodate new content entries.

Many ways to find a recipe

Browsing for tabouleh

Browse by Middle Eastern:

Browse by Meatless:

Looking for craft – more facets

Etsy facets

NB: Objects are also listed by categories

Shopping by colour

Cool for some

But, maybe not for grumpy old men!

Patriotic memory bracelet

Tagging and the folk

2004, tagging takes off with the release of two folksonomic tagging sites:

Folksonomy

"A folksonomy is a set of uncontrolled tags provided by individuals for their own retrieval purposes of that object and these tags are shared publicly."

Thomas Vander Wal - http://www.vanderwal.net/index.html

Folksonomy and tagging

‘Folksonomy' is an open-ended labelling system that allows users to categorise online content.

Users provide descriptive keywords or ‘tags', which use familiar, shared vocabularies.

Folksonomy is the sharing of tags provided by different users.

Assumption:

If enough people tag an object, interesting and useful patterns will emerge.

Da Vinci search with del.icio.us

Results relate to Da Vinci the painter and the Da Vinci Code book.

Tagging produced interesting and useful associations.

Potential benefits

Users offer differing perspectives on how resources can be organised and described.

Users designate terms that make sense to them.

Users provide machine-readable metadata for information content.

Tagging can enhance search engine information retrieval.

Folksonomies can help support emergent vocabularies and multilingual information classification and retrieval.

Cat

Germaine from Switzerland

Cat chat

Ella from Poland

Da Vinci search with flickr

All of the first page results relate to the book, not the painter.

Interesting, but perhaps not so useful.

Potential issues

Looking for answers - Rough and ready survey

July 2006, participants:

Key questions:

Who has tagged in the past

How many the participants have previously tagged web content?

8 out of 40 participants

How will the participants tag two survey photos?

Photo #1 tags

49 different tags used. Most common tags:

29 unique tags including:

Photo #2 tags

67 different tags used. Most common tags:

47 unique tags including:

How many will tag in the future

At the end of the survey each participant was asked:

"If in the future you could provide tags for web content (pages, images) that might be helpful to you and other users, how often would you do this?"

Never Infrequently Sometimes Often Always
4 15 10 6 5

Comments include:

I just want to get the information and get out.

I might if it helps other people.

Don't have the time.

What's in it for me?

Issues for discussion

What do you do with large numbers of tags?

How do you handle wilfully misleading tags?

Do you allow/encourage idiosyncratic tagging

Tagging idiosyncrasy

Tag Clouds: Another way to find things

Back to Etsy

Seller tags in a cloud

Popular link tags

Cloud – number of directory entries

Tag cloud mock-up

Back to the rough and ready survey

Two questions:

Survey: Responses #1

What is this? (the tag cloud)

Most seem to recognise it as a list or index of links relating to Sydney.

Survey: Responses #2

Why are some items bigger?

Wide range of responses:

Other concerns

Use of tags and ‘Tag clouds' for information retrieval raises some interesting questions:

Where to now?

Information Architecture is dead:

Who knows more about what they want than the user?

Folksonomy is a mess:

With mob indexing how will we know where anything is or find what we want?

Traditional hierarchies, facets, tags and folksonomies are all interesting and potentially useful.

It is a question of finding the right balance.

There's more than one way to skin a cat - with apologies to cat lovers

Be cautious when people say they know the one and only way.

Unless of course, the answer is...

42