Sydney Sharepoint User Group: Taxonomies & Sharepoint (Tuesday 18 August)

Here are the slides & notes for a presentation I gave last night to the Sydney Sharepoint User Group on the topic of Sharepoint & Taxonomies. The presso is basically in 3 sections.

Slides 2-14: Introduction to Taxonomies

I started off by asking the audience how they might group wine together (or classify it) – the answers included colour, variety, vintage, region, bottle shape, sweetness. Then we had a look the way a wine shop orders them. The point here (apart from giving Brendan a bit of a plug) is that there are many ways to group things – and some of the most useful ones for users/consumers are not necessarily obvious from the object itself. Hierarchies & facets were then discussed via the systems of Dewey* (Dewey Decimal) and Ranganathan (Colon Classification) and some real-world examples.

The “we don’t need structure, we just need search” comment also got a mention – which resonated with a few people in the audience. The answer(s) to this include: i. taxonomies & metadata can make search better & ii. taxonomies & informtion mapping are about more than just findability. We ended that segment with Patrick‘s taxonomy map.

Slides 15-25: Taxonomies in Organisations

This section could be represented by a 3 x 3 matrix – who is involved in taxonomies vs what they are doing. I split the “who” into 3 broad groups:

Experts – and here I mostly mean taxonomy experts but it could also be subject matter experts.
Machines – language processing / semantic software (but this could also include process automation software as well).
Users – general people who just do, y’know, stuff.

You need to involve all 3 groups but each has their strengthens & weaknesses. And then I tackle 3 broad activities:

Building a taxonomy (or folksonomy or ontology).
Applying terms to documents.
Consuming – which in this situation means doing things with documents based on their metadata. This could as simple as someone searching & finding something or some fancy processing based on an ontology.

Slides 26-37: Sharepoint

Sharepoint’s basic methods of managing metadata are:

This is a good start but Sharepoint has three main deficits:

It doesn’t handle hierarchical relationships between terms in lists well – it treats each list as though it is independent.
Metadata can easily get caught in site “islands”.
It doesn’t do any of the fancy machine classification.

A range of third-party vendors have arisen to meet these needs – each offering very different functionality at varying costs.

*I don’t know whether to be offended or impressed by Dewey’s classification of Australia with extra-terrestrial worlds.

4 Responses to Sydney Sharepoint User Group: Taxonomies & Sharepoint (Tuesday 18 August)

Shannon says:

August 24, 2009 at 9:38 am

I would like to know the answer to the question posed on the last slide — “what experiences do you have with 3rd party products?” We are looking for something that will require very little user intervention…cost is always an issue but we have budgeted a fair amount for this, knowing that taxonomy is an issue in sharepoint.

Adam says:

September 21, 2009 at 3:25 am

nice ppt matt! just like last semester, the principles of multimedia presentation!

innotecture says:

November 6, 2009 at 9:50 pm

Email from Alice Redmond-Neal @ Data Harmony:

SharePoint seems to be taking over the world, and other software systems are having to play along or get left behind. Besides those you mentioned, you should know about Data Harmony software and its connector to SharePoint and numerous other systems. (I refrained from replying to your original post because the TaxoCoP discourages much active contribution from vendors.)

Data Harmony includes
1) Thesaurus Master–a rich taxonomy/thesaurus construction and management tool and
2) M.A.I.–a rule-based categorizer using taxonomy terms that can run automatically or in editor-assist mode.

The two components have been integrated with SharePoint to provide document auto-categorization from a controlled vocabulary. Editors or document handlers can modify the categorization before saving. The ability to continually finetune rules that are transparent to editors leads to extremely precise categorization results. Categorization is based on the approved vocabulary, not floating concept clusters.

I encourage you to take a look at http://www.DataHarmony.com in general to learn a bit about Thesaurus Master and M.A.I., and specifically at http://www.dataharmony.com/library/news/08-11-18-AII_Develops_Connectors_Library.html

Pingback: Titus Labs SharePoint MetaData and Classification Blog