UsingCategories

This page is still under development.

Categories are already implemented in PmWiki 2, and in most wikis they don't require any special code or markup, they're just a useful convention in Wiki pages. The idea is that every page that is in a given category should have a link to a shared category page.

There is also a special markup as a shortcut to category-entries: [[!category]] will create a link to Category/category

Since PmWiki has WikiGroups, I'll assume that category links go to the Category group. However, bear in mind that you can use any group or page naming scheme for this -- it's just a convention and doesn't require special programming support. TODO: special wiki-var for categories

The key aspects of building a category are:

  1. On every page that belongs to a category "XYZ", create a link on that page to [[Category/XYZ]] or [[Category.XYZ]].
  2. Then, to see a list of all of the pages that belong to category XYZ, simply do a search to list all pages that have links to the Category.XYZ page.
note: all pages that contain the text "Category.XYZ" are displayed. A future version will allow (:pagelist linkto=Category.XYZ:) or (:pagelist backlink=Category.XYZ:) (I'm still working on the syntax), which will strictly search for links and not text. TODO: check implementation and change this docs

PmWiki 2 makes this second step incredibly simple, since you can get easily generate a list of all pages with links to Category.XYZ by doing

    (:pagelist Category.XYZ:)

So, that's really all there is to categories. But wait, there's more!

As John Rankin pointed out in his excellent post, the above convention, combined with the fact that all of the category pages are in a single group, means that we can do much more. If we create a page called Category.GroupFooter, and put the following markup in that page

    (:pagelist Category.{$Name}:)

then every page in the Category group, even empty/non-existent ones, will display a list of pages that are in the category named by the page. For example, the Category.XYZ page display will automatically include Category.GroupFooter, and {$Name} in the GroupFooter will be replaced by the name of the current page (XYZ) to give us the (:pagelist Category.XYZ:) directive described above! So, we don't even need to write a bunch of (:pagelist Category.<Name>:) directives to create the page listings -- just creating a generic one in Category.GroupHeader or Category.GroupFooter will do it for all categories we might create.

It finds any instances of "Category.XYZ" in the text, and it also finds any page that has an outgoing link to Category.XYZ (regardless of whether that link was specified as [[Category.XYZ]], [[Category/XYZ]], [[Category.X(Y)Z]], or even [[x y z]]).
Also, pages such as RecentChanges show up in the category by default and probably should be filtered with list=normal:
(:pagelist Category.{$Name} list=normal:)
(it's not possible to add the non-existent pages as a link in a page, cause this link will automatically turned into an action=edit-link so you can add the link only in the URL-field of the browser)

So, what do we get? Every page belonging to Category.XYZ has a link to Category.XYZ, and following that link automatically displays a list of all pages in the XYZ category.

But that's not all...!

The Category.* pages can themselves be placed into categories! So, to follow John's excellent example, let's suppose we have the following film pages in the categories listed to the right:

 Film.SeanOfTheDead    [[!Horror]] [[!Comedy]] [[!2003]]
 Film.InMyFathersDen   [[!Drama]] [[!2004]]
 Film.TheCorporation   [[!Documentary]] [[!2003]]

Now then, we can create Category.Horror, Category.Comedy, Category.Drama, and Category.Documentary, and in each one of those pages we put [[!/Genre]]. In Category.2003 and Category.2004, we put [[!Year]].

So, what happens when we display Category.Genre ? We see links to "Comedy", "Drama", "Documentary", and "Horror", because they're in the Genre category. When we click on one of those links, we see all of the films listed in one of those categories. Similarly, if we click on Category.Year, we see links to "2003" and "2004", each of which in turn displays the list of films for that year.

Finally, in Category.Genre and Category.Year we can put [[!Category]], which makes them "top-level" categories reachable from the Category.Category page. Voila, we now have an instant "hierarchy":

   Category.Category
       Category.Genre
           Category.Comedy
               Film.SeanOfTheDead
           Category.Drama
               Film.InMyFathersDen
           Category.Documentary
               Film.TheCorporation
           Category.Horror
               Film.SeanOfTheDead
       Category.Year
           Category.2003
               Film.SeanOfTheDead
               Film.TheCorporation
           Category.2004
               Film.InMyFathersDen

Note however that this isn't a "strict" hierarchy--i.e., any page or category can appear simultaneously in multiple categories. For example, Category.Documentary could be a member of both the Genre and top-level category listings.

Each category page can have content text before the generated list, e.g., to give a generic description of things in the category. (Or it can be empty, which works fine.)

And all of this works "out of the box" without any modifications to PmWiki 2.0! John goes a step further and proposes that we create a special markup for "Category", so that one can write things like [[!Comedy]] and [[!Genre]] instead of the longer [[Category/Comedy]] and [[Category/Genre]], but at the moment I'm leaving this as a (one-line!) local customization until it's widely adopted or we have a good standard markup for it. TODO: This is implemented as a test since v2develop25(?)

When we choose a good markup, this is all that's needed: (uses [[!Comedy]] for illustration)
  
    SDV($CategoryGroup,'Category');
    Markup('[[!','<links','/\[\[!([^\|\]]+?)\]\]/',
      "<span class='category'>[[$CategoryGroup/$1]]</span>");
  

Hope this helps. The advantage of a separate category markup is that authors can assign pages to categories independently and explicitly. It lets authors distinguish between a category reference and a reference to a page that happens to be a category page.

The hard part about using categories is choosing a good vocabulary. Site content managers may wish to follow the Guidelines for the establishment and development of monolingual thesauri (ISO 2788-1986) and the Guidelines for the establishment and development of multilingual thesauri (ISO 5964-1985). Questions to think about include:

  • whether a scheme already exists and can be reused
  • number of levels in a multilevel scheme (not too shallow, not too deep -- eg 3)
  • number of categories per page (not too many, not too few -- eg 3)
  • consistent use of singular ([[Mercury]] is a [[!planet]]) or plural ([[Mercury]] is in the [[!planets]] category)
  • disambiguation and use of phrases ([[!musical instruments]] and [[!medical instruments]]) or Cookbook/SubpageMarkup ([[!Instruments*Musical]] and [[!Instruments*Medical]])
(quotes of postings in pmwiki-user-maillist)

TODO: add the text of John Rankins post too ?


samples (created in groups SampleCategory and SampleFilm):



This page may have a more recent version on pmwiki.org: PmWiki:UsingCategories, and a talk page: PmWiki:UsingCategories-Talk.