Saturday, July 16, 2011

Hidden (great) secrets inside ATVocabularyManager

ATVocabularyManager is a well-know Plone product developed by BlueDynamics that make simple handling vocabulary values used by your contents directly inside Plone.
That's all: the power of the product is all in this first sentence.

Generic Setup Strike Back
In my personal experience I not commonly found projects that need a Plone user able to change vocabularies, but in the information architecture this is quite common (they call this "Controlled Vocabulary"). So I used ATVocabularyManager a couple of times in the past, but I never became habit to rely on it.

Another thing I don't liked was the unexistent Generic Setup integration.

Recently a customer asked us a new project with some new content types, with a lot of field with controlled vocabularies (with many values inside). Also he explicitly ask to be able to handle and change it in the future.

So I looked back to ATVocabularyManager, hoping that during this time something changed.

A New Hope
This time I note immediately that latest releases (1.5 branch for Plone 3.3, and 1.6 for Plone 4) give us something new.

First of all: for the first time I understand that the name prefix "AT" means obviously "Archetypes", but you can think it as "the way of controlling vocabulary is done using some archetypes contents". This mean that you can use it only if your vocabularies are inside archetypes content types? No!
You can also use it to handle whatever ZCML vocabulary you need (portlet? Dexterity?)

The other thing I found is that now we have Generic Setup integration. Great!

The Generic Setup integration
Right now the integration is not fully complete. Seems that export step is not there, but looking at the code I saw that the import step code (the most important!) is available. The product right now suffer only some missing of documentation.

How the import steps works? Instead of creating new vocabulary content types ("Simple Vocabulary", "Sorted Simple Vocabulary", ...), it is based on the "IMS VDEX Vocabulary File" content.

At first glance this can seem the less user friendly way and most obscure content type (and probably this is true) but going back to information architecture this is probably the best choice, because it's based on an XML international standard for handle vocabularies: the IMS VDEX.
Another good news: this format also support i18n (and also Plone)!

What I needed after this is simple: provide a VDEX compatible XML file. How? Let show a complete example.

How to add the GS support
First of all you need to provide a "vocabularies.xml file" to your profile directory.
The format of the file is as follow:

<?xml version="1.0"?>
<object name="portal_vocabularies" meta_type="ATVocabularyManager">
<object name="test.vdex" /> 

So you need to provide a reference to a vocabulary file for every vocabulary you need to import (.vdex of .xml file extensions are valid ones).

Where to put all vocabulary files? You need also to put at the same level a "vocabularies" directory. Inside this you simply need to put all files.

Now I will show the file format of our test.vdex file.

<vdex xmlns=""
    <langstring language="en">A test vocabulary</langstring>
    <langstring language="it">Un vocabolario di test</langstring>
      <langstring language="en">A value</langstring>
      <langstring language="it">Un valore</langstring>
      <langstring language="en">Another value</langstring>
      <langstring language="it">Un altro valore</langstring>

That's all. We created a vocabulary with two entry inside (foo values are "aaa" and "bbb"). As say above, handle this directly from Plone is not very comfortable (simpler vocabulary type are easier to understand) however this is great for Generic Setup install step!

Internationalization note
A little bug on this approach. Seems that even if you plan to not provide an internationalization for you vocabularies (for example: you only need to provide your italian, spanish or something other translation) you still need to provide also the english one, or the vocabulary content inside ATVocabularyManager control panel will not show you the right title of the vocabulary (something like "unnamed vocabulary" instead of "Un vocabolario di test").
But you can use a trick and duplicate your locale specific translation also for english. Also, put english translation first.

Not very comfortable right now
If you still think it, you are right. Maybe that vdex is a well know standard for handle vocabulary, but build a vocabulary with this XML format can be not very simple.

For example, the customer give us a document (a MS Word attachment, obviously) with a set of lists of values. The easy way is to put all this in some CSV files, where columns are "italian translation" and "english translation".
But after that we need to convert this in the vdex format. How?

What I did is too look on the cheeseshop for a library that can convert a CSV in a VDEX file. What I find is vdexcsv!
Two funny thing about it:
  • It was released something like two hour before I performed that search!
  • The company behind this product is again BlueDynamics!
What this product does? You only need to easy_install it then you will be gifted with a new bash command: csv2vdex.

The documentation is clear: you need to provide a CSV file, some parameter, and you'll obtain your vdex file.

For the example above I used a CSV like this:

"aaa";"A value";"Un valore"
"bbb";"Another value";"Un altro valore"

This was named test.csv.

Then I called the script in this way:
csv2vdex test-vocab 'A test vocabulary,Un vocabolario di test' test.csv test.vdex --languages en,it --startrow 1

One last step: the generated vdex file is perfect for all but the root node name. As in the example above you need to have the root node called vdex, but the script generate it as vocabulary. However after contacting Jens (the product creator, that also help me to reach this results) he was in agreement to change this in future releases of vdexcsv.
For now, simply rename the node!
EDIT (2011-08-22): the good guys released vdexcsv 1.1 that is now fully standard compliant, so no more needs to manually rename the node!