Dan McCreary's Blog: October 2007

Thursday, October 18, 2007

When XML Schemas Imports Become XQueries

For the last five years I have been using a XML Schema design process that I learned from the people at Georgia Tech Research Institute as part of my work with GJXDM and the NIEM. This work has strongly been influenced by Jim Heldler who co-wrote the Semantic Web article with Tim Berners-Lee. Although GTRI wanted to use RDF (graphs) to represent the exchange data they had to settle on using XML and XML Schemas since those technologies were widely used by most criminal justice agencies already. They compromised by still using a striped model within the XML Schema driven structures.

Creating an exchange documents starts with creating a metadata registry of all the data elements used in a family of XML Schemas. A shopping-cart-like tool is used to select a subset of the data elements. These data elements form a "wantlist" of data elements. The checkout process creates a zip file that when uncompressed creates a set of XML Schema file that you import into your constraint schema.

This process is an order-of-magnitude to use than before the subschema generation tools were created by GTRI. The subschema generated by the checkout process not only included the XML Schema type definitions but also all the data types that they depended on. The metadata registry had an internal dependency list that it maintained and so if you needed just ten data elements but they depended on then other data types you got all 20. you didn't have to manually figure out what additional data types to include.

I adopted the GTRI shopping-cart process to include adding a "subscribers list" to each of my data elements used in local metadata registries and local XML Schema creation. I used a large, complex and brittle ant script to update all the imported files for each XML Schema we were developing. But this could not easily be done during a requirements gathering meeting.

In the past we have viewed this subschema generation as a process of interacting with a web application that creates a set of files. One of our first insights is that this process is in fact just a transformation of the metadata registry based on the rules created by a wantlist and the dependency graph. It does not have to be a manual "batch" process.

In a recent blog (Metadata Web Services) I pointed out that tools like eXist and XForms allow us to store metadata in XML format and allow it to be easily updated by BAs and SMEs and queried using languages like XQuery. One of the great realizations I had on my last project was that XQuery has a simple construct that allows you to "chain" web services. The argument to the doc() function is just a URL. If that URL points to a web service that returns XML you can easily build composite web services. Web services that are transformations can be chained together. This allows you to start using enterprise integration patterns to build reusable services at a very fine grain.

Putting these facts together it has occurred to me that the ultimate goal of the metadata registry process is to allows new data element definitions and enumerated values to be changed during a XML Schema design process. We should be able to change a definition, hit the "save" on the XForms application that edits data elements and then just refresh the generated subschema. The refresh process must call a XQuery that calls web services that analyzes the dependency graph and updates new data elements imported into the XML Schema.

What this does is helps us get to the goal of updating the a model quickly and generating new artifacts directly from the model. This is part of the overall model-driven development process. The artifacts we create include XML Schemas, instance documents generated from the XML Schemas and XForms applications to view and edit the documents. This faster turn-around time allows our users to quickly see if their definitions and enumerated values are being precisely captured and used to create actual systems.

Wednesday, October 10, 2007

XForms Tutorial and Cookbook Being Translated to French and Used in China

I got a nice notes from people both in France and China that our XForms Tutorial and Cookbook are now being used in both France and China.

The French version is being hosted on a version of the MediaWiki server that also has a new macro for code syntax highlighting.

The URL for that version is here: http://xforms.free-web-hosting.biz/mediawiki-1.9.3/index.php/Wikilivre I have not figured out if this will work in the Wikibooks yet but I have my doubts.

Hints of the Chinese version were posted by a zhouliyi from ZheJiang China. These were posted in the Cookbook Guest Registry: http://en.wikibooks.org/wiki/XForms/Guest_Registry

Please let me know if anyone else is using XForms so we can maintain a list of users.

Metadata Web Services

I have been using the eXist native XML database/web server for about nine months now and it is starting to change the way I think about metadata management.

My latest project for a financial institution requires us to quickly build XForms to manage various metadata as well as data. What I am finding is that my old method of storing metadata in XML files on a file system and then transforming the metadata using Apache Ant was a complex process. My new approach is to store the XML directly in eXist. This used to be a little bit hard since I thought that you had to use the eXist web interface to upload each XML file and ant scripts to backup your eXist database.

This all changed when I was shown how to use the Microsoft Windows WebDAV tool. Copying files to eXist and backing up the entire data store is just a drag and drop using Windows.

Now an entire new set of metadata web services are becoming much easier to build. Take the simple task of building a pick list of enumerate values for a form. XForms allows you to use the select1 control and specify an itemset using an XPath expression. I can now just load the data elements into an instance and grab the values and labels directly from the enumerations in the metadata registry files.

The only drawback to this is the fact that you load more metadata (like the full definitions) then you need in building the form. But once again eXist comes to the rescue. It is just a few lines of code to create a little web service (using XQuery) that you pass a code table to that returns just the label/value pairs. Using this method the selection list are always up to date and don't require any "batch" updates.

What I am learning from this is that in the past, metadata management was usually an after thought. Something that the coding standards people used to enforce database column naming conventions. But with metadata being stored in eXist metadata becomes part of application services. Building apps is just assembling forms that pull metadata from the registry in real time.

If you are concerned that the metadata registry server will be overloaded with requests for information each time forms load, we should remember that these services are RESTful. The results can also be cached so they don't have to be regenerated. I still have more to learn about how to make these services fast but since metadata is small it can usually always exist in RAM and disk I/O is very limited.

All of these developments are just small pieces of the puzzle at putting well-managed metadata at the core of your enterprise development methodologies. It is really the heart of the model-driven enterprise.

Let me know if you are creating metadata web services. I would like to know what things you feel are useful to your users.

- Dan