Ruminations on NoSQL, XRX, XQuery, Semantics, STEM, Arduino, Internet-of-Things (IoT) and empowering the non-programmer.
Thursday, March 23, 2006
A Little Bit of Semantics Goes a Long Way
Monday, March 13, 2006
Impresssions of Semantic Technology Conference 2006
I just returned form the 2006 Semantic Technology Conference in San Jose CA. This was a fantastic conference with over 600 people attending 72 presentations over four days. This was over twice what the attendance was last year. It is showing that semantics are really becoming a hot topic, although there is still plenty of room for discussion on when and how this will impact the industry. The biggest problem for me was that with up to six concurrent sessions running it was very hard to decide what sessions to attend.
The composition of the conference consisted mostly of vendors of semantic technology products, large customers from government agencies like the DOD, NASA, some academics working on semantic research topics, a few venture capital firms and a handful of independent consultants like myself. Some of the people were from the "old AI school" and were promoting the use of description logic (DL) within an ontology. Others like me, were simply searching for semantic integration techniques for the deployment of intelligent agents.
I think one of the most important presentations was the keynote given by Jim Hendler and Ora Lassila, two of the three people that wrote the original article in Scientific American with Tim Berners-Lee. The presentation discussed some of the linkages between traditional AI (with complex ontologies) and the current document-centric web with simple un-typed link between documents. My favorite sound bite was "Linking is Power". The metaphor was that semantics are the "plumbing" necessary to allow intelligent agents will to perform interesting work over the world wide web. If agents don't know how to access distributed data they will be restricted to local databases, not a great prospect. And the way to promote agent interoperability is to link ontologies, something I have been advocating for a long time. They also pointed out that google now has around 12K ontologies (search google for ontology filetype:owl). But the problem is that each ontology is an "island" and few people are linking ontologies. Looking for ontologies with lots of equivalentClass statements leads to less than 40 ontologies.
This leads to the natural questions, how do we encourage ontology-linking. One standard proposal is to encourage everyone to link to a standard reference upper ontology such as SUMO or CYC. The problem is that few people can agree on what this upper ontology should be.
I have been trying to champion the application of some basic economic theory to the publishing on ontology links. If we do supply-and-demand analysis on ontologies links we see that there are few economic incentives to publish inter-ontology data element links. If people did this then agents could take advantage of them and automatically perform semantic translation. Jim Hendler and Ora Lassila both though this was a worthy idea and indicated they would support this. I hope to kick this off as some type of reward/recognition for next year's conference. Tony Shaw from Wilshire Conferences has also been very supportive of the idea.
Semantic Wiki's were also a very hot topic. Any presentation with the word Wiki in the title were packed full of people. The topic of using Wiki's to build controlled vocabularies was mentioned in several presentations. Problems with locking down approved terms was also mentioned. The topic of semantic wiki's was also discussed. Ideally a metadata wiki would have different access control for different sections of a single page. The approved definitions would require stakeholder team approval for changes.
I advocated for first adding simple typed links into MediaWiki for obvious relationships like subClassOf, instanceOf, partOf and basic GIS-type things like insideOf, capitalOf. Note that by adding a prefix to the current MediaWiki could allow this to be incrementally added to WikiPedia. I think this goes along the lines of "A little semantics goes a long way". Typed links are an awesome way to add semantics to any system and I predict that there were be dozens of semantic wiki's in the near future.
For some time I have been suggesting that OpenCYC and Wikipedia will eventually merge into a single system. It would be very easy to add simple link types to WikiPedia. We have thousands of volunteers waiting in the wings. But after talking to Doug Lenat he indicated that CYC has over 16,000 types of relationships. Yikes! We would almost need an mini-expert system to figure out the link type. The training for adding consistent links might be challenging. Nonetheless the prospect of WikiPedia evolving into HAL over the next ten years is exciting.
I always enjoy going to Doug Lenat's presentations. He indicated that over half of the new rules added to CYC in the last year were done by reading natural language text. Something that is VERY exciting. Automated machine learning. We just hope it is not another false peak. I also asked Doug how I could add my K-12 ontology into CYC. He indicated that I would have to learn about CYC's relationships types to do this effectively. But he also agreed that ontology linking was the Rosetta Stone for enabling web-wide intelligent agents.
For all the advanced topics, I was impressed by the lack of the use of structured XML Schemas to capture semantics using controlled vocabularies and importing semantic XML schemas created by subschema generators from metadata registries. I was also very impressed by Contivo's new Builder product which is going for an incredibly low $500.
There also appears to be good movement toward incrementally adding semantics to HTML documents using RDF/A. I attended an excellent presentation by a person from MIT that was working on these standards. Although his arguments were not all clear, the one point he did make was that adding RDF in the "class" attributes would have minimal negative impact to the current web.
I was also glad to hear there is more interest in the semantic-web in Minnesota. Apparently Lockheed-Martin is starting a new logistics agent project that will be done in Eagan Minnesota. They indicated that much of this would be based on semantic technologies. Lockheed sent seven people to the conference at the last minute so at least I know they have a budget for training. Logistics problems like those from Katrina could clearly benefit from semantic agents. Lockheed has been promoting use of semantic web in their literature: See the article on page 14.
My only real complaint with the conference is that the conference organizers allow people to present without sharing any of their slides even though they are supposedly "due" almost two months before the conference is scheduled. We are forced to take notes and ask for e-mailed copies of the slides. My success rate has been very low about getting copies mail out after a conference. <shameless bragging> I believe I was also one of the few presenters that wrote the supplementary paper for the conference. Providing only the PDF versions of the slides also prevents you from getting builds and speakers notes. For an example see my full presentation web site.</shameless bragging>
The Semantic Technology conference next year will also be in San Jose next year but has been moved to early April.