Thursday, March 23, 2006

A Little Bit of Semantics Goes a Long Way

There are now over 20 versions of semantic wikis that are being developed. For example see here. What is interesting is the large number of ways that people are approaching this process. One of the phrases that I heard a lot of at the Semantic Technology Conference was A Little Bit of Semantics Goes a Long Way. An although I admire all the innovations that people are putting into their semantic wiki projects, to me I think that just adding attributes to a page and adding types to links is all that we really need to test the concepts out. It was also pointed out to me that several other wikis do have strong ACL features and you can "lock-down" pages using tools such as Apache administration tools. There also seems to be a lot of discussion about implementing semantic wikis using triple stores. This seems to me a little ahead of its time. If using a triple store makes a system more flexible, but it appears to me that the current structure is more than flexible enough to meet most business requirements. If people wanted to work on something useful, I would suggest something like a natural language front-end to Wikipedia so people could type in a written language questions like "What is the population of Minneapolis?". That would be useful! - Dan

Monday, March 13, 2006

Impresssions of Semantic Technology Conference 2006

I just returned form the 2006 Semantic Technology Conference in San Jose CA. This was a fantastic conference with over 600 people attending 72 presentations over four days. This was over twice what the attendance was last year. It is showing that semantics are really becoming a hot topic, although there is still plenty of room for discussion on when and how this will impact the industry.  The biggest problem for me was that with up to six concurrent sessions running it was very hard to decide what sessions to attend.

The composition of the conference consisted mostly of vendors of semantic technology products, large customers from government agencies like the DOD, NASA, some academics working on semantic research topics, a few venture capital firms and a handful of independent consultants like myself.  Some of the people were from the "old AI school" and were promoting the use of description logic (DL) within an ontology. Others like me, were simply searching for semantic integration techniques for the deployment of intelligent agents.

I think one of the most important presentations was the keynote given by Jim Hendler and Ora Lassila, two of the three people that wrote the original article in Scientific American with Tim Berners-Lee.  The presentation discussed some of the linkages between traditional AI (with complex ontologies) and the current document-centric web with simple un-typed link between documents.  My favorite sound bite was "Linking is Power".  The metaphor was that semantics are the "plumbing" necessary to allow intelligent agents will to perform interesting work over the world wide web. If agents don't know how to access distributed data they will be restricted to local databases, not a great prospect.  And the way to promote agent interoperability is to link ontologies, something I have been advocating for a long time.  They also pointed out that google now has around 12K ontologies (search google for ontology filetype:owl).  But the problem is that each ontology is an "island" and few people are linking ontologies.  Looking for ontologies with lots of equivalentClass statements leads to less than 40 ontologies.

This leads to the natural questions, how do we encourage ontology-linking.  One standard proposal is to encourage everyone to link to a standard reference upper ontology such as SUMO or CYC.  The problem is that few people can agree on what this upper ontology should be.

I have been trying to champion the application of some basic economic theory to the publishing on ontology links.  If we do supply-and-demand analysis on ontologies links we see that there are few economic incentives to publish inter-ontology data element links.  If people did this then agents could take advantage of them and automatically perform semantic translation.  Jim Hendler and Ora Lassila both though this was a worthy idea and indicated they would support this.  I hope to kick this off as some type of reward/recognition for next year's conference.  Tony Shaw from Wilshire Conferences has also been very supportive of the idea.

Semantic Wiki's were also a very hot topic.  Any presentation with the word Wiki in the title were packed full of people.  The topic of using Wiki's to build controlled vocabularies was mentioned in several presentations.  Problems with locking down approved terms was also mentioned.  The topic of semantic wiki's was also discussed.  Ideally a metadata wiki would have different access control for different sections of a single page.  The approved definitions would require stakeholder team approval for changes.

I advocated for first adding simple typed links into MediaWiki for obvious relationships like subClassOf, instanceOf, partOf and basic GIS-type things like insideOf, capitalOf.  Note that by adding a prefix to the current MediaWiki could allow this to be incrementally added to WikiPedia.  I think this goes along the lines of "A little semantics goes a long way".  Typed links are an awesome way to add semantics to any system and I predict that there were be dozens of semantic wiki's in the near future.

For some time I have been suggesting that OpenCYC and Wikipedia will eventually merge into a single system.  It would be very easy to add simple link types to WikiPedia.  We have thousands of volunteers waiting in the wings.  But after talking to Doug Lenat he indicated that CYC has over 16,000 types of relationships.  Yikes!  We would almost need an mini-expert system to figure out the link type.  The training for adding consistent links might be challenging.  Nonetheless the prospect of WikiPedia evolving into HAL over the next ten years is exciting.

I always enjoy going to Doug Lenat's presentations.  He indicated that over half of the new rules added to CYC in the last year were done by reading natural language text.  Something that is VERY exciting.  Automated machine learning.  We just hope it is not another false peak.  I also asked Doug how I could add my K-12 ontology into CYC.  He indicated that I would have to learn about CYC's relationships types to do this effectively.  But he also agreed that ontology linking was the Rosetta Stone for enabling web-wide intelligent agents.

For all the advanced topics, I was impressed by the lack of the use of structured XML Schemas to capture semantics using controlled vocabularies and importing semantic XML schemas created by subschema generators from metadata registries.  I was also very impressed by Contivo's new Builder product which is going for an incredibly low $500.

There also appears to be good movement toward incrementally adding semantics to HTML documents using RDF/A.  I attended an excellent presentation by a person from MIT that was working on these standards.  Although his arguments were not all clear, the one point he did make was that adding RDF in the "class" attributes would have minimal negative impact to the current web.

I was also glad to hear there is more interest in the semantic-web in Minnesota.  Apparently Lockheed-Martin is starting a new logistics agent project that will be done in Eagan Minnesota.  They indicated that much of this would be based on semantic technologies.  Lockheed sent seven people to the conference at the last minute so at least I know they have a budget for training.  Logistics problems like those from Katrina could clearly benefit from semantic agents.  Lockheed has been promoting use of semantic web in their literature:  See the article on page 14.

My only real complaint with the conference is that the conference organizers allow people to present without sharing any of their slides even though they are supposedly "due" almost two months before the conference is scheduled.   We are forced to take notes and ask for e-mailed copies of the slides.  My success rate has been very low about getting copies mail out after a conference.  <shameless bragging> I believe I was also one of the few presenters that wrote the supplementary paper for the conference.  Providing only the PDF versions of the slides also prevents you from getting builds and speakers notes.  For an example see my full presentation web site.</shameless bragging>

The Semantic Technology conference next year will also be in San Jose next year but has been moved to early April.

Friday, March 03, 2006

Semantic Technology Conference

I am heading out to speak at the Semantic Technology Conference in San Jose CA. If you have any comments on my presentation, please post them here. Here is a link to my presentation site: http://www.danmccreary.com/presentations/semweb2006/ I have animations on the PowerPoint slides and make sure to check out the notes pages. The conference looks to be a good one. There are many speakers on real-world case studies and there seems to be a growing market for consultants that understand the social aspects of metadata publishing. I always like hearing Doug Lenat talk. His charts on automated machine learing are the most realistic path we have to HAL and real AI. But it is still 14 years before we have the resources to pass the Turing Test.

Leaning Objects for Economics: Dynamic Graphs in SVG

I have been interested in the creation and metadata tags associated with Leanring Objects. To get started I have created a set of dynamic economic graphs on my web site: http://www.danmccreary.com/svg/ I wrote the graphs using SVG since I am already familiar with XML and JavaScript. These graphs are simple supply and demand graphs but with the dynamic nature it allows the learner to visualize that as inputs change how other factors such as total revenue and profit also change. I am willing to do more of these if people are interested. I have sent notes to several people that indicated an interest in economics but did not get any hits yet. One problem is that SVG is part of FireFox but still not 100% implemented. The Adobe SVG viewer also has some extensions. My friend Jack Nutting is doing work with SVG animation on cell phones. So students could study economics on the school bus with their cell phones. I would also like to allow the graphs to scale to fill the screen but the slider object I am using does not currently do this. So here are my questions: 1) Who would use these objects? 2) How would they find them? 3) What enhancements would you make? 4) How could these objects be integrated into a LMS using Moodle. I am allowing non-profits to use the initial versions free of charge but would people pay me to write more complex versions?