Saturday, April 22, 2006

Natural Language Interface to Wikipedia

I have been saying for a while that because wikipedia is a database it is much easier to link it into a natual language seach system. The Wikipedia search function is also very limted and does not allow you to type in complete questions like ask.com. I reciently found out that MIT is already building a very powerful natuaral language interface in their START project. http://start.csail.mit.edu I did a search for "Who was Tesla?" and got a link right to the Wikipedia article. Very cool! Although there seems to be very limited documentation on START and I do not believe that there is an OpenSource version, I believe this points to the direction that wiki software should be moving. Could your relational database bennefit from this? When I was at the Minnesota Department of Education I wanted to build a natural language interface to their "cubes" of student test data. For example "What school district has the highest highest 8th grade math scores for girls?" Give it a try and let me know what you think.

1 comment:

Anonymous said...

Dan, Thanks for your thoughts and pointed out the SMART project. I gave it a little whirl and I was quite (pleasantly of course!) surprised that I got one correct answer for the question "What is the population of India"...with some demographic info and a link to India. Very neat! It bothers me that although the site has been up since 1993 I have not known about it till today, when I ran into your blog! Why? I wonder why? Is it that the site did not have a critical mass of "answers"? Is it because I was plain incompetent? Or is it another proof that we, the people are so swayed by hype and marketing (Google for e.g.) that we have lost our collective ability/guts to really point out the diamond yet to be discovered?

Thanks for pointing this out, like many other things :)