Testing Open Calais

04.17.2008 | Topics: blog, Uncategorized |

I’ve been thinking a lot about tagging and entity extraction lately (I know, exciting!). We use Inform on most of our properties and they perform very well, but new techniques and tools are all over the place. One of those tools is Open Calais by Reuters.

Open Calais is an open source API for automatically extracting keywords from text. I hadn’t had time to fiddle with it until I heard that someone had created an Open Calais WordPress Plugin.

I immediately added both the archive and auto tagger to this site to test it out. While it came up with some odd suggestions and omitted some obvious ones (I tested a Computerworld article about MS Vista, and it didn’t recommend “Vista”), I was optimistic with this first test. Even for this post, Calais wasn’t perfect, missing the obvious “Open Calais” and “WordPress” as key entities offering up only “API” and “Reuters” as tag suggestions. Calais is far from perfect, but a promising step in the right direction.

The Calais WordPress plugin itself was very impressive, integrating seamlessly with WordPress’ native tagging functionality. Basically, you can use Calais to recommend tags and then redefine them (adding/removing) as you see fit. Open Calais is officially on my watch list.

Leave a Reply