Web-scale Content Extraction-as-a-Service
Extractiv allowed users to transform unstructured Web content into structured semantic data -- from their desktops. It combined a powerful Web crawler infrastructure (made available by 80Legs) with the amazing text extraction software developed by LCC. The result? a Web service that could crawl, download, and mark-up unstructured text faster (and with better precision) than other data center-based solutions. It was cheap, too!
Extractiv had two APIs; the On-Demand platform allows a user to upload their own documents or URLs to be processed one-at-a-time and the Crawling platform lets users create crawling jobs to crawl the web and process its content.
Extractiv was the second natural language processing company incubated at Language Computer Corporation during my tenure as CEO. John Lehmann and Shion Deysarkar led the company from its founding in 2009 to its closure in 2011.