Home  »  ICIC  »  ICIC 2011  »  Programme  »  Tuesday 25 Oct  »  From BACON to XML – Why Patent Information Publishers are still Converting Image Data to Searchable Text

From BACON to XML – Why Patent Information Publishers are still Converting Image Data to Searchable Text

 

With almost 150 active patent issuing authorities, the fact remains that less than 20% of patent offices publish regular updates of their full text data; fewer still have published their complete backfiles. This presentation outlines why and how the patent information publishing industry is still creating full text data from images, more than 25 years after the EPO began its BACfile CONversion (BACON) project. During the presentation, we will describe the breadth of sources currently available to vendors, some of the processes and systems involved in converting images to text, the scope for adding value to existing first-level patent data, and discuss some of the new content that is fast emerging. We will also highlight some of the issues faced by users of such databases and offer insights on overcoming those challenges.