Home  »  ICIC  »  ICIC 2005 - 2009  »  ICIC 2008  »  Programme  »  Tuesday 20 Oct

Tuesday 20 October 2008

09:00 - 09:30

From Gutenberg to Blackberry : Challenges for the European Patent System(s)

The patent is a policy tool aiming at stimulating innovation. This presentation first explains the economic role of patent systems and their importance in innovation systems. In this respect the design of patent systems is a key issue. A particular focus will be put on the quality and cost factors through historic and recent cases as well as with simulations. The presentation is inspired by recent research and the book authored by Guellec and van Pottelsberghe (2007), “The economics of the European patent system”, Oxford University Press, which calls for a more ‘economic’ approach in the design of patent systems.

09:30 - 10:00

Are Pictures Worth a Thousand Words? Text Mining, Gisting And Visualisation

A significant proportion of scientists’ and information specialists’ time is spent reading, annotating and in the analysis of information sources, ranging from news feeds to patents to full-text scientific literature. Understanding all the implications of a corpus is a challenge which can be tackled by using tools which combine text processing, entity extraction, automatic recognition of correlations between those entities, and graphics. Visualisations of text corpora using techniques such as entity and relationship (fact) frequency tables, Venn diagrams, heat maps and computer-generated networks, pathways or spidergrams make comprehension easier and will save time. Graphic analysis of large text corpora is perhaps the only way to perform this task effectively, ascribing authority and reliability to automatically extracted relationships from multiple sources.

Automatically generated pathways are a particularly useful technique for combining information gained from several sources in order to generate new knowledge, whether in competitive intelligence or research. The balances between stringent and relaxed entity extraction, or between full retrieval and complete relevance can be tuned to give appropriate levels of information depending upon software solutions chosen, the application area and the demands of the users.

10:00 - 10:30

Automated Knowledge Discovery to Support the Development of New Drugs

The ever increasing amount of available scientific literature sparks new approaches for knowledge extraction. At Merck Serono, we are using state-of-the-art use text mining technology to discover "hidden" / new links between biomedically relevant entities. In this way we try to validate new scientific hypothesis to add value to our molecules in the pipeline. In this presentation we show our latest experiences, exemplified by a well validated case study.

11:15 - 11:45

Visualization and Text Mining of Patent and Non-patent Data

Text mining is now being used more in patent and non-patent literature search, especially to analyse large complex data sets rapidly. The supervised approach – classification – and the unsupervised approach – clustering and projection techniques – are both popular in text mining and together provide strong instruments for various tasks. Text mining in combination with advanced visualisation are two important techniques in patent analytics. This presentation presents the work of Treparel undertaken together with Philips on the combined usage of classification and clustering and different advanced visualisation techniques. The technical principles and the business case of some applications of text mining and visualisation will be presented and discussed.

11:45 - 12:15

Complementarity Between Public and Commercial Databases of Bioactive Compounds: Extending the Linkage Between Chemistry and Biology

The last few years have seen a revolution in open cheminformatics as exemplified by the growth of PubChem, DrugBank and other databases. Consequently, medicinal chemists and biologists now have access to high utility public sources of bioactive compounds that they can not only download and/or query directly over the Web but that also link to structured bioinformatic data. This work (PubMed ID 17897036) reviews compound content comparisons between selected public and commercial databases, particularly those that specify relationships between compounds and their activity against primary protein targets, thereby linking chemistry to biology. After collecting 19 different commercial and public data sources, including selected bioactive sub-sets, stringent filtering for unique content was applied to facilitate standardised comparison of content. The resultant 19x19 matrix shows the pair-wise comparison of each set of compounds. Detailed results will be presented but overall they emphasise the complementaritity of combining sources. This conclusion is supported by a Venn-type analysis of GVKBIO, WOMBAT (both commercial) and PubChem (public). These compound databases show not only overlap but also unique content and types of molecular target bioinformatic connectivity in each case because of their different strategies for source selection and expert curation.

14:15 : 14:45

Development of Technology for Transforming Analysis of Patent Information

The immense pressure to improve drug discovery in recent years has led to changes impacting on strategies for protecting intellectual property -based investments in a number of ways. Thus, the pressure of bringing drug candidates ever more rapidly to the market has lead to a significant increase in the risk of loosing investments in patent litigation. Unrecognizsd by most, the requirement of submitting prior art deemed material to patentability to patent examiners and to specifically point out the novelty and non obviousness of a claimed invention threatens to reverse the long-established presumption of patentability given to a patent application and exposes corporations to a substantially greater risk of patent litigation. Keeping in mind that there are ~ 150 000 chemical patent application filed each year in the US alone and that the quality of patent searches between applications varies widely, the issue of patent validity is becoming one of the key problems of current patent systems. Considering that Pfizer creates intellectual property in multiple phases of the R&D process, the company initiated about five years ago efforts for developing technology that could assist in protecting its R&D based investments. Since most of Pfizer’s investments involve to some extent, patents relating to utilities of chemical structures, proteins, DNA or RNA sequences, one primary goal of this initiative was improving the accuracy and speed of the analysis of patent claim information. Herein we describe the development of technology that assists scientists with understanding what is actually claimed in a competitor’s patent. The outcome of this analysis is of strategic importance because it determines the risk of losing an investment in patent litigation.

15:45 - 16:15

Searches Obtained from First-Level and Value-Add Patent Data

Patent data can be searched either from a collection of first-level original patent datasets from the issuing authorities, or from single sources of value-add data from the commercial information providers. In terms of the results obtained, each has its own advantages, for example the first-level data can provide the most comprehensive text-based searching, whereas the value-add databases offer abstracts in English for many more countries, plus advanced indexing to aid retrieval. In addition, combining and de-duplication of results from the various sources can be difficult, and the differing methods of calculating patent family relationships can bring further complications. This presentation examines a case study demonstrating these issues, comparing a search from first-level and value-add patent sources. Options for combining and de-duplicating the results are then discussed, as are the possibilities for creating answer sets compiled according to INPADOC and invention-based patent families.

16:15 - 16:45

China, Japan and Korea- Advantages of Searching in Original Language Databases

Currently, more than half of all new patent applications published in the world are written in Japanese, Chinese or Korean. Japan, China and Korea are all among the top five biggest patenting nations in the world. Every year, the Japanese Patent Office receives some 400,000 patent applications, the majority of which are filed by domestic applicants. In the last ten years, applications from domestic applicants doubled in Korea, and increased more than eight-fold in China. A considerable part of the prior art thus generated in East Asia will stay at a national level and not be published elsewhere in the world in a western language.

The above illustrates that patent documentation from East Asia has become indispensable. The patent information user faces the challenge of dealing with a large number of prior art documents that are -- in many cases -- neither readable nor fully searchable in English. Those relying entirely on English-language coverage, limit themselves to searching in abstracts and bibliographic information, often missing out on utility models altogether and facing a serious time delay of several weeks or months until English information becomes available. This presentation looks at possible risks when searching only in English information and points out ways for users of patent information from East Asia to overcome the language barrier and search East Asian patent data more efficiently.