Home  »  II-SDV  »  Programme  »  Tuesday 9 April 2019

Tuesday 9 April 2019

Conference starts at 09:00

The IUPAC InChI Chemical Structure Standard – Today and the Future


For almost 100 years IUPAC has been well known around the world for its efforts in standardizing nomenclature in chemistry. At the start of the present century it became clear to all involved in chemical structure representation work that with the extensive use of computers and electronic information in all aspects of chemistry and related sciences, an IUPAC standard was necessary.  From this critical need the InChI project was launched by IUPAC in cooperation with the US standards agency - NIST.  The result of this effort has been the development, maintenance, and expansion of capabilities of the open source non‑proprietary International Chemical Identifier - InChI first by NIST and now by the InChI Trust, a not-for-profit UK charity.  

This brief presentation of InChI will highlight on‑going efforts to strengthen and extend this standard for chemical structures and its hashed‑form, the InChIKey. Information standards are critical to enable effective and efficient communication of scientific content. Validation and reproducibility of research results are critical to advances in science. Without a chemical structure standard, it was becoming impossible to find and share all the reported results needed for a particular purpose. The costs of experiments are ever increasing; hence the need for increased efficiency in labs around the world. Open Access, Open Data and Open Standards are areas that are expanding rapidly and are facilitating faster and more effective research discovery. However, before you can share data about a chemical you need to find where the information has been made available on the Internet. Collaborative, interoperable, and global dissemination standards are essential in a more networked world.

Physical Quantity Search with the Simplicity of Keyword Searching


When searching across full text patent data, the numerical data given in a document is often at least as important in determining a document’s relevancy as the keywords present. Chemists or metallurgists may be interested in compositions or alloys citing specific amounts or ranges of a certain metal or compound, and engineers may be interested in specific dimensions, current measurements or temperature ranges.  Unfortunately, searching comprehensively for physical quantities using conventional text-searching is practically impossible as many lexically distinct quantities can be matches. The fact that measurements of the same type may use different units further complicates matters.

Minesoft have been working on the problem of facilitating searches including physical quantity criteria, and here we report on our success with using text-mining to automate the identification and interpretation of quantities in patents in our new PatDocs tool.

All indexed terms and user queries are converted into ranges in standardized units, for example, “>5 to ≤10 miles per hour” is interpreted as the range (2.2352,4.4704] m/s.) Rather than forcing the user to learn a search engine specific syntax, the same formats as appear in actual documents are used to write queries. The search then finds all indexed ranges, with the same standardized units, that intersect with the user’s query range.

It is also important to know what a quantity is referring to. For specific cases, such as alloy compositions, this is captured during the text-mining e.g. 2 wt% Fe, refers specifically to a weight percentage of iron. For the general case, we now allow searching for quantities in close proximity to arbitrary phrases, or even other quantities. The tool will also facilitate the user by showing the context of where their query matched as well as allowing combining of quantity queries with metadata queries.



Leaning towards improved and efficient discoverabilities: The why? how? and when? of Natural Language Processing Aids


Primary goal with any content searching system is to expect minimum possible inputs from a user and evince an efficient answer set to the user query. To that end, ‘Augmented browsing’, a technique that is focused towards ‘guiding users’ by teaching machines to infer and take logical decisions from a query is a readily relatable and need of the hour solution.


This presentation will focus on state-of-the-art search technologies including augmented browsing in a data model which is centered around semantic entities. All named entities with unique properties and definitions form the central units(along with traditional keywords) also known as ‘semantic units’. Also exploited are  in this implementation are; multifaceted ontologies that define intra-connections amongst the semantic units. The environment also allows for  external communications via inter-connections between semantic units wherever applicable.


This case study also explores the scope and scalability of augmented browsing. Specific focus will be on the impact of augmented browsing when compound-word searches are used in conjugation with a different combination of words/keywords/entities/phrases. The semantic units include domain entities, geographical locations and heuristic entities. A detailed account of Natural Language Processing as a query parsing component is also showcased here.


All in all, this technology and its scalable result pave way for unique implementations of user centric search and browsing modules to data lakes as well as structured indexed content.

10:30 - 11:00

Exhibition and Networking Break

Semantic e-Science - The Future of Information Provision in the Digital Age

With the digital revolution the provision of scientific information is fully entered in a new mode of operation.


The fast advances of technologies is transforming the way of providing information towards scientists and opening new opportunities.


Data and Information storage has moved from a paper-based, manual affair to an activity in which computers are necessary.


As a result, a vast amount of scientific information is being daily produced and collected, a research organization has not enough resources to collect everything.


With emphasizing data-intensive thinking e-Science is moving towards data Intensive technologies and is becoming a new technology driver and requires the re-thinking of infrastructure architecture components, solutions and processes for Scientific Information Provision. 


A new systematic approach for tackling the challenges of data-intensive computing, providing scientists and decision makers with practical tools for dealing and exploring the needed data and information.


Design Patents, Art, Science, Strategy? - Decoding functional characteristics that manifest in visual Ornamentation for overlap Detection


The commercial implications of intellectual property rights (IPR) infringement risks are typically least considered when the design process is undertaken. When evaluating a new technology introduction project or new product development program, it is important to consider the impact of potential design alternatives, in addition to technical performance parameters. While greater efficiency may be achieved, it may come at an unacceptable cost in the form of unanticipated royalty payments which destroy margins. Design changes can be costly, particularly when the latter stages of technology maturity and product incorporation have occurred. This research paper uncover how design patents could hold strategic position for a company with valuable decodable hidden technically functional information in some technology sector for example tires industry. Spotting the gap between utility and design patent of same subject matter and unearthing the technically functional information could benefit understanding technical know-how of competitor for competitive intelligence and to design R&D innovation roadmap; validate design and functional information at pre-application stage; avoid/track both costly advertent and inadvertent infringements. 

Firms as patent applicants in France in 2017

Firms represent a prime target for public authority awareness-raising policies especially as regards innovation and filing patents. Yet it is not always easy to get a handle on this population in terms of statistics, meaning that it is particularly difficult to systematically identify in the patent databases different kinds of firms that do file patent applications.


Two census of firms conducted in 1999 and 2007, organised jointly by Bpifrance and INPI, allow the INPI to yearly identify the different kinds of firms among the companies that filed a patent application at the French patent office. This study reveals the importance of the different kinds of firms among the total patent applicant population and firms’ behaviours.





12:30 - 14:00

Lunch, Exhibition and Networking (also start of Analytics and Visualization Meeting)

IP Analytics and Data Visualizations: The Art of effective Communication

Human beings have too much information to process and we use heuristics to filter information and make decisions quickly; policymakers process evidence and the environment in which they operate in exactly the same way. The world of intellectual property is complex; analysis of IP databases is inherently full of caveats, assumptions and ambiguities. There is a natural disconnect between the two meaning effective communication to support evidence-based decision making is key. In recent years the digital age has given us increasingly beautiful and dynamic visualisations; when these are thrown in to the mix, do these help or hinder the art of effective communication? 






A Client focused Display - Exemplified Compounds Table display linked to Citing Publications

Exemplified compounds resulting from a substructure search are key to making filings based on novelty and inventive step. Conversations with our attorney clients revealed that they prefer to see a table of structures with hyperlinks to references, rather than a list of references with the structures beneath them. To provide this structure-led display we have had to rely on internal macros written many years ago, to reorganize the output from Chemical Abstracts, but the macros are not supported by GSK lT and often crash with large volumes of data. Analytical table tools in third party software have not been able to provide a solution either.  

We use BizInt Smart Charts for Patents to deliver reports, but it did not have the capability to include hit structures and was reference rather than structure oriented. We have now further collaborated with BizInt to develop a method enabling us to present our structures in an easy to read tabular format hyperlinked to their references. These tables are now used in our search reports for the chemistry attorneys, giving a succinct view of the compounds exemplified within a publication (patent and literature) to facilitate decision making.



15:00 - 15:30

Exhibition and Networking Break

Competitive Intelligence: how to optimize the analysis of pipeline and clinical trials data with BizInt Smart Charts and VantagePoint


BizInt for data compilation, selection and Chart Vizualisation and VantagePoint for specific graphic data representations can help for competitive intelligence analysis.

·        Pipeline and clinical trials data

·        Structure, reliability and updating of data

·        Need to query and export data from different sources

·        Added values of verification and visualization of information.

·        Description of BizInt and VantagePoint

·        Practical examples of the use of these 2 tools for the realization of competitive intelligence reports



What’s it all about? Or … How I fell at the last fence

 When I first started delivering patent landscapes, I was taught that there are three steps to a patent landscape analysis; create the data set; clean the data set; analyse the data set.

In an R&D environment, this met the expectations of project scientists. Despite this, Patent Landscaping never became the strategic tool I knew it should be. Is that because the process lacked the key step: reporting and communicating to senior stakeholders? Short of time and attention, they want to know what a landscape means; preferably in one PowerPoint slide.

How do we communicate with senior stakeholders effectively and ensure that Patent Landscapes play an effective part in strategic decision making? 

Speaker Panel Q&A

18:30 - 20:00

Reception (for Analytics and Visualization Meeting attendees only)