Why can’t patent searchers rely on keyword or semantic searching to find all relevant patents?

Most patent searching is based on keyword or semantic searching. I understand this fully and use keyword searching a lot myself.  

However from long experience, I have learnt that relying on keyword searching alone will see a lot of highly relevant patents missed or incorrectly ranked in a relevancy list of patents found.  For this reason, we recommend to all searchers that they supplement their keyword or semantic searches with a citation based search such as Cluster Searching (where the seed patents for the Cluster Search could be the best hits they find using other search methods). 

There are four good reasons for this.

1) Patent applicants can be very inconsistent with their use of technical words. 

2) Semantic or keyword searching can return thousands or hundreds of thousands of irrelevant hits

3) Semantic searching can be misled by different languages 

4) Even if keyword searching finds relevant hits, the order of result can be misleading

 

1) Patent applicants can be very inconsistent with their use of technical words 

As one example alone, consider this following list synonyms for the humble cardboard ‘box’ :

  • carton
  • carrier 
  • container
  • package 
  • packaging
  • receptacle
  • case 
  • pack  
  • crate

In some cases, it might be possible for search engines to understand every possible synonym for some technical concepts, but this is unlikely for all technical concepts. As one real world client example from a real world example a key prior citation for a patent that referred to ‘carbon dioxide’ was missed by USPTO examiners and a major patent law firm because it instead referred to the very rare synonym ‘carbonic acid gas’.

In contrast, the citation searching approach used by Cluster Searching relies the expertise of a number of patent examiners and applicants in the immediate field to recognise rare synonyms for technical concepts  – this greatly increases the probability that at least one of the searchers makes the link between patents that would be missed by semantic analysis alone.

 

2) Semantic or keyword searching can return thousands or hundreds of thousands of irrelevant hits 

Again if we return to the simple example of cardboard boxes, a search for prior art based on the main technical keywords used for cardboard boxes (for example ‘box’, ‘lid’, ‘flap’, etc) will return thousands or tens of thousands of patents – almost all not very relevant to the patent application that you may be looking for – as is shown by the results for this query shown by the seach engine “The Lens” 

boxlidfapquery.gif

 

In order to deal with these large amounts of irrelevant hits, searchers and examiners are forced to apply assumptions about which keywords are important – and every such assumption risks filtering out irrelevant patents.

As an example of this, consider the feedback from a patent examiner for a major patent office after trialling Cluster Searching:

The top ranked [prior art] patent, USxxxxxxx, was very good because it found an element that is hard to search for, i.e. ‘independent power settings’.  The ‘independent power settings’ are hard to search for because searching for something like “power near2 settings” would return several thousand hits and most of them would not be for independent power settings and limiting that search with an adjective like “independent” would filter out a lot of good results.  And it’s amazing that it came up first in your search…   .”  – Patent Examiner of a world leading patent office

 

3) Semantic searching can be misled by different languages

Semantic based searching is much less likely to pick up similar patents in different languages – self-evidently semantic searching will likely struggle to recognise synonyms in foreign languages. Returning to the earlier, it is worthwhile considering some of its foreign language equivalents for the word ‘packaging’.

  • verpackung             (German)
  • conditionnement     (French)
  • imballaggio             (Italian)
  • embalaje                 (Spanish)
  • パッケージング         (Japanese)
  • 包装                         (Simplified Chinese)
  • 포장                         (Korean)

In contrast, we like to believe that Cluster Searching is ‘language agnostic’ as it relies on the ability of patent examiner to recognise similarities between patents that may even be filed in foreign languages. 

 

4) Even if keyword searching finds relevant hits, the order of result can be misleading

Semantic based searching tends to rank results based on the overall similarity of keywords. So even if a keyword semantic search tool will recognise that a box is also a carton and a carrier, it may rank patents that refer to ‘cartons’ and ‘carriers’ as being much less relevant than patents that refer to ‘boxes’. However In practice, many of the patents that refer to cartons may be much less similar than many patents that refer to boxes.

In contrast, patent examiners and applicants will identify and cite similar patents even these patents use different synonyms for the same concepts.

 

>>>>>>>>>>>

For all of the above reasons, we recommend to all clients that they use Cluster Searching to complement their existing semantic or keyword searching processes. Case studies have shown that it will produce results that are different and yet still very relevant, when compared to conventional searching. An example, consider the case study found here. While originally published in August 2014, these results are still very applicable to Cluster Searching. 

An example of a query from a recent blog on Cluster Searching is found below:

 

ClusterQuery_20150714-041021_1.jpg

 

But doesn’t this produce a lot more work?

Cluster Searching has been carefully designed to be very easy and fast to use. You can obtain and review usable results in minutes. Considering the importance of many patent searches to their clients, and the genuine importance of not missing relevant patents in these searches, we think that this few minutes is well worth spending.

 

Want to know more?

Ambercite offers free and completely confidential two week trials to Cluster Searching – please contact us to arrange a demonstration and trial. Testimonials for Cluster Searching are found here, including:

“Ambercite Clustering has managed to create a tool to enhance chemical patent due diligence in a remarkable way…revealing art which may cover compounds in discovery within claims of a generic scope.  Congratulations!” – Patent Analyst, global pharmaceutical company, USA

Read More

日本語(Google翻訳)で読む