Crowdsourcing vs Cluster Searching – how do they compare?

最終更新日

One of the most economically important reasons for patent searching is to help invalidate patents. In recent years crowdsourcing platforms such as Article One, BluePatent and Patexia run competitiors in which prior art searchers compete to find the best prior art to patents and inventions in return for prizes. Prizes on offer have ranged up to $50,000 or higher, but the prizes on offer by Article One at the date of this blog were US$4500, while Patexia were offering $5000 prizes. 

b2ap3_thumbnail_crowd_20150223-044836_1.jpg

 

Cluster Searching is a new type of patent searching that Ambercite has recently developed that can automatically find similar patents to one or more seed (or starting) patents.  These seed patents can be from many different sources, with one common source for seed patents being a list of already known prior art patents. 

So how does cluster searching compare to crowdsourcing?

To answer this question, I went onto the Article One website,and found a current competition in relation to invalidating US6510434, which describes a System and method for retrieving information from a database using an index of XML tags and metafiles. There were still 32 days remaining in this competition when I wrote this blog.

This included a list of 49 already known prior art patents, with 5 of these supplied by the examiner. This list of 49 patents was ideal to be the seed patents for a cluster search – and we also weighted the influence of the examiner cited patents to be 5 times as important in the final result as the other patents.

Our automated cluster searching took a matter of seconds to produce a list of the 50 most similar patents to the list of 49 seed patents. In a second query, we allowed the search to run until all potentially similar patents were found – this produced a list of 26,000 potentially similar patents in about 60 seconds. In practice 26,000 patents is probably too many to deal with individually, but potentially it would be possible to screen this list by a combination of keywords, particular as this list is ranked in order of potential similarity.

So what did we find? The 20 best candidates (out of the 26,000 we found) according to our system are shown below. These are, as according to the rules of the search competition, all filed before 28 December 1999.

 


#


Patent


Owner


Title


Filing Date


1


US6356920


AWARE INC X


Dynamic, hierarchical data exchange system


1999-03-08


2


US6199195


SCIENCE APPLIC INT CORP


Automatically generated objects within extensible object frameworks and links to enterprise resources


1999-07-08


3


US6418448


SARKAR SHYAM SUNDAR


Method and apparatus for processing markup language specifications for data and metadata used inside multiple related internet documents to navigate, query and manipulate information from a plurality of object relational databases over the web


1999-12-06


4


US6366934


IBM


Method and apparatus for querying structured documents using a database extender


1999-06-02


5


US6343287


SUN MICROSYSTEMS INC


External data store link for a profile service


1999-05-19


6


US5920854


INFOSEEK CORP


Real-time document collection search engine with phrase indexing


1996-08-14


7


US6012098


IBM


Servlet pairing for isolation of the retrieval and rendering of data


1998-02-23


8


US5764906


NETWORD LLC


Universal electronic resource denotation, request and delivery system


1995-11-07


9


US5708780


OPEN MARKET INC


Internet server access control and monitoring systems


1995-06-07


10


US5754938


HERZ; FREDERICK S. M.


Pseudonymous server for system for customized electronic identification of desirable objects


1995-10-31


11


US5572643


JUDSON; DAVID H.


Web browser with dynamic display of information objects during linking


1995-10-19


12


US5530852


SUN MICROSYSTEMS INC


Method for extracting profiles and topics from a first file written in a first markup language and generating files in different markup languages containing the profiles and topics for use in accessing data described by the profiles and topics


1994-12-20


13


US5710887


BROADVISION


Computer system and method for electronic commerce


1995-08-29


14


US5819271


MULTEX SYSTEMS INC


Corporate information communication and delivery system and method including entitlable hypertext links


1996-10-29


15


US6154738


CALL; CHARLES GAINOR


Methods and apparatus for disseminating product information via the internet using universal product codes


1999-05-21


16


US5862325


INTERMIND CORP


Computer-based communication system and method using metadata defining a control structure


1996-09-27


17


US5649186


SILICON GRAPHICS INC


System and method for a computer-based dynamic information clipping service


1995-08-07


18


US5920859


IDD ENTERPRISES L P


Hypertext document retrieval system and method


1997-02-05


19


US5907837


MICROSOFT CORP


Information retrieval system in an on-line network including separate content and layout of published titles


1995-11-17


20


US5724424


OPEN MARKET INC


Digital active advertising


1995-11-29

 

Are these relevant to the US patent being invalidated? A quick scan of these suggests that some of these are quite relevant, and this is only the top 20 results out of 26,000 we found (although the liklihood of relevance is much higher for the highest ranked patents). We would suggest that a keyword review of for example the top 100 patents would be the next step in the process, and this is what the existing users of cluster searching are already doing.

So how do Crowdsoucing and Cluster Searching compare? Crowdsourcing would potentially deliver some very good results, including non-patent prior art.

On the other hand, Cluster Searching is: 

  • Fast. It take mere seconds to produce results from known prior art.
  • Completely confidential, as you can run the process in-house
  • Flexible. We can easily run any combination of seed patents. For example, we could have simply run a search based on the 5 patents nominated by the examiner.

If I was the defendent in a patent assertion case,  I would probably try a variety of approaches to invalidating the patent(s) in question, including conventional keyword class code searching, and crowdsourcing if needed. But I would certainly include Cluster Searching as a key and early part of the process, both to: 

  • generate potential candidates from already known prior art patents (as I have done above),
  • and also to expand the result sets based on new prior art patents discovered in other search processes.

Because Cluster Searching is so fast, it is possible to apply it at mutiple stages in the search process.

Does the ability to find new prior art to help invalidate patents in mere seconds have appeal to you? If so, please contact us to arrange a free trial of cluster searching (qualified prospect only).

Read More

日本語(Google翻訳)で読む