7 min read
When Winston Churchill said, “broadly speaking, the short words are the best, and the old words best of all”, he (most likely) wasn’t speaking about the words found in patent specifications. However, in saying that, he is both right and wrong.
Old, short words would be described today as “plain English”. Describing something in plain English adds succinctness and clarity to any document, something patent specifications, or at least the claims, are supposed to possess, but sometimes only do so in the most obscure way possible by using new, long words. There have been many years of innovation since Winston was alive, and by necessity we have had to invent new words or repurpose old ones to describe our modern world, so perhaps we can be a little forgiving when using one of our newfangled words.
He is right though, because we shouldn’t forget the old words. They still may adequately describe a modern innovation. Calculators weren’t always called calculators; Charles Babbage called his a difference engine, and computers once referred to humans, not machines. They, of course, used a necktop, not a laptop.
In this article we’re going to talk about words, or as they are more commonly known in the searching world, “keywords”, and how you can use them effectively. I’ll be straying into territory I’ve covered before, so here are links to my articles on patent classification and narrowing your search results.
Before we start manipulating our keywords, we need to determine what the keywords are. One of the biggest mistakes is only considering the words you know to describe a concept. This may be because it is commonly used industry jargon or simply how you have described it. Even when you have settled on a set of keywords, be open to discovering more as you read through any specifications you think might be relevant, and add those new words into your keyword set, and search again.
Don’t be overly descriptive when determining your set of keywords. A doohickey bolt is just a bolt; a thingummy panel is just a panel; and a whatsit tube is just a tube. Sometimes using doohickey, thingummy or whatsit might be required, but usually when they are the industry jargon, and never when it’s something you just made up.
This is where the above reference to old words comes in. Patent literature goes back a long way to simpler times, and with a whole lot less prior art, inventions could be described more simply and broadly without having to navigate a minefield of potential infringements, so go back further than your own time and history, and consider ye olde time technology terminology as well.
Using ‘tube’ as the initial keyword should lead you to words like cylinder, duct, pipe, pipeline, pipette, conduit, tunnel, chute or straw. You can also go a little more abstract with ‘hollow’, or technical with ‘lumen’. Not all of these will be relevant to your particular concept. For instance, ‘straw’ is more likely to be used in foodstuffs or packaging applications compared to say, ‘tunnel’, so you don’t have to use every keyword you can find. Stick to the most likely in the first instance, and broaden your search with others as necessary later.
Now that you have a set of keywords, what can be done to ensure you will capture all instances of those keywords?
If you have ‘tube’, what about tubes, tubing, tubular, and so on. Sometimes it’s effective to just add these to your set, but it’s easier to use truncation, so tube, tubes, tubing and tubular can all be summarised with ‘tub*’. Pipe, pipeline and pipette above can be truncated to ‘pipe*’, and cylinder can become ‘cylind*’ to also cover cylindrical, as further examples.
In most freely available patent databases an asterisk can be used to truncate, but the USPTO uses a $ symbol. If in doubt, check the help pages available in whatever system you’re searching in. Truncation not only allows you to get a number of keywords for the price of one, but can also be used to cover spelling variations, such as American/British spelling, or capture typos.
Here’s something else to consider. I like to have a number of sets of keywords, each relating to a different aspect of the concept. For example if I’m searching for ‘red bicycle tubes’, I’ll have three sets of keywords: one for keywords relating to ‘red’, another for ‘bicycle’, and one for ‘tubes’. It just means I can more easily manipulate them by combining them in different ways.
So, let’s start combining those sets of keywords. To keep it simple I’ll refer to the various sets of keywords as A, B and C, and individual keywords within those sets as A1, A2, A3, etc.
There are three Boolean operators to be aware of. These are OR, AND and NOT.
Firstly, stay away from NOT as much as possible. Using A NOT B means you are taking everything from A that is also in B. The problem with using NOT is that you don’t know what you’re excluding. In many applications it is suitable, but not for patent searching. One of the features of a patent specification that often appears is a background of the invention, or a description of related art, where earlier inventions in the field are described. If there is a specification that happens to be all about B, for example, but also happens to describe A, your concept, in great and precise detail in the description of related art, by conducting a search for A NOT B, you will exclude that highly relevant specification from your search results.
In general, with OR and AND you will end up with search strings that look like
(A1 OR A2 OR A3) AND (B1 OR B2)
Likewise you can search for A AND C, or A AND B AND C. What you want, ideally, is a number of search results that is manageable. That may mean for example splitting up your sets of keywords or removing the truncation and spelling out some of those words in full. For example when truncating tube, etc. earlier we would also pick up tuberculosis and tuberosity, among others that are clearly not relevant, although you would hope that when combining tub* with bicycle you wouldn’t find much in the field of bicycle tuberculosis (FYI, the answer to that is not zero).
The problem with using AND and OR is that they don’t care how the two words are related in a patent specification. All you know is that two of your keywords appear in a specification, somewhere. One could be in the first sentence and one in the last, but you won’t know until you read it. Ideally you would like the words to be more closely related than that; at least the same sentence or paragraph.
Let’s be clear though, AND and OR are very important, and you absolutely have to use them. What follows does not replace those primary tools, but just allows you to get your keywords a little closer together if you need to.
There are two methods for getting keywords closer together. The first is to search for a direct phrase. This can be as long as you want but two or three words is really all that will be effective. If we take our ‘red bicycle tubes’ example from above, we could search for something like “red bicycle” or “bicycle tube*”. Note how I’ve used double quotes this time. In most freely available searching databases you will need to use double quotes to indicate a direct phrase search. Espacenet allows you to enter the words without quotes as it will automatically consider it a direct phrase search. You can usually use truncation on one or both of the words.
The problem with this search is that you wouldn’t pick up any combination of keywords such as ‘red racing bicycle’ or ‘bicycle inner tube’. To find those you can use a feature called proximity searching but could be called ‘words within so many words’. Not many freely available databases have this feature but it does allow you to expand your search beyond the restrictive direct phrase search.
PatentScope allows for the use of NEAR to find two keywords within five words of each other, but it also allows for the use of a ~ (tilde) to customise that gap, so you could search for ‘red NEAR bicycle’ which is the same as “red bicycle”~5, or you can go further to “red bicycle’~10, and these would pick up ‘red racing bicycle’.
AusPat used to let you use NEAR but that has been replaced by /n/ where n is the number of words in the gap between your keywords, so “red /5/ bicycle”. Note that this, and the PatentScope search have to take the form of a direct phrase search using double quotes along with the words within words operators.
So, rack your brain (cerebellum, grey matter, head…), grab a thesaurus (dictionary, onomasticon, lexicon, source book…) or whatever you can, to come up with a list of suitable alternatives to your initial keywords, and try combining them in a few different ways to assist with your search.
Authored by Frazer McLennan and Gareth Dixon, PhD