Patents are a unique forward-looking data source that can be a great indicator of innovation and emerging trends, particularly with regard to new technologies. Working with this data to pinpoint companies that are poised to directly benefit can be challenging; data can cluster around specific industries, geographies, and market and company sizes. Timing is also an intangible element. Using knowledge graph technology to identify where the concentration of innovation is, and what companies are involved, is one way to make practical use of patent data.
In this webinar with Yewno COO Ruth Pickering and Tony Guida, Portfolio Manager and Executive Director at RAM Active Investments, we examine some of Tony’s findings in working with knowledge graphs and patent data.
For a full recording, please visit the link below, or keep reading for a text summary.
Video Link: https://www.youtube.com/watch?v=oE_J-Su-410
Previously, Yewno and Tony Guida have partnered together to use Yewno’s Concept Exposure Data as applied to a News data set. Could you give us a brief recap and tell us what we can expect from Patents using the same multi-factor model applied to an entirely different data source?
The commonality here is “concepts” and what signals can be used from related concepts. A “concept” is something derived from a graph knowledge and we can measure the degree to which companies are exposed to certain concepts in financial literature for example, earnings estimates, cash flows, and create signals to construct portfolios.
Patents are very different from the news, and patent activities are following the same path as news or sentiment data did in early days. They are public, so you can find a lot of information, but what’s really interesting is that much of the way the data is used is simply by counting.
While counting patents is better than nothing, the knowledge graph technology goes beyond. It captures clusters of innovation. With knowledge graphs, you can pick a theme and then get more granular with sub-categories and vectors. For example, if we want to look at “Quantum” we can drill down into quantum computing, quantum currency, and so on. Similarly we can adjust duration and grow to include new concepts.
Can you tell us more about the patent data landscape and what’s available?
This was a fun exercise because I’d done some work with patent data. Patents used to be a bit boring – they were more for lawyers, and not linked to investment because the financial community hadn’t figured out how to generate alpha signals.
What’s changed is that now you can literally type in a keyword into Google and see the patents and which companies are active in this area. We are also now able to find what words are used in the patents – a step beyond the counting that I referred to earlier.
However, words are often either too technical or two vague, so what you get is a list of topics. It’s difficult to carve out the details yourself. Graph knowledge like Yewno’s, is a more semantic representation of the knowledge, helping you to identify and connect concepts.
How does Yewno sit in this universe and what’s different?
The technology is totally different from anything that has been done so far – it’s a generation jump. Patents haven’t been tapped that much because they have a longer duration – you have had to buy and hold to take advantage of the signals. However, Yewno extracts concepts that can be linked across the whole universe of patents. This helps to identify the trends behind the patents and capture the semantics to see (and capitalize on) a cross-section of innovation. The beauty of graph knowledge is that you can go deeper, and then even deeper.
How did you select the concepts for your research?
This was difficult because Yewno has MILLIONS of concepts. How do you choose just a handful? I simply asked myself what topics I’m very excited about, but don’t currently have signals or metrics around them. Being a quant, I do love to cover other sciences, and quantum computing is one thing I’m really interested in, so that’s what I chose to start with.
Can you lay out for us how you constructed your research and what you found?
Using aggregate scores and weekly data, I constructed a universe of stocks that is very active in patents. I looked at 300 most active companies in patents globally over the last 5 years, narrowed that down to 250 of the most productive filers. The horizon was mid/long term.
I found a bit of a bias toward developed countries, US, Japan, Germany, Switzerland. There was also a sector bias towards Technology, Capital Goods, Software, Pharmaceuticals and Media – industries that typically invest in research and development. However, it was more balanced across sectors than I thought it would be.
Once I was able to identify the groups creating patents and the groups not producing I went deeper. I extracted aggregate data and weighted it using four factors to identify current and future hot technologies. I wanted to see what the most covered concepts were. It turned out that Augmented Reality was the top one, followed by Virtual Reality, Artificial Intelligence, Graphic Processing, Cache (Computing) and a number of others including several types of Quantum.
Ultimately, did you find enough or the right types of Concept Exposure signals to construct any kind of strategy around Patent Data?
Actually, yes, I was able to create a portfolio out of the aggregate data, assuming a bi-annual rebalancing to mimic an Exchange Traded Fund (ETF). The ETF performed much the same as the aggregate. It’s not only possible – there is a good case for creating thematic ETF type products off of this data.