Prathik Roy is Product Director for Data and AI Solutions at Springer Nature, one of the world's largest academic publishing companies. A quantum chemist and material scientist by training, he spent years in R&D before gravitating towards product management — and has spent the past 12 years helping publishers understand the value locked inside their content.
In this episode, Prathik makes the case that publishers are sitting on some of the most strategically valuable data in the world, and that most of them are only beginning to understand what that means in the age of AI. The conversation moves from the collapse of traditional publishing metrics to the emergence of token-based pricing, rights management as a product discipline, and what it means to serve users who now live inside AI systems rather than on your platform.
Prathik draws on his background in scientific publishing to explain why peer review is a genuine moat, why AI developers are asking publishers to restructure centuries-old formats, and why the prohibited section of a licensing agreement matters more than most product managers realise.
Chapters
Key Takeaways
For 20 years, publishing was built on the assumption that value is visible — clicks, downloads, page views. AI has broken that model. A scientist can now get a synthesis of 300 papers without opening a single PDF. Publishers who still measure only what happens on their own platforms are measuring the wrong thing.
The quality signal embedded in scientific publishing — acceptance rates, citations, expert validation — is precisely what AI developers need and cannot easily replicate at scale. That is the leverage point. Publishers who understand this can command significantly higher licensing value than those treating their content as a commodity.
AI developers are asking for content formatted with bullet points and question-and-answer structures — formats that run counter to how scientific articles have been written for centuries. Publishers who adapt their content architecture for machine consumption will capture more value from AI pipelines than those who do not.
If you are building a data product, understanding rights in and rights out is as important as building the product itself. Prathik's rule: if the prohibited section of your agreement is not longer than the rights section, you are probably doing something wrong.
The next revenue frontier in data-intensive publishing is not flat subscription fees — it is pricing tied to how content is consumed inside AI systems and the outcomes that consumption generates. Token-based models and revenue-share arrangements are already emerging in scientific publishing.
Product managers need to build pipelines that track how content flows into AI systems, how often it is retrieved, and whether the outputs generated are traceable back to the original source. Session time and page views will not tell you what you need to know.
Podcast transcripts, audio, video — all of it contains extractable IP that can be packaged, enriched, and licensed to financial firms, research companies, and AI developers. Extraction and enrichment is the first step; licensing and outcome-based models follow from there.