Out there on the information superway a tiny little text file just got run over; the problem is the one left bleeding is you.
You are reading the Upgrade Media newsletter sent on 18st july 2024. Sign up now to receive future newsletters directly to your inbox or via LinkedIn.
Be honest. Until now had you ever really asked what robots.txt did? Of course you didn’t; not least because you have more pressing things to do than research all the little text files that help the web work.
Robots.txt is a tiny text file that specifies what can and what can’t access your website. Specifically it gives or withholds permission to search engines. It all started out in the 90s when limited hardware resources meant overenthusiastic search engines could crash a site, but since then it has become a ‘gentlemens’ agreement by which the web runs.
Until now.
AI, at least in the hands of certain companies, does not seem to respect robots.txt and simply barges it out of the way to not only rifle through your content, but cheerfully steal it.
The old promise was that if you let search engines access your information, it was understood to be in return for them linking back to you.
But the new generation of search engines like Perplexity and Arc Search skip the inconvenience of backlinks and instead simply supply the user with answers.
That in itself is driving a rethink of the relationship between search engines and publishers.
However, news from Forbes shows that it is even more serious.
Forbes publishes Wired, and Wired’s staff discovered that Perplexity was not only ignoring robots.txt, but serving up information only found in Forbes articles – without crediting Forbes at all. AI piracy, in fact. It even went one better as Wired magazine discovered that the article it then wrote about Perplexity’s piracy was being pirated, by Perplexity.
Some companies are demanding money for the new generations of AI search tools to even access their content, an effective paywall that involves negotiating the keys for the Application Programming Interface that gives the robots access to content. Nobody really expects robots.txt to be respected any more. While this is an option for Reddit, or for newspapers with deep pockets and lots of lawyers, it is less obvious as a solution for local press.
With the dual theft of content and the loss of traffic from search referral it might seem tempting to retreat to the castle and throw up a hard paywall. As any mediaeval siege expert will tell you, however, that this mentality only works if you have sufficient provisions within the castle.
Read our June 2024 newsletter: Sink, or swim – the pink slime is coming
The catch is that while we all want to keep the AI invaders out, we really, really want to welcome not only our regulars, but complete strangers who might take a fancy to what we have on offer.
The hard paywall has several characteristics:
Total blocking of content: It prevents all access to content without subscription, reserving all articles for paying subscribers only.
Immediate confrontation: Readers encounter the paywall as soon as they try to read a premium article, forcing them to subscribe in order to continue.
Strict strategy: This model is the most rigid form of paywall, potentially effective for the most engaged readers but likely to drive away less engaged ones.
Increasing the visibility of paid content: By increasing the visibility of paid articles, more users see the paywall, which can increase conversions.
These points are now out of step with the needs of today’s publishers, who must instead adjust their content strategy with finesse and subtlety to attract users who are not necessarily familiar with them. Even newspapers who are the only title in town are now up against YouTube and TikTok, and paywalls are not the weapon of choice in that battle.
The challenge is to use your data on reader behaviour to serve them better:
Total blocking of content: Not all of us have the luxury of being the Financial Times and can afford such a strategy. If you set up a paywall, start making smart exceptions to encourage users to discover and enjoy your content.
Immediate confrontation: Let’s be strategic and apply the paywall to content that readers value highly.
Strict strategy: Only a few media can use a hard paywall. In a competitive environment marked by the emergence of intelligent browsers and AI in search, it is essential to adopt more subtle strategies.
Increasing the visibility of paid content: This strategy is outdated and only applies to a handful of reference media such as the Financial Times or the Wall Street Journal. For the local press, it would be counter-productive.
Conclusion: More than ever, an AI-informed content strategy that intelligently links content and form and highlights the values of the medium is essential if it is to adapt to the needs of users at the key moments when they choose to be informed.
That’s a big ask. It requires rapid, even immediate assessment of data to serve the right information to the right person in a way that makes them want more – enough to pay for it. The good news is that this too is coming, and also courtesy of AI. Bizarrely, even as we fight off the first wave of AI pirates, we should be opening the side door and asking smart AI gatekeepers how to better usher in the right kind of visitors.
About Upgrade Media: Upgrade Media is a creative agency, strategy consultancy, training center and media transformation think tank, through its brand New World Encounters.
◾️ We work for media and communicating companies to accelerate their digital transformations, evolve their organizations, print and digital products, and also develop team agility.
◾️ Check out our Upgrade Media website to learn more about our projects and approach.
◾️ We hope this article and our other content inspires you!
Thank you for reading.
Keep up to date with all our news by subscribing to our Newsletter by e-mail or via Linkedin.