GPT-J: Is your media sufficiently different from its competitors?

How can you optimize your content strategy? In an ultra-competitive world, our AI aims to help media stand out from their competitors.

In an ultra-competitive world, it is vital to work on your brand in order to distinguish yourself from your competitors and to be unique. In the media world, the choice of topics, angles and headlines are the elements on which the audience bases the decision to engage, or not… So, are you sure you are unique enough?

What did we set out to do?

We worked with GPT-J, a language model developed by a community of researchers, engineers and developers (EleutherAI).

We took this model (which works like GPT3 as used by ChatGPT) to keep only its “brain”, in order to train it to guess the brand behind the headlines of reference media articles.

Then, we represented this work visually in order to make it a real strategic tool and to see how a publication can distinguish itself with the help of the right headline to distance itself, or alternatively to become more like, or blend in with its competitors. What is important to note in these graphs is the relative placement of each media in relation to the others.

Test 1. Focus on 5 US/UK media

We worked on a first set of data, consisting of 10,000 article titles, the most shared via Facebook, from major international media in English: The New York Times, The WSJ, The Washington Post and The Guardian.

The first results are amazing.

Screenshot of our interactive map.

This interactive map allows us to discover the titles that are clearly representative of a individual brand, compared to the titles that are perfectly interchangable between the media studied.

We can conclude that The New York Times, The Washington Post and The Guardian each have very distinctive content because the AI has perfectly succeeded in isolating them, while for the WSJ and The New York Times the points are blended.

Test 2: Focus on 5 French media

We continued our investigations on a second dataset including 50,000 article titles, the most shared via Facebook, from the following media: Le Monde, Le Figaro, Ouest-France, Le Parisien and France Télévision. To make this chart readable, we used a random sample of articles.

This interactive map is exciting to explore!

Each point corresponds to the headline, deliberately reduced to the first letters (contact us for the full version). We’d recommend studying this on a tablet or a fixed computer for ease of reading.

We can clearly see that our AI is very successful in distinguishing media like Ouest-France, and Le Monde, but less successful in distinguishing between Le Figaro, FranceTV and Le Parisien. Indeed, over the course of several tests, it is the media of Le Figaro and Le Parisien which present our AI with the most difficulty when it comes to recognizing their styles.

Learn more about our scores for those who are interested

For the first set of data, we obtain an AUC ROC score of 0.93 for the first, and 0.87 for the second.

You will find below the rule of thumb of Hosmer and Lemeshow in Applied Logistic Regression on which we rely to assess our results:

0.5 = No discrimination

0.5-0.7 = Low discrimination

0.7-0.8 = Acceptable discrimination

0.8-0.9 = Excellent discrimination

0.9 = Exceptional discrimination

We are continuing our research and will share our findings with you very soon.

