IP Talent

Welcome to the Machine: Measures you can take to stop AI scraping your content

READ TIME:
2 mins
Welcome to the Machine: Measures you can take to stop AI scraping your content

I saw the viral 'Goodbye Meta AI' repost doing the rounds on socials.

It's a hoax.

But what is clear is the scepticism over scraping in the name of 'AI'. Specifically, GenAI.

This legal conflict is heating up on multiple fronts:

  • record labels (Warner, Sony and Atlantic) are suing AI music providers, and
  • Getty is suing Stability AI for copying of their images.

But there is a new target. You.

Many social platforms have now disclosed that they are (and have been) using public data to train their AI models.

And for the ones that don't, and prohibit it, third party crawlers are doing it anyway.

In fact, it goes further.

The richest data sets used to train Chatbots are videos with subtitles.

Why?

Because it reliably processes how people speak in the correct context and mannerisms.

Pauses, pace, rhythm and flow etc.

Leading experts in the industry have dubbed it:

the Holy Grail

And we're all content creators, right?

While the law gets its act together, here are some steps you can take to protect yourself:

  • Double down on your privacy
  • Add a digital signature or watermark or image cloak to your images
  • If you have a website, prohibit scraping in your 'terms'
  • Opt-out - many platforms now offer this (Meta, Squarespace, Adobe)
  • If you suspect your data has been scraped without permission send a Data Subject Access Request (DSAR) - data privacy is a human right but you might want to speak with a privacy lawyer first
  • If you can't beat them, join them. There are now agencies paying for content to train AI models

Scraping and crawling is not new. Google and other search engines rely on it.

But is the modern-day 'Yellow pages' a different proposition to AI?

Only time will tell.

By Jack Jones
Published October 2024