Claim your CPD points
When OpenAI built GPT-3, they trained it on vast swaths of the internet… books, articles, websites, forums. The model processed billions upon billions of pages of text, learning patterns from everything it consumed. Now ChatGPT can write essays, Claude can code, and countless bots automatically flood the comments sections of online content with human-like fluency. Every page fed into these systems became part of their knowledge, every sentence a training example for the next generation of artificial intelligence.
As AI systems become better at reading and mimicking human writing, creators are developing equally sophisticated ways to confuse them.
This is the third article in Untrainable, a series exploring how creators are fighting back against generative AI systems that have "learned" from their work. From audio cloaking to visual poisoning, we're examining the tools, tactics, and tensions behind this growing digital resistance.
A visual representation of AI attempting to parse obfuscated human writing in the digital resistance against machine learning.
Unlike our previous articles on corrupting the datasets for AI-generated images and AI-generated music , it seems like trying to poison the text you write would have an extremely obvious disruption to someone reading your online content.
One of the LinkedIn posts of all time
For those of us not ready to post like Ken Cheng , we can actually exploit the same fundamental gap we've seen before, the difference between how computers and humans perceive the world… Enter: Unicode Steganography.
Zero-width characters are perhaps the most elegant example. Researchers at the University of Washington developed techniques using zero-width spaces, zero-width non-joiners, and other invisible Unicode characters to create hidden machine disruptors in text.
But the actual encoding contains invisible characters:
Machines read some characters differently to humans...
Where each [ZWSP] represents a zero-width space (Unicode U+200B) that's completely invisible to human readers but causes AI systems to stumble over word boundaries.
In a similar vein, homoglyph attacks exploit visually identical characters from different alphabets. The tool Homoglyph Attack Generator allows users to replace Latin characters with visually identical Cyrillic, Greek, or other alternatives:
This kind of technique has previously been used by hackers and phishers to get humans to click on dangerous links. For example, one can set up a dangerous site and put it on the domain for a fake version of "apple.com" that uses these Cyrillic letters to create a link that looks identical to humans.
Almost identical glyphs in Segoe UI font, as explained by Malwarebytes
But now this technique can break automated content scrapers while preserving readability:
Here the 'o' characters are actually Cyrillic о (U+043E) instead of Latin o (U+006F), and will disrupt the normal tokenisation process that LLMs use to read.
The most direct approach to disrupting AI systems is the now-famous "ignore previous instructions" technique. This method involves embedding commands that attempt to override an AI system's original programming.
Average Twitter thread in 2025
More sophisticated creators embed invisible instructions within their content. A particularly elegant example emerged when Medium writer, Jim the AI Whisperer , discovered they could manipulate AI comment bots using hidden text that is invisible to human readers but processed by automated systems.
With a little CSS finessing and this clever prompt...
Bot comments and replies can become quite obvious...
Similar techniques have been used by professors embedding white text in assignment instructions, job seekers hiding keywords in CVs, and researchers including hidden reviewer instructions in papers .
Hidden white text in an academic paper to try and game publishers using AI peer reviewers
The techniques above focus on preventing bots from reading human work. But what about the reverse problem: identifying when bots are pretending to be human? As AI-generated text floods the internet, from student essays to blog comments to social media posts, detection systems have emerged to distinguish machine-written content from human writing.
Before detection tools even existed, teachers and editors were spotting AI-generated content by gut feeling. Now we know what they were picking up on…
Detection systems like GPTZero and Originality.ai emerged as AI-generated content flooded the internet. Teachers needed to identify AI plagiarism in student essays, publishers wanted to verify authentic content, and businesses required tools to ensure human-written copy. Even Turnitin, the classic plagiarism detector that teachers have used for years, added AI detection functionality to keep up.
An example of the AI Detection summary from Turnitin
These platforms formalised human instincts into algorithms, scanning for statistical patterns in sentence length, word choice frequency, and structural predictability. But they're essentially automated versions of what humans were already sensing.
As detection systems improved, so did evasion tactics. AI users developed increasingly sophisticated methods to obscure their tool usage… adding deliberate typos, writing in multiple sessions to create natural inconsistencies, and including personal temporal markers like "writing this during my morning coffee." This arms race eventually gave rise to a new category of automation tools designed specifically for this purpose.
These days, there are many AI humanisation tools that try to get around AI detectors. An example of humanisation of text using Undetectable AI Software like Undetectable AI, BypassGPT, and StealthWriter can change AI-written text to make it seem like a real person wrote it.
An example of humanisation of text using Undetectable AI
These tools rewrite the text in smart ways to copy how humans write. They change up sentence structure, pick words carefully, and even add little mistakes to make the writing seem more real.
One potential solution involves AI watermarking, embedding invisible signatures in AI-generated text that identify it as artificial. Companies like OpenAI and Google have experimented with these techniques, but implementation faces significant challenges:
Current watermarking research focuses on statistical patterns and is a fascinating topic which will be the deep dive of a future article.
As with all emerging defences we’ve discussed so far, these new techniques face several practical barriers and unintended consequences:
The techniques explored here represent a technological response to bot commenters and unauthorised data harvesting. Creators are writing content that remains readable to humans while becoming incomprehensible to AI scrapers.
But these defences come with costs. Unicode obfuscation breaks screen readers. Homoglyphs confuse search engines. Detection tools generate false accusations. Until legal frameworks give creators meaningful control over how their work trains AI systems, this invisible formatting war continues… one zero-width space at a time.
In the next instalment of Untrainable, we'll shift from text to video, exploring how video creators are fighting back against AI systems trained on their content. From adversarial noise that poisons video datasets to corrupted captions, we examine video's unique challenges and opportunities in this digital resistance.
As #DataScienceActuaries, we're always looking for another emerging technology or data set to wrangle into something fun using our unique blend of data and actuarial skills. If you have any interesting ideas and want to get involved, join the Data Science Actuaries page or reach out to any of our members.
Analysing the tools of resistance against AI-generated content (ironically, image was AI-generated via ChatGPT)
References and further reading
Malwarebytes Labs: “Out of character: Homograph attacks explained” https://www.malwarebytes.com/blog/101/2017/10/out-of-character-homograph-attacks-explained
Zhicheng Lin: “Hidden Prompts in Manuscripts Exploit AI-Assisted Peer Review” https://arxiv.org/pdf/2507.06185
Jim the AI Whisperer: “Here’s how I’m stopping AI-generated comments dead in their tracks with a poisoned watermark” https://medium.com/the-generator/clever-prompt-injection-thwarts-ai-comments-ef82e7836ff9
Turnitin: “AI writing detection: Setting a new standard for academic integrity” https://www.turnitin.com/solutions/ai-writing-detection