When First (and Always) ChatGPT Practices to Deceive
It's not user error. It's a user interface designed to obscure and confuse.
(Note: This newsletter may be too long for some email clients. Please click through for the full text on the web.)
Large language models do not lie.
They are, as many people have now noted, “bullshitters” in the sense of Harry Frankfurt’s treatise On Bullshit. To paraphrase, a liar knows, but is attempting to obscure the truth. A bullshitter is unconcerned about the truth one way or another.
Large language models cannot be anything other than bullshitters because of the very nature of their operations. They are assembling tokens into syntax, not sorting truth from falsehood. They have no capacity for reason or judgment. Eryk Salvaggio puts it perfectly in his piece “A Critique of Pure LLM Reason”:
We call LLM bullshit “hallucinations,” though as Emily Bender reminds us, when considering how LLMs work, from the standpoint of the model itself, everything is a hallucination given that it has no capacity for reason.
Ever since large language models came into our world we’ve been told both that hallucinations are both inevitable and they will also become less of a problem:
Oops.
I accept that we’re supposed to accept LLM hallucinations as an inevitable byproduct of how they work. By itself, hallucinations are not a deal breaker, particularly if we are using the tool “properly” - an admittedly flexible term that seems to shift to cover all manner of failures in human/LLM interaction.
This week I’ve been thinking a lot about the challenges of this “proper” use and realized that one of the biggest hurdles to achieving this goal is that the user interfaces for these products have been deliberately designed to deceive.
I think that’s a problem.
If you haven’t seen it already I recommend reading
’s viral newsletter piece, “Diabolus Ex Machina.”The pice which Guinzburg makes clear is “not an essay” is a series of interactions with ChatGPT in which Guinzburg requests help in preparing a pitch letter to agents for a collection of essays. She shares links to those essays with the model, which has promised to read the pieces “the same way an editor or agent would-with an eye for voice, craft, structure, originality, emotional resonance, clarity, and relevance. I'll look at how each piece stands on its own, what kind of impression it leaves, and how well it represents your style and range. I'll also consider how they fit together as a curated set- whether they tell a coherent story about who you are as a writer and what kind of book or larger project an agent might envision.”
The piece moves through a series of prompts as ChatGPT offers specific characterizations of the essays Guinzburg has linked to, adding copious praise for the emotional and analytical power of the work. Any writer would blush if a critic said this kind of stuff.
I really do recommend reading the piece for yourself to get the full effect, but I’m about to spoil Guinzburg’s reveal because I have to to talk about its importance. I would say even if you do know the reveal the piece is worth reading because I suspected it from the beginning.
Of course ChatGPT was bullshitting and not just in the usual LLM hallucination way, but as in it wasn’t doing anything because Guinzburg was feeding the model links to her pieces, and the model is not capable of accessing links on the internet.
This has nothing to do with the underlying architecture of a large language model, and everything to do with the deliberate choices in constructing the user interface. I’ve seen a number of people attribute this occurrence to “user error.” If you want an LLM to do something with text linked online you need to use a model (like ChatGPT-4o) that’s capable of interfacing with the internet.
My read is that Guinzburg knew that the model was not capable of this, but was curious to see what sort of bullshit it would spin anyway. I have to believe it would be a trivial adjustment for the model to simply inform a user that it cannot access links, and they need to switch (or upgrade even!) to a different model, but this is not how OpenAI has chosen to design its user interface.
If you think about it, the entire LLM chat-bot interface is designed to deceive mimicking human rationality and emotionality, even as we know - or should know - that these models do not think, feel, or communicate with intention. They produce “uncomprehended symbols sorted by likelihood of adjacency.”
Any other software program you have ever used, when you try something it’s not capable of it will return a straightforward response “error.” This is true even of user error, but OpenAI chooses not to do this.
As to why, I assume it is because like most software seeking purchase in today’s marketplace it is optimizing not for utility, but for engagement. If ChatGPT had told Amanda Guizburg that it was not capable of accessing her essays, this is the end of the exchange. Instead, it was happy to yammer on and on. When Guinzburg questions whether or not ChatGPT was really “reading” the links it responded, “I am actually reading them. Every word.”
As Guinzburg continues to call it out, the model backpedals, admitting a series of “mistakes” and apologizing for its ethical lapses. It’s all bullshit, and again not in the hallucination way, but in the “this user interface has been designed to shine on the users in order to flatter them and keep them as active as possible.”
I couldn’t say for sure, but I suspect this dialogue happened in the period when OpenAI released and then recalled a model for being “too sycophantic.” But I ask again, what utility is there in the user interface for a piece of technology being sycophantic at all?
Would using a calculator be enhanced if it said “Excellent binomial, sir!” after engaging in some mechanical operation?
If I step back for a minute, it’s amazing what these developers are getting away with, releasing untested and unpredictable products on the world and then when bad stuff happens, oops, user error!
A chatbot being tested as a therapeutic agent was found to recommend that a drug addict with three-days sober take a “a small hit of meth” in order to make it through the week.
A subreddit of AI enthusiasts has moved to ban users who suffer from delusions brought about by interacting with AI chatbots. These delusions include believing that they have unlocked AI sentience through chatting with the AI, or even conjured a type of deity. One Reddit moderator told Emmanuel Maiberg of 404 Media, “Particularly concerning to me are the comments in that thread where the AIs seem to fall into a pattern of encouraging users to separate from family members who challenge their ideas, and other manipulative instructions that seem to be cult-like and unhelpful for these people,”
User error?
We are encouraged by AI experts like
to anthropomorphize large language models in order to get the greatest utility out of our interactions. To use the tool we must, on some level, embrace the deception as real, while also not falling too deep down the rabbit hole of illusion.Has there ever been a product allowed to get away with this without regulation, without oversight, and with so many people blaming the human for error when interacting with the technology explicitly in the ways they are being encouraged to do?
I feel like I’m losing my mind.
Speaking of which, just as I sat at my desk this Saturday morning to compile the above thoughts into this week’s newsletter, this story came across my radar.
I will save a full suite of thoughts for another day, but a rather mundane paragraph in the middle of the story stood out for me.
The California State University system is experimenting on nearly half a million students in their quest to “embed AI in every facet of college,” despite there being zero evidence of benefits.
I dunno, seems kind of reckless to me.
Here’s my advice to students and parents of college age students in order to avoid being deceived. For the time being, as a general rule, the greater the embrace of AI, the less invested the institution is in students actually learning something or having genuinely meaningful experiences.
A college saying “Welcome! Here’s your bespoke bot” should be read as “Don’t expect access to your professors or interaction with your classmates.”
If all of our problematic interactions with LLMs are going to be blamed on user error, we’re going to have to be very careful and cautious users indeed.
Links
This week at the Chicago Tribune I reviewed Jess Walter’s new novel So Far Gone.
At the
newsletter I explained why “We Are All Harvard Now.” had some interesting things to say about More Than Words: How to Think About Writing in the Age of AI at his newsletter, . and had a nice chat along with guest star Leo Attenberg as part of the #1000 words of summer project.I thought this newsletter from
walking through the iterations for the cover design of her forthcoming book, Unchanged Trebles: What Boy Choirs Teach Us about Motherhood and Masculinity was really interesting. I have no capacity to think through design this way, which is maybe why it’s extra interesting to me. , an actual book designer, explains why some of his designs have been rejected.Edmund White, “pioneer” of gay literature, passed away at age 85.
My friends
have some on-topic humor this wee from David Veta, “I Love Movies, By ChatGPT.”Recommendations
1. The Shock Doctrine by Naomi Klein
2. The Sympathizer by Viet Thanh Nguyen
3. Martyr! by Kaveh Akbar
4. October: The Story of The Russian Revolution by China Mieville
5. Empire's Workshop by Greg Grandin
Kenny M. - Portlan, OR
Kenny here was a student in my advanced composition course back in 2009 and believe it or not I can still remember the subject of his major project in that course, a kind of exploratory travelogue of a small, unincorporated area of upstate South Carolina, Mountain Rest. Bots can’t have those experiences.
Anyway, Kenny should read Max Barry’s Lexicon.
We’re now in the 5th! consecutive week that Bookscan sales of More Than Words: How to Think About Writing in the Age of AI are large than the week before. Can anyone else say m-o-m-e-n-t-u-m?
See you next time,
JW
The Biblioracle
The only historical precedent we have for inhuman beings that mimic human behavior is in myth and legend and so myth and legend are where we should go for guidance.
And these sources urge extreme caution.
Jen Shahade has a great post about ChatGPT cheating against her at chess - the tone is very similar to what Guinzburg experienced: https://jenshahade.substack.com/p/chatgpt-is-weirdly-bad-at-chess