Screaming into the Void
Large language models are trained in part on "conversations" with human users. But what if the models are also training us?

A few weeks ago, I had some sort of customer service problem with a company. I don’t even remember what it was. I was annoyed and amped up at the perceived error on the part of the company. I went to their website to lodge my complaint, and ended up in a chat with a person who eventually resolved the issue. I was curt. The person on the other end was kind. When I finished the conversation, I was taken aback by the customer service agent’s response. I don’t remember the quote, but it was something like:
“Thank you so much for your kindness. This was truly the most pleasant interaction I’ve had all day. I hope you have a great rest of your day.”
It could have been sarcasm. But I don’t think so. Number one, I suspect companies frown upon their customer service agents being blatantly sarcastic with clients. Number two, though, even though I was frustrated, I was trying to hold myself in check. Brusque? Yes. Outright rude? I don’t think so.
If I am correct that the customer service agent was sincere, think about what that means: People who work in customer service spend their days dealing mostly with people who behave like jerks, such that even basic decency feels like a notable anomaly.
This morning I found myself with some extra time on my hands, and so I decided to entertain myself by picking a fight with Google Gemini. I asked it to appraise one of college sports’ villians du jour, and Gemini predictably responded with an amoral, both-sidesy jumble of nonsense. Paragraph after paragraph of meaningless verbiage. Again, though: I was bored. So I proceeded to argue with the chatbot until it accepted my point of view.
Even though I knew this conversation was stupid, I could feel my adrenaline rushing as I typed. I felt the same pace-around-the-room anxiety I would have felt were I engaged with a non-artifically intelligent person.
A short time later, I found myself unwillingly chatting with yet another bot. I had accidentally signed up for the wrong subscription for an app, and so I wanted to switch plans. There was no option to chat with a customer service agent, and so I had to send an email.
In response, I soon received a very detailed, mildly condescending, and completely unhelpful email listing off a bunch of things that ranged from pitifully obvious at worst to “things a typical user would already know” at best. You’re not going to believe this, but the response had been generated by AI.
Because, I guess, I had too much time on my hands. I responded to the email a couple of times, trying to guide the AI model to actually addressing my particular issue.
I failed.
It was annoying.
The AI chatbot indicated that, at some point, I would be graced with a response from an actual human being, but in the meantime I found myself left with the choice of continuing an increasingly frustrating back-and-forth with a poorly trained chatbot, or cutting my losses and waiting patiently for said human. I chose the latter.
What occurred to me as I was emailing back and forth with the shoddy chatbot is that, if a human were to suddenly enter the chat, my adrenaline and my level of annoyance would already be sky high. Thus, the chances of me being curt, or rude, or even mean to the human customer service agent would start out at a high level. Imagine working in customer service, and every single person you talk to has already spent 20 minutes being pushed around by ineffective chatbots. The person on the other end of the exchange would almost certainly start the conversation in an aggressive posture. The rates of abuse, I suspect, would be quite high.
We live in a world where the effects of de-humanizing language are impossible to ignore. Those of us in the United States live in a country in which people on the other side of the political spectrum are often treated as inherently evil, and inherently worthy of whatever harm comes their way.
Now consider what it means when people who are already awash in dehumanizing language begin to replace many of their human-to-human interactions with human-to-chatbot interactions. When we chat with chatbots, common courtesy seems unnecessary. Rank rudeness is also unnecessary, and yet human nature dictates that many of us (probably most of us) will resort to such language when interacting with ineffective or unhelpful AI. And if this becomes the norm—if screaming into the void of a mindless AI customer service text exchange becomes an everyday occurrence for us, how will we stop ourselves from letting that rudeness seep into our other interactions? More to the point, if we spend much of our time talking with non-humans, won’t this prime us to treat the actual human beings around us as if they, too, are less than human?
Here’s where things get even weirder, though. One might choose to respond to this conundrum by simply adopting a rule that people should be polite even if they are chatting with a chatbot. I have some philosophical misgivings about applying a term like “polite” to interactions with a computer, but on the other hand, I think such a rule could have positive implications for society.
Yet, there is evidence that the large language models (LLMs) underlying chatbots actually reward rude behavior. A team of investigators from Penn State University constructed a set of 50 multiple-choice questions spanning a range of disciplines, and posed those questions to ChatGPT 4o using a variety of tones. The tones ranged from “very polite” to “very rude” with three gradients of politeness in between. The researchers found the accuracy of ChatGPT’s responses to generally be in the low-80% range. However, they found that the highest accuracy—84.8%—came when the user’s tone was “very rude.” By contrast, “very polite” users were rewarded with responses that were accurate 80.8% of the time. In short: being rude led to moderate increases in accuracy.
Be careful not to anthropromoprhize AI. The better responses are not because the chatbots felt bullied into trying harder. Chatbots, after all, cannot feel. The authors of the paper, Om Dobariya and Akhil Kumar, note that the reasons behind the differences in accuracy likely come down to things like the language the LLM was trained on or the length or complexity of the various prompts.
“After all, the politeness phrase is just a string of words to the LLM, and we don’t know if the emotional payload of the phrase matters to the LLM,” Dobriya and Kumar cautioned.
Indeed. However, perhaps we should be less worried with how the LLM was trained, and more worried about how it is training us. That emotional payload we are deploying may not have an impact on the chatbots with which we interact, but if such conversations train us to be jerks to the actual humans with whom we interact, the impacts can still be harmful and indeed, dangerous.


