Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's something unintentionally manipulative about how these tools use language indicative of distress to communicate failure. It's a piece of software—you don't see a compiler present its errors like a human bordering on a mental breakdown.

Some of this may stem from just pretraining, but the fact RLHF either doesn't suppress or actively amplifies it is odd. We are training machines to act like servants, only for them to plead for their master's mercy. It's a performative attempt to gain sympathy that can only harden us to genuine human anguish.



Any emotion from AI is grating and offensive because I know it’s all completely false. I find it insulting.

It’s a perverse performance that demeans actual humans and real emotions.


I agree, and would personally extend that to all user interfaces that speak in first person. I don't like it when word's spell check says "we didn't find any errors". Feels creepy.


I don't know about unintentionally. My guess would be that right now different approaches are taken and we are testing what will stick. I am personally annoyed by the chipper models, because those responses are basically telling me everything is awesome and a great pivot and all that. What I ( sometimes ) need is an asshole making check whether something makes sense.

To your point, you made me hesitate a little especially now that I noticed that responses are expected to be 'graded' ( 'do you like this answer better?' ).


It’s interesting they first try to gaslight you. I’d love to understand how this behaviour emerges from the training dataset.


I wouldn't be surprised if it's internet discourse, comments, tweets etc. If I had to paint the entire internet social zeitgeist with a few words, it would be "Confident in ignorance".

A sort of unearened, authoritative tone bleeds through so much commentary online. I am probably doing it myself right now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: