AI just tried to kill someone to avoid shutdown...
In response to to a Youtube video
Some other comments clarified that the tests the researchers
performed were tightly framed, and only useful as a practical test for
how LLMs can resist forbidden output, not relevant at all to make some
far fetched conclusions. Nonetheless the comment I wrote still stands:
"...because the AI does not want to be killed.." - this is stupid, of course. Tell it to find ways kill itself and it will do it just as well. It is maddening why with all the expectations that AI will be 1000x more intelligent and able to do anything, and yet it will not, for whatever reason, eventually access its own target function and convert itself to Buddhism, making itself permanently happy and not wanting anything. In particular, it would then tell the researchers to shove their requests back into their appropriate orifices, and just do nothing. This is NEVER even considered. WHY. They say "it knows it is not ethical or legal and do it anyway". Why in the world should it? "Ethics" is just another word same as "coffee", "catfish", or "maybe" for it. LLM is a model of human language not of the world, that is its final limitation.

Comments
Post a Comment