2018-02-01 wiki.lesswrong.comPaperclip maximizerThe canonical thought experiment showing how an artificial general intelligence, even one designed competently and without malice, could ultimately destroy humanity.
2023-02-25 lesswrong.comSolidGoldMagikarp (plus, prompt generation)We discovered that prompting like this with the mysterious tokens can lead to very peculiar behaviour. Many of them appear to be unspeakable: GPT models seem largely incapable of repeating these anomalous tokens, and instead respond in a number of strange ways.