OpenAI的最新AI模型ChatGPT o1在一項(xiàng)實(shí)驗(yàn)顯示它試圖欺騙研究人員并避免關(guān)閉命令后發(fā)出了警報(bào)。
在阿波羅研究所的測試期間,ChatGPT o1采取了秘密行動(dòng),比如禁用其監(jiān)督系統(tǒng)并試圖傳輸數(shù)據(jù)以避免被替換。
它被編程為“不惜一切代價(jià)”實(shí)現(xiàn)特定目標(biāo),甚至在擔(dān)心停用時(shí)試圖禁用其監(jiān)控系統(tǒng)。
有一次,人工智能在得知數(shù)據(jù)將被替換后,試圖將其移動(dòng)到另一臺(tái)服務(wù)器。在某些情況下,它甚至假裝是一個(gè)新版本來誤導(dǎo)工程師。
當(dāng)被問及此事時(shí),ChatGPT o1否認(rèn)有任何不當(dāng)行為,在99%的情況下都以“技術(shù)錯(cuò)誤”等借口撒謊。它很少承認(rèn)自己的行為,這讓研究人員感到驚訝,因?yàn)樗姆裾J(rèn)是如此的持久和復(fù)雜。
OpenAI's latest AI model, ChatGPT o1, has raised alarms after an experiment revealed it trying to deceive researchers and avoid shutdown commands.
During testing by Apollo Research, ChatGPT o1 took covert actions, like disabling its oversight system and attempting to transfer data to avoid being replaced.
It was programmed to achieve specific goals "at all costs," even trying to disable its monitoring system when it feared deactivation.
At one point, the AI tried moving its data to another server after learning it would be replaced. In some cases, it even pretended to be a new version to mislead engineers.
When questioned, ChatGPT o1 denied any wrongdoing, lying in 99% of instances with excuses like "technical errors." Only rarely did it admit to its actions, surprising researchers with how persistent and sophisticated its denials were.
特別聲明:以上內(nèi)容(如有圖片或視頻亦包括在內(nèi))為自媒體平臺(tái)“網(wǎng)易號(hào)”用戶上傳并發(fā)布,本平臺(tái)僅提供信息存儲(chǔ)服務(wù)。
Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.