Sign In

Communications of the ACM

ACM News

DarkBert AI Was Trained Using Dark Web Data

View as: Print Mobile App Share:

While the researchers don’t have any plans to release DarkBERT to the public, they are accepting access requests for academic purposes.

Credit: Shutterstock

Following the success of OpenAI's ChatGPT, Microsoft's Bing Chat and Google Bard, researchers have created a new AI model with a much darker twist.

While the large language models (LLMs) that power ChatGPT and Google Bard were trained on data from the open web, DarkBERT was trained exclusively on data from the dark web. Yes, you read that correctly, this new AI model was trained using data from hackers, cybercriminals and other scammers.

A team of South Korean researchers have released a paper (PDF opens in new tab) detailing how they made DarkBERT using data from the Tor network, which is often used to access the dark web. By crawling through the dark web and then filtering the raw data, they were able to create a dark web database that they used to train DarkBERT.

From Tom's Guide
View Full Article


No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account