Security experts are warning that data that is online for even a short time can stay in online generative AI apps like Microsoft Copilot for a long time after the data is made private.
Lasso, an Israeli cybersecurity company that focuses on new generative AI threats, has found that thousands of once-public GitHub projects from some of the biggest companies in the world are now private. Microsoft’s is one of them.
Parhlo World talked to Ophir Dror, co-founder of Lasso. He said that the company found content from its own GitHub source in Copilot because Microsoft’s Bing search engine had indexed and cached it. Dror said that the repository had been set to private after being mistakenly made public for a short time. Trying to view it on GitHub gave them an error message saying “page not found.”
“It was a surprise that on Copilot we found one of our own private repositories,” Dror said. “I wouldn’t see this information if I were to surf the web.” But anyone could get this information by asking Copilot the right question.
When Lasso learned that tools like Copilot could possibly make any data on GitHub public, even for a short time, it looked into it further.
Lasso got a list of all the repositories that were public at any given time in 2024 and found the ones that had been removed or made private since then. The company used Bing’s caching system to find that more than 20,000 private GitHub repositories still had data that could be accessed through Copilot. This affected more than 16,000 groups.
Lasso says that Amazon Web Services, Google, IBM, PayPal, Tencent, and Microsoft itself are among the companies that are affected. The company said that some affected companies might be asked to hand over private GitHub archives that hold intellectual property, sensitive business data, access keys, and tokens.
Lasso said that it used Copilot to get the files from a GitHub repository that Microsoft has since deleted. This repository had a tool that let people use Microsoft’s cloud AI service to make “offensive and harmful” AI pictures.
Dror said that Lasso contacted all companies that were “severely affected” by the data leak and told them to change or delete any keys that had been compromised.
Parhlo World asked the companies Lasso named that were harmed, but none of them replied. Parhlo World also asked Microsoft a question, but they didn’t answer.
In November 2024, Lasso told Microsoft what it had found. Microsoft told Lasso that the problem was of “low severity” and that this caching behavior was “acceptable.” Starting in December 2024, Microsoft stopped including links to Bing’s cache in its search results.
Also Read: Microsoft Releases Copilot+ Pcs Because It Wants to Turn Windows Into an Ai Operating System
But Lasso says that even though the storing feature was turned off, Copilot could still access the data even though regular web searches couldn’t find it. This suggests that the fix is only temporary.
What do you say about this story? Visit Parhlo World For more.