co-founder palma.ai

1 in 10 employees post confidential information into public llms.

You are probably thinking - 1 in 10? That doesn't sound too bad. Recent statistics have discovered that 11% of employees regularly post confidential content into public AI services like ChatGPT and that per 100,000 employees there were 319 such incidents a week!¹

AI tools such as ChatGPT or Bard, to name two, are rapidly becoming more and more integral to the tasks many employees fulfill on a daily basis, and the numbers are rising fast. According to Forbes, close to every single business owner (97%) believes that AI will have a major positive impact on their business with LLMs at the top of that list.

I don't think you need another blog to tell you how much more productive anyone can be with AI. There is already more than enough research on that topic today. What I still think is not spoken about enough are the potential risks of letting your organization have uncontrolled access to public LLMs and what the ramifications are of both unfettered access or completely restricted access.

The downfalls of complete AI freedom and prohibition

One of the absolute top contenders for cautionary tales is the accidental leak that occurred at Samsung where a user accidentally posted source code to software responsible for measuring semiconductor equipment, which was subsequently used by ChatGPT to "learn" and effectively became searchable on ChatGPT. Samsung subsequently noted that it is difficult to “retrieve and delete” the data on external servers, and the data transmitted to such AI tools could be disclosed to other users. Samsung blocked all access to public LLMs shortly after. Based on Samsung’s internal survey in April, about 65% of participants said using generative AI tools carries a security risk.¹

So wait a minute, how exactly does something I or my team enter into the public version of ChatGPT end up in someone else's prompt results?

The interesting thing is - no one can really track the exact "path" that the LLM takes to make use of this (your) training data and effectively make it an output to another user, not even the providers of the LLMs themselves. Sam Bowman, NYU professor and AI researcher, said:

"If we open up ChatGPT or a system like it and look inside, you just see millions of numbers flipping around a few hundred times a second, and we just have no idea what any of it means. With only the tiniest of exceptions, we can’t look inside these things and say, 'Oh, here’s what concepts it’s using, here’s what kind of rules of reasoning it’s using. Here’s what it does and doesn’t know in any deep way.' We just don’t understand what’s going on here. We built it, we trained it, but we don’t know what it’s doing."

Irrespective of whether or not your sensitive data ends up as text on someone else's screen, the second any data is stored on servers outside your control it becomes a security risk. As demonstrated in OpenAI's first big data leak back in late 2023. Repeated inputs of single words like "poem" caused the AI to reveal raw training data, which shows that even if your sensitive data doesn't end up in someone's output, it's still on a server outside of your control that could be "hacked" anytime. So why not just block the whole thing?

Well, public LLMs have an incredible upside - vast productivity gains at a very low cost. While expensive "private" deployments cost tens of millions today (but will get more cost-effective over time) and require special technical talent to set up and operate, public LLMs are inexpensive and can boost your competitive edge as a business and the productivity of your teams at just the cost of a subscription. That is, of course, assuming you can keep your critical data safe to avoid financial and reputational risk.

Can you afford not to use generative AI? If the answer is still yes, it won't be for long.

Ok, so let's get to the part where I introduce our solution to this (other than just completely blocking generative AI platforms or breaking the bank with a private deployment):

palma.ai's solution is designed to introduce safety layers at the user level without interfering with the daily tasks of each team member. Deploying as an extension to your browser of choice, we cover not only public LLMs but its app-stores as well so that your organization can use all the advancements of generative AI while ensuring total safety from leaking sensitive, secret, or non-compliant data. Enjoy peace of mind while benefiting from all the advantages LLMs have to offer.