Are they being contradicting or am I just not seeing the potential security issue of a cloud GenAI service? Really curious what is the consensus outside of the circle of people I talked to and also what are the critical company data, if any, that somehow can be in all cloud services except GenAI?
In terms of API usage, OpenAI has never used the prompts for training but this is very poorly understood among enterprise CEOs and CIOs. Executives heard about the Samsung incident early on (confidential information submitted by employees via the ChatGPT interface, which was training on the data by default at the time), and their trust was shook in a fundamental way.
The email analogy is very apt - companies send all of their secrets to other peoples' computers for processing (cloud compute, email, etc.) without any issue. BUT there's a big caveat: abuse moderation. Prompts, including API calls, are normally stored by OpenAI/MS/etc. for a certain period and may be viewed by a human to check for abuse (e.g. using the system to do phishing requests). This causes significant issues when it comes to certain type of data. Worth nothing that the moderation by default approach is in the proces of being dialed down, and there are now top tier enterprise plans that are no longer moderated by 3rd parties by default.
TL;DR: The concern stems from an early loss of trust (Samsung), but there is a valid issue for certain types of data (abuse moderation), but there are ways around it if you have enough money (enterprise plans).