Privacy-Aware Inferencing for LLMs

Question

Hello,One of the biggest challenges with cloud-based inferencing for LLMs is keeping user data private. Is it possible to use both local and cloud machines together to solve this?For example, could we run the first and last layers of an LLM on a local machine to protect the data privacy and use the cloud for the rest to speed things up? We could fine-tune the first and last layers locally to change the weights and keep them away from the cloud.Please let me know if there's any ongoing researches using such approach for privacy-aware inferencing.Thank you.

koutetsu · Accepted Answer

I had a similar idea some time ago but didn't implement due to its complexity and the need to juggle different parameters. On top of that, I don't think there are any guarantees when it comes to privacy. Your users will have to trust that no mistake will be made in handling the raw data or the preprocessed data and that no malicious actor will be able to access the original weights.
You should instead try looking into Homomorphic Encryption:
https://huggingface.co/blog/encrypted-llm
It is resource intensive and slower but it serves your purpose better, in my opinion.