TRUSTS - Privacy Preserving Analytics and Quantum Computing
16 October 2020
If we talk about data analytics and quantum computing one of the first buzz words coming to our mind are privacy and data security. Usually cryptographic methods form the basis of secure systems and guarantee that our data is safe. In the webinar Privacy Preserving Analytics and Quantum Computing, Roman Walch, Matthias Böhm and Andreas Trügler from the Know-Center, Austria, explain cryptography, federated learning and quantum computing in more detail. Also in the related TRUSTS research project several of these methods play a key role.
Privacy-preserving analytics - cryptographic methods
Generally speaking, cryptography is a way of encrypting data to protect it from unauthorised access and to transmit and store it in a secure way. Due to limitations of classical cryptography and data markets getting gradually more complex new methods or protocols are indispensable. One of the greatest breakthroughs in modern cryptography is the ability to perform calculations directly on encrypted data. This means that the true content of data is never revealed to other parties and stays fully private, although another party, e.g. a server or cloud, can perform calculations on the encrypted data. Two new protocols that allow such privacy-preserving analytics are Multi-Party Computation (MPC) or Fully Homomorphic Encryption (FHE).
In many business processes usually a trusted third party plays the role of an intermediary in case the other two parties don’t trust each other. In cryptography we usually want to get rid of such trusted third parties and ensure save and secure data exchange without the need of any middleman or mediator. MPC protocols allow that users can compute a mutual function or output without disclosing their input data, i.e. no user knows the input of the others but all can benefit from the calculation based on the combination of their data. To give an example let’s have a look at the use case of collaborative learning. Several parties want to use a machine learning (ML) model, which will work best if they all combine their data, but the users don’t want to actually share their data to others or competitors. This is where MPC comes in, it allows to solve this problem by keeping the input fully private and nevertheless still allows to perform calculations based on the complete data since all parties participate in the computation.
Computation on encrypted data
Let us assume you encrypt two numbers. If you want to add or multiply the encrypted versions, this will not work since you cannot do arithmetics with gibberish expressions. This is why the development of Fully Homomorphic Encryption (FHE) was such a big breakthrough. This novel method allows to do exactly that, to add the two gibberish expressions and if you decrypt the result (another encrypted expression) you get the same number as if summing up the numbers in plain text. In this way, you can outsource calculations to a server or cloud without having to share any data, the true content remains hidden and fully secure and only you as data provider have the key to unlock the results.
Limitations and challenges
Although the cutting-edge cryptographic and privacy-preserving protocols are very promising there are still limitations and problems of inefficiency that we have to overcome. In case of MPC, there is a huge increase of communication between the involved parties and also the corresponding security guarantees influence the performance. In case of FHE, there is again a huge performance overhead and the calculation on encrypted data is, at the moment, much slower than on unencrypted data (100 to 1000 times). This depends very much on the actual problem and number of necessary calculations and also faster solutions are possible, but it works as a rule of thumb. However, there is a lot of research going on right now to improve these protocols.
When it comes to Federated Learning (FL) you also do not directly share your data with others, but again a global machine learning model can be trained on a server or cloud, where only the parameters of the model are exchanged between server an clients. First you download the shared model, update it by running the model locally on your device and sending only the updated parameters back to the central server where they are aggregated and combined with other results. Then the improved model is sent back to the devices again and you do the next run locally on your device. In other words, thanks to FL you can collect experience from a wide range of local datasets spread across different locations. This allows that multiple parties can collaborate on the development and training of models again without having to share their data.
Quantum computing - what does Schrödinger´s cat have to do with it?
Jumping from cryptography and federated learning to quantum computing may seem a bit abrupt, but there are several links to this exiting new technology. Especially if you ask the question on how to keep your data secure in the next decades. Let’s start with the main ingredients that are needed to build a quantum computer. The basic idea is to use quantum mechanical effects for the calculation of specific problems – a concept that was already introduced in the 1980s by Richard Feynman. First you need a physical two-state system that can create a so-called qubit and where the two possible states can form a superposition. Superposition is also what you might know from Schrödinger’s cat, a state where two possibilities occur at the same time. Then you need a second ingredient from quantum mechanics called entanglement, where you connect two or more of such qubits in a way that they are not independent from each other anymore but form one single quantum state. We cannot experience this in our every day life, this is again a true quantum feature of nature. The last part of a quantum computer is a way to interact and communicate with the entangled qubits, where usually interference (e.g. of microwave impulses) is used. The huge advantage of quantum computers is that N qubits describe 2N possible states at the same time. So by manipulating the entangled system you can massively perform parallel operations on all states. This is why they can perform calculations with an incredible speed compared to classical computers – John Preskill, one of the quantum computing pioneers, formed the term quantum supremacy. Last year in a Nature publication there was a first claim that quantum supremacy has been reached, by performing a calculation on a quantum computer in 200 seconds, which would have taken 10 000 years on a super computer. There was some debate about that immediately after the publication, it was at least a very important milestone that showed the great possibilities of quantum computers. Given a large enough quantum computer (large means with a high number of high-quality qubits) there are algorithms that can break modern cryptographic systems based on factorisation and such machines would also revolutionise machine learning and AI. The threat of breaking crypto-systems like RSA with quantum computers also lead to a new research field called post-quantum cryptography, where new methods and algorithms are currently developed that are still secure - even if large scale quantum computers are available. One of these methods is FHE by the way – one of the mentioned links to the cryptographic methods described at the beginning of this post.
Where are we now?
First quantum computing applications are already used in several industries such as the automotive (quantum traffic optimisation, research on solid state batteries, etc.), medical and pharma (drug discovery, disease evolution), or finance (derivative pricing, portfolio selection and optimization, etc.) industry. However, quantum computing is still in its infancy. There are still many problems to solve and the current machines are far away of breaking RSA. The interaction with the environment usually destroys quantum properties, so quantum computers need to be isolated or cooled down to minimise such interactions. In physics this is called decoherence, which also solves the conundrum of Schrödinger’s cat by the way. However, new algorithms can already be developed now and especially hybrid methods where classical machine learning is combined with quantum algorithms have a lot of potential. First use cases already exploit advantages with the currently available quantum computers and although quantum computers will not replace classical computers as we know them we can expect fantastic new possibilities in the upcoming years.