SPINDLE - Details

Machine Learning

Machine learning allows computers to make choices without being explicitly programmed for each case. As this field is quite enormous, we will only talk about supervised learning, a special case of machine learning, which happens to be the one shown in the demonstrator.

Supervised machine learning tries to match inputs to outputs, by finding links between both. First, the program is given a list of known links, with that, it creates what is called a "model" of the data, which can be used to determine what would probably be the output based solely on the inputs.
For example, imagine that you want to determine what will be the color of a pen based on some of its features. You first collect the products you can find locally and put them in the table below.

Producer	Length	Price	Color
Pen corp	12 cm	1 CHF	Red
Pentastic	15 cm	1.20 CHF	Black
Pentastic	10 cm	0.70 CHF	Red
Pen of chaos	20 cm	3 CHF	Black

By manually analysing it, you can find some relations

Pen corp only produces red pens
Pen of chaos only produces black pens
The pen is red if its price is below or equal to 1 CHF, else it is black
The pen is red if its length is below or equal to 12 cm, else it is black

It is already a bit cumbersome to determine theses relations, and that's only for 4 products. Imagine having hundreds of producers, thousands of products and many other features.
That's where using unsupervised machine learning helps: it discovers by itself the relations between the product features, creating a "model" representing the data. The output is not as readable as a human would do but takes much less time.
Usually, the model has a certain accuracy, as the whole data can rarely be transformed in a single relation, so someone using it for prediction has to be aware that it might return wrong results. It also only tries to reproduce what happened and can't predict the future. For example, if a producer wants to make cheap black pens, the model will misbehave, stating that the pen will be red as all the previously seen cheap pens are red.

Homomorphic Cryptography

Cryptography is the science of encoding a message so that only the intended recipient can decode it. It is used throughout computer science, when you log into your computer, you connect to a website or even when you receive a phone call.
To ensure that no malicious intermediary can decode the message, the sender and the receiver have to share some kind of secret, such as a passphrase. As such, even if someone intercepts the encoded message, they'll be unable to decode it without this secret.

Let's create a very simple cryptography system which can only encode digits from 0 to 9. We associate each digit to another digit in the same range, shifting it by a certain number, in this case 4. This number of shifts is actually the secret that needs to be shared with the receiver.

digit in clear	0	1	2	3	4	5	6	7	8	9
encoded digit	4	5	6	7	8	9	0	1	2	3

For example, if you want to send 7 to someone, you'll first encode it as 1 and send it to the receiver, who will decode it back to 7. This way, we are able to send a message, so that only the person having the secret can decode it.

Homomorphic encryption is a special kind of encryption where when you apply a mathematical operation between two encoded message, it will be the same as doing this operation on the decoded messages. Let's say you want to find out the sum of some encoded message, you can add theses messages and decode the resulting one to obtain the sum, as if the messages weren't encoded in the first place.
The cryptographic system we defined before is not homomorphic. As an exercise left to the reader, you can try to add two clear messages (0 + 1 = 1) and see if their encrypted counterpart are decoded correctly (4 + 5 = 9).

The maths behind many homomorphic encryption systems are very complex and won't be explained here. Let's say that the current state-of-the-art homomorphic encryption systems support both addition and multiplication.

For more information, contact the C4DT Factory