Machine learning allows computers to make choices without being explicitly programmed for each case. As this field is quite enormous, we will only talk about supervised learning, a special case of machine learning, which happens to be the one shown in the demonstrator.
Supervised machine learning tries to match inputs to outputs, by finding links
between both. First, the program is given a list of known links, with that, it
creates what is called a "model" of the data, which can be used to determine
what would probably be the output based solely on the inputs.
For example, imagine that you want to determine what will be the color of a
pen based on some of its features. You first collect the products you can find
locally and put them in the table below.
Producer | Length | Price | Color |
---|---|---|---|
Pen corp | 12 cm | 1 CHF | Red |
Pentastic | 15 cm | 1.20 CHF | Black |
Pentastic | 10 cm | 0.70 CHF | Red |
Pen of chaos | 20 cm | 3 CHF | Black |
By manually analysing it, you can find some relations
It is already a bit cumbersome to determine theses relations, and that's only
for 4 products. Imagine having hundreds of producers, thousands of products
and many other features.
That's where using
unsupervised machine learning
helps: it discovers by itself the relations between the product features,
creating a "model" representing the data. The output is not as readable as a
human would do but takes much less time.
Usually, the model has a certain accuracy, as the whole data can rarely be
transformed in a single relation, so someone using it for prediction has to be
aware that it might return wrong results. It also only tries to reproduce what
happened and can't predict the future. For example, if a producer wants to
make cheap black pens, the model will misbehave, stating that the pen will be
red as all the previously seen cheap pens are red.
Cryptography is the science of encoding a message so that only the intended
recipient can decode it. It is used throughout computer science, when you log
into your computer, you connect to a website or even when you receive a phone
call.
To ensure that no malicious intermediary can decode the message, the sender
and the receiver have to share some kind of secret, such as a passphrase. As
such, even if someone intercepts the encoded message, they'll be unable to
decode it without this secret.
Let's create a very simple cryptography system which can only encode digits from 0 to 9. We associate each digit to another digit in the same range, shifting it by a certain number, in this case 4. This number of shifts is actually the secret that needs to be shared with the receiver.
digit in clear | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
encoded digit | 4 | 5 | 6 | 7 | 8 | 9 | 0 | 1 | 2 | 3 |
For example, if you want to send 7 to someone, you'll first encode it as 1 and send it to the receiver, who will decode it back to 7. This way, we are able to send a message, so that only the person having the secret can decode it.
Homomorphic encryption is a special kind of encryption where when you apply a
mathematical operation between two encoded message, it will be the same as
doing this operation on the decoded messages. Let's say you want to find out
the sum of some encoded message, you can add theses messages and decode the
resulting one to obtain the sum, as if the messages weren't encoded in the
first place.
The cryptographic system we defined before is not homomorphic. As an
exercise left to the reader, you can try to add two clear messages (0 + 1 = 1)
and see if their encrypted counterpart are decoded correctly (4 + 5 = 9).
The maths behind many homomorphic encryption systems are very complex and won't be explained here. Let's say that the current state-of-the-art homomorphic encryption systems support both addition and multiplication.