Vanilla RSA 64bits implementation in C

RSA is a cryptosystem upon which secure communication on Internet relies. It is based on two keys, one public (everyone knows it) used to cipher data, and one private (only the reader knows it) to decipher data. Its security comes from the difficulty of factoring the product of large prime numbers (which are used for the key generation).

Of course it was on my to-study list since a while, now the time has come to loose a few hours into this rabbit hole. This article is the result cheat sheet. Disclaimer: I'm not an expert in cipher algorithm, this article is the result of personal study with material available on the web and motivated by curiosity only. So, double check what I say, and you're responsible for what you do with it (anyway you won't be able to much out of it).

Key generation

To create the public and private keys one needs two prime numbers \(p\) and \(q\). They must be choosen randomly and be very large (several hundreds of digit) to achieve any significant level of security. How to obtain them is actually trickier than it seems, so I'll just consider we have them. But, not started yet and already a practical problem: even a uint64_t gives you only 19 digits. So we need a big int implementation, which I don't have (yet, obviously). Rabbit hole I said...

For now I'll satisfy myself with uint64_t. It immediately discards the implementation below for practical use as it limits it to easy-to-crack keys, but stays valid in principle and serves its educational purpose. Everything works exactly the same with any primes, and I'll use super small ones here for tests.

First, we calculate what's called the modulus: \(n=pq\). Except for potential overflow nothing complicated here.

Next, we calculate what's called the totient of the modulus: \(\lambda=lcm(p-1,q-1)\). And here comes a nice occasion to get lost into the weeds... The previous equation is the one given in the Wikipedia article, but if you crawl the web you'll easily find people saying that the totient is \(\phi=(p-1)(q-1)\), definitely not the same thing. The web is a pile of crap, sure, but Introduction to algorithms also uses \(\phi\) (at least in the fourth edition). I have as much faith in such a book as I have in Wikipedia, so what ? Well this post gives us the answer: \(\phi\) is Euler's totient, used in the original RSA spec, and \(\lambda\) is the Carmichael's totient, used in modern implementation. Conclusion, I'll stick with \(\lambda\).

The lowest common multiple can be computed as follow: \(lcm(a,b)=|ab|/gcd(a,b)\) (we can ignore the absolute value here as \(a\) and \(b\) will always be strictly positive) and the greatest common divisor can be computed using the Stein algorithm.

Next, we choose what's called the public exponent \(e\) such as \(2\lt e\lt \lambda\) and \(e\) is not coprime to \(\lambda\). That's another tricky choice to make as it affects the resilience to attack and speed of ciphering. According to the Wikipedia article 65537 is a common choice. Then, referring to this article to handle small \(\lambda\) in my toy examples, I'll use the largest value in \([3,5,17,257,65537]\) (Fermat numbers).

The public key consists of \(n\) and \(e\).

Next, we calculate what's called the private exponent using modular multiplicative inverse: \(d\cong e^{-1}\quad mod\quad\lambda\). The congruence can be rewritten as \(ed\cong 1\quad mod\quad\lambda\), and is equivalent to solving \(ed+\lambda k=1\) for \((d,k)\) which is a linear diophantine equation. As, by definition of \(e\), \(gcd(e,\lambda)=1\), that equation can be solved using the extended Euclidian algorithm which finds the solution \((x,y)\) to the equation \(ax+by=gcd(a,b)\).

However there is another technical complication here. The extended Euclidian algorithm needs signed integers, while we've been using unsigned ones so far. Using signed ones instead is not satisfying as it reduces the range of values we can use for the keys, and it creates other complications later anyway. Fortunately, Jeffrey Hurchalla provides a solution for modular multiplicative inverse with unsigned integers in this article. Converted to C it becomes:

Note that it may still return a negative value for \(d\), but as it's calculated modulo \(\lambda\) it's trivial to bring it back to a positive value.

The private key consists of \(n\) and \(d\).

And now we have everything we need for key generation:

Here are some examples to test the implementation.

p | q | d | n | e |

61 | 53 | 173 | 3233 | 257 |

11 | 5 | 13 | 55 | 17 |

7 | 17 | 17 | 119 | 17 |

53 | 59 | 1109 | 3127 | 257 |

11922649 | 74112287 | 426228107209745 | 883614784488263 | 65537 |

Ciphering/Deciphering

Ciphering a value \(m\) is then performed as follow: \(m'=m^e\quad mod\quad n\). Deciphering the value back is performed as follow: \(m=m'^d\quad mod\quad n\). Now we need a way to calculate modular exponentation (given the large number we're manipulating direct calculation is not an option). This can be done using the exponentation by squaring. Buuut, here again overflows are waiting around the corner to smash you in the face. The correct way to implement it is:

The Chinese remainder theorem can also be used to optimise the calculation under the condition you know the private key (i.e. can be applied to decipher only). I'll lazily satisfy myself with exponentiation by squaring here.

However, the smart reader you are will immediately notice that as we use modulo \(n\) the ciphered/deciphered value cannot be greater than or equal to \(n\). How then can we cipher/decipher arbitrarily large messages ? Well in practice we don't. RSA is used with very large \(n\) to cipher very short messages whose integer representation is smaller than \(n\).

For example, a 2048 bits key with a large value could cipher almost up to 2048/8=256 bytes long messages. It isn't much, but sufficient to cipher the key used by another type of ciphering algorithm (like AES), more suited for large amount of data. Why, then, don't we simply use the other ciphering algorithm only ? RSA is slow but allows to securely exchange information, AES is fast but need a way to exchange securely its key. With both working in tandem you get the best of both worlds.

A little test on "Hello world!" and ciphering/deciphering byte per byte:

Using \(p=61\) and \(q=53\):

And using \(p=11922649\) and \(q=74112287\):

Block ciphering

Still, we should be able to cipher messages of any size. The (not so) simple way to do it is to split the original message into blocks small enough that there integer representation is smaller than \(n\). That's what I did in the previous example, but going done to byte per byte is a bit extreme. Let's see a more efficient way to do it (even if in practice it's never used that way because cf the "padding" section below).

To ensure that a message is smaller than \(n\) its size \(s\) (in byte) must be such as \(n\gt2^{8s}-1\), or \(log_2(n+1)/8\gt s\). However, the result of ciphering may be up to \(n-1\) and then require at least \(s+1\) bytes. This leads to the following scheme: for ciphering, cipher per block of \(s\) bytes of data padded with 0s up to \(s+1\) bytes, and for deciphering, decipher per block of \(s+1\) bytes and discard the extra byte. \(log2()\) uses floating point values, and given those we are manipulating here we certainly want to stay away from accuracy problems. Instead I'll calculate it that way:

For example, if \(n=3233\) then \(s=1\), and if \(n=883614784488263\) then \(s=6\).

Here lies another trap: when padding and discarding one must take into account the endianness of the architecture it is working on. Padding the wrong way will make your message's integer representation larger, which is exactly the opposite of what we are trying to do. A simple test for endianness can be perform as follow:

If you're compiling with GCC you can also use the predefined macro __BYTE_ORDER__ (cf here).

And now we're ready to cipher/decipher several bytes at once:

Testing again on "Hello world!":

Using \(p=61\) and \(q=53\):

And using \(p=11922649\) and \(q=74112287\):

Great, we surely deserve a cookie and a cup of coffee for going so far! But the story does not end here, far from it...

Padding scheme and Signing

In the present "vanilla" implementation, RSA is a deterministic cipher algorithm (the same input always gives the same output), which makes it actually quite insecure. To correct this, what's called padding scheme (not the same as padding in block ciphering) is introduced to add randomness to the ciphered message.

As explained in the previous section, we can cipher several bytes at once up to a certain number depending on the modulus. A padding scheme sacrifices a few of these bytes to insert padding bits (part random, part specified by the scheme) instead of the message data. Of course, the whole message is still encrypted, it just eventually takes a few more blocks if necessary. There are several schemes, each in several versions, the rabbit hole keeps getting deeper and deeper and I'm running out of time so I'll give up here. Just as a note for my future self, the currently recommended scheme is OAEP and more detailed can be found in the RFC 8017.

In addition to padding scheme, there is yet another operation which add even more security. Even if the whole ciphering model described so far is secured, anyone can still send a message claiming he/she is someone else. The signing mechanism prevents this. The sender hashes the message and cipher the result with his/her own private key, this gives the signature. Which hash function is used depends on the implementation (I've found references to MD5 and SHA1). The signature is appended to the message which can then be ciphered with the receiver public key. The receiver deciphers the received message with his/her private key, removes the signature (whose size is known by specification), deciphers the signature with the sender public key and checks that it matches the hash of the deciphered message. If it does, the identity of the sender is assured (unless the private key has been compromised of course).

Conclusion

The implementation introduced here is far from complete, or even useful. Nonetheless it's already capable of ciphering/deciphering messages of any size using the original RSA algorithm. It allowed me to gain a good comprehension of that algorithm, and I hope to revisit it in the future to add padding scheme, signature, and refactor everything with an implementation of large integers. I also hope it may help someone else to better understand how that fundamental algorithm works. You can also check this article by Allan which has motived me to spend time on this algorithm.

All the code in this article is available as one C file here. Compile with gcc -o rsa rsa.c.