Problem: Modular exponentiation. Solution:

The idea is to use square and multiply (or double and add, in terms of powers) like this:

base to the power of 0b10 = base ^ 2 = (base squared)

base to the power of 0b11 = base ^ 3 = (base squared) * base

base to the power of 0b100 = base ^ 4 = (base squared) squared

base to the power of 0b101 = base ^ 5 = ((base squared) squared) * base

base to the power of 0b110 = base ^ 6 = ((base squared) * base) squared

So in this instance we want to first square the base and then multiply the result by the base. The most illustrative examples are the last two. Notice that the order of set vs unset bits determines the order of squaring and multiplication. In particular, we square after every bit, but we multiply upon seeing a set bit. Thus it could be written for clarity:

10 square then nothing

11 square then multiply

100 square then nothing then square then nothing

101 square then nothing then square then multiply

110 square then multiply then square then nothing

111 square then multiply then square then multiply

In other words, assuming there are at least 2 bits, we start from the leftmost bit, square, go right one place, multiply if bit is set, else nothing, then square and move rightward again, stop if there is nothing on the right.

Can we do it reversely, going from right to left, so that we can use the modulo and division by 2 operations instead of having to convert the exponent to a bit array first?

First, we need a different representation of what the powers are.

a ^ 0b1 = a ^ 1

a ^ 0b10 = a ^ 2 = a ^ 2

a ^ 0b11 = a ^ 3 = a ^ 2 * a ^ 1

a ^ 0b100 = a ^ 4 = a ^ 4

a ^ 0b101 = a ^ 5 = a ^ 4 * a ^ 1

a ^ 0b110 = a ^ 6 = a ^ 4 * a ^ 2

a ^ 0b111 = a ^ 7 = a ^ 4 * a ^ 2 * a ^ 1

As you can see, the set bits correspond exactly to which terms of the base get multiplied into the result. Notice also these terms are powers of 2, so we can generate them sequentially by squaring. Thus, we start from the rightmost (lowest) bit, if the bit is set, we multiply the term with the result, otherwise we do nothing, square the term, go to the next bit and repeat. This method requires keeping track of a third variable, the current term, in addition to the current result and base, which were the only 2 variables we needed to keep track of in the left-to-right method.

My Python implementation of left-to-right (LR) and right-to-left (RL) methods

Time complexity: The time complexity of the square-and-multiply algorithm is linear with respect to the number of bits in the exponent whereas the time complexity of the naive approach – multiplying the base with itself an exponent number of times – is exponential with respect to the number of bits in the exponent. You cannot do better than linear time because you have to at least look at each of the bits in the exponent, so this algorithm is optimal in terms of asymptotic time complexity.