integer power of 2

futher optimisation

We saw than if B = 1 << b, then

A * B == A << b
A / B == A >> b
A % B == A & (B - 1) == A & ((1U << b) - 1)

But there are interesting property if we have 2 power of 2.

B1 * B2 == (1 << b1) * (1 << b2) == 1 << (b1 + b2)
B1 / B2

if B1 >= B2, (1 << b1) / (1 << b2) == 1 << (b1 - b2)
if B1 < B2, 0

A / (B1 / B2) == A / (1 << (b1 - b2)) == A >> (b1 - b2) because (B1 / B2) can't be null in C
A * (B1 / B2)

if b1 - b2 >= 0, A * (1 << (b1 - b2)) == A << (b1 - b2)
if b1 - b2 < 0, 0

This means macro is not enough, but compiler isn't often clever to detect this. To have efficient code, better feed compiler with precomputed stuff.

int divu3(uint a, uint b)
{
        return a / ((1U<<b) / 4);
}

int divu300(uint a, uint b)
{
        return a / (1<<(b-2));
}

divu3:
        stmfd   sp!, {r3, lr}
        mov     r3, #1
        mov     r1, r3, asl r1
        mov     r1, r1, lsr #2
        bl      {aeabi_uidiv
        ldmfd   sp!, {r3, pc}
divu300:
        sub     r1, r1, #2
        mov     r0, r0, lsr r1
        mov     pc, lr

PS : arm compiler is not able to optimize A / B and A * B …