Efficient C for ARM
Looping Structures
Examples
Here’s a routine to calculate a factorial.
int fact1(int N) { int i, fact = 1; for (i = 1; i <= N; i++) fact *= i; return fact; }
Compiled by ARM ADS:
fact1 MOV r2,#1 ; fact MOV r1,#1 ; i loop CMP r1,r0 ; i <= N MULLE r2,r1,r2 ADDLE r1,r1,#1 BLE loop MOV r0,r2 MOV pc,lr
The revised routine counts down.
int fact2(int N) { int fact = 1; do fact *= N; while (--N != 0); return fact; }
Compiled by ARM ADS:
fact2 MOV r1,#1 loop MUL r1,r0,r1 SUBS r0,r0,#1 BNE loop MOV r0,r1 MOV pc,lr
- Allows
ADD+CMPto becomeSUBS. - The inner loop is now an instruction shorter.
But do note the subtly different behaviour:
- Iterates at least once (ie. N == 0 won’t work, causing an infinite loop.)
- Uses a pre-decrement combined with test.


Inner loop is 4 instructions.
Writing loops to count down is beneficial because:
- The
- Replacing the
after changes:SUBSinstruction (subtract, setting flags) which is generated gives us a compare with zero for free.ADD+CMPwithSUBSsaves a cycle per iteration.i has been removed reducing register pressure.
The counting up
forloop was changed to a counting downdo .. whileloop.