It’s been used for a better performance, but only works for leaf functions.
Working szenario (Leaf function)
void foo() { return; } void bar() { return; } int main() { foo(); // Write PC+1 (foo()+1) into LR, goto foo and come back via LR bar(); // Write PC+1 (bar()+1) into LR, goto bar and come back via LR return 0; }
Nice, it’s fast, cause we only cache the return address in a register (LR) and not on the stack.
Not working szenario (Non-leaf function)
void bar(); void foo() { bar(); // Write PC+1 (bar()+1) into LR, goto bar and come back via LR return; // You can't get back to main, cause LR points to bar()+1! } void bar() { return; } int main() { foo(); // Write PC+1 (foo()+1) into LR, goto foo and come back via LR? return 0; }
Shit, for non-leaf functions we have to operate like Intel/AMD and have to store the LR on the stack.