http://benji3up2kxewkqfcq7buxk2xd6zwy3zggnurkrm3l4cvwy2iipvyyad.onion/mirrors/gmpdoc/Assembly-Loop-Unrolling.html
Alternately for a power of 2 unroll the loop count and remainder can be
established with a shift and mask. This is convenient if also making a
computed jump into the middle of a large loop. The limbs not a multiple of the unrolling can be handled in various ways, for
example A simple loop at the end (or the start) to process the excess.