An Oldie But Goodie
Dec. 19th, 2011 06:52 pmI was just reviewing some code which was copying a linear array of 16 values into a 4x4 matrix as follows:
That immediately joggled my inner optimizer. Shifting right by 2 bits is equivalent to dividing by 4, but faster on most processors. Similarly, masking off the lower two bits is equivalent to taking a 4 modulus, but again faster. So this should be significantly faster:
But wait. Surely today's modern compilers are capable of detecting this level of optimization themselves!
I tried it out and found that even at the highest level of optimization in g++, the second version is 60 times faster than the first one.
Nice to see that some of that old learning is still useful today.
for (int i = 0; i < 16; ++i) {
matrix[i / 4][i % 4] = array[i];
}
matrix[i / 4][i % 4] = array[i];
}
That immediately joggled my inner optimizer. Shifting right by 2 bits is equivalent to dividing by 4, but faster on most processors. Similarly, masking off the lower two bits is equivalent to taking a 4 modulus, but again faster. So this should be significantly faster:
for (int i = 0; i < 16; ++i) {
matrix[i >> 2][i & 0x3] = array[i];
}
matrix[i >> 2][i & 0x3] = array[i];
}
But wait. Surely today's modern compilers are capable of detecting this level of optimization themselves!
I tried it out and found that even at the highest level of optimization in g++, the second version is 60 times faster than the first one.
Nice to see that some of that old learning is still useful today.