This is an O(1) implementation compared to the previous O(log n). In my tests it was faster for all values >4 (int) / >8 (long).