在C ++中循环移位（旋转）操作的最佳实践

左和右移运算符（<<和>>）已经在C ++中可用。但是，我无法find如何执行循环移位或旋转操作。

如何执行“向左旋转”和“向右旋转”？

在这里旋转两次

Initial --> 1000 0011 0100 0010

应该导致：

 Final --> 1010 0000 1101 0000

一个例子会有帮助。

（编者注：如果旋转计数为零，或者编译为不止一个旋转机器指令，许多常见的expression式旋转的方法都会受到未定义的行为的影响，这个问题的答案应该logging最佳实践。

有关更多关于asm gcc / clang为x86生成的更多详细信息，另请参阅另一个循环问题的此答案的早期版本。

用C和C ++来表示旋转的最好的方法是避免任何未定义的行为，这似乎是John Regehr的实现。我已经适应它旋转的types的宽度（例如，假设uint32_t是32位宽，虽然C / C + +只能保证至less那么宽。我试图保持它的可读性，通过忽略这种检查的东西）。

 #include <stdint.h> // for uint32_t #include <limits.h> // for CHAR_BIT // #define NDEBUG #include <assert.h> static inline uint32_t rotl32 (uint32_t n, unsigned int c) { const unsigned int mask = (CHAR_BIT*sizeof(n) - 1); // assumes width is a power of 2. // assert ( (c<=mask) &&"rotate by type width or more"); c &= mask; return (n<<c) | (n>>( (-c)&mask )); } static inline uint32_t rotr32 (uint32_t n, unsigned int c) { const unsigned int mask = (CHAR_BIT*sizeof(n) - 1); // assert ( (c<=mask) &&"rotate by type width or more"); c &= mask; return (n>>c) | (n<<( (-c)&mask )); }

适用于任何无符号的整数types，不只是uint32_t ，所以你可以为其他大小的版本。

另请参见C ++ 11模板版本，其中包含大量的安全检查（包括types宽度为2的static_assert ） ，例如在某些24位DSP或36位主机上不是这种情况。

我build议只使用该模板作为名称包含显式旋转宽度的包装的后端。 整数提升规则意味着rotl_template(u16 & 0x11UL, 7)将执行32或64位旋转，而不是16 （取决于unsigned long 整数的宽度）。即使uint16_t & uint16_t也被C ++的整数提升规则提升为有signed int整数，除了int不超过uint16_t 。

在x86上 ，这个版本通过编译器内联到一个单独的rol r32, cl （或者rol r32, imm8 ），因为编译器知道x86的旋转和移位指令和 C源一样掩盖了shift-count。

编译器支持这种避免在UB上使用UB的习惯用法，对于uint32_t x和unsigned int n用于variables数转换：

铿锵声：从clang3.5开始，可变计数转动，在此之前多次转换+或insn。
海湾合作委员会：认可variables计数旋转自gcc4.9 ，多class+ + insn之前。 gcc5，然后在wikipedia版本中优化掉分支和掩码，只需使用ror或rol指令进行variables计数。
ICC ：自ICC13或更早版本以来支持可变计数旋转。当BMI2不适用于rorx eax,edi,25到10时shld edi,edi,7恒定计数轮换使用shld edi,edi,7 ，它比一些CPU（特别是AMD，也包括一些Intel）保存一个MOV。
MSVC：x86-64 CL19：只识别恒定计数旋转。（维基百科成语是公认的，但分支和AND没有被优化掉）。使用x86（包括x86-64）上的<intrin.h>的_rotl / _rotr内在函数。

gcc for ARM使用一个and r1, r1, #31进行variables计数的旋转，但仍然用一条指令实际旋转 ： ror r0, r0, r1 。所以gcc没有意识到，旋转计数本质上是模块化的。正如ARM文档所说， “移位长度为n ROR大于32与移位长度为n-32 ROR相同” 。我认为海湾合作委员会在这里会感到困惑，因为ARM上的左/右移位使计数饱和，所以32位或更多的移位将清除寄存器。（与x86不同，在这种情况下，移位屏蔽计数与旋转一样）。在识别旋转惯用语之前，它可能决定需要一个AND指令，因为非循环移位是如何在该目标上工作的。

当前的x86编译器仍然使用一个额外的指令来屏蔽8位和16位旋转的variables计数，可能出于同样的原因，他们不能避免ARM上的AND。这是一个错过的优化，因为性能不依赖任何x86-64 CPU上的旋转计数。（屏蔽计数是由于性能方面的原因而引入的，因为它处理的是迭代迭代，而不是像现代CPU那样具有恒定延迟。）

顺便说一句，偏好旋转右variables计数旋转，以避免让编译器做32-n来实现像ARM和MIPS只提供一个旋转权的体系结构左旋。（这可以优化编译时常数。）

有趣的事实：ARM并没有真正的专用移位/旋转指令，它只是MOV，源操作数在ROR模式下通过桶式移位器： mov r0, r0, ror r1 。所以一个旋转可以折叠成一个EOR指令的寄存器源操作数或其他东西。

确保你使用无符号types的n和返回值，否则它不会是一个旋转 。（用于x86目标的gcc执行算术右移，将符号位的副本移动而不是零，导致在将两个移位的值相加时出现问题。负符号整数的右移是在C中由实现定义的行为。）

另外， 确保移位计数是一个无符号types ，因为(-n)&31带有符号types可以是补码或符号/量值，而不是用无符号或二进制补码得到的模块2 ^ n。（请参阅Regehr博客文章的评论）。对于每个我已经查看过的编译器， unsigned int都可以很好地适用于每个x宽度。其他一些types实际上还是击败了一些编译器的成语识别，所以不要只使用与x相同的types。

一些编译器提供了旋转的内在特性，如果可移植版本不能在你所定位的编译器上生成好的代码，那么它比内联的asm好得多。对于我所知道的任何编译器，没有跨平台的内在函数。这些是一些x86选项：

英特尔文档<immintrin.h>提供_rotl和_rotl64内在函数，右移也是如此。 MSVC需要<intrin.h> ，而gcc需要<x86intrin.h> 。一个#ifdef照顾gcc与icc，但铿锵似乎并没有提供任何地方，除了在MSVC兼容模式下使用-fms-extensions -fms-compatibility -fms-compatibility-version=17.00 。而它的散发为他们吸（额外的掩蔽和一个CMOV）。
MSVC： _rotr8和_rotr16 。
gcc和icc（不是<x86intrin.h> ）： <x86intrin.h>还提供了__rolb / __rorb用于8位左右旋转， __rolw / __rorw （16位）， __rold / __rord （32位）， __rolq / __rorq （64位，仅针对64位目标定义）。对于窄旋转，实现使用__builtin_ia32_rolhi或...qi ，但32位和64位旋转使用shift /或（对UB没有保护，因为ia32intrin.h的代码只能在gcc for x86上工作）。 GNU C似乎没有任何跨平台__builtin_rotate函数的方式，它为__builtin_popcount （即扩展到目标平台上的任何最佳，即使它不是一个单一的指令）。大多数情况下，你从成语识别中获得好的代码。

 // For real use, probably use a rotate intrinsic for MSVC, or this idiom for other compilers. This pattern of #ifdefs may be helpful #if defined(__x86_64__) || defined(__i386__) #ifdef __MSC_VER #include <intrin.h> #else #include <x86intrin.h> // Not just <immintrin.h> for compilers other than icc #endif uint32_t rotl32_x86_intrinsic(rotwidth_t x, unsigned n) { //return __builtin_ia32_rorhi(x, 7); // 16-bit rotate, GNU C return _rotl(x, n); // gcc, icc, msvc. Intel-defined. //return __rold(x, n); // gcc, icc. // can't find anything for clang } #endif

据推测一些非x86编译器也有内在特性，但是我们不要扩展这个社区wiki的答案来包含它们。（也许在现有的关于内部函数的答案中这样做）。

（这个答案的旧版本build议MSVC特定的内联asm（只适用于32位x86代码），或http://www.devx.com/tips/Tip/14043为C版本。评论回复。）;

内联asm打败了许多优化 ，尤其是MSVC风格，因为它强制input被存储/重载。仔细编写的GNU C inline-asm循环可以使count成为编译时常量移位计数的立即操作数，但是如果要移位的值也是编译时常量，它仍然不能完全优化内联后。 https://gcc.gnu.org/wiki/DontUseInlineAsm 。

由于它是C ++，所以使用一个内联函数：

 template <typename INT> INT rol(INT val) { return (val << 1) | (val >> (sizeof(INT)*CHAR_BIT-1)); }

C ++ 11变体：

 template <typename INT> constexpr INT rol(INT val) { static_assert(std::is_unsigned<INT>::value, "Rotate Left only makes sense for unsigned types"); return (val << 1) | (val >> (sizeof(INT)*CHAR_BIT-1)); }

大多数编译器都有这个内在的东西。 Visual Studio，例如_rotr8，_rotr16

明确：

 template<class T> T ror(T x, unsigned int moves) { return (x >> moves) | (x << sizeof(T)*8 - moves); }

如何使用标准的bitset这样的事情…

 #include <bitset> #include <iostream> template <std::size_t N> inline void rotate(std::bitset<N>& b, unsigned m) { b = b << m | b >> (Nm); } int main() { std::bitset<8> b(15); std::cout << b << '\n'; rotate(b, 2); std::cout << b << '\n'; return 0; }

HTH，

详细来说，你可以应用以下逻辑。

如果位模式是33602整数

 1000 0011 0100 0010

然后你需要用2个右移shifs：首先复制位模式，然后左移：长度 – RightShift即长度是16右移值是2 16 – 2 = 14

14次左转后，你得到。

 1000 0000 0000 0000

现在右移价值33602，根据需要2次。你得到

 0010 0000 1101 0000

在14倍左移值和2倍右移值之间取一个或。

 1000 0000 0000 0000
 0010 0000 1101 0000
 ===================
 1010 0000 1101 0000
 ===================

你得到你的转移翻转值。记住比特明智的操作更快，这甚至不需要任何循环。

如果x是一个8位的值，你可以使用这个：

 x=(x>>1 | x<<7);

假设你想要右移L位，inputx是一个有N位的数字：

 unsigned ror(unsigned x, int L, int N) { unsigned lsbs = x & ((1 << L) - 1); return (x >> L) | (lsbs << (NL)); }

正确答案如下：

 #define BitsCount( val ) ( sizeof( val ) * CHAR_BIT ) #define Shift( val, steps ) ( steps % BitsCount( val ) ) #define ROL( val, steps ) ( ( val << Shift( val, steps ) ) | ( val >> ( BitsCount( val ) - Shift( val, steps ) ) ) ) #define ROR( val, steps ) ( ( val >> Shift( val, steps ) ) | ( val << ( BitsCount( val ) - Shift( val, steps ) ) ) )

源代码x位数字

 int x =8; data =15; //input unsigned char tmp; for(int i =0;i<x;i++) { printf("Data & 1 %d\n",data&1); printf("Data Shifted value %d\n",data>>1^(data&1)<<(x-1)); tmp = data>>1|(data&1)<<(x-1); data = tmp; }

另一个build议

 template<class T> inline T rotl(T x, unsigned char moves){ unsigned char temp; __asm{ mov temp, CL mov CL, moves rol x, CL mov CL, temp }; return x; }

以下是DídacPérez的答案略有改进的版本，实现了两个方向，并演示了使用unsigned char和unsigned long long值的这些函数的用法。几个注释：

为了编译器优化，内联函数被内联
我使用了一个cout << +value技巧来简单地输出一个无符号字符数字，我在这里find： https : //stackoverflow.com/a/28414758/1599699
为了清晰和安全，我build议使用明确的<put the type here>语法。
我使用unsigned char作为shiftNum参数，因为我在附加细节部分find了这里：

如果加法expression式为负或者加法expression式大于或等于（提升的） 移位expression式中的位数，则移位操作的结果是不确定的。

以下是我正在使用的代码：

 #include <iostream> using namespace std; template <typename T> inline T rotateAndCarryLeft(T rotateMe, unsigned char shiftNum) { static const unsigned char TBitCount = sizeof(T) * 8U; return (rotateMe << shiftNum) | (rotateMe >> (TBitCount - shiftNum)); } template <typename T> inline T rotateAndCarryRight(T rotateMe, unsigned char shiftNum) { static const unsigned char TBitCount = sizeof(T) * 8U; return (rotateMe >> shiftNum) | (rotateMe << (TBitCount - shiftNum)); } void main() { //00010100 == (unsigned char)20U //00000101 == (unsigned char)5U == rotateAndCarryLeft(20U, 6U) //01010000 == (unsigned char)80U == rotateAndCarryRight(20U, 6U) cout << "unsigned char " << 20U << " rotated left by 6 bits == " << +rotateAndCarryLeft<unsigned char>(20U, 6U) << "\n"; cout << "unsigned char " << 20U << " rotated right by 6 bits == " << +rotateAndCarryRight<unsigned char>(20U, 6U) << "\n"; cout << "\n"; for (unsigned char shiftNum = 0U; shiftNum <= sizeof(unsigned char) * 8U; ++shiftNum) { cout << "unsigned char " << 21U << " rotated left by " << +shiftNum << " bit(s) == " << +rotateAndCarryLeft<unsigned char>(21U, shiftNum) << "\n"; } cout << "\n"; for (unsigned char shiftNum = 0U; shiftNum <= sizeof(unsigned char) * 8U; ++shiftNum) { cout << "unsigned char " << 21U << " rotated right by " << +shiftNum << " bit(s) == " << +rotateAndCarryRight<unsigned char>(21U, shiftNum) << "\n"; } cout << "\n"; for (unsigned char shiftNum = 0U; shiftNum <= sizeof(unsigned long long) * 8U; ++shiftNum) { cout << "unsigned long long " << 3457347ULL << " rotated left by " << +shiftNum << " bit(s) == " << rotateAndCarryLeft<unsigned long long>(3457347ULL, shiftNum) << "\n"; } cout << "\n"; for (unsigned char shiftNum = 0U; shiftNum <= sizeof(unsigned long long) * 8U; ++shiftNum) { cout << "unsigned long long " << 3457347ULL << " rotated right by " << +shiftNum << " bit(s) == " << rotateAndCarryRight<unsigned long long>(3457347ULL, shiftNum) << "\n"; } cout << "\n\n"; system("pause"); }

 --- Substituting RLC in 8051 C for speed --- Rotate left carry Here is an example using RLC to update a serial 8 bit DAC msb first: (r=DACVAL, P1.4= SDO, P1.5= SCLK) MOV A, r ?1: MOV B, #8 RLC A MOV P1.4, C CLR P1.5 SETB P1.5 DJNZ B, ?1 Here is the code in 8051 C at its fastest: sbit ACC_7 = ACC ^ 7 ; //define this at the top to access bit 7 of ACC ACC = r; B = 8; do { P1_4 = ACC_7; // this assembles into mov c, acc.7 mov P1.4, c ACC <<= 1; P1_5 = 0; P1_5 = 1; B -- ; } while ( B!=0 ); The keil compiler will use DJNZ when a loop is written this way. I am cheating here by using registers ACC and B in c code. If you cannot cheat then substitute with: P1_4 = ( r & 128 ) ? 1 : 0 ; r <<= 1; This only takes a few extra instructions. Also, changing B for a local var char n is the same. Keil does rotate ACC left by ADD A, ACC which is the same as multiply 2. It only takes one extra opcode i think. Keeping code entirely in C keeps things simpler sometimes.

重载一个函数：

 unsigned int rotate_right(unsigned int x) { return (x>>1 | (x&1?0x80000000:0)) } unsigned short rotate_right(unsigned short x) { /* etc. */ }

 #define ROTATE_RIGHT(x) ( (x>>1) | (x&1?0x8000:0) )

在C ++中循环移位（旋转）操作的最佳实践

〜x +〜y ==〜（x + y）总是假的？

检查标志位Java

你有没有在实际项目中使用位移？

使用按位或0来放置一个数字

“int mask =〜0”的目的是什么？

什么是最快/最有效的方法来find一个整数在C中的最高设置位（MSB）？

按位（＆）如何在Java中工作？

在字节中设置特定位

在C / C ++中，最简单的方法是颠倒一个字节中的位的顺序？

什么是位掩码？