在C ++ 11中,“i + = ++ i + 1”是否performance出未定义的行为?
这个问题出现在我正在阅读(答案)时,为什么我在C ++ 11中定义了明确的i ++ I + 1?
我收集的微妙的解释是(1)expression式++i
返回一个左值,但是+
prvalues作为操作数,所以必须执行从左值到左值的转换; 这涉及获得该左值的当前值(而不是一个比旧的值多一个),因此必须在增量的副作用(即,更新i
) 后sorting(2)该赋值的LHS也是因此其价值评估不涉及获取i
的当前价值; 而这个值的计算与RHS的值计算是不一致的,这就不存在问题。(3)赋值本身的值计算涉及更新i
(再次),而是在RHS的值计算之后进行sorting,因此在先给i
更新; 没问题。
好吧,那里没有UB。 现在我的问题是,如果将assigment操作符从=
改成+=
(或类似的操作符),会怎样。
expression式
i += ++i + 1
的评估是否会导致未定义的行为?
正如我所看到的那样,这个标准似乎在这里相互矛盾。 由于+=
的LHS仍然是一个左值(而其RHS仍然是一个值),就上面(1)和(2)而言,同样的推理也适用。 在+=
上操作数的评估中没有未定义的行为。 至于(3),复合赋值的操作+=
(更准确地说是该操作的副作用;如果需要的话,它的值计算在任何情况下都在其副作用之后被sorting)现在必须同时获取当前值i
, 然后 (明显地按顺序排列,即使标准没有明确说明,否则评估这样的操作符总是会调用未定义的行为)添加RHS并将结果存回i
。 如果这些操作在++
副作用下没有被确定,那么这两个操作都会给出未定义的行为,但是正如上面所讨论的( ++
的副作用是在给出+=
操作符的RHS的值计算之前进行sorting的,哪个值计算在该复合赋值操作之前被sorting),情况并非如此。
但是另一方面,标准还表示E += F
相当于E = E + F
,只是(左值)E只被评估一次。 现在在我们的例子中, 作为左值的i
(这是E
在这里是什么)的值计算不涉及需要与其他动作sorting的任何事情,因此做一次或两次都没有区别; 我们的expression应严格等于E = E + F
但是这是问题。 很明显,评估i = i + (++i + 1)
会给出未定义的行为! 是什么赋予了? 或者这是标准的缺陷?
添加。 我略微修改了上面的讨论,为副作用和价值计算之间的恰当区分做了更多的正义,并且使用了expression式的“评估”(如同标准一样)来包含两者。 我认为我的主要讯问不只是在这个例子中是否定义了行为,而是如何读取标准来决定这一点。 值得注意的是,如果将E op= F
到E = E op F
的等价性作为复合赋值操作的语义的最终权威(在这种情况下,该例子显然具有UB),或者仅仅是作为什么math运算涉及确定要分配的值(即,由op
标识的值,以左值和左值作为右操作数,左值到右值将化合物分配操作符的LHS转换为左操作数)。 后者的select使得在这个例子中争论UB更加困难,正如我试图解释的那样。 我承认,等价权威是有诱惑力的(因此复合赋值成为一类二级原语,其含义是通过在一级原语的基础上重写而给出的;因此语言定义将被简化)反对这一点是相当有力的论据:
-
等价不是绝对的,因为“
E
只评估一次”的例外。 请注意,这个例外是必不可less的,以避免在评估E
涉及副作用未定义的行为时使用,例如在相当常见的a[i++] += b;
用法。 如果事实我认为没有绝对等同的改写来消除复合作业是可能的; 使用一个虚构的|||
运算符来指定不确定的评估,可以尝试定义E op= F;
(int
操作数简单)等价于{ int& L=E ||| int R=F; L = L + R; }
{ int& L=E ||| int R=F; L = L + R; }
{ int& L=E ||| int R=F; L = L + R; }
,但是这个例子不再有UB。 无论如何,这个标准并没有给我们带来任何重要的配方。 -
该标准不把复合赋值当作第二类原语,因为不需要单独定义语义。 比如在5.17(重点是我的)
赋值运算符(=)和复合赋值运算符全部从右到左。 […] 在所有情况下 ,赋值都在左右操作数的值计算之后,赋值expression式的值计算之前进行sorting。 对于一个不确定sorting的函数调用, 复合赋值的操作是一个单独的评估 。
- 如果意图是让复合作业仅仅是简单作业的简写,那么就没有理由在这个描述中明确地包含它们。 最后一句话甚至直接否定了如果等同性被认为是权威性的话。
如果承认复合赋值具有自己的语义,那么就会产生这样的观点:除了math运算之外,它们的评估不仅仅包括副作用(赋值)和价值评估(赋值后的序列),而是也是取一个LHS的(以前的)值的一个未命名的操作。 这通常在“左值到右值转换”的标题下处理,但是在这里这样做很难certificate是合理的,因为没有操作符存在将LHS 作为右值操作数(尽pipe扩展中有一个“等同”forms)。 正是这个未命名的操作与++
的副作用的潜在的不确定的关系会导致UB,但是这个无序的关系在标准中没有明确的说明,因为未命名的操作不是。 UB使用只存在于标准中的操作很难certificateUB的合理性。
关于i = ++i + 1
的描述
我认为这个微妙的解释是
(1)expression式
++i
返回一个左值,但+
将prvalues作为操作数,所以必须执行从左值到左值的转换;
可能会看到CWG活跃问题1642 。
这涉及到获得该左值的当前值(而不是比旧值
i
多一个),因此必须在增量的副作用(即更新i
)之后进行sorting,
这里的定序是为增量定义的(间接的,通过+=
,见(a) ): ++
( i
的修改)的副作用在整个expression式++i
的值计算之前被sorting。 后者是指计算++i
的结果 ,而不是加载i
的值 。
(2)任务的LHS也是一个左值,所以它的价值评估不涉及到
i
的当前值; 而这个值计算与RHS的值计算是不一致的,这没有问题
我认为这个标准没有正确定义,但我同意。
(3)赋值本身的值计算涉及更新
i
(再次),
只有在使用结果时才需要计算i = expr
的值,例如int x = (i = expr);
或(i = expr) = 42;
。 价值计算本身不会修改i
。
在expression式i = expr
中由于i = expr
而发生的修改称为=
的副作用 。 这个副作用在i = expr
值计算之前被sorting – 或者说i = expr
的值计算在i = expr
中的赋值的副作用之后被sorting 。
一般来说,expression式的操作数的值计算当然是在expression式的副作用之前sorting的。
但在其RHS的值计算之后被sorting,并因此在之前更新为
i
; 没问题。
赋值i = expr
副作用在赋值的操作数i
(A)和expr
的值计算之后被sorting。
在这种情况下, expr
是一个+
expr
: expr1 + 1
。 这个expression式的值计算在其操作数expr1
和1
的值计算之后被sorting。
这里的expr1
是++i
。 ++i
的值计算经过++i
的副作用( ++i
的修改)(B)
这就是为什么i = ++i + 1
是安全的 :(A)中的值计算和(B)中的相同variables的副作用之间存在一连串的顺序 。
(a)标准定义了以expr += 1
定义的++expr
,定义为expr = expr + 1
, expr
只被计算一次。
对于这个expr = expr + 1
,我们因此只有一个expr
值计算。 在整个expr = expr + 1
的值计算之前, =
的副作用被sorting,并且在操作数expr
(LHS)和expr + 1
(RHS)的值计算之后被sorting。
这相当于我声称对于++expr
,副作用在++expr
的值计算之前被sorting。
关于i += ++i + 1
i += ++i + 1
的值计算是否涉及未定义的行为?由于
+=
的LHS仍然是一个左值(而其RHS仍然是一个值),就上面(1)和(2)而言,同样的推理也适用。 至于(3)+=
运算符的值计算现在必须同时提取i
的当前值,然后(在它明显sorting之后,即使标准没有明确说明,否则这些运算符的执行总是会调用未定义的行为)执行RHS的添加并将结果存储回i
。
我认为这是问题:把i +=
的LHS加到++i + 1
的结果上需要知道i
的值 – 一个值计算(可以意味着加载i
的值)。 对于由++i
执行的修改,该值计算是不确定的。 这实际上就是你在替代描述中所说的,在标准i += expr
– > i = i + expr
要求的重写之后。 这里, i + expr
的i
的值计算相对于expr
的值计算是expr
。 这就是你得到UB的地方 。
请注意,一个值计算可以有两个结果:一个对象的“地址”,或一个对象的值。 在expression式i = 42
,lhs的值计算产生了i
的地址。 也就是说,编译器需要弄清楚存储rhs的位置(在抽象机器的可观察行为的规则下)。 在expression式i + 42
, i + 42
的值计算产生该值。 在上面的段落中,我指的是第二种,因此[intro.execution] p15适用:
如果对标量对象的副作用相对于同一个标量对象的另一个副作用或者使用相同标量对象的值进行值计算而言是不确定的,则行为是不确定的。
另一种方法为i += ++i + 1
现在
+=
运算符的值计算必须同时提取i
的当前值, 然后执行RHS的加法运算
RHS是++i + 1
。 计算这个expression式(值计算)的结果对于LHS的i
值计算是不确定的。 所以在这句话中这个词是误导性的:当然,它必须先加载i
然后把RHS的结果加到它上面。 但是在RHS的副作用和价值计算之间没有任何顺序来获得LHS的价值。 例如,你可以得到LHS的旧的或新的价值,由RHS修改。
一般来说,商店和“并发”加载是数据竞争,导致未定义的行为。
解决附录
使用一个虚构的
运算符来指定不确定的评估,可以尝试定义E op= F;
(int操作数简单)等价于{ int& L=E
int R=F; L = L + R; } { int& L=E
int R=F; L = L + R; } { int& L=E
int R=F; L = L + R; } ,但是这个例子不再有UB。
让E
是i
而F
是++i
(我们不需要+ 1
)。 那么,对于i = ++i
int* lhs_address; int lhs_value; int* rhs_address; int rhs_value; ( lhs_address = &i)
(i = i+1, rhs_address = &i, rhs_value = *rhs_address); *lhs_address = rhs_value;
另一方面,对于i += ++i
( lhs_address = &i, lhs_value = *lhs_address)
(i = i+1, rhs_address = &i, rhs_value = *rhs_address); int total_value = lhs_value + rhs_value; *lhs_address = total_value;
这是为了表示我对顺序保证的理解。 请注意,
运算符在LHS的所有值计算和副作用之前sorting。 括号不影响测序。 在第二种情况下, i += ++i
,我们对i
=> UB的左值到右值的转换进行了修改。
该标准不把复合赋值当作第二类原语,因为没有单独的语义定义是必要的。
我会说这是一个冗余。 从E1 op = E2
到E1 = E1 op E2
的重写还包括哪些expression式types和值类别是必需的(在rhs上,5.17 / 1说了关于lhs的一些事情),指针types发生了什么,所需的转换等等。可悲的是,5.17 / 1中有关“关于……”的句子不在5.17 / 7中,作为这种等价的例外。
无论如何,我想我们应该将复合赋值的保证和要求与简单赋值加运算符进行比较,看看是否有矛盾。
一旦我们在5.17 / 7的例外列表中加上“关于……”,我不认为有矛盾。
事实certificate,正如你在Marc van Leeuwen的回答讨论中所看到的那样,这句话引出了以下有趣的观察:
int i; // global int& f() { return ++i; } int main() { i = i + f(); // (A) i += f(); // (B) }
似乎(A)有两种可能的结果,因为对f
的主体的评价是以i + f()
中的i + f()
的值计算被不确定地sorting的。
另一方面,在(B)中, f()
的主体的评价在i
的值计算之前被sorting,因为+=
必须被看作单个操作,并且f()
当然需要在赋值+=
。
expression方式:
i += ++i + 1
会调用未定义的行为。 语言律师的方法要求我们回到缺陷报告,导致:
i = ++i + 1 ;
在C ++ 11中被定义为缺陷报告637.sorting规则和示例不一致 ,它开始说:
在1.9 [intro.execution]第16段中,以下expression式仍被列为未定义行为的示例:
i = ++i + 1;
但是,新的sorting规则似乎使这个expression式得到了很好的定义
报告中使用的逻辑如下:
-
分配副作用需要在LHS和RHS(5.17 [expr.ass]段落1)的值计算之后进行sorting。
-
LHS(i)是一个左值,所以它的值计算包括计算i的地址。
-
为了计算RHS(++ i + 1),有必要首先对左值expression式++ i进行值计算,然后对结果进行左值到右值的转换。 这保证了增量副作用在计算加法运算之前被sorting,而运算又是在分配副作用之前sorting的。 换句话说,它为这个expression式产生一个明确定义的顺序和最终值。
所以在这个问题中,我们的问题改变了RHS
:
++i + 1
至:
i + ++i + 1
由于草稿C ++ 11标准第5.17
部分分配和复合赋值操作符说:
formsE1 op = E2的expression式的行为相当于E1 = E1 op E2,除了E1仅被评估一次。 […]
所以现在我们有一种情况, i
RHS
的i
的计算没有按照++i
sorting,所以我们有未定义的行为。 这从第1.9
节第15段说:
除了注意到的地方之外,对个别操作符和个别expression式的操作数的评估是不确定的。 [注意:在一个程序执行过程中被多次评估的expression式中,对其子expression式的无序和不定序评估不需要在不同的评估中一致地执行。 结束符]在运算符结果值计算之前对运算符操作数的值计算进行sorting。 如果对标量对象的副作用相对于同一个标量对象的另一个副作用或者使用相同标量对象的值进行值计算而言是不确定的,则行为是不确定的。
显示这一点的实用方法是使用clang
来testing代码,该代码会生成以下警告( 请参阅实况 ):
warning: unsequenced modification and access to 'i' [-Wunsequenced] i += ++i + 1 ; ~~ ^
对于这个代码:
int main() { int i = 0 ; i += ++i + 1 ; }
在clang's
testing套件中,这个明确的testing例子进一步加强了这一点 :
a += ++a;
是的,这是UB!
评价你的表情
i += ++i + 1
按以下步骤进行:
5.17p1(C ++ 11)状态(重点介绍):
赋值运算符(=)和复合赋值运算符全部从右到左。 所有需要一个可修改的左值作为它们的左操作数,并返回一个左值指向左操作数。 如果左操作数是位域,则所有情况下的结果都是一个位域。 在任何情况下,赋值都是在左右操作数的值计算之后,赋值expression式的值计算之前进行sorting的。
“价值计算”是什么意思?
1.9p12给出了答案:
访问由volatile glvalue(3.10)指定的对象,修改对象,调用库I / O函数或调用执行这些操作的函数都是副作用,这些副作用是执行环境状态的变化。 对expression(或子expression)的评估通常包括值计算(包括确定用于评估值的对象的身份并获取先前分配给对象的值以用于评估)和副作用的开始。
由于你的代码使用了一个复合赋值操作符 ,5.17p7告诉我们,这个操作符是如何工作的:
forms
E1 op= E2
的expression式的行为相当于E1 = E1 op E2 except that
E1仅被评估一次。
因此,对expression式E1 ( == i)
的评估包括确定由i
指定的对象的身份和左值到右值的转换以获取存储在该对象中的值。 但是对两个操作数E1
和E2
的评估没有相互sorting。 因此,我们得到未定义的行为,因为E2 ( == ++i + 1)
的评估发起副作用(更新i
)。
1.9p15:
… 如果对标量对象的副作用不是相对于同一个标量对象的另一个副作用或者是使用相同标量对象的值进行值计算的结果,则行为是未定义的。
您的问题/评论中的以下陈述似乎是您误解的根源:
(2)任务的LHS也是一个左值,所以它的价值评估不涉及获取我的当前值
获取价值可以成为估价评估的一部分。 但是在E + = F中,唯一的前值是F,所以提取E的值不是评估(左值)子expression式E
如果一个expression式是一个左值,或者右值没有告诉任何关于如何评估这个expression式。 一些操作符需要左值作为操作数,而另一些则需要右值。
第5p8条:
每当一个glvalueexpression式作为一个操作数的操作数出现时,左值到右值(4.1),数组到指针(4.2)或函数到指针(4.3)的标准转换是用于将expression式转换为前值。
在一个简单的任务中,对LHS的评估只需要确定对象的身份。 但在一个复合赋值如+=
LHS必须是一个可修改的左值,但是在这种情况下LHS的评估包括确定对象的身份和左值到右值的转换。 这是这个转换的结果(这是一个prvalue)被加到了RHS评估的结果(也是一个前值)。
“但是在E + = F中,唯一的价值是F,所以提取E的价值不是评估(左值)子expression式E”
这是不正确的,因为我上面解释。 在你的例子中, F
是一个prvalueexpression式,但是F
也可能是一个左值expression式。 在这种情况下,左值到右值的转换也适用于F
上面引用的5.17p7告诉我们,复合赋值运算符的语义是什么。 该标准规定E += F
的行为与E += F
E = E + F
相同,但E
只评估一次。 在这里, E
的评价包括左值到右值的转换,因为二元运算符+
要求它的操作数是右值。
这里没有明确的未定义行为的例子
当然,正如我在问题中指出的那样,可以给出一个导致UB的论据,至今已经给出了答案。 然而这涉及严格的5.17:7的解读,这个解释既矛盾又与5.17:1中有关复合赋值的明确陈述相矛盾 。 由于5.17:7的读数较弱,矛盾消失,UB的论点也消失了。 因此,我的结论既不是在这里有UB, 也没有明确的行为,但是标准的文本是不一致的,应该进行修改以明确哪个阅读是主stream的 (我想这意味着缺陷报告应该是书面)。 当然,在这里可以引用标准中的回退条款(1.3.24中的注释),标准未能定义行为[明确和自我一致]的评估是未定义的行为,但这会使任何使用的复合赋值(包括前缀增加/减less运算符)到UB中,这可能对某些实现者有吸引力,但对程序员肯定没有。
我们不是为了解决这个问题而提出一个稍微修改的例子,而是更加清楚地表明不一致的地方。 假定一个已经定义
int& f (int& a) { return a; }
一个不做任何事的函数并返回它的(左值)参数。 现在修改这个例子
n += f(++n) + 1;
请注意,尽pipe在标准中给出了关于函数调用的sorting的一些额外条件,但是乍一看似乎并不影响该示例,因为从函数调用(甚至不在本地函数内部)完全没有副作用,因为递增发生在f
的参数expression式中,其评价不受这些额外条件的影响。 事实上,让我们应用关于未定义行为的关键论据(5.17:7),即5.17:7这样一个复合赋值的行为等价于(在这种情况下)
n = n + f(++n) + 1;
除了n
只被评估一次(这里没有什么区别)。 我刚刚写的陈述的评价有UB (RHS中第一个(前值) n
的值计算是与++
操作的副作用不相关的,它涉及相同的标量对象(1.9:15),而你死了)。
所以n += f(++n) + 1
有不确定的行为,对吗? 错误! 请阅读5.17:1
对于一个不确定sorting的函数调用,复合赋值的操作是一个单独的评估。 [ 注意 :因此,函数调用不应介于左值到右值转换和与任何单个复合赋值运算符关联的副作用之间。 – 结束注意 ]
这个语言远没有我想要的那么精确,但我不认为这是假设“不确定地sorting”应该是“关于复合作业的这种操作”的意思。 (非规范的,我知道)注解清楚说明左值到右值的转换是复合赋值操作的一部分。 现在是关于+=
的复合赋值操作的f
的调用。 我不确定,因为“顺序”关系是为个人价值计算和副作用定义的,而不是操作员的完整评估,这可能涉及到两者。 实际上,复合赋值运算符的评估包括三个项目:左操作数的左值到右值转换,副作用(赋值适当)以及复合赋值的值计算(副作用后sorting,并返回原左操作数作为左值)。 请注意, 除了上面引用的注释之外,左值到右值转换的存在从未在标准中明确提及; 特别是该标准对其sorting与其他评估完全没有(其他)声明有关。 It is pretty clear that in the example the call of f
is sequenced before the side effect and value computation of +=
(since the call occurs in the value computation of the right operand to +=
), but it might be indeterminately-sequenced with respect to the lvalue-to-rvalue conversion part. I recall from my question that since the left operand of +=
is an lvalue (and necessarily so), one cannot construe the lvalue-to-rvalue conversion to have occurred as part of the value computation of the left operand.
However, by the principle of the excluded middle, the call to f
must either be indeterminately-sequenced with respect to the operation of the compound assignment of +=
, or not indeterminately-sequenced with respect to it; in the latter case it must be sequenced before it because it cannot possibly be sequenced after it (the call of f
being sequenced before the side effect of +=
, and the relation being anti-symmetric). So first assume it is indeterminately-sequenced with respect to the operation. Then the cited clause says that wrt the call of f
the evaluation of +=
is a single operation, and the note explains that it means the call should not intervene between the lvalue-to-rvalue conversion and the side effect associated with +=
; it should either be sequenced before both, or after both. But being sequenced after the side effect is not possible, so it should be before both. This makes (by transitivity) the side effect of ++
sequenced before the lvalue-to-rvalue conversion, exit UB. Next assume the call of f
is sequenced before the operation of +=
. Then it is in particular sequenced before the lvalue-to-rvalue conversion, and again by transitivity so is the side effect of ++
; no UB in this branch either.
Conclusion: 5.17:1 contradicts 5.17:7 if the latter is taken (CAUB) to be normative for questions of UB resulting from unsequenced evaluations by 1.9:15. As I said CAUB is self-contradictory as well (by arguments indicated in the question), but this answer is getting to long, so I'll leave it at this for now.
Three problems, and two proposals for resolving them
Trying to understand what the standard writes about these matters, I distinguish three aspects in which the text is hard to interpret; they all are of a nature that the text is insufficiently clear about what model its statements are referring to. (I cite the texts at the end of the numbered items, since I do not know the markup to resume a numbered item after a quote)
-
The text of 5.17:7 is of an apparent simplicity that, although the intention is easy to grasp, gives us little hold when applied to difficult situations. It makes a sweeping claim (equivalent behavior, apparently in all aspects) but whose application is thwarted by the exception clause. What if the behavior of
E1 = E1
opE2
is undefined? Well then that ofE1
op= E2
should be as well. But what if the UB was due toE1
being evaluated twice inE1 = E1
opE2
? Then evaluatingE1
op= E2
should presumably not be UB, but if so, then defined as what? This is like saying "the youth of the second twin was exactly like that of the first, except that he did not die at childbirth." Frankly, I think this text, which has little evolved since the C version "A compound assignment of the the formE1 op = E2
differs from the simple assignment expressionE1 = E1 op E2
only in that the lvalueE1
is evaluated only once." might be adapted to match the changes in the standard.(5.17) 7 The behavior of an expression of the form
E1
op= E2
is equivalent toE1 = E1
opE2
except thatE1
is evaluated only once.[…] -
It is not so clear what precisely the actions (evaluations) are between which the 'sequenced' relation is defined. It is said (1.9:12) that evaluation of an expression includes value computations and initiation of side effects. Though this appears to say that an evaluation may have multiple (atomic) components, the sequenced relation is actually mostly defined (eg in 1.9:14,15) for individual components, so that it might be better to read this as that the notion of "evaluation" encompasses both value computations and (initiation of) side effects. However in some cases the 'sequenced' relation is defined for the (entire) execution of an expression of statement (1.9:15) or for a function call (5.17:1), even though a passage in 1.9:15 avoids the latter by referring directly to executions in the body of a called function.
(1.9) 12 Evaluation of an expression (or a sub-expression) in general includes both value computations (…) and initiation of side effects. […] 13 Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread […] 14 Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated. […] 15 When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. […] Every evaluation in the calling function (including other function calls) … is indeterminately sequenced with respect to the execution of the called function […] (5.2.6, 5.17) 1 … With respect to an indeterminately-sequenced function call, …
-
The text should more clearly acknowledge that a compound assignment involves, in contrast to a simple assignment, the action of fetching the value previously assigned to its left operand; this action is like lvalue-to-rvalue conversion, but does not happen as part of the value computation of that left operand, since it is not a prvalue; indeed it is a problem that 1.9:12 only acknowledges such action for prvalue evaluation . In particular the text should be more clear about which 'sequenced' relations are given for that action, if any.
(1.9) 12 Evaluation of an expression… includes… value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation)
The second point is the least directly related to our concrete question, and I think it can be solved simply by choosing a clear point of view and reformulating pasages that seem to indicate a different point of view. Given that one of the main purposes of the old sequence points, and now the 'sequenced' relation, was to make clear that the side effect of postfix-increment operators is unsequenced wrt to actions sequenced after the value computation of that operator (thus giving eg i = i++
UB), the point of view must be that individual value computations and (initiation of) individual side effects are "evaluations" for which "sequenced before" may be defined. For pragmatic reasons I would also include two more kinds of (trivial) "evaluations": function entry (so that the language of 1.9:15 may be simplified to: "When calling a function…, every value computation and side effect associated with any of its argument expressions, or with the postfix expression designating the called function, is sequenced before entry of that function") and function exit (so that any action in the function body gets by transitivity sequenced before anything that requires the function value; this used to be guaranteed by a sequence point, but the C++11 standard seems to have lost such guarantee; this might make calling a function ending with return i++;
potentially UB where this is not intended, and used to be safe). Then one can also be clear about the "indeterminately sequenced" relation of functions calls: for every function call, and every evaluation that is not (directly or indirectly) part of evaluating that call, that evaluation shall be sequenced (either before or after) wrt both entry and exit of that function call, and it shall have the same relation in both cases (so that in particular such external actions cannot be sequenced after function entry but before function exit, as is clearly desirable within a single thread).
Now to resolve points 1. and 3., I can see two paths (each affecting both points), which have different consequences for the defined or not behavior of our example:
Compound assignments with two operands, and three evaluations
Compound operations have thier two usual operands, an lvalue left operand and a prvalue right operand. To settle the unclarity of 3., it is included in 1.9:12 that fetching the value previously assigned to an object also may occur in compound assignments (rather than only for prvalue evaluation). The semantics of compount assignments are defined by changing 5.17:7 to
In a compound assignment op
=
, the value previously assigned to the object referred to by the left operand is fetched, the operator op is applied with this value as left operand and the right operand of op=
as right operand, and the resulting value replaces that of the object referred to by the left operand.
(That gives two evaluations, the fetch and the side effect; a third evaluation is the trivial value computation of the compound operator, sequenced after both other evaluations.)
For clarity, state clearly in 1.9:15 that value computations in operands are sequenced before all value computations associated with the operator (rather than just those for the result of the operator ), which ensures that evaluating the lvalue left operand is sequenced before fetching its value (one can hardly imagine otherwise), and also sequences the value computation of the right operand before that fetch, thus excluding UB in our example. While at it, I see no reason not to also sequence value computations in operands before any side effects associated with the operator (as they clearly must); this would make mentioning this explicitly for (compound) assignments in 5.17:1 superfluous. On the other hand do mention there that the value fetching in a compound assignment is sequenced before its side effect.
Compound assignments with three operands, and two evaluations
In order to obtain that the fetch in a compount assignment will be unsequenced with respect to the value computation of the right operand, making our example UB, the clearest way seems to be to give compound operators an implicit third (middle) operand , a prvalue, not represented by a separate expression, but obtained by lvalue-to-rvalue conversion from the left operand (this three-operand nature corresponds to the expanded form of compound assignments, but by obtaining the middle operand from the left operand, it is ensured that the value is fetched from the same object to which the result will be stored, a crucial guarantee that is only vaguely and implicitly given in the current formulation through the "except that E1
is evaluated only once" clause). The difference with the previous solution is that the fetch is now a genuine lvalue-to-rvalue conversion (since the middle operand is a prvalue) and is performed as part of the value computation of the operands to the compound assignment , which makes it naturally unsequenced with the value computation of the right operand. It should be stated somewhere (in a new clause that describes this implicit operand) that the value computation of the left operand is sequenced before this lvalue-to-rvalue conversion (it clearly must). Now 1.9:12 can be left as it is, and in place of 5.17:7 I propose
In a compound assignment op
=
with left operanda
(an lvalue), and midlle and right operandsb
respectivelyc
(both prvalues), the operator op is applied withb
as left operand andc
as right operand, and the resulting value replaces that of the object referred to bya
.
(That gives one evaluation, the side effect, with as second evaluation the trivial value computation of the compound operator, sequenced after it.)
The still applicable changes to 1.9:15 and 5.17:1 suggested in the previous solution could still apply, but would not give our original example defined behavior. However the modified example at the top of this answer would still have defined behavior, unless the part 5.17:1 "compound assignment is a single operation" is scrapped or modified (there is a similar passage in 5.2.6 for postfix increment/decrement). The existence of those passages would suggest that detaching the fecth and store operations within a single compound assignement or postfix increment/decrement was not the intention of those who wrote the current standard (and by extension making our example UB), but this of course is mere guesswork.
From the compiler writer's perspective, they don't care about "i += ++i + 1"
, because whatever the compiler does, the programmer may not get the correct result, but they surely get what they deserve. And nobody writes code like that. What the compiler writer cares about is
*p += ++(*q) + 1;
The code must read *p
and *q
, increase *q
by 1, and increase *p
by some amount that is calculated. Here the compiler writer cares about restrictions on the order of read and write operations. Obviously if p and q point to different objects, the order makes no difference, but if p == q
then it will make a difference. Again, p
will be different from q
unless the programmer writing the code is insane.
By making the code undefined, the language allows the compiler to produce the fastest possible code without caring for insane programmers. By making the code defined, the language forces the compiler to produce code that conforms to the standard even in insane cases, which may make it run slower. Both compiler writers and sane programmers don't like that.
So even if the behaviour is defined in C++11, it would be very dangerous to use it, because (a) a compiler might not be changed from C++03 behaviour, and (b) it might be undefined behaviour in C++14, for the reasons above.