在C ++中i ++和++ i之间是否存在性能差异？

Is there a performance difference between i++ and ++i in C++?

我们的问题是C中i++和++i之间的性能差异是什么？

C ++的答案是什么？

[执行摘要：如果您没有特定原因使用i++，请使用++i。]

对于C ++，答案有点复杂。

如果i是一个简单类型(不是C ++类的实例)，则为C("没有性能差异")给出的答案成立，因为编译器正在生成代码。

但是，如果i是C ++类的实例，则i++和++i正在调用其中一个operator++函数。这是一对标准的这些功能：

1
2
3
4
5
6
7
8
9
10
11
12

Foo& Foo::operator++() // called for ++i
{
this->data += 1;
return *this;
}

Foo Foo::operator++(int ignored_dummy_value) // called for i++
{
Foo tmp(*this); // variable"tmp" cannot be optimized away by the compiler
++(*this);
return tmp;
}

由于编译器不生成代码，只是调用operator++函数，因此无法优化tmp变量及其关联的复制构造函数。如果复制构造函数很昂贵，那么这会对性能产生重大影响。

是。有。

++运算符可以定义为函数，也可以不定义。对于原始类型(int，double，...)，内置运算符，因此编译器可能能够优化您的代码。但是在定义++运算符的对象的情况下，事物是不同的。

operator ++(int)函数必须创建一个副本。这是因为postfix ++应该返回一个与它所拥有的值不同的值：它必须在temp变量中保存其值，增加其值并返回temp。对于operator ++()，前缀++，不需要创建副本：对象可以自行递增，然后只返回自身。

以下是一个例子：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

struct C
{
C& operator++(); // prefix
C operator++(int); // postfix

private:

int i_;
};

C& C::operator++()
{
++i_;
return *this; // self, no copy created
}

C C::operator++(int ignored_dummy_value)
{
C t(*this);
++(*this);
return t; // return a copy
}

每次调用operator ++(int)时都必须创建一个副本，编译器无法对其进行任何操作。给出选择时，使用operator ++();这样您就不会保存副本。在许多增量(大循环？)和/或大对象的情况下，它可能很重要。

这是增量运算符在不同翻译单元中的基准。用g ++ 4.5编译。

暂时忽略样式问题

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

// a.cc
#include <ctime>
#include
class Something {
public:
Something& operator++();
Something operator++(int);
private:
std::array<int,PACKET_SIZE> data;
};

int main () {
Something s;

for (int i=0; i<1024*1024*30; ++i) ++s; // warm up
std::clock_t a = clock();
for (int i=0; i<1024*1024*30; ++i) ++s;
a = clock() - a;

for (int i=0; i<1024*1024*30; ++i) s++; // warm up
std::clock_t b = clock();
for (int i=0; i<1024*1024*30; ++i) s++;
b = clock() - b;

std::cout <<"a=" << (a/double(CLOCKS_PER_SEC))
<<", b=" << (b/double(CLOCKS_PER_SEC)) << '
';
return 0;
}

O(n)增量

测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

// b.cc
#include
class Something {
public:
Something& operator++();
Something operator++(int);
private:
std::array<int,PACKET_SIZE> data;
};

Something& Something::operator++()
{
for (auto it=data.begin(), end=data.end(); it!=end; ++it)
++*it;
return *this;
}

Something Something::operator++(int)
{
Something ret = *this;
++*this;
return ret;
}

结果

在虚拟机上使用g ++ 4.5的结果(时间以秒为单位)：

1
2
3
4
5

Flags (--std=c++0x) ++i i++
-DPACKET_SIZE=50 -O1 1.70 2.39
-DPACKET_SIZE=50 -O3 0.59 1.00
-DPACKET_SIZE=500 -O1 10.51 13.28
-DPACKET_SIZE=500 -O3 4.28 6.82

O(1)增量

测试

现在让我们采取以下文件：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

// c.cc
#include
class Something {
public:
Something& operator++();
Something operator++(int);
private:
std::array<int,PACKET_SIZE> data;
};

Something& Something::operator++()
{
return *this;
}

Something Something::operator++(int)
{
Something ret = *this;
++*this;
return ret;
}

它在增量中没有任何作用。这模拟了增量具有恒定复杂性的情况。

结果

结果现在变化很大：

1
2
3
4
5
6

Flags (--std=c++0x) ++i i++
-DPACKET_SIZE=50 -O1 0.05 0.74
-DPACKET_SIZE=50 -O3 0.08 0.97
-DPACKET_SIZE=500 -O1 0.05 2.79
-DPACKET_SIZE=500 -O3 0.08 2.18
-DPACKET_SIZE=5000 -O3 0.07 21.90

结论

性能方面

如果您不需要以前的值，请养成使用预增量的习惯。即使使用内置类型也要保持一致，如果你用自定义类型替换内置类型，你将习惯它并且不会冒着遭受不必要的性能损失的风险。

语义明智

i++说increment i, I am interested in the previous value, though。
++i表示increment i, I am interested in the current value或increment i, no interest in the previous value。再一次，你会习惯它，即使你现在不是。

克努特。

过早优化是万恶之源。因为过早的悲观化。

说后缀情况下编译器无法优化临时变量副本并不完全正确。使用VC进行的快速测试表明，在某些情况下，它至少可以做到这一点。

在以下示例中，生成的代码对于前缀和后缀是相同的，例如：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

#include <stdio.h>

class Foo
{
public:

Foo() { myData=0; }
Foo(const Foo &rhs) { myData=rhs.myData; }

const Foo& operator++()
{
this->myData++;
return *this;
}

const Foo operator++(int)
{
Foo tmp(*this);
this->myData++;
return tmp;
}

int GetData() { return myData; }

private:

int myData;
};

int main(int argc, char* argv[])
{
Foo testFoo;

int count;
printf("Enter loop count:");
scanf("%d", &count);

for(int i=0; i<count; i++)
{
testFoo++;
}

printf("Value: %d
", testFoo.GetData());
}

无论你是使用++ testFoo还是testFoo ++，你仍然会得到相同的结果代码。实际上，在没有从用户那里读取计数的情况下，优化器将整个事情降低到常数。所以这：

1
2
3
4
5
6
7

for(int i=0; i<10; i++)
{
testFoo++;
}

printf("Value: %d
", testFoo.GetData());

结果如下：

1
2
3
4

00401000 push 0Ah
00401002 push offset string"Value: %d
" (402104h)
00401007 call dword ptr [__imp__printf (4020A0h)]

因此，虽然后缀版本的速度肯定会慢一些，但如果您不使用它，优化器可能会很好地摆脱临时副本。

Google C ++风格指南说：

Preincrement and Predecrement

Use prefix form (++i) of the increment and decrement operators with
iterators and other template objects.

Definition: When a variable is incremented (++i or i++) or decremented (--i or
i--) and the value of the expression is not used, one must decide
whether to preincrement (decrement) or postincrement (decrement).

Pros: When the return value is ignored, the"pre" form (++i) is never less
efficient than the"post" form (i++), and is often more efficient.
This is because post-increment (or decrement) requires a copy of i to
be made, which is the value of the expression. If i is an iterator or
other non-scalar type, copying i could be expensive. Since the two
types of increment behave the same when the value is ignored, why not
just always pre-increment?

Cons: The tradition developed, in C, of using post-increment when the
expression value is not used, especially in for loops. Some find
post-increment easier to read, since the"subject" (i) precedes the
"verb" (++), just like in English.

Decision: For simple scalar (non-object) values there is no reason to prefer one
form and we allow either. For iterators and other template types, use
pre-increment.

@Ketan

...raises over-looked detail regarding intent vs performance. There are times when we want to use iter++ instead of ++iter.

显然post和pre-increment有不同的语义，我相信每个人都同意，当使用结果时你应该使用适当的运算符。我认为问题是当结果被丢弃时应该怎么做(如在for循环中)。这个问题(恕我直言)的答案是，由于性能考虑因素最多可以忽略不计，所以你应该做更自然的事情。对于我自己++i更自然，但我的经验告诉我，我是少数，使用i++将导致大多数读取代码的人的金属开销减少。

毕竟这就是语言不被称为"++C"的原因。[*]

[*]插入关于++C是一个更合乎逻辑的名称的强制性讨论。

我想指出Andrew Koenig最近在Code Talk上发表的一篇优秀文章。

http://dobbscodetalk.com/index.php?option=com_myblog&show=Efficiency-versus-intent.html&Itemid=29

在我们公司，我们也使用++ iter的约定来保证一致性和性能。但安德鲁提出了关于意图与表现的过度细节。有时候我们想要使用iter ++而不是++ iter。

因此，首先确定你的意图，如果前或后无关紧要，那么请使用pre，因为它可以通过避免创建额外的对象并抛出它来获得一些性能上的好处。

Mark：只是想指出operator ++是很好的内联候选者，如果编译器选择这样做，在大多数情况下都会消除冗余副本。 (例如POD类型，迭代器通常是。)

也就是说，在大多数情况下使用++ iter仍然是更好的风格。 :-)

当您将运算符视为值返回函数及其实现方式时，++i和i++之间的性能差异将更加明显。为了更容易理解发生了什么，下面的代码示例将使用int，就像它是struct一样。

++i递增变量，然后返回结果。这可以在原地完成并且CPU时间最短，在许多情况下只需要一行代码：

1
2
3

int& int::operator++() {
return *this += 1;
}

但i++也不能说。

后递增i++通常被视为在递增之前返回原始值。但是，函数只能在结束时返回结果。因此，有必要创建包含原始值的变量的副本，增加变量，然后返回保存原始值的副本：

1
2
3
4
5

int int::operator++(int& _Val) {
int _Original = _Val;
_Val += 1;
return _Original;
}

当预增量和后增量之间没有功能差异时，编译器可以执行优化，使得两者之间没有性能差异。但是，如果涉及复合数据类型(如struct或class)，则将在后递增时调用复制构造函数，如果需要深层复制，则无法执行此优化。因此，预增量通常比后增量更快并且需要更少的存储器。

++ i - 更快不使用返回值

i ++ - 使用返回值更快

当不使用返回值时，编译器保证在++ i的情况下不使用临时值。不保证更快，但保证不会慢。

当使用返回值时，i ++允许处理器推送两者
增量和左侧进入管道，因为它们不相互依赖。 ++我可能会停止管道，因为处理器无法启动左侧，直到预增量操作一直蜿蜒。同样，管道停滞不能得到保证，因为处理器可能会发现其他有用的东西。

@Mark：我删除了我之前的答案，因为它有点翻转，并且应该为此单独进行下调。我实际上认为这是一个很好的问题，因为它询问了很多人的想法。

通常的答案是++ i比i ++更快，毫无疑问它是，但更大的问题是"你什么时候关心？"

如果增加迭代器所花费的CPU时间比例小于10％，那么您可能不在乎。

如果在递增迭代器中花费的CPU时间比例大于10％，则可以查看正在进行迭代的语句。看看你是否可以增加整数而不是使用迭代器。你有可能，虽然它可能在某种意义上不太可取，但是很有可能你将节省在这些迭代器中花费的所有时间。

我已经看到了一个例子，其中迭代器增量在90％以上的时间内消耗得很好。在这种情况下，进行整数递增会将执行时间减少大致相应的量。 (即超过10倍加速)

@wilhelmtell

编译器可以忽略临时。从另一个线程逐字：

允许C ++编译器消除基于堆栈的临时工具，即使这样做会改变程序行为。 VC 8的MSDN链接：

http://msdn.microsoft.com/en-us/library/ms364057(VS.80).aspx

是时候为人们提供智慧宝石了;) - 有一个简单的技巧可以使C ++后缀增量与前缀增量几乎相同(为我自己发明了这个，但是在其他人的代码中也看到了它，所以我不是单独)。

基本上，技巧是使用辅助类在返回后推迟增量，并且RAII来拯救

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

#include <iostream>

class Data {
private: class DataIncrementer {
private: Data& _dref;

public: DataIncrementer(Data& d) : _dref(d) {}

public: ~DataIncrementer() {
++_dref;
}
};

private: int _data;

public: Data() : _data{0} {}

public: Data(int d) : _data{d} {}

public: Data(const Data& d) : _data{ d._data } {}

public: Data& operator=(const Data& d) {
_data = d._data;
return *this;
}

public: ~Data() {}

public: Data& operator++() { // prefix
++_data;
return *this;
}

public: Data operator++(int) { // postfix
DataIncrementer t(*this);
return *this;
}

public: operator int() {
return _data;
}
};

int
main() {
Data d(1);

std::cout << d << '
';
std::cout << ++d << '
';
std::cout << d++ << '
';
std::cout << d << '
';

return 0;
}

发明了一些重型自定义迭代器代码，它减少了运行时。前缀和后缀的成本现在是一个参考，如果这是自定义操作员进行大量移动，前缀和后缀为我产生相同的运行时间。

两者都快;)
如果你想要它与处理器的计算相同，那就是它的完成顺序不同。

例如，以下代码：

1
2
3
4
5
6
7
8
9
10

#include <stdio.h>

int main()
{
int a = 0;
a++;
int b = 0;
++b;
return 0;
}

生成以下程序集：

1
2
3
4
5
6
7
8
9
0x0000000100000f24 <main+0>: push %rbp
0x0000000100000f25 <main+1>: mov %rsp,%rbp
0x0000000100000f28 <main+4>: movl $0x0,-0x4(%rbp)
0x0000000100000f2f <main+11>: incl -0x4(%rbp)
0x0000000100000f32 <main+14>: movl $0x0,-0x8(%rbp)
0x0000000100000f39 <main+21>: incl -0x8(%rbp)
0x0000000100000f3c <main+24>: mov $0x0,%eax
0x0000000100000f41 <main+29>: leaveq
0x0000000100000f42 <main+30>: retq

你看，对于++和b ++，它是一个包含助记符，所以它是相同的操作;)

你应该使用++ i，即使是没有性能优势的内置类型，也是为自己创造一个好习惯的原因。

预期的问题是关于何时未使用结果(从C的问题中可以清楚地看出)。有人可以解决这个问题，因为问题是"社区维基"吗？

关于过早的优化，Knuth经常被引用。那就对了。但唐纳德克努特永远不会用这些日子里你能看到的可怕代码来辩护。曾经见过Java Integers中的a = b + c(不是int)吗？这相当于3次装箱/拆箱转换。避免这样的事情很重要。并且无用地编写i ++而不是++我是同样的错误。
编辑：正如phresnel很好地把它放在评论中，这可以概括为"过早的优化是邪恶的，就像过早的悲观化"。

即使是人们更习惯于i ++这一事实也是一个不幸的C遗产，由K＆R的概念性错误引起(如果你遵循意图论证，这是一个合乎逻辑的结论;并且因为他们的K＆R无意义而捍卫K＆R，他们是很棒，但它们并不像语言设计师那么好; C设计中存在无数错误，从gets()到strcpy()，再到strncpy()API(从第1天起它应该有strlcpy()API) )。

顺便说一句，我是那些没有用到C ++的人之一来找到令我讨厌的++。我仍然使用它，因为我承认这是正确的。

++i比i++快，因为它不返回值的旧副本。

它也更直观：