問題描述
我寫了一個簡單的多線程程序如下:
I wrote a simple multithreading programs as follows:
static bool finished = false;
int func()
{
size_t i = 0;
while (!finished)
++i;
return i;
}
int main()
{
auto result=std::async(std::launch::async, func);
std::this_thread::sleep_for(std::chrono::seconds(1));
finished=true;
std::cout<<"result ="<<result.get();
std::cout<<"
main thread id="<<std::this_thread::get_id()<<std::endl;
}
它在Visual Studio或-O0
中在gcc中的調(diào)試模式下正常運行,并在1<后打印結(jié)果/code> 秒.但是在發(fā)布模式或
-O1 -O2 -O3
下它卡住了并且不打印任何東西.
It behaves normally in debug mode in Visual studio or -O0
in gcc and print out the result after 1
seconds. But it stuck and does not print anything in Release mode or -O1 -O2 -O3
.
推薦答案
兩個線程,訪問一個非原子的、非保護的變量是 UB 這涉及finished
.您可以制作 std::atomic
類型的 finished
來解決這個問題.
Two threads, accessing a non-atomic, non-guarded variable are U.B. This concerns finished
. You could make finished
of type std::atomic<bool>
to fix this.
我的修復(fù):
#include <iostream>
#include <future>
#include <atomic>
static std::atomic<bool> finished = false;
int func()
{
size_t i = 0;
while (!finished)
++i;
return i;
}
int main()
{
auto result=std::async(std::launch::async, func);
std::this_thread::sleep_for(std::chrono::seconds(1));
finished=true;
std::cout<<"result ="<<result.get();
std::cout<<"
main thread id="<<std::this_thread::get_id()<<std::endl;
}
輸出:
result =1023045342
main thread id=140147660588864
coliru 現(xiàn)場演示
有人可能會認為'這是一個 bool
–大概有一點.這怎么可能是非原子的?(當我自己開始使用多線程時,我就這樣做了.)
Somebody may think 'It's a bool
– probably one bit. How can this be non-atomic?' (I did when I started with multi-threading myself.)
但請注意,std::atomic
提供給您的不只是缺乏撕裂.它還使來自多個線程的并發(fā)讀+寫訪問得到明確定義,阻止編譯器假設(shè)重新讀取變量將始終看到相同的值.
But note that lack-of-tearing is not the only thing that std::atomic
gives you. It also makes concurrent read+write access from multiple threads well-defined, stopping the compiler from assuming that re-reading the variable will always see the same value.
使 bool
不受保護、非原子會導(dǎo)致其他問題:
Making a bool
unguarded, non-atomic can cause additional issues:
- 編譯器可能會決定將變量優(yōu)化為一個寄存器,甚至將 CSE 多次訪問優(yōu)化為一個,并從循環(huán)中提升負載.
- 可能會為 CPU 內(nèi)核緩存該變量.(在現(xiàn)實生活中,CPU 具有一致的緩存.這不是一個真正的問題,但 C++ 標準足夠?qū)捤桑梢院w非連貫共享內(nèi)存上的假設(shè) C++ 實現(xiàn),其中
atomic
和memory_order_relaxed
存儲/加載將工作,但volatile
不會.為此使用 volatile 將是 UB,即使它在實際 C++ 實現(xiàn)中實際工作.)
- The compiler might decide to optimize variable into a register or even CSE multiple accesses into one and hoist a load out of a loop.
- The variable might be cached for a CPU core. (In real life, CPUs have coherent caches. This is not a real problem, but the C++ standard is loose enough to cover hypothetical C++ implementations on non-coherent shared memory where
atomic<bool>
withmemory_order_relaxed
store/load would work, but wherevolatile
wouldn't. Using volatile for this would be UB, even though it works in practice on real C++ implementations.)
為了防止這種情況發(fā)生,必須明確告知編譯器不要這樣做.
To prevent this to happen, the compiler must be told explicitly not to do.
關(guān)于 volatile
與這個問題的潛在關(guān)系的不斷發(fā)展的討論讓我有點驚訝.因此,我想花掉我的兩分錢:
I'm a little bit surprised about the evolving discussion concerning the potential relation of volatile
to this issue. Thus, I'd like to spent my two cents:
- volatile 對線程有用
- 誰害怕糟糕的優(yōu)化編譯器?.
這篇關(guān)于多線程程序卡在優(yōu)化模式但在 -O0 中正常運行的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!