問題描述
C++11 引入了標準化的內存模型,但這究竟是什么意思?它將如何影響 C++ 編程?
C++11 introduced a standardized memory model, but what exactly does that mean? And how is it going to affect C++ programming?
本文(作者:Gavin克拉克引用Herb Sutter) 說,
This article (by Gavin Clarke who quotes Herb Sutter) says that,
內存模型意味著C++代碼現在有一個標準化的庫可以調用不管是誰做的編譯器以及它在什么平臺上運行.有一個標準的方法來控制如何不同的線程與處理器的內存.
The memory model means that C++ code now has a standardized library to call regardless of who made the compiler and on what platform it's running. There's a standard way to control how different threads talk to the processor's memory.
"當你在談論分裂時[代碼] 跨不同內核在標準中,我們正在談論內存模型.我們準備去優化它而不破壞遵循人們會去的假設編寫代碼,"Sutter 說.
"When you are talking about splitting [code] across different cores that's in the standard, we are talking about the memory model. We are going to optimize it without breaking the following assumptions people are going to make in the code," Sutter said.
嗯,我可以記住這個和在線可用的類似段落(因為我從出生起就有自己的記憶模型:P),甚至可以發布作為其他人提出的問題的答案,但是老實說,我并不完全理解這一點.
Well, I can memorize this and similar paragraphs available online (as I've had my own memory model since birth :P) and can even post as an answer to questions asked by others, but to be honest, I don't exactly understand this.
C++ 程序員以前也開發過多線程應用程序,那么到底是 POSIX 線程,還是 Windows 線程,還是 C++11 線程又有什么關系呢?有什么好處?我想了解底層細節.
C++ programmers used to develop multi-threaded applications even before, so how does it matter if it's POSIX threads, or Windows threads, or C++11 threads? What are the benefits? I want to understand the low-level details.
我也有這樣的感覺,C++11 內存模型與 C++11 多線程支持有某種關系,因為我經常看到這兩者在一起.如果是,具體如何?為什么他們應該是相關的?
I also get this feeling that the C++11 memory model is somehow related to C++11 multi-threading support, as I often see these two together. If it is, how exactly? Why should they be related?
由于我不知道多線程內部是如何工作的,以及內存模型的一般含義,請幫助我理解這些概念.:-)
As I don't know how the internals of multi-threading work, and what memory model means in general, please help me understand these concepts. :-)
推薦答案
首先,你必須學會??像語言律師一樣思考.
First, you have to learn to think like a Language Lawyer.
C++ 規范未提及任何特定的編譯器、操作系統或 CPU.它參考了一個抽象機器,它是實際系統的概括.在語言律師的世界里,程序員的工作是為抽象機器編寫代碼;編譯器的工作是在具體機器上實現該代碼.通過嚴格按照規范編碼,您可以確保您的代碼無需修改即可在任何具有兼容 C++ 編譯器的系統上編譯和運行,無論是現在還是 50 年后.
The C++ specification does not make reference to any particular compiler, operating system, or CPU. It makes reference to an abstract machine that is a generalization of actual systems. In the Language Lawyer world, the job of the programmer is to write code for the abstract machine; the job of the compiler is to actualize that code on a concrete machine. By coding rigidly to the spec, you can be certain that your code will compile and run without modification on any system with a compliant C++ compiler, whether today or 50 years from now.
C++98/C++03 規范中的抽??象機基本上是單線程的.因此,不可能編寫完全可移植"的多線程 C++ 代碼.關于規范.該規范甚至沒有說明內存加載和存儲的原子性或加載和存儲可能發生的順序,更不用說互斥鎖之類的事情了.
The abstract machine in the C++98/C++03 specification is fundamentally single-threaded. So it is not possible to write multi-threaded C++ code that is "fully portable" with respect to the spec. The spec does not even say anything about the atomicity of memory loads and stores or the order in which loads and stores might happen, never mind things like mutexes.
當然,您可以在實踐中為特定的具體系統編寫多線程代碼——比如 pthreads 或 Windows.但是沒有標準方法可以為 C++98/C++03 編寫多線程代碼.
Of course, you can write multi-threaded code in practice for particular concrete systems – like pthreads or Windows. But there is no standard way to write multi-threaded code for C++98/C++03.
C++11 中的抽象機在設計上是多線程的.它還具有定義良好的內存模型;也就是說,它說明了在訪問內存時編譯器可以做什么和不可以做什么.
The abstract machine in C++11 is multi-threaded by design. It also has a well-defined memory model; that is, it says what the compiler may and may not do when it comes to accessing memory.
考慮以下示例,其中兩個線程同時訪問一對全局變量:
Consider the following example, where a pair of global variables are accessed concurrently by two threads:
Global
int x, y;
Thread 1 Thread 2
x = 17; cout << y << " ";
y = 37; cout << x << endl;
線程 2 可能輸出什么?
What might Thread 2 output?
在C++98/C++03下,這甚至不是未定義行為;問題本身毫無意義,因為標準沒有考慮任何稱為線程"的東西.
Under C++98/C++03, this is not even Undefined Behavior; the question itself is meaningless because the standard does not contemplate anything called a "thread".
在 C++11 下,結果是 Undefined Behavior,因為加載和存儲通常不需要是原子的.這似乎沒有太大的改進......而就其本身而言,事實并非如此.
Under C++11, the result is Undefined Behavior, because loads and stores need not be atomic in general. Which may not seem like much of an improvement... And by itself, it's not.
但是使用 C++11,你可以這樣寫:
But with C++11, you can write this:
Global
atomic<int> x, y;
Thread 1 Thread 2
x.store(17); cout << y.load() << " ";
y.store(37); cout << x.load() << endl;
現在事情變得更有趣了.首先,這里的行為是定義.線程 2 現在可以打印 0 0
(如果它在線程 1 之前運行)、37 17
(如果它在線程 1 之后運行)或 0 17
>(如果它在線程 1 分配給 x 之后但在分配給 y 之前運行).
Now things get much more interesting. First of all, the behavior here is defined. Thread 2 could now print 0 0
(if it runs before Thread 1), 37 17
(if it runs after Thread 1), or 0 17
(if it runs after Thread 1 assigns to x but before it assigns to y).
它不能打印的是37 0
,因為C++11中原子加載/存儲的默認模式是強制順序一致性.這只是意味著所有加載和存儲都必須好像"它們按照您在每個線程中編寫它們的順序發生,而線程之間的操作可以根據系統的喜好交錯進行.所以原子的默認行為為加載和存儲提供原子性和排序.
What it cannot print is 37 0
, because the default mode for atomic loads/stores in C++11 is to enforce sequential consistency. This just means all loads and stores must be "as if" they happened in the order you wrote them within each thread, while operations among threads can be interleaved however the system likes. So the default behavior of atomics provides both atomicity and ordering for loads and stores.
現在,在現代 CPU 上,確保順序一致性的成本可能很高.特別是,編譯器很可能會在此處的每次訪問之間發出全面的內存屏障.但是如果你的算法可以容忍無序加載和存儲;即,如果它需要原子性但不需要排序;即,如果它可以容忍 37 0
作為這個程序的輸出,那么你可以這樣寫:
Now, on a modern CPU, ensuring sequential consistency can be expensive. In particular, the compiler is likely to emit full-blown memory barriers between every access here. But if your algorithm can tolerate out-of-order loads and stores; i.e., if it requires atomicity but not ordering; i.e., if it can tolerate 37 0
as output from this program, then you can write this:
Global
atomic<int> x, y;
Thread 1 Thread 2
x.store(17,memory_order_relaxed); cout << y.load(memory_order_relaxed) << " ";
y.store(37,memory_order_relaxed); cout << x.load(memory_order_relaxed) << endl;
CPU 越現代,它就越有可能比前面的示例更快.
The more modern the CPU, the more likely this is to be faster than the previous example.
最后,如果你只需要保持特定的加載和存儲順序,你可以寫:
Finally, if you just need to keep particular loads and stores in order, you can write:
Global
atomic<int> x, y;
Thread 1 Thread 2
x.store(17,memory_order_release); cout << y.load(memory_order_acquire) << " ";
y.store(37,memory_order_release); cout << x.load(memory_order_acquire) << endl;
這讓我們回到有序的加載和存儲——所以 37 0
不再是一個可能的輸出——但它以最小的開銷實現了這一點.(在這個簡單的例子中,結果與完全成熟的順序一致性相同;在更大的程序中,它不會.)
This takes us back to the ordered loads and stores – so 37 0
is no longer a possible output – but it does so with minimal overhead. (In this trivial example, the result is the same as full-blown sequential consistency; in a larger program, it would not be.)
當然,如果您只想看到0 0
或37 17
的輸出,您可以只在原始代碼周圍包裹一個互斥鎖.但是,如果您讀了這么多,我敢打賭您已經知道它是如何工作的,而且這個答案已經比我預期的要長:-).
Of course, if the only outputs you want to see are 0 0
or 37 17
, you can just wrap a mutex around the original code. But if you have read this far, I bet you already know how that works, and this answer is already longer than I intended :-).
所以,底線.互斥體很棒,C++11 對它們進行了標準化.但有時出于性能原因,您需要較低級別的原語(例如,經典的 雙重檢查鎖定模式).新標準提供了諸如互斥體和條件變量之類的高級小工具,并且還提供了諸如原子類型和各種類型的內存屏障之類的低級小工具.所以現在您可以完全使用標準指定的語言編寫復雜的高性能并發例程,并且您可以確定您的代碼將在今天和明天的系統上編譯和運行不變.
So, bottom line. Mutexes are great, and C++11 standardizes them. But sometimes for performance reasons you want lower-level primitives (e.g., the classic double-checked locking pattern). The new standard provides high-level gadgets like mutexes and condition variables, and it also provides low-level gadgets like atomic types and the various flavors of memory barrier. So now you can write sophisticated, high-performance concurrent routines entirely within the language specified by the standard, and you can be certain your code will compile and run unchanged on both today's systems and tomorrow's.
盡管坦率地說,除非您是專家并且正在處理一些嚴肅的低級代碼,否則您可能應該堅持使用互斥鎖和條件變量.這就是我打算做的.
Although to be frank, unless you are an expert and working on some serious low-level code, you should probably stick to mutexes and condition variables. That's what I intend to do.
有關此內容的更多信息,請參閱此博客發布.
For more on this stuff, see this blog post.
這篇關于C++11 引入了標準化的內存模型.這是什么意思?它將如何影響 C++ 編程?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!