問題描述
我在編譯時編寫了斐波那契數計算程序(constexpr)使用 C++11 支持的模板元編程技術的問題.目的這是為了計算模板元編程方法與舊的傳統方法之間的運行時間差異.
I wrote the program Fibonacci number calculation in compile time (constexpr) problem using the template metaprogramming techniques supported in C++11. The purpose of this is to calculate the difference in the run-time between the template metaprogramming approach and the old conventional approach.
// Template Metaprograming Approach
template<int N>
constexpr int fibonacci() {return fibonacci<N-1>() + fibonacci<N-2>(); }
template<>
constexpr int fibonacci<1>() { return 1; }
template<>
constexpr int fibonacci<0>() { return 0; }
// Conventional Approach
int fibonacci(int N) {
if ( N == 0 ) return 0;
else if ( N == 1 ) return 1;
else
return (fibonacci(N-1) + fibonacci(N-2));
}
我在我的 GNU/Linux 系統上為 N = 40 運行了兩個程序,并測量了時間和發現傳統解決方案(1.15 秒)比基于模板的解決方案(0.55 秒)慢大約兩倍.這是一項重大改進,因為這兩種方法都基于遞歸.
I ran both programs for N = 40 on my GNU/Linux system and measured the time and found that that conventional solution (1.15 second) is around two times slower than the template-based solution (0.55 second). This is a significant improvement as both approaches are based on the recursion.
為了更深入地理解它,我在 g++ 中編譯了程序(-fdump-tree-all 標志),發現編譯器實際上生成了 40 個不同的函數(如 fibonacci<40>、fibonacci<39>...斐波那契<0>).
To understand it more I compiled the program (-fdump-tree-all flag) in g++ and found that compiler actually generated the 40 different functions (like fibonacci<40>, fibonacci<39>...fibonacci<0>).
constexpr int fibonacci() [with int N = 40] () {
int D.29948, D.29949, D.29950;
D.29949 = fibonacci<39> ();
D.29950 = fibonacci<38> ();
D.29948 = D.29949 + D.29950;
return D.29948;
}
constexpr int fibonacci() [with int N = 39] () {
int D.29952, D.29953, D.29954;
D.29953 = fibonacci<38> ();
D.29954 = fibonacci<37> ();
D.29952 = D.29953 + D.29954;
return D.29952;
}
...
...
...
constexpr int fibonacci() [with int N = 0] () {
int D.29962;
D.29962 = 0;
return D.29962;
}
我也在GDB中調試了程序,發現上面的功能都是執行次數與傳統遞歸方法相同.如果程序的兩個版本都以相同的次數(遞歸)執行該函數,那么模板元編程技術如何實現這一點?我還想知道您對基于模板元編程的方法與其他版本相比如何以及為什么要花費一半時間的看法?這個程序能不能比現在的程序更快?
I also debugged the program in GDB and found that all the above functions are executed an equal number of times as with the conventional recursive approach. If both versions of the program are executing the function an equal number of times (recursive), then how is this achieved by template metaprogramming techniques? I would also like to know your opinion about how and why a template metaprogramming based approach is taking half time compared to the other version? Can this program be made faster than the current one?
基本上,我的目的是盡可能多地了解內部發生的事情.
Basically my intention here is to understand what's going on internally as much as possible.
我的機器是帶有 GCC 4.8.1 的 GNU/Linux,我對兩個程序都使用了優化 -o3
.
My machine is GNU/Linux with GCC 4.8.1, and I used the optimization -o3
for both programs.
推薦答案
試試這個:
template<size_t N>
struct fibonacci : integral_constant<size_t, fibonacci<N-1>{} + fibonacci<N-2>{}> {};
template<> struct fibonacci<1> : integral_constant<size_t,1> {};
template<> struct fibonacci<0> : integral_constant<size_t,0> {};
使用 clang 和 -Os
,編譯時間大約為 0.5 秒,零 時間運行 N=40
.您的傳統"方法大約在 0.4 秒內編譯并在 0.8 秒內運行.只是為了檢查,結果是102334155
對嗎?
With clang and -Os
, this compiles in roughly 0.5s and runs in zero time for N=40
. Your "conventional" approach compiles in roughly 0.4s and runs in 0.8s. Just for checking, the result is 102334155
right?
當我嘗試您自己的 constexpr
解決方案時,編譯器運行了幾分鐘,然后我停止了它,因為顯然內存已滿(計算機開始凍結).編譯器正在嘗試計算最終結果,而您的實現在編譯時使用效率極低.
When I tried your own constexpr
solution the compiler run for a couple of minutes and then I stopped it because apparently memory was full (computer started freezing). The compiler was trying to compute the final result and your implementation is extremely inefficient to be used at compile time.
使用此解決方案,在實例化 N
時會重復使用 N-2
、N-1
處的模板實例化.所以 fibonacci<40>
實際上在編譯時作為一個值被知道,在運行時沒有什么可做的.這是一種動態編程方法,當然,如果您在 N<計算之前將所有值存儲在
0
到 N-1
處,當然您可以在運行時執行相同的操作/代碼>.
With this solution, template instantiations at N-2
, N-1
are re-used when instantiating N
. So fibonacci<40>
is actually known at compile time as a value, and there is nothing to do at run-time. This is a dynamic programming approach and of course you can do the same at run time if you store all values at 0
through N-1
before computing at N
.
使用您的解決方案,編譯器可以在編譯時評估fibonacci
,但不需要.在您的情況下,所有或部分計算都留給運行時.就我而言,所有計算都是在編譯時嘗試的,因此永無止境.
With your solution, the compiler can evaluate fibonacci<N>()
at compile time but is not required to. In your case, all or part of computation is left for run time. In my case, all computation is attempted at compile time, hence never ending.
這篇關于在 C++11 中計算編譯時(constexpr)中的斐波那契數(遞歸方法)的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!