久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

為什么ubuntu 12.04下的OpenMP比串口版慢

Why OpenMP under ubuntu 12.04 is slower than serial version(為什么ubuntu 12.04下的OpenMP比串口版慢)
本文介紹了為什么ubuntu 12.04下的OpenMP比串口版慢的處理方法,對大家解決問題具有一定的參考價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)吧!

問題描述

我已經(jīng)閱讀了有關(guān)此主題的其他一些問題.然而,他們無論如何都沒有解決我的問題.

I've read some other questions on this topic. However, they didn't solve my problem anyway.

我寫的代碼如下,我得到的 pthread 版本和 omp 版本都比串行版本慢.我很困惑.

I wrote the code as following and I got pthread version and omp version both slower than the serial version. I'm very confused.

環(huán)境下編譯:

Ubuntu 12.04 64bit 3.2.0-60-generic
g++ (Ubuntu 4.8.1-2ubuntu1~12.04) 4.8.1

CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Vendor ID:             AuthenticAMD
CPU family:            18
Model:                 1
Stepping:              0
CPU MHz:               800.000
BogoMIPS:              3593.36
L1d cache:             64K
L1i cache:             64K
L2 cache:              512K
NUMA node0 CPU(s):     0,1

編譯命令:

g++ -std=c++11 ./eg001.cpp -fopenmp

#include <cmath>
#include <cstdio>
#include <ctime>
#include <omp.h>
#include <pthread.h>

#define NUM_THREADS 5
const int sizen = 256000000;

struct Data {
    double * pSinTable;
    long tid;
};

void * compute(void * p) {
    Data * pDt = (Data *)p;
    const int start = sizen * pDt->tid/NUM_THREADS;
    const int end = sizen * (pDt->tid + 1)/NUM_THREADS;
    for(int n = start; n < end; ++n) {
        pDt->pSinTable[n] = std::sin(2 * M_PI * n / sizen);
    }
    pthread_exit(nullptr);
}

int main()
{
    double * sinTable = new double[sizen];
    pthread_t threads[NUM_THREADS];
    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

    clock_t start, finish;

    start = clock();
    int rc;
    Data dt[NUM_THREADS];
    for(int i = 0; i < NUM_THREADS; ++i) {
        dt[i].pSinTable = sinTable;
        dt[i].tid = i;
        rc = pthread_create(&threads[i], &attr, compute, &dt[i]);
    }//for
    pthread_attr_destroy(&attr);
    for(int i = 0; i < NUM_THREADS; ++i) {
        rc = pthread_join(threads[i], nullptr);
    }//for
    finish = clock();
    printf("from pthread: %lf
", (double)(finish - start)/CLOCKS_PER_SEC);

    delete sinTable;
    sinTable = new double[sizen];

    start = clock();
#   pragma omp parallel for
    for(int n = 0; n < sizen; ++n)
        sinTable[n] = std::sin(2 * M_PI * n / sizen);
    finish = clock();
    printf("from omp: %lf
", (double)(finish - start)/CLOCKS_PER_SEC);

    delete sinTable;
    sinTable = new double[sizen];

    start = clock();
    for(int n = 0; n < sizen; ++n)
        sinTable[n] = std::sin(2 * M_PI * n / sizen);
    finish = clock();
    printf("from serial: %lf
", (double)(finish - start)/CLOCKS_PER_SEC);

    delete sinTable;

    pthread_exit(nullptr);
    return 0;
}

輸出:

from pthread: 21.150000
from omp: 20.940000
from serial: 20.800000

不知道是不是我代碼的問題,所以我用pthread來做同樣的事情.

I wonder whether it was my code's problem so I used pthread to do the same thing.

然而,我完全錯(cuò)了,我想知道這是否可能是 Ubuntu 在 OpenMP/pthread 上的問題.

However, I'm totally wrong, and I wonder whether it might be Ubuntu's problem on OpenMP/pthread.

我有一個(gè)朋友也有 AMD CPU 和 Ubuntu 12.04,在那里遇到了同樣的問題,所以我可能有理由相信問題不僅限于我.

I have a friend who has AMD CPU and Ubuntu 12.04 as well, and got the same problem there, so I might have some reason to believe that the problem is not limited to only me.

如果有人和我有同樣的問題,或者對這個(gè)問題有一些線索,提前致謝.

If anyone has the same problem as me, or has some clue on the problem, thanks in advance.

如果代碼不夠好,我運(yùn)行了一個(gè)基準(zhǔn)測試并將結(jié)果粘貼在這里:

If the code is not good enough, I ran a benchmark and I pasted the result here:

http://pastebin.com/RquLPREc

基準(zhǔn)網(wǎng)址:http://www.cs.kent.edu/~farrell/mc08/lectures/progs/openmp/microBenchmarks/src/download.html

新信息:

我使用 VS2012 在 windows(沒有 pthread 版本)上運(yùn)行代碼.

I ran the code on windows (without pthread version) with VS2012.

我使用了 sizen 的 1/10,因?yàn)?windows 不允許我分配大內(nèi)存主干的結(jié)果:

I used 1/10 of sizen because windows does not allow me to allocate that great trunk of memory where the results are:

from omp: 1.004
from serial: 1.420
from FreeNickName: 735 (this one is the suggestion improvement by @FreeNickName)

這是否表明它可能是 Ubuntu OS 的問題??

Does this indicate that it could be a problem of Ubuntu OS ??

問題通過使用在操作系統(tǒng)之間可移植的omp_get_wtime 函數(shù)解決.請參閱 Hristo Iliev 的答案.

Problem is solved by using omp_get_wtime function that is portable among Operating Systems. See the answer by Hristo Iliev.

FreeNickName 對這個(gè)有爭議的話題進(jìn)行了一些測試.

Some tests about the controversial topic by FreeNickName.

(抱歉,我需要在 Ubuntu 上測試它,因?yàn)?Windows 是我的朋友之一.)

(Sorry I need to test it on Ubuntu cause the windows was one of my friends'.)

--1-- 從 delete 更改為 delete [] : (但沒有 memset)(-std=c++11 -fopenmp)

--1-- Change from delete to delete [] : (but without memset)(-std=c++11 -fopenmp)

from pthread: 13.491405
from omp: 13.023099
from serial: 20.665132
from FreeNickName: 12.022501

--2-- 在 new 之后立即使用 memset:(-std=c++11 -fopenmp)

--2-- With memset immediately after new: (-std=c++11 -fopenmp)

from pthread: 13.996505
from omp: 13.192444
from serial: 19.882127
from FreeNickName: 12.541723

--3-- 在 new 之后立即使用 memset:(-std=c++11 -fopenmp -march=native -O2)

--3-- With memset immediately after new: (-std=c++11 -fopenmp -march=native -O2)

from pthread: 11.886978
from omp: 11.351801
from serial: 17.002865
from FreeNickName: 11.198779

--4-- 在 new 之后立即使用 memset,并將 FreeNickName 的版本放在 OMP 之前用于版本:(-std=c++11 -fopenmp -march=native -O2)

--4-- With memset immediately after new, and put FreeNickName's version before OMP for version: (-std=c++11 -fopenmp -march=native -O2)

from pthread: 11.831127
from FreeNickName: 11.571595
from omp: 11.932814
from serial: 16.976979

--5-- 在 new 之后立即使用 memset,并將 FreeNickName 的版本放在 OMP for version 之前,并將 NUM_THREADS 設(shè)置為 5 而不是 2(我是雙核).

--5-- With memset immediately after new, and put FreeNickName's version before OMP for version, and set NUM_THREADS to 5 instead of 2 (I'm dual core).

from pthread: 9.451775
from FreeNickName: 9.385366
from omp: 11.854656
from serial: 16.960101

推薦答案

在您的情況下,OpenMP 沒有任何問題.問題在于您測量經(jīng)過的時(shí)間的方式.

There is nothing wrong with OpenMP in your case. What is wrong is the way you measure the elapsed time.

使用 clock() 測量 Linux(和大多數(shù)其他類 Unix 操作系統(tǒng))上多線程應(yīng)用程序的性能是一個(gè)錯(cuò)誤,因?yàn)樗环祷貟扃?實(shí)時(shí))時(shí)間,而是返回所有進(jìn)程線程的累積 CPU 時(shí)間(在某些 Unix 風(fēng)格上甚至是所有子進(jìn)程的累積 CPU 時(shí)間).您的并行代碼在 Windows 上顯示出更好的性能,因?yàn)?clock() 返回的是實(shí)時(shí)時(shí)間,而不是累積的 CPU 時(shí)間.

Using clock() to measure the performance of multithreaded applications on Linux (and most other Unix-like OSes) is a mistake since it does not return the wall-clock (real) time but instead the accumulated CPU time for all process threads (and on some Unix flavours even the accumulated CPU time for all child processes). Your parallel code shows better performance on Windows since there clock() returns the real time and not the accumulated CPU time.

防止此類差異的最佳方法是使用可移植的 OpenMP 計(jì)時(shí)器例程 omp_get_wtime():

The best way to prevent such discrepancies is to use the portable OpenMP timer routine omp_get_wtime():

double start = omp_get_wtime();
#pragma omp parallel for
for(int n = 0; n < sizen; ++n)
    sinTable[n] = std::sin(2 * M_PI * n / sizen);
double finish = omp_get_wtime();
printf("from omp: %lf
", finish - start);

對于非 OpenMP 應(yīng)用程序,您應(yīng)該使用 clock_gettime()CLOCK_REALTIME 時(shí)鐘:

For non-OpenMP applications, you should use clock_gettime() with the CLOCK_REALTIME clock:

struct timespec start, finish;
clock_gettime(CLOCK_REALTIME, &start);
#pragma omp parallel for
for(int n = 0; n < sizen; ++n)
    sinTable[n] = std::sin(2 * M_PI * n / sizen);
clock_gettime(CLOCK_REALTIME, &finish);
printf("from omp: %lf
", (finish.tv_sec + 1.e-9 * finish.tv_nsec) -
                          (start.tv_sec + 1.e-9 * start.tv_nsec));

這篇關(guān)于為什么ubuntu 12.04下的OpenMP比串口版慢的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!

【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題,如果有圖片或者內(nèi)容侵犯了您的權(quán)益,請聯(lián)系我們刪除處理,感謝您的支持!

相關(guān)文檔推薦

What do compilers do with compile-time branching?(編譯器如何處理編譯時(shí)分支?)
Can I use if (pointer) instead of if (pointer != NULL)?(我可以使用 if (pointer) 而不是 if (pointer != NULL) 嗎?)
Checking for NULL pointer in C/C++(在 C/C++ 中檢查空指針)
Math-like chaining of the comparison operator - as in, quot;if ( (5lt;jlt;=1) )quot;(比較運(yùn)算符的數(shù)學(xué)式鏈接-如“if((5<j<=1)))
Difference between quot;if constexpr()quot; Vs quot;if()quot;(“if constexpr()之間的區(qū)別與“if())
C++, variable declaration in #39;if#39; expression(C++,if 表達(dá)式中的變量聲明)
主站蜘蛛池模板: 2020国产在线 | 亚洲综合五月天婷婷 | 欧日韩在线 | 日韩精品在线播放 | 中文字幕免费视频 | 国产一级片免费在线观看 | 男女羞羞视频在线免费观看 | 一区二区三区在线电影 | 人人看人人草 | 亚洲三区在线观看 | 久久久女女女女999久久 | 午夜成人免费视频 | 黄色大片免费播放 | 亚洲色图在线观看 | 99视频在线免费观看 | 81精品国产乱码久久久久久 | 成人免费大片黄在线播放 | 免费看片国产 | 国产精品视频999 | 亚洲成av人影片在线观看 | 天天操网| 亚洲精品一区二区三区蜜桃久 | 国产精品亚洲第一 | 久久精品国产亚洲一区二区 | 国产午夜av片 | 九九综合| 国产精品成人一区二区三区夜夜夜 | 韩国av一区二区 | 久久久xxx| 精品国产91乱码一区二区三区 | 日韩欧美在线一区二区 | 国产精品中文在线 | 国产免费福利小视频 | 一区网站 | 一级黄在线观看 | 国产精品美女久久久久久久久久久 | 欧美一级黄色片在线观看 | 国产精品欧美一区二区三区 | 日韩欧美精品 | 亚洲国产中文字幕 | 91亚洲精品在线 |