久久综合热,中文字幕在线观看,国产精品一区免费在线观看

本文介紹了是否有任何 JVM 的 JIT 編譯器生成使用矢量化浮點(diǎn)指令的代碼?的處理方法，對大家解決問題具有一定的參考價(jià)值，需要的朋友們下面隨著小編來一起學(xué)習(xí)吧！

問題描述

假設(shè)我的 Java 程序的瓶頸確實(shí)是一些緊密循環(huán)來計(jì)算一堆矢量點(diǎn)積.是的，我已經(jīng)分析過了，是的，它是瓶頸，是的，它很重要，是的，算法就是這樣，是的，我已經(jīng)運(yùn)行 Proguard 來優(yōu)化字節(jié)碼，等等.

Let's say the bottleneck of my Java program really is some tight loops to compute a bunch of vector dot products. Yes I've profiled, yes it's the bottleneck, yes it's significant, yes that's just how the algorithm is, yes I've run Proguard to optimize the byte code, etc.

這項(xiàng)工作本質(zhì)上是點(diǎn)積.如，我有兩個(gè) float[50] ，我需要計(jì)算成對產(chǎn)品的總和.我知道處理器指令集的存在是為了快速批量執(zhí)行此類操作，例如 SSE 或 MMX.

The work is, essentially, dot products. As in, I have two float[50] and I need to compute the sum of pairwise products. I know processor instruction sets exist to perform these kind of operations quickly and in bulk, like SSE or MMX.

是的，我可以通過在 JNI 中編寫一些本機(jī)代碼來訪問這些.事實(shí)證明，JNI 調(diào)用非常昂貴.

Yes I can probably access these by writing some native code in JNI. The JNI call turns out to be pretty expensive.

我知道你不能保證 JIT 會(huì)編譯什么，什么不編譯.有沒有人曾經(jīng)聽說過使用這些指令的 JIT 生成代碼?如果是這樣，Java 代碼有什么東西可以幫助它以這種方式編譯嗎?

I know you can't guarantee what a JIT will compile or not compile. Has anyone ever heard of a JIT generating code that uses these instructions? and if so, is there anything about the Java code that helps make it compilable this way?

可能是不"；值得一問.

Probably a "no"; worth asking.

推薦答案

所以，基本上，你希望你的代碼運(yùn)行得更快.JNI 就是答案.我知道你說它對你不起作用，但讓我告訴你你錯(cuò)了.

So, basically, you want your code to run faster. JNI is the answer. I know you said it didn't work for you, but let me show you that you are wrong.

這里是 Dot.java:

import java.nio.FloatBuffer;
import org.bytedeco.javacpp.*;
import org.bytedeco.javacpp.annotation.*;

@Platform(include = "Dot.h", compiler = "fastfpu")
public class Dot {
    static { Loader.load(); }

    static float[] a = new float[50], b = new float[50];
    static float dot() {
        float sum = 0;
        for (int i = 0; i < 50; i++) {
            sum += a[i]*b[i];
        }
        return sum;
    }
    static native @MemberGetter FloatPointer ac();
    static native @MemberGetter FloatPointer bc();
    static native @NoException float dotc();

    public static void main(String[] args) {
        FloatBuffer ab = ac().capacity(50).asBuffer();
        FloatBuffer bb = bc().capacity(50).asBuffer();

        for (int i = 0; i < 10000000; i++) {
            a[i%50] = b[i%50] = dot();
            float sum = dotc();
            ab.put(i%50, sum);
            bb.put(i%50, sum);
        }
        long t1 = System.nanoTime();
        for (int i = 0; i < 10000000; i++) {
            a[i%50] = b[i%50] = dot();
        }
        long t2 = System.nanoTime();
        for (int i = 0; i < 10000000; i++) {
            float sum = dotc();
            ab.put(i%50, sum);
            bb.put(i%50, sum);
        }
        long t3 = System.nanoTime();
        System.out.println("dot(): " + (t2 - t1)/10000000 + " ns");
        System.out.println("dotc(): "  + (t3 - t2)/10000000 + " ns");
    }
}

和Dot.h:

float ac[50], bc[50];

inline float dotc() {
    float sum = 0;
    for (int i = 0; i < 50; i++) {
        sum += ac[i]*bc[i];
    }
    return sum;
}

我們可以通過 JavaCPP 使用這個(gè)命令來編譯和運(yùn)行它:

We can compile and run that with JavaCPP using this command:

$ java -jar javacpp.jar Dot.java -exec

使用 Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz、Fedora 30、GCC 9.1.1 和 OpenJDK 8 或 11，我得到這樣的輸出:

With an Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz, Fedora 30, GCC 9.1.1, and OpenJDK 8 or 11, I get this kind of output:

dot(): 39 ns
dotc(): 16 ns

或大約快 2.4 倍.我們需要使用直接 NIO 緩沖區(qū)而不是數(shù)組，但是 HotSpot 可以像訪問數(shù)組一樣快地訪問直接 NIO 緩沖區(qū).另一方面，在這種情況下，手動(dòng)展開循環(huán)并不能顯著提升性能.

Or roughly 2.4 times faster. We need to use direct NIO buffers instead of arrays, but HotSpot can access direct NIO buffers as fast as arrays. On the other hand, manually unrolling the loop does not provide a measurable boost in performance, in this case.

這篇關(guān)于是否有任何 JVM 的 JIT 編譯器生成使用矢量化浮點(diǎn)指令的代碼?的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網(wǎng)！

【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題，如果有圖片或者內(nèi)容侵犯了您的權(quán)益，請聯(lián)系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

是否有任何 JVM 的 JIT 編譯器生成使用矢量化浮點(diǎn)

問題描述

推薦答案

相關(guān)文檔推薦