問題描述
我了解數學上等效的算術運算如何由于數值錯誤(例如,以不同順序求和浮點數)而導致不同的結果.
I understand how mathematically-equivalent arithmentic operations can result in different results due to numerical errors (e.g. summing floats in different orders).
然而,令我驚訝的是,將零添加到 sum
會改變結果.我認為這始終適用于浮點數,無論如何:x + 0. == x
.
However, it surprises me that adding zeros to sum
can change the result. I thought that this always holds for floats, no matter what: x + 0. == x
.
這是一個例子.我希望所有的行都完全為零.誰能解釋一下為什么會這樣?
Here's an example. I expected all the lines to be exactly zero. Can anybody please explain why this happens?
M = 4 # number of random values
Z = 4 # number of additional zeros
for i in range(20):
a = np.random.rand(M)
b = np.zeros(M+Z)
b[:M] = a
print a.sum() - b.sum()
-4.4408920985e-16
0.0
0.0
0.0
4.4408920985e-16
0.0
-4.4408920985e-16
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
2.22044604925e-16
0.0
4.4408920985e-16
4.4408920985e-16
0.0
M
和 Z
的較小值似乎不會發生這種情況.
It seems not to happen for smaller values of M
and Z
.
我還確定了 a.dtype==b.dtype
.
這里還有一個例子,它也展示了 python 的內置 sum
的行為符合預期:
Here is one more example, which also demonstrates python's builtin sum
behaves as expected:
a = np.array([0.1, 1.0/3, 1.0/7, 1.0/13, 1.0/23])
b = np.array([0.1, 0.0, 1.0/3, 0.0, 1.0/7, 0.0, 1.0/13, 1.0/23])
print a.sum() - b.sum()
=> -1.11022302463e-16
print sum(a) - sum(b)
=> 0.0
我正在使用 numpy V1.9.2.
I'm using numpy V1.9.2.
推薦答案
簡答:你看到了兩者的區別
a + b + c + d
和
(a + b) + (c + d)
因為浮點數不準確所以不一樣.
which because of floating point inaccuracies is not the same.
長答案: Numpy 將成對求和作為速度(它允許更容易矢量化)和舍入誤差的優化.
Long answer: Numpy implements pair-wise summation as an optimization of both speed (it allows for easier vectorization) and rounding error.
numpy sum-implementation 可以在 here(函數pairwise_sum_@TYPE@
).它基本上做了以下事情:
The numpy sum-implementation can be found here (function pairwise_sum_@TYPE@
). It essentially does the following:
- 如果數組的長度小于 8,則執行常規的 for 循環求和.這就是為什么如果
W < 沒有觀察到奇怪的結果.4
在您的情況下 - 在兩種情況下都將使用相同的 for 循環求和. - 如果長度在 8 到 128 之間,則在 8 個 bin
r[0]-r[7]
中累加總和,然后通過((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7]))
.李> - 否則,它將遞歸地對數組的兩半求和.
- If the length of the array is less than 8, a regular for-loop summation is performed. This is why the strange result is not observed if
W < 4
in your case - the same for-loop summation will be used in both cases. - If the length is between 8 and 128, it accumulates the sums in 8 bins
r[0]-r[7]
then sums them by((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7]))
. - Otherwise, it recursively sums two halves of the array.
因此,在第一種情況下,您會得到 a.sum() = a[0] + a[1] + a[2] + a[3]
而在第二種情況下 b.sum() = (a[0] + a[1]) + (a[2] + a[3])
這導致 a.sum() - b.sum() != 0
.
Therefore, in the first case you get a.sum() = a[0] + a[1] + a[2] + a[3]
and in the second case b.sum() = (a[0] + a[1]) + (a[2] + a[3])
which leads to a.sum() - b.sum() != 0
.
這篇關于添加零時奇怪的 numpy.sum 行為的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!