問(wèn)題描述
在這個(gè) StackOverflow 問(wèn)題中:
In this StackOverflow question:
從范圍生成隨機(jī)整數(shù)
接受的答案建議使用以下公式在給定的 min
和 max
之間生成一個(gè)隨機(jī)整數(shù),其中 min
和 max
被包含在范圍內(nèi):
the accepted answer suggests the following formula for generating a random integer in between given min
and max
, with min
and max
being included into the range:
但它也說(shuō)
這仍然略微偏向于較低的數(shù)字......它也是可以擴(kuò)展它以消除偏差.
This is still slightly biased towards lower numbers ... It's also possible to extend it so that it removes the bias.
但它沒(méi)有解釋為什么它偏向于較低的數(shù)字或如何消除這種偏向.所以,問(wèn)題是:這是在(有符號(hào))范圍內(nèi)生成隨機(jī)整數(shù)的最佳方法,而不依賴于任何花哨的東西,只是 rand()
函數(shù),如果它是最優(yōu),如何消除偏差?
But it doesn't explain why it's biased towards lower numbers or how to remove the bias. So, the question is: is this the most optimal approach to generation of a random integer within a (signed) range while not relying on anything fancy, just rand()
function, and in case if it is optimal, how to remove the bias?
我剛剛針對(duì)浮點(diǎn)外推測(cè)試了@Joey 建議的 while
循環(huán)算法:
I've just tested the while
-loop algorithm suggested by @Joey against floating-point extrapolation:
查看有多少均勻的球"落"入并分布在多個(gè)桶"中,一個(gè)測(cè)試用于浮點(diǎn)外推,另一個(gè)用于 while
循環(huán)算法.但結(jié)果證明結(jié)果會(huì)因球"(和桶")的數(shù)量而異,因此我無(wú)法輕易選出獲勝者.可以在 此 Ideone 頁(yè)面 中找到工作代碼.例如,對(duì)于 10 個(gè)桶和 100 個(gè)球,浮點(diǎn)外推的桶之間理想概率的最大偏差小于 while
循環(huán)算法(分別為 0.04 和 0.05)但有 1000 個(gè)球,while
-loop 算法的最大偏差較小(0.024 和 0.011),并且在 10000 個(gè)球的情況下,浮點(diǎn)外推再次做得更好(0.0034 和 0.0053),依此類推.的一致性.考慮到?jīng)]有一種算法始終產(chǎn)生比其他算法更好的均勻分布的可能性,讓我傾向于浮點(diǎn)外推,因?yàn)樗坪醣?while
循環(huán)算法執(zhí)行得更快.那么選擇浮點(diǎn)外推算法好還是我的測(cè)試/結(jié)論不完全正確?
to see how much uniformly "balls" are "falling" into and are being distributed among a number of "buckets", one test for the floating-point extrapolation and another for the while
-loop algorithm. But results turned out to be varying depending on the number of "balls" (and "buckets") so I couldn't easily pick a winner. The working code can be found at this Ideone page. For example, with 10 buckets and 100 balls the maximum deviation from the ideal probability among buckets is less for the floating-point extrapolation than for the while
-loop algorithm (0.04 and 0.05 respectively) but with 1000 balls, the maximum deviation of the while
-loop algorithm is lesser (0.024 and 0.011), and with 10000 balls, the floating-point extrapolation is again doing better (0.0034 and 0.0053), and so on without much of consistency. Thinking of the possibility that none of the algorithms consistently produces uniform distribution better than that of the other algorithm, makes me lean towards the floating-point extrapolation since it appears to perform faster than the while
-loop algorithm. So is it fine to choose the floating-point extrapolation algorithm or my testings/conclusions are not completely correct?
推薦答案
當(dāng)隨機(jī)數(shù)生成器的輸出個(gè)數(shù) (RAND_MAX+1) 不能被所需的范圍 (max-min+1) 整除時(shí)出現(xiàn)問(wèn)題.由于從隨機(jī)數(shù)到輸出會(huì)有一致的映射,因此某些輸出將映射到比其他輸出更多的隨機(jī)數(shù).這與映射是如何完成的無(wú)關(guān)——您可以使用模數(shù)、除法、轉(zhuǎn)換為浮點(diǎn)數(shù),無(wú)論您能想出什么伏都教,基本問(wèn)題仍然存在.
The problem occurs when the number of outputs from the random number generator (RAND_MAX+1) is not evenly divisible by the desired range (max-min+1). Since there will be a consistent mapping from a random number to an output, some outputs will be mapped to more random numbers than others. This is regardless of how the mapping is done - you can use modulo, division, conversion to floating point, whatever voodoo you can come up with, the basic problem remains.
問(wèn)題的嚴(yán)重性非常小,要求不高的應(yīng)用程序通??梢院雎运?范圍越小,RAND_MAX越大,效果越不明顯.
The magnitude of the problem is very small, and undemanding applications can generally get away with ignoring it. The smaller the range and the larger RAND_MAX is, the less pronounced the effect will be.
我采用了您的示例程序并對(duì)其進(jìn)行了一些調(diào)整.首先我創(chuàng)建了一個(gè)特殊版本的rand
,范圍只有0-255,以更好地展示效果.我對(duì) rangeRandomAlg2
做了一些調(diào)整.最后我將球"的數(shù)量改為 1000000 以提高一致性.您可以在此處查看結(jié)果:http://ideone.com/4P4HY
I took your example program and tweaked it a bit. First I created a special version of rand
that only has a range of 0-255, to better demonstrate the effect. I made a few tweaks to rangeRandomAlg2
. Finally I changed the number of "balls" to 1000000 to improve the consistency. You can see the results here: http://ideone.com/4P4HY
請(qǐng)注意,浮點(diǎn)版本產(chǎn)生兩個(gè)緊密分組的概率,接近 0.101 或 0.097,介于兩者之間.這就是行動(dòng)中的偏見(jiàn).
Notice that the floating-point version produces two tightly grouped probabilities, near either 0.101 or 0.097, nothing in between. This is the bias in action.
我認(rèn)為稱其為Java 的算法"有點(diǎn)誤導(dǎo) - 我確信它比 Java 古老得多.
I think calling this "Java's algorithm" is a bit misleading - I'm sure it's much older than Java.
這篇關(guān)于在一個(gè)范圍內(nèi)生成無(wú)偏隨機(jī)整數(shù)的最佳算法是什么?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!