問題描述
我有兩種方法可以在 [0..n-1] 范圍內生成 m 個不同的隨機數
I have two methods of generating m distinct random numbers in the range [0..n-1]
方法一:
//C++-ish pseudocode
int result[m];
for(i = 0; i < m; ++i)
{
int r;
do
{
r = rand()%n;
}while(r is found in result array at indices from 0 to i)
result[i] = r;
}
方法二:
//C++-ish pseudocode
int arr[n];
for(int i = 0; i < n; ++i)
arr[i] = i;
random_shuffle(arr, arr+n);
result = first m elements in arr;
當 n 遠大于 m 時,第一種方法更有效,否則第二種方法更有效.但是更大"不是一個嚴格的概念,是嗎?:)
The first method is more efficient when n is much larger than m, whereas the second is more efficient otherwise. But "much larger" isn't that strict a notion, is it? :)
問題: 我應該使用 n 和 m 的什么公式來確定方法 1 還是方法 2 更有效?(就運行時間的數學期望而言)
Question: What formula of n and m should I use to determine whether method1 or method2 will be more efficient? (in terms of mathematical expectation of the running time)
推薦答案
純數學:
讓我們計算兩種情況下 rand()
函數調用的數量并比較結果:
Pure mathematics:
Let's calculate the quantity of rand()
function calls in both cases and compare the results:
案例 1:當您已經選擇了 k 個數字時,讓我們看看在步驟 i = k
上調用的數學期望.通過一次 rand()
調用獲得一個數字的概率等于 p = (n-k)/n
.我們需要知道此類調用數量的數學期望,這導致獲得我們還沒有的數字.
Case 1:
let's see the mathematical expectation of calls on step i = k
, when you already have k numbers chosen. The probability to get a number with one rand()
call is equal to p = (n-k)/n
. We need to know the mathematical expectation of such calls quantity which leads to obtaining a number we don't have yet.
使用1
調用獲得它的概率是p
.使用 2
調用 - q * p
,其中 q = 1 - p
.在一般情況下,在 n
次調用之后準確獲得它的概率是 (q^(n-1))*p
.因此,數學期望為 Sum[ n * q^(n-1) * p ], n = 1 -->INF
.這個總和等于 1/p
(由 wolfram alpha 證明).
The probability to get it using 1
call is p
. Using 2
calls - q * p
, where q = 1 - p
. In general case, the probability to get it exactly after n
calls is (q^(n-1))*p
. Thus, the mathematical expectation is
Sum[ n * q^(n-1) * p ], n = 1 --> INF
. This sum is equal to 1/p
(proved by wolfram alpha).
因此,在步驟 i = k
上,您將執行 1/p = n/(nk)
調用 rand()
功能.
So, on the step i = k
you will perform 1/p = n/(n-k)
calls of the rand()
function.
現在讓我們總結一下:
Sum[ n/(n - k) ], k = 0 -->m - 1 = n * T
- 方法 1 中的 rand
調用次數.
這里 T = Sum[ 1/(n - k) ], k = 0 -->m - 1
Sum[ n/(n - k) ], k = 0 --> m - 1 = n * T
- the number of rand
calls in method 1.
Here T = Sum[ 1/(n - k) ], k = 0 --> m - 1
情況 2:
這里 rand()
在 random_shuffle
內被調用 n - 1
次(在大多數實現中).
Here rand()
is called inside random_shuffle
n - 1
times (in most implementations).
現在,要選擇方法,我們必須比較這兩個值: n * T ?n - 1
.
因此,要選擇合適的方法,請按上述方法計算 T
.如果 T <(n - 1)/n
最好使用第一種方法.否則使用第二種方法.
Now, to choose the method, we have to compare these two values: n * T ? n - 1
.
So, to choose the appropriate method, calculate T
as described above. If T < (n - 1)/n
it's better to use the first method. Use the second method otherwise.
這篇關于在 [0..n-1] 范圍內生成 m 個不同的隨機數的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!