这是一个非常快速的完全矢量化版本:
def vectorized(prob_matrix, items): s = prob_matrix.cumsum(axis=0) r = np.random.rand(prob_matrix.shape[1]) k = (s < r).sum(axis=0) return items[k]
从理论上讲 ,
searchsorted是用于在累积求和的概率中查找随机值的正确函数,但
m由于相对较小,因此
k = (s <r).sum(axis=0)最终速度要快得多。它的时间复杂度为O(m),而
searchsorted方法为O(log(m)),但这仅对大得多才有意义
m。
另外
,
cumsum也是O(m),所以
vectorized和@perimosocordiae
improved都是O(m)。(
m实际上,如果您的计算机更大,则必须运行一些测试以查看
m此方法变慢之前可以达到的大小。)
这是我使用
m = 10和
n = 10000(使用函数
original和
improved@perimosocordiae的答案)得到的时间:
In [115]: %timeit original(prob_matrix, items)1 loops, best of 3: 270 ms per loopIn [116]: %timeit improved(prob_matrix, items)10 loops, best of 3: 24.9 ms per loopIn [117]: %timeit vectorized(prob_matrix, items)1000 loops, best of 3: 1 ms per loop
定义功能的完整脚本为:
import numpy as npdef improved(prob_matrix, items): # transpose here for better data locality later cdf = np.cumsum(prob_matrix.T, axis=1) # random numbers are expensive, so we'll get all of them at once ridx = np.random.random(size=n) # the one loop we can't avoid, made as simple as possible idx = np.zeros(n, dtype=int) for i, r in enumerate(ridx): idx[i] = np.searchsorted(cdf[i], r) # fancy indexing all at once is faster than indexing in a loop return items[idx]def original(prob_matrix, items): choices = np.zeros((n,)) # This is slow, because of the loop in Python for i in range(n): choices[i] = np.random.choice(items, p=prob_matrix[:,i]) return choicesdef vectorized(prob_matrix, items): s = prob_matrix.cumsum(axis=0) r = np.random.rand(prob_matrix.shape[1]) k = (s < r).sum(axis=0) return items[k]m = 10n = 10000 # Or some very large numberitems = np.arange(m)prob_weights = np.random.rand(m, n)prob_matrix = prob_weights / prob_weights.sum(axis=0, keepdims=True)