Python 生成随机字符串的两种写法

在看微信 JS 接口 demo 的 Python 版本时，发现中间的随机字符串是这样生成的：

# 方法 1
''.join(random.choice(string.ascii_letters + string.digits) for _ in range(15)

其实，也可以换一种方式：

# 方法 2
s=''
for _ in range(15):
    s += random.choice(string.ascii_letters + string.digits)

很显然，方法 1 更优雅而且更 Python，方法 2 更加像是 C++/Java 的写法。

不知道这两者在性能上有没有差异？

Python 生成随机字符串的两种写法

ionicwang 1楼

不知道这两者在性能上有没有差异？
请自己测试，不要让别人帮你测试。

sinazl 2楼

帖子里的两种写法，一种是用random.choices，另一种是用random.choice加列表推导式。我直接给你看代码吧。

import random
import string

# 方法1: 使用 random.choices (Python 3.6+)
def random_string_choices(length=10):
    # string.ascii_letters 是大小写字母，string.digits 是数字
    characters = string.ascii_letters + string.digits
    # k 参数指定生成多少个字符，然后 join 起来
    return ''.join(random.choices(characters, k=length))

# 方法2: 使用 random.choice 加列表推导式
def random_string_choice(length=10):
    characters = string.ascii_letters + string.digits
    # 循环 length 次，每次选一个字符，最后 join
    return ''.join(random.choice(characters) for _ in range(length))

# 测试一下
if __name__ == "__main__":
    print("方法1 (choices):", random_string_choices(12))
    print("方法2 (choice):", random_string_choice(12))

两种方法都能用，但有点区别。random.choices是直接生成一个列表，一步到位，代码更干净。random.choice加列表推导式是老写法，兼容性更好，Python 2.7也能跑。现在一般都用第一种，除非你要兼容老版本。

总结：直接用random.choices就行。

gougou168 3楼

1，第一种方式更推荐，速度快一点
2，第一段你少了个括号

caililin 4楼

补充：理论上……

gougou168 5楼

100W 次
timeit
第一个 18s
第二个 17.3s

wuwangju 6楼

Python 3.6 开始可以使用：
’’.join(random.choices(string.ascii_letters + string.digits, k=15)

eggper 7楼

没注意，谢谢提醒

zlyuanteng 8楼

谢谢。目测跟我一样少打了一个括号^_^

wuwangju 9楼

这个内置函数的速度是最快的，100W 个字符 3.15765118598938s

htzhanglong 10楼

https://gist.github.com/wonderbeyond/1806c7b43d3e642e5ad0aee7052b8e8f

这是我记的笔记，搬 Django 的实现，为什么写这么复杂，大家可以发表下看法。

phonegap100 11楼

ide 没跟你说 for i 里面的 i 没用到吗？

bupafengyu 12楼

可以对比一下 choice 和 choices 的源码
https://hg.python.org/cpython/file/tip/Lib/random.py#l252
https://hg.python.org/cpython/file/tip/Lib/random.py#l340
choice 是生成一个随机的整数索引
choices 是把分布比重（默认等比重）转换成 0-1 的数轴，然后 random()生成 0-1 小数，对应到数轴上
大家底层都是用的 random()，choices 更复杂，理应更慢才对

使用 cProfile 测试
>>> cProfile.run(’"".join(random.choice(string.ascii_letters + string.digits) for _ in range(107))’)
60321941 function calls in 21.869 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
10000001 5.516 0.000 20.772 0.000 <string>:1(<genexpr>)
1 0.000 0.000 21.869 21.869 <string>:1(<module>)
10000000 6.283 0.000 8.918 0.000 random.py:222(_randbelow)
10000000 5.381 0.000 15.256 0.000 random.py:252(choice)
1 0.000 0.000 21.869 21.869 {built-in method builtins.exec}
10000000 0.956 0.000 0.956 0.000 {built-in method builtins.len}
10000000 0.785 0.000 0.785 0.000 {method ‘bit_length’ of ‘int’ objects}
1 0.000 0.000 0.000 0.000 {method ‘disable’ of ‘_lsprof.Profiler’ objects}
10321936 1.851 0.000 1.851 0.000 {method ‘getrandbits’ of ‘_random.Random’ objects}
1 1.097 1.097 21.869 21.869 {method ‘join’ of ‘str’ objects}

>>> cProfile.run(’"".join(random.choices(string.ascii_letters + string.digits, k=107))’)
10000007 function calls in 3.463 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.014 0.014 3.463 3.463 <string>:1(<module>)
1 0.000 0.000 3.374 3.374 random.py:340(choices)
1 2.780 2.780 3.374 3.374 random.py:352(<listcomp>)
1 0.000 0.000 3.463 3.463 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.len}
1 0.000 0.000 0.000 0.000 {method ‘disable’ of ‘_lsprof.Profiler’ objects}
1 0.075 0.075 0.075 0.075 {method ‘join’ of ‘str’ objects}
10000000 0.594 0.000 0.594 0.000 {method ‘random’ of ‘_random.Random’ objects}

可以看到：
1. choice 法到底层用的是 getrandbits
# Only call self.getrandbits if the original random() builtin method
# has not been overridden or if a new getrandbits() was supplied.
说明 getrandbits 应该是比 random 更快的，否则官方不会这么用

2. choice 法的 function calls 是 choices 法的 6 倍，而正好时间也是将近 6 倍，很可能这两者是有关联的

3.看 tottime，choices 的时间主要是在 random.py:352
return [population[_int(random() * total)] for i in range(k)]
这里构建 list 消耗大可以理解

choice 的时间主要是在<string>:1，random.py:222，random.py:252 上
choice 一个 5 行的函数，吃这么多时间，很难理解

sinazl 13楼

python λ python -m timeit -n 1000 -r 10 -s "import random, string" "''.join(r andom.choices(string.ascii_letters + string.digits, k=10000))" 1000 loops, best of 10: 1.53 msec per loop λ python -m timeit -n 1000 -r 10 -s "import os" "os.urandom(10000)" 1000 loops, best of 10: 2.91 usec per loop 

note:　 1 msec (milliseconds) = 1000 usec (microseconds)

bupafengyu 14楼

厉害，还有这个方法
>>> cProfile.run(‘binascii.hexlify(os.urandom(10**7)).decode()’)
6 function calls in 0.789 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.005 0.005 0.789 0.789 <string>:1(<module>)
1 0.032 0.032 0.032 0.032 {built-in method binascii.hexlify}
1 0.000 0.000 0.789 0.789 {built-in method builtins.exec}
1 0.747 0.747 0.747 0.747 {built-in method posix.urandom}
1 0.005 0.005 0.005 0.005 {method ‘decode’ of ‘bytes’ objects}
1 0.000 0.000 0.000 0.000 {method ‘disable’ of ‘_lsprof.Profiler’ objects}

zlyuanteng 15楼

中间一段代码都是在生成 random.seed()。可能是为了随机效果更好吧。

wuwangju 16楼

应该是因为<string>:1，random.py:222，random.py:252 都是 py 实现的，而且调用次数都是最多的，所以耗时吧。
choice:
10000001 5.516 0.000 20.772 0.000 <string>:1(<genexpr>)
10000000 6.283 0.000 8.918 0.000 random.py:222(_randbelow)
10000000 5.381 0.000 15.256 0.000 random.py:252(choice)

而在 choices 中，调用最多的是 C 语言实现的，所以不耗时。
choices:
10000000 0.594 0.000 0.594 0.000 {method ‘random’ of ‘_random.Random’ objects}

楼上有 XD 提到 os.urandom()调用 syscall(such as /dev/urandom on Unix or CryptGenRandom on Windows)生成一段随机的 bytes 速度更快。

caililin 17楼

3.6 有个 secrets 模块，不用自己写了😂

gougou168 18楼

In [42]: %time s = binascii.b2a_hex(os.urandom(10**7/2));
CPU times: user 34.8 ms, sys: 318 ms, total: 353 ms
Wall time: 353 ms