在Python 3中将string转换为字节的最佳方法是什么？

似乎有两种不同的方式将string转换为字节，如TypeError的答案中所示：“str”不支持缓冲区接口

哪种方法会更好或者更加Pythonic？还是只是个人喜好的问题？

b = bytes(mystring, 'utf-8') b = mystring.encode('utf-8')

如果你看看文件的bytes ，它指向你bytearray ：

bytearray（[source [，encoding [，errors]]]）

返回一个新的字节数组。 bytearraytypes是一个在0 <= x <256范围内的可变整数序列。它具有可变序列的大多数常用方法，在可变序列types中描述，以及字节types具有的大多数方法，请参见字节和字节数组方法。

可选的source参数可以用几种不同的方式初始化数组：

如果是string，则还必须给出编码（以及可选的错误）参数; bytearray（）然后使用str.encode（）将string转换为字节。

如果它是一个整数，则数组将具有该大小，并将用空字节进行初始化。

如果它是符合缓冲区接口的对象，则将使用该对象的只读缓冲区来初始化字节数组。

如果它是一个可迭代的，它必须是0 <= x <256范围内的整数的迭代，它们被用作数组的初始内容。

没有参数，就会创build一个大小为0的数组。

所以bytes可以做的不仅仅是编码一个string。 Pythonic会允许你使用任何types的源参数来调用构造函数。

对于一个string的编码，我认为some_string.encode(encoding)比使用构造函数更Pythonic，因为它是最自我logging – “采取这个string，并用此编码进行编码”比bytes(some_string, encoding) – 使用构造函数时没有明确的动词。

编辑：我检查了Python的来源。如果您使用CPython将unicodestring传递给bytes ，则会调用PyUnicode_AsEncodedString ，这是encode的实现; 所以如果你打电话给自己encode你只是跳过一个间接的程度。

此外，请参阅Serdalis的评论 – unicode_string.encode(encoding)也是更Pythonic，因为它的逆是byte_string.decode(encoding)和对称是好的。

比它想象的更容易：

 my_str = "hello world" my_str_as_bytes = str.encode(my_str) type(my_str_as_bytes) # ensure it is byte representation my_decoded_str = my_str_as_bytes.decode() type(my_decoded_str) # ensure it is string representation

绝对最好的办法不是2号，而是3号。 encode的第一个参数默认为 'utf-8' 。所以最好的办法是

 b = mystring.encode()

这也会更快，因为默认的参数结果不在C代码中的string"utf-8"中，而是NULL ，它的检查速度要快得多！

这里有一些时机：

 In [1]: %timeit -r 10 'abc'.encode('utf-8') The slowest run took 38.07 times longer than the fastest. This could mean that an intermediate result is being cached. 10000000 loops, best of 10: 183 ns per loop In [2]: %timeit -r 10 'abc'.encode() The slowest run took 27.34 times longer than the fastest. This could mean that an intermediate result is being cached. 10000000 loops, best of 10: 137 ns per loop

尽pipe有警告，但经过反复运行后，时间非常稳定 – 偏差仅为〜2％。

 so_string = 'stackoverflow' so_bytes = so_string.encode( )

你可以简单地将string转换为字节使用：

a_string.encode()

你可以简单地将字节转换为string使用：

some_bytes.decode()

bytes.decode和str.encode encoding='utf-8'为默认值。

以下函数（取自Effective Python ）可能对将str转换为bytes并将bytes转换为str有用：

 def to_bytes(bytes_or_str): if isinstance(bytes_or_str, str): value = bytes_or_str.encode() # uses 'utf-8' for encoding else: value = bytes_or_str return value # Instance of bytes def to_str(bytes_or_str): if isinstance(bytes_or_str, bytes): value = bytes_or_str.decode() # uses 'utf-8' for encoding else: value = bytes_or_str return value # Instance of str

在Python 3中将string转换为字节的最佳方法是什么？

UTF-8编码的html页面显示（问号）而不是字符

如何将整个MySQL数据库字符集和归类转换为UTF-8？

设置默认的Java字符编码？

Unicode可打印字符的范围是什么？

Python：从ISO-8859-1 / latin1转换为UTF-8

通过脚本中的vim将文件编码更改为utf-8

如何在Java中转换ISO-8859-1和UTF-8？

与UTF-8字符的麻烦; 我看到的不是我所存储的

我可以让git将文件识别为UTF-16文件吗？

“”是“＆nbsp;”的替代吗？

在Python 3中将string转换为字节的最佳方法是什么？

UTF-8编码的html页面显示 （问号）而不是字符

如何将整个MySQL数据库字符集和归类转换为UTF-8？

设置默认的Java字符编码？

Unicode可打印字符的范围是什么？

Python：从ISO-8859-1 / latin1转换为UTF-8

通过脚本中的vim将文件编码更改为utf-8

如何在Java中转换ISO-8859-1和UTF-8？

与UTF-8字符的麻烦; 我看到的不是我所存储的

我可以让git将文件识别为UTF-16文件吗？

“”是“＆nbsp;”的替代吗？

UTF-8编码的html页面显示（问号）而不是字符