简单的方法来删除string中的多个空格？

假设这是string：

The fox jumped over the log.

这将导致：

 The fox jumped over the log.

什么是最简单的，1-2class可以做到这一点？没有分裂和进入名单…

 >>> import re >>> re.sub(' +',' ','The quick brown fox') 'The quick brown fox'

foo是你的string：

 " ".join(foo.split())

虽然这将删除“所有空白字符（空格，制表符，换行符，返回，换页）”警告。（感谢hhsaffar ，看到评论）即"this is \ta test\n"将有效地结束为"this is a test"

 import re s = "The fox jumped over the log." re.sub("\s\s+" , " ", s)

要么

 re.sub("\s\s+", " ", s)

因为逗号前面的空格在PEP8中被列为宠物狗，正如评论中的驼鹿所提到的那样。

用“\ s”使用正则expression式，并做简单的string.split（）也将删除其他空格 – 如换行符，回车符，制表符。除非需要，否则只能做多个空格 ，我提出这些例子。

编辑：正如我不想做的，我睡在这，除了纠正最后的结果（v3.3.3 @ 64位， 而不是 32位）错字，明显打我：testingstring是相当微不足道的。

所以，我得到了…… 11段，1000字，6665字节的Lorem Ipsum，以获得更现实的时间testing。然后，我在整个过程中添加了随机的额外空间

 original_string = ''.join(word + (' ' * random.randint(1, 10)) for word in lorem_ipsum.split(' '))

我也纠正了“正确join ”; 如果一个人在意，单线程将基本上做任何领先/尾随空格，这个修正版本保留前/后空格（但只有一个 ;-)。（我发现这是因为随机间隔lorem_ipsum得到了额外的空间，因此失败了assert 。）

 # setup = ''' import re def while_replace(string): while ' ' in string: string = string.replace(' ', ' ') return string def re_replace(string): return re.sub(r' {2,}' , ' ', string) def proper_join(string): split_string = string.split(' ') # To account for leading/trailing spaces that would simply be removed beg = ' ' if not split_string[ 0] else '' end = ' ' if not split_string[-1] else '' # versus simply ' '.join(item for item in string.split(' ') if item) return beg + ' '.join(item for item in split_string if item) + end original_string = """Lorem ipsum ... no, really, it kept going... malesuada enim feugiat. Integer imperdiet erat.""" assert while_replace(original_string) == re_replace(original_string) == proper_join(original_string) #'''

 # while_replace_test new_string = original_string[:] new_string = while_replace(new_string) assert new_string != original_string

 # re_replace_test new_string = original_string[:] new_string = re_replace(new_string) assert new_string != original_string

 # proper_join_test new_string = original_string[:] new_string = proper_join(new_string) assert new_string != original_string

注意： ~~“ while版本”制作了original_string的副本，因为我相信在第一次运行时修改后，连续运行会更快（如果只是一点点）。~~ ~~由于这增加了时间，所以我把这个string拷贝加到另外两个string中，以便时间显示只在逻辑上的区别。~~ 请记住， timeit实例的主要部分将只执行一次 ; 原来的方式，我做了这个， while循环工作在同一个标签， original_string ，因此第二次运行，将没有什么可做的。现在它的设置方式，使用两个不同的标签调用一个函数，这不是一个问题。我已经向所有工作人员添加了assert语句，以validation我们每次迭代都会改变某些内容（对于那些可疑的人）。例如，改变这个，它打破了：

 # while_replace_test new_string = original_string[:] new_string = while_replace(new_string) assert new_string != original_string # will break the 2nd iteration while ' ' in original_string: original_string = original_string.replace(' ', ' ')

 Tests run on a laptop with an i5 processor running Windows 7 (64-bit). timeit.Timer(stmt = test, setup = setup).repeat(7, 1000) test_string = 'The fox jumped over\n\t the log.' # trivial Python 2.7.3, 32-bit, Windows test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.001066 | 0.001260 | 0.001128 | 0.001092 re_replace_test | 0.003074 | 0.003941 | 0.003357 | 0.003349 proper_join_test | 0.002783 | 0.004829 | 0.003554 | 0.003035 Python 2.7.3, 64-bit, Windows test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.001025 | 0.001079 | 0.001052 | 0.001051 re_replace_test | 0.003213 | 0.004512 | 0.003656 | 0.003504 proper_join_test | 0.002760 | 0.006361 | 0.004626 | 0.004600 Python 3.2.3, 32-bit, Windows test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.001350 | 0.002302 | 0.001639 | 0.001357 re_replace_test | 0.006797 | 0.008107 | 0.007319 | 0.007440 proper_join_test | 0.002863 | 0.003356 | 0.003026 | 0.002975 Python 3.3.3, 64-bit, Windows test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.001444 | 0.001490 | 0.001460 | 0.001459 re_replace_test | 0.011771 | 0.012598 | 0.012082 | 0.011910 proper_join_test | 0.003741 | 0.005933 | 0.004341 | 0.004009

 test_string = lorem_ipsum # Thanks to http://www.lipsum.com/ # "Generated 11 paragraphs, 1000 words, 6665 bytes of Lorem Ipsum" Python 2.7.3, 32-bit test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.342602 | 0.387803 | 0.359319 | 0.356284 re_replace_test | 0.337571 | 0.359821 | 0.348876 | 0.348006 proper_join_test | 0.381654 | 0.395349 | 0.388304 | 0.388193 Python 2.7.3, 64-bit test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.227471 | 0.268340 | 0.240884 | 0.236776 re_replace_test | 0.301516 | 0.325730 | 0.308626 | 0.307852 proper_join_test | 0.358766 | 0.383736 | 0.370958 | 0.371866 Python 3.2.3, 32-bit test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.438480 | 0.463380 | 0.447953 | 0.446646 re_replace_test | 0.463729 | 0.490947 | 0.472496 | 0.468778 proper_join_test | 0.397022 | 0.427817 | 0.406612 | 0.402053 Python 3.3.3, 64-bit test | minum | maximum | average | median ---------------------+------------+------------+------------+----------- while_replace_test | 0.284495 | 0.294025 | 0.288735 | 0.289153 re_replace_test | 0.501351 | 0.525673 | 0.511347 | 0.508467 proper_join_test | 0.422011 | 0.448736 | 0.436196 | 0.440318

对于普通的string来说，while循环看起来是最快的，接下来是Pythonicstringsplit / join，而正则expression式是向后拉的。

对于非平凡的string ，似乎有更多要考虑。 32位2.7？这是正义的救援！ 2.7 64位？一个while循环是最好的，一个不错的余地。 32位3.2，去“正确” join 。 64位3.3，去一段while循环。再次。

最后， 如果需要的话 ，可以提高性能，但最好记住这个口头禅：

让它起作用
改正它
快一点

IANAL，YMMV，注意到Emptor！

必须同意Paul McGuire的上述评论。对我来说，

  ' '.join(the_string.split())

是非常喜欢掀起一个正则expression式。我的测量结果（Linux，Python 2.5）显示，拆分后join比“re.sub（…）”快了近5倍，而且如果预编译一次正则expression式并执行操作，则仍然快3倍多次。而且这是通过任何措施更容易理解 – 更pyiousonic。

类似于以前的解决scheme，但更具体的：用一个replace两个或更多的空间：

 >>> import re >>> s = "The fox jumped over the log." >>> re.sub('\s{2,}', ' ', s) 'The fox jumped over the log.'

一个简单的灵魂

 >>> import re >>> s="The fox jumped over the log." >>> print re.sub('\s+',' ', s) The fox jumped over the log.

其他的select

 >>> import re >>> str = 'this is a string with multiple spaces and tabs' >>> str = re.sub('[ \t]+' , ' ', str) >>> print str this is a string with multiple spaces and tabs

一行代码删除句子之前，之后和之内的所有额外空格：

 sentence = " The fox jumped over the log. " sentence = ' '.join(filter(None,sentence.split(' ')))

说明：

将整个string分割成列表。
从列表中筛选空的元素。
用单个空格重新join剩余的元素*

*剩余的元素应该是带有标点符号的文字或单词等。我没有广泛的testing，但这应该是一个很好的起点。祝一切顺利！

这也似乎工作：

 while " " in s: s=s.replace(" "," ")

variabless代表你的string。

 def unPretty(S): # given a dictionary, json, list, float, int, or even a string.. # return a string stripped of CR, LF replaced by space, with multiple spaces reduced to one. return ' '.join( str(S).replace('\n',' ').replace('\r','').split() )

如果是空格，则处理分裂。无将返回的值中包含空string。

https://docs.python.org/2/library/stdtypes.html#str.split

 string='This is a string full of spaces and taps' string=string.split(' ') while '' in string: string.remove('') string=' '.join(string) print(string)

结果：

这是一个充满空间和水龙头的string

要删除空格，考虑词之间的前导，尾随和额外空格，请使用：

（？<= \ s）+ | ^ +（？= \ s）| （？= + [\ n \ 0]）

第一个是处理领先的白色空间，第二个是处理string空白的开始，最后一个是处理尾部空白

为了certificate这个链接将为您提供一个testing。

https://regex101.com/r/meBYli/4

让我知道如果你find一个input，将打破这个正则expression式代码。

另外 – 这个要和re.split函数一起使用

在某些情况下，希望用该字符的单个实例replace每个空白字符的连续出现。你会使用带有反向引用的正则expression式来做到这一点。

(\s)\1{1,}匹配任何空白字符，后跟一个或多个该字符。现在，您只需指定第一个组（ \1 ）作为replace即可。

在一个函数中包装这个：

 import re def normalize_whitespace(string): return re.sub(r'(\s)\1{1,}', r'\1', string)

 >>> normalize_whitespace('The fox jumped over the log.') 'The fox jumped over the log.' >>> normalize_whitespace('First line\t\t\t \n\n\nSecond line') 'First line\t \nSecond line'

import re string = re.sub（'[\ t \ n] +'，''，'快速的棕色\ n \ n \ t狐狸'）

这将删除所有的标签，新的行和多个空白与单个空格。

简单的方法来删除string中的多个空格？

从string中移除HTML标记，包括C＃中的＆nbsp;

Ruby lookbehind断言（1.9 / 2.0）有错误吗？

正则expression式中的Tilde运算符

在django / python中检查电子邮件的有效性

获取URL的部分（正则expression式）

取消长时间运行的正则expression式匹配？

如何将所有stringreplace为Notepad ++中每个string中包含的数字？

Bash，两行之间用指定的stringgrep

正则expression式来分割camelCase或TitleCase（高级）

什么是正则expression式平衡组？