将空白序列折叠为单个字符和修剪string
考虑下面的例子:
" Hello this is a long string! "
我想将其转换为:
"Hello this is a long string!"
OS X 10.7+和iOS 3.2+
使用hfossli提供的原生正则expression式解决scheme 。
除此以外
要么使用你最喜欢的正则expression式库,要么使用下面的Cocoa-native解决scheme:
NSString *theString = @" Hello this is a long string! "; NSCharacterSet *whitespaces = [NSCharacterSet whitespaceCharacterSet]; NSPredicate *noEmptyStrings = [NSPredicate predicateWithFormat:@"SELF != ''"]; NSArray *parts = [theString componentsSeparatedByCharactersInSet:whitespaces]; NSArray *filteredArray = [parts filteredArrayUsingPredicate:noEmptyStrings]; theString = [filteredArray componentsJoinedByString:@" "];
正则expression式和NSCharacterSet在这里帮助你。 该解决scheme修剪前后的空白以及多个空格。
NSString *original = @" Hello this is a long string! "; NSString *squashed = [original stringByReplacingOccurrencesOfString:@"[ ]+" withString:@" " options:NSRegularExpressionSearch range:NSMakeRange(0, original.length)]; NSString *final = [squashed stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
loggingfinal
给出
"Hello this is a long string!"
可能的替代正则expression式模式:
- 只replace空格:
[ ]+
- replace空格和制表符:
[ \\t]+
- replace空格,制表符和换行符:
\\s+
性能下降
- 这个解决scheme:7.6秒
- 分裂,过滤,join(GeorgSchölly) :13.7秒
易于扩展,性能,代码行数和创build的对象数量使得这个解决scheme是合适的。
其实,有一个非常简单的解决scheme:
NSString *string = @" spaces in front and at the end "; NSString *trimmedString = [string stringByTrimmingCharactersInSet: [NSCharacterSet whitespaceAndNewlineCharacterSet]]; NSLog(@"%@", trimmedString)
( 来源 )
用正则expression式,但不需要任何外部框架:
NSString *theString = @" Hello this is a long string! "; theString = [theString stringByReplacingOccurrencesOfString:@" +" withString:@" " options:NSRegularExpressionSearch range:NSMakeRange(0, theString.length)];
单线解决scheme:
NSString *whitespaceString = @" String with whitespaces "; NSString *trimmedString = [whitespaceString stringByReplacingOccurrencesOfString:@" " withString:@""];
这应该做到这一点…
NSString *s = @"this is a string with lots of white space"; NSArray *comps = [s componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]; NSMutableArray *words = [NSMutableArray array]; for(NSString *comp in comps) { if([comp length] > 1)) { [words addObject:comp]; } } NSString *result = [words componentsJoinedByString:@" "];
正则expression式的另一个select是RegexKitLite ,它很容易embedded到iPhone项目中:
[theString stringByReplacingOccurencesOfRegex:@" +" withString:@" "];
尝试这个
NSString *theString = @" Hello this is a long string! "; while ([theString rangeOfString:@" "].location != NSNotFound) { theString = [theString stringByReplacingOccurrencesOfString:@" " withString:@" "]; }
这里是一个NSString
扩展的片段,其中"self"
是NSString
实例。 通过将[NSCharacterSet whitespaceAndNewlineCharacterSet]
和' '
传递给两个参数,它可以用于将连续的空白[NSCharacterSet whitespaceAndNewlineCharacterSet]
到单个空间中。
- (NSString *) stringCollapsingCharacterSet: (NSCharacterSet *) characterSet toCharacter: (unichar) ch { int fullLength = [self length]; int length = 0; unichar *newString = malloc(sizeof(unichar) * (fullLength + 1)); BOOL isInCharset = NO; for (int i = 0; i < fullLength; i++) { unichar thisChar = [self characterAtIndex: i]; if ([characterSet characterIsMember: thisChar]) { isInCharset = YES; } else { if (isInCharset) { newString[length++] = ch; } newString[length++] = thisChar; isInCharset = NO; } } newString[length] = '\0'; NSString *result = [NSString stringWithCharacters: newString length: length]; free(newString); return result; }
替代解决scheme:获得一份OgreKit(Cocoa正则expression式库)的副本。
- OgreKit (日文网页 – 代码是英文)
- OgreKit (谷歌自动翻译):
整个function就是:
NSString *theStringTrimmed = [theString stringByTrimmingCharactersInSet: [NSCharacterSet whitespaceAndNewlineCharacterSet]]; OGRegularExpression *regex = [OGRegularExpression regularExpressionWithString:@"\s+"]; return [regex replaceAllMatchesInString:theStringTrimmed withString:@" "]);
简短而甜蜜。
如果您使用的是最快的解决scheme,那么使用NSScanner
进行精心构build的一系列指令可能效果最好,但是如果您计划处理大量(兆字节)的文本块,则只需NSScanner
。
根据@Mathieu Godart是最好的答案,但是有些行不见了,所有的答案只是减less单词之间的空格,但是如果有tab或者tab有地方空间,就像这样:“这是text \ t,而\ tTab之间,所以“在三行代码中,我们将:我们想要的string减less空白
NSString * str_aLine = @" this is text \t , and\tTab between , so on "; // replace tabs to space str_aLine = [str_aLine stringByReplacingOccurrencesOfString:@"\t" withString:@" "]; // reduce spaces to one space str_aLine = [str_aLine stringByReplacingOccurrencesOfString:@" +" withString:@" " options:NSRegularExpressionSearch range:NSMakeRange(0, str_aLine.length)]; // trim begin and end from white spaces str_aLine = [str_aLine stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
结果是
"this is text , and Tab between , so on"
无需replace标签将是:
"this is text , and Tab between , so on"
你也可以使用一个简单的while参数。 那里没有RegEx的魔法,所以也许在将来更容易理解和改变:
while([yourNSStringObject replaceOccurrencesOfString:@" " withString:@" " options:0 range:NSMakeRange(0, [yourNSStringObject length])] > 0);
以下两个正则expression式将根据要求工作
- @“+”用于匹配空格和制表符
- @“\\ s {2,}”用于匹配空格,制表符和换行符
然后应用nsstring的实例方法stringByReplacingOccurrencesOfString:withString:options:range:
用一个空格replace它们。
例如
[string stringByReplacingOccurrencesOfString:regex withString:@" " options:NSRegularExpressionSearch range:NSMakeRange(0, [string length])];
注意:对于iOS 5.x及以上版本,我没有使用“RegexKitLite”库来实现上述function。