使用Objective C / Cocoa来避开unicode字符，即\ u1234

我从中获取数据的某些站点正在返回UTF-8string，并且UTF-8字符被转义，即： \u5404\u500b\u90fd

有一个内置的cocoafunction，可能有助于这一点，或者我将不得不写我自己的解码algorithm。

没有内置的function来进行通信。

您可以使用NSPropertyListSerialization作弊，因为“旧文本样式”plist支持通过\Uxxxx转义：

 NSString* input = @"ab\"cA\"BC\\u2345\\u0123"; // will cause trouble if you have "abc\\\\uvw" NSString* esc1 = [input stringByReplacingOccurrencesOfString:@"\\u" withString:@"\\U"]; NSString* esc2 = [esc1 stringByReplacingOccurrencesOfString:@"\"" withString:@"\\\""]; NSString* quoted = [[@"\"" stringByAppendingString:esc2] stringByAppendingString:@"\""]; NSData* data = [quoted dataUsingEncoding:NSUTF8StringEncoding]; NSString* unesc = [NSPropertyListSerialization propertyListFromData:data mutabilityOption:NSPropertyListImmutable format:NULL errorDescription:NULL]; assert([unesc isKindOfClass:[NSString class]]); NSLog(@"Output = %@", unesc);

但是请记住，这不是很有效率。如果你写出你自己的parsing器，这会好得多。（顺便说一句，你解码JSONstring？如果是的话，你可以使用现有的JSONparsing器。）

Cocoa没有提供解决scheme是正确的，但Core Foundation却这样做： CFStringTransform 。

CFStringTransform存在于Mac OS（和iOS）的尘土飞扬的偏远angular落，所以它有点儿知道gem。这是苹果兼容ICUstring转换引擎的前端。它可以像希腊和拉丁文（或任何已知的脚本）之间的音译一样执行真正的魔术，但它也可以用来做一些普通的任务，比如从一个蹩脚的服务器中翻阅string：

 NSString *input = @"\\u5404\\u500b\\u90fd"; NSString *convertedString = [input mutableCopy]; CFStringRef transform = CFSTR("Any-Hex/Java"); CFStringTransform((__bridge CFMutableStringRef)convertedString, NULL, transform, YES); NSLog(@"convertedString: %@", convertedString); // prints: 各個都, tada!

正如我所说的， CFStringTransform非常强大。它支持一些预定义的转换，如大小写映射，规范化或unicode字符名称转换。你甚至可以devise你自己的转换。

~~我不知道为什么苹果不能从Cocoa提供。~~

编辑2015：

OS X 10.11和iOS 9将以下方法添加到Foundation中：

 - (nullable NSString *)stringByApplyingTransform:(NSString *)transform reverse:(BOOL)reverse;

所以上面的例子变成了…

 NSString *input = @"\\u5404\\u500b\\u90fd"; NSString *convertedString = [input stringByApplyingTransform:@"Any-Hex/Java" reverse:YES]; NSLog(@"convertedString: %@", convertedString);

谢谢@nschmidt的抬头。

这是我写的。希望这会帮助一些人。

 + (NSString*) unescapeUnicodeString:(NSString*)string { // unescape quotes and backwards slash NSString* unescapedString = [string stringByReplacingOccurrencesOfString:@"\\\"" withString:@"\""]; unescapedString = [unescapedString stringByReplacingOccurrencesOfString:@"\\\\" withString:@"\\"]; // tokenize based on unicode escape char NSMutableString* tokenizedString = [NSMutableString string]; NSScanner* scanner = [NSScanner scannerWithString:unescapedString]; while ([scanner isAtEnd] == NO) { // read up to the first unicode marker // if a string has been scanned, it's a token // and should be appended to the tokenized string NSString* token = @""; [scanner scanUpToString:@"\\u" intoString:&token]; if (token != nil && token.length > 0) { [tokenizedString appendString:token]; continue; } // skip two characters to get past the marker // check if the range of unicode characters is // beyond the end of the string (could be malformed) // and if it is, move the scanner to the end // and skip this token NSUInteger location = [scanner scanLocation]; NSInteger extra = scanner.string.length - location - 4 - 2; if (extra < 0) { NSRange range = {location, -extra}; [tokenizedString appendString:[scanner.string substringWithRange:range]]; [scanner setScanLocation:location - extra]; continue; } // move the location pas the unicode marker // then read in the next 4 characters location += 2; NSRange range = {location, 4}; token = [scanner.string substringWithRange:range]; unichar codeValue = (unichar) strtol([token UTF8String], NULL, 16); [tokenizedString appendString:[NSString stringWithFormat:@"%C", codeValue]]; // move the scanner past the 4 characters // then keep scanning location += 4; [scanner setScanLocation:location]; } // done return tokenizedString; } + (NSString*) escapeUnicodeString:(NSString*)string { // lastly escaped quotes and back slash // note that the backslash has to be escaped before the quote // otherwise it will end up with an extra backslash NSString* escapedString = [string stringByReplacingOccurrencesOfString:@"\\" withString:@"\\\\"]; escapedString = [escapedString stringByReplacingOccurrencesOfString:@"\"" withString:@"\\\""]; // convert to encoded unicode // do this by getting the data for the string // in UTF16 little endian (for network byte order) NSData* data = [escapedString dataUsingEncoding:NSUTF16LittleEndianStringEncoding allowLossyConversion:YES]; size_t bytesRead = 0; const char* bytes = data.bytes; NSMutableString* encodedString = [NSMutableString string]; // loop through the byte array // read two bytes at a time, if the bytes // are above a certain value they are unicode // otherwise the bytes are ASCII characters // the %C format will write the character value of bytes while (bytesRead < data.length) { uint16_t code = *((uint16_t*) &bytes[bytesRead]); if (code > 0x007E) { [encodedString appendFormat:@"\\u%04X", code]; } else { [encodedString appendFormat:@"%C", code]; } bytesRead += sizeof(uint16_t); } // done return encodedString; }

简单的代码：

 const char *cString = [unicodeStr cStringUsingEncoding:NSUTF8StringEncoding]; NSString *resultStr = [NSString stringWithCString:cString encoding:NSNonLossyASCIIStringEncoding];

来自： https ： //stackoverflow.com/a/7861345

使用Objective C / Cocoa来避开unicode字符，即\ u1234

Xcode 4.5和iOS 4.2.1不兼容

有一个很好的iPhone图表库吗？

在iOS上从Swift生成一个UUID

NSTimeZone：localTimeZone和systemTimeZone有什么区别？

pathForResource返回null

NSDateFormatter setDateFormat的序号月份后缀选项

如何在OS X或iOS（不使用格式塔）运行时确定操作系统版本？

实际上NSAssert有什么意义呢？

获取MKMapvIew的界限

构build错误 – 在文件中缺less所需的体系结构i386