如何将一个“大写字母”分隔的string分割成一个数组?
我如何从这个string:“ThisIsMyCapsDelimitedString”
…这个string:“这是我的大写字母分隔string”
VB.net中最less的代码行是首选,但也欢迎C#。
干杯!
我前一阵子做了这个。 它匹配一个CamelCase名称的每个组件。
/([AZ]+(?=$|[AZ][az])|[AZ]?[az]+)/g
例如:
"SimpleHTTPServer" => ["Simple", "HTTP", "Server"] "camelCase" => ["camel", "Case"]
要将其转换为在单词之间插入空格:
Regex.Replace(s, "([az](?=[AZ])|[AZ](?=[AZ][az]))", "$1 ")
如果你需要处理数字:
/([AZ]+(?=$|[AZ][az]|[0-9])|[AZ]?[az]+|[0-9]+)/g Regex.Replace(s,"([az](?=[AZ]|[0-9])|[AZ](?=[AZ][az]|[0-9])|[0-9](?=[^0-9]))","$1 ")
Regex.Replace("ThisIsMyCapsDelimitedString", "(\\B[AZ])", " $1")
很好的答案,MizardX! 我微调了一下,把数字看作是单独的单词,所以“AddressLine1”将变成“Address Line 1”而不是“Address Line1”:
Regex.Replace(s, "([az](?=[A-Z0-9])|[AZ](?=[AZ][az]))", "$1 ")
只是为了一个小品种…这是一个不使用正则expression式的扩展方法。
public static class CamelSpaceExtensions { public static string SpaceCamelCase(this String input) { return new string(InsertSpacesBeforeCaps(input).ToArray()); } private static IEnumerable<char> InsertSpacesBeforeCaps(IEnumerable<char> input) { foreach (char c in input) { if (char.IsUpper(c)) { yield return ' '; } yield return c; } } }
除了格兰特·瓦格纳的优秀评论:
Dim s As String = RegularExpressions.Regex.Replace("ThisIsMyCapsDelimitedString", "([AZ])", " $1")
我需要一个支持首字母缩略词和数字的解决scheme。 这种基于正则expression式的解决scheme将以下模式视为单个“单词”:
- 大写字母后跟小写字母
- 连续的数字序列
- 连续的大写字母(解释为首字母缩略词) – 一个新的单词可以开始使用最后一个大写字母,例如HTMLGuide =>“HTML指南”,“TheATeam”=>“The A Team”
你可以做一个单行的:
Regex.Replace(value, @"(?<!^)((?<!\d)\d|(?(?<=[AZ])[AZ](?=[az])|[AZ]))", " $1")
更可读的方法可能会更好:
using System.Text.RegularExpressions; namespace Demo { public class IntercappedStringHelper { private static readonly Regex SeparatorRegex; static IntercappedStringHelper() { const string pattern = @" (?<!^) # Not start ( # Digit, not preceded by another digit (?<!\d)\d | # Upper-case letter, followed by lower-case letter if # preceded by another upper-case letter, eg 'G' in HTMLGuide (?(?<=[AZ])[AZ](?=[az])|[AZ]) )"; var options = RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled; SeparatorRegex = new Regex(pattern, options); } public static string SeparateWords(string value, string separator = " ") { return SeparatorRegex.Replace(value, separator + "$1"); } } }
这是(XUnit)testing的摘录:
[Theory] [InlineData("PurchaseOrders", "Purchase-Orders")] [InlineData("purchaseOrders", "purchase-Orders")] [InlineData("2Unlimited", "2-Unlimited")] [InlineData("The2Unlimited", "The-2-Unlimited")] [InlineData("Unlimited2", "Unlimited-2")] [InlineData("222Unlimited", "222-Unlimited")] [InlineData("The222Unlimited", "The-222-Unlimited")] [InlineData("Unlimited222", "Unlimited-222")] [InlineData("ATeam", "A-Team")] [InlineData("TheATeam", "The-A-Team")] [InlineData("TeamA", "Team-A")] [InlineData("HTMLGuide", "HTML-Guide")] [InlineData("TheHTMLGuide", "The-HTML-Guide")] [InlineData("TheGuideToHTML", "The-Guide-To-HTML")] [InlineData("HTMLGuide5", "HTML-Guide-5")] [InlineData("TheHTML5Guide", "The-HTML-5-Guide")] [InlineData("TheGuideToHTML5", "The-Guide-To-HTML-5")] [InlineData("TheUKAllStars", "The-UK-All-Stars")] [InlineData("AllStarsUK", "All-Stars-UK")] [InlineData("UKAllStars", "UK-All-Stars")]
对于更多种类,使用普通的旧C#对象,下面产生与@ MizardX的优秀正则expression式相同的输出。
public string FromCamelCase(string camel) { // omitted checking camel for null StringBuilder sb = new StringBuilder(); int upperCaseRun = 0; foreach (char c in camel) { // append a space only if we're not at the start // and we're not already in an all caps string. if (char.IsUpper(c)) { if (upperCaseRun == 0 && sb.Length != 0) { sb.Append(' '); } upperCaseRun++; } else if( char.IsLower(c) ) { if (upperCaseRun > 1) //The first new word will also be capitalized. { sb.Insert(sb.Length - 1, ' '); } upperCaseRun = 0; } else { upperCaseRun = 0; } sb.Append(c); } return sb.ToString(); }
string s = "ThisIsMyCapsDelimitedString"; string t = Regex.Replace(s, "([AZ])", " $1").Substring(1);
以下是将以下内容转换为Title Case的原型:
- snake_case
- 骆驼香烟盒
- PascalCase
- 判例
- 标题大小写(保持当前格式)
显然你只需要自己的“ToTitleCase”方法。
using System; using System.Collections.Generic; using System.Globalization; using System.Text.RegularExpressions; public class Program { public static void Main() { var examples = new List<string> { "THEQuickBrownFox", "theQUICKBrownFox", "TheQuickBrownFOX", "TheQuickBrownFox", "the_quick_brown_fox", "theFOX", "FOX", "QUICK" }; foreach (var example in examples) { Console.WriteLine(ToTitleCase(example)); } } private static string ToTitleCase(string example) { var fromSnakeCase = example.Replace("_", " "); var lowerToUpper = Regex.Replace(fromSnakeCase, @"(\p{Ll})(\p{Lu})", "$1 $2"); var sentenceCase = Regex.Replace(lowerToUpper, @"(\p{Lu}+)(\p{Lu}\p{Ll})", "$1 $2"); return new CultureInfo("en-US", false).TextInfo.ToTitleCase(sentenceCase); } }
控制台将如下所示:
THE Quick Brown Fox The QUICK Brown Fox The Quick Brown FOX The Quick Brown Fox The Quick Brown Fox The FOX FOX QUICK
博客文章引用
天真的正则expression式解决scheme。 不会处理O'Conner,并在string的开头添加一个空格。
s = "ThisIsMyCapsDelimitedString" split = Regex.Replace(s, "[A-Z0-9]", " $&");
可能有一个更优雅的解决scheme,但这是我从头顶上来的:
string myString = "ThisIsMyCapsDelimitedString"; for (int i = 1; i < myString.Length; i++) { if (myString[i].ToString().ToUpper() == myString[i].ToString()) { myString = myString.Insert(i, " "); i++; } }
尝试使用
"([AZ]*[^AZ]*)"
结果将适合与数字字母组合
Regex.Replace("AbcDefGH123Weh", "([AZ]*[^AZ]*)", "$1 "); Abc Def GH123 Weh Regex.Replace("camelCase", "([AZ]*[^AZ]*)", "$1 "); camel Case
从https://stackoverflow.com/a/5796394/4279201实施psudo代码;
private static StringBuilder camelCaseToRegular(string i_String) { StringBuilder output = new StringBuilder(); int i = 0; foreach (char character in i_String) { if (character <= 'Z' && character >= 'A' && i > 0) { output.Append(" "); } output.Append(character); i++; } return output; }
要在非大写字母和大写字母Unicode之间进行匹配Unicode类别 : (?<=\P{Lu})(?=\p{Lu})
Dim s = Regex.Replace("CorrectHorseBatteryStaple", "(?<=\P{Lu})(?=\p{Lu})", " ")
正则expression式比简单的循环慢大约10-12倍:
public static string CamelCaseToSpaceSeparated(this string str) { if (string.IsNullOrEmpty(str)) { return str; } var res = new StringBuilder(); res.Append(str[0]); for (var i = 1; i < str.Length; i++) { if (char.IsUpper(str[i])) { res.Append(' '); } res.Append(str[i]); } return res.ToString(); }