如何将MatchCollection转换为string数组
有没有更好的方法将MatchCollection转换为string数组:
MatchCollection mc = Regex.Matches(strText, @"\b[A-Za-z-']+\b"); string[] strArray = new string[mc.Count]; for (int i = 0; i < mc.Count;i++ ) { strArray[i] = mc[i].Groups[0].Value; }
PS:mc.CopyTo(strArray,0)将抛出一个exception:“源数组中至less有一个元素不能被转换为目标数组types”。
尝试:
var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b") .Cast<Match>() .Select(m => m.Value) .ToArray();
戴夫比什的答案是好的,工作正常。
值得注意的是,虽然用OfType<Match>()
replaceCast<Match>()
OfType<Match>()
会加快速度。
代码将成为:
var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b") .OfType<Match>() .Select(m => m.Groups[0].Value) .ToArray();
结果是完全一样的(并且以完全相同的方式解决了OP的问题),但对于巨大的string则更快。
testing代码:
// put it in a console application static void Test() { Stopwatch sw = new Stopwatch(); StringBuilder sb = new StringBuilder(); string strText = "this will become a very long string after my code has done appending it to the stringbuilder "; Enumerable.Range(1, 100000).ToList().ForEach(i => sb.Append(strText)); strText = sb.ToString(); sw.Start(); var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b") .OfType<Match>() .Select(m => m.Groups[0].Value) .ToArray(); sw.Stop(); Console.WriteLine("OfType: " + sw.ElapsedMilliseconds.ToString()); sw.Reset(); sw.Start(); var arr2 = Regex.Matches(strText, @"\b[A-Za-z-']+\b") .Cast<Match>() .Select(m => m.Groups[0].Value) .ToArray(); sw.Stop(); Console.WriteLine("Cast: " + sw.ElapsedMilliseconds.ToString()); }
产出如下:
OfType: 6540 Cast: 8743
对于非常长的stringCast()因此速度较慢。
我运行了Alex发布的完全相同的基准,发现Cast
有时更快,有时OfType
更快,但两者之间的差异可以忽略不计。 但是,虽然丑陋,for循环一直比其他两个更快。
Stopwatch sw = new Stopwatch(); StringBuilder sb = new StringBuilder(); string strText = "this will become a very long string after my code has done appending it to the stringbuilder "; Enumerable.Range(1, 100000).ToList().ForEach(i => sb.Append(strText)); strText = sb.ToString(); //First two benchmarks sw.Start(); MatchCollection mc = Regex.Matches(strText, @"\b[A-Za-z-']+\b"); var matches = new string[mc.Count]; for (int i = 0; i < matches.Length; i++) { matches[i] = mc[i].ToString(); } sw.Stop();
结果:
OfType: 3462 Cast: 3499 For: 2650
考虑下面的代码…
var emailAddress = "joe@sad.com; joe@happy.com; joe@elated.com"; List<string> emails = new List<string>(); emails = Regex.Matches(emailAddress, @"([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})") .Cast<Match>() .Select(m => m.Groups[0].Value) .ToList();
祝你好运!