Java中的自然sorting顺序string比较 – 是内置的?
我想要一些保留自然sorting顺序1的string比较函数。 有没有像Java这样的东西? 在String类中我找不到任何东西, Comparator类只知道两个实现。
我可以推出自己的(这不是一个很难的问题),但是我不想重新发明轮子。
在我的具体情况下,我有我想要sorting的软件版本string。 所以我想把“1.2.10.5”视为大于“1.2.9.1”。
1按照“自然”的sorting顺序,我的意思是比较string与人类比较的方式,而不是“ascii-betical”的sorting顺序,只对程序员有意义。 换句话说,“image9.jpg”小于“image10.jpg”,“album1set2page9photo1.jpg”小于“album1set2page10photo5.jpg”,“1.2.9.1”小于“1.2.10.5”
在Java中,“自然”顺序的含义是“字典顺序”,所以在核心中没有像你正在寻找的那样的实现。
有开源的实现。
这里有一个:
NaturalOrderComparator.java
确保你阅读:
Cougaar开源许可证
我希望这有帮助!
String实现Comparable,这就是Java中的自然sorting(使用可比较的接口进行比较)。 您可以将string放在TreeSet中,或使用Collections或Arrays类进行sorting。
然而,在你的情况下,你不想要“自然顺序”,你真的想要一个自定义比较器,然后你可以在Collections.sort方法或Arrays.sort方法中使用比较器。
就比较器中要实现的具体逻辑而言(用点分隔的数字),我没有意识到任何现有的标准实现,但正如你所说,这不是一个难题。
编辑:在你的评论,你的链接让你在这里 ,这是一个体面的工作,如果你不介意的事实,它是大小写敏感的。 这里是修改代码,允许你传入String.CASE_INSENSITIVE_ORDER
:
/* * The Alphanum Algorithm is an improved sorting algorithm for strings * containing numbers. Instead of sorting numbers in ASCII order like * a standard sort, this algorithm sorts numbers in numeric order. * * The Alphanum Algorithm is discussed at http://www.DaveKoelle.com * * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public * License as published by the Free Software Foundation; either * version 2.1 of the License, or any later version. * * This library is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this library; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA * */ import java.util.Comparator; /** * This is an updated version with enhancements made by Daniel Migowski, * Andre Bogus, and David Koelle * * To convert to use Templates (Java 1.5+): * - Change "implements Comparator" to "implements Comparator<String>" * - Change "compare(Object o1, Object o2)" to "compare(String s1, String s2)" * - Remove the type checking and casting in compare(). * * To use this class: * Use the static "sort" method from the java.util.Collections class: * Collections.sort(your list, new AlphanumComparator()); */ public class AlphanumComparator implements Comparator<String> { private Comparator<String> comparator = new NaturalComparator(); public AlphanumComparator(Comparator<String> comparator) { this.comparator = comparator; } public AlphanumComparator() { } private final boolean isDigit(char ch) { return ch >= 48 && ch <= 57; } /** Length of string is passed in for improved efficiency (only need to calculate it once) **/ private final String getChunk(String s, int slength, int marker) { StringBuilder chunk = new StringBuilder(); char c = s.charAt(marker); chunk.append(c); marker++; if (isDigit(c)) { while (marker < slength) { c = s.charAt(marker); if (!isDigit(c)) break; chunk.append(c); marker++; } } else { while (marker < slength) { c = s.charAt(marker); if (isDigit(c)) break; chunk.append(c); marker++; } } return chunk.toString(); } public int compare(String s1, String s2) { int thisMarker = 0; int thatMarker = 0; int s1Length = s1.length(); int s2Length = s2.length(); while (thisMarker < s1Length && thatMarker < s2Length) { String thisChunk = getChunk(s1, s1Length, thisMarker); thisMarker += thisChunk.length(); String thatChunk = getChunk(s2, s2Length, thatMarker); thatMarker += thatChunk.length(); // If both chunks contain numeric characters, sort them numerically int result = 0; if (isDigit(thisChunk.charAt(0)) && isDigit(thatChunk.charAt(0))) { // Simple chunk comparison by length. int thisChunkLength = thisChunk.length(); result = thisChunkLength - thatChunk.length(); // If equal, the first different number counts if (result == 0) { for (int i = 0; i < thisChunkLength; i++) { result = thisChunk.charAt(i) - thatChunk.charAt(i); if (result != 0) { return result; } } } } else { result = comparator.compare(thisChunk, thatChunk); } if (result != 0) return result; } return s1Length - s2Length; } private static class NaturalComparator implements Comparator<String> { public int compare(String o1, String o2) { return o1.compareTo(o2); } } }
我已经testing了其他人在这里提到的三个Java实现,发现他们的工作稍有不同,但没有我期望的那样。
AlphaNumericStringComparator和AlphanumComparator都不会忽略空格,所以pic2
放在pic 1
之前。
另一方面, NaturalOrderComparator不仅忽略空格,而且忽略所有前导零,因此sig[1]
先于sig[0]
。
关于性能AlphaNumericStringComparator是〜x10比其他两个更慢。
看看这个实现。 它应该尽可能快,没有任何正则expression式或数组操作或方法调用,只是几个标志和很多情况。
这应该sortingstring中的数字的任何组合,并正确支持相等的数字,并继续前进。
public static int naturalCompare(String a, String b, boolean ignoreCase) { if (ignoreCase) { a = a.toLowerCase(); b = b.toLowerCase(); } int aLength = a.length(); int bLength = b.length(); int minSize = Math.min(aLength, bLength); char aChar, bChar; boolean aNumber, bNumber; boolean asNumeric = false; int lastNumericCompare = 0; for (int i = 0; i < minSize; i++) { aChar = a.charAt(i); bChar = b.charAt(i); aNumber = aChar >= '0' && aChar <= '9'; bNumber = bChar >= '0' && bChar <= '9'; if (asNumeric) if (aNumber && bNumber) { if (lastNumericCompare == 0) lastNumericCompare = aChar - bChar; } else if (aNumber) return 1; else if (bNumber) return -1; else if (lastNumericCompare == 0) { if (aChar != bChar) return aChar - bChar; asNumeric = false; } else return lastNumericCompare; else if (aNumber && bNumber) { asNumeric = true; if (lastNumericCompare == 0) lastNumericCompare = aChar - bChar; } else if (aChar != bChar) return aChar - bChar; } if (asNumeric) if (aLength > bLength && a.charAt(bLength) >= '0' && a.charAt(bLength) <= '9') // as number return 1; // a has bigger size, thus b is smaller else if (bLength > aLength && b.charAt(aLength) >= '0' && b.charAt(aLength) <= '9') // as number return -1; // b has bigger size, thus a is smaller else if (lastNumericCompare == 0) return aLength - bLength; else return lastNumericCompare; else return aLength - bLength; }
如何使用String中的split()方法parsing单个数字string,然后逐一比较它们?
@Test public void test(){ System.out.print(compare("1.12.4".split("\\."), "1.13.4".split("\\."),0)); } public static int compare(String[] arr1, String[] arr2, int index){ // if arrays do not have equal size then and comparison reached the upper bound of one of them // then the longer array is considered the bigger ( --> 2.2.0 is bigger then 2.2) if(arr1.length <= index || arr2.length <= index) return arr1.length - arr2.length; int result = Integer.parseInt(arr1[index]) - Integer.parseInt(arr2[index]); return result == 0 ? compare(arr1, arr2, ++index) : result; }
我没有检查angular落案件,但应该工作,它是相当紧凑的
它连接数字,然后比较它。 如果不适用,则继续。
public int compare(String o1, String o2) { if(o1 == null||o2 == null) return 0; for(int i = 0; i<o1.length()&&i<o2.length();i++){ if(Character.isDigit(o1.charAt(i)) || Character.isDigit(o2.charAt(i))) { String dig1 = "",dig2 = ""; for(int x = i; x<o1.length() && Character.isDigit(o1.charAt(i)); x++){ dig1+=o1.charAt(x); } for(int x = i; x<o2.length() && Character.isDigit(o2.charAt(i)); x++){ dig2+=o2.charAt(x); } if(Integer.valueOf(dig1) < Integer.valueOf(dig2)) return -1; if(Integer.valueOf(dig1) > Integer.valueOf(dig2)) return 1; } if(o1.charAt(i)<o2.charAt(i)) return -1; if(o1.charAt(i)>o2.charAt(i)) return 1; } return 0;
}
可能是一个迟到的答复。 但是我的回答可以帮助其他需要像这样的比较器的人。
我也validation了其他几个比较器。 但是我的看起来比我比较的其他人有效。 也尝试了伊has发布的那个。 对于100个条目的字母数字数据集,数据只有上述一半的时间。
/** * Sorter that compares the given Alpha-numeric strings. This iterates through each characters to * decide the sort order. There are 3 possible cases while iterating, * * <li>If both have same non-digit characters then the consecutive characters will be considered for * comparison.</li> * * <li>If both have numbers at the same position (with/without non-digit characters) the consecutive * digit characters will be considered to form the valid integer representation of the characters * will be taken and compared.</li> * * <li>At any point if the comparison gives the order(either > or <) then the consecutive characters * will not be considered.</li> * * For ex., this will be the ordered O/P of the given list of Strings.(The bold characters decides * its order) <i><b>2</b>b,<b>100</b>b,a<b>1</b>,A<b>2</b>y,a<b>100</b>,</i> * * @author kannan_r * */ class AlphaNumericSorter implements Comparator<String> { /** * Does the Alphanumeric sort of the given two string */ public int compare(String theStr1, String theStr2) { char[] theCharArr1 = theStr1.toCharArray(); char[] theCharArr2 = theStr2.toCharArray(); int aPosition = 0; if (Character.isDigit(theCharArr1[aPosition]) && Character.isDigit(theCharArr2[aPosition])) { return sortAsNumber(theCharArr1, theCharArr2, aPosition++ ); } return sortAsString(theCharArr1, theCharArr2, 0); } /** * Sort the given Arrays as string starting from the given position. This will be a simple case * insensitive sort of each characters. But at any given position if there are digits in both * arrays then the method sortAsNumber will be invoked for the given position. * * @param theArray1 The first character array. * @param theArray2 The second character array. * @param thePosition The position starting from which the calculation will be done. * @return positive number when the Array1 is greater than Array2<br/> * negative number when the Array2 is greater than Array1<br/> * zero when the Array1 is equal to Array2 */ private int sortAsString(char[] theArray1, char[] theArray2, int thePosition) { int aResult = 0; if (thePosition < theArray1.length && thePosition < theArray2.length) { aResult = (int)theArray1[thePosition] - (int)theArray2[thePosition]; if (aResult == 0) { ++thePosition; if (thePosition < theArray1.length && thePosition < theArray2.length) { if (Character.isDigit(theArray1[thePosition]) && Character.isDigit(theArray2[thePosition])) { aResult = sortAsNumber(theArray1, theArray2, thePosition); } else { aResult = sortAsString(theArray1, theArray2, thePosition); } } } } else { aResult = theArray1.length - theArray2.length; } return aResult; } /** * Sorts the characters in the given array as number starting from the given position. When * sorted as numbers the consecutive characters starting from the given position upto the first * non-digit character will be considered. * * @param theArray1 The first character array. * @param theArray2 The second character array. * @param thePosition The position starting from which the calculation will be done. * @return positive number when the Array1 is greater than Array2<br/> * negative number when the Array2 is greater than Array1<br/> * zero when the Array1 is equal to Array2 */ private int sortAsNumber(char[] theArray1, char[] theArray2, int thePosition) { int aResult = 0; int aNumberInStr1; int aNumberInStr2; if (thePosition < theArray1.length && thePosition < theArray2.length) { if (Character.isDigit(theArray1[thePosition]) && Character.isDigit(theArray1[thePosition])) { aNumberInStr1 = getNumberInStr(theArray1, thePosition); aNumberInStr2 = getNumberInStr(theArray2, thePosition); aResult = aNumberInStr1 - aNumberInStr2; if (aResult == 0) { thePosition = getNonDigitPosition(theArray1, thePosition); if (thePosition != -1) { aResult = sortAsString(theArray1, theArray2, thePosition); } } } else { aResult = sortAsString(theArray1, theArray2, ++thePosition); } } else { aResult = theArray1.length - theArray2.length; } return aResult; } /** * Gets the position of the non digit character in the given array starting from the given * position. * * @param theCharArr /the character array. * @param thePosition The position after which the array need to be checked for non-digit * character. * @return The position of the first non-digit character in the array. */ private int getNonDigitPosition(char[] theCharArr, int thePosition) { for (int i = thePosition; i < theCharArr.length; i++ ) { if ( !Character.isDigit(theCharArr[i])) { return i; } } return -1; } /** * Gets the integer value of the number starting from the given position of the given array. * * @param theCharArray The character array. * @param thePosition The position form which the number need to be calculated. * @return The integer value of the number. */ private int getNumberInStr(char[] theCharArray, int thePosition) { int aNumber = 0; for (int i = thePosition; i < theCharArray.length; i++ ) { if(!Character.isDigit(theCharArray[i])) { return aNumber; } aNumber += aNumber * 10 + (theCharArray[i] - 48); } return aNumber; } }
使用RuleBasedCollator
也是一个选项。 虽然你必须提前添加所有的sorting规则,所以如果你想考虑更大的数字,这也不是一个好的解决scheme。
添加诸如2 < 10
特定自定义比较容易,而且可能对sortingTrusty < Precise < Xenial < Yakkety
等特殊版本标识符非常有用。
RuleBasedCollator localRules = (RuleBasedCollator) Collator.getInstance(); String extraRules = IntStream.range(0, 100).mapToObj(String::valueOf).collect(joining(" < ")); RuleBasedCollator c = new RuleBasedCollator(localRules.getRules() + " & " + extraRules); List<String> a = asList("1-2", "1-02", "1-20", "10-20", "fred", "jane", "pic01", "pic02", "pic02a", "pic 5", "pic05", "pic 7", "pic100", "pic100a", "pic120", "pic121"); shuffle(a); a.sort(c); System.out.println(a);
也许这个实现对你更好。
(顺便说一句我应该复制/粘贴在这里?)