识别列表中的重复项

我有一个Integertypes的列表，例如：

[1, 1, 2, 3, 3, 3]

我想要一个方法来返回所有的重复，例如：

 [1, 3]

做这个的最好方式是什么？

Set的方法add返回一个布尔值，不pipe这个值是否已经存在（如果它不存在则返回true，如果它已经存在则返回false，参见Set documentation ）。

所以只需遍历所有的值：

 public Set<Integer> findDuplicates(List<Integer> listContainingDuplicates) { final Set<Integer> setToReturn = new HashSet(); final Set<Integer> set1 = new HashSet(); for (Integer yourInt : listContainingDuplicates) { if (!set1.add(yourInt)) { setToReturn.add(yourInt); } } return setToReturn; }

我也需要一个解决scheme。我使用leifg的解决scheme，并通用。

 private <T> Set<T> findDuplicates(Collection<T> list) { Set<T> duplicates = new LinkedHashSet<T>(); Set<T> uniques = new HashSet<T>(); for(T t : list) { if(!uniques.add(t)) { duplicates.add(t); } } return duplicates; }

 int[] nums = new int[] {1, 1, 2, 3, 3, 3}; Arrays.sort(nums); for (int i = 0; i < nums.length-1; i++) { if (nums[i] == nums[i+1]) { System.out.println("duplicate item "+nums[i+1]+" at Location"+(i+1) ); } }

显然，你可以做任何你想要的东西（即把一个Set来获得重复值的唯一列表），而不是打印…这也有logging重复项目的位置的好处。

我采取了John Strickler的解决scheme，并重新使用JDK8中引入的streamAPI：

  private <T> Set<T> findDuplicates(Collection<T> list) { Set<T> uniques = new HashSet<T>(); return list.stream().filter(e -> !uniques.add(e)).collect(Collectors.toSet()); }

这也适用：

 public static Set<Integer> findDuplicates(List<Integer> input) { List<Integer> copy = new ArrayList<Integer>(input); for (Integer value : new HashSet<Integer>(input)) { copy.remove(value); } return new HashSet<Integer>(copy); }

你可以使用这样的东西：

 List<Integer> newList = new ArrayList<Integer>(); for(int i : yourOldList) { yourOldList.remove(i); if(yourOldList.contains(i) && !newList.contains(i)) newList.add(i); }

在Java 8上使用Guava

 private Set<Integer> findDuplicates(List<Integer> input) { // Linked* preserves insertion order so the returned Sets iteration order is somewhat like the original list LinkedHashMultiset<Integer> duplicates = LinkedHashMultiset.create(input); // Remove all entries with a count of 1 duplicates.entrySet().removeIf(entry -> entry.getCount() == 1); return duplicates.elementSet(); }

使用MultiMap将每个值存储为键/值集。然后遍历这些键并find具有多个值的键。

如果您使用Eclipse集合，这将工作：

 MutableList<Integer> list = Lists.mutable.with(1, 1, 2, 3, 3, 3); Set<Integer> dupes = list.toBag().selectByOccurrences(i -> i > 1).toSet();

注意：我是Eclipse集合的提交者。

兰巴斯可能是一个解决scheme

 Integer[] nums = new Integer[] {1, 1, 2, 3, 3, 3}; List<Integer> list = Arrays.asList(nums); List<Integer> dps = list.stream().distinct().filter(entry -> Collections.frequency(list, entry) > 1).collect(Collectors.toList());

这是function性技术发光的一个问题。例如，下面的F＃解决scheme比最好的Java解决scheme（以及我每天都在使用Java和F＃）都更清晰，更容易出错。

 [1;1;2;3;3;3] |> Seq.countBy id |> Seq.choose (fun (key,count) -> if count > 1 then Some(key) else None)

当然，这个问题是关于Java的。所以我的build议是采用一个为Java带来function特性的库。例如，它可以使用我自己的库如下解决（还有其他几个值得一看）：

 Seq.of(1,1,2,3,3,3) .groupBy(new Func1<Integer,Integer>() { public Integer call(Integer key) { return key; } }).filter(new Predicate<Grouping<Integer,Integer>>() { public Boolean call(Grouping<Integer, Integer> grouping) { return grouping.getGrouping().count() > 1; } }).map(new Func1<Grouping<Integer,Integer>,Integer>() { public Integer call(Grouping<Integer, Integer> grouping) { return grouping.getKey(); } });

紧凑型版本的顶级答案，还增加了空的检查和预分配设置大小：

 public static final <T> Set<T> findDuplicates(final List<T> listWhichMayHaveDuplicates) { final Set<T> duplicates = new HashSet<>(); final int listSize = listWhichMayHaveDuplicates.size(); if (listSize > 0) { final Set<T> tempSet = new HashSet<>(listSize); for (final T element : listWhichMayHaveDuplicates) { if (!tempSet.add(element)) { duplicates.add(element); } } } return duplicates; }

试试这个在列表中find重复的项目：

 ArrayList<String> arrayList1 = new ArrayList<String>(); arrayList1.add("A"); arrayList1.add("A"); arrayList1.add("B"); arrayList1.add("B"); arrayList1.add("B"); arrayList1.add("C"); for (int x=0; x< arrayList1.size(); x++) { System.out.println("arrayList1 :"+arrayList1.get(x)); } Set s=new TreeSet(); s.addAll(arrayList1); Iterator it=s.iterator(); while (it.hasNext()) { System.out.println("Set :"+(String)it.next()); }

将列表放在集合中（这样只会有效地过滤唯一的项目），从原始列表中删除所有设置的项目（因此它将只包含多次出现的项目），并将列表放在新的集合中（这将再次过滤出唯一的项）：

 List<Item> list = ...; list.removeAll(new HashSet<Item>(list)); return new HashSet<Item>(list);

创build一个Map<Integer,Integer> ，迭代列表，如果一个元素在地图中，增加它的值，否则将它添加到地图上，键= 1
迭代地图，并添加键> = 2的所有元素

 public static void main(String[] args) { List<Integer> list = new LinkedList<Integer>(); list.add(1); list.add(1); list.add(1); list.add(2); list.add(3); list.add(3); Map<Integer,Integer> map = new HashMap<Integer, Integer>(); for (Integer x : list) { Integer val = map.get(x); if (val == null) { map.put(x,1); } else { map.remove(x); map.put(x,val+1); } } List<Integer> result = new LinkedList<Integer>(); for (Entry<Integer, Integer> entry : map.entrySet()) { if (entry.getValue() > 1) { result.add(entry.getKey()); } } for (Integer x : result) { System.out.println(x); } }

这应该适用于sorting和未sorting。

 public void testFindDuplicates() { List<Integer> list = new ArrayList<Integer>(); list.add(1); list.add(1); list.add(2); list.add(3); list.add(3); list.add(3); Set<Integer> result = new HashSet<Integer>(); int currentIndex = 0; for (Integer i : list) { if (!result.contains(i) && list.subList(currentIndex + 1, list.size()).contains(i)) { result.add(i); } currentIndex++; } assertEquals(2, result.size()); assertTrue(result.contains(1)); assertTrue(result.contains(3)); }

 public class DuplicatesWithOutCollection { public static void main(String[] args) { int[] arr = new int[] { 2, 3, 4, 6, 6, 8, 10, 10, 10, 11, 12, 12 }; boolean flag = false; int k = 1; while (k == 1) { arr = removeDuplicate(arr); flag = checkDuplicate(arr, flag); if (flag) { k = 1; } else { k = 0; } } } private static boolean checkDuplicate(int[] arr, boolean flag) { int i = 0; while (i < arr.length - 1) { if (arr[i] == arr[i + 1]) { flag = true; } else { flag = false; } i++; } return flag; } private static int[] removeDuplicate(int[] arr) { int i = 0, j = 0; int[] temp = new int[arr.length]; while (i < arr.length - 1) { if (arr[i] == arr[i + 1]) { temp[j] = arr[i + 1]; i = i + 2; } else { temp[j] = arr[i]; i = i + 1; if (i == arr.length - 1) { temp[j + 1] = arr[i + 1]; break; } } j++; } System.out.println(); return temp; } }

 import java.util.Scanner; public class OnlyDuplicates { public static void main(String[] args) { System.out.print(" Enter a set of 10 numbers: "); int[] numbers = new int[10]; Scanner input = new Scanner(System.in); for (int i = 0; i < numbers.length; i++) { numbers[i] = input.nextInt(); } numbers = onlyDuplicates(numbers); System.out.print(" The numbers are: "); for (int i = 0; i < numbers.length; i++) { System.out.print(numbers[i] + ""); } } public static int[] onlyDuplicates(int[] list) { boolean flag = true; int[] array = new int[0]; array = add2Array(array, list[0]); for (int i = 0; i < list.length; i++) { for (int j = 0; j < array.length; j++) { if (list[i] == array[j]) { flag = false; break; } } if (flag) { array = add2Array(array, list[i]); } flag = true; } return array; } // Copy numbers1 to numbers2 // If the length of numbers2 is less then numbers2, return false public static boolean copyArray(int[] source, int[] dest) { if (source.length > dest.length) { return false; } for (int i = 0; i < source.length; i++) { dest[i] = source[i]; } return true; } // Increase array size by one and add integer to the end of the array public static int[] add2Array(int[] source, int data) { int[] dest = new int[source.length + 1]; copyArray(source, dest); dest[source.length] = data; return dest; } }

如果你知道最大值（例如<10000），你可以牺牲空间的速度。我不记得这种技术的确切名称。

伪代码：

 //does not handle case when mem allocation fails //probably can be extended to unknown values /larger values . maybe by sorting first public List<int> GetDuplicates(int max) { //allocate and clear memory to 0/false bit[] buckets=new bit[max] memcpy(buckets,0,max); //find duplicates List<int> result=new List<int>(); foreach(int val in List) { if (buckets[val]) { result.add(value); } else { buckets[val]=1; } } return result }

试试这个：

例如，列表值是：[1,2,3,4,5,6,4,3,7,8]重复项目[3,4]。

 Collections.sort(list); List<Integer> dup = new ArrayList<>(); for (int i = 0; i < list.size() - 1; i++) { if (list.get(i) == list.get(i + 1)) { if (!dup.contains(list.get(i + 1))) { dup.add(list.get(i + 1)); } } } System.out.println("duplicate item " + dup);

识别列表中的重复项

Java中的通用树实现

为什么没有针对ConcurrentHashMap的ConcurrentHashSet

sortingJava集合

获取两个列表之间的区别

为什么Java可变参数不支持集合？

如何检查Map中是否存在键或值？

返回null还是空集合更好？

在Java中从java.util.stream.Stream中检索列表

为什么Scala中没有可变的TreeMap？

.NET中的线程安全集合