Java中的迭代笛卡尔积
我想计算Java中任意数量的非空集的笛卡尔乘积。
我已经写了迭代代码…
public static <T> List<Set<T>> cartesianProduct(List<Set<T>> list) { List<Iterator<T>> iterators = new ArrayList<Iterator<T>>(list.size()); List<T> elements = new ArrayList<T>(list.size()); List<Set<T>> toRet = new ArrayList<Set<T>>(); for (int i = 0; i < list.size(); i++) { iterators.add(list.get(i).iterator()); elements.add(iterators.get(i).next()); } for (int j = 1; j >= 0;) { toRet.add(Sets.newHashSet(elements)); for (j = iterators.size()-1; j >= 0 && !iterators.get(j).hasNext(); j--) { iterators.set(j, list.get(j).iterator()); elements.set(j, iterators.get(j).next()); } elements.set(Math.abs(j), iterators.get(Math.abs(j)).next()); } return toRet; }
但是我发现它相当不雅。 有人有更好的,仍然迭代的解决scheme? 一个解决scheme,使用一些奇妙的function类似的方法? 否则…有关如何改善它的build议? 错误?
我写了一个解决scheme,不需要你在内存中填充一个大集合。 不幸的是,所需的代码长达数百行。 你可能要等到它出现在番石榴项目( http://guava-libraries.googlecode.com ),我希望将在今年年底。 抱歉。 🙁
请注意,如果您在笛卡儿积的集合的数量是在编译时已知的固定数量,则您可能不需要这样的实用程序 – 您可以使用该嵌套for循环的数目。
编辑:代码现在发布。
Sets.cartesianProduct()
我想你会很高兴的。 它只是按照你的要求创build个人名单。 并没有用它们的所有MxNxPxQ来填充内存。
如果你想检查来源,这是在727线 。
请享用!
下面的答案使用迭代而不是recursion。 它使用我以前的答案相同的Tuple
类。
这是一个单独的答案,因为恕我直言,都是有效的,不同的方法。
这是新的主要课程:
public class Example { public static <T> List<Tuple<T>> cartesianProduct(List<Set<T>> sets) { List<Tuple<T>> tuples = new ArrayList<Tuple<T>>(); for (Set<T> set : sets) { if (tuples.isEmpty()) { for (T t : set) { Tuple<T> tuple = new Tuple<T>(); tuple.add(t); tuples.add(tuple); } } else { List<Tuple<T>> newTuples = new ArrayList<Tuple<T>>(); for (Tuple<T> subTuple : tuples) { for (T t : set) { Tuple<T> tuple = new Tuple<T>(); tuple.addAll(subTuple); tuple.add(t); newTuples.add(tuple); } } tuples = newTuples; } } return tuples; } }
这是我写的一个迭代的,懒惰的实现。 该接口与Google的Sets.cartesianProduct非常相似,但它更加灵活:它在Iterables而不是Sets中处理。 这个代码和它的unit testing在https://gist.github.com/1911614 。
/* Copyright 2012 LinkedIn Corp. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. */ import com.google.common.base.Function; import com.google.common.collect.Iterables; import java.lang.reflect.Array; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.Iterator; import java.util.List; import java.util.NoSuchElementException; /** * Implements the Cartesian product of ordered collections. * * @author <a href="mailto:jmkristian@gmail.com">John Kristian</a> */ public class Cartesian { /** * Generate the <a href="http://en.wikipedia.org/wiki/Cartesian_product">Cartesian * product</a> of the given axes. For axes [[a1, a2 ...], [b1, b2 ...], [c1, c2 ...] * ...] the product is [[a1, b1, c1 ...] ... [a1, b1, c2 ...] ... [a1, b2, c1 ...] ... * [aN, bN, cN ...]]. In other words, the results are generated in same order as these * nested loops: * * <pre> * for (T a : [a1, a2 ...]) * for (T b : [b1, b2 ...]) * for (T c : [c1, c2 ...]) * ... * result = new T[]{ a, b, c ... }; * </pre> * * Each result is a new array of T, whose elements refer to the elements of the axes. If * you prefer a List, you can call asLists(product(axes)). * <p> * Don't change the axes while iterating over their product, as a rule. Changes to an * axis can affect the product or cause iteration to fail (which is usually bad). To * prevent this, you can pass clones of your axes to this method. * <p> * The implementation is lazy. This method iterates over the axes, and returns an * Iterable that contains a reference to each axis. Iterating over the product causes * iteration over each axis. Methods of each axis are called as late as practical. */ public static <T> Iterable<T[]> product(Class<T> resultType, Iterable<? extends Iterable<? extends T>> axes) { return new Product<T>(resultType, newArray(Iterable.class, axes)); } /** Works like product(resultType, Arrays.asList(axes)), but slightly more efficient. */ public static <T> Iterable<T[]> product(Class<T> resultType, Iterable<? extends T>... axes) { return new Product<T>(resultType, axes.clone()); } /** * Wrap the given arrays in fixed-size lists. Changes to the lists write through to the * arrays. */ public static <T> Iterable<List<T>> asLists(Iterable<? extends T[]> arrays) { return Iterables.transform(arrays, new AsList<T>()); } /** * Arrays.asList, represented as a Function (as used in Google collections). */ public static class AsList<T> implements Function<T[], List<T>> { @Override public List<T> apply(T[] array) { return Arrays.asList(array); } } /** Create a generic array containing references to the given objects. */ private static <T> T[] newArray(Class<? super T> elementType, Iterable<? extends T> from) { List<T> list = new ArrayList<T>(); for (T f : from) list.add(f); return list.toArray(newArray(elementType, list.size())); } /** Create a generic array. */ @SuppressWarnings("unchecked") private static <T> T[] newArray(Class<? super T> elementType, int length) { return (T[]) Array.newInstance(elementType, length); } private static class Product<T> implements Iterable<T[]> { private final Class<T> _resultType; private final Iterable<? extends T>[] _axes; /** Caution: the given array of axes is contained by reference, not cloned. */ Product(Class<T> resultType, Iterable<? extends T>[] axes) { _resultType = resultType; _axes = axes; } @Override public Iterator<T[]> iterator() { if (_axes.length <= 0) // an edge case return Collections.singleton(newArray(_resultType, 0)).iterator(); return new ProductIterator<T>(_resultType, _axes); } @Override public String toString() { return "Cartesian.product(" + Arrays.toString(_axes) + ")"; } private static class ProductIterator<T> implements Iterator<T[]> { private final Iterable<? extends T>[] _axes; private final Iterator<? extends T>[] _iterators; // one per axis private final T[] _result; // a copy of the last result /** * The minimum index such that this.next() will return an array that contains * _iterators[index].next(). There are some special sentinel values: NEW means this * is a freshly constructed iterator, DONE means all combinations have been * exhausted (so this.hasNext() == false) and _iterators.length means the value is * unknown (to be determined by this.hasNext). */ private int _nextIndex = NEW; private static final int NEW = -2; private static final int DONE = -1; /** Caution: the given array of axes is contained by reference, not cloned. */ ProductIterator(Class<T> resultType, Iterable<? extends T>[] axes) { _axes = axes; _iterators = Cartesian.<Iterator<? extends T>> newArray(Iterator.class, _axes.length); for (int a = 0; a < _axes.length; ++a) { _iterators[a] = axes[a].iterator(); } _result = newArray(resultType, _iterators.length); } private void close() { _nextIndex = DONE; // Release references, to encourage garbage collection: Arrays.fill(_iterators, null); Arrays.fill(_result, null); } @Override public boolean hasNext() { if (_nextIndex == NEW) { // This is the first call to hasNext(). _nextIndex = 0; // start here for (Iterator<? extends T> iter : _iterators) { if (!iter.hasNext()) { close(); // no combinations break; } } } else if (_nextIndex >= _iterators.length) { // This is the first call to hasNext() after next() returned a result. // Determine the _nextIndex to be used by next(): for (_nextIndex = _iterators.length - 1; _nextIndex >= 0; --_nextIndex) { Iterator<? extends T> iter = _iterators[_nextIndex]; if (iter.hasNext()) { break; // start here } if (_nextIndex == 0) { // All combinations have been generated. close(); break; } // Repeat this axis, with the next value from the previous axis. iter = _axes[_nextIndex].iterator(); _iterators[_nextIndex] = iter; if (!iter.hasNext()) { // Oops; this axis can't be repeated. close(); // no more combinations break; } } } return _nextIndex >= 0; } @Override public T[] next() { if (!hasNext()) throw new NoSuchElementException("!hasNext"); for (; _nextIndex < _iterators.length; ++_nextIndex) { _result[_nextIndex] = _iterators[_nextIndex].next(); } return _result.clone(); } @Override public void remove() { for (Iterator<? extends T> iter : _iterators) { iter.remove(); } } @Override public String toString() { return "Cartesian.product(" + Arrays.toString(_axes) + ").iterator()"; } } } }
基于索引的解决scheme
使用索引是一个简单的select,它是快速和高效的,可以处理任意数量的集合。 实现Iterable可以方便地在for-each循环中使用。 有关使用示例,请参阅#main方法。
public class CartesianProduct implements Iterable<int[]>, Iterator<int[]> { private final int[] _lengths; private final int[] _indices; private boolean _hasNext = true; public CartesianProduct(int[] lengths) { _lengths = lengths; _indices = new int[lengths.length]; } public boolean hasNext() { return _hasNext; } public int[] next() { int[] result = Arrays.copyOf(_indices, _indices.length); for (int i = _indices.length - 1; i >= 0; i--) { if (_indices[i] == _lengths[i] - 1) { _indices[i] = 0; if (i == 0) { _hasNext = false; } } else { _indices[i]++; break; } } return result; } public Iterator<int[]> iterator() { return this; } public void remove() { throw new UnsupportedOperationException(); } /** * Usage example. Prints out * * <pre> * [0, 0, 0] a, NANOSECONDS, 1 * [0, 0, 1] a, NANOSECONDS, 2 * [0, 0, 2] a, NANOSECONDS, 3 * [0, 0, 3] a, NANOSECONDS, 4 * [0, 1, 0] a, MICROSECONDS, 1 * [0, 1, 1] a, MICROSECONDS, 2 * [0, 1, 2] a, MICROSECONDS, 3 * [0, 1, 3] a, MICROSECONDS, 4 * [0, 2, 0] a, MILLISECONDS, 1 * [0, 2, 1] a, MILLISECONDS, 2 * [0, 2, 2] a, MILLISECONDS, 3 * [0, 2, 3] a, MILLISECONDS, 4 * [0, 3, 0] a, SECONDS, 1 * [0, 3, 1] a, SECONDS, 2 * [0, 3, 2] a, SECONDS, 3 * [0, 3, 3] a, SECONDS, 4 * [0, 4, 0] a, MINUTES, 1 * [0, 4, 1] a, MINUTES, 2 * ... * </pre> */ public static void main(String[] args) { String[] list1 = { "a", "b", "c", }; TimeUnit[] list2 = TimeUnit.values(); int[] list3 = new int[] { 1, 2, 3, 4 }; int[] lengths = new int[] { list1.length, list2.length, list3.length }; for (int[] indices : new CartesianProduct(lengths)) { System.out.println(Arrays.toString(indices) // + " " + list1[indices[0]] // + ", " + list2[indices[1]] // + ", " + list3[indices[2]]); } }
}
使用Google Guava 19和Java 8非常简单:
假设你有想要关联的所有数组的列表…
public static void main(String[] args) { List<String[]> elements = Arrays.asList( new String[]{"John", "Mary"}, new String[]{"Eats", "Works", "Plays"}, new String[]{"Food", "Computer", "Guitar"} ); // Create a list of immutableLists of strings List<ImmutableList<String>> immutableElements = makeListofImmutable(elements); // Use Guava's Lists.cartesianProduct, since Guava 19 List<List<String>> cartesianProduct = Lists.cartesianProduct(immutableElements); System.out.println(cartesianProduct); }
创build不可变列表的方法如下:
/** * @param values the list of all profiles provided by the client in matrix.json * @return the list of ImmutableList to compute the Cartesian product of values */ private static List<ImmutableList<String>> makeListofImmutable(List<String[]> values) { List<ImmutableList<String>> converted = new LinkedList<>(); values.forEach(array -> { converted.add(ImmutableList.copyOf(array)); }); return converted; }
输出如下:
[ [John, Eats, Food], [John, Eats, Computer], [John, Eats, Guitar], [John, Works, Food], [John, Works, Computer], [John, Works, Guitar], [John, Plays, Food], [John, Plays, Computer], [John, Plays, Guitar], [Mary, Eats, Food], [Mary, Eats, Computer], [Mary, Eats, Guitar], [Mary, Works, Food], [Mary, Works, Computer], [Mary, Works, Guitar], [Mary, Plays, Food], [Mary, Plays, Computer], [Mary, Plays, Guitar] ]
我相信这是正确的。 它不是寻求效率,而是通过recursion和抽象来获得清洁的风格。
关键抽象是引入一个简单的Tuple
类。 这有助于后来的仿制药:
class Tuple<T> { private List<T> list = new ArrayList<T>(); public void add(T t) { list.add(t); } public void addAll(Tuple<T> subT) { for (T t : subT.list) { list.add(t); } } public String toString() { String result = "("; for (T t : list) { result += t + ", "; } result = result.substring(0, result.length() - 2); result += " )"; return result; } }
有了这门课,我们可以写出如下的课程:
public class Example { public static <T> List<Tuple<T>> cartesianProduct(List<Set<T>> sets) { List<Tuple<T>> tuples = new ArrayList<Tuple<T>>(); if (sets.size() == 1) { Set<T> set = sets.get(0); for (T t : set) { Tuple<T> tuple = new Tuple<T>(); tuple.add(t); tuples.add(tuple); } } else { Set<T> set = sets.remove(0); List<Tuple<T>> subTuples = cartesianProduct(sets); System.out.println("TRACER size = " + tuples.size()); for (Tuple<T> subTuple : subTuples) { for (T t : set) { Tuple<T> tuple = new Tuple<T>(); tuple.addAll(subTuple); tuple.add(t); tuples.add(tuple); } } } return tuples; }
}
我有一个体面的例子,但是为了简洁起见我省略了这个例子。
您可能对笛卡尔产品的另一个问题感兴趣(编辑:删除保存超链接,search标签笛卡尔产品)。 这个答案有一个很好的recursion解决scheme,我很难改进。 你特别想要一个迭代解决scheme,而不是recursion解决scheme?
编辑:
在perl中查看堆栈溢出的另一个迭代解决scheme并做了一个简单的解释之后 ,下面是另一个解决scheme:
public static <T> List<Set<T>> uglyCartesianProduct(List<Set<T>> list) { List<Iterator<T>> iterators = new ArrayList<Iterator<T>>(list.size()); List<T> elements = new ArrayList<T>(list.size()); List<Set<T>> toRet = new ArrayList<Set<T>>(); for (int i = 0; i < list.size(); i++) { iterators.add(list.get(i).iterator()); elements.add(iterators.get(i).next()); } for(int i = 0; i < numberOfTuples(list); i++) { toRet.add(new HashSet<T>()); } int setIndex = 0; for (Set<T> set : list) { int index = 0; for (int i = 0; i < numberOfTuples(list); i++) { toRet.get(index).add((T) set.toArray()[index % set.size()]); index++; } setIndex++; } return toRet; } private static <T> int numberOfTuples(List<Set<T>> list) { int product = 1; for (Set<T> set : list) { product *= set.size(); } return product; }
这是一个懒惰的迭代器方法,它使用函数来产生适当的输出types。
public static <T> Iterable<T> cartesianProduct( final Function<Object[], T> fn, Object[]... options) { final Object[][] opts = new Object[options.length][]; for (int i = opts.length; --i >= 0;) { // NPE on null input collections, and handle the empty output case here // since the iterator code below assumes that it is not exhausted the // first time through fetch. if (options[i].length == 0) { return Collections.emptySet(); } opts[i] = options[i].clone(); } return new Iterable<T>() { public Iterator<T> iterator() { return new Iterator<T>() { final int[] pos = new int[opts.length]; boolean hasPending; T pending; boolean exhausted; public boolean hasNext() { fetch(); return hasPending; } public T next() { fetch(); if (!hasPending) { throw new NoSuchElementException(); } T out = pending; pending = null; // release for GC hasPending = false; return out; } public void remove() { throw new UnsupportedOperationException(); } private void fetch() { if (hasPending || exhausted) { return; } // Produce a result. int n = pos.length; Object[] args = new Object[n]; for (int j = n; --j >= 0;) { args[j] = opts[j][pos[j]]; } pending = fn.apply(args); hasPending = true; // Increment to next. for (int i = n; --i >= 0;) { if (++pos[i] < opts[i].length) { for (int j = n; --j > i;) { pos[j] = 0; } return; } } exhausted = true; } }; } }; }
我为Strings表编写了一个recursion笛卡尔乘积algorithm。 你可以修改它以使集合成立。 下面是algorithm。 这也在我的文章中解释
public class Main { public static void main(String[] args) { String[] A = new String[]{ "a1", "a2", "a3" }; String[] B = new String[]{ "b1", "b2", "b3" }; String[] C = new String[]{ "c1" }; String[] cp = CartesianProduct(0, A, B, C); for(String s : cp) { System.out.println(s); } } public static String[] CartesianProduct(int prodLevel, String[] res, String[] ...s) { if(prodLevel < s.length) { int cProdLen = res.length * s[prodLevel].length; String[] tmpRes = new String[cProdLen]; for (int i = 0; i < res.length; i++) { for (int j = 0; j < s[prodLevel].length; j++) { tmpRes[i * res.length + j] = res[i] + s[prodLevel][j]; } } res = Main.CartesianProduct(prodLevel + 1, tmpRes, s); } return res; }}