I've answered an interesting question on StackOverflow.com
The question was, provided two unique sets, how to produce all posible combination of subsets. For example provided these sets:
List<string> employees = new List<string>() { "Adam", "Bob"};
List<string> jobs = new List<string>() { "1", "2", "3"};
We can split the two sets to the maximum of two subsets, and here are all the combinations:
Adam : 1
Bob : 2 , 3
Adam : 1 , 2
Bob : 3
Adam : 1 , 3
Bob : 2
Adam : 2 Bob : 1 , 3
Adam : 2 , 3
Bob : 1
Adam : 3
Bob : 1 , 2
Adam, Bob : 1, 2, 3
First of all before you write your code down, lets understand the underlying combinatorics of this question.
Basically, you require that for any partition of set A, you require the same number of parts in set B.
E.g. If you split set A to 3 groups you require that set B will be split to 3 groups as well, if you don't at least one element won't have a corresponding group in the other set.
It's easier to picture it with splitting set A to 1 group we must have one group made from set B as in your example (Adam, Bob : 1, 2, 3) .
It's easier to picture it with splitting set A to 1 group we must have one group made from set B as in your example (Adam, Bob : 1, 2, 3) .
So now, we know that set A has n elements and that set B has k elements.
So naturally we cannot request that any set be split more that
Let's say we split both sets to 2 groups (partitions) each, now we have
Another example is 3 groups each would give us
However, we're still not finished, after any set is split into several subsets (groups), we can still order the elements in many combinations.
So for
This can be found by using the Stirling number of the second kind formula:
(Eq 1)
So naturally we cannot request that any set be split more that
Min(n,k)
.Let's say we split both sets to 2 groups (partitions) each, now we have
1*2=2!
unique pairs between the two sets.Another example is 3 groups each would give us
1*2*3=3!
unique pairs.However, we're still not finished, after any set is split into several subsets (groups), we can still order the elements in many combinations.
So for
m
partitions of a set we need to find how many combinations of placing n
elements in m
partitions.This can be found by using the Stirling number of the second kind formula:
(Eq 1)
This formula gives you The number of ways of partitioning a set of
So the total number of combinations of pairs for all partitions from 1 to
(Eq 2)
n
elements into k
nonempty sets.So the total number of combinations of pairs for all partitions from 1 to
min(n,k)
is:(Eq 2)
In short, it's the sum of all partition combinations of both sets, times all combinations of pairs.
So now we understand how to partition and pair our data we can write the code down:
Code:
So basically, if we look at our final equation (2) we understand the we need four parts of code to our solution.
1. A summation (or loop)
2. A way to get our Stirling sets or partitions from both sets
3. A way to get the Cartesian product of both Stirling sets.
4. A way to permutate through the items of a set. (n!)
1. A summation (or loop)
2. A way to get our Stirling sets or partitions from both sets
3. A way to get the Cartesian product of both Stirling sets.
4. A way to permutate through the items of a set. (n!)
On StackOverflow you can find many ways of how to permuatate items and how to find cartesian products, here is a an example (as extension method):
public static IEnumerable<IEnumerable<T>> Permute<T>(this IEnumerable<T> list)
{
if (list.Count() == 1)
return new List<IEnumerable<T>> { list };
return list.Select((a, i1) => Permute(list.Where((b, i2) => i2 != i1)).Select(b => (new List<T> { a }).Union(b)))
.SelectMany(c => c);
}
This was the easy part of the code.
The more difficult part (imho) is finding all possible
So to solve this, I first solved the greater problem of how to find all possible partitions of a set (not just of size n).
The more difficult part (imho) is finding all possible
n
partitions of a set.So to solve this, I first solved the greater problem of how to find all possible partitions of a set (not just of size n).
I came up with this recursive function:
public static List<List<List<T>>> AllPartitions<T>(this IEnumerable<T> set)
{
var retList = new List<List<List<T>>>();
if (set == null || !set.Any())
{
retList.Add(new List<List<T>>());
return retList;
}
else
{
for (int i = 0; i < Math.Pow(2, set.Count()) / 2; i++)
{
var j = i;
var parts = new [] { new List<T>(), new List<T>() };
foreach (var item in set)
{
parts[j & 1].Add(item);
j >>= 1;
}
foreach (var b in AllPartitions(parts[1]))
{
var x = new List<List<T>>();
x.Add(parts[0]);
if (b.Any())
x.AddRange(b);
retList.Add(x);
}
}
}
return retList;
}
The return value of :
We do not have to use the type List here, but it simplifies indexing.
List<List<List<T>>>
just means a List of all possible partitions where a partition is a list of sets and a set is a list of elements.We do not have to use the type List here, but it simplifies indexing.
So now let's put everything together:
Main Code
//Initialize our sets
var employees = new [] { "Adam", "Bob" };
var jobs = new[] { "1", "2", "3" };
//iterate to max number of partitions (Sum)
for (int i = 1; i <= Math.Min(employees.Length, jobs.Length); i++)
{
Debug.WriteLine("Partition to " + i + " parts:");
//Get all possible partitions of set "employees" (Stirling Set)
var aparts = employees.AllPartitions().Where(y => y.Count == i);
//Get all possible partitions of set "jobs" (Stirling Set)
var bparts = jobs.AllPartitions().Where(y => y.Count == i);
//Get cartesian product of all partitions
var partsProduct = from employeesPartition in aparts
from jobsPartition in bparts
select new { employeesPartition, jobsPartition };
var idx = 0;
//for every product of partitions
foreach (var productItem in partsProduct)
{
//loop through the permutations of jobPartition (N!)
foreach (var permutationOfJobs in productItem.jobsPartition.Permute())
{
Debug.WriteLine("Combination: "+ ++idx);
for (int j = 0; j < i; j++)
{
Debug.WriteLine(productItem.employeesPartition[j].ToArrayString() + " : " + permutationOfJobs.ToArray()[j].ToArrayString());
}
}
}
}
Output:
Partition to 1 parts:
Combination: 1
{ Adam , Bob } : { 1 , 2 , 3 }
Partition to 2 parts:
Combination: 1
{ Bob } : { 2 , 3 }
{ Adam } : { 1 }
Combination: 2
{ Bob } : { 1 }
{ Adam } : { 2 , 3 }
Combination: 3
{ Bob } : { 1 , 3 }
{ Adam } : { 2 }
Combination: 4
{ Bob } : { 2 }
{ Adam } : { 1 , 3 }
Combination: 5
{ Bob } : { 3 }
{ Adam } : { 1 , 2 }
Combination: 6
{ Bob } : { 1 , 2 }
{ Adam } : { 3 }
We can easily check our out put just by counting the results.
In this example we have a set of 2 elements, and a set of 3 elements, Equation 2 states that we need S(2,1)S(3,1)1!+S(2,2)S(3,2)2! = 1+6 = 7
which is exactly the number of combinations we got.
In this example we have a set of 2 elements, and a set of 3 elements, Equation 2 states that we need S(2,1)S(3,1)1!+S(2,2)S(3,2)2! = 1+6 = 7
which is exactly the number of combinations we got.
For reference here are examples of Stirling number of the second kind:
S(1,1) = 1
S(1,1) = 1
S(2,1) = 1
S(2,2) = 1
S(2,2) = 1
S(3,1) = 1
S(3,2) = 3
S(3,3) = 1
S(3,2) = 3
S(3,3) = 1
S(4,1) = 1
S(4,2) = 7
S(4,3) = 6
S(4,4) = 1
S(4,2) = 7
S(4,3) = 6
S(4,4) = 1
Here is an example of a ToString I use for display purposes:
public static String ToArrayString<T>(this IEnumerable<T> arr)
{
string str = "{ ";
foreach (var item in arr)
{
str += item + " , ";
}
str = str.Trim().TrimEnd(',');
str += "}";
return str;
}
Here is a fast permuation algorithm if you want faster results:
The main part of this algorithm is finding the Stirling sets, I've used an inneficient Permutation method, here is a faster one based on the QuickPerm algorithm:
public static IEnumerable<IEnumerable<T>> QuickPerm<T>(this IEnumerable<T> set)
{
int N = set.Count();
int[] a = new int[N];
int[] p = new int[N];
var yieldRet = new T[N];
var list = set.ToList();
int i, j, tmp ;// Upper Index i; Lower Index j
T tmp2;
for (i = 0; i < N; i++)
{
// initialize arrays; a[N] can be any type
a[i] = i + 1; // a[i] value is not revealed and can be arbitrary
p[i] = 0; // p[i] == i controls iteration and index boundaries for i
}
yield return list;
//display(a, 0, 0); // remove comment to display array a[]
i = 1; // setup first swap points to be 1 and 0 respectively (i & j)
while (i < N)
{
if (p[i] < i)
{
j = i%2*p[i]; // IF i is odd then j = p[i] otherwise j = 0
tmp2 = list[a[j]-1];
list[a[j]-1] = list[a[i]-1];
list[a[i]-1] = tmp2;
tmp = a[j]; // swap(a[j], a[i])
a[j] = a[i];
a[i] = tmp;
//MAIN!
// for (int x = 0; x < N; x++)
//{
// yieldRet[x] = list[a[x]-1];
//}
yield return list;
//display(a, j, i); // remove comment to display target array a[]
// MAIN!
p[i]++; // increase index "weight" for i by one
i = 1; // reset index i to 1 (assumed)
}
else
{
// otherwise p[i] == i
p[i] = 0; // reset p[i] to zero
i++; // set new index value for i (increase by one)
} // if (p[i] < i)
} // while(i < N)
}
This will take cut the time by 80-90%
However, most of the CPU cycles go to string building, which is specifically needed for this example.
This will be a make it a bit faster:
However, most of the CPU cycles go to string building, which is specifically needed for this example.
This will be a make it a bit faster:
results.Add(productItem.employeesPartition[j].Aggregate((a, b) => a + "," + b) + " : " + permutationOfJobs.ToArray()[j].Aggregate((a, b) => a + "," + b));
Compiling in x64 will be better cause these strings are taking a lot of memory.
I liked the question, it wasn't trivial. I want to make the AllPartitions method more optimized in the future.
No comments:
Post a Comment