Sunday, July 3, 2011

Python: Generating all subsets of size k for n objects.

In our main blog, we presented a routine for generating all subsets of set, expressed as an array or vector of indices to the n objects (0 to n-1).See Generating all subsets of a set with n objects In this article we present an "odometer" method of generating all subsets of specific size k where k < n. For example, for n = 5 and k = 3, we expect the function to return the set containing the following vectors: [0,1,2], [0, 1,3], [0,1,4], [0,2,3] and so on up to [2,3,4]. Here is our Python code.

# -*- coding: utf-8 -*-

"""
File ksubsets.py
Author Ernesto P. Adorio, Ph.D.
UPDEPP, UP Clarkfield
Universty of the Philippines Extension Program in Pampanga
Desc generates subsets (as indices) with k elements from a set of size n.
Version 0.0.1 2011.07.03
"""


def ksubsets (n, k):
"""
generates the list of subsets (as index array) of n elements
chosen k at a time. Combinations actually!
"""
#sanity check
if n <0 or k <0 or n - k < 0: return None N = range(k) retval = [N[:]] print "initial N=", N #update current N vector. level = k-1 while level >=0:
d = N[level]
#test if element is already at maximum value.
if d < n - k + level: # No, increment the value. N[level]+= 1 #repair trailing elements. for i in range(level+1, k): N[i] = N[i-1] + 1 #append newly generated subset. retval.append(N[:]) level = k-1 else: # position N[level] already at maximum value! level -= 1 return retval for i, subset in enumerate(ksubsets(5, 3)): print i, subset





When the above program is run, it outputs the expected results:

0 [0, 1, 2]
1 [0, 1, 3]
2 [0, 1, 4]
3 [0, 2, 3]
4 [0, 2, 4]
5 [0, 3, 4]
6 [1, 2, 3]
7 [1, 2, 4]
8 [1, 3, 4]
9 [2, 3, 4]

How does the code work? First it initializes the first member of the set of subsets as
[0, 1, ..., k-1]. Next, starting from the rightmost (level = k-1)position it checks whether the current element is at maximum. If it is not it is incremented and any trailing positions adjusted to be 1 greater than at the preceding position. Otherwise the level is decremented and the process repeats.

Exercise: write a generator version of the above code. We will present the solution on the next revision.

The itertools module has a combinations function. After importing, one simply writes:
list(combinations (range(5), 3)).. Python just makes it TOO easy!

No comments:

Post a Comment