few problems, approaches and solutions

Category Archives: programming

min-cost perfect matching

If you would like to play with a minimum-cost perfect matching, klein-tardos is a C++ implementation. The intent of the implementation was to know how the algorithm works. Also, most of the C++ I know is from my experience with this problem. So use should probably be limited to understanding and experimentation of the algorithm. My implementation is based on the algorithm presented in Algorithm Design by Kleinberg, Tardos. priqueue.h is an implementation of priority queue with deletion. indexedList.h provides an abstraction for two lists with a reverse function, which switches element from one list to other. It is used to represent the outgoing and incoming list of a vertex. Reverse function could be used to switch an edge from outgoing list to incoming list.

The algorithm in general is called Hungarian algorithm. The implemented algorithm is different from the once here, here and here. Each iteration finds and augments the shortest augmenting path from the bipartite graph using the Dijkstra’s algorithm. But essentially, it also uses the same fundamental concepts to solve the problem. The idea is to create a matching which doesn’t have negative cycle in its residual graph. A little more formal treatment would be.

Consider a bipartite graph G = (V,E) having cost associated with each edge. If M is a matching in G, GM denotes corresponding residual graph for M. Quoting from Algorithm Design.

Let M be a perfect matching. If there are no negative-cost directed cycles C in GM, then M is a minimum-cost perfect matching.

Prof. Golin’s notes explain the same concept as Equality graph.
Hungarian algorithm counts on this fact to find the minimum-cost matching. A more general version of this statement is also the basis for minimum-cost flow algorithms.

Hungarian algorithm, in each iteration, builds a larger matching while maintaining GM free from negative-cost directed cycles. In the end, algorithm reaches the perfect matching which also is a minimum-cost matching.

Tougher part is to avoid the negative-cost cycles. Thats where the labeling comes in (Feasible labeling in Prof. Golin’s notes). After each augmentation the graph is relabeled, with reduced costs, which don’t change the solution, and also keep the GM free of negative loops.

If you want to know more, you should read the above mentioned notes and if possible Algorithm Design. I have tried to make the core part quite visible in the implementation. Infact, the main loop of the algorithm is just

void min_weight_assignment() {
    initiate_node_prices();

    for (int i=0; i<x.size(); i++) {
        dijkstra_spath(s);  // shortest paths from source
        augment_spath();
        reduce_node_prices();
    }
}

An observation. As I mentioned, all approaches are almost equivalent. The matrix method seems to be different from the others, but the similarity could be seen by considering column nodes as left nodes in bipartite graph, and the row nodes as right once.

Advertisements

Beust Sequence! aka sequence of numbers without repeating digits.

Original statement for the problem is here . The problem asks to generate a sequence of increasing numbers with no digits repeating in same number. The problem is quite similar to generating permutations, except for the part that number couldn’t start with 0. Here is my take over the problem. There are three basic things to take care of. First, generating permutations. Second, taking care of the initial 0 problem. And third, not going beyond the limit.

If we are given a list and asked to produce all permutations, all we have to do is, for each entry in the list, consider permutation of rest of the entries(this applies recursively). To generate permutations in lexical order, we just have to select the elements in increasing order.
In the code lst stores the lexically arranged digits. Each recursion chooses one element from the list, and passes rest to the next recursive call.

The second problem is to not count initial 0’s. This problem is handled by left and the associated recursive call. left essentially takes care of whether all the entries to left are 0. If all the left entries are 0, and the current choice is also 0, then we pass the whole choice list to the next recursive call.

The third is to not cross the limit. This is the part which convolutes my solution. Variables atmax, rnge, chldmaxd, maxd all work towards solving this part. The idea is to keep track from the left, of the digits which have reached the maximum. Similar to the case of left before, atmax lets know whether all choices to the left have reached there maximum. This is used to restrict the choices for the next recursive call. ex: if the limit is 1245, and the choice so far is 12, then the choice for next level will be restricted till 0-4. But if its 11, then the next choice could be between 0-9.

sol = []
def seq(inp):
digits = tuple([i for i in range(10)])
maxd = [int(i) for i in str(inp)]
# number in reverse, with the first element always 0
# ex: the maxd would be [0, 5, 2, 1] for limit of 125
maxd.append(0)
maxd.reverse()

srep = tuple([str(i) for i in range(10)])

def seq_gen(n, atmax, lst, s, left):
if n==1:
for i in lst:
sol.append(s+srep[i])
return

rnge = (atmax and [maxd[n]] or [9])[0]

chldmaxd = False
for i in range(len(lst)):
if lst[i] <= rnge: if atmax and lst[i]==rnge: chldmaxd = True if left and lst[i] == 0: seq_gen(n-1, chldmaxd, lst, s, True) else: seq_gen(n-1, chldmaxd, lst[:i]+lst[i+1:]\ , s+srep[lst[i]], False) else: break; seq_gen(len(maxd)-1, 1, digits, "", True) [/sourcecode] The above code's performance could be improved with simple change in termination condition. [sourcecode language='python'] if n==1: for i in lst: sol.append(s+srep[i]) return [/sourcecode] Timing analysis for the solution. real 0m29.297s user 0m28.762s sys 0m0.344s As approaches in Java performed quite faster than this, I tried my own implementation in C/C++, with Bob’s idea of using integers rather than string to save the sequence. Takes about 230ms($time ./a.out) to generate the whole sequence, on my system.

The code is available here

On the other hand, if we just want the count of such numbers possible (8877690), then here is a one approach. The numbers range from 1 digit to 10 digits, because there are only 10 digits and we cant go beyond that without repetition. So considering each such case(ex: 3 digit numbers), the most significant digit can’t be 0, because otherwise it just becomes a 2 digit number(which we account for separately). So there are 9 options for first number [1-9]. Second number onwards, we just have to count the permutation of the remaining digits. In our example of 3 digit numbers, this turns out to be 9*(9*8), written better as 9*(9!/7!).
Doing this for all 1, 2, …, 10 digits gives us the answer.

google treasure hunt

Two problems from Google treasure hunt, which although not quite tough, but nevertheless were interesting.

One was about a robot starting from top left corner of a matrix, trying to reach the bottom right corner. Robot could only go right or down. The problem was to find the number of unique possible paths. My first thought was to use dynamic programming. Some more thoughts, and I was able to see a pattern in the solution. The basic idea was, like in dynamic programming, to solve the smaller problems first. The observation was that the paths could be calculated one level at a time. The paths are calculated along a line parallel to the anti-diagonal, as it sweeps through the matrix from destination cell towards the starting cell.

Here’s the code doing just that.

int main(int argc, char ** argv)
{
  int r, c, i, j, diag;
  unsigned long long *pa;
  assert(argc == 3);

  r = atoi(argv[1]);
  c = atoi(argv[2]);

  diag = r<c? r:c;
  pa = calloc( diag, sizeof pa&#91;0&#93; );

  for (i=0;i<diag;i++)
    pa&#91;i&#93; = 0;
  pa&#91;0&#93; = 1;

  for (i=0;i<r+c-2;i++)
    for (j=diag-1;j>0;j--)
      pa[j] = pa[j] + pa[j-1];

  printf("paths %llu \n", pa[diag-1]);

  return 0;
}

The implementation worked for small inputs. But for the actual input, long long was not enough (fortunately, and assumably intended by the problem designer). Some more thoughts about the process, and the insight came. The pattern seemed to be familiar, although incomplete. It was the Pascal’s triangle . It was easy to find out the entry needed from the Pascal’s triangle and its corresponding value ( \mathrm{C}^{n+m-2}_{n-1} ). But still, to calculate it, would need more precision than whats available with C data types (otherwise my program would have given the answer). Google also didn’t help. Looking for bignum library, lead me to this site , which provided an expression calculator showing capabilities of the library. So all I had to do was fill in the equation and copy paste the solution.

The other one was the last problem. Here is a verbatim instance of the problem.

Find the smallest number that can be expressed as
the sum of 3 consecutive prime numbers,
the sum of 35 consecutive prime numbers,
the sum of 553 consecutive prime numbers,
the sum of 829 consecutive prime numbers,
and is itself a prime number.
For example, 41 is the smallest prime number that can be expressed as
the sum of 3 consecutive primes (11 + 13 + 17 = 41) and
the sum of 6 consecutive primes (2 + 3 + 5 + 7 + 11 + 13 = 41).

The smaller sequence would always end with larger primes, than the longer sequence. Reason being that the same sum is distributed in a smaller set of consecutive primes. How does that help? Slide the smallest sequence to include the next prime, and simultaneously, update all later sequences to have sum greater than or equal to sequences smaller than them. Here’s a little cryptic code for the solution.

struct cseq {
int seql; /* seq len */
int seqt; /* seq top */
int seqs; /* seq sum */
} *sa;

void slide_sequence(struct cseq *s)
{
s->seqt++;
s->seqs -= prime[s->seqt – s->seql];
s->seqs += prime[s->seqt];
}

int align_to_sum(struct cseq *s, int sum)
{
while (s->seqs < sum) slide_sequence(s); return s->seqs == sum;
}

int main()
{
/* initialize sa and primes array and other variables. */

for each consecutive prime {
/* invar: bigger sequences have sum >= smaller once. */
eqcnt=0;

slide_sequence( &(sa[0]) );

for (j=1;j here .

Here are the timing details.

$time ./a.out
seq: 3, start: 221709, end: 221711, sum: 9221231
seq: 35, start: 23088, end: 23122, sum: 9221231
seq: 553, start: 1650, end: 2202, sum: 9221231
seq: 829, start: 930, end: 1758, sum: 9221231

real 0m0.210s
user 0m0.208s
sys 0m0.000s

Problem 4-2

Problem:

Given, an array of size n with distinct elements such that 0 <= A[i] <= n for each element i. Note that the array should be missing an integer from range [0,n], as the array has size n and the range has n+1 elements. The only way to access elements is through function

bit_access(int * A, int i, int j)

which returns the j^{th} bit of element i in array A.

We have to find out the missing number in O(n) time.

Approach:

Let b^{th} bit be the most significant set bit of n. What we observe is that the elements of A could be divided into two sets, based on the b^{th} bit of those elements. Those which have the bit set, and those which don’t. We know that number of elements with b^{th} bit 0 should be 2^{b-1}. And the remaining elements would have the bit set. If the number of 0’s doesn’t match the expected count, then the missing element has its b^{th} bit as 0, otherwise its 1. The algorithm works on this observation by first partitioning the array according to b^{th} bit and then applying the same logic recursively on the sub-array which has the element missing from the range.

Here’s the code for partition and recursive search of solution.

/* partition array with elements, according to b bit of each
element and return the partition index. */
int partition(int * arr, int n, int b)
{
int i, part=0;

/* invar: part points to the first set bit seen so far. */
for (i=0;i optimized away.
The above recursion is T(n) = T(n/2) + O(n), which solves to O(n), hence a solution for the problem.

subtleties of programming

While coding for the problem-2.3-7 , some how I was not in my default mode of not thinking while coding. To test my programs I was writing a random array generator. One part of it was to swap two array entries. Now, the most straight forward way to swap two array elements is

int tmp = arr[i];
arr[i]  = arr[r];
arr[r]  = tmp;

But, knowing the xor method, I thought why not save a variable. Basically, if a and b are to be swapped, then the code suffices.

a ^= b;
b ^= a;
a ^= b;

So now the question is, would this work on an array. Ex.

arr[i] ^= arr[r];
arr[r] ^= arr[i];
arr[i] ^= arr[r];

Here’s a subtle bug. If i and r are same, then it wont. The reason being that we loose the original value. If both indexes are same, the same array entry would be xor’ed with itself. Wiki page explains it better, and why it should be avoided in desktop systems. Again, I learn the evils of premature optimization. Subtleties of programming!

problem 2.3-7

Problem:

We are given a set S of n integers, and another integer x. And we need to find out if there exists two integers in the set S such that there sum is x.

Approach:

The straight forward answer is to calculate sum for each possible pair of elements in S, and see if any of them matches with x.

Note that the sum of the smallest element and the largest element in S, will be the median of sums possible with pairs of elements in S. If x, is less than this sum, then the largest element of S could not be in the pair we are searching for. Similarly, if x is larger than the required sum, then the smallest element from S will not be in the pair we are searching for. The same method could be applied to the subproblem(the remaining list).

At each stage we have to calculate the smallest and largest element in the list. We could solve this for each subproblem in O(n), or we could sort the set S initially, and then access the smallest and largest elements in O(1). The second approach solves the problem in O( n log n).

Coding the approach:

As u might have noticed, the solution is a recursive one. But it doesn’t have to be coded that way. The recursive solution could be easily converted to an iterative one, as shown below. I have skipped the much of the code, just to present heart of the method.

int sum_check(int *arr, int n, int x)
{
  int u, l, s;
  u=0;l=n-1;

  while (u<l) {
    s = arr&#91;u&#93;+arr&#91;l&#93;;
    if (x < s)
      l--;
    else if (x > s)
      u++;
    else
      return 1;
  }

 return 0;
}

But lets see the recursive approach also.

int sum_check_rec(int *arr, int n, int x)
{
int s = arr[0] + arr[n-1];

if (n == 1) return 0;
if (x == s) return 1;

return sum_check_rec(arr + (x