few problems, approaches and solutions

Monthly Archives: June 2008

Beust Sequence! aka sequence of numbers without repeating digits.

Original statement for the problem is here . The problem asks to generate a sequence of increasing numbers with no digits repeating in same number. The problem is quite similar to generating permutations, except for the part that number couldn’t start with 0. Here is my take over the problem. There are three basic things to take care of. First, generating permutations. Second, taking care of the initial 0 problem. And third, not going beyond the limit.

If we are given a list and asked to produce all permutations, all we have to do is, for each entry in the list, consider permutation of rest of the entries(this applies recursively). To generate permutations in lexical order, we just have to select the elements in increasing order.
In the code lst stores the lexically arranged digits. Each recursion chooses one element from the list, and passes rest to the next recursive call.

The second problem is to not count initial 0’s. This problem is handled by left and the associated recursive call. left essentially takes care of whether all the entries to left are 0. If all the left entries are 0, and the current choice is also 0, then we pass the whole choice list to the next recursive call.

The third is to not cross the limit. This is the part which convolutes my solution. Variables atmax, rnge, chldmaxd, maxd all work towards solving this part. The idea is to keep track from the left, of the digits which have reached the maximum. Similar to the case of left before, atmax lets know whether all choices to the left have reached there maximum. This is used to restrict the choices for the next recursive call. ex: if the limit is 1245, and the choice so far is 12, then the choice for next level will be restricted till 0-4. But if its 11, then the next choice could be between 0-9.

sol = []
def seq(inp):
digits = tuple([i for i in range(10)])
maxd = [int(i) for i in str(inp)]
# number in reverse, with the first element always 0
# ex: the maxd would be [0, 5, 2, 1] for limit of 125
maxd.append(0)
maxd.reverse()

srep = tuple([str(i) for i in range(10)])

def seq_gen(n, atmax, lst, s, left):
if n==1:
for i in lst:
sol.append(s+srep[i])
return

rnge = (atmax and [maxd[n]] or [9])[0]

chldmaxd = False
for i in range(len(lst)):
if lst[i] <= rnge: if atmax and lst[i]==rnge: chldmaxd = True if left and lst[i] == 0: seq_gen(n-1, chldmaxd, lst, s, True) else: seq_gen(n-1, chldmaxd, lst[:i]+lst[i+1:]\ , s+srep[lst[i]], False) else: break; seq_gen(len(maxd)-1, 1, digits, "", True) [/sourcecode] The above code's performance could be improved with simple change in termination condition. [sourcecode language='python'] if n==1: for i in lst: sol.append(s+srep[i]) return [/sourcecode] Timing analysis for the solution. real 0m29.297s user 0m28.762s sys 0m0.344s As approaches in Java performed quite faster than this, I tried my own implementation in C/C++, with Bob’s idea of using integers rather than string to save the sequence. Takes about 230ms($time ./a.out) to generate the whole sequence, on my system.

The code is available here

On the other hand, if we just want the count of such numbers possible (8877690), then here is a one approach. The numbers range from 1 digit to 10 digits, because there are only 10 digits and we cant go beyond that without repetition. So considering each such case(ex: 3 digit numbers), the most significant digit can’t be 0, because otherwise it just becomes a 2 digit number(which we account for separately). So there are 9 options for first number [1-9]. Second number onwards, we just have to count the permutation of the remaining digits. In our example of 3 digit numbers, this turns out to be 9*(9*8), written better as 9*(9!/7!).
Doing this for all 1, 2, …, 10 digits gives us the answer.

Advertisements

twiddling with weblogger.el (emacs + wordpress)

Although wordpress.com provides a feature rich editor, I prefer writing my posts in emacs. Edit (in emacs), copy, paste (to wordpress) is possible, but more interesting would be to post from withing emacs. Thats were weblogger.el comes in. Here is some help on how to set it up. But still, I was a little uncomfortable with the default configuration. For example, I am little more adapt to using \C-x\C-s in emacs. And each time I did this, the post would be published and the buffer closed. I would have to again fetch the blog entries (\M-x RET weblogger-fetch-entries).

This post is about the hacks made into weblogger.el to suit my needs. If it helps you, feel free1 to download the file from here. Rename it with extension .el.

Essentially what I wanted was for the modes to save the post (either published or as a draft), without closing the buffer. The change for this seemed to be simple, just remove the (burry-buffer), from the respective functions. The exception was that if we started with a new post, then saving the buffer would sent it to the server, but the buffer itself wouldn’t be updated with the information for the new post created at the server. Thus I changed the code to look for this exception while saving. If the post is new, then simply do (weblogger-fetch-entries).

Few more changes to customizing my editing experience. I like editing/programming in auto-fill mode. But available width for posts in wordpress themes usually wouldn’t match with that of auto-fill mode. The alternative was longlines mode. It allowed editing with word wrapping (soft newlines), and also didn’t introduce any (hard) new lines. Unfortunately, in this mode also, saving a post caused the post to show up all the soft newlines as newlines. What I needed was some way to turn off the longlines-mode before saving, and then again turn it on, if the buffer is not to be buried. I have introduced two new hooks, one to be run before saving the draft, and the other to be run afterwords, if the buffer is not to be closed.

Here are the weblogger related initializations from my .emacs

(add-hook 'weblogger-start-edit-entry-hook
	  (lambda()
	    (flyspell-mode 1)
	    (flyspell-buffer)	; spell check the fetched post
	    (longlines-mode)))
(add-hook 'weblogger-publish-entry-hook
	  (lambda()
	    (when longlines-mode
	      (longlines-mode-off))))
(add-hook 'weblogger-publish-entry-end-hook
	  (lambda()
	    (longlines-mode)))

(weblogger-select-configuration "my configuration name" )

Two tools of trade that helped me the most were eval-buffer and \C-h f.

  • \M-x eval-buffer evaluates a buffer, so that you don’t have to restart emacs after each change, to see its effects.
  • Emacs lisp is a self documenting system, where you provide the description of the function in the function itself. \C-h f <name of function> shows the description of the function.

Thats how I tried to understand the code structure, and various library functions. To know the key bindings and command options, try \C-h a weblogger. Its fairly easy to change the key binding by editing the weblogger.el file.

I am fairly novice at emacs and emacs lisp. So please do consider the chances of disk formatting bugs lurking in my code.

google treasure hunt

Two problems from Google treasure hunt, which although not quite tough, but nevertheless were interesting.

One was about a robot starting from top left corner of a matrix, trying to reach the bottom right corner. Robot could only go right or down. The problem was to find the number of unique possible paths. My first thought was to use dynamic programming. Some more thoughts, and I was able to see a pattern in the solution. The basic idea was, like in dynamic programming, to solve the smaller problems first. The observation was that the paths could be calculated one level at a time. The paths are calculated along a line parallel to the anti-diagonal, as it sweeps through the matrix from destination cell towards the starting cell.

Here’s the code doing just that.

int main(int argc, char ** argv)
{
  int r, c, i, j, diag;
  unsigned long long *pa;
  assert(argc == 3);

  r = atoi(argv[1]);
  c = atoi(argv[2]);

  diag = r<c? r:c;
  pa = calloc( diag, sizeof pa&#91;0&#93; );

  for (i=0;i<diag;i++)
    pa&#91;i&#93; = 0;
  pa&#91;0&#93; = 1;

  for (i=0;i<r+c-2;i++)
    for (j=diag-1;j>0;j--)
      pa[j] = pa[j] + pa[j-1];

  printf("paths %llu \n", pa[diag-1]);

  return 0;
}

The implementation worked for small inputs. But for the actual input, long long was not enough (fortunately, and assumably intended by the problem designer). Some more thoughts about the process, and the insight came. The pattern seemed to be familiar, although incomplete. It was the Pascal’s triangle . It was easy to find out the entry needed from the Pascal’s triangle and its corresponding value ( \mathrm{C}^{n+m-2}_{n-1} ). But still, to calculate it, would need more precision than whats available with C data types (otherwise my program would have given the answer). Google also didn’t help. Looking for bignum library, lead me to this site , which provided an expression calculator showing capabilities of the library. So all I had to do was fill in the equation and copy paste the solution.

The other one was the last problem. Here is a verbatim instance of the problem.

Find the smallest number that can be expressed as
the sum of 3 consecutive prime numbers,
the sum of 35 consecutive prime numbers,
the sum of 553 consecutive prime numbers,
the sum of 829 consecutive prime numbers,
and is itself a prime number.
For example, 41 is the smallest prime number that can be expressed as
the sum of 3 consecutive primes (11 + 13 + 17 = 41) and
the sum of 6 consecutive primes (2 + 3 + 5 + 7 + 11 + 13 = 41).

The smaller sequence would always end with larger primes, than the longer sequence. Reason being that the same sum is distributed in a smaller set of consecutive primes. How does that help? Slide the smallest sequence to include the next prime, and simultaneously, update all later sequences to have sum greater than or equal to sequences smaller than them. Here’s a little cryptic code for the solution.

struct cseq {
int seql; /* seq len */
int seqt; /* seq top */
int seqs; /* seq sum */
} *sa;

void slide_sequence(struct cseq *s)
{
s->seqt++;
s->seqs -= prime[s->seqt – s->seql];
s->seqs += prime[s->seqt];
}

int align_to_sum(struct cseq *s, int sum)
{
while (s->seqs < sum) slide_sequence(s); return s->seqs == sum;
}

int main()
{
/* initialize sa and primes array and other variables. */

for each consecutive prime {
/* invar: bigger sequences have sum >= smaller once. */
eqcnt=0;

slide_sequence( &(sa[0]) );

for (j=1;j here .

Here are the timing details.

$time ./a.out
seq: 3, start: 221709, end: 221711, sum: 9221231
seq: 35, start: 23088, end: 23122, sum: 9221231
seq: 553, start: 1650, end: 2202, sum: 9221231
seq: 829, start: 930, end: 1758, sum: 9221231

real 0m0.210s
user 0m0.208s
sys 0m0.000s

Problem 4-2

Problem:

Given, an array of size n with distinct elements such that 0 <= A[i] <= n for each element i. Note that the array should be missing an integer from range [0,n], as the array has size n and the range has n+1 elements. The only way to access elements is through function

bit_access(int * A, int i, int j)

which returns the j^{th} bit of element i in array A.

We have to find out the missing number in O(n) time.

Approach:

Let b^{th} bit be the most significant set bit of n. What we observe is that the elements of A could be divided into two sets, based on the b^{th} bit of those elements. Those which have the bit set, and those which don’t. We know that number of elements with b^{th} bit 0 should be 2^{b-1}. And the remaining elements would have the bit set. If the number of 0’s doesn’t match the expected count, then the missing element has its b^{th} bit as 0, otherwise its 1. The algorithm works on this observation by first partitioning the array according to b^{th} bit and then applying the same logic recursively on the sub-array which has the element missing from the range.

Here’s the code for partition and recursive search of solution.

/* partition array with elements, according to b bit of each
element and return the partition index. */
int partition(int * arr, int n, int b)
{
int i, part=0;

/* invar: part points to the first set bit seen so far. */
for (i=0;i optimized away.
The above recursion is T(n) = T(n/2) + O(n), which solves to O(n), hence a solution for the problem.