Andy's Math/CS page

Wednesday, May 30, 2007

Flooding and Kleitman's Theorem

In their classic, and highly entertaining, survey book on probabilistic methods in combinatorics, Alon and Spencer pose the following question: suppose we generate a random graph on n vertices, independently including each edge with probability 1/2. Suppose we do not look at the graph right away, but are informed that the graph we got contains a Hamiltonian path, that is, a path visiting each vertex exactly once.

Does this knowledge make it more or less likely that the graph is nonplanar?

Of course, it should make it more likely. Please stop and verify that this ought to be true. Can you prove it?

Results that prove facts like this are called 'correlation inequalities', and the simplest one that applies here is called Kleitman's Theorem. (In Alon and Spencer, Kleitman's Theorem is derived from more general results, but I encourage you to try to prove it directly.)

Say a subset A of the Boolean cube is monotone if, whenever x is in A and y >= x (bitwise domination), then y is in A too. We may also say that A is 'upwards closed'.

Kleitman's Theorem states that, given two monotone subsets A, B of the Boolean cube, if we pick a random cube element z, the chance of z being in B cannot decrease if we condition on the event that z is in A.

So, in the example, let the bits of the cube represent the presence/absence of edges in a graph of fixed size. Let A and B be the properties of Hamiltonicity and nonplanarity, respectively. Then Kleitman yields the intuitive fact we were looking for.

***

Now I am going to describe how Kleitman's result solves the previous flooded-city puzzle I posed last time. In fact, trying to reprove Kleitman's Theorem was what led me to pose the flooding puzzle in the first place, although I now tend to think of Kleitman's as the logically prior result.

To do this requires an easy generalization of Kleitman's Theorem, where we replace the boolean cube in the statement with {0, 1, ... m}^n. So, e.g., vector (3, 4, 5) dominates (3, 3, 4), but is incomparable with (4, 0, 0). In fact, we'll only need this result for n = 2.


Create a bipartite graph U x F, with unflooded houses on the left and flooded houses on the right. Add an edge from an unflooded house h to a flooded house h' if there exists an h-to-h' path with all steps running north or east.

Given a set S of unflooded houses, let N(S) denote the flooded houses adjacent to some house of S in the bipartite graph.



In the example above, the green-circled houses are S and the orange-circled flooded houses are N(S).

Each unflooded house contains a unit amount of water to be delivered, while each flooded house needs an amount |U|/|F| of water delivered, which can only be delivered from a house that is adjacent in the graph we defined. Then, by a version of Hall's Theorem, not difficult to derive from the original formulation, the delivery plan we seek exists if and only if for every subset S of U, the total water-need of N(S) meets or exceeds the amount of water available from S.

Suppose that the desired allocation does not exist. Then by Hall's Theorem, there exists some subset S of U such that the water supplied by S exceeds the need of N(S).

We take such an S and iteratively add into S all the unflooded north- and east-neighbors of houses in S. This doesn't increase the size of N(S), but it does increase the amount of water supplied by S, so the resulting modified S is still a witness to the impossibility of our desired allocation.

We repeat this until the union of S and N(S) is upwards-closed. Call the resulting set S'. Since S' supplies more water than N(S') demands, the ratio |S'|/|N(S')| exceeds |U|/|F|.

If S were as in the example above, S' would be the purple-circled set of houses below:



This means that, if we pick a house at random from within the union of S' and N(S'), we are less likely to get a flooded house than if we picked a house uniformly at random from the whole city. This means that, in picking uniformly from the whole city, the (upwards-closed) event of getting a flooded house is negatively correlated with the upwards-closed event of getting a house from the union of S' and N(S')--a contradiction to the generalized Kleitman's Theorem.

We conclude that the desired delivery plan exists, and can be found in polynomial time using known algorithms for bipartite matching.

Labels: , ,

Wednesday, May 23, 2007

Not Social Commentary

There's been some unfortunate flooding in Grid City:



The water swept in from the Northeast, so if any house is currently flooded, and it has neighbors to the North or East, it's a safe bet they're flooded too. (Other than this fact, and the grid layout, you should disregard the specifics of the map given.)

The water is low and flooded houses are still livable, but they need tap water. In a show of social cohesion befitting Grid City, each non-flooded household immediately donates 50 gallons of water to the effort. The goal is to distribute this water evenly among the flooded houses, and there is no shortage of volunteers to help. Also, quantities of water can be poured and combined or subdivided with precision.

But water is heavy and, let us say, has to be carried around on foot, so some plan is needed. An ideal of efficiency would be that water is never carried in the South or West directions.

Can both objectives be met simultaneously?

***

In the next post, I'll explain how this puzzle came up and my solution. But I'm curious to hear other approaches first, if you care to try.

Labels:

Wednesday, May 16, 2007

Dogs in the Mineshaft

A couple posts ago I described some shenanigans with cats in a binary tree.
But that was just the front yard. The back yard is worse: it contains the opening of an old mineshaft that descends infinitely far into the earth's depths.

Well, there's some funky smell down there that dogs just have to investigate. So, every day, some finite number of dogs may enter or go further down the (unbranching) shaft. They never go back upwards. Here's the scene:



As before, you've got angry owners on your hands. You refuse to close off the shaft (it's too nifty) but again you're willing to try and ensure that any individual dog descends only finitely far into the shaft. Your tool this time: it turns out that after enjoying a good steak, a dog will curl up and fall asleep indefinitely.

The difficulty is that, before eating any steak you give it, the dog may carry it arbitrarily far further down the tunnel. If it passes sleeping dogs, they may be awakened by the steak aroma, and resume their descent.

Money, time, and steaks are no object, and you can outrun the dogs, who won't smell the steaks if they're sealed in your coat pocket. Can you keep your promise to the owners?

To get you started, consider a simple but unsuccessful strategy of giving a steak to every dog every day. The problem is that some dog D may have a sequence of 'accomplice' dogs: every time D falls asleep, a new accomplice will run into the tunnel and hang out above D in the shaft. When you give the accomplice a steak, it'll scamper past D, who'll rouse and go further. In fact, it may happen that every dog goes infinitely far down the tunnel!

Another simple idea is to give a steak every day to the dog furthest down the tunnel, if that dog is not already asleep. This avoids ever waking up sleeping dogs (assuming that the previous day's steak has had its effect; if not we can just wait longer), but it falls victim to a 'rear-guard' attack: a dog that trails constantly just above/behind the current leader.

***

This is a challenging puzzle (unless there's a glitch in the set-up, which I've checked over). But it's well worth thinking about, because the solution strategy--the 'finite injury' or 'priority' method of Friedberg and Muchnik--revolutionized computability theory. Some would say it marked the end of its comprehensibility to outside observers, but I'm optimistic that is not the case. The ingenious strategy is lovely, simple, and, fingers crossed, easier to see in a puzzle setting. I will discuss the solution in a future post, so stay tuned.

Update: Scratch that, Gareth Rees has solved the puzzle and clearly explained the solution for us, in the comments section of 'Trees and Infinity, part III'. I may still talk more in the future about its significance to computability.

Labels: ,

Monday, May 14, 2007

Feed Me

A note to readers: I now have an RSS feed, accessible on the right toolbar ('Atom'). When used with appropriate software/sites (I have feeds on my homepage at My Yahoo), this will let you know when I've put up a new post--useful for a blog like this, where posts come strictly at random intervals. As always, I encourage feedback, which will bring more-frequent and more closely-tailored posts.

I try not to do throwaway posts, so how can I salvage this one? Hmm... a puzzle suggests itself:

Modern undergraduate that you are, you're watching bootleg 'Scrubs' episodes on your laptop. This goes on for three hours or so. Meanwhile, your site-feeds page occasionally updates to display someone's new blog post. When your favored blogs post, the content hits your feed within 10 minutes, but the precise delay is nondeterministic--could be 0, 10, or anything in between.

You register the order in which the feed updates come in. Now: is there an efficient method to calculate the number of 'true orders' in which the bloggers could have made their updates?

For example: say updates A, B, C came at 12:30, 12:35, and 12:41 respectively. Then this is consistent with the true-orders ABC, BAC, ACB, but not with, e.g. CAB. Thus the answer here is 3 ways.

Good luck! Does your method work if each individual blogger has their own (known) interval of possible delays?

Clarification: I haven't solved this problem. It might be #P-complete, and the proof of this might be very technical. It seems related to work on counting linear extensions of posets (which is #P-complete in general), discussed by Suresh here.

Labels:

The 'Sack of Potatoes' Theorem

Tired of packing ham sandwiches for lunch every day? Have some spuds for a change.

You've got a countable collection A_1, A_2, A_3, ... of potatoes. A potato is a convex body of diameter at most 1.

(That is: if p, q are points in a potato P, P also contains the entire line segment between them; furthermore, p, q are at a distance at most 1 from each other.)

Furthermore, most of these spuds are incredibly dinky: the total volume of the entire collection is at most some finite value v, say, 4 cubic inches. You'd really like to eat them all.

Trouble is, mashing or cutting them up beforehand would ruin their freshness. You want to bring them all to work whole in your lunchbox.

Prove that these spuds can be packed into a box of finite dimensions. In fact, such dimensions can be computed solely in terms of v, independent of the precise shape of the potatoes.

***

How to get started on a puzzle like this? One should try to find a sequence of reductions to simplify the problem. So, first, suppose we could do it for rectangular potatoes. Show how to pack a single potato of volume c into a box of volume at most a constant multiple of c, also with bounded diameter. Then, if you can pack these boxes (which have finite total volume), you've packed the potatoes they contain.

I solved the 2-D case, but haven't yet worked on 3-D or n dimensions (where it is also true). I read about this result in a charming volume: The Scottish Book, a collection of research problems (some simple, some still open) compiled by high-caliber, and very imaginative, mathematicians, including Ulam, Banach, and Mazur, who met regularly in the 'Scottish Coffee House', confusingly located in Lwow, Poland, during the '30s. Their book of problems was kept there by the management and notes the prizes--'A bottle of wine', etc.--offered for various solutions. It's definitely worth looking over, both for its diverse contents and for the image of mathematical community it offers.

Labels: , ,

Tuesday, May 08, 2007

Trees and Infinity, part III



(Hat tip to Gareth--my first fan art!)

Today, a computability theory-inspired quickie puzzle. Can you face down the infinite in its many guises?

Cats in a Tree

Some GMO-contaminated sludge washed onto your property during the latest freak weather event. A few weeks later, a mysterious tree has sprouted on the lawn, and its progress is astounding--every day, a new layer of binary growth appears:



This would be peachy, but there's a problem: your neighborhood suffers from a plague of cats. Mangy, poorly drawn, tree-climbing cats:



On any given day, some (finite) number of cats may start climbing the tree, or continue their climb from a previous day. What they don't know how to do is come down, not even a single level.
The cats all have irate, protective owners. They want you to cut down this cat sinkhole at once (after rescuing their pets). But you're not going for that--you want the world's largest tree, with unending growth.

Here's the compromise you'd like to offer the owners: a guarantee that every cat C will eventually be prevented from climbing above some level n (which may depend on C).

To achieve this, you have the following means: on any day, you can climb to any number of nodes of the tree and strip the bark there, preventing further growth above the stripped nodes.
Of course, stripping the root node would stop the cats. The challenge, again, is to achieve unending growth as well (though not necessarily the full binary tree).

Good luck!



***

This puzzle models the theorem that there exists an infinite recursive tree with no infinite recursive path. (Interestingly, however, no recursive tree T can have the Halting problem computable with the aid of an arbitrarily chosen infinite path from T. Promise problems are subtle...)

Can the whole theory of computability be 'sweetened up' this way, like the Flintstones vitamins I gobbled up in my youth? Here's a proposal: once students understand programs' capacity to simulate other programs and other computational models, dovetail search procedures, and other 'infrastructural' techniques, it may be preferable to actively 'forget' the computational statement of a problem one aims to prove.
So, in the above puzzle, the cats correspond to different simulated programs in an enumeration, but we are so profoundly unable to predict in advance the behavior of these programs that it might be better to think of the situation adversarially--hence, mischievous cats who do just as they please.
Of course, one could also make a case for the aid to visualization, and possibly generalization, given by the metaphoric puzzle format, but here I mean specifically to ask whether metaphors might help students develop a stronger familiarity with the black-box methodology of computer science.

Next time: the Finite Injury method!

Labels: ,

Thursday, May 03, 2007

The Antichain King

Some readers should be familiar with the classic Kruskal-Katona Theorem. Well, legend came to life today... Gyula O. H. Katona (of the Renyi Institute in Hungary) just came to our department and gave a talk about his main area of expertise, extremal set theory.

Sperner's Theorem was the result that launched this now-sprawling field. It addresses the following question: How large a family of subsets {S_1, ... S_m} of {1, 2, ... n} can we find, such that no S_i contains another S_j? (Such a family is called an antichain in the lattice of subsets of {1, 2, ... n} partially ordered by set inclusion.)

Sperner states that the collection of all sets of size (floor-of-) n/2 gives the biggest such antichain you could hope for. It's clear that this set is a maximal antichain (can't be augmented); the claim is that it's also globally optimal.

Katona mentioned some nice applications of Sperner's Theorem. Here's a cute one: given a set of real numbers all greater than 1, use Sperner's Theorem to bound the number of subset sums taking on a value in any interval [x, x+1), x a positive real. This bounds the concentration of the random variable that outputs a randomly chosen subset-sum.

He mentioned many generalizations as well. Sperner's Theorem can be viewed as a statement about the number of sets necessary to guarantee an ordered pair, but we can ask about guaranteeing ordered triples, diamonds, or any poset P for that matter. He and colleagues have results of this form, and he conjectures that for any poset P, there exists a constant c depending only on P such that, for all n, there exists a maximum P-excluding family of subsets of {1, 2, ... n}, such that all sets in the family have size in the range (n/2 - c(P), n/2 + c(P)). So, the structure of optimality in Sperner's result could extend way further (Erdos proved a result of this type for all chains P).

Finally, Katona mentioned the 'group testing' problem that first drew him into this whole web of intrigue:

Given n, r, there are n boxes, r < n of which contain (say) radioactive material. We can in one step test any subset of the boxes for presence of radioactivity (but get no sense of how many radioactive boxes are in the group, just 'some' or 'none'). How many tests are necessary to determine the r hot boxes?

Sperner theory is relevant to the nonadaptive-query case.

Well? Can you see such an elegantly posed question and not try your hand at it? Thanks, Professor Katona!

Labels: