Andy's Math/CS page

Friday, March 21, 2008

News, and some Number Theory

Time for a personal update: I'm enjoying myself greatly here in Cambridge, and was recently admitted as a transfer student to MIT's EECS department. The move has both personal and academic advantages for me, but let me emphasize that I think UCSD has a lot to offer prospective students, and in addition is home to many cool people whom I miss... I would be happy to offer advice about either school.

When I started writing here as an undergrad, I knew very few people in the worlds of TCS and Complexity, and the blog was a way of reaching out to a community I wanted to join. Now, thanks in part to blogging (thru which I met Scott, my advisor), I'm immersed in that community at a school with a large, vibrant Theory group. This is to say that, though writing here still interests me, it no longer answers an urgent personal need, and I am most likely to post new material in response to a request for specific topics or post types.

Today I was perusing a favorite book of mine, 'Gems of Theoretical Computer Science' by U. Schoning & R. Pruim. One chapter partially treats the undecidability of determining whether systems of Diophantine equations (polynomial equations where solutions are required to be integral) have a solution. This was one of Hilbert's problems from 1900, solved in 1971 and full of deep math.

The authors pose the following exercise: show that the version where variables must take nonnegative values, is reducible to the integer version. (And vice-versa; also the rational-values version is reducible to the integral version; the converse appears to be open...)

Think about it...

The reduction they give: Given a set of polynomial equations {P_i(X_1, ... X_k)}, we want to determine if they're simultaneously satisfiable with nonnegative values. Introduce, for each j <= k, the variables Y_{j, 1}, Y_{j, 2}, Y_{j, 3}, Y_{j_4}.

Now add to the system, for each j <= k, the constraint that

X_j = Y_{j, 1}^2 + Y_{j, 2}^2 + Y_{j, 3}^2 + Y_{j_4}^2.

Claim: the new system is satisfiable over the integers iff the original one is satisfiable over the nonnegative integers! The proof, of course, is an easy consequence of Lagrange's Theorem that every nonnegative integer is the sum of 4 integer squares.

So, my question is, could a reduction be found that doesn't rely on Lagrange's Theorem? Or its weaker variants where the constant 4 is replaced with some constant c > 4. Or maybe for some constant c, the proof is really so simple that I will be satisfied that we are performing this reduction with the cheapest tools.

If our plan in the reduction is to constrain the original variables in a form analogous to the above, namely

X_j = Q_j(Y, Z, ...),

where Y, Z, ... are new integral variables, is there any way around proving a version of Lagrange's Theorem? Generally we show that polynomials are identically nonnegative by expressing them as a sum of one or more squares, e.g.,

Y^2 + Z^2 - 2YZ = (Y - Z)^2.

Using this template, we'd be reliant on Lagrange. However, in his thesis, Minkowski conjectured that there exist nonnegative real polynomials *not* so expressible, and this was proved by Hilbert. A simple example I found online is

Q(X, Y, Z) = X^4*Y^2 + Y^4*Z^2 + Z^4*Y^2 - 3X^2*Y^2*Z^2,

and another counterexample is derived systematically in the classy and useful inequalities book 'The Cauchy-Schwartz Master Class' by J.M. Steele. (Artin proved, however, that every poly that's identically nonnegative, is expressible as the sum of finitely many *rational* functions squared.)

Could it be that one of these creatures is easier to use in the reduction (both to prove it's nonnegative, and ranges over all nonnegative values)? Somehow I doubt it. Anyways, I just wanted to point to an instance where a reduction one would expect to be straightforward seems to require fairly nontrivial understanding of the problem domain.

Labels: ,

Friday, March 07, 2008

Beasts of Probability and Plane Geometry

Say you're trying to predict whether some event E occurs or not. There is another collection of events I_1, I_2, ... I_k, which are positive predictors of E: for every j, E occurs with probability at least .99 conditioning on the event that I_j occurs.

Can we lower-bound the probability that E occurs conditioning on the event that *at least one* I_j occurs?

(Think about it before reading on.)

Here's a simple example of what can go wrong: let the underlying probability space be a sequence of n unbiased coin flips. Let E be the event that at least 2/3 of the flips come up heads. For each subset S of {1, 2, ... n} of size exactly
.4n, let I_S be the event that all the coin flips indexed by S come up heads.

If n is large enough, we have that

i) E is almost surely false, yet

ii) Almost surely, some I_S is satisfied--even though

iii) Conditioning on any fixed event I_S, E becomes almost surely true (since we then expect half of the remaining flips to come up heads, yielding about a .4 + .5*.6 = .7 fraction of heads total).

One can also modify this example to make conditioning on the union of the I_j's actually decrease the probability that E occurs.

This kind of conclusion seems somewhat pathological and inconvenient, so it's natural to look for restrictions that prevent it from arising. The simplest would be to restrict the number, k, of predictor variables: for the above hypotheses, we have that the probability of E conditioning on the union of the I_j's is at least

.99 / (1 + .01(k - 1)).

(to see this, think about the worst possible case, which resembles a 'sunflower' in probability space.)

A more interesting direction is to restrict the structure of the predictor events within probability space. For instance, suppose that the probability space is a uniformly drawn point from the unit interval, E is some arbitrary subset of the interval, and each I_j is the indicator variable for some fixed subinterval. Then, regardless of the number of I_j, we can conclude that E occurs with high probability conditioning on the union of I_j's; not quite .99, but close. See my closely related earlier post for details.

It is natural to try to extend this result to higher dimensions, and for predictor indicator-sets given by axis-parallel rectangles this succeeds by an induction (although the effect gets exponentially weaker with the dimension). Similar results hold if the sets are required to be 'fat' convex bodies, in which case an argument much like the 1-dimensional one works.

However, allowing rotated rectangles destroys the effect even in two dimensions. Here's one interpretation: consider the unit square as a city, and take the set S to be the set of Democratic households.

In pathological cases, it's possible to find a covering of the square by overlapping convex 'precincts', such that

i) in each precinct, 99% of the households are Democratic, yet

ii) overall, 99% of houses are Republican!

Such sets seem truly bizarre. For a long time I was convinced they couldn't exist, but after failing to prove this, I finally tracked down a construction in a Harmonic Analysis textbook by Elias Stein (who, besides seeming impressive in his own right, was PhD advisor to Fields Medalists Charles Fefferman and Terence Tao). These sets, whose construction resembles a kind of hydra spawning ever more and tinier heads, are related to the more well-known Besicovitch/Kakeya sets. One can even achieve a kind of fantastical limit, in the following theorem for which Stein provides a reference:

There exists a subset S of measure zero in the unit square, such that for every point x on the square, there exists a line L thru x, such that S contains all of L except, possibly, x itself!

Labels: ,