Mathematics of the Worst-Case Scenario
Optimization is an obvious and major theme of applied mathematics. What may be less obvious to the public is the extent to which mathematics relies on a dual process, which one might call 'pessimization'.
Suppose we want to partially relate two functions, f and g, with a statement of the form
"f(x) >= M implies g(x) >= N". So, f and g are positively correlated, and we want to quantify this relationship.
How do we get the strongest possible relationship? Fixing M, we choose x to minimize g(x) subject to the constraint f(x) >= M. This value of g(x) is the tightest lower-bound N we can get, and the most informative statement.
Why is this 'pessimization'? We look for statements of the above form when we are interested in ensuring that g(x) is high (g is some measure of 'goodness', and f is some simpler measure that we can more easily estimate and which is heuristically related to g); but, to achieve this we have to spend much of the time thinking perversely about how low g could wriggle subject to the constraint that f is high.
This kind of thinking is ubiquitous. The subject is too general (really only a 'theme') to have truly distinctive methods or theoretical constructs (in particular, there is no fundamental difference between optimization and pessimization--the latter is just optimization from the devil's perspective), but where it treats discrete, finite objects and collections it is sometimes called 'extremal combinatorics'; I recommend Jukna's book by that name.
Sometimes the heuristic relationship between our two functions has a clear logical grounding. For instance, let f(G) measure the number of edges in a graph, and let h(G) measure the number of triangles; adding an edge to a graph can never decrease the number of triangles. However, f(G) > f(G') does not imply that h(G) >= h(G'); G' may be more 'pessimally' organized than G with respect to triangles. Turan's Theorem, giving exact conditions on f to guarantee that h > 0, is considered the founding result of extremal combinatorics. Can you prove such a theorem? What do triangle-pessimal graphs on m vertices look like?
The preceding example is really a special case. Having a triangle and having at least a certain number of edges are both 'monotone properties' in the sense that they are monotone functions of the 0/1 adjacency matrix of a graph. A simple result of Kleitman, provable by induction, says that if A and B are two such properties and we uniformly generate a graph on n labeled vertices, Prob[A & B hold] >= Prob[A holds]*Prob[B holds].
So, e.g., being informed that our graph has diameter at least 10 cannot make it less likely to be planar, as should be intuitive ("diam >= 10" and planarity are antimonotone properties, so we apply the result to their negations and then rephrase).
A pleasant feature of investigations in extremal combinatorics is that they can be jump-started with simple questions; the exotic structure emerges naturally in exploring pessimal objects. Or, sometimes, the pessimal structures end up being very simple and familiar, but meeting them in the context of the investigation sheds new light on them. For example, we may heuristically observe that collections of 'many' distinct numbers contain 'many' distinct pairwise differences; even though the pessimal structure that pops out when we pursue this relationship is nothing more than n points in arithmetic progression, it's still a nice problem (which becomes more interesting when raised to 2 or more dimensions).
Some quite simple results of extremal combinatorics have important uses in elementary probability and in complexity theory. For example, take a square 0/1 matrix with 'many' ones; we'd like to conclude that it has many rows with many ones each; if we're willing to settle for either many 1-heavy rows or many 1-heavy columns, we can do much better (since pessimal examples against the 'many-rows' conclusion promote the 'many-columns' conclusion) when we quantify our various 'manys'. If we'd also settle for even a single 1-heavy row or column, we can do still better.
Such results have some immediate interpretations in terms of conditional probabilities. They also fuel a kind of robust reduction from any function f(x) on n bits to the function g(x, y) = (f(x), f(y)) (with the two matrix dimensions in our analysis corresponding to the two inputs to g). This is at the heart of a certain approach to hardness amplification, including a proof of Yao's XOR Lemma, given in this paper (see also my earlier post on the XOR Lemma). In fact, Jukna's book is rife with complexity apps, and was written with computer scientists in mind.
As I've said, no 'deep' unifying theory of extremal combinatorics exists, and I feel more inclined to share problems than to exposit. But this time, instead of belaboring my idiosyncratic preferences, let me point you to a great site with more problems than you'll ever solve: Kalva, a clearing-house of old Olympiads with many solutions. Many sterling combinatorics puzzles lie amidst the usual geometry and number theory. Olympiads are hard, but with this many problems to peruse you WILL find ones you can solve--and that feels good.
Suppose we want to partially relate two functions, f and g, with a statement of the form
"f(x) >= M implies g(x) >= N". So, f and g are positively correlated, and we want to quantify this relationship.
How do we get the strongest possible relationship? Fixing M, we choose x to minimize g(x) subject to the constraint f(x) >= M. This value of g(x) is the tightest lower-bound N we can get, and the most informative statement.
Why is this 'pessimization'? We look for statements of the above form when we are interested in ensuring that g(x) is high (g is some measure of 'goodness', and f is some simpler measure that we can more easily estimate and which is heuristically related to g); but, to achieve this we have to spend much of the time thinking perversely about how low g could wriggle subject to the constraint that f is high.
This kind of thinking is ubiquitous. The subject is too general (really only a 'theme') to have truly distinctive methods or theoretical constructs (in particular, there is no fundamental difference between optimization and pessimization--the latter is just optimization from the devil's perspective), but where it treats discrete, finite objects and collections it is sometimes called 'extremal combinatorics'; I recommend Jukna's book by that name.
Sometimes the heuristic relationship between our two functions has a clear logical grounding. For instance, let f(G) measure the number of edges in a graph, and let h(G) measure the number of triangles; adding an edge to a graph can never decrease the number of triangles. However, f(G) > f(G') does not imply that h(G) >= h(G'); G' may be more 'pessimally' organized than G with respect to triangles. Turan's Theorem, giving exact conditions on f to guarantee that h > 0, is considered the founding result of extremal combinatorics. Can you prove such a theorem? What do triangle-pessimal graphs on m vertices look like?
The preceding example is really a special case. Having a triangle and having at least a certain number of edges are both 'monotone properties' in the sense that they are monotone functions of the 0/1 adjacency matrix of a graph. A simple result of Kleitman, provable by induction, says that if A and B are two such properties and we uniformly generate a graph on n labeled vertices, Prob[A & B hold] >= Prob[A holds]*Prob[B holds].
So, e.g., being informed that our graph has diameter at least 10 cannot make it less likely to be planar, as should be intuitive ("diam >= 10" and planarity are antimonotone properties, so we apply the result to their negations and then rephrase).
A pleasant feature of investigations in extremal combinatorics is that they can be jump-started with simple questions; the exotic structure emerges naturally in exploring pessimal objects. Or, sometimes, the pessimal structures end up being very simple and familiar, but meeting them in the context of the investigation sheds new light on them. For example, we may heuristically observe that collections of 'many' distinct numbers contain 'many' distinct pairwise differences; even though the pessimal structure that pops out when we pursue this relationship is nothing more than n points in arithmetic progression, it's still a nice problem (which becomes more interesting when raised to 2 or more dimensions).
Some quite simple results of extremal combinatorics have important uses in elementary probability and in complexity theory. For example, take a square 0/1 matrix with 'many' ones; we'd like to conclude that it has many rows with many ones each; if we're willing to settle for either many 1-heavy rows or many 1-heavy columns, we can do much better (since pessimal examples against the 'many-rows' conclusion promote the 'many-columns' conclusion) when we quantify our various 'manys'. If we'd also settle for even a single 1-heavy row or column, we can do still better.
Such results have some immediate interpretations in terms of conditional probabilities. They also fuel a kind of robust reduction from any function f(x) on n bits to the function g(x, y) = (f(x), f(y)) (with the two matrix dimensions in our analysis corresponding to the two inputs to g). This is at the heart of a certain approach to hardness amplification, including a proof of Yao's XOR Lemma, given in this paper (see also my earlier post on the XOR Lemma). In fact, Jukna's book is rife with complexity apps, and was written with computer scientists in mind.
As I've said, no 'deep' unifying theory of extremal combinatorics exists, and I feel more inclined to share problems than to exposit. But this time, instead of belaboring my idiosyncratic preferences, let me point you to a great site with more problems than you'll ever solve: Kalva, a clearing-house of old Olympiads with many solutions. Many sterling combinatorics puzzles lie amidst the usual geometry and number theory. Olympiads are hard, but with this many problems to peruse you WILL find ones you can solve--and that feels good.
Labels: general math
12 Comments:
Hey, I hit upon your blog yesterday.. I really liked your articles, and learnt a lot from them.. Keep Writing!
By Anonymous, at 11:50 AM
Thanks, Prasad! Topic suggestions are welcome, by the way, as is feedback on my balance of exposition, puzzles, and random musings.
By Andy D, at 2:06 PM
Hi Andy! I followed the trail you left from Scott's page. I noticed you've marked the same blogs as I have as well.
Perhaps you would like to comment on the algebraic geometry community's obsession with dimensional reduction, in particular 2D embeddings. I cannot understand why topology and graph theorists insist on battling with planarity. Note that there IS loss of generality when it comes to these results. For e.g., your edge-triangle heuristic cannot non trivially be extended to to an edge-cuboidal 3D or an edge-hypercube nD situation.
One argument in favor is the 3D limitations of human perception and the consequent applicability of nifty solutions found to the real world. But what about from a purely theoretical standpoint?
In this post again, when you talk about a simpler "representative" function f(x)>=N, there are so many reductions - 1) your RHS is an oh-so-uninformative scalar, 2) you understandably want your LHS to involve a less convoluted expression than its analogue in g(x) (or in a heuristic model like an NeurNet, fewer layers and/or connections). Cheap computability is fine, but at what theoretical cost?
Im not questioning the intent of reductionism, whos motivation is convincing enough for me. Holistic mathematical treatment is something i havent studied, so im not qualified to ask questions.
By Anonymous, at 9:24 AM
This comment has been removed by a blog administrator.
By Anonymous, at 9:24 AM
oops pls remove the redundant comment.
By Anonymous, at 9:25 AM
Thanks for the comment, Snehid.
I'm far from expert in any of the fields of math you mentioned, so I don't want to make sweeping statements about the pros and cons of 'dimensionality reduction'.
In fact I'm not sure what you mean. Do you mean
1) theorems that only apply to planar graphs & other low-dimensional objects, or
2) a tendency for research to try to reduce high-dimensional problems to lower-dimensional ones in a justified, WLOG fashion.
In general, I think type-1 research is very well-motivated: e.g., planar graphs have special properties, such as 4-colorability, that make them interesting to study. In topology, low-dimensional questions like the Poincare conjecture seem often to prove especially interesting (and in that case, apparently *harder* then the general n > 3 case). Nor do we necessarily need 'theoretical' reasons to concentrate on low-dimensional problems: for instance, I would say personally that the notion of a planar graph is no less 'natural' than that of a general graph. Think about the objects that seem natural to you.
Type-1 research also most readily utilizes the spatial reasoning abilities we've evolved. It's worthwhile to push the limits of human abilities, but it's also important, and enjoyable, to work with our existing strengths.
As for type-2 research (attempting to reduce the n-dimensional case to lower-dimensional ones), I think it is a great thing when it can be achieved (often it can't), since it can make problems much easier. What's to object?
Maybe there's a third meaning of your question that I didn't catch.
By Andy D, at 3:08 PM
This comment has been removed by a blog administrator.
By Andy D, at 3:20 PM
Two other notes:
1) when I referred to a 'triangle' in a graph, I didn't mean a planar graph necessarily. By a triangle in a graph I mean 3 nodes all connected, i.e. a 3-clique. Also, for any chosen k, with enough edges we can also guarantee there will be a k-clique; so the result does generalize nicely.
2) As you said, I often want a simpler 'heuristic' function f to stand in for g or help show that g is large. I don't have any comment on the extent to which satisfactory heuristic functions f can be found in situations we care about. Certainly not as often as we'd wish; but this kind of thinking remains important (in many cases, what's the alternative?).
By Andy D, at 3:21 PM
Hey Andy,
Im far from qualified (older than you, but not yet in grad school) to pose all but the most illinformed of questions. I just thought Id offer some fodder for future posts like you asked. Let me try and give my previous ramble a more cohesive front.
Yes, I mean both 1 and 2. In fact, I mean the aggregate 1o2 - "A tendency for research to try to reduce high-dimensional problems to lower-dimensional ones, followed by the search for theorems that only apply to planar graphs & other low-dimensional objects created by the former."
What I did not know is that these reductions are WLOG. Besides perelman's topology results (for which I understand that higher dimensions are less constrained and easier to work with than), I did not know that lower D embeddings always offered results that were easily extrapolatable to higher D. Could you confirm this for me?
You said : Think about the objects that seem natural to you.
Like I asked, isnt this intuitive real world appeal restricting our question space and causing a kind of tunnel vision?
By Anonymous, at 11:07 PM
I didn't mean to suggest either that we can reliably reduce high-D problems to low-D ones, or that low-D theorems reliably shed light on high-D.
So, for example, there's no analogue of the 4-color theorem for 3 dimensions, since every graph is realizable in a natural way in 3-space and general graphs can have arbitrary chromatic number.
But sometimes the 'effective dimensionality' of a problem is less than it appears. So, if you have a manifold sitting in n dimensions, that is locally 2-dimensional, then you're not so far from the planar case: there is some k such that every graph embeddable on that manifold is k-colorable. (Also, if the manifold is compact, there is a surface homeomorphic to this one embeddable in 4-space--another dimensionality reduction, due I think to Nash.)
Sometimes dimensionality reduction for a problem is possible in principle, but at great cost. For instance, if we seek the maximum of a function over a Boolean n-cube, we can dissect the n-cube into two (n-1)-cubes, find the maxima M_1, M_2 on those two, and return max(M_1, M_2). But the amount of search we do this way is exponential in n.
In response to your worry about 'tunnel vision' in low dimensions: I'm not personally focused on spatial investigations, so my opinion is a casual one, but for the most part I respectfully disagree. Generalizing to n dimensions is a valid step, but neither the most imaginative nor the most likely to yield fundamentally new phenomena. It is, however, likely to destroy interesting things that only happen in low-D.
To preserve these interesting phenomena one should look for different kinds of generalization. The best and most impressive example I can think of is the Robertson-Seymour Theorem, which takes Kuratowski's characterization of planar graphs and generalizes it to a ridiculous extent by looking at 'minor-closed graph families'. A related and also important generalization of planarity is bounded treewidth, which when assumed in graphs often allows a kind of dimensionality reduction in graph algorithms.
Hope this is useful! But there are plenty of more-qualified geometers/topologists/graph theorists out there to talk to.
By Andy D, at 2:21 PM
Mathematics is the universal language, all intelligent life out there for sure nows mathematics.
By xlpharmacy reviews, at 5:18 PM
Hey! Coach Factory Online simply wanted to say your website is one of the nicely laid out, most inspirational I have come across in quite a while.
By Coach Factory Online, at 2:49 AM
Post a Comment
<< Home