When studying local extreme points of functions of several (real) variables, the textbook asks in one of the exercises to consider the polynomial
Here we have and , so the only critical point of is Since and the Hessian of is , it follows that is a local minimum of and, since it is the only critical point, it is in fact an absolute minimum with
being a polynomial, it is reasonable to expect that there is an algebraic explanation as for why is its minimum, and why it lies at . After all, this is what happens in one variable: If and , then
and obviously has a minimum at , and this minimum is
The polynomial of the example above can be analyzed this way as well. A bit of algebra shows that we can write
and it follows immediately that has a minimum value of , achieved precisely when both and , i.e, at
(One can go further, and explain how to go in a systematic way about the `bit of algebra’ that led to the representation of as above, but I will leave that for a future occasion.)
What we did with is not a mere coincidence. Hilbert’s 17th of the 23 problems of his famous address to the Second International Congress of Mathematicians in Paris, 1900, asks whether every polynomial with real coefficients which is non-negative for all (real) choices of is actually a sum of squares of rational functions. (A rational function is a quotient of polynomials.) A nonnegative polynomial is usually called positive definite, but I won’t use this notation here.
If Hilbert’s problem had an affirmative solution, this would provide a clear explanation as for why is non-negative.
Hilbert himself showed that a non-negative polynomial is a sum of squares of polynomials in either of the following cases:
- is a polynomial in only one variable,
- is a quadratic polynomial (as in our example above), in any number of variables,
- is a quartic polynomial in two variables.
Emil Artin solved Hilbert’s problem (in the affirmative) in 1927. The stronger conclusion, that is a sum of squares of polynomials, is unfortunately false in general. Here is a counterexample:
That this polynomial is non-negative follows from the inequality between the arithmetic and the geometric means, see example 1 below. This polynomial appears in the paper of Marie-Francoise Roy, The role of Hilbert’s problems in real algebraic geometry, in Proceedings of the ninth EWM Meeting, Loccum, Germany 1999, and one can also check that it is not a sum of squares of polynomials by applying the algorithm I refer to below.
Although Hilbert himself was aware that some non-negative polynomials are not a sum of squares of polynomials, I believe it was the logician Rafael Robinson who exhibited the first explicit example of such a , in his nice paper Some deﬁnite polynomials which are not sums of squares of real polynomials, in Selected questions of algebra and logic (collection dedicated to the memory of A. I. Mal′cev) (Russian), pp. 264–282, Izdat. ‘‘Nauka’’ Sibirsk. Otdel., Novosibirsk, 1973.
Here I quote from the Mathscinet review by W. Ellison:
The key result is the following lemma: A polynomial of degree at most 3 that vanishes at 8 of the 9 points with must also vanish at the 9th point.
The polynomial vanishes to the 2nd order at each of the 9 points and we can ﬁnd a sextic curve that does not pass through but does have double points at each of the other 8 points, for example It follows that the function is bounded above in the whole plane. Thus, for a suitable (in fact one can take ) it follows that is positive deﬁnite and that It is now clear that we cannot have with real polynomials for these polynomials are at most cubics and so would have to vanish at since they vanish at the other 8 points.
It is interesting to see that several logicians have worked on this problem; notably Robinson, Henkin and Kreisel.
This is a good place to mention that there are algorithms that, given a non-negative polynomial, decide whether it admits a representation as a sum of squares of polynomials, and if so, find such a sum. This is the main result of the nice paper An algorithm for sums of squares of real polynomials, by Victoria Powers and Thorsten Woermann, J. Pure and Appl. Alg. 127 (1998), 99-104. (As of this writing, the paper is also available at Dr. Powers’s webpage.)
Many natural inequalities that appear frequently in analysis can effectively be stated as asserting that certain polynomials are non-negative, so in fact one can (at least in principle) prove these inequalities by simply displaying a representation of the given polynomials as sums of squares.
1. The inequality between the arithmetic and the geometric means asserts that
for all nonnegative numbers , with equality only when . This classical result is one of the most quoted inequalities in analysis. Notice that it is equivalent to the assertion that the polynomial
is expressible as a sum of squares (here, I replaced each with to dispense with the restriction that the must be non-negative).
As suggested, one may try to prove this inequality by simply exhibiting as a sum of squares. Such a representation was actually found by Hurwitz in 1891, in his paper Über den Vergleich des arithmetischen und des geometrischen Mittels, that can be found in Math. Werke, 505-507, Basel, E. Berkhäuser, 1933.
To write the corresponding representation in a palatable way, let’s introduce some notation. Given a function , let’s write for the result of adding all the expressions of the form for all possible rearrangements of the indices (there are such possible rearrangements, where , -factorial, is defined by setting and for ).
For example, and .
Notice that each is clearly non-negative, since
Notice that this expression makes it obvious that equality can only occur if the are all equal.
2. Another fairly used classical inequality is the Cauchy-Schwarz inequality
with equality only if the tuples and are proportional. This result is usually shown by analyzing the discriminant of a non-negative quadratic polynomial, and it is also widely presented in the more compact form for arbitrary vectors in and the usual dot product. Letting
the inequality is seen to be equivalent to the assertion that the polynomial is non-negative, and so (as in the previous example) it should be possible to prove this by exhibiting as a sum of squares. The resulting identity is due to Lagrange, namely,
from which the case of equality also follows immediately.
3. Let me close with an example of a classical inequality that leads to non-negative polynomials for which I do not know whether they admit representations as a sum of squares of polynomials. This inequality is due to Minkowski: For non-negative , we have
As in example 1 above, we can rewrite this as the assertion that a polynomial is non-negative, namely,
That the inequality holds is a consequence of the inequality between the arithmetic mean and the geometric mean. For example, expanding the first product, one finds exactly terms with exactly many factors of the form and many factors . For any with precisely many of these terms contain the factor so, by the arithmetic mean-geometric mean inequality, the sum of these terms is at least times the product of the to the power times the product of the to the power (since ), which is precisely the corresponding term when we expand the -th power we are subtracting in .
However, it is not clear to me that is a sum of squares of polynomials. The problem trying to apply the representation I displayed in example 1 is similar to the problem one finds trying to apply that representation to the polynomial namely, since for example is not a perfect sixth power, the representation from example 1 gives us as a sum of squares of functions involving cubic roots rather than polynomials.