Archive for March, 2009

Does Beauty Equal Truth in Physics and Math?

Wednesday, March 11th, 2009

It is not uncommon to hear physicists or mathematicians talk about the “beauty”, “simplicity” or “elegance” of equations or theorems, and even claim that they are sometimes led to a correct formula (or away from an incorrect one) by considering what is “simple” or “elegant”. Consider, for example, the words of the Nobel prize winning physicist Murray Gell-Mann:

“Three or four of us in 1957 put forward a partially complete theory of the weak [nuclear] force, in disagreement with the results of seven experiments. It was beautiful and so we dared to publish it, believing that all those experiments must be wrong. In fact, they were all wrong.”

and Albert Einstein’s remark:

“I have deep faith that the principle of the universe will be beautiful and simple.”
Could there be something to these remarkable claims? Is beauty in physics evidence for some kind of “intelligent” universe, or are there more mundane explanations? Is elegance in mathematics evidence for an underlying structure to reality? Or can this be explained away by psychological or practical considerations?
To begin answering these questions, an important thing to notice about the aesthetics of equations is that what appears to be simple or elegant may only be so because of the way that symbols are defined. For example, consider the remarkable and rather minimalist “heat equation”

Δf = f ‘

which, when solved for the function f with a given condition on its boundary, will describe how heat would actually flow over time on any specified surface in any number of dimensions. Is it not astounding that we can describe such a powerful physical law with just these 5 characters? Even if you don’t understand the mathematics or physics, bear with me because you will still be able to understand my point.

A deeper look at this equation shows us that the apparent simplicity here is in large part an illusion. First of all, the Δ, which is known in this context as the Laplace operator, can be thought of as simply a short hand notation. If we replace Δf with its definition, we are left with the markedly less simple equation:

d2/dx2 f + d2/dy2 f + d2/dz2 f = f ‘

where d2/dx2, d2/dy2, and d2/dz2 are second derivatives with respect to the x,y and z dimensions of space. The right hand side of the equation can now be replaced with its definition, where the tick (‘) applied to f is understood to mean that we are taking one derivative with respect to time. This gives us:

d2/dx2 f + d2/dy2 f + d2/dz2 f = d/dt f.

Even without knowing what this equation means, you can see that things are starting to get fairly complicated and are looking quite a bit less elegant. Derivative operations (which are taken a total of seven times in the above equation) are not themselves trivial operations, and are (typically) defined via a limiting procedure. If we apply the definition of the derivative to the d/dt on the right hand side, we get:

d2/dx2 f + d2/dy2 f + d2/dz2 f = lim h→0 (f(x,t+h) – f(x,t))/h.

Now, if we are crazy enough to replace the remaining six derivative operations with the definition of the derivative, we are left with an equation which is just plain long and ugly, even after performing some simplification:

lim h→0 (1/h2) * (3 f(x,y,z,t) + f(x+2h,y,z,t) – 2 f(x+h,y,z,t) + f(x,y+2h,z,t) – 2 f(x,y+h,z,t) +f(x,y,z+2h,t) – 2 f(x,y,z+h,t) ) )

= lim h→0 (f(x,t+h) – f(x,t))/h.

The point to realize here is that mathematicians and physicists make very careful choices when selecting their notation to vastly compress very complicated ideas. Typically, they define symbols in such a way as to make important formulas easy to write down and work with. However, if they chose to, they could always pick notations which would make even the “simplest” formula look nasty. For example, whenever we find ‘1′ in an equation, we could (if we were completely crazy) replace it using the following formula:

1 = ∑ k≥0 (-1)k (π/2) 2k+1/(2k+1)!.

Doing so would not change any of our results, but it sure would confuse a lot of people and make the formulas much harder to work with.

All of this being said, notation is not the end of the story. Another important point to consider is that in many cases a single physical law can cause a multitude of different effects which may not, at first, appear to be related. To give some classic examples, before Newton’s era it was not at all obvious that the force that causes us to fall to the ground when we jump is the same force that keeps planets in orbit in our solar system. Likewise, before the 1800’s it was not known that electric fields, magnetic fields and light are in fact manifestations of a single phenomena now known as electromagnetism. Similarly, before the era of Einstein it was not understood that conservation of energy and conservation of momentum could be thought of as effectively being part of a single conservation law.

There are a number of cases in physics where simpler and more elegant theories have won out over more complex theories because they correctly identify seemingly unrelated phenomena as having a single cause. Theories which treat inherently connected ideas as being wholly different are destined to be replaced since their lack of unification creates redundancy and therefore unnecessary complexity in the theory. This is one important reason why ugly, complicated theories can often be outdone by what seem to be more beautiful ones. We find it more beautiful to have one explanation for two results that two have two distinct explanations, and if the results really are just caused by one phenomenon, the single explanation will typically be easier to express and work with mathematically than both of the other two.

Another, related reason why we might expect simplicity to win out over complexity comes from a rule of thumb known as Occam’s Razor. This idea, which is often bandied about as if it were obviously and unquestionably true, states that when given many possible explanations for something that are otherwise equally plausible, we should prefer the one that is the simplest (or that makes the fewest assumptions). While Occam’s Razor certainly makes some intuitive sense, we can place the idea on a slightly more rigorous footing by considering results from the now blossoming field of machine learning, which concerns itself (in large part) with getting computers to make intelligent predictions by learning from past examples.

When computer scientists attempt to estimate how good a particular learning algorithm is at making predictions, typically what they find is that the expected future error of the algorithm is dependent on what might be called an “Occam term”, which punishes models based on their complexity. The more complex a model is, the more of this kind of penalty it will incur, and so the less accurate the algorithm will tend to be when making predictions. Here, depending on the mathematical analysis carried out, “complexity” can be measured in a variety of different ways, including the number of free parameters in the model, the number of bits of information required to specify the model, or the maximum number of points the model will always be able to categorize without making an error. The idea is that while very complex models are good at explaining past data (i.e. data that is used to train the models), they tend to (all else being equal) make more errors than simple models on future data (i.e. data that is not available at the time when the models are trained).

Now, since Physicists are in the business of trying to guess (or predict) the rules of the universe from experiments (which are just like the “past examples” in the machine learning setting), it is intuitive to think that an “Occam term” will apply to them as well. Hence, while this is not by any means an air tight argument, we have some reason to think that in the scientific method, just as in the machine learning setting, simpler theories due truly tend to be more useful than complex ones, so long as both explain all of the currently available experimental evidence.

A good example of Occam’s Razor which came up in practice is the Ptolemaic explanation of the motion of the planets, which apparently was the “accepted theory” in some places for “over 13 centuries”. The basic idea of this theory was that planetary motion consists of “epicycles” around the fixed planet earth. This means that planets were thought to make circular orbits around earth, but that during these circular orbits the planets orbited in smaller circles along the orbits, and along those smaller circular orbits they orbited in still smaller circles, etc. This model was intrinsically very complex because by adjusting the epicycles so that there were a sufficient number of circular orbits within circular orbits at appropriate speeds one could have described pretty much any shape of orbit whatsoever, real or imagined. In other words, the model had a large number of free variables which gave it enormous flexibility and therefore complexity. Copernicus eventually laid the Ptolemaic model to waste by replacing it with a far simpler model with far fewer free parameters, which he accomplished merely by shifting the center of the circular planetary orbits to be the sun rather than the earth. However, the basic form of his new theory still did not agree perfectly with observation, and so required some adhoc refinements that introduced extra complexity. This final complexity was eventually removed by Kepler who refined the model yet again by allowing for elliptical rather than circular orbits, which now is known to be an excellent explanation for the orbits that are observed. The key difference in these explanations for orbits is that the theory of epicycles is complex enough to explain almost any conceivable orbit you could ever think of, whereas Kepler’s idea of elliptical orbits with the sun at one focus of the ellipses was just complex enough to explain what was actually observed but without being complex enough to explain the universe had we observed substantively different orbits than actually exist. In other words, Kepler’s theory is precisely as complicated as it needs to be to explain reality.

There are a few more points about the relationship between beauty and truth in physics and math that I feel are worth mentioning. To begin with, as physicist Murray Gell-Mann (quoted above) mentions in his TED talk on beauty and truth in physics, symmetry plays a key role in simplicity. For example, since all of the known laws of physics treat the three dimensions of space equally, we can often greatly simplify equations by writing things such as

∇ f = some expression

rather than having to write an equivalent but much more cumbersome set of equations where we treat each dimension of space separately, as in:

df/dx = some expression

df/dy = some expression

df/dz = some expression.

The point here is that symmetry makes it easy to simplify equations. Of course, this argument goes beyond just the symmetry of the three dimensions of space, and applies also to symmetry in time, rotation, etc.

Another idea that should be mentioned is that typically mathematical expressions have a number of different equivalent forms. For example, we could define the exponential function ex using any of the following equivalent definitions:

f(x) = lim h→∞ (1+(x/h))h
f(x) = f ‘(x) & f(0) = 1

f(x) = ∑ k≥0 xk/k!

f(ln(x)) = x

f(x+y) = f(x) f(y) & f(1) = e.

f(x) = Cosh(x) + Sinh(x)

None of these definitions for ex is intrinsically better than any other. Mathematicians have the choice to use whichever definition is more useful for any given purpose, and often times it is precisely the simpler or more “elegant” definitions that are used most commonly because they are easier to understand and manipulate.

As a final point, it is worth noting that much of the most theoretical mathematical work is driven more by the aesthetic and psychological appeal of the theorems produced than by the importance of those theorems in solving practical problems that arise in the real world. One prime example of this phenomenon is the field of number theory, which while popular and very elegant, found almost no practical applications before it was (unexpectedly) linked to the field of cryptography and secure online banking. It also should be notated that it is likely easier to publish results that strike the reviewers as elegant rather than clumsy and awkward. Keeping these ideas in mind, it is no surprise to find that some of the most researched areas of math even today have great beauty but few real world applications.

In conclusion, the relationship between beauty and truth in physics and math is a complicated one, which relates to practical considerations such as choices for notation and definitions, psychological phenomenon such as the personal preferences and aesthetic sensibilities of the practitioners, and deeper physical or mathematical ideas such as symmetry, the unification of seemingly unrelated results, and Occam’s razor. In the end, it is clear that beauty is an important, if not fundamental part of math and science.

We, Evil?

Wednesday, March 4th, 2009

COULD it be that we are slave holders, witch burners or nazis and yet don’t even know it? By that I mean could we, as individuals, or as societies, be unwittingly  perpetrating acts that future generations will consider to be unspeakably evil? That idea may sound absurd, yet many slave holders, witch burners and nazis saw nothing wrong in what they did. Surely many of them would be shocked to know that their actions are used today as quintessential examples of moral corruption. So how can we be absolutely certain that we too are not committing acts that will be called evil? As the Utilitarian philosopher Peter Singer wrote in his book Practical Ethics,

“It is easy for us to criticize the prejudices of our grandfathers, from which our fathers freed themselves. It is more difficult to distance ourselves from our own views, so that we can dispassionately search for prejudices among the beliefs and values we hold. What is needed now is a willingness to follow the arguments where they lead, without a prior assumption that the issue is not worth attending to.”

Consider the following practices that are commonplace today, but which could perhaps, given the right circumstances, one day be looked upon with astonishment and horror:

1. Our treatment of the environment.

If during our lifetimes global warming or global pollution is pushed past a point of no return that leaves future generations reaping the catastrophic consequences (such as water levels dramatically rising, amplified natural disasters, increased spread of diseases, extinction of many species, etc.), we could one day be thought of as the astoundingly self-serving wreckers of planet earth. The evidence is now abundant that man made global warming is real, and many technologies are available to fight this problem, though the world seems to lack the will to use them.

2. Our treatment of the poor.

If poverty on our planet is ever greatly reduced, future generations may look back upon the incredibly unequal distribution of wealth between and within countries that is commonplace today and be shocked that those with wealth and power did not do more in the name of equality. Recall that every time we buy ourselves a new pair of expensive shoes we may well have been able to use that money to help prevent someone from dying of malnutrition or becoming infected with a disease like AIDS or malaria. The “Millennium Campaign” cites some sobering statistics, such as that “800 million people go to bed hungry every day” and that “nearly half the world’s population is living on less than $2 a day”. Almost everyone would agree that murdering a person by withholding food until they starve to death is an act of extreme evil. Yet there are few who acknowledge that allowing someone to starve to death when you could have easily prevented it (through only a small sacrifice on your part) is evil as well. In a country where most people make less that $2 a day, a hundred dollars can go a long way.

3. Our treatment of animals.

If one day humans fully accept that many animals experience pain and emotions that are similar in nature to what we feel, then perhaps society could begin to feel the same outrage over hurting any intelligent animal that many now experience when dogs, cats or endangered species are mistreated. Could keeping a dog in a small cage for its entire life before slaughtering it (which most people seem to consider cruel) really be much different in an ethical sense from doing the same to a pig or a lamb or even a chicken? If the world only concedes that killing an animal (whether its a chicken, turkey, cow, pig or sheep) is one thousandth as “bad” as killing a human, that would make the approximately 10 billion animals killed in the U.S. in 2008 (to satisfy our gastronomic preferences) equivalent to the death of 10 million humans.

4. Our inaction during genocides and atrocities committed throughout the world.

Future generations may come to feel that the powerful nations of the world did not do nearly enough when hundreds of thousands of ordinary people were slaughtered with machetes in Rwanda in  1994, or when approximately 1.5 million Cambodians died from “execution, starvation, and forced labor” under Pol Pot in the 1970s, or when about ten million people were killed in Germany under Hitler in the 1930s and 40s, or when millions of people died of starvation in the Ukrainian Soviet Socialist Republic under Stalin’s rule during the 1930s. For that matter, perhaps the American people will be blamed for not doing more to stop over a hundred thousand civilian deaths during the Iraq war, which was allegedly  fought over what proved to be non-existent weapons of mass destruction. What’s more, maybe future generations will reject America’s wartime justifications for killing hundreds of thousands of Japanese civilians with nuclear weapons in 1945.

5. Our poor control over nuclear weapons.

If one day there is a nuclear war that claims tens of millions or even hundreds of millions of lives or seriously alters the earth’s climate due to nuclear winter, perhaps such a catastrophe will be in large part blamed upon the current generations, which failed to do enough to secure and prevent the spread of nuclear weapons in the world and stop nuclear war.

6. Our refusal to give marriage rights to homosexuals.

We look back with disbelief on the years before 1920 when women still could not vote in America and on the time before the Voting Rights Act in 1965 when many black men and women were denied the right to vote by what are now often referred to as “discriminatory voting practices”. Perhaps one day America’s (not yet fully repealed) sodomy laws and ongoing refusal to give full marriage rights to homosexuals will be looked upon as similarly backward, ignorant and contemptible.

The list above is by no means perfect or complete. There are undoubtedly examples mentioned above that future generations will never blame us for or even consider unethical (though I do not know for certain which examples those are). Likewise, there are surely plenty of potential evils that I have failed to think of or mention (due to my own ignorance, fallibility, bias, and acceptance of cultural norms).

Please note that I do not, by any means, believe that the average person is evil or corrupt. On the contrary, I think that most people do what they feel is right most of the time. That being said, I hope that the examples listed above will help convince you that the accepted norms of today are not necessarily beyond reproach, and that it is worth taking a careful look at the potentially harmful ways of acting that we take for granted.

As Peter Singer alludes to in the quote above, it can be very difficult to criticize actions that are considered normal or are expected in your culture. For example, it is easy to condemn slavery when raised in a world that rejects it, but it far more difficult to condemn it (or even recognize it as unethical) when all of your relatives and friends own slaves. Let us work hard to identify and stamp out our own evils, and not wait idly for future generations to label us as slave holders, witch burners, or nazis.