Wednesday, July 16, 2014

About the meaning of Schrödinger Equation

It can be little bit confusing - especially through the first chapters - to grasp what exactly is being developed in the analysis (see the book on the right). It is easy to miss that what is being developed is not a physical theory, but about physical theories; every argument exists an abstraction level above physics. Especially the first chapters in themselves are highly abstract as they are meticulously detailing an epistemological framework for an analysis (a framework that is most likely completely unfamiliar to the reader).

I find that typically people start to get a better perspective towards the analysis when Schrödinger equation suddenly appears in plain sight in rather surprising manner.

So to provide a softer landing, I decided to write some commentary and thoughts on that issue. Don't mistake this for an analytical defense of the argument; for that it's better to follow the arguments in the book. I'm merely providing a quick overview, and pointing out some things worth thinking about.

Let me first jump forward to page 51, equation (3.15) is;
-\frac{\hbar^2}{2m}\frac{\partial^2}{\partial x^2}\Psi(x,t)+V(x)\Psi(x,t)=i\hbar\frac{\partial}{\partial t}\Psi(x,t)

which is identical to Schrödinger equation in one dimension for a single particle.

In Richard's analysis this expression arises from radically different considerations than its historical twin, and the steps that were taken to get here offer some insight as to why Schrödinger's equation is a valid approximated general description of the entire universe.

To get some perspective on the issue, let me do a quick hand-wavy run through some of the history of Schrödinger equation.

Usually the $\Psi$ in Schrödinger equation is interpreted as representing a probability amplitude (whose absolute square is the probability of observation results). Erwin Schrödinger didn't arrive to this representation by trying to describe probability amplitude's per se. Rather it arose from a series of assumptions and definitions that were simply necessary to explain some observations, under certain pre-existing physics definitions;

Max Planck created a concept of "discrete energy bundles" as part of his explanation for "black body radiation". Motivated by this, Albert Einstein suggested that electromagnetic radiation exists in spatially localized bundles - call them "photons" - while they would still preserve their wave-like properties where suitable. Consequently, Louis de Broglie showed that wave behavior would also explain observations related to electrons, and further hypothesized that all massive particles could be associated with waves describing their behavior.

Schrödinger set out to find wave equation that would represent the behavior of an electron in 3 dimensions; in particular he set out to find a wave equation that would correctly reproduce the observed emission spectrum of a hydrogen atom.

Schrödinger arrived at a valid equation, which would "detail the behavior of wave function $\Psi$ but say nothing of its underlying nature". There is no underlying reasoning that would have directly led Schrödinger to this particular equation, there is just the simple fact that this is the form that yields valid probabilistic expectations for the observed measurement outcomes.

It was only after the fact that the desire to explain what in nature makes Schrödinger equation valid, would lead to a multitude of hypothetical answers (i.e. all the ontological QM interpretations).

Effectively what we know is that the equation does correctly represent the probabilities of measurement outcomes, but all the ideas as to why, are merely additional beliefs.

Just as a side-note, every step on the historical route towards Schrödinger equation represents a small change in the pre-existing physics framework; every step contains an assumption that the pre-existing framework represents reality mostly correctly. Physical theories don't just spring out form radically new perspectives, but rather they tend to be sociologically constructed as logical evolution from previous paradigms, generalized to explain a wider range of phenomena. These successful generalizations may cause revolutionary paradigm shifts, as was the case with Schrödinger equation.

Alright, back to Richard's analysis. I will step backwards from the equation (3.15) (Schrödinger Equation) to provide a quick but hand-wavy preview of what is it all about.

First note that $\Psi$ is indeed here also related to the probability of a measurement outcome via $P = \Psi^{\dagger} \cdot \Psi$. But it has not been interpreted as so after the fact; rather it has been explicitly defined at the get-go as any unknown function that yields observed probabilities in self-consistent fashion; any function that does so can be seen as a valid explanation to some data.

For more detail on this definition, see equation (2.13) on page 33 and the arguments leading up to it. Note especially that the definition (2.13) is specifically designed to not exclude any possible functions embedded inside $\Psi$. It is important that this move does not close out any possibilities pre-emptively; if it did, we would have just made an undefendable assumption about the universe. The definition (2.13) will have an impact on the mathematical appearance of any expressions, but this is most correctly seen as an abstract (and in many ways arbitrary) mathematical terminology choice. Its consequences should be viewed as purely epistemological (purely related to the mental terminology embedded to our explanation), not ontological (relating to nature in itself). E.g. the squaring comes from the definition of probability, and $\Psi$ being complex function simply impacts the apparent form of its constraints (which play important role in the form of Schrödinger equation, as becomes evident little later).

Let's jump to equation (3.14). which is logically equivalent to Schrödinger equation; only couple of simple algebraic steps stand between the two expressions. There is absolutely no reason to do these steps other than to point out that the equations indeed do represent the same constraint.

As is mentioned in the book, (3.14) was obtained as a starting point for a perturbation attack, to find the constraints on a $\Psi$ for a single element, in an unknown universe (under few of conditions, which I'll return to).

To get some idea of what that means, let me first point out that the underlying constraints embedded into the equation have a deceptively simple source. Equations tagged as (2.7) on page 26 are;

\sum_{i=1}^{n}\frac{\partial}{\partial \tau_i}P(x_1, \tau_1, x_2, \tau_2 ... , x_n, \tau_n,t)=0


\sum_{i=1}^{n}\frac{\partial}{\partial x_i}P(x_1, \tau_1, x_2, \tau_2 ... , x_n, \tau_n,t)=0

which simply means that, under any explanation of any kind of universe, the probability of an outcome of any measurement is always a function of some data points that provide a context (the meaning) for that measurement. But the probability is never a function of the assignment of labels (names) to those data points.

Alternatively you can interpret this statement in terms of an abstract coordinate system $(x, \tau)$ (a view also developed carefully in the book), in which case we could say, the probability of an outcome of a measurement is not a function of the location of the context inside the coordinate system. Effectively that is to say that the defined coordinate system does not carry any meaning with its locations. After all, it is explicitly established as a general notation capable of representing any kind of explanation.

Note that what the data points are, and what they mean, is a function of each and every possible explanation. Thus the only constraints that are meaningful to this analysis are those that would apply to any kind of assignment of labels.

Note that exactly similar symmetry constraint is defined for partial derivative of $t$, the index for time-wise evolution. See (2.19) where it is expressed against $\Psi$

The other type of underlying constraint is represented in equation (2.10) with a Dirac delta function, meaning at its core that different data points cannot be represented as the same data point by any self-consistent explanation; a rather simple epistemological fact.

The definition of $\Psi$ as $P = \Psi^{\dagger} \cdot \Psi$ and some algebra will lead to a succinct equation expressing these universal epistemological constraints as single equation (2.23)

\left \{ \sum_i \vec{\alpha}_i \cdot \vec{\triangledown}_i + \sum_{i \neq j} \beta_{ij} \delta(\vec{x}_i - \vec{x}_j)   \right \}\Psi=\frac{\partial}{\partial t}\Psi = im\Psi

which was somewhat useless for me to write down here as you need to view the associated definitions in the book anyway to understand what it means. You can see page 39 for definitions of the alpha and beta elements, and investigate the details of this expression better there. For those who want to just carry on for now, effectively this amounts to be a single expression representing exactly the above-mentioned constraints - without creating any additional constraints - on $\Psi$.

View this as a constraint that arises from the fact that any explanation of anything must establish a self-consistent terminology to refer to what-is-being-explained, and this is the constraint that any self-consistent terminology in itself will obey, regardless of what the underlying data is.

Chapter 3 describes the steps from this point onward, leading us straight into Schrödinger's expression. It is worth thinking about what those steps actually are.

First steps are concerned with algebraic manipulations to separate the collection of elements into multiple sets under common probability relationship P(set #1 and set #2) = P(set #1)P(set #2 given set #1) (page 45)

Leading us to equation (3.6), which is an exact constraint that a single element must obey in order to satisfy the underlying epistemological constraints. But this expression is still wonderfully useless since we don't know anything about the impact of the rest of the universe (the $\Psi_r$)

From this point on, the moves that are made are approximations that cannot be defended from an ontological point of view, but their epistemological impact is philosophically significant.

The first move (on page 48) is the assumption that there is only negligible feedback between the rest of the universe and the element of interest. Effectively the universe is taken as stationary in time, and the element of interest is assumed to not have an impact to the rest of the universe.

Philosophically this can be seen in multiple ways. Personally I find it interesting to think about the fact that, if there exists a logical mechanism to create object definitions in a way where those objects have negligible feedback to the rest of the universe, then there are rather obvious benefits in simplicity for adopting exactly such object definitions, whether or not those definitions are real or merely a mental abstraction.

Note further that if it was not possible - via reasonable approximations or otherwise - to define microscopic and macroscopic "objects" independently from the rest of the universe, so that those objects can be seen as universes unto themselves, the alternative would be that any proposed theory would have to constantly represent state of the entire universe. I.e. the variables of the representation would have to include all represented coordinates of everything in the universe simultaneously.

That is to say, whether or not reality was composed of complex feedback loops, any method of modeling probabilistic expectations with as little feedback mechanisms as possible would be desirable, and from our point of view such explanations would appear to be the simplest way to understand reality.

Next steps are just algebraic moves under the above assumption, leading to equation (3.12) on page 50. Following that equation, the third and final approximation is set as;

\frac{\partial}{\partial t} \Psi \approx -iq\Psi

which leads to an expression that is already effectively equivalent to Schrödinger's equation, simply implying that this approximation plays a role in the exact form of Schrödinger's Equation. See more commentary about this from page 53 onward.

And there it is, the equation that implies wave particle duality to the entire universe, and yielded a revolution in the concepts of modern physics, arises from entirely epistemological constraints, and few assumptions that are forced upon us to remove overwhelming complexity from a representation of a universe.

The steps that got us here tell us exactly what makes Schrödinger Equation generally valid. When we create our world view, we define the elements (the mental terminology) with exactly the same epistemological constraints that would also yield Schrödinger Equation in Richard's analysis. The only difference between different representations (everyday, classical, or quantum mechanical) is that different approximations are made for simplicity's sake.

The steps that the field of physics took towards Schrödinger equation were always dependent on the elements that had already been defined as useful representation of reality. They were merely concerned of creating generalized expressions that would represent the behavior of those elements.

The so-called quantum mystery arises from additional beliefs about the nature of reality - redundant beliefs that the elements we define as part of our understanding of reality, are also ontological elements in themselves. There exists many different beliefs (QM interpretations) that each yield a possible answer to the nature behind quantum mechanics, but scientifically speaking, there is no longer need to explain the validity of Schrödinger Equation from any hypothetical ontological perspective.

So it appears the critical mistake is to assume that the particles we have defined are also in themselves real objects, from which our understanding of reality arises. Rather the converse appears to be true; a useful method of representing the propagation of probabilistic expectations between observations is driving what we have define as objects, and consequently this epistemological issue critically affects how do we view the universe meaningfully in the first place. After all, understanding reality is meaningful only in so far that we can correctly predict the future, and the only meaningful object definitions have to be directly related to that fact.

Thus, to avoid redundant beliefs in the subject matter, Schrödinger equation can be seen simply as a valid general representation of the propagation of our expectations (of finding a defined particle) between observations, governed by the same constraints that govern what we define as "objects". The exact reason why the particles "out there" appear to be reacting to our mere observations is that the pure observation X implies the existence of a particle Y only in our own mental classification of reality. That is why the particles that we have defined do not behave as particles would in-between observations.

To assume the ontological existence of associated particles as we have defined them, is not only redundant, but also introduces a rather unsolvable quantum mystery.


  1. Hi Anssi,

    Perhaps the "normal" derivation of the Schrodinger equation would be great to have in parallel.

    Would you also like to explain the step from 3.8 to 3.9. The function g(x) seems quite complicated and is related to V(x) in the end. Does V(x) not normally have some physical meaning (other than the assumptions of the rest of the universe not influencing the system, the rest of the universe is stationary in time, and the time derivative of the wave function is equal to the wave function times an imaginary constant times q)?

    1. Thank you for the comment.

      There is really no "normal" derivation of Schrödinger equation, as historically it is directly based on some postulates of quantum mechanics. That is the meaning of the Feynman quote on Wikipedia:

      "Where did we get that (equation) from? Nowhere. It is not possible to derive it from anything you know. It came out of the mind of Schrödinger."

      When Schrödinger created the equation, he was simply motivated by de Broglie's notions and other associated developments in physics at that time. Motivated by these ideas, and some guess work, he came up with an equation that replicates observational data, but he did not really know why.

      If you google for a derivation of Schrödinger equation, you will get various presentations deconstructing some algebraic connections to Newtonian definitions. For instance I spotted this blog post which is related to that issue. See also the video link at Addendum 5 in that page.

      At any rate, within the current understanding of physics it would be said that Schrödinger equation cannot be reduced to anything more fundamental. That is no longer exactly true, as it appears to arise from constraints that govern sensible object definitions.

      This just entails a complete paradigm shift from the idea that we are trying to explain a world made of persistent particles that behave like waves when we are not looking, to the idea that defined particles are in fact part of a mental terminology useful for representing some information (reality) in meaningful manner.

      $V(x)$ in Schrödinger equation represents potential energy in location x. It can be used to model some external influence, for instance to
      model some boundaries for the particle.

      In the analysis, $g(\vec{x})$ is a shorthand to an integral that represents the impact of the rest of the universe, given the location of the single element of interest. It is something which can be expressed as a function of single $x$ (that single element of interest), if a solution to the rest of the universe is known or the rest of the universe can be ignored. To understand that bit better, see how the function $F$ arose in (3.4), and also see the additional clarifying comments related to that function on page 52.

      If people want, this could be perhaps explained in greater detail in a separate post.

      But so effectively $g$ is embedded inside the $V$ and they are indeed associated to the same idea (representing external influences).

      You should also read from the book the few pages following directly after the Schrödinger equation (3.15), which further clarifies some details surrounding this equation, and is somewhat relevant to your questions regarding its "normal" derivation.

  2. Regarding MrQuincle's comment, “Would you also like to explain the step from 3.8 to 3.9.” together with “The function g(x) seems quite complicated and is related to V(x) in the end.” it may be that he is confusing dVr-1 with V(x), two very different references. The expression dVr-1 refers to the differential volume associated with the x variables in the “remaining” collection; see equation (3.5). The “minus 1” refers to the fact that x sub one is not being integrated over!

    The V(x) defined in equation (3.15) is an entirely different thing. That expression is no more than the consequence of identifying g(x) (the result of the integration) with Schrodinger's V(x). It essentially inserts the consequence of that integration over the rest of the universe with what is ordinarily seen as potential energy.

    I hope that clears things up.