From First Principles

Our goal here is to consider how the assignments of space and time coordinates by various distinct observers are related, and how that relativity of coordinates to observers is not arbitrary but rather is determined by the invariance of some aspects of the observers’ experience.

In particular, the condition that if one observer sees another moving at velocity $#v#$ then the other sees the first moving at the same speed in the opposite direction (ie at velocity$#-v#$) is quite restrictive. It forces the transformation law to be either Galilean (with an absolute time coordinate) or Lorentzian for some particular maximum possible speed $#c#$.

Let’s first look at the most basic aspects of how we assign time and space coordinates to our experiences.

Each observer quickly learns that aspects of its experience can be identified with particular values or ranges of a “time” parameter, $#t#$, and a spatial position, $#x#$.

For any choice of time and space coordinates, $#t#$ and $#x#$, for an observer who we’ll call $#O#$, going to the corresponding coordinates $#t’#$ and $#x’#$ of the same event as seen by some other observer $#O’#$ defines a function such that $$\mathscr{G}_{OO’}(\begin{pmatrix}t\\x\end{pmatrix})=\begin{pmatrix}t’\\x’\end{pmatrix}$$

If the observers are both in the same place and stationary with respect to one another, then we find that much of their experiences are the same (and those experience which are the same are identified as “external” or “physical” “events”). Furthermore, we also find that if they both use the same particular events as markers for their coordinate systems, then they will agree on the coordinates of any other event.

We also expect that all observers experience the same “laws of physics” relating different aspects of their experience and so that they will find the same function $#\mathscr{G}#$ giving the spacetime coordinates of another observer whenever the state of that other observer has the same relation to their own (and to the distribution of other matter in the universe). So in otherwise empty space, if the displacements in space and time and velocity changes are the same from $#O_{1}#$ to $#O_{1}’#$ as from $#O_{2}#$ to $#O_{2}’#$, then $#\mathscr{G}_{O_{1}O_{1}’}=\mathscr{G}_{O_{2}O_{2}’}#$. (Even though our local experience suggests that the geometry of spacetime is independent of what objects it contains, there is no reason in principle for that independence to hold exactly and universally – and it turns out that the relative positions of nearby large masses do in fact influence this geometry. But for now we’ll restrict our attention to cases where space is “close enough” to empty that we can neglect those effects.)

If the axis scales and directions of $#O#$ and $#O’#$are defined so as to agree on the coordinates of a remote event when they are together and not moving relative to one another, then when they are separated in space or time but not moving relative to one another, the space and time coordinates of $#O#$ with respect to $#O’#$ are the negatives of those of $#O’#$ with respect to $#O#$, and if there is a rotation which takes the spatial axes for $#O#$ to those of $#O’#$ its inverse of course does the opposite. And when they are in relative motion, if observer $#O#$ sees $#O’#$ moving with velocity $#v#$, then $#O’#$ sees $#O#$ as moving with velocity $#-v#$.

[In a universe with just two separated observers, it might be natural for each to define the preferred direction, or x-axis, in space as pointing towards the other. In this perfectly symmetrical situation each would assign the same number to the velocity of the other and each has the same rule for finding the coordinates used by the other and so repeating that rule should give back the original coordinates. But if those observers then come together they will see that they have assigned opposite x coordinates to any remote event. ]

From now on we’ll restrict to the case where both frameworks coincide at $#t=t’=0#$ with $#x=x’=0#$ and the only difference between $#O#$ and $#O’#$ is a relative motion at constant velocity. We will also assume that both observers identify the $#x#$-axis direction as being that of the motion of $#O’#$ relative to $#O#$ and will ignore the perpendicular directions.

In that situation we’ll use $#\mathscr{G}_{v}#$ to denote the special case of $#\mathscr{G}_{OO’}#$ and so we have $#\mathscr{G}_{v}^{-1}=\mathscr{G}_{-v}#$.

Even if the transformation rules aren’t linear in $#x#$ and $#t#$, the same relation applies to their linear approximations (ie jacobian matrices of derivatives), which I will denote by the greek letter, so we have $#\Gamma_{v}^{-1}=\Gamma_{-v}#$, where $$\Gamma_{v}=\begin{pmatrix}\gamma_{00_{v}}&\gamma_{01_{v}}\\\gamma_{10_{v}}&\gamma_{11_{v}}\end{pmatrix}=\begin{pmatrix}\gamma_{v}&\delta_{v}\\\epsilon_{v}&\alpha_{v}\end{pmatrix}$$

$$\text{with }\gamma_{ij_{v}}=\frac{\partial x’_{i}}{\partial x_{j}}(\begin{pmatrix}0\\0\end{pmatrix})\text{ where }x_{0}=t\text{ and }x_{1}=x$$

If both observers measure space and time from the event where they coincide (ie $#x=x’=0#$ at $#t=t’=0#$), then for $#O’#$ moving at velocity $#v#$ with respect to $#O#$ we have $$x’=0 \iff x=vt \hspace{5mm}\text{and}\hspace{5mm} x=0 \iff x’=-vt’$$ and if we just consider the linear approximation, then for an event at time $#t#$ and position $#x#$ for $#O#$, the coordinates $#x’#$ and $#t’#$ for $#O’#$ are related by a transformation $$\begin{pmatrix}t’\\x’\end{pmatrix}=\Gamma_{v}\begin{pmatrix}t\\x\end{pmatrix}=\begin{pmatrix}\gamma_{v}&\delta_{v}\\\epsilon_{v}&\alpha_{v}\end{pmatrix}\begin{pmatrix}t\\x\end{pmatrix}$$

[or equivalently $#\begin{pmatrix}x’\\t’\end{pmatrix}=\begin{pmatrix}\alpha_{v}&\epsilon_{v}\\\delta_{v}&\gamma_{v}\end{pmatrix}\begin{pmatrix}x\\t\end{pmatrix}#$]

For a linear function, the condition $$x’=0 \iff x=vt$$ forces $#x’=\epsilon_{v} t+\alpha_{v} x#$ to be a constant multiple of $#x-vt#$ where the constant (which here must be $#\alpha_{v}#$) is independent of $#x#$ and $#t#$ but possibly depends on $#v#$.

So $#\epsilon_{v}=-\alpha_{v} v#$.

And since $$\Gamma_{v}^{-1}=\frac{1}{\alpha_{v}\gamma_{v}-\epsilon_{v}\delta_{v}} \begin{pmatrix} \alpha_{v} & -\delta_{v} \\ -\epsilon_{v} & \gamma_{v} \end{pmatrix}$$ the condition $#\Gamma_{v}^{-1}=\Gamma_{-v}#$ can be written as $$\frac{1}{\alpha_{v}\gamma_{v}-\epsilon_{v}\delta_{v}} \begin{pmatrix} \alpha_{v} & -\delta_{v} \\ -\epsilon_{v} & \gamma_{v} \end{pmatrix}=\begin{pmatrix}\gamma_{-v}&\delta_{-v}\\\epsilon_{-v}&\alpha_{-v}\end{pmatrix}$$

So by the same argument which gave $#\epsilon_{v}=-\alpha_{v} v#$, the condition $#x=0 \iff x’=-vt’#$ forces $#x#$ to be a constant multiple of $#x’-(-v)t’=x’+vt’#$.

Since $#x=\epsilon_{-v} t’+\alpha_{-v} x’=\frac{\gamma_{v} x’-\epsilon_{v} t’}{\alpha_{v}\gamma_{v}-\epsilon_{v}\delta_{v}}#$, this gives $#\epsilon_{-v}=-\alpha_{-v} (-v)=\alpha_{-v} v#$. So $#-\epsilon_{v}=\gamma_{v} v#$ which makes $#-\epsilon_{v} / \gamma_{v} =v#$ so $#\gamma_{v}=-\epsilon_{v}/v=\alpha_{v}#$.

Also, since [???WHAT????], we must have $$1=\alpha_{v}\gamma_{v}-\epsilon_{v}\delta_{v}=\gamma_{v}^{2}-\epsilon_{v}^{2}\frac{\delta_{v}}{\epsilon_{v}}=\gamma_{v}^{2}(1-v^{2}\frac{\delta_{v}}{\epsilon_{v}})$$ so $$\gamma_{v}=\frac{1}{\sqrt{1- v^{2}\frac{\delta_{v}}{\epsilon_{v}}}}$$

If $#\delta_{v}=0#$ then $#\gamma_{v}=1#$ and vice versa, and in that case we get the traditional Galilean transformation which assumes that there is a universal time coordinate which is the same for all observers.

But if $#\delta_{v}\neq 0#$, then we can write $$\gamma_{v}=\frac{1}{\sqrt{1- \frac{v^{2}}{c^{2}}+\mathscr{O}(v^4)}}$$ by defining $#c#$ as (the $#v#$-independent part of) $#\sqrt{\frac{\epsilon_{v}}{\delta_{v}}}#$

This gives $$\Gamma_{v}=\begin{pmatrix}\gamma_{v}&-\frac{v}{c^2}\gamma_{v}\\-v\gamma_{v}&\gamma_{v}\end{pmatrix}$$

which is just the standard Lorentz transformation for a “boost” of velocity $#v#$.

———————————–

Since the formula for $#\gamma_{v}#$ only makes sense for $#v<c#$, we should check that speeds greater than $#c#$ don’t arise when we combine these transformations.

For example if an observer $#O”#$ is observed by $#O’#$ to be moving at velocity $#w#$ when $#O’#$ is moving at velocity $#v#$ with respect to $#O#$, then how does $#O”#$ appear to be moving as seen by $#O#$?

$$\begin{align} \begin{pmatrix}t”\\x”\end{pmatrix}&=\Gamma_{w}\begin{pmatrix}t’\\x’\end{pmatrix}=\Gamma_{w}\Gamma_{v}\begin{pmatrix}t\\x\end{pmatrix}\\&=\begin{pmatrix}\gamma_{w}&-\frac{w}{c^2}\gamma_{w}\\-w\gamma_{w}&\gamma_{w}\end{pmatrix}\begin{pmatrix}\gamma_{v}&-\frac{v}{c^2}\gamma_{v}\\-v\gamma_{v}&\gamma_{v}\end{pmatrix}\begin{pmatrix}t\\x\end{pmatrix} \\&=\begin{pmatrix}\gamma_{w}\gamma_{v}(1+\frac{vw}{c^{2}})&-\frac{v+w}{c^2}\gamma_{w}\gamma_{v}\\-(v+w)\gamma_{w}\gamma_{v}&\gamma_{w}\gamma_{v}(1+\frac{vw}{c^{2}})\end{pmatrix} \begin{pmatrix}t\\x\end{pmatrix} \\&=\begin{pmatrix}\gamma_{v[+]w}&-\frac{v[+]w}{c^2}\gamma_{v[+]w}\\-(v[+]w)\gamma_{v[+]w}&\gamma_{v[+]w}\end{pmatrix}\begin{pmatrix}t\\x\end{pmatrix}=\Gamma_{v[+]w}\begin{pmatrix}t\\x\end{pmatrix}\end{align}$$

with $#v[+]w=\frac{v+w}{1+\frac{vw}{c^{2}}}#$.

Thus $#x”=0 \iff 0=-(v[+]w)\gamma_{v[+]w}t+\gamma_{v[+]w}x#$ which gives $#x=(v[+]w)t#$, and so the origin for $#O”#$ appears to $#O#$ to be moving with velocity $#v[+]w#$ (which is always less than $#c#$ if $#v#$ and $#w#$ are, and is equal to $#c#$ if either $#v#$ or $#w#$ is ).