Abstract
The Hilbert space effect algebra is a fundamental mathematical structure which is used to describe unsharp quantum measurements in Ludwig’s formulation of quantum mechanics. Each effect represents a quantum (fuzzy) event. The relation of coexistence plays an important role in this theory, as it expresses when two quantum events can be measured together by applying a suitable apparatus. This paper’s first goal is to answer a very natural question about this relation, namely, when two effects are coexistent with exactly the same effects? The other main aim is to describe all automorphisms of the effect algebra with respect to the relation of coexistence. In particular, we will see that they can differ quite a lot from usual standard automorphisms, which appear for instance in Ludwig’s theorem. As a byproduct of our methods we also strengthen a theorem of Molnár.
Similar content being viewed by others
1 Introduction
1.1 On the classical mathematical formulation of quantum mechanics
Throughout this paper H will denote a complex, not necessarily separable, Hilbert space with dimension at least 2. In the classical mathematical formulation of quantum mechanics such a space is used to describe experiments at the atomic scale. For instance, the famous Stern–Gerlach experiment (which was one of the firsts showing the reality of the quantum spin) can be described using the two-dimensional Hilbert space \({\mathbb {C}}^2\). In the classical formulation of quantum mechanics, the space of all rank-one projections \({{\mathcal {P}}}_1(H)\) plays an important role, as its elements represent so-called quantum pure-states (in particular in the Stern–Gerlach experiment they represent the quantum spin). The so-called transition probability between two pure states \(P, Q \in {{\mathcal {P}}}_1(H)\) is the number \(\mathrm{tr}PQ\), where \(\mathrm{tr}\) denotes the trace. For the physical meaning of this quantity we refer the interested reader to e.g. [33]. A very important cornerstone of the mathematical foundations of quantum mechanics is Wigner’s theorem, which states the following.
Wigner’s Theorem
Given a bijective map \(\phi :{{\mathcal {P}}}_1(H)\rightarrow {{\mathcal {P}}}_1(H)\) that preserves the transition probability, i.e. \(\mathrm{tr}\phi (P)\phi (Q) = \mathrm{tr}PQ\) for all \(P,Q\in {{\mathcal {P}}}_1(H)\), one can always find either a unitary, or an antiunitary operator \(U:H\rightarrow H\) that implements \(\phi \), i.e. we have \(\phi (P) = UPU^*\) for all \(P\in {{\mathcal {P}}}_1(H)\).
For an elementary proof see [11]. As explained thoroughly by Simon in [29], this theorem plays a crucial role (together with Stone’s theorem and some representation theory) in obtaining the general time-dependent Schrödinger equation that describes quantum systems evolving in time (and which is usually written in the form \(i \hslash \tfrac{d}{dt} |\Psi (t)\rangle = {\hat{H}} |\Psi (t)\rangle \), where \(\hslash \) is the reduced Planck constant, \({\hat{H}}\) is the Hamiltonian operator, and \(|\Psi (t)\rangle \) is the unit vector that describes the system at time t).
One of the main objectives of quantum mechanics is the study of measurement. In the classical formulation an observable (such as the position/momentum of a particle, or a component of a particle’s spin) is represented by a self-adjoint operator. Equivalently, we could say that an observable is represented by a projection-valued measure \(E:{{\mathcal {B}}}_{\mathbb {R}}\rightarrow {{\mathcal {P}}}(H)\) (i.e. the spectral measure of the representing self-adjoint operator), where \({{\mathcal {B}}}_{\mathbb {R}}\) denotes the set of all Borel sets in \({\mathbb {R}}\) and \({{\mathcal {P}}}(H)\) the space of all projections (also called sharp effects) acting on H. If \(\Delta \) is a Borel set, then the quantum event that we get a value in \(\Delta \) corresponds to the projection \(E(\Delta )\). However, this mathematical formulation of observables implicitly assumes that measurements are perfectly accurate, which is far from being the case in real life. This was the crucial thought which led Ludwig to give an alternative axiomatic formulation of quantum mechanics which was introduced in his famous books [18] and [19].
1.2 On Ludwig’s mathematical formulation of quantum mechanics
This paper is related to Ludwig’s formulation of quantum mechanics, more precisely, we shall examine one of the theory’s most important relations, called coexistence (see the definition later). The main difference compared to the classical formulation is that (due to the fact that no perfectly accurate measurement is possible in practice) quantum events are not sharp, but fuzzy. Therefore, according to Ludwig, a quantum event is not necessarily a projection, but rather a self-adjoint operator whose spectrum lies in [0, 1]. Such an operator is called an effect, and the set of all such operators is called the Hilbert space effect algebra, or simply the effect algebra, which will be denoted by \({{\mathcal {E}}}(H)\). Clearly, we have \({{\mathcal {P}}}(H)\subset {{\mathcal {E}}}(H)\). A fuzzy or unsharp quantum observable corresponds to an effect-valued measure on \({{\mathcal {B}}}_{\mathbb {R}}\), which is often called a normalised positive operator-valued measure, see e.g. [13] for more details on this. We point out that the role of effects and positive operator-valued measures was already emphasised in the earlier book [8] of Davies. For some of the subsequent contributions to the theory we refer the reader to the work of Kraus [17] and the recent book of Busch–Lahti–Pellonpää–Ylinen [3].
Let us point out that, contradicting to its name, \({{\mathcal {E}}}(H)\) is obviously not an actual algebra. There are a number of operations and relations on the effect algebra that are relevant in mathematical physics. First of all, the usual partial order \(\le \), defined by \(A\le B\) if and only if \(\langle Ax, x \rangle \le \langle B x , x \rangle \) for all \(x \in H\), expresses that the occurrence of the quantum event A implies the occurrence of B. We emphasise that \(({{\mathcal {E}}}(H),\le )\) is not a lattice, because usually there is no largest effect C whose occurrence implies both A and B (see [1, 25, 31] for more details on this). Note that, as can be easily shown, we have \({{\mathcal {E}}}(H) = \{A \in {{\mathcal {B}}}(H) :A=A^*, 0\le A\le I\}\), where \({{\mathcal {B}}}(H)\) denotes the set of all bounded operators on H, \(A^*\) the adjoint of A, and I the identity operator. Hence sometimes the literature refers to \({{\mathcal {E}}}(H)\) as the operator interval [0, I].
Second, the so called ortho-complementation \(\perp \) is defined by \(A^\perp = I - A\), and it can be thought of as the complement event (or negation) of A, i.e. A occurs if and only if \(A^\perp \) does not.
We are mostly interested in the relation of coexistence. Ludwig called two effects coexistent if they can be measured together by applying a suitable apparatus. In the language of mathematics (see [18, Theorem IV.1.2.4]), this translates into the following definition:
Definition 1.1
\(A, B \in {{\mathcal {E}}}(H)\) are said to be coexistent, in notation \(A \sim B\), if there are effects \(E,F,G \in {{\mathcal {E}}}(H)\) such that
We point out that in the earlier work [8] Davies examined the simultaneous measurement of unsharp position and momentum, which is closely related to coexistence. It is apparent from the definition that coexistence is a symmetric relation. Although it is not trivial from the above definition, two sharp effects \(P,Q \in {{\mathcal {P}}}(H)\) are coexistent if and only if they commute (see Sect. 2), which corresponds to the classical formulation. We will denote the set of all effects that are coexistent with \(A\in {{\mathcal {E}}}(H)\) by
and more generally, if \({\mathcal {M}} \subset {{\mathcal {E}}}(H)\), then \({\mathcal {M}}^\sim := \cap \{ A^\sim :A\in {\mathcal {M}}\}\).
The relation of order in \({{\mathcal {E}}}(H)\) is fairly well-understood. However, the relation of coexistence is very poorly understood. In the case of qubit effects (i.e. when \(\dim H=2\)) the recent papers of Busch–Schmidt [4], Stano–Reitzner–Heinosaari [30] and Yu–Liu–Li–Oh [36] provide some (rather complicated) characterisations of coexistence. Although there are no similar results in higher dimensions, it was pointed out by Wolf–Perez–Garcia–Fernandez in [35] that the question of coexistence of pairs of effects can be phrased as a so-called semidefinite program, which is a manageable numerical mathematical problem. We also mention that Heinosaari–Kiukas–Reitzner in [14] generalised the qubit coexistence characterisation to pairs of effects in arbitrary dimensions that belong to the von Neumann algebra generated by two projections.
To illustrate how poorly the relation of coexistence is understood, we note that the following very natural question has not been answered before—not even for qubit effects:
What does it mean for two effects A and B to be coexistent with exactly the same effects?
As our first main result we answer this very natural question. Namely, we will show the following theorem, where \({{\mathcal {F}}}(H)\) and \(\mathcal {SC}(H)\) denote the set off all finite-rank and scalar effects on H, respectively.
Theorem 1.1
For any effects \(A,B\in {{\mathcal {E}}}(H)\) the following are equivalent:
-
(i)
\(B\in \{A, A^\perp \}\) or \(A,B \in \mathcal {SC}(H)\),
-
(ii)
\(A^\sim = B^\sim \).
Moreover, if H is separable, then the above statements are also equivalent to
-
(iii)
\(A^\sim \cap {{\mathcal {F}}}(H) = B^\sim \cap {{\mathcal {F}}}(H)\).
Physically speaking, the above theorem says that the (unsharp) quantum events A and B can be measured together with exactly the same quantum events if and only if they are the same, or they are each other’s negation, or both of them are scalar effects.
1.3 Automorphisms of \({{\mathcal {E}}}(H)\) with respect to two relations
Automorphisms of mathematical structures related to quantum mechanics are important to study because they provide the right tool to understand the time-evolution of certain quantum systems (see e.g. [18, Chapters V-VII] or [29]). In case when this mathematical structure is \({{\mathcal {E}}}(H)\), we call a map \(\phi :{{\mathcal {E}}}(H)\rightarrow {{\mathcal {E}}}(H)\) a standard automorphism of the effect algebra if there exists a unitary or antiunitary operator \(U:H\rightarrow H\) that (similarly to Wigner’s theorem) implements \(\phi \), i.e. we have
Obviously, standard automorphisms are automorphisms with respect to the relations of order:
of ortho-complementation:
and also of coexistence:
One of the fundamental theorems in the mathematical foundations of quantum mechanics states that every ortho-order automorphism is a standard automorphism, which was first stated by Ludwig.
Ludwig’s Theorem
(1954, Theorem V.5.23 in [18]). Let H be a Hilbert space with \(\dim H \ge 2\). Assume that \(\phi :{{\mathcal {E}}}(H) \rightarrow {{\mathcal {E}}}(H)\) is a bijective map satisfying (\(\le \)) and (\(\perp \)). Then \(\phi \) is a standard automorphism of \({{\mathcal {E}}}(H)\). Conversely, every standard automorphism satisfies (\(\le \)) and (\(\perp \)).
We note that Ludwig’s proof was incomplete and that he formulated his theorem under the additional assumption that \(\dim H \ge 3\). The reader can find a rigorous proof of this version for instance in [5]. Let us also point out that the two-dimensional case of Ludwig’s theorem was only proved in 2001 in [22].
It is very natural to ask whether the conclusion of Ludwig’s theorem remains true, if one replaces either (\(\le \)) by (\(\sim \)), or (\(\perp \)) by (\(\sim \)). Note that in light of Theorem 1.1, in the former case the condition (\(\perp \)) becomes almost redundant, except on \(\mathcal {SC}(H)\). However, as scalar effects are exactly those that are coexistent with every effect (see Sect. 2), this problem basically reduces to the characterisation of automorphisms with respect to coexistence only—which we shall consider later on.
In 2001, Molnár answered the other question affirmatively under the assumption that \(\dim H \ge 3\).
Molnár’s Theorem (2001, Theorem 1 in [20]). Let H be a Hilbert space with \(\dim H \ge 3\). Assume that \(\phi :{{\mathcal {E}}}(H) \rightarrow {{\mathcal {E}}}(H)\) is a bijective map satisfying ( \(\le \)) and ( \(\sim \)). Then \(\phi \) is a standard automorphism of \({{\mathcal {E}}}(H)\). Conversely, every standard automorphism satisfies ( \(\le \)) and ( \(\sim \)).
In this paper we shall prove the two-dimensional version of Molnár’s theorem.
Theorem 1.2
Assume that \(\phi :{{\mathcal {E}}}({\mathbb {C}}^2) \rightarrow {{\mathcal {E}}}({\mathbb {C}}^2)\) is a bijective map satisfying (\(\le \)) and (\(\sim \)). Then \(\phi \) is a standard automorphism of \({{\mathcal {E}}}({\mathbb {C}}^2)\). Conversely, every standard automorphism satisfies (\(\le \)) and (\(\sim \)).
Note that Molnár used the fundamental theorem of projective geometry to prove the aforementioned result, therefore his proof indeed works only if \(\dim H \ge 3\). Here, as an application of Theorem 1.1, we shall give an alternative proof of Molnár’s theorem that does not use this dimensionality constraint, hence fill this dimensionality gap in. More precisely, we will reduce Molnár’s theorem and Theorem 1.2 to Ludwig’s theorem (see the end of Sect. 2).
1.4 Automorphisms of \({{\mathcal {E}}}(H)\) with respect to only one relation
It is certainly a much more difficult problem to describe the general form of automorphisms with respect to only one relation. Of course, here we mean either order preserving (\(\le \)), or coexistence preserving (\(\sim \)) maps, as it is easy (and not at all interesting) to describe bijective transformations that satisfy (\(\perp \)). It has been known for quite some time that automorphisms with respect to the order relation on \({{\mathcal {E}}}(H)\) may differ a lot from standard automorphisms, although they are at least always continuous with respect to the operator norm. We do not state the related result here, but only mention that the answer finally has been given by the second author in [26, Corollary 1.2] (see also [28]).
The other main purpose of this paper is to give the characterisation of all automorphisms of \({{\mathcal {E}}}(H)\) with respect to the relation of coexistence. As can be seen from our result below, these maps can also differ a lot from standard automorphisms, moreover, unlike in the case of (\(\le \)) they are in general not even continuous.
Theorem 1.3
Let H be a Hilbert space with \(\dim H \ge 2\), and \(\phi :{{\mathcal {E}}}(H) \rightarrow {{\mathcal {E}}}(H)\) be a bijective map that satisfies (\(\sim \)). Then there exists a unitary or antiunitary operator \(U:H\rightarrow H\) and a bijective map \(g:[0,1] \rightarrow [0,1]\) such that we have
and
Conversely, every map of the above form preserves coexistence in both directions.
Observe that in the above theorem if we assume that our automorphism is continuous with respect to the operator norm, then up to unitary-antiunitary equivalence we obtain that \(\phi \) is either the identity map, or the ortho-complementation: \(A\mapsto A^\perp \). Also note that the converse statement of the theorem follows easily by Theorem 1.1. As we mentioned earlier, the description of all automorphisms with respect to (\(\sim \)) and (\(\perp \)) now follows easily, namely, we get the same conclusion as in the above theorem, except that now g further satisfies \(g(1-t) = 1-g(t)\) for all \(0\le t\le 1\).
1.5 Quantum mechanical interpretation of automorphisms of \({{\mathcal {E}}}(H)\)
In order to explain the above automorphism theorems’ physical interpretation, let us go back first to Wigner’s theorem. Assume there are two physicists who analyse the same quantum mechanical system using the same Hilbert space H, but possibly they might associate different rank-one projections to the same quantum (pure) state. However, we know that they always agree on the transition probabilities. Then according to Wigner’s theorem, there must be either a unitary, or an antiunitary operator with which we can transform from one analysis into the other (like a ”coordinate transformation”).
For the interpretation of Ludwig’s theorem, let us say there are two physicists who analyse the same quantum fuzzy measurement, but they might associate different effects to the same quantum fuzzy event. If we at least know that both of them agree on which pairs of effects are ortho-complemented, and which effect is larger than the other (i.e. implies the occurrence of the other), then by Ludwig’s theorem there must exist either a unitary, or an antiunitary operator that gives us the way to transform from one analysis into the other.
As for the interpretation of our Theorem 1.3, if we only know that our physicists agree on which pairs of effects are coexistent (i.e. which pairs of quantum events can be measured together), then there is a map \(\phi \) satisfying (2) and (3) that transforms the first physicist’s analysis into the other’s.
1.6 The outline of the paper
In the next section we will prove our first main result, Theorem 1.1, and as an application, we prove Molnár’s theorem in an alternative way that works for qubit effects as well. This will be followed by Sect. 3 where we prove our other main result, Theorem 1.3, in the case when \(\dim H = 2\). Then in Sect. 4, using the two-dimensional case, we shall prove the general version of our result. Let us point out once more that, unless otherwise stated, H is not assumed to be separable. We will close our paper with some discussion on the qubit case and some open problems in Sects. 5–6.
2 Proofs of Theorems 1.1, 1.2, and Molnár’s Theorem
We start with some definitions. The symbol \({{\mathcal {P}}}(H)\) will stand for the set of all projections (idempotent and self-adjoint operators) on H, and \({{\mathcal {P}}}_1(H)\) will denote the set of all rank-one projections. The commutant of an effect A intersected with \({{\mathcal {E}}}(H)\) will be denoted by
and more generally, for a subset \({\mathcal {M}}\subset {{\mathcal {E}}}(H)\) we will use the notation \({\mathcal {M}}^c := \cap \{ A^c :A\in {\mathcal {M}} \}\). Also, we set \(A^{cc} := (A^c)^c\) and \({\mathcal {M}}^{cc} := ({\mathcal {M}}^c)^c\).
We continue with three known lemmas on the structure of coexistent pairs of effects that can all be found in [27]. The first two have been proved earlier, see [4, 21].
Lemma 2.1
For any \(A\in {{\mathcal {E}}}(H)\) and \(P \in {{\mathcal {P}}}(H)\) the following statements hold:
-
(a)
\(A^\sim = {{\mathcal {E}}}(H)\) if and only if \(A \in \mathcal {SC}(H)\),
-
(b)
\(P^\sim = P^c\),
-
(c)
\(A^c \subseteq A^\sim \).
Lemma 2.2
Let \(A,B \in {{\mathcal {E}}}(H)\) so that their matrices are diagonal with respect to some orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\), i.e. \(A = \oplus _{i\in \mathcal {I}} A_i\) and \(B = \oplus _{i\in \mathcal {I}} B_i \in {{\mathcal {E}}}(\oplus _{i\in \mathcal {I}} H_i)\). Then \(A\sim B\) if and only if \(A_i\sim B_i\) for all \(i\in \mathcal {I}\).
Lemma 2.3
Let \(A,B \in {{\mathcal {E}}}(H)\). Then the following are equivalent:
-
(i)
\(A \sim B\),
-
(ii)
there exist effects \(M,N \in {{\mathcal {E}}}(H)\) such that \(M \le A\), \(N \le I-A\), and \(M+N = B\).
We continue with a corollary of Lemma 2.1.
Corollary 2.4
For any effect A and projection \(P\in A^\sim \) we have \(P\in A^c\). In particular, we have \(A^\sim \cap {{\mathcal {P}}}(H) = A^c\cap {{\mathcal {P}}}(H)\).
Proof
Since coexistence is a symmetric relation, we obtain \(A\in P^\sim \), which implies \(AP=PA\). \(\square \)
The next four statements are easy consequences of Lemma 2.3, we only prove two of them.
Corollary 2.5
For any effect A we have \(A^\sim = \left( A^\perp \right) ^\sim \).
Corollary 2.6
Let \(A\in {{\mathcal {E}}}(H)\) such that either \(0\notin \sigma (A)\), or \(1\notin \sigma (A)\). Then there exists an \(\varepsilon >0\) such that \(\{C\in {{\mathcal {E}}}(H):C\le \varepsilon I\} \subseteq A^\sim \).
We recall the definition of the strength function of \(A\in {{\mathcal {E}}}(H)\):
see [2] for more details and properties.
Corollary 2.7
Assume that \(A\in {{\mathcal {E}}}(H)\), \(0 < t \le 1\), and \(P\in {{\mathcal {P}}}_1(H)\). Then the following conditions are equivalent:
-
(i)
\(A\sim tP\);
-
(ii)
$$\begin{aligned} t \le \Lambda (A,P) + \Lambda (A^\perp ,P). \end{aligned}$$(4)
Proof
By (ii) of Lemma 2.3 we have \(A\sim tP\) if and only if there exist \(t_1, t_2 \ge 0\) such that \(t = t_1 + t_2\), \(t_1 P \le A\) and \(t_2 P \le A^\perp \), which is of course equivalent to (4). \(\square \)
Corollary 2.8
Let \(A,B \in {{\mathcal {E}}}(H)\) such that \(A^\sim \subseteq B^\sim \). Assume that with respect to the orthogonal decomposition \(H = H_1 \oplus H_2\) the two effects have the following block-diagonal matrix forms:
Then we also have
In particular, if \(A^\sim = B^\sim \), then \(A_1^\sim = B_1^\sim \) and \(A_2^\sim = B_2^\sim \).
Proof
Let \(P_1\) be the orthogonal projection onto \(H_1\). By Lemma 2.2 we observe that
which immediately implies (5). \(\square \)
Next, we recall the Busch–Gudder theorem about the explicit form of the strength function, which we shall use frequently here. We also adopt their notation, so whenever it is important to emphasise that the range of a rank-one projection P is \({\mathbb {C}}\cdot x\) with some \(x\in H\) such that \(\Vert x\Vert =1\), we write \(P_x\) instead. Furthermore, the symbol \(A^{-1/2}\) denotes the algebraic inverse of the bijective restriction \(A^{1/2}|_{(\mathrm{Im\,}A)^-}:(\mathrm{Im\,}A)^-\rightarrow \mathrm{Im\,}(A^{1/2})\), where \(\cdot ^-\) stands for the closure of a set. In particular, for all \(x \in \mathrm{Im\,}(A^{1/2})\) the vector \(A^{-1/2} x\) is the unique element in \((\mathrm{Im\,}A)^-\) which \(A^{1/2}\) maps to x.
Busch–Gudder Theorem
(1999, Theorem 4 in [2]) For every effect \(A \in {{\mathcal {E}}}(H)\) and unit vector \(x\in H\) we have
We proceed with proving some new results which will be crucial in the proofs of our main theorems. The first lemma is probably well-known, but as we did not find it in the literature, we state and prove it here. Recall that WOT and SOT stand for the weak- and strong operator topologies, respectively.
Lemma 2.9
For any effect \(A\in {{\mathcal {E}}}(H)\), the set \(A^\sim \) is convex and WOT-compact, hence it is also SOT- and norm-closed. Moreover, if H is separable, then the subset \(A^\sim \cap {{\mathcal {F}}}(H)\) is SOT-dense, hence also WOT-dense, in \(A^\sim \).
Proof
Let \(t\in [0,1]\) and \(B_1,B_2\in A^\sim \). By Lemma 2.3 there are \(M_1,M_2,N_1,N_2\in {{\mathcal {E}}}(H)\) such that \(M_1+N_1 = B_1\), \(M_2+N_2 = B_2\), \(M_1\le A, N_1\le I-A\) and \(M_2\le A, N_2\le I-A\). Hence setting \(M = tM_1+(1-t)M_2\in {{\mathcal {E}}}(H)\) and \(N = tN_1+(1-t)N_2\in {{\mathcal {E}}}(H)\) gives \(M+N = tB_1+(1-t)B_2\) and \(M\le A, N\le I-A\), thus \(tB_1+(1-t)B_2\sim A\), so \(A^\sim \) is indeed convex.
Next, we prove that \(A^\sim \) is WOT-compact. Clearly, \({{\mathcal {E}}}(H)\) is WOT-compact, as it is a bounded WOT-closed subset of \({{\mathcal {B}}}(H)\) (see [7, Proposition IX.5.5]), therefore it is enough to show that \(A^\sim \) is WOT-closed. Let \(\{B_\nu \}_\nu \subseteq A^\sim \) be an arbitrary net that WOT-converges to B, we shall show that \(B\sim A\) holds. For every \(\nu \) we can find two effects \(M_\nu \) and \(N_\nu \) such that \(M_\nu +N_\nu = B_\nu \), \(M_\nu \le A\) and \(N_\nu \le I-A\). By WOT-compactness of \({{\mathcal {E}}}(H)\), there exists a subnet \(\{B_\xi \}_\xi \) such that \(M_\xi \rightarrow M\) in WOT with some effect M. Again, by WOT-compactness of \({{\mathcal {E}}}(H)\), there exists a subnet \(\{B_\eta \}_\eta \) of the subnet \(\{B_\xi \}_\xi \) such that \(N_\eta \rightarrow N\) in WOT with some effect N. Obviously we also have \(B_\eta \rightarrow B\) and \(M_\eta \rightarrow M\) in WOT. Therefore we have \(M+N = B\) and by definition of WOT convergence we also obtain \(M\le A\), \(N\le I-A\), hence indeed \(B\sim A\). Closedness with respect to the other topologies is straightforward.
Concerning our last statement for separable spaces, first we point out that for every effect C there exists a net of finite rank effects \(\{C_\nu \}_\nu \) such that \(C_\nu \le C\) holds for all \(\nu \) and \(C_\nu \rightarrow C\) in SOT. Denote by \(E_C\) the projection-valued spectral measure of C, and set \(C_n = \sum _{j=0}^{n} \frac{j}{n} E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) for every \(n\in {\mathbb {N}}\). Clearly, each \(C_n\) has finite spectrum, satisfies \(C_n\le C\), and \(\Vert C_n - C\Vert \rightarrow 0\) as \(n\rightarrow \infty \). For each spectral projection \(E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) we can take a sequence of finite-rank projections \(\{P_k^{j,n}\}_{k=1}^\infty \) such that \(P_k^{j,n}\le E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) for all k and \(P_k^{j,n} \rightarrow E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) in SOT as \(k\rightarrow \infty \). Define \(C_{n,k} := \sum _{j=0}^{n} \frac{j}{n} P_k^{j,n}\). It is apparent that \(C_{n,k} \le C_n\) for all n and k, and that for each n we have \(C_{n,k} \rightarrow C_n\) in SOT as \(k\rightarrow \infty \). Therefore the SOT-closure of \(\{C_{n,k}:n,k\in {\mathbb {N}}\}\) contains each \(C_n\), hence also C, thus we can construct a net \(\{C_\nu \}_\nu \) with the required properties.
Now, let \(B\in A^\sim \) be arbitrary, and consider two other effects \(M,N\in {{\mathcal {E}}}(H)\) that satisfy the conditions of Lemma 2.3 (ii). Set \(C:= M\oplus N \in {{\mathcal {E}}}(H\oplus H)\), and denote by \(E_M\) and \(E_N\) the projection-valued spectral measures of M and N, respectively. Clearly, \(E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) = E_M\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \oplus E_N\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) for each j and n. In the above construction we can choose finite-rank projections of the form \(P_k^{j,n} = Q_k^{j,n}\oplus R_k^{j,n} \in {{\mathcal {P}}}(H\oplus H)\) where \(Q_k^{j,n}, R_k^{j,n} \in {{\mathcal {P}}}(H)\), \(Q_k^{j,n}\le E_M\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) and \(R_k^{j,n}\le E_N\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) holds for all k, n. Then each element \(C_\nu \) of the convergent net is an orthogonal sum of the form \(M_\nu \oplus N_\nu \in {{\mathcal {E}}}(H\oplus H))\). It is apparent that \(M_\nu , N_\nu \in {{\mathcal {F}}}(H)\), \(M_\nu \le M\) and \(N_\nu \le N\) for all \(\nu \), and that \(M_\nu \rightarrow M\), \(N_\nu \rightarrow N\) holds in SOT. Therefore \(M_\nu +N_\nu \in {{\mathcal {F}}}(H)\cap A^\sim \) and \(M_\nu +N_\nu \) converges to \(M+N = B\) in SOT, the proof is complete. \(\square \)
We proceed to investigate when do we have the equation \(A^\sim = B^\sim \) for two effects A and B, which will take several steps. We will denote the set of all rank-one effects by \({{\mathcal {F}}}_1(H) := \{tP :P\in {{\mathcal {P}}}_1(H), 0<t\le 1\}\).
Lemma 2.10
Let \(H = H_1\oplus H_2\) be an orthogonal decomposition and assume that \(A, B\in {{\mathcal {E}}}(H)\) have the following matrix decompositions:
where \(\lambda _1, \lambda _2, \mu _1, \mu _2 \in [0,1]\), and \(I_1\) and \(I_2\) denote the identity operators on \(H_1\) and \(H_2\), respectively. Then the following are equivalent:
-
(i)
\(\Lambda (A,P) + \Lambda (A^\perp ,P) = \Lambda (B,P) + \Lambda (B^\perp ,P)\) holds for all \(P\in {{\mathcal {P}}}_1(H)\),
-
(ii)
\(A^\sim \cap {{\mathcal {F}}}_1(H) = B^\sim \cap {{\mathcal {F}}}_1(H)\),
-
(iii)
either \(\lambda _1=\lambda _2\) and \(\mu _1=\mu _2\), or \(\lambda _1=\mu _1\) and \(\lambda _2=\mu _2\), or \(\lambda _1 + \mu _1 = \lambda _2 + \mu _2 =1\).
Proof
The directions (iii)\(\Longrightarrow \)(ii)\(\iff \)(i) are trivial by Lemma 2.1 (a) and Corollaries 2.5, 2.7, so we shall only consider the direction (i)\(\Longrightarrow \)(iii). First, a straightforward calculation using the Busch–Gudder theorem gives the following for every \(x_1\in H_1, x_2\in H_2, \Vert x_1\Vert = \Vert x_2\Vert = 1\) and \(0\le \alpha \le \tfrac{\pi }{2}\):
where we use the interpretations \(\tfrac{1}{0} = \infty \), \(\tfrac{1}{\infty } = 0\), \(\infty \cdot 0 = 0\), \(\infty + \infty = \infty \), and \(\infty + a = \infty \), \(\infty \cdot a = \infty \) (\(a>0\)), in order to make the formula valid also for the case when \(\lambda _1 = 0\) or \(\lambda _2 = 0\). Clearly, (8) depends only on \(\alpha \), but not on the specific choices of \(x_1\) and \(x_2\). We define the following two functions
and
which are the same by our assumptions. By (8), for all \(0\le \alpha \le \tfrac{\pi }{2}\) we have
Next, we observe the following implications:
-
if \(\lambda _1 = \lambda _2\), then \(T_A(\alpha )\) is the constant 1 function,
-
if \(\lambda _1 = 0\) and \(\lambda _2 = 1\), then \(T_A(\alpha )\) is the characteristic function \(\chi _{\{0,\pi /2\}}(\alpha )\),
-
if \(\lambda _1 = 0\) and \(0< \lambda _2 < 1\), then \(T_A(\alpha )\) is continuous on \(\left[ 0,\tfrac{\pi }{2}\right) \), but has a jump at \(\tfrac{\pi }{2}\), namely \(\lim _{\alpha \rightarrow \tfrac{\pi }{2}-} T_A(\alpha ) = 1-\lambda _2\) and \(T_A(\tfrac{\pi }{2}) = 1\),
-
if \(\lambda _1 = 1\) and \(0< \lambda _2 < 1\), then \(T_A(\alpha )\) is continuous on \(\left[ 0,\tfrac{\pi }{2}\right) \), but has a jump at \(\tfrac{\pi }{2}\), namely \(\lim _{\alpha \rightarrow \tfrac{\pi }{2}-} T_A(\alpha ) = \lambda _2\) and \(T_A(\tfrac{\pi }{2}) = 1\),
-
if \(\lambda _1,\lambda _2 \in (0,1)\), then \(T_A(\alpha )\) is continuous on \(\left[ 0,\tfrac{\pi }{2}\right] \),
-
if \(\lambda _1 \ne \lambda _2\), then we have \(T_A(0) = T_A(\tfrac{\pi }{2}) = 1\) and \(T_A(\alpha ) < 1\) for all \(0< \alpha < \tfrac{\pi }{2}\).
All of the above statements are rather straightforward computations using the formula (9), let us only show the last one here. Clearly, \(T_A(0) = T_A(\tfrac{\pi }{2}) = 1\) is obvious. As for the other assertion, if \(\lambda _1,\lambda _2 \in (0,1)\), then we can use the strict version of the weighted harmonic-arithmetic mean inequality:
If \(\lambda _1 = 0< \lambda _2 < 1\), then we calculate in the following way:
The remaining cases are very similar.
The above observations together with Corollary 2.5 and (9) readily imply the following:
-
\(A\in \mathcal {SC}(H)\) if and only if \(B\in \mathcal {SC}(H)\),
-
\(A\in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\) if and only if \(B\in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\), in which case \(B \in \{A, A^\perp \}\),
-
there exists a \(P\in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\) and a \(t\in (0,1)\) with \(A\in \{tP, I-tP\}\) if and only if \(B\in \{tP, I-tP\}\),
-
\(\lambda _1,\lambda _2 \in (0,1)\) and \(\lambda _1\ne \lambda _2\) if and only if \(\mu _1, \mu _2 \in (0,1)\) and \(\mu _1\ne \mu _2\).
So what remained is to show that in the last case we further have \(B \in \{A, A^\perp \}\), which is what we shall do below.
Let us introduce the following functions:
and
Our aim is to prove that \({\mathcal {T}}_A(s) = {\mathcal {T}}_B(s)\) (\(s\in [0,1]\)) implies either \(\lambda _1=\mu _1\) and \(\lambda _2=\mu _2\), or \(\lambda _1 + \mu _1 = \lambda _2 + \mu _2 =1\). The derivative of \({\mathcal {T}}_A\) is
from which we calculate
Therefore, if we managed to show that the function
is injective on the set \(\Delta := \{(x,y)\in {\mathbb {R}}^{2}:0<y<x<1 \}\), then we are done (note that \(F(x,y) = F(1-x,1-y)\)). For this assume that with some \(c,d>0\) we have
or equivalently,
If we substitute \(u = \tfrac{x-y}{2}\) and \(v = \tfrac{x+y}{2}\), then we get
Now, considering the sum and difference of these two equations and manipulate them a bit gives
From these latter equations we conclude
which clearly implies that F is globally injective on \(\Delta \), and the proof is complete. \(\square \)
We have an interesting consequence in finite dimensions.
Corollary 2.11
Assume that \(2\le \dim H < \infty \) and \(A,B \in {{\mathcal {E}}}(H)\). Then the following are equivalent:
-
(i)
\(\Lambda (A, P) + \Lambda (A^\perp , P) = \Lambda (B, P) + \Lambda (B^\perp , P)\) for all \(P\in P_1(H)\),
-
(ii)
\(A^\sim \cap {{\mathcal {F}}}_1(H) = B^\sim \cap {{\mathcal {F}}}_1(H)\),
-
(iii)
either \(A,B \in \mathcal {SC}(H)\), or \(A=B\), or \(A=B^\perp \).
Proof
The directions (i)\(\iff \)(ii)\(\Longleftarrow \)(iii) are trivial, so we shall only prove the (ii)\(\Longrightarrow \)(iii) direction. First, let us consider the two-dimensional case. As we saw in the proof of Lemma 2.10, we have \(A^\sim \cap {{\mathcal {F}}}_1(H) = {{\mathcal {F}}}_1(H)\) if and only if A is a scalar effect (see the first set of bullet points there). Therefore, without loss of generality we may assume that none of A and B are scalar effects. Notice that by Lemma 2.1, A and B commute with exactly the same rank-one projections, hence A and B possess the forms in (7) with some one-dimensional subspaces \(H_1\) and \(H_2\), and an easy application of Lemma 2.10 gives (iii).
As for the general case, since again A and B commute with exactly the same rank-one projections, we can jointly diagonalise them with respect to some orthonormal basis \(\{e_j\}_{j=1}^n\), where \(n = \dim H\):
Of course, for any two distinct \(i, j \in \{1,\dots , n\}\) we have the following equation for the strength functions:
which instantly implies
By the two-dimensional case this means that we have one of the following cases:
-
\(\lambda _i = \lambda _j\) and \(\mu _i = \mu _j\),
-
\(\lambda _i \ne \lambda _j\) and either \(\mu _i = \lambda _i\) and \(\mu _j = \lambda _j\), or \(\mu _i = 1-\lambda _i\) and \(\mu _j = 1-\lambda _j\).
From here it is easy to conclude (iii). \(\square \)
The commutant of an operator \(T\in {{\mathcal {B}}}(H)\) will be denoted by \(T' := \{ S\in {{\mathcal {B}}}(H) :ST=TS \}\), and more generally, if \({\mathcal {M}}\subseteq {{\mathcal {B}}}(H)\), then we set \({\mathcal {M}}' := \cap \{ T' :T\in {\mathcal {M}}\}\). We shall use the notations \(T'' := (T')'\) and \({\mathcal {M}}'' := ({\mathcal {M}}')'\) for the double commutants.
Lemma 2.12
For any \(A,B \in {{\mathcal {E}}}(H)\) the following three assertions hold:
-
(a)
If \(A^\sim \subseteq B^\sim \), then \(B\in A''\).
-
(b)
If \(\dim H \le \aleph _0\) and \(A^\sim \subseteq B^\sim \), then there exists a Borel function \(f:[0,1] \rightarrow [0,1]\) such that \(B = f(A)\).
-
(c)
If B is a convex combination of \(A, A^\perp , 0\) and I, then \(A^\sim \subseteq B^\sim \).
Proof
(a): Assume that \(C\in A'\). Our goal is to show \(B\in C'\). We express C in the following way:
where \(C_{\mathfrak {R}}\) and \(C_{\mathfrak {I}}\) are self-adjoint (they are usually called the real and imaginary parts of C). Since A is self-adjoint, \(C^* \in A'\), hence \(C_{\mathfrak {R}}, C_{\mathfrak {I}} \in A'\). Let \(E_{\mathfrak {R}}\) and \(E_{\mathfrak {I}}\) denote the projection-valued spectral measures of \(C_{\mathfrak {R}}\) and \(C_{\mathfrak {I}}\), respectively. By the spectral theorem ([7, Theorem IX.2.2]), Lemma 2.1 and Corollary 2.4, we have \(E_{\mathfrak {R}}(\Delta ), E_{\mathfrak {I}}(\Delta ) \in A^c \subseteq A^\sim \subseteq B^\sim \), therefore also \(E_{\mathfrak {R}}(\Delta ), E_{\mathfrak {I}}(\Delta ) \in B'\) for all \(\Delta \in {{\mathcal {B}}}_{\mathbb {R}}\), which gives \(C\in B'\).
(b): This is an easy consequence of [7, Proposition IX.8.1 and Lemma IX.8.7].
(c): If \(A\sim C\), then also \(A^\perp , 0\) and \(I\sim C\). Hence by the convexity of \(C^\sim \) we obtain \(B\sim C\). \(\square \)
Now, we are in the position to prove our first main result.
Proof of Theorem 1.1
If H is separable, then the equivalence (ii)\(\iff \)(iii) is straightforward by Lemma 2.9. For general H the direction (i)\(\Longrightarrow \)(ii) is obvious, therefore we shall only prove (ii)\(\Longrightarrow \)(i), first in the separable, and then in the general case. By Lemma 2.1, we may assume throughout the rest of the proof that A and B are non-scalar effects. We will denote the spectral subspace of a self-adjoint operator T associated to a Borel set \(\Delta \subseteq {\mathbb {R}}\) by \(H_T(\Delta )\).
(ii)\(\Longrightarrow \)(i) in the separable case: We split this part into two steps.
STEP 1: Here, we establish two estimations, (11) and (12), for the strength functions of A and B on certain subspaces of H. Let \(\lambda _1, \lambda _2 \in \sigma (A), \lambda _1 \ne \lambda _2\) and \(0< \varepsilon < \tfrac{1}{2}|\lambda _1-\lambda _2|\). Then the spectral subspaces \(H_1 = H_A\left( (\lambda _1-\varepsilon , \lambda _1+\varepsilon ) \right) \) and \(H_2 = H_A\left( (\lambda _2-\varepsilon , \lambda _2+\varepsilon ) \right) \) are non-trivial and orthogonal. Set \(H_3\) to be the orthogonal complement of \(H_1\oplus H_2\), then the matrix of A written in the orthogonal decomposition \(H = H_1\oplus H_2 \oplus H_3\) is diagonal:
Note that \(H_3\) might be a trivial subspace. Since by Corollary 2.4A and B commute with exactly the same projections, the matrix of B in \(H = H_1\oplus H_2 \oplus H_3\) is also diagonal:
At this point, let us emphasise that of course \(H_j, A_j\) and \(B_j\) (\(j=1,2,3\)) all depend on \(\lambda _1, \lambda _2\) and \(\varepsilon \), but in order to keep our notation as simple as possible, we will stick with these symbols. However, if at any point it becomes important to point out this dependence, we shall use for instance \(B_j^{(\lambda _1, \lambda _2, \varepsilon )}\) instead of \(B_j\). Similar conventions apply later on.
Observe that by Corollary 2.8 we have
Now, we pick two arbitrary points \(\mu _1\in \sigma (B_1)\) and \(\mu _2\in \sigma (B_2)\). Then obviously, the following two subspaces are non-zero subspaces of \(H_1\) and \(H_2\), respectively:
Similarly as above, we have the following matrix forms where \({\check{H}}_j = H_j \ominus {\widehat{H}}_j\) \((j=1,2)\):
and
Note that \({\check{H}}_1\) or \({\check{H}}_2\) might be trivial subspaces. Again by Corollary 2.8, we have
Let us point out that by construction \(\sigma ({\widehat{A}}_j) \subseteq [\lambda _j-\varepsilon ,\lambda _j+\varepsilon ]\) and \(\sigma ({\widehat{B}}_j) \subseteq [\mu _j-\varepsilon ,\mu _j+\varepsilon ]\). Corollary 2.7 gives the following identity for the strength functions, where \({\widehat{I}}_j\) denotes the identity on \({\widehat{H}}_j\) \((j=1,2)\):
Define
and notice that we have the following two estimations for all rank-one projections P:
and
Note that the above estimations hold for any arbitrarily small \(\varepsilon \) and for all suitable choices of \(\mu _1\) and \(\mu _2\) (which of course depend on \(\varepsilon \)).
STEP 2: Here we show that \(B\in \{A,A^\perp \}\). Let us define the following set that depends only on \(\lambda _j\):
Notice that as this set is an intersection of monotonically decreasing (as \(\varepsilon \searrow 0\)), compact, non-empty sets, it must contain at least one element. Also, observe that if \(\mu _1\in {\mathcal {C}}_1\) and \(\mu _2\in {\mathcal {C}}_2\), then (11) and (12) hold for all \(\varepsilon >0\).
We proceed with proving that either \({\mathcal {C}}_1 = \{\lambda _1\}\) and \({\mathcal {C}}_2 = \{\lambda _2\}\), or \({\mathcal {C}}_1 = \{1-\lambda _1\}\) and \({\mathcal {C}}_2 = \{1-\lambda _2\}\) hold. Fix two arbitrary elements \(\mu _1 \in {\mathcal {C}}_1\) and \(\mu _2 \in {\mathcal {C}}_2\), and assume that neither \(\lambda _1 = \mu _1\) and \(\lambda _2 = \mu _2\), nor \(\lambda _1 + \mu _1 = \lambda _2 + \mu _2 = 1\) hold. From here our aim is to get a contradiction. As we showed in the proof of Lemma 2.10, there exists an \(\alpha _0 \in \left( 0,\tfrac{\pi }{2}\right) \) such that we have
where we interpret both sides as in (8). Notice that both summands on both sides depend continuously on \(\lambda _1,\lambda _2,\mu _1\) and \(\mu _2\). Therefore there exists an \(\varepsilon >0\) small enough and a rank-one projection \(P = P_{\cos \alpha _0 {\widehat{x}}_1+\sin \alpha _0 {\widehat{x}}_2}\), with \({\widehat{x}}_1\in {\widehat{H}}_1, {\widehat{x}}_2\in {\widehat{H}}_2, \Vert {\widehat{x}}_1\Vert =\Vert {\widehat{x}}_2\Vert =1\), such that the closed intervals bounded by the right- and left-hand sides of (11), and those of (12) are disjoint—which is a contradiction.
Observe that as we can do the above for any two disjoint elements of the spectrum \(\sigma (A)\), we can conclude that one of the following possibilities occur:
or
From here, we show that (13) implies \(A=B\), and (14) implies \(B=A^\perp \). As the latter can be reduced to the case (13), by considering \(B^\perp \) instead of B, we may assume without loss of generality that (13) holds. By Lemma 2.12 and [7, Theorem IX.8.10], there exists a function \(f\in L^\infty (\mu )\), where \(\mu \) is a scalar-valued spectral measure of A, such that \(B = f(A)\). Moreover, we have \(B = A\) if and only if \(f(\lambda ) = \lambda \) \(\mu \)-a.e, so we only have to prove the latter equation. Let us fix an arbitrarily small number \(\delta > 0\). By the spectral mapping theorem ( [7, Theorem IX.8.11]) and (13) we notice that for every \(\lambda \in \sigma (A)\) there exists an \(0< \varepsilon _{\lambda } < \delta \) such that
where \(\mu -\mathrm {essran}\) denotes the essential range of a function with respect to \(\mu \) (see [7, Example IX.2.6]). Now, for every \(\lambda \in \sigma (A)\) we fix such an \(\varepsilon _{\lambda }\). Clearly, the intervals \(\{(\lambda -\varepsilon _{\lambda },\lambda +\varepsilon _{\lambda }) :\lambda \in \sigma (A)\}\) cover the whole spectrum \(\sigma (A)\), which is a compact set. Therefore we can find finitely many of them, let’s say \(\lambda _1, \dots , \lambda _n\) so that
Finally, we define the function
By definition we have \(\Vert h - \mathrm {id}_{\sigma (A)} \Vert _{\infty } \le \delta \) where the \(\infty \)-norm is taken with respect to \(\mu \) and \(\mathrm {id}_{\sigma (A)}(\lambda ) = \lambda \) \((\lambda \in \sigma (A))\). But notice that by (15) we also have \(\Vert h - f \Vert _{\infty } \le \delta \), and hence \(\Vert f - \mathrm {id}_{\sigma (A)} \Vert _\infty \le 2\delta \). As this inequality holds for all positive \(\delta \), we actually get that \(f(\lambda ) = \lambda \) for \(\mu \)-a.e. \(\lambda \).
(ii)\(\Longrightarrow \)(i) in the non-separable case: It is well-known that there exists an orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\) such that each \(H_i\) is a non-trivial, separable, invariant subspace of A, see for instance [7, Proposition IX.4.4]. Since A and B commute with exactly the same projections, both are diagonal with respect to the decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\):
By Corollary 2.8 we have \(A_i^\sim = B_i^\sim \) for all \(i\in \mathcal {I}\), therefore the separable case implies
Without loss of generality we may assume from now on that there exists an \(i_0\in \mathcal {I}\) so that \(A_{i_0}\) is not a scalar effect. (In case all of them are scalar, we simply combine two subspaces \(H_{i_1}\) and \(H_{i_2}\) so that \(\sigma (A_{i_1})\ne \sigma (A_{i_2})\)). This implies either \(A_{i_0} = B_{i_0}\), or \(B_{i_0} = A_{i_0}^\perp \). By considering \(B^\perp \) instead of B if necessary, we may assume from now on that \(A_{i_0} = B_{i_0}\) holds.
Finally, let \(i_1 \in \mathcal {I} \setminus \{i_0\}\) be arbitrary, and let us consider the orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}\setminus \{i_0,i_1\}} H_i \oplus K\) where \(K = H_{i_0}\oplus H_{i_1}\). Similarly as above, we obtain either \(A_{i_0}\oplus A_{i_1} = B_{i_0} \oplus B_{i_1}\), or \(B_{i_0}\oplus B_{i_1} = A_{i_0}^\perp \oplus A_{i_1}^\perp \), but since \(A_{i_0} = B_{i_0}\), we must have \(A_{i_1} = B_{i_1}\). As this holds for arbitrary \(i_1\), the proof is complete. \(\square \)
Now, we are in the position to give an alternative proof of Molnár’s theorem which also extends to the two-dimensional case.
Proof of Theorem 1.2 and Molnár’s theorem
By (a) of Lemma 2.1 and (\(\sim \)) we obtain \(\phi (\mathcal {SC}(H)) = \mathcal {SC}(H)\), moreover, the property (\(\le \)) implies the existence of a strictly increasing bijection \(g:[0,1] \rightarrow [0,1]\) such that \(\phi (\lambda I) = g(\lambda ) I\) for every \(\lambda \in [0,1]\). By Theorem 1.1 we conclude
We only have to show that the same holds for scalar operators, because then the theorem is reduced to Ludwig’s theorem. For any effect A and any set of effects \({{\mathcal {S}}}\) let us define the following sets \(A^\le := \{B\in {{\mathcal {E}}}(H):A\le B\}\), \(A^\ge := \{B\in {{\mathcal {E}}}(H):A\ge B\}\) and \({{\mathcal {S}}}^\perp := \{B^\perp :B\in {{\mathcal {S}}}\}\). Observe that for any \(s,t\in [0,1]\) we have
if and only if \(t=1-s\) and \(s< \tfrac{1}{2}\). Thus for all \(s < \tfrac{1}{2}\) we obtain
which by (16) implies \(g(1-s) = 1-g(s)\) and \(g(s)< \tfrac{1}{2}\), therefore we indeed have (\(\perp \)) for every effect. \(\square \)
3 Proof of Theorem 1.3 in Two Dimensions
In this section we prove our other main theorem for qubit effects. In order to do that we need to prove a few preparatory lemmas. We start with a characterisation of rank-one projections in terms of coexistence.
Lemma 3.1
For any \(A\in {{\mathcal {E}}}({\mathbb {C}}^2)\) the following are equivalent:
-
(i)
there are no effects \(B\in {{\mathcal {E}}}({\mathbb {C}}^2)\) such that \(B^\sim \subsetneq A^\sim \),
-
(ii)
\(A \in {{\mathcal {P}}}_1({\mathbb {C}}^2)\).
Proof
The case when \(A\in \mathcal {SC}({\mathbb {C}}^2)\) is trivial, therefore we may assume otherwise throughout the proof.
(i)\(\Longrightarrow \)(ii): Suppose that \(A \notin {{\mathcal {P}}}_1({\mathbb {C}}^2)\), then by Corollary 2.6 there exists an \(\varepsilon >0\) such that \(\{C\in {{\mathcal {E}}}({\mathbb {C}}^2) :C \le \varepsilon I\} \subseteq A^\sim \). Let \(B\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\cap A^c\), then we have \(B^\sim = B^c = A^c\subseteq A^\sim \). But it is very easy to find a \(C\in {{\mathcal {E}}}({\mathbb {C}}^2)\) such that \(C \le \varepsilon I\) and \(C\notin B^c\), therefore we conclude \(B^\sim \subsetneq A^\sim \).
(ii)\(\Longrightarrow \)(i): If \(A \in {{\mathcal {P}}}_1({\mathbb {C}}^2)\), \(B\in {{\mathcal {E}}}({\mathbb {C}}^2)\) and \(B^\sim \subsetneq A^\sim \), then also \(B^c\subsetneq A^c\), which is impossible. \(\square \)
Note that the above statement does not hold in higher dimensions, see the final section of this paper for more details. We continue with a characterisation of rank-one and ortho-rank-one qubit effects in terms of coexistence.
Lemma 3.2
Let \(A\in {{\mathcal {E}}}({\mathbb {C}}^2)\setminus \mathcal {SC}({\mathbb {C}}^2)\). Then the following are equivalent:
-
(i)
A or \(A^\perp \in {{\mathcal {F}}}_1({\mathbb {C}}^2)\setminus {{\mathcal {P}}}_1({\mathbb {C}}^2)\),
-
(ii)
There exists at least one \(B\in {{\mathcal {E}}}({\mathbb {C}}^2)\) such that \(B^\sim \subsetneq A^\sim \), and for every such pair of effects \(B_1, B_2\) we have either \(B_1^\sim \subseteq B_2^\sim \), or \(B_2^\sim \subseteq B_1^\sim \).
Moreover, if (i) holds, i.e. A or \(A^\perp = tP\) with \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(0<t<1\), then we have \(B^\sim \subseteq A^\sim \) if and only if B or \(B^\perp = sP\) with some \(t\le s\le 1\).
Proof
First, notice that by Theorem 1.1 and Lemma 2.12 (c) we have
(i)\(\Longrightarrow \)(ii): If we have \(B^\sim \subseteq (tP)^\sim \) with some rank-one projection P, \(t \in (0,1]\) and qubit effect B, then by Lemma 2.12 (b) we obtain \(P\in B^c\) and \(B\notin \mathcal {SC}({\mathbb {C}}^2)\). Furthermore, since \(B^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2) \subseteq (tP)^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\), by Corollary 2.7 we obtain
where we use the notation from the proof of Lemma 2.10. Thus, the discontinuity of \(T_{tP}(\alpha )\) at either \(\alpha = 0\), or \(\alpha = \tfrac{\pi }{2}\), implies the discontinuity of \(T_B(\alpha )\) at the same \(\alpha \). Whence we conclude either \(B=sP\), or \(B=I-sP\) with some \(t\le s\le 1\).
(ii)\(\Longrightarrow \)(i): By Lemma 3.1, (ii) cannot hold for elements of \({{\mathcal {P}}}_1({\mathbb {C}}^2)\), so we only have to check that if \(A, A^\perp \notin {{\mathcal {F}}}_1({\mathbb {C}}^2) \cup \mathcal {SC}({\mathbb {C}}^2)\), then (ii) fails. Suppose that the spectral decomposition of A is \(\lambda _1 P + \lambda _2 P^\perp \) where \(1> \lambda _1>\lambda _2 > 0\). Then by Lemma 2.12 (c) we find that \(\left( \lambda _1 P\right) ^\sim \subseteq A^\sim \) and \(\left( (1-\lambda _2) P^\perp \right) ^\sim \subseteq A^\sim \) (see Figure 1), but by the previous part neither \((\lambda _1 P)^\sim \subseteq \left( (1-\lambda _2) P^\perp \right) ^\sim \), nor \(\left( (1-\lambda _2) P^\perp \right) ^\sim \subseteq (\lambda _1 P)^\sim \) holds. \(\square \)
For a visualisation of \((tP)^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\) see Sect. 5. Before we proceed with the proof of Theorem 1.3 for qubit effects, we need a few more lemmas about rank-one projections acting on \({\mathbb {C}}^2\).
Lemma 3.3
For all \(P, Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) we have
Proof
Since \(\mathrm{tr}(P-Q) = 0\), the eigenvalues of the self-adjoint operator \(P-Q\) are \(\lambda \) and \(-\lambda \) with some \(\lambda \ge 0\). Hence we have \(\Vert P-Q\Vert ^2 = -\det (P-Q)\). Applying a unitary similarity if necessary, we may assume without loss of generality that \((1,0) \in \mathrm{Im\,}P\). Obviously, there exist \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(0\le \mu \le 2\pi \) such that \((\cos \vartheta , e^{i\mu }\sin \vartheta ) \in \mathrm{Im\,}Q\). Thus the matrix forms of P and Q in the standard basis are
and
where we used the notation of the Busch–Gudder theorem. Now, an easy calculation gives us \(\det (P-Q) = -\sin ^2\vartheta \) and \(\mathrm{tr}PQ = \cos ^2\vartheta \). Hence the second equation in (17) is proved, and the third one follows from \(\mathrm{tr}P^\perp Q = 1 - \mathrm{tr}PQ\). \(\square \)
For \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(s\in [0,1]\), let us use the following notation:
Next, we examine this set.
Lemma 3.4
For all \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) the following statements are equivalent:
-
(i)
\(s = \sin \tfrac{\pi }{4}\),
-
(ii)
there exists an \(R\in {\mathcal {M}}_{P,s}\) such that \(R^\perp \in {\mathcal {M}}_{P,s}\),
-
(iii)
for all \(R\in {\mathcal {M}}_{P,s}\) we have also \(R^\perp \in {\mathcal {M}}_{P,s}\).
Proof
One could use the Bloch representation (see Sect. 5), however, let us give here a purely linear algebraic proof. Note that for any \(R_1, R_2 \in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) we have \(\Vert R_1-R_2\Vert = 1\) if and only if \(R_2 = R_1^\perp \). Without loss of generality we may assume that P has the matrix form of (18). Then for any \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(R_1, R_2 \in {\mathcal {M}}_{P,\sin \vartheta }\) we have
with some \(\mu _1,\mu _2\in {\mathbb {R}}\). Hence, we get
Notice that the right-hand side is always less than or equal to 1. Moreover, for any \(\mu _1\in {\mathbb {R}}\) there exist a \(\mu _2\in {\mathbb {R}}\) such that \(\Vert R_1-R_2\Vert =1\) if and only if \(\vartheta = \tfrac{\pi }{4}\). This completes the proof. \(\square \)
Lemma 3.5
Let \(P,Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(s,t\in (0,1)\). Then the following are equivalent:
-
(i)
\(tP\sim sQ\)
-
(ii)
either \(Q=P\), or \(Q=P^\perp \), or
$$\begin{aligned} s \le \frac{1}{\tfrac{1}{1-t}\Vert P^\perp -Q\Vert ^2+\Vert P-Q\Vert ^2}. \end{aligned}$$
Proof
The case when \(Q\in \{P,P^\perp \}\) is trivial, so from now on we assume otherwise. Recall that two rank-one effects with different images are coexistent if and only if their sum is an effect, see [20, Lemma 2]. Therefore, (i) is equivalent to \(I-tP-sQ \ge 0\). Since \(\mathrm{tr}(I-tP-sQ) = 2-t-s > 0\), the latter is further equivalent to \(\det (I-tP-sQ) \ge 0\). Without loss of generality we may assume that P and Q have the matrix forms written in (18) and (19) with \(0<\vartheta <\tfrac{\pi }{2}\). Then a calculation gives
From the latter we get that \(\det (I-tP-sQ) \ge 0\) holds if and only if
which, by (17) is equivalent to (ii). \(\square \)
Note that we have
We need one more lemma.
Lemma 3.6
Let \(P,Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\). Then there exists a projection \(R\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) such that
Proof
Again, one could use the Bloch representation, however, let us give here a purely linear algebraic proof. We may assume without loss of generality that P and Q are of the form (18) and (19). Then for any \(z\in {\mathbb {C}}\), \(|z|=1\) the rank-one projection
satisfies \(\Vert P-R\Vert = \sin \tfrac{\pi }{4}\). In order to complete the proof we only have to find a z with \(|z|=1\) such that \(\mathrm{tr}RQ = \tfrac{1}{2}\), which is an easy calculation. Namely, we find that \(z=ie^{i\mu }\) is a suitable choice. \(\square \)
Now, we are in the position to prove our second main result in the low-dimensional case.
Proof of Theorem 1.3 in two dimensions
The proof is divided into the following three steps:
-
1
we show some basic properties of \(\phi \), in particular, that it preserves commutativity in both directions,
-
2
we show that \(\phi \) maps pairs of rank-one projections with distance \(\sin \tfrac{\pi }{4}\) into pairs of rank-one projections with the same distance,
-
3
we finish the proof by examining how \(\phi \) acts on rank-one projections and rank-one effects.
STEP 1: First of all, the properties of \(\phi \) imply
and
Hence, it is straightforward from Lemma 2.1 that there exists a bijection \(g:[0,1] \rightarrow [0,1]\) such that
Also, by Lemma 3.1 we easily infer
thus, in particular, we get
By Theorem 1.1 we also obtain
Now, we observe that \(\phi \) preserves commutativity in both directions. Indeed we have the following for every \(A,B \in {{\mathcal {E}}}({\mathbb {C}}^2)\setminus \mathcal {SC}({\mathbb {C}}^2)\):
Note that we easily get the same conclusion using (20) if any of the two effects is a scalar effect.
Next, notice that Lemma 3.2 implies
Therefore, by interchanging the \(\phi \)-images of tP and \(I-tP\) for some \(0<t<1\) and \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\), we may assume without loss of generality that
Hence we obtain the following for all rank-one projections P:
Thus, again by interchanging the \(\phi \)-images of P and \(P^\perp \) for some \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\), and using Lemma 3.2, we may assume without loss of generality that for every \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) there exists a strictly increasing bijective map \(f_P:(0,1]\rightarrow (0,1]\) such that
STEP 2: We define the following set for any qubit effect of the form tP, \(0<t<1, P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\):
(For a visualisation of \(\ell _{tP}\) see Sect. 5.) Using Lemma 3.5 we see that
By the properties of \(\phi \) we obtain
Next, using the set introduced in (22), we prove the following property of \(\phi \):
By a straightforward calculation we get that
where
Note that \(s(t,r) = \sin \tfrac{\pi }{4}\) holds if and only if \(t= r\). By Lemma 3.4, this is further equivalent to the following:
Notice that by (23) this is equivalent to the following:
which is further equivalent to \(f_P(t) = f_{P^\perp }(r)\).
Hence we can conclude a few important properties of \(\phi \). First, we have
Second, since for every \(0<t<1\) and \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) we have
therefore using (21) gives (24).
Furthermore, we also obtain
By Lemma 3.6, for all \(Q_1, Q_2\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) there exists a rank-one projection P such that
Therefore, applying (25) and noticing that \(t\mapsto \tfrac{1-t}{1-t/2}\) is a strictly decreasing bijection of (0, 1) gives that
Thus we conclude that there exists a strictly increasing bijection \(f:(0,1]\rightarrow (0,1]\) such that
We also observe that (25) implies
therefore we notice that
which is a consequence of the fact that the unique solution of the equation \(t = \tfrac{1-t}{1-t/2}\), \(0<t<1\), is \(t = 2-\sqrt{2}\).
STEP 3: Next, applying [12, Theorem 2.3] gives that there exists a unitary or antiunitary operator \(U:{\mathbb {C}}^2\rightarrow {\mathbb {C}}^2\) such that we have
Since either both \(U^*\phi (\cdot )U\) and \(\phi (\cdot )\) satisfy our assumptions simultaneously, or none of them does, therefore without loss of generality we may assume that we have
We now claim that
Let us assume otherwise, then there exist two rank-one projections P and Q such that \(\Vert P-Q\Vert < \sin \tfrac{\pi }{4}\), \(\phi (P) = P\) and \(\phi (Q) = Q^\perp \). Note that \(\Vert P-Q^\perp \Vert = \sqrt{1-\Vert P-Q\Vert ^2}> \sin \tfrac{\pi }{4} > \Vert P-Q\Vert \). By (23) and (28) we have
Therefore putting first \(R=Q\) and then \(R=Q^\perp \) gives
and
But this implies that f interchanges two different numbers which contradicts to its strict increasingness—proving our claim (29).
Note that for every \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(0\le \mu < 2\pi \) we have
where \(\cdot ^t\) stands for the transposition, and we used the notation of the Busch–Gudder theorem. It is well-known, and can be verified by an easy computation, that we have \(A^t = KAK^*\) for every qubit effect A, where K is the coordinate-wise conjugation antiunitary operator: \(K(z_1,z_2) = (\overline{z_1},\overline{z_2})\) \((z_1,z_2\in {\mathbb {C}})\). Therefore from now on we may assume without loss of generality that we have
i.e. \(\phi \) fixes all rank-one projections.
Finally, observe that (30) and (31) implies
thus we obtain \(\phi (tP) = tP\) for all \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(\sqrt{2}-1< t < 1\). But this further implies
for all \(\sqrt{2}-1< t < 1\), from which we conclude
i.e. \(\phi \) fixes all rank-one effects. From here we only need to apply Corollary 2.11 and transform back to our original \(\phi \) to complete the proof. \(\square \)
4 Proof of Theorem 1.3 in the General Case
Here we prove the general case of our main theorem, utilising the above proved low-dimensional case. We start with two lemmas.
Lemma 4.1
Let \(P \in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\) and \(A \in {{\mathcal {E}}}(H) \setminus \{ P, P^\perp \}\). Then there exists a rank-one effect \(R\in {{\mathcal {F}}}_1(H)\) such that \(R \sim A\) but \(R \not \sim P\).
Proof
Assume that \(A \in {{\mathcal {E}}}(H)\) such that \(A^\sim \cap {{\mathcal {F}}}_1(H) \subseteq P^\sim = P^c\) holds. We have to show that then either \(A=P\), or \(A=P^\perp \). Clearly, A is not a scalar effect. By Corollary 2.7 we obtain that
Notice that the set
has two connected components (with respect to the operator norm topology), namely
However, by the Busch–Gudder theorem we obtain that
Since \(\mathrm{supp}\left( \Lambda (P,\cdot ) + \Lambda (P^\perp ,\cdot )\right) \) is a closed set, we obtain
Notice that the left-hand side of (34) is connected if and only if A is not a projection, in which case it must be a subset of one of the components of the right-hand side. However, this is impossible because the left-hand side contains a maximal set of pairwise orthogonal rank-one projections. Therefore \(A\in {{\mathcal {P}}}(H)\), and in particular \(\mathrm{supp}\left( \Lambda (A,\cdot ) + \Lambda (A^\perp ,\cdot )\right) \) has two connected components. From here using (33) for both A and P we easily complete the proof. \(\square \)
We introduce a new relation on \({{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\). For \(A,B \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\) we write \(A \prec B\) if and only if for every \(C\in A^\sim \setminus \mathcal {SC}(H)\) there exists a \(D\in B^\sim \setminus \mathcal {SC}(H)\) such that \(C^\sim \subseteq D^\sim \). Clearly, for every non-scalar effect B we have \(B \prec B\) and \(B^\perp \prec B\). In particular \(\prec \) is a reflexive relation, but it is not antisymmetric. It is also straightforward from the definition that \(\prec \) is a transitive relation, i.e. \(A \prec B\) and \(B \prec C\) imply \(A \prec C\).
We proceed with characterising non-trivial projections in terms of the relation of coexistence.
Lemma 4.2
Assume that \(A \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\). Then the following two statements are equivalent:
-
(i)
\(A \in {{\mathcal {P}}}(H)\),
-
(ii)
\(\# \{ B \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H) :B \prec A \} = 2\).
Proof
(i)\(\Longrightarrow \)(ii): Suppose that \(B \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\), \(B\ne A\), \(B\ne A^\perp \) and \(B \prec A\). We need to show that this assumption leads to a contradiction. By Lemma 4.1 there exists a rank one effect tQ, with some \(Q\in {{\mathcal {P}}}_1(H)\) and \(t \in (0,1]\), such that \(tQ \sim B\) but \(tQ\not \sim A\). From \(B \prec A\) we know that there exists a non-scalar effect D such that
By Lemma 2.12 (a) we have
where the latter equation is easy to see (even in non-separable Hilbert spaces). Since we also have \(D\in A^c\), we obtain \(Q\in A^c\), hence the contradiction \(tQ\in A^c = A^\sim \).
(ii)\(\Longrightarrow \)(i): Here we use contraposition, so let us assume that \(A \in \left( {{\mathcal {E}}}(H)\setminus {{\mathcal {P}}}(H)\right) \setminus \mathcal {SC}(H)\). We shall construct a non-trivial projection P (which is obviously different from both A and \(A^\perp \)) such that \(P\prec A\). First, notice that there exists an \(0< \varepsilon < \tfrac{1}{2}\) such that \(H_A\left( \left( \varepsilon , 1-\varepsilon \right] \right) \notin \left\{ \{0\}, H \right\} \). Indeed, otherwise an elementary examination of the spectrum gives that \(\sigma (A) \subseteq \{\varepsilon _0,1-\varepsilon _0\}\) holds with some \(0< \varepsilon _0 < \tfrac{1}{2}\). As A is non-scalar, we actually get \(\sigma (A) = \{\varepsilon _0,1-\varepsilon _0\}\), which implies that \(H_A\left( \left( \varepsilon _0, 1-\varepsilon _0 \right] \right) \) is a non-trivial subspace.
Let us now consider the orthogonal decomposition \(H = H_1 \oplus H_2 \oplus H_3\) where
With respect to this orthogonal decomposition we have
Since coexistence is invariant under taking the ortho-complements, we may assume without loss of generality that \(H_3\ne \{0\}\). Let us set
Our goal is to show that \(P \prec A\). Let C be an arbitrary non-scalar effect coexistent with P. Then, since C and P commute, the matrix form of C is
Consider the effect \(D := \varepsilon \cdot C\) and notice that
Clearly, by Lemmas 2.3 and 2.12 we have \(D\sim A\) and \(C^\sim \subseteq D^\sim \), which completes the proof. \(\square \)
Next, we characterise commutativity preservers on \({{\mathcal {P}}}(H)\). We note that the following theorem has been proved before implicitly in [23] for separable spaces, and was stated explicitly in [24, Theorem 2.8]. In order to prove the theorem for general spaces, one only has to use the ideas of [23], however, we decided to include the proof for the sake of completeness and clarity.
Theorem 4.3
Let H be a Hilbert space of dimension at least three and \(\phi :{{\mathcal {P}}}(H)\rightarrow {{\mathcal {P}}}(H)\) be a bijective mapping that preserves commutativity in both directions, i.e.
Then there exists a unitary or antiunitary operator \(U:H\rightarrow H\) such that
Proof
For an arbitrary set \({\mathcal {M}} \subseteq {{\mathcal {P}}}(H)\) let us use the following notations: \({\mathcal {M}}^{\mathfrak {c}}:= {\mathcal {M}}^c \cap {{\mathcal {P}}}(H)\) and \({\mathcal {M}}^{{\mathfrak {c}}{\mathfrak {c}}} := ({\mathcal {M}}^{\mathfrak {c}})^{\mathfrak {c}}\). By the properties of \(\phi \) we immediately get \(\phi ({\mathcal {M}}^{\mathfrak {c}}) = \phi ({\mathcal {M}})^{\mathfrak {c}}\) and \(\phi ({\mathcal {M}}^{{\mathfrak {c}}{\mathfrak {c}}}) = \phi ({\mathcal {M}})^{{\mathfrak {c}}{\mathfrak {c}}}\) for all subset \({\mathcal {M}}\).
Next, let P and Q be two arbitrary commuting projections. Then (for instance by the Halmos’s two projections theorem) we have
where \(H_1 = \mathrm{Im\,}P \cap \mathrm{Im\,}Q\), \(H_2 = \mathrm{Im\,}P \cap \mathrm{Ker}Q\), \(H_3 = \mathrm{Ker}P \cap \mathrm{Im\,}Q\), \(H_4 = \mathrm{Ker}P \cap \mathrm{Ker}Q\) and \(H = H_1\oplus H_2 \oplus H_3 \oplus H_4\). Note that some of these subspaces might be trivial. We observe that
Hence we conclude that \(\#\{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} = 2^{\#\{j:H_j \ne \{0\}\}}\). In particular, \(\#\{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} = 2\) if and only if \(P,Q\in \{0,I\}\), and \(\#\{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} = 4\) if and only if either \(P\notin \{0,I\}\) and \(Q\in \{I,0,P,P^\perp \}\), or \(Q\notin \{0,I\}\) and \(P\in \{I,0,Q,Q^\perp \}\).
Now, we easily conclude the following characterisation of rank-one and co-rank-one projections:
This implies that
Note that we also have \(\phi (P^\perp ) = \phi (P)^\perp \) for every \(P\in {{\mathcal {P}}}(H)\), as \(P^{\mathfrak {c}}= Q^{\mathfrak {c}}\) holds exactly when \(P=Q\) or \(P+Q = I\). Since changing the images of some pairs of ortho-complemented projections to their orto-complementations does not change the property (35), we may assume without loss of generality that \(\phi ({{\mathcal {P}}}_1(H)) = {{\mathcal {P}}}_1(H)\). It is easy to see that two rank-one projections commute if and only if either they coincide, or they are orthogonal to each other. Thus, as \(\dim H \ge 3\), Uhlhorn’s theorem [32] gives that there exist a unitary or antiunitary operator \(U:H\rightarrow H\) such that
Finally, note that for every projection \(Q\in {{\mathcal {P}}}(H)\) we have
from which we easily complete the proof. \(\square \)
Before we prove Theorem 1.3 in the general case, we need one more technical lemma for non-separable Hilbert spaces. We will use the notation \({{\mathcal {E}}}_{fs}(H)\) for the set of all effects whose spectrum has finitely many elements.
Lemma 4.4
For all \(A\in {{\mathcal {E}}}_{fs}(H)\) we have
Proof
We only have to observe the following for all \(A\in {{\mathcal {E}}}(H)\) with \(\#\sigma (A) = n \in {\mathbb {N}}\), where \(E_1, \dots E_n\) are the spectral projections and \(H_j = \mathrm{Im\,}E_j\) \((j=1,2,\dots n)\):
\(\square \)
Now, we are in the position to prove our second main theorem in the general case.
Proof of Theorem 1.3 for spaces of dimension at least three
The proof will be divided into the following steps:
-
1
we show that \(\phi \) maps \({{\mathcal {E}}}_{fs}(H)\) onto itself,
-
2
we prove that \(\phi \) has the form (2) on \({{\mathcal {E}}}_{fs}(H)\setminus \mathcal {SC}(H)\),
-
3
we show that \(\phi \) has the form (2) on \({{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\).
STEP 1: First, similarly as in the previous section, we easily get the existence of a bijective function \(g:[0,1] \rightarrow [0,1]\) such that
Of course, the properties of \(\phi \) imply \(\phi (A)^\sim = \phi (A^\sim )\) for all \(A\in {{\mathcal {E}}}(H)\), and also
From the latter it follows that
Hence by Lemma 4.2 we obtain
and therefore Lemma 2.1 (b) implies that the restriction \(\phi |_{P(H)\setminus \{0,I\}}\) preserves commutativity in both directions. Applying Theorem 4.3 then gives that up to unitary–antiunitary equivalence and element-wise ortho-complementation, we have
From now on we may assume without loss of generality that this is the case.
Next, by the spectral theorem [7, Theorem IX.2.2] we have
Therefore we obtain
and thus also
In particular, we have
Hence for all \(A\in {{\mathcal {E}}}_{fs}(H)\) there exists a polynomial \(p_A\) such that \(p_A(\sigma (A)) \subset [0,1]\) and
As a similar statement holds for \(\phi ^{-1}\), we immediately get \(\phi ({{\mathcal {E}}}_{fs}(H)) = {{\mathcal {E}}}_{fs}(H)\). Also, notice that \(\#\sigma (\phi (A)) = \#\sigma (p_A(A)) \le \#\sigma (A)\) and \(\#\sigma (\phi ^{-1}(A)) \le \#\sigma (A)\) hold for all \(A\in {{\mathcal {E}}}_{fs}(H)\). Whence we obtain
In particular, the restriction \(p_A|_{\sigma (A)}\) is injective.
STEP 2: Now, let M be an arbitrary two-dimensional subspace of H and let \(P_M\in {{\mathcal {P}}}(H)\) be the orthogonal projection onto M. Consider two arbitrary effects \(A,B\in (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H)\) which therefore have the following matrix representations:
Obviously,
Note that by (39), the polynomial \(p_A\) acts injectively on \(\sigma (A)\), therefore
and of course, similarly for B. We observe that by Lemma 2.2 the following two equations hold:
and
It is important to observe that by (38) the set in (41) is the \(\phi \)-image of (40). Thus we obtain the following equivalence if \(A_M \notin \mathcal {SC}(M)\):
Now, we are in the position to use the previously proved two-dimensional version. Let
and let us say that two elements of \({\mathfrak {E}}(M)\) are coexistent, in notation \(\approx \), if either one of them is \(\mathcal {SC}(M)\), or the two elements are \(\{D,D^\perp \}\) and \(\{E,E^\perp \}\) with \(D\sim E\). Clearly, the bijective restriction
induces a well-defined bijection on \({\mathfrak {E}}(M)\) by
Notice that this map also preserves the relation \(\approx \) in both directions. Indeed, for all \(A,B\in (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H)\), \(A_M,B_M \notin \mathcal {SC}(H)\) we have
Therefore, using the two-dimensional version of Theorem 1.3, we obtain a unitary or antiunitary operator \(U_M:M\rightarrow M\) such that
and
Observe that this implies the following: for any pair of orthogonal unit vectors \(x,y\in M\) we must have either \(U_M({\mathbb {C}}\cdot x) = {\mathbb {C}}\cdot x\) and \(U_M({\mathbb {C}}\cdot y) = {\mathbb {C}}\cdot y\), or \(U_M({\mathbb {C}}\cdot x) = {\mathbb {C}}\cdot y\) and \(U_M({\mathbb {C}}\cdot y) = {\mathbb {C}}\cdot x\). As \(U_M\) is continuous, we have either the first case for all orthogonal pairs \({\mathbb {C}}\cdot x,{\mathbb {C}}\cdot y\), or the second for every such pair. But a similar statement holds for all two-dimensional subspaces, therefore it is easy to show that the second possibility cannot occur. Consequently, we have \(U_M({\mathbb {C}}\cdot x) = {\mathbb {C}}\cdot x\) for all unit vectors \(x\in M\), from which it follows that \(U_M\) is a scalar multiple of the identity operator. Thus we obtain the following for every two-dimensional subspace M:
From here it is rather straightforward to obtain
STEP 3: Observe that (43) holds for every \(A\in {{\mathcal {F}}}(H)\), therefore an application of Theorem 1.1 and Corollary 2.5 completes the proof in the separable case. As for the general case, let us consider an arbitrary effect \(A\in {{\mathcal {E}}}(H)\setminus {{\mathcal {E}}}_{fs}(H)\) and an orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\) such that each \(H_i\) is a separable invariant subspace of A. By (38) and Lemma 2.1 (b), each \(H_i\) is an invariant subspace also for \(\phi (A)\), in particular, we have
Without loss of generality we may assume from now on that there exists an \(i_0\in \mathcal {I}\) so that \(A_{i_0}\) is not a scalar effect.
Now, let \(i\in \mathcal {I}\), \(F\in {{\mathcal {F}}}(H)\) and \(\mathrm{Im\,}F \subseteq H_{i}\) be arbitrary. Then by (43) we have
In particular, \(A_{i}^\sim \cap {{\mathcal {F}}}(H_{i}) = \mathcal {A}_{i}^\sim \cap {{\mathcal {F}}}(H_{i})\), therefore by Theorem 1.1 we get that for all i we have either \(A_i, \mathcal {A}_i\in \mathcal {SC}(H)\), or \(A_i = \mathcal {A}_i\), or \(\mathcal {A}_i = A_i^\perp \). By considering \(A^\perp \) instead of A if necessary, we may assume that we have \(A_{i_0} = \mathcal {A}_{i_0}\). Finally, for any \(i_1\in \mathcal {I}\setminus \{i_0\}\) let us consider the orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}\setminus \{i_0,i_1\}} H_i \oplus (H_{i_0} \oplus H_{i_1}) \). Similarly as above, we then get \(A_{i_0}\oplus A_{i_1} = \mathcal {A}_{i_0} \oplus \mathcal {A}_{i_1}\), and the proof is complete. \(\square \)
5 A Remark on the Qubit Case
Here we visualise the set \(A^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\) for a general rank-one qubit effect A. First, let us introduce Bloch’s representation. Consider the following vector space isomorphism between the space of all \(2 \times 2\) Hermitian matrices \({{\mathcal {B}}}_{sa}({\mathbb {C}}^2)\) and \({\mathbb {R}}^4\), see also [4]:
where
are the Pauli matrices. Clearly, we have \(\rho (0) = (0,0,0,0)\), \(\rho (I) = (1,0,0,0)\). The Bloch representation is usually defined as the restriction \(\rho |_{{{\mathcal {P}}}_1({\mathbb {C}}^2)}\) which maps \({{\mathcal {P}}}_1({\mathbb {C}}^2)\) onto a sphere of the three-dimensional affine subspace \(\{ (1/2,x_1,x_2,x_3) :x_j\in {\mathbb {R}}, j=1,2,3 \}\) with centre at (1/2, 0, 0, 0) and radius 1/2. Indeed, as the general form of a rank-one projection in \({\mathbb {C}}^2\) is
where \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(0\le \mu < 2\pi \), a not too hard calculation gives that
Recall the remarkable angle doubling property of the Bloch representation, namely, we have \(\Vert P-Q\Vert = \sin \theta \) if and only if the angle between the vectors \(\rho (P) - \tfrac{1}{2}e_0\) and \(\rho (Q) - \tfrac{1}{2}e_0\) is exactly \(2\theta \).
Next, we call a positive (semi-definite) element of \({{\mathcal {B}}}_{sa}({\mathbb {C}}^2)\) a density matrix if its trace is 1, or in other words, if it is a convex combination of some rank-one projections. Therefore \(\rho \) maps the set of all \(2\times 2\) density matrices onto the closed ball of the three-dimensional affine subspace \(\{ (1/2,x_1,x_2,x_3) :x_j\in {\mathbb {R}}, j=1,2,3 \}\) with centre at (1/2, 0, 0, 0) and radius 1/2. Hence, we see that the cone of all positive (semi-definite) \(2\times 2\) matrices is mapped onto the infinite cone spanned by (0, 0, 0, 0) and the aforementioned ball. Thus \(\rho \) maps \({{\mathcal {E}}}({\mathbb {C}}^2)\) onto the intersection of this cone and its reflection through the point \(\rho (\tfrac{1}{2}I) = (\tfrac{1}{2},0,0,0)\).
We can re-write (44) as follows:
where
is an orthonormal system in \({\mathbb {R}}^4\). Let \(S_\mu \) be the three-dimensional subspace spanned by \(e_0, e_\mu , e_3\). Then the set \(\rho ({{\mathcal {E}}}({\mathbb {C}}^2))\cap S_\mu \) can be visualised as a double cone of \({\mathbb {R}}^3\), by regarding \(e_0, e_\mu , e_3\) as the standard basis of \({\mathbb {R}}^3\), see Figure 2. Note that \(\rho ({{\mathcal {P}}}_1({\mathbb {C}}^2))\cap S_\mu \) is the circle where the boundaries of the two cones meet.
We continue with visualising the set \((tP_{(1, 0)})^\sim \) for an arbitrary \(0<t<1\). Note that then visualising \((tP)^\sim \) for a general rank-one projection P is very similar, we simply have to apply a unitary similarity (which by well-known properties of the Bloch representation, acts as a rotation on the sphere \(\rho ({{\mathcal {P}}}_1({\mathbb {C}}^2))\)). Equation (9) gives the following:
Now, let us consider the vector
which is orthogonal to
From here a bit tedious computation gives
Therefore by Corollary 2.7 and (46) we conclude that \(\rho \left( (t P_{(1,0)})^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\right) \) is the union of the line segment \(\{\rho \left( s P_{(1,0)}\right) :0<s\le 1 \} = \{ \tfrac{s}{2} e_0 + \tfrac{s}{2} e_3 :0<s\le 1 \}\) and of the area on the boundary of \(\rho ({{\mathcal {E}}}({\mathbb {C}}^2))\) which is either on, or below the affine hyperplane whose normal vector is u and which contains \(\rho (P_{(1,0)}^\perp )\), see Figure 3. We note that using the notation of (22), the ellipse on the boundary is exactly the set
Therefore \(\rho (\ell _{tP_{(1,0)}})\) is a punctured ellipsoid.
If one illustrates the set \(\rho \left( (A)^\sim \right) \cap \rho \left( {{\mathcal {F}}}_1({\mathbb {C}}^2)\right) \cap S_\mu \) with \(A, A^\perp \notin \mathcal {SC}({\mathbb {C}}^2) \cup {{\mathcal {F}}}_1({\mathbb {C}}^2)\) in the way as above, then one gets a set on the boundary of the cone which is bounded by a continuous closed curve containing the \(\rho \)-images of the spectral projections.
6 Final Remarks and Open Problems
First, we prove the analogue of Lemma 3.1 for finite dimensional spaces of dimension at least three.
Lemma 6.1
Let H be a Hilbert space with \(2\le \dim H < \infty \) and \(A\in {{\mathcal {E}}}(H)\). Then the following are equivalent:
-
(i)
\(0,1 \in \sigma (A)\),
-
(ii)
there exists no effect \(B\in {{\mathcal {E}}}(H)\) such that \(B^\sim \subsetneq A^\sim \).
Proof
If \(\dim H = 2\), then (i)\(\iff \)(ii) was proved in Lemma 3.1, so from now on we will assume \(2< \dim H < \infty \). Also, as the case when \(A\in \mathcal {SC}(H)\) is trivial, we assume otherwise throughout the proof.
(i)\(\Longrightarrow \)(ii): Suppose that \(0,1 \in \sigma (A)\) and consider an arbitrary effect B with \(B^\sim \subseteq A^\sim \). By Lemma 2.12, A and B commute. If \(0 = \lambda _1 \le \lambda _2 \le \dots \le \lambda _{n-1} \le \lambda _n = 1\) are the eigenvalues of A, then the matrices of A and B written in an orthonormal basis of joint eigenvectors are the following:
with some \(\mu _1,\dots \mu _n \in [0,1]\). Notice that by Corollary 2.8, for all \(1\le i < j \le n\) we have
In particular, choosing \(i = 1, j = n\) implies either \(\mu _1 = 0\) and \(\mu _n = 1\), or \(\mu _1 = 1\) and \(\mu _n = 0\). Assume the first case. If we set \(i=1\), then Lemma 3.2 and (47) imply \(\mu _j \ge \lambda _j\) for all \(j = 2,\dots , n-1\). But on the other hand, setting \(j=n\) implies \(\mu _i \le \lambda _i\) for all \(i = 2,\dots , n-1\). Therefore we conclude \(B = A\). Similarly, assuming the second case implies \(B=A^\perp \).
(ii)\(\Longrightarrow \)(i): Assume (i) does not hold, then there exists a positive number \(\varepsilon \) such that \(\sigma (A) \subseteq [0,1-\varepsilon ]\) or \(\sigma (A^\perp ) \subseteq [0,1-\varepsilon ]\). Suppose the first possibility holds, then \(\tfrac{1}{1-\varepsilon }A \notin \{A,A^\perp \}\) and \(\left( \tfrac{1}{1-\varepsilon }A\right) ^\sim \subseteq A^\sim \). The second case is very similar. \(\square \)
We only proved the above lemma and Corollary 2.11 in the finite dimensional case. The following two questions would be interesting to examine:
Question 6.2
Does the statement of Corollary 2.11 remain true for general infinite dimensional Hilbert spaces?
Question 6.3
Does the statement of Lemma 6.1 hold if \(\dim H \ge \aleph _0\)?
Finally, our first main theorem characterises completely when \(A^\sim = B^\sim \) happens for two effects A and B. However, we gave only some partial results about when \(A^\sim \subseteq B^\sim \) occurs, e.g. Lemma 2.12.
Question 6.4
How can we characterise the relation \(A^\sim \subseteq B^\sim \) for effects A, B?
We believe that a complete answer to this latter question would represent a substantial step towards the better understanding of coexistence.
References
Ando, T.: Problem of Infimum in the Positive Cone, Analytic and Geometric Inequalities and Applications, Mathematical Application., vol. 478, pp. 1–12. Kluwer Academic Publication, Dordrecht (1999)
Busch, P., Gudder, S.: Effects as functions on projective Hilbert space. Lett. Math. Phys. 47, 329–337 (1999)
Busch, P., Lahti, P., Pellonpää, J.-P., Ylinen, K.: Quantum Measurement. Springer, Berlin (2016)
Busch, P., Schmidt, H.-J.: Coexistence of qubit effects. Quantum Inf. Process. 9, 143–169 (2010)
Cassinelli, G., De Vito, E., Lahti, P., Leverero, A.: A theorem of Ludwig revisited. Found. Phys. 8, 921–941 (2000)
Chevalier, G.: Wigner’s Theorem and Its Generalizations, Handbook of Quantum logic and quantum structures, 429–475. Elsevier, Amsterdam (2007)
Conway, J.B.: A Course in Functional Analysis, Graduate Texts in Mathematics, vol. 96, 2nd edn. Springer, New York (1990)
Davies, E.B.: Quantum Theory of Open Systems. Academic Press, New York (1976)
Faure, C.-A.: An elementary proof of the fundamental theorem of projective geometry. Geom. Dedicata 90, 145–151 (2002)
Foulis, D.J., Bennett, M.K.: Effect algebras and unsharp quantum logics. Found. Phys. 24, 1331–1352 (1994)
Gehér, G.P.: An elementary proof for the non-bijective version of Wigner’s theorem Phys. Lett. A 378, 2054–2057 (2014)
Gehér, G.P.: Symmetries of projective spaces and spheres. Int. Math. Res. Not. IMRN 7, 2205–2240 (2020)
Gudder, S.: Sharp and unsharp quantum effects. Adv. Appl. Math. 20, 169–187 (1998)
Heinosaari, T., Kiukas, J., Reitzner, D.: Coexistence of effects from an algebra of two projections. J. Phys. A Math. Theor. 47, 225301 (2014)
Houtappel, R.M.F., van Dam, H., Wigner, E.P.: The conceptual basis and use of the geometric invariance principles. Rev. Modern Phys. 37, 595–632 (1965)
Kadison, R.V.: Order properties of bounded self-adjoint operators. Proc. Am. Math. Soc. 2, 505–510 (1951)
Kraus, K.: States, effects, and operations, fundamental notions of quantum theory. In: Böhm, A., Dollard, J.D., Wootters, W.H. (eds.) Lecture Notes in Physics, vol. 190. Springer, Berlin (1983)
Ludwig, G.: Foundations of Quantum Mechanics, vol. I, (Translated from the German by Carl A. Hein), Springer-Verlag, New York (1983)
Ludwig, G.: Foundations of Quantum Mechanics, vol. II, (Translated from the German by Carl A. Hein), Springer-Verlag, New York (1985)
Molnár, L.: Characterization of the automorphisms of Hilbert space effect algebras. Commun. Math. Phys. 223, 437–450 (2001)
Molnár, L.: Selected Preserver Problems on Algebraic Structures of Linear, Operators and on Function Spaces. Lecture notes in mathematics, vol. 1895. Springer, Berlin (2007)
Molnár, L., Páles, Zs: \(^\perp \)-order automorphisms of Hilbert space effect algebras: the two-dimensional case. J. Math. Phys. 42, 1907–1912 (2001)
Molnár, L., Šemrl, P.: Nonlinear commutativity preserving maps on self-adjoint operators. Quart. J. Math. 56, 589–595 (2005)
Molnár, L., Šemrl, P.: Transformations of the unitary group on a Hilbert space. J. Math. Anal. Appl. 388, 1205–1217 (2012)
Moreland, T.J., Gudder, S.P.: Infima of Hilbert space effects. Linear Algebra Appl. 286, 1–17 (1999)
Šemrl, P.: Comparability preserving maps on Hilbert space effect algebras. Commun. Math. Phys. 313, 375–384 (2012)
Šemrl, P.: Automorphisms of Hilbert space effect algebras. J. Phys. A 48, 195301 (2015)
Šemrl, P.: Groups of order automorphisms of operator intervals. Acta Sci. Math. (Szeged) 84, 125–136 (2018)
Simon, B.: Quantum dynamics: from automorphism to Hamiltonian. In: Lieb, E.H., Simon, B., Wightman, A.S. (eds.) Studies in Mathematical Physics, Essays in Honor of Valentine Bargmann, Princeton Series in Physics, pp. 327–349. University Press, Princeton (1976)
Stano, P., Reitzner, D., Heinosaari, T.: Coexistence of qubit effects. Phys. Rev. A 78, 012315 (2008)
Titkos, T.: Ando’s theorem for nonnegative forms. Positivity 16, 619–626 (2012)
Uhlhorn, U.: Representation of symmetry transformations in quantum mechanics. Ark. Fysik 23, 307–340 (1963)
Uhlmann, A.: Transition probability (fidelity) and its relatives. Found Phys. 41, 288–298 (2011)
Wigner, E.P.: Gruppentheorie und ihre Anwendung auf die Quantenmechanik der Atomspektrum. Fredrik Vieweg und Sohn, Brunswick (1931)
Wolf, M.M., Perez-Garcia, D., Fernandez, C.: Measurements incompatible in quantum theory cannot be measured jointly in any other no-signaling theory. Phys. Rev. Lett. 103, 230402 (2009)
Yu, S., Liu, N.-L., Li, L., Oh, C.H.: Joint measurement of two unsharp observables of a qubit. Phys. Rev. A 81, 062116 (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by H. T. Yau
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
György Pál Gehér was supported by the Leverhulme Trust Early Career Fellowship, ECF-2018-125. He was also partly supported by the Hungarian National Research, Development and Innovation Office-NKFIH (K115383)
Peter Šemrl was supported by Grants N1-0061, J1-8133, and P1-0288 from ARRS, Slovenia.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gehér, G.P., Šemrl, P. Coexistency on Hilbert Space Effect Algebras and a Characterisation of Its Symmetry Transformations. Commun. Math. Phys. 379, 1077–1112 (2020). https://doi.org/10.1007/s00220-020-03873-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00220-020-03873-3