Skip to content

2 Introduction to quantum mechanics

In this chapter we will acquire familiarity with elementary linear algebra while introducing the notation used by physicists to describe quantum mechanics (which is different to that used in most introductions to linear algebra). We then review the basic postulates of quantum mechanics. Later on we will introduce superdense coding, a surprising and illuminating example of quantum information processing which combines many of the postulates of quantum mechanics in a simple setting. Next there are powerful tools like density operator, purifications, and the Schmidt decomposition, which are especially useful in the study of quantum computation and quantum information.

2.1 Linear algebra

The basic objects of linear algebra are vector spaces, the vector space of most interest to us is \(\mathbb{C}^n\), the space of all \(n\)-tuples of complex numbers \((z_1,\cdots,z_n)\).

2.1.1 Bases, operators and matrices

  • A spanning set (\(\ket{v_1}, \ket{v_2}, \cdots, \ket{v_n}\)) for a vector space is a set of vectors such that any vector \(\ket v\) in the vector space can be written as a combination \(\sum_ia_i\ket{v_i}\). For example, the two vectors \(\begin{bmatrix}0\\1\end{bmatrix}\) and \(\begin{bmatrix}1\\0\end{bmatrix}\) span the vector space \(\mathbb C^2\). Generally, a vector space may have many different spanning sets.
  • A set of non-zero vectors \(\ket{v_1}, \ket{v_2}, \cdots, \ket{v_n}\) are linearly dependent if there exists a set of complex numbers \(a_1, \cdots, a_n\) with \(a_i\neq0\) for at least one \(i\), such that \(a_1\ket{v_1} + a_2\ket{v_2} + \cdots + a_n\ket{v_n} = 0\).
  • It can be shown that any two sets of linearly independent vectors which span a vector space \(V\) contain the same number of elements. The number of elements in the basis is defined to be the dimension of \(V\).
Exercise 2.1

(Linear dependence: example) Show that \((1, -1)\), \((1,2)\) and \((2,1)\) are linearly dependent.

solution
\[ (1)\begin{bmatrix}1\\-1\end{bmatrix} + (1)\begin{bmatrix}1\\2\end{bmatrix} +(-1)\begin{bmatrix}2\\1\end{bmatrix} = 0 \nonumber \]
  • A linear operator \(A:V \rightarrow W\) is linear to its inputs: \(A\bigg(\sum_ia_i\ket{v_i}\bigg) = \sum_i a_i A\ket{v_i}\) , when we say that a linear operator \(A\) is defined on a vector space \(V\), we mean that \(A\) is a linear operator from \(V\) to \(V\) .

  • The most convenient way to understand linear operators is in terms of their matrix representations. Suppose \(A:V \rightarrow W\) is a linear operator. \(\ket{v_1},\cdots,\ket{v_m}\) is a basis for V and \(\ket{w_1},\cdots,\ket{w_n}\) is a basis for \(W\). Then for each \(i\) in range \(1,\cdots,m\), there exist complex numbers \(A_{1i},\cdots,A_{ni}\) such that \(\ket{v_i} = \sum_{j=1}^n A_{ij}\ket{w_j}\).

Exercise 2.2

(Matrix representations: example) Suppose \(V\) is a vector space with basis vectors \(\ket0\) and \(\ket1\), and \(A\) is a linear operator from \(V\) to \(V\) such that \(A\ket0 = \ket1\) and \(A\ket1 = \ket0\). Give a matrix representation for \(A\), with respect to the input basis \(\ket0, \ket1\), and the output basis \(\ket0, \ket1\). Find input and output bases which give rise to a different matrix representation of \(A\).

solution
\[ A = \begin{bmatrix}0 &1\\1 &0\end{bmatrix} \]

, now if we switch from original input & output bases to \(\{\ket0, \ket1\}\to \{\ket1, \ket0\}\), then \(A\) will become \(\begin{bmatrix}1 &0\\0 &1\end{bmatrix}\).

Exercise 2.3

(Matrix representation for operator products) Suppose \(A\) is a linear operator from vector space \(V\) to vector space \(W\) , and \(B\) is a linear operator from vector space \(W\) to vector space \(X\). Let \(\ket{v_i}, \ket{w_j}, \ket{x_k}\) be bases for the vector spaces \(V\), \(W\), and \(X\), respectively. Show that the matrix representation for the linear transformation \(BA\) is the matrix product of the matrix representations for \(B\) and \(A\), with respect to the appropriate bases.

solution

From the description we know \(\ket{v_i} = \sum_j A_{ji}\ket{w_j}\) and \(\ket{w_j} = \sum_k B_{kj}\ket{x_k}\). Combine the two equations we get

\[ \ket{v_i} = \sum_j\sum_k B_{kj} A_{ji} \ket{x_k} = \sum_k (BA)_{ki} \ket{x_k} \nonumber \]

, hence the linear transformation \(BA\) is the matrix product of the matrix representations for \(B\) and \(A\).

Exercise 2.4

(Matrix representation for identity) Show that the identity operator on a vector space \(V\) has a matrix representation which is one along the diagonal and zero everywhere else, if the matrix representation is taken with respect to the same input and output bases. This matrix is known as the identity matrix.

solution
\[ \ket{v_j} = \sum_i I_{ij}\ket{v_i} = \ket{v_j}, \;\;\forall j\;\;\Rightarrow I_{ij} = \delta_{ij} \]

2.1.2 The Pauli matrices and inner products

  • Four extremely useful matrices (which we shall often have occasion to use) are the Pauli matrices:
\[ \begin{align*} \sigma_0\equiv I\equiv\begin{bmatrix}1 & 0\\0 & 1\end{bmatrix}& & \sigma_1\equiv X\equiv\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix} \\ \\ \sigma_2\equiv Y\equiv\begin{bmatrix}0 & -i\\i & 0\end{bmatrix}& & \sigma_3\equiv Z\equiv\begin{bmatrix}1 & 0\\0 & -1\end{bmatrix} \end{align*} \]
  • An inner product is a function which take input as two vectors \(\ket v\) and \(\ket w\) from a vector space and produce a complex number as output. The standard quantum mechanical notation for the inner product is \(\braket{v}{w}\). Note that \(\braket{v}{w} = \braket{w}{v}^*\) .
Exercise 2.5

Verify that \((\cdot,\cdot)\) just defined is an inner product on \(\mathbb C^n\)

solution

Let \(\ket v = (v_1, v_2, \cdots, v_n)^\intercal\) and \(\ket w = (w_1, w_2, \cdots, w_n)^\intercal\), then the three conditions require satisfaction are:

\[ \begin{align*} \bigg(\ket v, \sum_i\lambda_i\ket{w_i}\bigg) &= \sum_jv_j^*\Big(\sum_i\lambda_i{w_{ij}}\Big) = \sum_i\lambda_i\Big(\sum_iv_j^*{w_{ij}}\Big) = \sum_i\lambda_i\bigg(\ket v, \ket{w_i}\bigg) \\ \big(\ket v, \ket w\big) &= \sum_iv_i^*w_i = \sum_i(w_i^*v_i)^* = \big(\ket w, \ket v\big)^* \\ \big(\ket v, \ket v\big) &= \sum_iv_i^*v_i = \sum_i\abs{v_i}^2 \geq 0 \end{align*} \]

Thus \((\cdot,\cdot)\) is an inner product on \(\mathbb C^n\).

Exercise 2.6

Show that any inner product \((\cdot,\cdot)\) is conjugate-linear in the first argument, \(\bigg(\sum_i\lambda_i\ket{w_i}, \ket{v}\bigg) = \sum_i\lambda_i^*\bigg(\ket{w_i}, \ket{v}\bigg)\).

solution
\[ \bigg(\sum_i\lambda_i\ket{w_i}, \ket{v}\bigg) = \bigg(\ket{v}, \sum_i\lambda_i\ket{w_i}\bigg)^* = \bigg( \sum_i\lambda_i(\ket v, \ket{w_i})\bigg)^* = \sum_i\lambda_i^*(\ket v, \ket{w_i})^*. \]
  • Discussions of quantum mechanics often refer to Hilbert space. In the finite dimensional complex vector spaces that come up in quantum computation and quantum information, a Hilbert space is exactly the same thing as an inner product space.
Exercise 2.7

Verify that \(\ket w\equiv(1,1)\) and \(\ket v\equiv(1,-1)\) are orthogonal. What are the normalized forms of these vectors?

solution

\(\braket{w}{v} = 1-1 = 0\Rightarrow \text{orthogonal}\), \(\ket w = \frac{1}{\sqrt2}(1,1)\) and \(\ket v = \frac{1}{\sqrt2}(1,-1)\).

  • Suppose \(\ket{w_1},\cdots,\ket{w_d}\) is a basis set for some vector space \(V\) with an inner product. There is a useful method, the Gram–Schmidt procedure, which can be used to produce an orthonormal basis set \(\ket{v_1},\cdots,\ket{v_d}\) for the vector space \(V\):
  • define \(\ket{v_1} \equiv \frac{\ket{w_1}}{\Vert\ket{w_1}\Vert}\)

  • for \(1\leq k\leq(d-1)\) define \(\ket{v_{k+1}}\) inductively by:

\[ \ket{v_{k+1}} \equiv \frac{\ket{w_{k+1}}-\sum_{i=1}^k\braket{v_i}{w_{k+1}}\ket{v_i}}{\Vert \ket{w_{k+1}}-\sum_{i=1}^k\braket{v_i}{w_{k+1}}\ket{v_i} \Vert} \]

, we can then easily verify the created vectors \(\ket{v_1}, \cdots, \ket{v_d}\) form an orthonormal set which is also a basis for \(V\).

Exercise 2.8

Prove that the Gram–Schmidt procedure produces an orthonormal basis for V .

solution

One can proof by mathematical induction.

  • completeness relation: \(\sum_i\ket i\bra i = I\).

  • One application of the completeness relation is to give a means for representing any operator in the outer product notation: suppose \(A:V\to W\) is a linear operator, \(\ket{v_i}\) is an orthonormal basis for \(V\), and \(\ket{w_j}\) an orthonormal basis for \(W\). By applying the completeness relation twice we obtain:

    \[ \begin{align*} A &= I_WAI_V\\ &= \sum_{i,j}\ket{w_j}\bra{w_j}A\ket{v_i}\bra{v_i}\\ &= \sum_{i,j}\bra{w_j}A\ket{v_i}\ket{w_j}\bra{v_i} \end{align*} \]

    , which is the outer product representation for \(A\). We also see from this equation that \(A\) has matrix element \(\bra{w_j}A\ket{v_i}\) in the \(i^{\text{th}}\) column and \(j^\text{th}\) row, with respect to the input basis \(\ket{v_i}\) and output basis \(\ket{w_j}\).

  • A second application illustrating the usefulness of the completeness relation is the Cauchy–Schwarz inequality:

    \[ \braket{v}{v}\braket{w}{w} \geq \vert \braket{w}{v} \vert^2 \]

✏ Proof of the Cauchy–Schwarz inequality

(書本上是用completeness relation證, 我這邊用另一種更嚴謹的方法證) Suppose \(\Vert v\Vert^2 \equiv V\) and \(\braket{u}{v}\equiv c\), then we have:

\[ \begin{align*} 0\leq\frac{1}{\Vert v \Vert^2}\bigg\Vert \Vert v \Vert^2 u - \braket{v}{u}v \bigg\Vert^2 &= \frac{1}{V}\braket{Vu-\bar{c}v}{Vu-\bar{c}v} \\ &= \frac{1}{V}\bigg(V^2\Vert{u}\Vert^2 - V\bar{c}\braket{u}{v} - cV\braket{v}{u} + c\bar{c}\braket{v}{v}\bigg) \\ &= V\Vert{u}\Vert^2 - \bar{c}c - c\bar{c} + c\bar{c} \\ &= V\Vert{u}\Vert^2 - \bar{c}c \\ &= \Vert{v}\Vert^2\Vert{u}\Vert^2 - \Big\vert\braket{u}{v}\Big\vert^2 \end{align*} \]

Therefore we have:

\[ \Vert{v}\Vert^2\Vert{u}\Vert^2 \geq \Big\vert\braket{u}{v}\Big\vert^2 \]

Exercise 2.9

(Pauli operators and the outer product) The Pauli matrices can be considered as operators with respect to an orthonormal basis \(\ket0, \ket1\) for a two-dimensional Hilbert space. Express each of the Pauli operators in the outer product notation.

solution
\[ X = \ket0\bra1 + \ket1\bra0 \;\;\;\;\; Y = -i\ket0\bra1 + i\ket1\bra0 \;\;\;\;\; Z = \ket0\bra0 - \ket1\bra1\nonumber \]

Exercise 2.10

Suppose \(\ket{v_i}\) is an orthonormal basis for an inner product space \(V\). What is the matrix representation for the operator \(\ket{v_j}\bra{v_k}\), with respect to the \(\ket{v_i}\) basis?

solution

All \(0\) but except one \(1\) at \(j^\text{th}\) row and \(k^\text{th}\) column.

2.1.3 Eigenvectors and Hermitian operators

  • An eigenvector of a linear operator \(A\) on a vector space is a non-zero vector \(\ket v\) such that \(A\ket{v} = v\ket{v}\), where \(v\) is a complex number known as the eigenvalue of \(A\) corresponding to \(\ket{v}\).

  • The characteristic function is defined to be \(c(\lambda)\equiv \det{\vert A-\lambda I \vert}\). (it can be shown that the characteristic function depends only upon the operator \(A\), and not on the specific matrix representation used for \(A\)) The solutions of the characteristic equation \(c(\lambda) =0\) are the eigenvalues of the operator.

  • The eigenspace corresponding to an eigenvalue \(v\) is the set of vectors which have eigenvalue \(v\). It is a vector subspace of the vector space on which \(A\) acts. When an eigenspace is more than one dimensional we say that it is degenerate. For example, the matrix \(A\) defined by:

\[ A\equiv\begin{bmatrix} 2&0&0\\0&2&0\\0&0&0 \end{bmatrix}\nonumber \]

has two eigenvectors \((1,0,0)\) and \((0,1,0)\) corresponding to the eigenvalue 2. The two eigenvectors are said to be degenerate because they are linearly independent eigenvectors of \(A\) with the same eigenvalue.

  • A diagonal representation for an operator \(A\) on a vector space \(V\) is a representation \(A = \sum_i\lambda_i\ket i\bra i\), where the vectors \(\ket i\) form an orthonormal set of eigenvectors for \(A\), with corresponding eigenvalues \(\lambda_i\). An operator is said to be diagonalizable if it has a diagonal representation. e.g. the Pauli Z matrix can be written:
\[ Z = \begin{bmatrix}1&0\\0&-1\end{bmatrix} = \ket0\bra0-\ket1\bra1 \]

, where the matrix representation is with respect to orthonormal vectors \(\ket0\) and \(\ket1\), respectively. Diagonal representations are sometimes also known as orthonormal decompositions.

Exercise 2.11

(Eigen decomposition of the Pauli matrices) Find the eigenvectors, eigenvalues, and diagonal representations of the Pauli matrices \(X\), \(Y\), and \(Z\).

solution
  • for \(X\) matrix, \(\lambda = \pm1\) with \(\ket{\lambda_{-1}} = \frac{1}{\sqrt2}\begin{bmatrix}1\\-1\end{bmatrix}\) and \(\ket{\lambda_{+1}} = \frac{1}{\sqrt2}\begin{bmatrix}1\\1\end{bmatrix}\), diagonal representations \(\ket0\bra1+\ket1\bra0\).
  • for \(Y\) matrix, \(\lambda = \pm1\) with \(\ket{\lambda_{-1}} = \frac{1}{\sqrt2}\begin{bmatrix}1\\-i\end{bmatrix}\) and \(\ket{\lambda_{+1}} = \frac{1}{\sqrt2}\begin{bmatrix}1\\i\end{bmatrix}\), diagonal representations \(-i\ket0\bra1+i\ket1\bra0\).
  • for \(Z\) matrix, \(\lambda = \pm1\) with \(\ket{\lambda_{-1}} = \begin{bmatrix}0\\1\end{bmatrix}\) and \(\ket{\lambda_{+1}} = \begin{bmatrix}1\\0\end{bmatrix}\), diagonal representations \(\ket0\bra0-\ket1\bra1\).

Exercise 2.12

Prove that the matrix

\[ \begin{bmatrix}1&0\\1&1\end{bmatrix}\nonumber \]

is not diagonalizable.

solution

Characteristic function: \((1-\lambda)^2 = 0\Rightarrow\lambda=1\), thus \(\ket{\lambda_1} = \begin{bmatrix}0\\1\end{bmatrix}\). but since \(c\ket{\lambda_1}\bra{\lambda_1} = c\begin{bmatrix}0&0\\0&1\end{bmatrix} \neq \begin{bmatrix}1&0\\1&1\end{bmatrix}\), it is not diagonalizable.

Exercise 2.13

If \(\ket w\) and \(\ket v\) are any two vectors, show that \((\ket w\bra v)^\dagger = \ket v\bra w\).

solution

Let \(\ket w\bra v\) be \(M\), then \(M_{ij} = w_i^*v_j\), and \((\ket v\bra w)_{ij} = v_i^*w_j\), and since \((v_i^*w_j)^\intercal = v_j^*w_i\), followed by complex conjugate \((v_j^*w_i)^* = v_jw_i^*\), we have conclude that \((\ket w\bra v)^\dagger = \ket v\bra w\).

Or, without the matrix representation, consider:

\[ \begin{cases} &\braket{\psi}{(\ket w\bra v)\phi}^* = \braket{(\ket w\bra v)^\dagger\psi}{\phi}^* = \braket{\phi}{(\ket w\bra v)^\dagger\psi} \\ &\braket{\psi}{(\ket w\bra v)\phi}^* = (\braket{\psi}{w}\braket{v}{\phi})^* = \braket{\phi}{v}\braket{w}{\psi} \end{cases}\nonumber \]

, by comparing the two expressions we have \((\ket w\bra v)^\dagger = \ket v\bra w\).

Exercise 2.14

(Anti-linearity of the adjoint) Show that the adjoint operation is anti-linear,

\[ \bigg(\sum_ia_iA_i\bigg)^\dagger = \sum_ia_i^*A_i^\dagger \nonumber \]
solution
\[ \begin{align*} \braket{(a_iA_i)^\dagger\psi}{\phi} &= \braket{\psi}{(a_iA_i)\phi}\\ &= a_i \braket{\psi}{(A_i)\phi}\\ &= a_i \braket{(A_i)^\dagger\psi}{\phi}\\ &= \braket{a_i^*(A_i)^\dagger\psi}{\phi}. \end{align*} \]

Exercise 2.15

Show that \((A^\dagger)^\dagger = A\).

solution

\(\braket{(A^\dagger)^\dagger\psi}{\phi} = \braket{\psi}{A^\dagger\phi} = \braket{A^\dagger\phi}{\psi}^* = \braket{\phi}{A\psi}^* = \braket{A\psi}{\phi}\), hence \((A^\dagger)^\dagger = A\).

  • An operator \(A\) whose adjoint is \(A\) is known as a Hermitian or self-adjoint operator.

  • An important class of Hermitian operators is the projectors: Suppose \(W\) is a \(k\)-dimensional vector subspace of the \(d\)-dimensional vector space \(V\). Using the Gram–Schmidt procedure it is possible to construct an orthonormal basis \(\ket1,\cdots,\ket d\) for \(V\) such that \(\ket1,\cdots,\ket k\) is an orthonormal basis for \(W\). By definition,

\[ P\equiv \sum_{i=1}^{k}\ket i\bra i \]

is the projector onto the subspace \(W\).

  • From the definition it can be shown that \(\ket v\bra v\) is Hermitian for any vector \(\ket v\), so \(P\) is also Hermitian: \(P^\dagger = P\).

  • The orthogonal complement of \(P\) is the operator \(Q\equiv I-P\), which is a projector onto the vector space spanned by \(\ket{k+1},\cdots,\ket d\)

Exercise 2.16

Show that any projector \(P\) satisfies the equation \(P^2 = P\).

solution

\(P^2 = \bigg(\sum_i\ket i\bra i\bigg)\bigg(\sum_j\ket j\bra j\bigg) = \sum_{i,j}\ket i\braket{i}{j}\bra j = \sum_{i,j}\ket i\bra j\delta_{ij} = \sum_i\ket i\bra i = P\)

  • An operator \(A\) is said to be normal if \(AA^\dagger = A^\dagger A\). Obviously, an Hermitian operator is also normal. There is a remarkable representation theorem for normal operators known as the spectral decomposition, which states that an operator is a normal operator if and only if it is diagonalizable.

Theorem 2.1

(Spectral decomposition) Any normal operator \(M\) on a vector space \(V\) is diagonal with respect to some orthonormal basis for \(V\). Conversely, any diagonalizable operator is normal.

proof

The converse statement is easy to prove, hence we start with the forward one, proving the implication by the method of induction on \(d\), the dimension of \(V\).

  1. For \(d=1\) case, it's trivial.

  2. For \(d>1\) case, first let \(\lambda\) be one of the eigenvalues of \(M\), and \(P\) be the projector onto the \(\lambda\) eigenspace, and \(Q\) the projector onto the orthogonal complement. Then \(M = IMI = (P+Q)M(P+Q) = PMP+QMP+PMQ+QMQ\). The first term \(PMP=\lambda P\) and the second term \(QMP = 0\) for obvious reason (can be deduced from definition), we claim the third term \(PMQ= 0\) for the following reason:

  3. proof: Let \(\ket v\) be an element of the subspace \(P\), then \(MM^\dagger\ket v = M^\dagger M\ket v = \lambda M^\dagger\ket v\). This can be viewed as an element "\(M^\dagger \ket v\)" turns out to be an eigenvector (associated with eigenvalue \(\lambda\)) when acted upon by the same operator \(M\), therefore it's also an element in the subspace \(P\). Thus \(M^\dagger P = \lambda P\Rightarrow QM^\dagger P=0\), take its adjoint we get \(PMQ=0\). (note that \(P\) and \(Q\) are Hermitian)

Now we've arrived at \(M = PMP+QMQ\), next we're gonna prove that \(QMQ\) is normal:

  • proof: \(QM = QMI = QM(P+Q) = QMP + QMQ = 0+QMQ\), and \(QM^\dagger = QM^\dagger(P+Q) = 0+QM^\dagger Q\). Therefore, by the normality of \(M\) and property \(Q^2=Q\) (one can prove it trivially), we have:
\[ \begin{align*} (QMQ)(QM^\dagger Q) &= QMQM^\dagger Q \\ &= QMM^\dagger Q \\ &= QM^\dagger MQ \\ &= QM^\dagger QMQ \\ &= (QM^\dagger Q)(QMQ) \end{align*} \]

, thus \(QMQ\) is normal. Then, by induction (from the first step), \(QMQ\) is diagonal with respect to some orthonormal basis for the subspace \(Q\) (因為對於\(d=2\)的case, 由於\(P\)的dimension至少為\(1\), 所以\(Q\)的dimension至多為\(1\), 而我們已經在第一步證明過\(d=1\)的case時這定理是顯然合法的了, 所以我們根據歸納法將\(d\)成功往前從\(1\)推進到\(2\), 再繼續下去\(\cdots\)). And on the other hand \(PMP\) is already diagonal with respect to some orthonormal basis for \(P\), therefore it follows that \(M=PMP+QMQ\) is diagonal with respect to some orthonormal basis for the total vector space \(V\).

This conclusion means that \(M\) can be written as \(M = \sum_i\lambda_i\ket i\bra i\), where \(\lambda_i\) are the eigenvalues of \(M\), \(\ket i\) is an orthonormal basis for \(V\), while each individual \(\ket i\) being an eigenvector of \(M\) with eigenvalue \(\lambda_i\).

We can also express \(M\) in terms of projector: \(M = \sum_i\lambda_iP_i\), where \(P_i\) are the projectors onto the \(\lambda_i\) eigenspace of \(M\). And:

  • These projectors satisfy the completeness relation \(\sum_iP_i = I\),
  • and the orthonormality relation \(P_iP_j = \delta_{ij}P_i\).

Exercise 2.17

Show that a normal matrix is Hermitian if and only if it has real eigenvalues.

solution
  • forward proof: If \(A\) is Hermitian then \(A^\dagger = A\). Let \(\ket\lambda\) be eigenvectors of \(A\) with eigenvalues \(\lambda\), respectively. We have \(\bra{\lambda}A\ket\lambda = \lambda\bra{\lambda}\ket{\lambda} = \lambda\). We can reverse it: \(\lambda^* = \bra{\lambda}A^\dagger\ket\lambda = \bra{\lambda}A\ket\lambda = \lambda\;\Rightarrow\;\lambda^* = \lambda\), it has real eigenvalues.
  • backward proof: Since \(A\) is normal, it can be spectral decomposed: \(A = \sum_i\lambda_i\ket i\bra i\), taking its adjoint: \(A^\dagger = \sum_i\lambda_i^*\ket i\bra i\), obviously the two equations are the same because \(\lambda^* = \lambda\), that is, \(A^\dagger = A\).

Therefore a normal matrix is Hermitian if and only if it has real eigenvalues.

  • A matrix \(U\) is said to be unitary if \(U^\dagger U=I\), same to the operators: an operator \(U\) is said to be unitary if \(U^\dagger U=I\). A unitary operator also satisfies \(UU^\dagger = I\), and therefore \(U\) is normal and has a spectral decomposition.

  • Geometrically, unitary operators are important because they preserve inner products between vectors: The inner product of \(U\ket v\) and \(U\ket w\) is the same as the inner product of \(\ket v\) and \(\ket w\).

\[ (U\ket v, U\ket w) = \bra{v}U^\dagger U\ket w = \bra v I\ket w = \braket{v}{w} \]

Therefore we can imagine the outer product representation of any unitary operator \(U\). It starts from letting \(\ket{v_i}\) be any orthonormal basis set, then define \(\ket{w_i}\equiv U\ket{v_i}\), we can conclude that \(\ket{w_i}\) is also a orthonormal basis set since it preserves the inner products. Clearly, \(U=\sum_i\ket{w_i}\bra{v_i}\).

Exercise 2.18

Show that all eigenvalues of a unitary matrix have modulus \(1\), that is, can be written in the form \(e^{i\theta}\) for some real \(\theta\).

solution

Suppose \(\ket\lambda\) is one of \(U\)'s eigenvector, then \(U\ket\lambda = \lambda\ket\lambda\). We have \(1 = \braket{\lambda} = \bra{\lambda}I\ket\lambda = \bra{\lambda}U^\dagger U\ket\lambda = \lambda^*\lambda\braket{\lambda} = \Vert\lambda\Vert^2 = 1\). Therefore \(\lambda = e^{i\theta}\).

Exercise 2.19

(Pauli matrices: Hermitian and unitary) Show that the Pauli matrices are Hermitian and unitary.

solution

Show by simple calculation.

Exercise 2.20

(Basis changes) Suppose \(A'\) and \(A''\) are matrix representations of an operator \(A\) on a vector space \(V\) with respect to two different orthonormal bases, \(\ket{v_i}\) and \(\ket{w_i}\). Then the elements of \(A'\) and \(A''\) are \(A'_{ij} = \bra{v_i}A\ket{v_j}\) and \(A''_{ij} = \bra{w_i}A\ket{w_j}\). Characterize the relationship between \(A'\) and \(A''\).

solution

Define \(U\equiv\sum_i\ket{w_i}\bra{v_i}\), then we have:

\[ \begin{align*} A'_{ij} &= \bra{v_i}A\ket{v_j} \\ &= \sum_{k,l}\bra{v_i}\ket{w_k}\bra{w_k}A\ket{w_l}\bra{w_l}\ket{v_j} \\ &= \sum_{k,l}\bra{v_i}U\ket{v_k}\bra{w_k}A\ket{w_l}\bra{v_l}U^\dagger\ket{v_j} \\ &= \sum_{k,l}U_{ik}A''_{kl}U^\dagger_{lj} \end{align*} \]

  • A special subclass of Hermitian operators is the positive operators (extremely important!):
  • A positive operator \(A\) is defined to be an operator such that for any vector \(\ket v\), \((\ket v, A\ket v)\) is a real non-negative number.
  • \(A\) is positive definite if \((\ket v, A\ket v)\) is strictly greater than zero for all \(\ket v\neq0\)​. (In Exercise 2.24 one will show that any positive operator is automatically Hermitian, and therefore by the spectral decomposition has diagonal representation \(\sum_i\lambda_i\ket i\bra i\), with non-negative eigenvalues \(\lambda_i\))

Exercise 2.21

Repeat the proof of the spectral decomposition for the case when \(M\) is Hermitian, simplifying the proof wherever possible.

solution

If \(M\) is Hermitian, \(M^\dagger = M\). We pick up from the part where proving \(QMQ\) is normal:

\[ \begin{align*} (QMQ)(QMQ)^\dagger &= QMQQM^\dagger Q \\ &= QM^\dagger QQMQ \\ &= (QMQ)^\dagger(QMQ) \end{align*} \]

, therefore \(QMQ\) is normal. And keep on the induction and deduce that \(QMQ\) is diagonal, and so on...

Exercise 2.22

Prove that two eigenvectors of a Hermitian operator with different eigenvalues are necessarily orthogonal.

solution
\[ \begin{align*} &\bra{\lambda_i}H\ket{\lambda_j} = \lambda_j\braket{\lambda_i}{\lambda_j} \\ &\bra{\lambda_j}H\ket{\lambda_i} = \lambda_i\braket{\lambda_j}{\lambda_i} \\ &\bra{\lambda_j}H^\dagger\ket{\lambda_i} = \lambda_j^*\braket{\lambda_j}{\lambda_i} = \lambda_i\braket{\lambda_j}{\lambda_i} \;\;\text{ (taking the adjoint of the first equation)} \\ & \Rightarrow (\lambda_j^*-\lambda_i)\braket{\lambda_j}{\lambda_i} = (\lambda_j-\lambda_i)\braket{\lambda_j}{\lambda_i} = 0 \end{align*} \]

, therefore if \(\lambda_i\neq\lambda_j\) then \(\braket{\lambda_j}{\lambda_i}=0\), which is the eigenvectors are necessarily orthogonal.

Exercise 2.23

Show that the eigenvalues of a projector \(P\) are all either \(0\) or \(1\).

solution

Since \(P^2 = P\), the following two equations are equivalent:

\[ \begin{cases} P\ket\lambda = \lambda\ket\lambda \\ P^2\ket\lambda = \lambda P\ket\lambda = \lambda^2\ket\lambda \end{cases}\nonumber \]

, thus \(\lambda = \lambda^2 \Rightarrow \lambda = \{0,1\}\).

Exercise 2.24

(Hermiticity of positive operators) Show that a positive operator is necessarily Hermitian. (Hint: Show that an arbitrary operator \(A\) can be written \(A = B+iC\) where \(B\) and \(C\) are Hermitian.)

solution

Suppose \(A\) is an arbitrary operator, it can be expressed as

\[ A = \bigg(\frac{A+A^\dagger}{2}\bigg) + i\bigg(\frac{A-A^\dagger}{2i}\bigg)\equiv B+iC \nonumber \]

, where \(B\) and \(C\) are both obviously Hermitian. A positive operator means that \(\bra vA\ket v \geq 0\) and \(\bra vA\ket v\in\mathbb{R}\).

\[ \begin{align*} \bra vA\ket v = \bra v(B+iC)\ket v = \bra vB\ket v + i\bra vC\ket v \in \mathbb{R} \end{align*} \]

, therefore \(C=0\) and \(A\) is necessarily Hermitian.

Exercise 2.25

Show that for any operator \(A\), \(A^\dagger A\) is positive.

solution

Let \(\bra\psi A^\dagger A\ket\psi = c\) with \(\ket\psi\) an arbitrary vector, we have \(c^* = \bra\psi (A^\dagger A)^\dagger\ket\psi = \bra\psi A^\dagger A\ket\psi = c\), hence \(c\in\mathbb R\). On the other hand, \(\bra\psi A^\dagger A\ket\psi = \Big(A\ket\psi, A\ket\psi\Big) = \Big\Vert A\ket\psi\Big\Vert^2 \geq 0\), therefore \(A^\dagger A\) is positive.

2.1.4 Tensor products

  • The tensor product is a way of putting vector spaces together to form larger vector spaces.

  • Suppose \(V\) and \(W\) are vector spaces of dimension \(m\) and \(n\) respectively, and are both Hilbert spaces. Then \(V\otimes W\) (read "V tensor W") is an \(mn\) dimensional vector space.

  • The elements of \(V\otimes W\) are linear combinations of tensor products \(\ket v\otimes\ket w\).

  • In particular, if \(\ket i\) and \(\ket j\) are orthonormal bases for the spaces \(V\) and \(W\), then \(\ket i\otimes\ket j\) is a basis for \(V\otimes W\).

  • We often use the abbreviated notations \(\ket v\ket w\), \(\ket{v,w}\), or \(\ket{vw}\) for the tensor product \(\ket v\otimes\ket w\).

  • By definition the tensor product satisfies the following properties:

    • For an arbitrary scalar \(z\) and elements \(\ket v\) of \(V\) and \(\ket w\) os \(W\),
    \[ z(\ket v\otimes\ket w) = (z\ket v)\otimes\ket w = \ket v\otimes(z\ket w) \]
    • For arbitrary \(\ket{v_1}\) and \(\ket{v_2}\) in \(V\) and \(\ket w\) in \(W\),
    \[ (\ket{v_1} + \ket{v_2})\otimes\ket w = \ket{v_1}\otimes\ket w + \ket{v_2}\otimes\ket w \]
    • For arbitrary \(\ket{w_1}\) and \(\ket{w_2}\) in \(W\) and \(\ket v\) in \(V\),
    \[ \ket v\otimes(\ket{w_1}+\ket{w_2}) = \ket v\otimes\ket{w_1} + \ket v\otimes\ket{w_2} \]
  • Now about the linear operator acting on the \(V\otimes W\) space, if \(A\) and \(B\) are the linear operators on \(V\) and \(W\), respectively, then we can define a linear operator \(A\otimes B\) on \(V\otimes W\) by the equation:

    \[ (A\otimes B)(\ket v\otimes\ket w) \equiv A\ket v\otimes B\ket w \]

    , we can then extend this definition to all elements of \(V\otimes W\) in the natural way to ensure the linearity of \(A\otimes B\):

    \[ (A\otimes B)\bigg(\sum_ia_i\ket{v_i}\otimes\ket{w_i}\bigg) \equiv \sum_ia_iA\ket{v_i}\otimes B\ket{w_i} \]
  • The inner product on the \(V\otimes W\) space is naturally defined as,

    \[ \Bigg(\sum_ia_i\ket{v_i}\otimes\ket{w_i},\;\sum_jb_j\ket{v_j'}\otimes\ket{w_j'}\Bigg) \equiv \sum_{ij}a_i^*b_j\braket{v_i}{v_j'}\braket{w_i}{w_j'} \]
  • From this inner product, the inner product space \(V\otimes W\) inherits the other structure we are familiar with, such as notions of an adjoint, unitarity, normality, and Hermiticity.

  • The above discussion and definition can be made more concrete by switching to a convenient matrix representation known as the Kronecker product. Suppose \(A\) is a \(m\times n\) matrix, and \(B\) is a \(p\times q\) matrix, then we have the matrix representation:

    \[ A\otimes B\equiv \begin{bmatrix} A_{11}B &A_{12}B &\cdots &A_{1n}B \\ A_{21}B &A_{22}B &\cdots &A_{2n}B \\ \vdots &\vdots &\ddots &\vdots \\ A_{m1}B &A_{m2}B &\cdots &A_{mn}B \\ \end{bmatrix} \]

    . For example, the tensor product of the vectors \((1,2)\) and \((3,4)\) is the vector:

    \[ \begin{bmatrix} 1\\2 \end{bmatrix} \otimes \begin{bmatrix} 3\\4 \end{bmatrix} = \begin{bmatrix} 1\times 3\\1\times 4\\2\times 3\\2\times4 \end{bmatrix} = \begin{bmatrix} 3\\4\\6\\8 \end{bmatrix}\nonumber \]

    , and the tensor product of the Pauli matrices \(X\) and \(Y\) is:

    \[ X\otimes Y = \begin{bmatrix} 0\cdot Y & 1\cdot Y\\ 1\cdot Y &0\cdot Y \end{bmatrix} = \begin{bmatrix} 0 &0 &0 &-i\\ 0 &0 &i &0\\ 0 &-i &0 &0\\ i &0 &0 &0 \end{bmatrix}\nonumber \]
  • There is a useful notation \(\ket\psi^{\otimes k}\), which means \(\ket{\psi}\) tensored with itself \(k\) times. For example, \(\ket\psi^{\otimes 3} = \ket\psi\otimes \ket\psi\otimes \ket\psi\). An analogous notation is also used for operators on tensor product spaces.

Exercise 2.26

Let \(\ket\psi = (\ket0+\ket1)/\sqrt2\). Write out \(\ket\psi^{\otimes 2}\) and \(\ket\psi^{\otimes 3}\) explicitly, both in terms of tensor products like \(\ket0\ket1\), and using the Kronecker product.

solution
\[ \begin{align*} \ket\psi^{\otimes 2} &= \frac{1}{2}\big(\ket0\ket0+\ket0\ket1+\ket1\ket0+\ket1\ket1\big) =\frac{1}{2}\begin{bmatrix}1\\1\\1\\1\end{bmatrix}\\ \ket\psi^{\otimes 3} &= \frac{1}{2\sqrt2}\big(\ket0\ket0\ket0+\ket0\ket1\ket0+\ket1\ket0\ket0+\ket1\ket1\ket0+ \ket0\ket0\ket1+\ket0\ket1\ket1+\ket1\ket01\ket+\ket1\ket1\ket1\big) \\&=\frac{1}{2\sqrt2}\begin{bmatrix}1\\1\\1\\1\\1\\1\\1\\1\end{bmatrix} \end{align*} \]

Exercise 2.27

Calculate the matrix representation of the tensor products of the Pauli operators (a) \(X\) and \(Z\); (b) \(I\) and \(X\); (c) \(X\) and \(I\). Is the tensor product commutative?

solution
\[ \begin{align*} X\otimes Z &= \begin{bmatrix}0 &0 &1 &0 \\ 0 &0 &0 &-1\\ 1 &0 &0 &0\\ 0 &-1 &0 &0\end{bmatrix} \\ I\otimes X &= \begin{bmatrix}0 &1 &0 &0 \\ 1 &0 &0 &0\\ 0 &0 &0 &1\\ 0 &0 &1 &0\end{bmatrix} \\ X\otimes I &= \begin{bmatrix}0 &0 &1 &0 \\ 0 &0 &0 &1\\ 1 &0 &0 &0\\ 0 &1 &0 &0\end{bmatrix} \\ \end{align*} \]

, the tensor product is not commutative.

Exercise 2.28

Show that the transpose, complex conjugation, and adjoint operations distribute over the tensor product.

\[ (A\otimes B)^* = A^*\otimes B^*\;;\;\;(A\otimes B)^\intercal = A^\intercal\otimes B^\intercal \;;\;\;(A\otimes B)^\dagger = A^\dagger\otimes B^\dagger \]
solution
\[ \begin{align*} (A\otimes B)^* &= \begin{bmatrix} A_{11}^*B^* &\cdots &A_{1n}^*B^* \\ \vdots&\ddots&\vdots \\ A_{m1}^*B^* &\cdots &A_{mn}^*B^* \end{bmatrix} = A^*\otimes B^* \\ (A\otimes B)^\intercal &= \begin{bmatrix} A_{11}B^\intercal &\cdots &A_{m1}B^\intercal \\ \vdots&\ddots&\vdots \\ A_{1n}B^\intercal &\cdots &A_{mn}B^\intercal \end{bmatrix} = A^\intercal\otimes B^\intercal \\ (A\otimes B)^\dagger &= \begin{bmatrix} A_{11}^*B^\dagger &\cdots &A_{m1}^*B^\dagger \\ \vdots&\ddots&\vdots \\ A_{1n}^*B^\dagger &\cdots &A_{mn}^*B^\dagger \end{bmatrix} = A^\dagger\otimes B^\dagger \end{align*} \]

Exercise 2.29

Show that the tensor product of two unitary operators is unitary.

solution

Let \(U_1\) and \(U_2\) be the two unitary operators (therefore \(U_1U_1^\dagger = I,\;U_2U_2^\dagger=I\)), then we have:

\[ \begin{align*} (U_1\otimes U_2)(U_1\otimes U_2)^\dagger = U_1U_1^\dagger\otimes U_2U_2^\dagger = I\otimes I = I \end{align*} \]

Exercise 2.30

Show that the tensor product of two Hermitian operators is Hermitian.

solution

Let \(H_1\) and \(H_2\) be the two Hermitian operators, then we have:

\[ \begin{align*} (H_1\otimes H_2)^\dagger = H_1^\dagger \otimes H_2^\dagger = H_1\otimes H_2 = (H_1\otimes H_2) \end{align*} \]

Exercise 2.31

Show that the tensor product of two positive operators is positive.

solution

Let \(A\) and \(B\) be the two positive operators, then we have:

\[ \begin{align*} \bra{v\otimes w}A\otimes B\ket{v\otimes w} = \bra vA\ket v\cdot\bra wB\ket w \geq0 \end{align*} \]

Exercise 2.32

Show that the tensor product of two projectors is a projector.

solution

Let \(P_1\) and \(P_2\) be the two projectors, then we have:

\[ \begin{align*} (P_1\otimes P_2)^2 = (P_1\otimes P_2)(P_1\otimes P_2) = P_1^2\otimes P_2^2 = P_1\otimes P_2 \end{align*} \]

Exercise 2.33

The Hadamard operator on one qubit may be written as

\[ H = \frac{1}{\sqrt2}\bigg[\big(\ket0+\ket1\big)\bra0\; + \big(\ket0-\ket1\big)\bra1\;\bigg] \]

Show explicitly that the Hadamard transform on \(n\) qubits, \(H^{\otimes n}\), may be written as

\[ H^{\otimes n} = \frac{1}{\sqrt{2^n}}\sum_{\vec x,\vec y}(-1)^{\vec x\cdot \vec y}\ket {\vec x}\bra {\vec y} \]

Write out an explicit matrix representation for \(H^{\otimes 2}\).

solution
\[ \begin{align*} H^{\otimes n} &= \frac{1}{\sqrt{2^n}}\bigg(\sum_{x_1,y_1}(-1)^{x_1\cdot y_1}\ket {x_1}\bra {y_1}\bigg) \otimes \bigg(\sum_{x_2,y_2}(-1)^{x_2\cdot y_2}\ket {x_2}\bra {y_2}\bigg) \otimes\cdots\otimes \bigg(\sum_{x_n,y_n}(-1)^{x_n\cdot y_n}\ket {x_n}\bra {y_n}\bigg) \\ &= \frac{1}{\sqrt{2^n}}\sum_{\vec x,\vec y}(-1)^{\vec x\cdot \vec y}\ket {\vec x}\bra {\vec y} \end{align*} \]

, and the matrix representation for \(H^{\otimes 2}\) is:

\[ \begin{bmatrix} 1&1&1&1\\1&-1&1&-1\\1&1&-1&-1\\1&-1&-1&1 \end{bmatrix}\nonumber \]

2.1.5 Operator functions

  • Generically, given a function \(f:\{\mathbb C\to\mathbb C\}\), it is possible to define a corresponding matrix function on normal matrices.

  • Let \(A = \sum_aa\ket a\bra a\) be a spectral decomposition for a normal operator \(A\), we then define:

    \[ f(A) \equiv \sum_a f(a)\ket a\bra a \]

    , one can show that \(f(A)\) is uniquely defined.

  • This procedure can be used to define the square root of a positive operator, the logarithm of a positive-definite operator, or the exponential of a normal operator. As an example,

    \[ e^{\theta Z} = \begin{bmatrix}e^\theta &0\\0 &e^{-\theta}\end{bmatrix}\nonumber \]

Exercise 2.34

Find the square root and logarithm of the matrix

\[ \begin{bmatrix}4&3\\3&4\end{bmatrix}\nonumber \]
solution

The spectral decomposition for the matrix is:

\[ \begin{align*} \begin{bmatrix}4&3\\3&4\end{bmatrix} = 1\cdot\Bigg(\frac{1}{\sqrt2}\begin{bmatrix}1\\-1\end{bmatrix}\frac{1}{\sqrt2}\begin{bmatrix}1&-1\end{bmatrix}\Bigg) + 7\cdot\Bigg(\frac{1}{\sqrt2}\begin{bmatrix}1\\1\end{bmatrix}\frac{1}{\sqrt2}\begin{bmatrix}1&1\end{bmatrix}\Bigg) \end{align*} \]

Therefore we have:

\[ \begin{align*} \sqrt A &= \frac{1}{2}\begin{bmatrix}\sqrt7+1&\sqrt7-1\\\sqrt7-1&\sqrt7+1\end{bmatrix} \\ \ln A &= \frac{1}{2}\begin{bmatrix}\ln7&\ln7\\\ln7&\ln7\end{bmatrix} \end{align*} \]

Exercise 2.35

(Exponential of the Pauli matrices) Let \(\vec v\) be any real, three-dimensional unit vector and \(\theta\) a real number. Prove that

\[ \exp(i\theta\vec v\cdot\vec\sigma) = (\cos\theta)I + i\sin(\theta)\vec v\cdot\vec\sigma \]

where \(\vec v\cdot\vec\sigma\equiv\sum_{i=1}^3v_i\sigma_i\).

solution

We first calculate the eigenvector and eigenvalue of \(\vec v\cdot\vec\sigma = \begin{bmatrix}v_3&v_1-iv_2\\v_1+iv_2&-v_3\end{bmatrix}\):

\[ \begin{align*} \det\begin{bmatrix}v_3-\lambda&v_1-iv_2\\v_1+iv_2&-v_3-\lambda\end{bmatrix} &= \lambda^2-(v_1^2+v_2^2+v_3^2) = 0 \\ &\Rightarrow \lambda = \pm\sqrt{v_1^2+v_2^2+v_3^2} \end{align*} \]

, but since \(\vec v\) is stated as a unit vector, we have \(\sqrt{v_1^2+v_2^2+v_3^2}=1\), hence \(\lambda = \pm1\). Denote the eigenvectors with \(\ket{\lambda_1}\) and \(\ket{\lambda_{-1}}\).

\[ \begin{align*} \exp(i\theta\vec v\cdot\vec\sigma) &= \exp(1\cdot i\theta)\ket{\lambda_1}\bra{\lambda_1} + \exp(-1\cdot i\theta)\ket{\lambda_{-1}}\bra{\lambda_{-1}} \\ &= (\cos\theta + i\sin\theta)\ket{\lambda_1}\bra{\lambda_1} + (\cos\theta - i\sin\theta)\ket{\lambda_{-1}}\bra{\lambda_{-1}} \\ &= (\cos\theta)\big(\ket{\lambda_1}\bra{\lambda_1}+\ket{\lambda_{-1}}\bra{\lambda_{-1}}\big) + (i\sin\theta)\big(\ket{\lambda_1}\bra{\lambda_1}-\ket{\lambda_{-1}}\bra{\lambda_{-1}}\big) \\ &= (\cos\theta)I + i\sin(\theta)\vec v\cdot\vec\sigma \end{align*} \]

, since the spectral decomposition of \(\vec v\cdot\vec\sigma\) is \(\ket{\lambda_1}\bra{\lambda_1}-\ket{\lambda_{-1}}\bra{\lambda_{-1}}\).

  • The trace of \(A\) (another matrix function) is defined to be the sum of its diagonal elements:

    \[ \tr(A) \equiv \sum_iA_{ii} \]
  • The trace is easily seen to be cyclic, that is, \(\tr(AB) = \tr(BA)\); and linear also, \(\tr(A+B) = \tr(A)+\tr(B)\) , and \(tr(zA) = z\tr(A)\), where \(A\) and \(B\) are arbitrary matrices, and \(z\) a complex number.

  • From the cyclic property it follows that the trace of a matrix is invariant under the unitary similarity transformation: \(A\rightarrow UAU^\dagger\), (可以參考前面練習過的 Basis Change) that is, \(\tr(UAU^\dagger) = \tr(U^\dagger UA) = \tr(A)\), hence it makes sense to define the trace of an operator \(A\) to be the trace of any matrix representation of \(A\).

  • Useful result: suppose \(\ket\psi\) is a unit vector and \(A\) is an arbitrary operator. To evaluate \(\tr(A\ket\psi\bra\psi)\) one can use the Gram–Schmidt procedureto create an orthonormal basis with \(\ket\psi\) being the first element. In that way we have:

    \[ \tr(A\ket\psi\bra\psi) = \sum_i\bra iA\ket\psi\braket{\psi}{i} = \expval{A}{\psi} \]

    , which is extremely useful in evaluating the trace of an operator. ​

Exercise 2.36

Show that the Pauli matrices except for \(I\) have trace zero.

solution

\(\tr(X) = \tr(Y) = \tr(Z) = 0\)

Exercise 2.37

(Cyclic property of the trace) If \(A\) and \(B\) are two linear operators show that

\[ \tr(AB) = \tr(BA) \]
solution

Suppose \(A_{m\times n}\) and \(B_{n\times m}\), then we have:

\[ \begin{align*} \tr(AB) = \sum_{i=1}^m \bigg(\sum_{j=1}^n A_{ij} B_{ji}\bigg) = \sum_{j=1}^n \bigg(\sum_{i=1}^m B_{ji} A_{ij}\bigg) = \tr(BA) \end{align*} \]

Exercise 2.38

(Linearity of the trace) If A and B are two linear operators, show that

\[ \tr(A+B) = \tr(A)+\tr(B) \]

and if \(z\) is an arbitrary complex number show that

\[ \tr(zA) = z\tr(A) \]
solution

Expand the matrices in terms of element, then one can proof by simple calculation.

Exercise 2.39

(The Hilbert–Schmidt inner product on operators) The set \(L_V\) of linear operators on a Hilbert space \(V\) is obviously a vector space - the sum of two linear operators is a linear operator, \(zA\) is a linear operator if \(A\) is a linear operator and \(z\) is a complex number, and there is a zero element \(0\). An important additional result is that the vector space \(L_V\) can be given a natural inner product structure, turning it into a Hilbert space.

(1) Show that the function \((\cdot,\cdot)\) on \(L_V\times L_V\) defined by

\[ (A,B)\equiv \tr(A^\dagger B) \]

is an inner product function. This inner product is known as the Hilbert–Schmidt or trace inner product.

(2) If \(V\) has \(d\) dimensions show that \(L_V\) has dimension \(d^2\).

(3) Find an orthonormal basis of Hermitian matrices for the Hilbert space \(L_V\).

solution

(1) To prove this we need to assure that \((\cdot,\cdot)\) satisfies: linearity in the second argument, \((a,b) = (b,a)^*\), and \((a, a)\geq0\) with equality if and only if \(a=0\).

\[ \begin{align*} \bigg(A, \sum_i\lambda_iB_i\bigg) &= \tr(A^\dagger\sum_i\lambda_iB_i) = \tr(\sum_i\lambda_iA^\dagger B_i) = \sum_i\lambda_i\tr(A^\dagger B_i) = \sum_i\lambda_i\big(A, B_i\big) \\ (A,B)^* &= (\tr(A^\dagger B))^* = \tr\Big((A^\dagger B)^\dagger\Big) = \tr(B^\dagger A) = (B,A) \\ (A,A) &= \tr(A^\dagger A) = \sum_i\vert A_{ii}\vert^2 \geq0 \end{align*} \]

, hence the function \((\cdot,\cdot)\) on \(L_V\times L_V\) defined by \((A,B)\equiv \tr(A^\dagger B)\) is an inner product function.

(2) A vector in \(V\) has \(d\) elements, and a matrix in \(L_V\) has \(d\times d = d^2\) elements. Therefore \(\dim(L_V) = d^2\).

(3) 是不是可以只考慮上三角區域的element啊, 再用 Gram–Schmidt procedure 構造出正交歸一矩陣基底.

2.1.6 The commutator and anti-commutator

  • The commutator between two operators A and B is defined to be:

    \[ [A,B]\equiv AB-BA \]
  • If \(AB=BA\), we say \(A\) commutes with \(B\).

  • Similarly, the anti-commutator of two operators \(A\) and \(B\) is defined by:

    \[ \{A,B\}\equiv AB+BA \]
  • We say \(A\) anti-commutes with \(B\) if \(\{A,B\}=0\).

Theorem 2.2

(Simultaneous diagonalization theorem) Suppose \(A\) and \(B\) are Hermitian operators. Then \([A,B]=0\) if and only if there exists an orthonormal basis such that both \(A\) and \(B\) are diagonal with respect to that basis. We say that \(A\) and \(B\) are simultaneously diagonalizable in this case.

proof

The converse statement is easy to prove (if \(A\) and \(B\) are diagonal in the same orthonormal basis then \([A,B]=0\)), so we start with the forward one.

Let \(\ket{a, j}\) be an orthonormal basis for the eigenspace \(V_a\) of \(A\) with eigenvalue \(a\), the index \(j\) is used to label possible degeneracies (因應一個本徵值對到多個本徵向量, 要區分開來這些向量). Since \([A,B]=0\), we have:

\[ AB\ket{a,j} = BA\ket{a,j} = aB\ket{a,j} \]

, therefore we can view \(B\ket{a,j}\) as an element of the eigenspace \(V_a\). Let \(P_a\) denote the projector onto the space \(V_a\) and define \(B_a\equiv P_aBP_a\) (it is defined this way so that \(B_a\) can be Hermitian on \(V_a\)), therefore we can perform spectral decomposition to \(B_a\) in terms of an orthonormal set of eigenvectors which span the space \(V_a\). Suppose these eigenvectors are \(\ket{a,b,k}\), where the indices \(a\) and \(b\) label the eigenvalues of \(A\) and \(B_a\), and \(k\) is an extra index to allow for the possibility of a degenerate \(B_a\).

Using the same technique of equation (69), we know that \(B\ket{a,b,k}\) is an element of \(V_a\), so \(B\ket{a,b,k} = P_aB\ket{a,b,k}\). Moreover we have \(P_a\ket{a,b,k} = \ket{a,b,k}\), so combine the two relation we have:

\[ B\ket{a,b,k} = P_aBP_a\ket{a,b,k} = B_a\ket{a,b,k} = b\ket{a,b,k} \]

, which indicates that \(\ket{a,b,k}\) is an eigenvector of \(B\) with eigenvalue \(b\), thus \(\ket{a,b,k}\) is an orthonormal set of eigenvectors of both \(A\) and \(B\), spanning the entire vector space on which \(A\) and \(B\) are defined. That is, \(A\) and \(B\) are simultaneously diagonalizable.

Exercise 2.40

(Commutation relations for the Pauli matrices) Verify the commutation relations

\[ [X,Y] = 2iZ;\;\;\;[Y,Z] = 2iX;\;\;\;[Z,X] = 2iY \]

There is an elegant way of writing this using \(\epsilon_{jkl}\), the antisymmetric tensor on three indicies, for which \(\epsilon_{jkl}=0\) except for \(\epsilon_{123} = \epsilon_{231} = \epsilon_{312}=1\) and \(\epsilon_{321} = \epsilon_{132} = \epsilon_{213}=-1\):

\[ [\sigma_j, \sigma_k] = 2i\sum_{l=1}^3\epsilon_{jkl}\sigma_l \]
solution
\[ \begin{align*} [X,Y] &= \begin{bmatrix}i&0\\0&-i\end{bmatrix}-\begin{bmatrix}-i&0\\0&i\end{bmatrix} = 2i\begin{bmatrix}1&0\\0&-1\end{bmatrix} \\ [Y,Z] &= \begin{bmatrix}0&i\\i&0\end{bmatrix}-\begin{bmatrix}0&-i\\-i&0\end{bmatrix} = 2i\begin{bmatrix}0&1\\1&0\end{bmatrix} \\ [Z,X] &= \begin{bmatrix}0&1\\-1&0\end{bmatrix}-\begin{bmatrix}0&-1\\1&0\end{bmatrix} = 2i\begin{bmatrix}0&-i\\i&0\end{bmatrix} \end{align*} \]

Exercise 2.41

(Anti-commutation relations for the Pauli matrices) Verify the anti-commutation relations

\[ \{\sigma_i,\sigma_j\} = 0 \]

, where \(i\neq j\) are both chosen from the set \(\{1,2,3\}\). Also verify that for \(i\in\{0,1,2,3\}\),

\[ \sigma_i^2 = I \]
solution

One can proof the above relation via the commutation relations from last exercise.

Exercise 2.42

Verify that

\[ AB = \frac{[A,B]+\{A,B\}}{2} \]
solution
\[ \begin{align*} \frac{[A,B]+\{A,B\}}{2}= \frac{AB-BA+AB+BA}{2} = \frac{2AB}{2} = AB \end{align*} \]

Exercise 2.43

Show for \(j,k\in\{1,2,3\}\),

\[ \sigma_j\sigma_k = \delta_{jk}I + i\sum_{l=1}^3\epsilon_{jkl}\sigma_l \]
solution

From the conclusion of previous exercise (2.41 and 2.42), we have:

\[ \begin{align*} \sigma_j\sigma_k = \frac{[\sigma_j,\sigma_k]+\{\sigma_j,\sigma_k\}}{2} = \frac{[\sigma_j,\sigma_k]+0}{2} = I\delta_{jk} + i\sum_{l=1}^3\epsilon_{jkl}\sigma_l \end{align*} \]

Exercise 2.44

Suppose \([A,B]=\{A,B\}=0\) and \(A\) is invertible. Show that \(B\) must be \(0\).

solution

From the above assumption we have \(AB=BA\) and \(AB=-BA\), hence \(BA=-BA\). And since \(A\) is invertible, \(BAA^{-1} = -BAA^{-1}\Rightarrow B=-B\), which indicates \(B=0\).

Exercise 2.45

Show that \([A,B]^\dagger = [B^\dagger, A^\dagger]\).

solution
\[ \begin{align*} [A,B]^\dagger = (AB)^\dagger - (BA)^\dagger = B^\dagger A^\dagger - A^\dagger B^\dagger = [B^\dagger, A^\dagger] \end{align*} \]

Exercise 2.46

Show that \([A,B] = -[B,A]\).

solution

Proof by simply expanding the equation.

Exercise 2.47

Suppose \(A\) and \(B\) are Hermitian. Show that \(i[A,B]\) is also Hermitian.

solution

From the conclusions of the previous exercises (2.45 and 2.46), we have:

\[ \begin{align*} (i[A,B])^\dagger = -i[A,B]^\dagger = -i[B^\dagger, A^\dagger] = -i[B,A] = i[A,B] \end{align*} \]

, hence \(i[A,B]\) is also Hermitian.

2.1.7 The polar and singular value decompositions

  • The polar and singular value decompositions are useful ways of breaking linear operators up into simpler parts.
  • In particular, these decompositions allow us to break general linear operators up into products of unitary operators and positive operators.

Theorem 2.3

(Polar decomposition) Let \(A\) be a linear operator on a vector space \(V\). Then there exists unitary \(U\) and positive operators \(J\) and \(K\) such that

\[ A = UJ = KU \]

, where the unique positive operators \(J\) and \(K\) satisfying these equations are defined by \(J\equiv \sqrt{A^\dagger A}\) and \(K = \sqrt{AA^\dagger}\). Moreover, if \(A\) is invertible then \(U\) is unique.

We call the expression \(A=UJ\) the left polar decomposition of \(A\), and \(A=KU\) the right polar decomposition of \(A\). Most often, we’ll omit the 'right' or 'left' nomenclature, and use the term 'polar decomposition' for both expressions, with context indicating which is meant.

proof

From exercise we know that for any operator \(A\), \(A^\dagger A\) is positive, hence \(J\equiv \sqrt{A^\dagger A}\) is also a positive operator, so it can be given a spectral decomposition:

\[ J = \sum_i\lambda_i\ket i\bra i,\;(\lambda_i\geq0) \]

Now we define \(\ket{\psi_i}\equiv A\ket i\) in order to construct the relation \(\braket{\psi_i} = \lambda_i^2\), then we filter those \(i\) with \(\lambda_i\neq0\) and define their \(\ket{e_i}\equiv\ket{\psi_i}/\lambda_i\) to make them be normalized. Since they are orthogonal, we have \(0 = \braket{e_i}{e_j} = \bra{i}A^\dagger A\ket j/(\lambda_i\lambda_j) = \bra{i}J^2\ket j/(\lambda_i\lambda_j)\) for \(i\neq j\).

We can complement the orthonormal set \(\ket{e_i}\) using the Gram-Schmidt procedure to create the substitute basis vector for \(i\) that we omitted at the first moment due to \(\lambda_i=0\). We then define a unitary operator \(U\equiv \sum_i\ket{e_i}\bra i\) so that for:

\[ \begin{cases} \lambda_i\neq0,\;\;\;UJ\ket{i} = \lambda_i\ket{e_i} = \ket{\psi_i} = A\ket i \\ \lambda_i=0, \;\;\;UJ\ket{i} = 0 \end{cases}\nonumber \]

from the above cases we see that \(UJ\) and \(A\) both performs identical actions on \(\ket i\), therefore \(A = UJ\).

Now if \(A\) is invertible, then so is \(J\), and hence we can express \(U\) as \(U = AJ^{-1}\). Here the right polar decomposition naturally comes in because \(A = UJ = UJU^\dagger U = KU\), where \(K\equiv UJU^\dagger\) is a positive operator due to \(AA^\dagger=KUU^\dagger K^\dagger = KUU^\dagger K = K^2, K = \sqrt{AA^\dagger}\).

The following singular value decomposition combines the polar decomposition + spectral theorem:

Corollary 2.4

(Singular value decomposition) Let \(A\) be a square matrix. Then there exist unitary matrices \(U\) and \(V\) , and a diagonal matrix \(D\) with non-negative entries such that

\[ A=UDV \]

The diagonal elements of \(D\) are called the singular values of \(A\).

proof

By polar decomposition we have \(A = SJ\) with unitary \(S\) and positive \(J\) (left-polar decomposition). And by spectral theorem, \(J = TDT^\dagger\) with unitary \(T\) and diagonal \(D\) (notice that \(D\) has no negative entries since \(J\) is positive). Combining the two relatinos we get:

\[ A = STDT^\dagger = UDV \]

, where \(U\equiv ST\) and \(V\equiv T^\dagger\).

Exercise 2.48

What is the polar decomposition of a positive matrix \(P\) ? Of a unitary matrix \(U\) ? Of a Hermitian matrix, \(H\) ?

solution
  • positive matrix \(P\) : Since \(P\) is positive, it has spectral decomposition: \(P = \sum_i\lambda_i\ket i\bra i\) with all \(\lambda_i \geq 0\), and we have

    \[ J = \sqrt{P^\dagger P} = \sqrt{PP} = \sum_i\sqrt{\lambda_i^2}\ket i\bra i = P\nonumber \]

    , thus the polar decomposition is \(P = UP\) for all \(P\), hence \(U=I\) obviously.

  • unitary matrix \(U\) : \(J = \sqrt{U^\dagger U} = I\), thus \(U = UI\) for all \(U\).

  • hermitian matrix \(H\) : \(J = \sqrt{H^\dagger H} = \sqrt{HH} = \sqrt{H^2}\), thus \(H = U\sqrt{H^2}\) for all \(H\). note that generically \(H^2\neq H\) since \(H = \sum_i\lambda_i\ket i\bra i\) but \(\sqrt{H^2} = \sum_i\big\vert\lambda_i\big\vert\ket i\bra i\).

Exercise 2.49

Express the polar decomposition of a normal matrix in the outer product representation.

solution

Normal matrix \(A\) is diagonizable: \(A=\sum_i\lambda_i\ket i\bra i\). We utilize this to calculate \(J\): $$ J = \sqrt{A^\dagger A} = \sqrt{\sum_i\sum_j\lambda_j^*\lambda_i\ket j\bra j\ket{i}\bra i} = \sqrt{\sum_i\vert\lambda_i\vert^2\ket i\bra i} = \sum_i\big\vert\lambda_i\big\vert\ket i\bra i $$ , therefore \(A = U\sum_i\big\vert\lambda_i\big\vert\ket i\bra i\).

Exercise 2.50

Find the left and right polar decompositions of the matrix \(A=\begin{bmatrix}1&0\\1&1\end{bmatrix}\).

solution

\(A^\dagger = \begin{bmatrix}1&1\\0&1\end{bmatrix},\;\;A^\dagger A = \begin{bmatrix}2&1\\1&1\end{bmatrix},\;\;AA^\dagger = \begin{bmatrix}1&1\\1&2\end{bmatrix}\)

  • For \(A^\dagger A\), its eigenvalues and associated eigenvectors are:

    \[ \begin{cases} \lambda_+ = \frac{3+\sqrt5}{2}, \;\;\; \ket{\lambda_+} = \frac{1}{\sqrt{10-2\sqrt5}}\begin{bmatrix}2\\\sqrt5-1\end{bmatrix} \\ \lambda_- = \frac{3-\sqrt5}{2}, \;\;\; \ket{\lambda_-} = \frac{1}{\sqrt{10+2\sqrt5}}\begin{bmatrix}2\\-\sqrt5-1\end{bmatrix} \end{cases}\nonumber \]

    , then we can calculate:

    \[ \begin{cases} \ket{\lambda_+}\bra{\lambda_+} = \frac{1}{5-\sqrt5}\begin{bmatrix}2&\sqrt5-1\\\sqrt5-1&3-\sqrt5\end{bmatrix} \\ \ket{\lambda_-}\bra{\lambda_-} = \frac{1}{5+\sqrt5}\begin{bmatrix}2&-\sqrt5-1\\-\sqrt5-1&3+\sqrt5\end{bmatrix} \end{cases}\nonumber \]

    , and notice that:

    \[ \begin{cases} \sqrt{\lambda_+} = \frac{\sqrt5+1}{2}, \;\;\;\frac{1}{\sqrt{\lambda_+}} = \frac{\sqrt5-1}{2}\\ \sqrt{\lambda_-} = \frac{\sqrt5-1}{2}, \;\;\;\frac{1}{\sqrt{\lambda_-}} = \frac{\sqrt5+1}{2} \end{cases}\nonumber \]

    , therefore we may calculate \(J\):

    \[ J = \sqrt{A^\dagger A} = \sqrt{\lambda_+}\ket{\lambda_+}\bra{\lambda_+} + \sqrt{\lambda_-}\ket{\lambda_-}\bra{\lambda_-} = \frac{1}{\sqrt5} \begin{bmatrix}3&1\\1&2\end{bmatrix}\nonumber \]

    and then \(J^{-1}\):

    \[ J^{-1} = \frac{1}{\sqrt5} \begin{bmatrix}2&-1\\-1&3\end{bmatrix}\nonumber \]

    , thus we can calculate \(U=AJ^{-1}\)

    \[ U=AJ^{-1} = \frac{1}{\sqrt5}\begin{bmatrix}1&0\\1&1\end{bmatrix} \begin{bmatrix}2&-1\\-1&3\end{bmatrix} \nonumber = \frac{1}{\sqrt5}\begin{bmatrix}2&-1\\1&2\end{bmatrix} \]

    . Hence the left polar decomposition of \(A\) is:

    \[ A=\begin{bmatrix}1&0\\1&1\end{bmatrix} = UJ = \Bigg( \frac{1}{\sqrt5}\begin{bmatrix}2&-1\\1&2\end{bmatrix} \Bigg) \Bigg( \frac{1}{\sqrt5}\begin{bmatrix}3&1\\1&2\end{bmatrix} \Bigg)\nonumber \]
  • For the \(AA^\dagger\) case, or the right polar decomposition, perform similar process we get:

    \[ A=\begin{bmatrix}1&0\\1&1\end{bmatrix} = KU = \Bigg( \frac{1}{\sqrt5}\begin{bmatrix}2&1\\1&3\end{bmatrix} \Bigg) \Bigg( \frac{1}{\sqrt5}\begin{bmatrix}2&-1\\1&2\end{bmatrix} \Bigg)\nonumber \]

Tedious work, but still worth a try!

2.2 The postulates of quantum mechanics

2.2.1 State space & Evolution

The first postulate of quantum mechanics sets up the arena in which quantum mechanics takes place.

Postulate 1

Associated to any isolated physical system is a complex vector space with inner product (Hilbert space) known as the state space of the system. The system is completely described by its state vector, which is a unit vector in the system’s state space.

  • A qubit has two-dimensional state space. Suppose \(\ket0\) and \(\ket1\) form an orthonormal basis for that state space, then an arbitrary state vector in that space can be written as

    \[ \ket\psi = a\ket0+b\ket1 \]

    , where \(a\) and \(b\) are complex numbers.

  • \(\ket\psi\) is a unit vector, which is \(\braket{\psi} = 1\), is equivalent to \(\vert a\vert^2 + \vert b\vert^2 = 1\), often called normalization condition for state vectors.

As for how a state \(\ket\psi\) of a quantum mechanical system change with time, we have the second postulate:

Postulate 2

The evolution of a closed quantum system is described by a unitary transformation. The state of the system at \(t_1\rightarrow\ket\psi\) is related to the state of the system at \(t_2\rightarrow \ket{\psi'}\) by a unitary operator \(U\) which depends only on the times \(t_1\) and \(t_2\):

\[ \ket{\psi'} = U\ket{\psi} \]
  • 需要記住一點就是QM只跟我們說量子系統可以用state space描述, 沒有說這個space該長成什麼形狀; 同樣的QM再次告訴我們時間演化算符\(U\)是么正的, 沒說哪種process的\(U\)該長哪種樣子.
  • Some common and important unitary operators on a single qubit:
    • Pauli \(X\) matrix: quantum NOT gate, bit flip matrix.
    • Pauli \(Z\) matrix: phase flip matrix.
    • Hadamard gate.

Exercise 2.51

Verify that the Hadamard gate \(H\) is unitary.

solution
\[ HH^\dagger = H^\dagger H = \frac{1}{2}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}1&1\\1&-1\end{bmatrix} = \frac{1}{2}\begin{bmatrix}2&0\\0&2\end{bmatrix} = I\nonumber \]

Exercise 2.52

Verify that \(H^2 = I\).

solution
\[ H^2 = \frac{1}{2}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}1&1\\1&-1\end{bmatrix} = \frac{1}{2}\begin{bmatrix}2&0\\0&2\end{bmatrix} = I\nonumber \]

Exercise 2.53

What are the eigenvalues and eigenvectors of \(H\)?

solution

Characteristic equation: \(\lambda^2-1=0\), thus \(\lambda=\pm1\).

\[ \begin{cases} \lambda= -1,\;\; \dfrac{1}{\sqrt2}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}a\\b\end{bmatrix} = -\begin{bmatrix}a\\b\end{bmatrix},\;\;\ket{\lambda_{-1}}=\frac{1}{\sqrt{4+2\sqrt2}}\begin{bmatrix}1\\-1-\sqrt2\end{bmatrix} \\ \lambda= +1,\;\; \dfrac{1}{\sqrt2}\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}a\\b\end{bmatrix} = +\begin{bmatrix}a\\b\end{bmatrix},\;\;\ket{\lambda_{+1}}=\frac{1}{\sqrt{4-2\sqrt2}}\begin{bmatrix}1\\-1+\sqrt2\end{bmatrix} \end{cases}\nonumber \]

Actually we have a revised version of postulate 2 for continuous time case:

Postulate 2 (refined)

The time evolution of the state of a closed quantum system is described by the Schrödinger equation,

\[ i\hbar \frac{d\ket\psi}{dt} = H\ket\psi \]

, where \(\hbar\) is the reduced Planck's constant and \(H\) is the Hamiltonian (not Hadamard matrix) of the closed system.

  • The above refined postulate implies that uf we know the Hamiltonian of a system, then we understand its dynamics completely. But the question is, figuring out the Hamiltonian of the system is difficult even for today's physicists.

  • The spectral decomposition of Hamiltonian (因為它是Hermitian所以是normal的)

    \[ H = \sum_EE\ket E\bra E \]

    with the states \(\ket E\) be the energy eigenstates, or sometimes stationary states, and \(E\) the energy of that state. The lowest energy is known as the ground state energy for the system, and the corresponding energy eigenstate (or eigenspace) is known as the ground state.

  • The reason why eigen states of \(H\) are called stationary states is that the effect of time evolution on them only acquires a numerical factor:

    \[ \ket E\rightarrow e^{-iEt/\hbar}\ket E. \]
  • 驗證 postulate 2&2' 是等價的: First write down the solution of the Schrödinger equation,

    \[ \ket{\psi(t_2)} = \exp\bigg[{\frac{-iH(t_2-t_1)}{\hbar}}\bigg] \ket{\psi(t_1)} = U(t_1,t_2)\ket{\psi(t_1)},\;\;\; \text{where } U(t_1,t_2)\equiv \exp\bigg[{\frac{-iH(t_2-t_1)}{\hbar}}\bigg] \]

    , and then go to exercise 2.55. And furthermore, any unitary operator \(U\) can be realized in the form \(U = e^{iK}\) for some Hermitian operator \(K\).

  • Unitary operators \(\rightarrow\) to one-to-one correspondence between the discrete-time description of dynamics. Hamiltonians \(\rightarrow\) continuous time description.

Exercise 2.54

Suppose \(A\) and \(B\) are commuting Hermitian operators. Prove that \(e^Ae^B = e^{A+B}\).

solution

We utilize Theorem 2.2: \([A,B]=0\) if and only if there exists an orthonormal basis such that both \(A\) and \(B\) are diagonal with respect to that basis. Therefore we can write \(A = \sum_ia_i\ket i\bra i\), \(B = \sum_jb_j\ket j\bra j\), and \(A+B = \sum_k(a_k+b_k)\ket k\bra k\).

\[ \begin{align*} e^Ae^B &= \sum_i\sum_je^{a_i}e^{b_j}\ket i\bra i\ket {j}\bra j \\ &= \sum_i\sum_j e^{a_i+b_j}\delta_{ij}\ket i\bra j \\ &= \sum_ie^{a_i+b_i}\ket i\bra i \\ &= \sum_ke^{a_k+b_k}\ket k\bra k = e^{A+B} \end{align*} \]

Exercise 2.55

Prove that \(U(t_1, t_2)\) defined here is unitary.

solution

Note that \(H = \sum_EE\ket E\bra E\), therefore we have:

\[ \begin{align*} U(t_1,t_2)U^\dagger(t_1,t_2) &= \exp\bigg[{\frac{-iH(t_2-t_1)}{\hbar}}\bigg]\exp\bigg[{\frac{iH(t_2-t_1)}{\hbar}}\bigg] \\ &= \Bigg(\sum_{E}\exp\bigg[{\frac{-iE(t_2-t_1)}{\hbar}}\bigg]\ket E\bra E\Bigg) \Bigg(\sum_{E'}\exp\bigg[{\frac{iE'(t_2-t_1)}{\hbar}}\bigg]\ket {E'}\bra {E'}\Bigg) \\ &= \sum_{E}\sum_{E'}\exp\bigg[{\frac{i(E'-E)(t_2-t_1)}{\hbar}}\bigg]\ket E \delta_{EE'}\bra{E'} \\ &= \sum_E\exp(0)\ket E\bra E = I \end{align*} \]

, similarly we have \(U^\dagger(t_1,t_2)U(t_1,t_2)= I\), hence the operator is unitary.

Exercise 2.56

Use the spectral decomposition to show that \(K\equiv -i\log(U)\) is Hermitian for any unitary \(U\), and thus \(U=\exp(iK)\) for some Hermitian \(K\).

solution

Since \(U\) is unitary, it is also normal, hence has a spectral decomposition. Also from Exercise 2.18, we know that all eigenvalues of a unitary matrix can be written in the form \(e^{i\theta}\) for some real \(\theta\).

\[ K\equiv -i\log(U) = -i\sum_k\log(k)\ket k\bra k = -i\sum_ki\theta_k\ket k\bra k = \sum_k\theta_k\ket k\bra k\nonumber \]

, since \(\theta_k\in\mathbb R\) for all \(k\), we have \(\theta_k^* = \theta_k\) and thus \(K\) is Hermitian.

  • Sometimes we may describe a quantum system which is not closed using a time-varying Hamiltonian, and it indeed serve as a good approximation to a closed system, that way we can apply unitary operators on the system withoout feeling too guilty.

2.2.2 Quantum measurement

  • When we make a measurement to a closed system, the interaction with the system is sufficient to make it not-closed, hence cannot be strictly described by a unitary evolution, therefore how do we compensate to this?
Postulate 3

Quantum measurements are described by a collection \(\{M_m\}\) of measurement operators. These are operators acting on the state space of the system being measured. The index \(m\) refers to the measurement outcomes that may occur in the experiment. If the state of the quantum system is \(\ket\psi\) immediately before the measurement then the probability that result \(m\) occurs is given by

\[ p(m) = \expval{M_m^\dagger M_m}{\psi} \]

, and the state of the system after the measurement is

\[ \frac{M_m\ket\psi}{\sqrt{\expval{M_m^\dagger M_m}{\psi}}} \]

. The measurement operators satisfy the completeness equation,

\[ \sum_m M_m^\dagger M_m = I \]

. The completeness equation expresses the fact that probabilities sum to one: \(1=\sum_m p(m) = \sum_m\expval{M_m^\dagger M_m}{\psi}\).

  • A simple but important example of a measurement is the measurement of a qubit in the computational basis. In this case the measurement operators are defined by \(M_0 = \ket0\bra0\), \(M_1 = \ket1\bra1\). Notice that the completeness equation is obeyed.
  • Suppose the state right before the measurement is \(\ket\psi = a\ket0+b\ket1\). Then the probability of obtaining outcome \(0\) after measurement is \(p(0) = \expval{M_0^\dagger M_0}{\psi} = \expval{M_0}{\psi} = \vert a\vert^2\), similarly the probabolity of obtaining \(1\) is \(\vert b\vert^2\). The state after measurement in the two cases is therefore \(\dfrac{M_0\ket\psi}{\vert a\vert} = \dfrac{a}{\vert a\vert}\ket0\) and \(\dfrac{M_1\ket\psi}{\vert b\vert} = \dfrac{b}{\vert b\vert}\ket1\).
  • We will see in later sections that the factors with modulus \(1\), like \(a/\vert a\vert\) and \(b/\vert b\vert\) can effectively be ignored, so the two post-measurement states are effectively \(\ket0\) and \(\ket1\).

Exercise 2.57

(Cascaded measurements are single measurements) Suppose \(\{L_l\}\) and \(\{M_m\}\) are two sets of measurement operators. Show that a measurement defined by the measurement operators \(\{L_l\}\) followed by a measurement defined by the measurement operators \(\{M_m\}\) is physically equivalent to a single measurement defined by measurement operators \(\{N_{lm}\}\) with the representation \(N_{lm}\equiv M_mL_l\).

solution

It is essential to prove the post-measurement states for two ways are equivalent.

  • Suppose we do \(\{L_l\}\), followed by \(\{M_m\}\), the middle state \(\ket{\psi_1}\) and the final state \(\ket{\psi_2}\) are:

    \[ \begin{align*} \ket{\psi_1} &= \dfrac{L_l\ket\psi}{\sqrt{\expval{L_l^\dagger L_l}{\psi}}} \\ \ket{\psi_2} &= \dfrac{M_m\ket{\psi_1}}{\sqrt{\expval{M_m^\dagger M_m}{\psi_1}}} \\ &= \dfrac{M_mL_l\ket\psi}{\sqrt{\expval{L_l^\dagger L_l}{\psi}}}\dfrac{\sqrt{\expval{L_l^\dagger L_l}{\psi}}}{\sqrt{\expval{L_l^\dagger M_m^\dagger M_mL_l}{\psi}}} \\ &= \dfrac{M_mL_l\ket\psi}{\sqrt{\expval{L_l^\dagger M_m^\dagger M_mL_l}{\psi}}} \end{align*} \]
  • Now if we perform a single \(N_{lm}\) measurement instead, we have:

    \[ \begin{align*} \ket{\psi_3} &= \dfrac{N_{lm}\ket\psi}{\sqrt{\expval{N_{lm}^\dagger N_{lm}}{\psi}}} \\ &= \dfrac{M_mL_l\ket\psi}{\sqrt{\expval{L_l^\dagger M_m^\dagger M_mL_l}{\psi}}} \end{align*} \]

    , which is identical to the first scenario.

2.2.3 Distinguishing quantum states

  • In chapter 1 we've discussed that non-orthogonal quantum states cannot be reliably distinguished. With postulate 3 as a firm foundation we can now give a much more convincing demonstration of this fact.
  • We demonstrate "distinguishability" using Alice and Bob plot. Suppose they have been previously informed a fixed set of states \(\{\ket{\psi_i}, 1\leq i\leq n\}\), then Alice pick one state from the basket and send it to Bob, how will he arrage the measurement operators to identify the index \(i\) of the state?
    • If the states \(\ket{\psi_i}\) are orthonormal, then Bob can define measurement operators in this way: \(M_i\equiv \ket i\bra i\) , where \(i\) is all possible index, all with an additional \(M_0\) in order to satisfy the completeness relation: \(M_0 = \sqrt{I-\sum_{i\neq0}\ket{\psi_i}\bra{\psi_i}}\). By this way Bob can measure one specific state \(\ket{\psi_i}\) with probability \(p(i)=1\).
    • If the states \(\ket{\psi_i}\) are not orthonormal, if Bob can reliably distinguish different states, he may work this way: if he measures \(j\), his rule in mind tell him that this corresponds to the state \(\ket{\psi_1}\), but in reality this is not pragmatic since if another state \(\ket{\psi_2}\) have component parallel to \(\ket{\psi_1}\), chance are that it also gives measurement outcome \(j\), which ultimately leads to wrong decision according to Bob's rule. Now let's see a more rigorous proof below.

✏ Proof that non-orthogonal states can’t be reliably distinguished

A proof by contradiction shows that no measurement distinguishing the non-orthogonal states \(\ket{\psi_1}\) and \(\ket{\psi_2}\) is possible. Suppose such a measurement is possible, define \(E_i\equiv \sum_{j:f(j)=i}M_j^\dagger M_j\) (意思即測量到 \(j_1\) 或是 \(j_2\) 都代表量子態是 \(i\)), then these observations may be written as:

\[ \expval{E_1}{\psi_1} = 1;\;\;\;\expval{E_2}{\psi_2} = 1 \]

Clearly we have \(\sum_iE_i = I\) (因為\(E\)就是\(M\)的組合), therefore \(\sum_i\expval{E_i}{\psi_1} = 1\), but since this \(1\) comes entirely from the contribution of \(E_1\), we have \(\expval{E_2}{\psi_1} = 0\;\Rightarrow \sqrt{E_2}\ket{\psi_1}=0\).

Now we decompose \(\ket{\psi_2}\) into components parallel and perpendicular to \(\ket{\psi_1}\): \(\ket{\psi_2} = \alpha\ket{\psi_1} + \beta\ket{\varphi}\), therefore \(\sqrt{E_2}\ket{\psi_2}=\beta\sqrt{E_2}\ket\varphi\), and since \(\ket{\psi_1}\) and \(\ket{\psi_2}\) are non-orthogonal we have \(\vert\beta\vert<1\), thus implies:

\[ 1= \expval{E_2}{\psi_2} = \vert\beta\vert^2\expval{E_2}{\varphi}\leq \vert\beta\vert^2 \leq 1 \]

, which leads to contradiction (the second last "\(\leq\)" is due to \(\expval{E_2}{\varphi}\leq \sum_i\expval{E_i}{\varphi}=\expval{I}{\varphi}=1\)).

2.2.4 Projective measurements

  • A projective measurement is described by an observable, \(M\), a Hermitian operator on the state space of the system being observed. The obersvable has a spectral decomposition,

    \[ M = \sum_mmP_m \]

    , where \(P_m\) is the projector ontoo the eigenspace of \(M\) with eigenvalue \(m\). (測量後可能得到的所有值就是 observable \(M\) 的所有 eigenvalues 們, 即 \(m\))

  • Upon measuring the state \(\ket\psi\), the probability of getting result \(m\) is

    \[ p(m) = \expval{P_m}{\psi} \]

    , and if \(m\) was really measured, the state immediately after the measurement is

    \[ \dfrac{P_m\ket\psi}{\sqrt{p(m)}} \]
  • 與 postulate 3 的異同: The measurement operators in postulate 3 need merely satisfying the completeness relation \(\sum_mM^\dagger_mM_m = I\). Now if we add extra restrictions like they (\(M_m\)) must be orthogonal projectors, that is,

    • \(M_m\) are Hermitian, and
    • \(M_mM_{m'} = \delta_{mm'}M_m\)

    the postulate 3 will reduce to equivalent with projective measurement.

  • Nice properties of projective measurements:

    • Easy to calculate average values:
    \[ \textbf E(M) = \sum_mm\cdot p(m) = \sum_m m \expval{P_m}{\psi} = \bra\psi\bigg(\sum_m mP_m\bigg)\ket\psi =\expval{M}{\psi} \equiv \langle M \rangle \]
    • Able to calculate standard deviation:
    \[ \Delta M = \sqrt{\Big\langle \big(M - \langle M \rangle\big)^2 \Big\rangle} = \sqrt{\Big\langle M^2 \Big\rangle - \langle M \rangle^2} \]
  • These formulation of measurement and standard deviations in terms of observables gives rise in an elegant way to results such as the Heisenberg uncertainty principle.

Exercise 2.58

Suppose we prepare a quantum system in an eigenstate \(\ket\psi\) of some observable \(M\), with corresponding eigenvalue \(m\). What is the average observed value of \(M\), and the standard deviation?

solution
\[ \begin{align*} (\Delta M)^2 &= \big\langle M^2 \big\rangle - \langle M \rangle^2 \\ &= \bra\psi\bigg(\sum_{m,m'} mm'P_mP_{m'}\bigg)\ket\psi - \expval{M}{\psi}^2 \\ &= m \bra\psi\bigg(\sum_{m} mP_m\bigg)\ket\psi - m^2 \\ &= m^2-m^2 = 0 \end{align*} \]

, and obviously \(\langle M \rangle=m\).

✏ The Heisenberg uncertainty principle

Suppose \(A\) and \(B\) are two Hermitian operators, from simple derivation we have:

\[ \Big\vert \expval{[A,B]}{\psi} \Big\vert^2 + \Big\vert \expval{\{A,B\}}{\psi} \Big\vert^2 = 4\Big\vert \expval{AB}{\psi} \Big\vert^2\;\Rightarrow \Big\vert \expval{[A,B]}{\psi} \Big\vert^2 \leq 4\Big\vert \expval{AB}{\psi} \Big\vert^2 \]

(because if we assume \(\expval{AB}{\psi} = x+iy\) where \(x, y\in\mathbb R\), then \(\expval{[A,B]}{\psi} = 2iy\) and \(\expval{\{A,B\}}{\psi} = 2x\), therefore it follows the relation.)

Now we can then apply Cauchy–Schwarz inequality to the last term \(4\Big\vert \expval{AB}{\psi} \Big\vert^2\) in the above equation:

\[ \Big\vert \expval{AB}{\psi} \Big\vert^2 \leq \expval{A^2}{\psi}\expval{B^2}{\psi} \]

Hence we have:

\[ \Big\vert \expval{[A,B]}{\psi} \Big\vert^2 \leq 4\expval{A^2}{\psi}\expval{B^2}{\psi} \]

Now we substitute \(A\) and \(B\) with \(A = C-\langle C\rangle\) and \(B = D-\langle D\rangle\) where \(C\) and \(D\) are two observables, which makes:

\[ \begin{align*} \expval{[A,B]}{\psi} &= \expval{CD - C\langle{D}\rangle - \langle{C}\rangle D + \langle{C}\rangle\langle{D}\rangle -DC + D\langle{C}\rangle + \langle{D}\rangle C - \langle{D}\rangle\langle{C}\rangle}{\psi} = \expval{[C,D]}{\psi} \\ \expval{A^2}{\psi} &= \expval{C^2}{\psi} - \expval{C}{\psi}^2 = (\Delta C)^2 \\ \expval{B^2}{\psi} &= \expval{D^2}{\psi} - \expval{D}{\psi}^2 = (\Delta D)^2 \end{align*} \]

, and will ultimate leads to the most used form of Heisenberg uncertainty principle:

\[ \Delta C\Delta D \geq \dfrac{\Big\vert \expval{[C,D]}{\psi} \Big\vert}{2} \]

(有人可能會用錯誤的方法詮釋不確定性原理:「為了測量 \(C\) 到一定的準度, 過程中勢必會影響(disturb)到 \(D\) 的量值」, 這是不對的. 正確的理解方法應為: 今天準備一百萬個一模一樣的量子態 \(\ket\psi\), 然後前五十萬個我們都測它的 \(C\), 後五十萬個都測量它的 \(D\), 那麼兩者的標準差將符合上式規範.)

Example: If we use observables \(X\) and \(Y\) to measure the quantum state \(\ket0\), the uncertainty principle tells us that,

\[ \Delta X\Delta Y \geq \dfrac{\vert \expval{2iZ}{0} \vert}{2} = 1 \nonumber \]

One elementary consequence of this is that \(\Delta X\) and \(\Delta Y\) must both be strictly greater than \(0\), as can be verified by direct calculation.

  • Two widely used nomenclatures for measurements deserve emphasis:

    • Rather than giving an observable to describe a projective measurement, often people simply list a complete set of orthogonal projectors \(P_m\) satisfying the relations \(\sum_mP_m = I\) and \(P_mP_{m'} = \delta_{mm'}P_m\). The corresponding observable implicit in this usage is \(M = \sum_mmP_m\).
    • Another widely used phrase, to "measure in a basis \(\ket m\)", where \(\ket m\) form an orthonormal basis, simply means to perform the projective measurement with projectors \(P_m = \ket m\bra m\).
  • Example: If we make a measurement of observable \(Z\) to the state \(\ket\psi = \frac{\ket0+\ket1}{\sqrt2}\), we will get either \(+1\) or \(-1\) since \(Z\) has eigenvalues \(\lambda_\pm = \pm1\). To be more detailed, we will measure \(+1\) with probability

    \[ \expval{P_1}{\psi} = \braket{\psi}{\lambda_{+1}}\braket{\lambda_{+1}}{\psi} = \expval{P_1}{\psi} = \braket{\psi}{0}\braket{0}{\psi} = \frac{1}{2} \nonumber \]

    , and similarly the result \(-1\) with \(50\%\) probability.

  • More generally, suppose \(\vec v\) is any real three-dimensional unit vector. Then we can define an observable:

    \[ \vec v\cdot \vec\sigma \equiv v_1\sigma_1 + v_2\sigma_2 + v_3\sigma_3 \]

    . Measurement of this observable is sometimes referred to as a measurement of spin along the \(\vec v\) axis.

Exercise 2.59

Suppose we have qubit in the state \(\ket0\), and we measure the observable \(X\). What's the average value & standard deviation of \(X\)?

solution
\[ \begin{align*} \langle{X}\rangle &= \expval{X}{0} = \begin{bmatrix}1\\0\end{bmatrix} \begin{bmatrix}0&1\\1&0\end{bmatrix} \begin{bmatrix}1\\0\end{bmatrix} = 0 \\ \Delta(X)^2 &= \expval{X^2}{0}-0^2 = \begin{bmatrix}1\\0\end{bmatrix} \begin{bmatrix}1&0\\0&1\end{bmatrix} \begin{bmatrix}1\\0\end{bmatrix} = 1 \\ &\Rightarrow \Delta(X)=1 \end{align*} \]

Exercise 2.60

Show that \(\vec v\cdot \vec\sigma\) has eigenvalues \(\pm1\), and that the projectors onto the corresponding eigenspaces are given by \(P_\pm = \dfrac{I\pm\vec v\cdot \vec\sigma}{2}\).

solution

We've already shown in Exercise 2.35 the eigenvalues are \(\pm1\). Now we calculate their eigenvectors:

  • For \(\lambda = 1\):

    \[ \begin{align*} &\begin{bmatrix}v_3-1&v_1-iv_2\\v_1+iv_2&-v_3-1\end{bmatrix} \begin{bmatrix}a\\b\end{bmatrix} = \begin{bmatrix}0\\0\end{bmatrix} \;\;\Rightarrow\;\; \ket{\lambda_+} = \begin{bmatrix}a\\b\end{bmatrix} = \frac{1}{\sqrt{2(1-v_3)}}\begin{bmatrix}v_1-iv_2\\1-v_3\end{bmatrix} \end{align*} \]

    , then we can calculate projector:

    \[ \begin{align*} P_+ = \ket{\lambda_+}\bra{\lambda_+} &= \frac{1}{2(1-v_3)}\begin{bmatrix}v_1-iv_2\\1-v_3\end{bmatrix}\begin{bmatrix}v_1+iv_2&1-v_3\end{bmatrix} \\ &= \frac{1}{2(1-v_3)} \begin{bmatrix}1-v_3^2 & (v_1-iv_2)(1-v_3)\\(v_1+iv_2)(1-v_3) & (1-v_3)^2\end{bmatrix} \\ &= \frac{1}{2} \begin{bmatrix}1+v_3 & v_1-iv_2\\v_1+iv_2 & 1-v_3\end{bmatrix} \\ &= \frac{1}{2}\bigg(\begin{bmatrix}1&0\\0&1\end{bmatrix} + \begin{bmatrix}v_3 & v_1-iv_2\\v_1+iv_2 & -v_3\end{bmatrix}\bigg) = \frac{1}{2}\Big(I + \vec v\cdot \vec\sigma\Big) \end{align*} \]
  • For \(\lambda = -1\):

    \[ \begin{align*} &\begin{bmatrix}v_3+1&v_1-iv_2\\v_1+iv_2&-v_3+1\end{bmatrix} \begin{bmatrix}a\\b\end{bmatrix} = \begin{bmatrix}0\\0\end{bmatrix} \;\;\Rightarrow\;\; \ket{\lambda_-} = \begin{bmatrix}a\\b\end{bmatrix} = \frac{1}{\sqrt{2(1+v_3)}}\begin{bmatrix}v_1-iv_2\\-1-v_3\end{bmatrix} \end{align*} \]

    , then we can calculate projector:

    \[ \begin{align*} P_- = \ket{\lambda_-}\bra{\lambda_-} &= \frac{1}{2(1+v_3)}\begin{bmatrix}v_1-iv_2\\-1-v_3\end{bmatrix}\begin{bmatrix}v_1+iv_2&-1-v_3\end{bmatrix} \\ &= \frac{1}{2(1+v_3)} \begin{bmatrix}1-v_3^2 & -(v_1-iv_2)(1+v_3)\\-(v_1+iv_2)(1+v_3) & (1+v_3)^2\end{bmatrix} \\ &= \frac{1}{2} \begin{bmatrix}1-v_3 & -(v_1-iv_2)\\-(v_1+iv_2) & 1+v_3\end{bmatrix} \\ &= \frac{1}{2}\bigg(\begin{bmatrix}1&0\\0&1\end{bmatrix} - \begin{bmatrix}v_3 & v_1-iv_2\\v_1+iv_2 & -v_3\end{bmatrix}\bigg) = \frac{1}{2}\Big(I - \vec v\cdot \vec\sigma\Big) \end{align*} \]

. Hence we have \(P_\pm = \dfrac{I\pm\vec v\cdot \vec\sigma}{2}\).

Exercise 2.61

Calculate the probability of obtaining the result \(+1\) for a measurement of \(\vec v\cdot \vec\sigma\), given that the state prior to measurement is \(\ket0\). What is the state of the system after the measurement if \(+1\) is obtained?

solution
\[ \begin{align*} p(+1) = \expval{P_+}{0} &= \expval{\frac{I+\vec v\cdot \vec\sigma}{2}}{0} \\ &= \frac{1}{2} + \frac{v_3}{2} = \frac{1+v_3}{2} \end{align*} \]

, and the post-measurement state is

\[ \begin{align*} \dfrac{P_+\ket0}{\sqrt{p(+1)}} = \frac{1}{\sqrt{\dfrac{1+v_3}{2}}} \cdot\frac{1}{2}\begin{bmatrix}1+v_3\\v_1+iv_2 \end{bmatrix} = \frac{1}{\sqrt{2(1+v_3)}} \begin{bmatrix}1+v_3\\v_1+iv_2 \end{bmatrix} &= \sqrt\frac{1+v_3}{2} \begin{bmatrix}1\\\dfrac{v_1+iv_2}{1+v_3} \end{bmatrix} \\ &= \sqrt\frac{1+v_3}{2}\frac{1}{v_1-iv_2} \begin{bmatrix}v_1-iv_2\\1-v_3 \end{bmatrix} \\ &= \sqrt\frac{1+v_3}{2}\frac{1}{\sqrt{1-v_3^2}} \begin{bmatrix}v_1-iv_2\\1-v_3 \end{bmatrix} \text{ (這步驟其實不太數學)}\\ &= \frac{1}{\sqrt{2(1-v_3)}}\begin{bmatrix}v_1-iv_2\\1-v_3\end{bmatrix} = \ket{\lambda_+} \end{align*} \]

2.2.5 POVM measurements

  • Postulate 3 gives us two core values:

    • a rule describing the measurement statistics (即給出測量到各種 outcome 的機率是多少).
    • a rule describing the post-measurement state of the system.

But sometimes, we just don't care about the post-measurement state of the system. All we want to know is what we've got for the measurement exactly. In such instances there is a mathematical tool known as the POVM formalism which is especially well adapted to the analysis of the measurements.

  • POVM stands for Positive Operator-Valued Measure. (先不管這名字是怎麼來的)

    • Suppose a measurement described by measurement operators \(M_m\) is performed upon a quantum system in state \(\ket\psi\). Then the probability getting \(m\) is \(p(m) = \expval{M_m^\dagger M_m}{\psi}\).
    • Now, we define \(E\equiv M_m^\dagger M_m\).
    • Then according to elementary linear algebra and postulate 3, \(E\) is a positive operator such that \(\sum_mE_m = I\) and \(p(m) = \expval{E_m}{\psi}\).
    • The operators \(E_m\) is POVM elements and the complete set \(\{E_m\}\) is POVM.
  • Example: consider a projective measurement described by measurement operators \(P_m\), where \(P_m\) are projectors such that \(P_mP_{m'}= \delta_{mm'}P_m\) and \(\sum_mP_m = I\). In this instance (and only this instance) all the POVM elements are the same as the measurement operators themselves, since \(E_m\equiv P_m^\dagger P_m = P_m\).

✏ General measurements, projective measurements, and POVMs

為什麼先學 general measurements, 再學 projective measurements 或 POVM? 一般物理學家都直接從 projective measurement 開始, 但 QCQI 講求對量子系統的精確控制, 因此相較於一般只能粗略測量的系統, 從 general measurement 開始介紹會比較合適.

General measurement 又有幾個優點:

  • Mathematically easier than projective measurements, they have fewer restrictions, 例如它就沒有像是 \(P_iP_j = \delta_{ij}P_i\) 這種規定.
  • There are important problems in QCQI (such as the optimal way to distinguish a set of quantum states) the answer to which involves a general measurement, rather than a projective measurement.
  • 一般我們生活中見到的測量方法都是「不可重現性」的 (e.g. if we use a silvered screen to measure the position of a photon we destroy the photon in the process. This certainly makes it impossible to repeat the measurement of the photon’s position!) 但 projective measurement 都是可重現性的, 因為用 \(P_m\) 測量一次過後, 再用一次 \(P_m\), 並不會改變系統的 state. 所以這時候勢必要引入 general measurement formalism 來描述測量過程.

Where do POVMs fit in this picture? POVMs are best viewed as a special case of the general measurement formalism, providing the simplest means by which one can study general measurement statistics, without the necessity for knowing the post-measurement state. They are a mathematical convenience that sometimes gives extra insight into quantum measurements.

Exercise 2.62

Show that any measurement where the measurement operators and the POVM elements coincide is a projective measurement.

solution

Assume the measurement operators we're considering are \(M_m\), writing down the statement mathematically we have

\[ E_m = M_m^\dagger M_m = M_m\nonumber \]

, from Exercise 2.25 we deduce that \(M_m\) is positive, hence Hermitian, therefore we have \(M_m^\dagger M_m = M_m^2 = M_m\). The last equal sign indicates that \(M_m\) are projective operators.

  • Suppose now that \(\{E_m\}\) is some arbitrary set of positive operators such that \(\sum_mE_m = I\). We will show below that there exists a set of measurement operators \(M_m\) defining a measurement described by the POVM \(\{E_m\}\).

    • Define \(M_m\equiv\sqrt{E_m}\) in order to make \(\sum_mM_m^\dagger M_m = I\).
    • For this reason it is convenient to define a POVM to be any set of operators \(\{E_m\}\) such that:
      • each operator \(E_m\) is positive.
      • the completeness relation \(\sum_mE_m = I\) is obeyed (probabilities sum to one).
    • To complete the description of POVMs, we note again that given a POVM \(\{E_m\}\), the probability of getting outcome \(m\) is given by \(p(m) = \expval{E_m}{\psi}\).
  • 到現在為止還看不出 POVM 能發揮什麼鳥用, 所以來舉個例子. Suppose Alice gives Bob a qubit prepared in one of two states:

    \[ \ket{\psi_1} = \ket0\;\;,\;\;\ket{\psi_2} = \frac{\ket0+\ket1}{\sqrt2}\nonumber \]

    , though it is impossible for Bob to determine whether he has been given which \(\ket\psi\) with reliability, it is possible for him to perform a measurement which distinguishes the states some of the time, but never makes an error of mis-identification:

  • Consider a POVM containing three elements:

    \[ \begin{align} E_1 &\equiv \frac{\sqrt2}{1+\sqrt2} \ket1\bra1 \\ E_2 &\equiv \frac{\sqrt2}{1+\sqrt2} \frac{\big(\ket0-\ket1\big)\big(\bra0-\bra1\big)}{2} \\ E_3 &\equiv I-E_1-E_2 \end{align} \]

    , you can verify they satisfy positive operators and completeness relation, therefore form a legitimate POVM.

  • If Bob received \(\ket{\psi_1} = \ket0\), he then performs the measurement described by the POVM \(\{E_1, E_2, E_3\}\):

    • There is zero probability that he will observe the result \(E_1\). (這表示如果Bob測量到\(E_1\)的話, 就代表他一定收到的是\(\ket{\psi_2}\))
  • If Bob received \(\ket{\psi_2} = \frac{\ket0+\ket1}{\sqrt2}\), he then performs the measurement described by the POVM \(\{E_1, E_2, E_3\}\):

    • There is zero probability that he will observe the result \(E_2\). (\(E_2\) 的設計很巧妙吧!!)
  • Therefore he can perform the measurement with the rule in mind:

    \[ \begin{cases} \text{outcome: } E_1 \Rightarrow\text{he received }\ket{\psi_2} \\ \text{outcome: } E_2 \Rightarrow\text{he received }\ket{\psi_1} \\ \text{outcome: } E_3 \Rightarrow\text{he have no idea which state he received} \end{cases}\nonumber \]

    , 這種永遠不會出錯的代價就是 Bob 有時候做的測量會給出0資訊.

Exercise 2.63

Suppose a measurement is described by measurement operators \(M_m\). Show that there exist unitary operators \(U_m\) such that \(M_m = U_m\sqrt{E_m}\), where \(E_m\) is the POVM associated to the measurement.

solution

From polar decomposition (Theorem 2.3), \(A = UJ,\;J\equiv\sqrt{A^\dagger A}\), we have \(M_m = U_m\sqrt{M_m^\dagger M_m} = U_m\sqrt{E_m}\).

Exercise 2.64

Suppose Bob is given a quantum state chosen from a set \(\ket{\psi_1},\cdots,\ket{\psi_m}\) of linearly independent states. Construct a POVM \(\{E_1, E_2, \cdots, E_{m+1}\}\) such that if outcome \(E_i\)occurs (\(1\leq i\leq m\)), then Bob knows with certainty that he was given the state \(\ket{\psi_i}\). (The POVM must be such that \(\expval{E_i}{\psi_i}>0\) for each \(i\).)

solution

這就代表給定 \(i\), 量到其他 \(i\) 的機率就是 \(0\), 那構成這個 \(E_i\) 的元素就要剔除任何平行其他 \(i\) 的分量, 即:

\[ E_i = A\ket{\psi_i'}\bra{\psi_i'} \]

, where

\[ \ket{\psi_i'} = \ket{\psi_i} - \sum_{j=1,\;j\neq i}^m \frac{\braket{\psi_i}{\psi_j}\ket{\psi_j}}{\Big\Vert\ket{\psi_j}\Big\Vert^2} \]

, and \(A\) is chosen in order to maintain the positivity of \(E_{m+1} = I-\sum_{i=1}^mE_i\).

2.2.6 Phase & Composite systems

  • State \(e^{i\theta}\ket\psi\) is equal to state \(\ket\psi\) up to the global phase factor \(e^{i\theta}\).
  • The statistics of measurement predicted for these two states are the same. proof: \(\expval{e^{-i\theta}M_m^\dagger M_me^{i\theta}}{\psi} = \expval{M_m^\dagger M_m}{\psi}\), therefore from an observational point of view these two states are identical.
  • Relative phase: consider state \(\frac{\ket0+\ket1}{\sqrt2}\) and \(\frac{\ket0-\ket1}{\sqrt2}\), they are the same up to a relative phase shift because the amplitudes of \(\ket0\) are identical, and the amplitudes of \(\ket1\) differ only by a relative phase factor of \(-1\).
    • the relative phase is a basis-dependent concept unlike global phase.
    • this give rise to physically observable differences in measurement statistics.

Exercise 2.65

Express the states \(\frac{\ket0+\ket1}{\sqrt2}\) and \(\frac{\ket0-\ket1}{\sqrt2}\) in a basis in which they are not the same up to a relative phase shift.

solution

Apply them to new basis \(\ket+\equiv\frac{\ket0+\ket1}{\sqrt2}\) and \(\ket-\equiv\frac{\ket0-\ket1}{\sqrt2}\) to see the difference.

How should we describe states of the composite system? Therefore a postulate 4 is needed:

Postulate 4

The state space of a composite physical system is the tensor product of the state spaces of the component physical systems. Moreover, if we have systems numbered \(1\) through \(n\), and system number \(i\) is prepared in the state \(\ket{\psi_i}\), then the joint state of the total system is \(\ket{\psi_1}\otimes\ket{\psi_2}\otimes\cdots\otimes\ket{\psi_n}\).

  • Notation convention: for example, in a system containing three qubits, \(X_2\) is a Pauli \(\sigma_x\) operator acting on the second qubit.

Exercise 2.66

Show that the average value of the observable \(X_1Z_2\) for a two qubit system measured in the state \(\frac{\ket{00}+\ket{11}}{\sqrt2}\) is zero.

solution
\[ \frac{\bra{00}+\bra{11}}{\sqrt2}\Big(X_1Z_2\Big)\frac{\ket{00}+\ket{11}}{\sqrt2} = \frac{\bra{00}+\bra{11}}{\sqrt2}\frac{\ket{10}-\ket{01}}{\sqrt2} =0 \]

  • Projective measurements together with unitary dynamics are sufficient to implement a general measurement. The proof of this statement makes use of composite quantum systems, and is a nice illustration of Postulate 4 in action:

    • Suppose we have a quantum system with state space \(Q\), and we want to perform a measurement described by measurement operators \(M_m\) on the system \(Q\).

    • To do this, we first introduce an ancilla system with state space \(M\), having an orthonormal basis \(\ket m\) in one-to-one correspondence with the possible outcomes of the measurement we wish to implement. (這個擴充系統可被視為數學上虛構的東西或是真實的物理系統.)

    • Let \(\ket0\) be any fixed state of \(M\), define an operator \(U\) on products \(\ket\psi\ket0\) of states \(\ket\psi\) from \(Q\) with the state \(\ket0\) by

      \[ U\ket\psi\ket0\equiv \sum_mM_m\ket\psi\ket m \]
    • Due to the completeness relation and the orthonormality of \(\ket m\), we can see that \(U\) preserves inner products between states of the form \(\ket\psi\ket0\):

      \[ \begin{align*} \bra0\bra\varphi U^\dagger U\ket\psi\ket0 &= \sum_{m,m'}\bra\varphi M_m^\dagger M_{m'}\ket\psi \braket{m}{m'} \\ &= \sum_m \bra\varphi M_m^\dagger M_{m}\ket\psi \\ &= \braket{\varphi}{\psi} \end{align*} \]
    • And by the results of Exercise 2.67 it follows that \(U\) can be extended to a unitary operator on the space \(Q\otimes M\), which can be also denoted bu \(U\).

    • After letting \(U\) act on \(\ket\psi\ket0\), suppose we perform a projective measurement on the two systems described by projectors \(P_m\equiv I_Q\ket m\bra m\). The outcome \(m\) occurs with probability

      \[ \begin{align*} p(m) &= \bra\psi\bra0U^\dagger P_m U\ket\psi\ket0 \\ &= \sum_{m', m''} \bra{m'}\bra{\psi}M_{m'}^\dagger \;\Big(I_Q\ket m\bra m\Big)\;M_{m''}\ket{\psi}\ket{m''} \\ &= \bra\psi M^\dagger_{m'}M_m\ket\psi \end{align*} \]

      , just as given in Postulate 3. The joint state of the system \(QM\) after measurement, conditional on result \(m\) occurring, is given by:

      \[ \begin{align*} \frac{P_mU\ket\psi\ket0}{\sqrt{\bra0\expval{U^\dagger P_m U}{\psi}\ket0}} &= \frac{M_m\ket\psi\ket m}{\sqrt{\bra\psi M^\dagger_{m}M_m\ket\psi}} \\ &= \Bigg(\frac{M_m\ket\psi}{\sqrt{\bra\psi M^\dagger_{m}M_m\ket\psi}}\Bigg)\cdot \ket m \\ &= \text{(state of system }Q)\cdot\text{(state of system }M) \end{align*} \]

    , notice that the state of system \(Q\) is just as exactly the same as given in Postulate 3. Thus unitary dynamics, projective measurements, and the ability to introduce ancillary systems, together allow any measurement of the form described in Postulate 3 to be realized.

Exercise 2.67

Suppose \(V\) is a Hilbert space with a subspace \(W\). Suppose \(U:W\rightarrow V\) is a linear operator which preserves inner products, that is, for any \(\ket{w_1}\) and \(\ket{w_2}\) in \(W\),

\[ \bra{w_1}U^\dagger U\ket{w_2} = \braket{w_1}{w_2}\nonumber \]

Prove that there exists a unitary operator \(U':V\rightarrow V\) which extends \(U\). That is, \(U'\ket w = U\ket w\) for all \(\ket w\) in \(W\), but \(U'\) is defined on the entire space \(V\). Usually we omit the prime symbol \('\) and just write \(U\) to denote the extension.

solution

Since \(W\) is a subspace for \(V\), we can find an orthogonal complement \(W^\perp\) of it, that is, \(V = W\oplus W^\perp\). Let \(\ket{w_i}\), \(\ket{w_j'}\), and \(\ket{u_j'}\) be othonormal bases for \(W\), \(W^\perp\), and \((U\text{ act on }W)^\perp\), respectively.

Now we define \(U':V\rightarrow V\) as \(U'\equiv \sum_i^{\dim W}\ket{u_i}\bra{w_i} + \sum_j^{\dim W^\perp}\ket{u_j'}\bra{w_j'}\), where \(\ket{u_i} \equiv U\ket{w_i}\). Now we can verify the unitarity of \(U'\) by direct calculation:

\[ (U')^\dagger U' = \bigg(\sum_i\ket{w_i}\bra{u_i} + \sum_j\ket{w_j'}\bra{u_j'}\bigg) \bigg(\sum_i\ket{u_i}\bra{w_i} + \sum_j\ket{u_j'}\bra{w_j'}\bigg) = \sum_i\ket{w_i}\bra{w_i} + \sum_j\ket{w_j'}\bra{w_j'} = I \nonumber \]

, and similarly we have:

\[ U'(U')^\dagger = \bigg(\sum_i\ket{u_i}\bra{w_i} + \sum_j\ket{u_j'}\bra{w_j'}\bigg) \bigg(\sum_i\ket{w_i}\bra{u_i} + \sum_j\ket{w_j'}\bra{u_j'}\bigg) = \sum_i\ket{u_i}\bra{u_i} + \sum_j\ket{u_j'}\bra{u_j'} = I \nonumber \]

. Thus \(U'\) is unitary. Moreover we can check for all \(\ket{w}\in W\),

\[ \begin{align*} U'\ket{w} &= \bigg(\sum_i\ket{u_i}\bra{w_i} + \sum_j\ket{u_j'}\bra{w_j'}\bigg)\ket{w} \\ &= \sum_i\ket{u_i}\bra{w_i}\ket{w} + \sum_j\ket{u_j'}\bra{w_j'}\ket{w} \\ &= \sum_i\ket{u_i}\bra{w_i}\ket{w} + 0 \;\;\;\;\;(\text{since}\ket{w_j'}\perp \ket w) \\\ &= \sum_iU\ket{w_i}\bra{w_i}\ket{w} = U\ket w \end{align*} \]

, implying that \(U'\) is an extension of \(U\).

  • Postulate 4 also enable us to define entanglement (interesting!!), consider a two qubit state \(\ket\psi=\frac{\ket{00}+\ket{11}}{\sqrt2}\), no one can write it in form of \(\ket\psi = \ket a\ket b\). Hence we have the following exercise:
Exercise 2.68

Prove that \(\ket\psi\neq\ket a\ket b\) for all single qubit states \(\ket a\) and \(\ket b\).

solution

Suppose \(\ket\psi=\frac{\ket{00}+\ket{11}}{\sqrt2} = \ket a\ket b = (a_0\ket0+a_1\ket1)(b_0\ket0+b_1\ket1)\), therefore we have:

\[ \begin{cases} a_0b_0 = 1/\sqrt2 \\ a_0b_1=0 \\ a_1b_0=0 \\ a_1b_1 = 1/\sqrt2 \end{cases}\nonumber \]

, which is algebraically impossible.

We say that a state of a composite system having this property (that it can’t be written as a product of states of its component systems) is an entangled state. For reasons which nobody fully understands, entangled states play a crucial role in quantum computation and quantum information, and arise repeatedly through the remainder of this book.

A global view of quantum mechanics:

  • Postulate 1 sets the arena for quantum mechanics, by specifying how the state of an isolated quantum system is to be described.
  • Postulate 2 tells us that the dynamics of closed quantum systems are described by the Schrödinger equation, and thus by unitary evolution.
  • Postulate 3 tells us how to extract information from our quantum systems by giving a prescription for the description of measurement.
  • Postulate 4 tells us how the state spaces of different quantum systems may be combined to give a description of the composite system.

Might it be possible to reformulate quantum mechanics in a mathematically equivalent way so that it had a structure more like classical physics? It turns out by proving Bell's Inequality we show that quantum mechanics can's excape from its counter-intuitive nature.

2.3 Superdense coding

  • Alice can send 2 classical bits of information via transmission of only a single qubit to Bob.

  • First Alice and Bob share a Bell pair \(\ket\psi = \frac{\ket{00}+\ket{11}}{\sqrt2}\), then if Alice wishes to:

    \[ \text{encode classical information: } \begin{cases} \text{00},\;\ket\psi\xrightarrow{\text{perform nothing}}\dfrac{\ket{00}+\ket{11}}{\sqrt2} \\ \text{01},\;\ket\psi\xrightarrow{\text{perform }Z\text{ gate}}\dfrac{\ket{00}-\ket{11}}{\sqrt2} \\ \text{10},\;\ket\psi\xrightarrow{\text{perform }X\text{ gate}}\dfrac{\ket{01}+\ket{10}}{\sqrt2} \\ \text{11},\;\ket\psi\xrightarrow{\text{perform }iY\text{ gate}}\dfrac{\pm\ket{01}\mp\ket{10}}{\sqrt2} \end{cases} \]
  • These four states are known as the Bell basis, Bell states, or EPR pairs. Since they form an orthonormal basis, they can be distinguished by an appropriate quantum measurement.

  • Now after Alice done performing actions to her own qubit, she then send it to Bob.

  • Bob then do a measurement in the Bell basis, and he can determine which of the four possible bit strings Alice sent.

Exercise 2.69

Verify that the Bell basis forms an orthonormal basis for the two qubit state space.

solution

The bell states defined here are labeled as convention:

\[ \ket{\Phi^+} = \dfrac{1}{\sqrt2}\begin{bmatrix}1\\0\\0\\1\end{bmatrix} \;\;\;\;\;\;\;\;\;\;\; \ket{\Phi^-} = \dfrac{1}{\sqrt2}\begin{bmatrix}1\\0\\0\\-1\end{bmatrix} \;\;\;\;\;\;\;\;\;\;\; \ket{\Psi^+} = \dfrac{1}{\sqrt2}\begin{bmatrix}0\\1\\1\\0\end{bmatrix} \;\;\;\;\;\;\;\;\;\;\; \ket{\Psi^-} = \dfrac{1}{\sqrt2}\begin{bmatrix}0\\\pm1\\\mp1\\0\end{bmatrix}\nonumber \]

Check their linear independence:

\[ \begin{align*} a\ket{\Phi^+} + b\ket{\Phi^-} + c\ket{\Psi^+} + d\ket{\Psi^-} &= 0 \\ &\Rightarrow a+d = a-d = b+c = b-c = 0 \\ &\Rightarrow a = b = c = d = 0 \end{align*} \]

, hence they are linear independent (also with norm \(1\)), and form a set of orthonormal basis (for two qubit space).

Exercise 2.70

Suppose \(E\) is any positive operator acting on Alice's qubit. Show that \(\expval{E\otimes I}{\psi}\) takes the same value when \(\ket\psi\) is any of the four Bell states. Suppose some malevolent third party ('Eve') intercepts Alice's qubit on the way to Bob in the superdense coding protocol. Can Eve infer anything about which of the four possible bit strings \(00,01,10,11\) Alice is trying to send? If so, how, or if not, why not?

solution

You may check by simple matrix calculation that \(\expval{E\otimes I}{\psi_i} = E_{00} + E_{11} = \expval{E}{0}+\expval{E}{1}\) where \(\ket{\psi_i}\in\text{Bell states}\). Now if Eve "intercepts" Alice's qubit by performing measurement operators \(M_m\) on it, the probability that Eve get outcome \(m\) is \(p(m) = \expval{M_m^\dagger M_m\otimes I}{\psi_i}\), and since \(M_m^\dagger M_m\) is positive, we have immediate conclusion that Eve won't be able to distinguish which Bell pair Alice is transiting toward Bob since all \(p(m)\) are the same.

2.4 The Density Operator

  • An alternative formulation of quantum mechanics other than using the language of state vectors is the tool of density operator or density matrix.
  • They are mathematically equivalent, but it serves as a more convenient language for thinking about some commonly encountered scenarios in quantum mechanics.

In the following 3 sections, we first intorduce the density operator using the concept of an ensemble of quantum states; next we derive some general properties of the density operator; last we describe an application, as the density operator being a tool for the description of individual subsystems of a composite quantum system.

2.4.1 Ensembles of Quantum States

  • The density operator language provides a convenient means for describing quantum systems whose state is not completely known.

  • To be more precise, suppose we only know a quantum system having probabilities \(p_i\) for corresponding states \(\ket{\psi_i}\), then we call \(\{p_i,\ket{\psi_i}\}\) an ensemble of pure states. The density operator for the system is defined:

    \[ \rho\equiv \sum_ip_i\ket{\psi_i}\bra{\psi_i} \]

    , which is often known as the density matrix. (the two terms can be used interchangeably)

  • All postulates regarding quantum mechanics from section 2.3 can be reformulated into the language for density operator. For example, the evolution of the density operator is described by the equation:

    \[ \rho = \sum_ip_i\ket{\psi_i}\bra{\psi_i} \xrightarrow{\;U\;} \sum_ip_iU\ket{\psi_i}\bra{\psi_i}U^\dagger = U\rho U^\dagger \]
  • Measurements are also easily described in the language of density operators. Suppose the initial state is \(\ket{\psi_i}\), then the probability of getting result \(m\) is

    \[ p(m|i) = \expval{M_m^\dagger M_m}{\psi_i} = \tr(M_m^\dagger M_m\ket{\psi_i}\bra{\psi_i}) \]

    (if you can't get the last equality, see here for reference). By the law of total probability, the total probability of obtaining \(m\) is:

    \[ \begin{align*} p(m) &= \sum_ip(m|i)p_i \\ &= \sum_ip_i \tr(M_m^\dagger M_m\ket{\psi_i}\bra{\psi_i}) \\ &= \tr(M_m^\dagger M_m\rho) \end{align*} \]
  • What is the density operator of the system after obtaining the measurement result \(m\)? If the initial state was \(\ket{\psi_i}\) then the state after obtaining the result \(m\) is

    \[ \ket{\psi_i^m} = \frac{M_m\ket{\psi_i}}{\sqrt{\expval{M_m^\dagger M_m}{\psi_i}}} \]

    , which implies after a measurement which yields the result \(m\) we have an ensemble of states \(\ket{\psi_i^m}\) with respective probabilities \(p(i|m)\). The corresponding density operator \(\rho_m\) is therefore:

    \[ \rho_m = \sum_i p(i|m)\ket{\psi_i^m}\bra{\psi_i^m} = \sum_i p(i|m)\frac{M_m\ket{\psi_i}\bra{\psi_i}M_m^\dagger}{\expval{M_m^\dagger M_m}{\psi_i}} \]

    , here we do some math manipulation for calculation convenience: (以下轉換你可以自己驗證)

    \[ p(i|m) = \frac{p(i,m)}{p(m)} = \frac{p(m|i)\cdot p_i}{p(m)} \]

    , therefore the equation becomes:

    \[ \begin{align} \rho_m &= \sum_i p(i|m)\frac{M_m\ket{\psi_i}\bra{\psi_i}M_m^\dagger}{\expval{M_m^\dagger M_m}{\psi_i}}\nonumber \\ &= \sum_i\frac{p(m|i)\cdot p_i}{p(m)}\frac{M_m\ket{\psi_i}\bra{\psi_i}M_m^\dagger}{\expval{M_m^\dagger M_m}{\psi_i}}\nonumber \\ &= \sum_i\frac{\expval{M_m^\dagger M_m}{\psi_i}\cdot p_i}{\tr(M_m^\dagger M_m\rho)}\frac{M_m\ket{\psi_i}\bra{\psi_i}M_m^\dagger}{\expval{M_m^\dagger M_m}{\psi_i}}\nonumber \\ &= \sum_i \frac{p_i}{\tr(M_m^\dagger M_m\rho)}M_m\ket{\psi_i}\bra{\psi_i}M_m^\dagger\nonumber \\ &= \frac{M_m^\dagger \rho M_m}{\tr(M_m^\dagger M_m\rho)} \end{align} \]
  • A quantum system whose state is known exactly (just one \(\ket\psi\), prob.\(=1\)) is called pure state. In that way the density operator is simply \(\rho = \ket\psi\bra\psi\); otherwise, \(\rho\) is in a mixed state (it is a mixture of the different pure states in the ensemble for \(\rho\)). In the exercise later we will prove:

    \[ \begin{cases} \text{a pure state}&\longrightarrow&\tr(\rho^2)=1\\ \text{a mixed state}&\longrightarrow&\tr(\rho^2)<1 \end{cases} \]
  • 如果一個系統是由多個 \(\rho_i\) 構成, 每個對應到機率 \(p_i\), then the system may be described as \(\sum_ip_i\rho_i\). proof: 假設 \(\rho_i\) 是來自純態集合 \(\{p_{ij},\ket{\psi_{ij}}\}\) 的混態, 那這個系統處於 \(\rho_i\) 的總機率即 \(p_ip_{ij}\), 根據此關係我們可以求出此系統的 density matrix:

    \[ \rho = \sum_{ij}p_ip_{ij}\ket{\psi_{ij}}\bra{\psi_{ij}} = \sum_ip_i\rho_i \]

    , where \(\rho_i = \sum_jp_{ij}\ket{\psi_{ij}}\bra{\psi_{ij}}\). We say that \(\rho\) is a mixture of the states \(\rho_i\) with probabilities \(p_i\).

  • 這種 mixture 的概念之後會很常用到. For example, if we have a quantum system in the state \(\rho_m\) with prob. \(p(m)\), but know nothing about the actual value of \(m\), then the state of such a system can be described by density operator:

    \[ \rho = \sum_mp(m)\rho_m = \sum_m \tr(M_m^\dagger M_m\rho)\frac{M_m^\dagger \rho M_m}{\tr(M_m^\dagger M_m\rho)} = \sum_m M_m^\dagger \rho M_m \]

    , which is a nice compact formula which may be used as the starting point for analysis of further operations on the system.

2.4.2 General Properties of the Density Operator

In this section we move to an intrinsic characterization of density operators that does not rely on an ensemble interpretation.

Theorem 2.5

(Characterization of density operators) An operator \(\rho\) is the density operator associated to some ensemble \(\{p_i, \ket{\psi_i}\}\) if and only if it satisfies the conditions:

(1) Trace condition: \(\rho\) has trace equal to one.

(2) Positivity condition: \(\rho\) is a positive operator.

proof

Suppose \(\rho = \sum_ip_i\ket{\psi_i}\bra{\psi_i}\) is a density operator. Then

\[ \tr(\rho) = \sum_ip_i\tr(\ket{\psi_i}\bra{\psi_i}) = \sum_ip_i = 1 \]

, hence thte trace condition is satisfied. Suppose \(\ket\varphi\) is an arbitrary vector in state space. Then

\[ \expval{\rho}{\varphi} = \sum_ip_i\braket{\varphi}{\psi_i}\braket{\psi_i}{\varphi} = \sum_ip_i\big\vert\braket{\varphi}{\psi_i}\big\vert^2 \geq 0 \]

, hence the positivity condition is satisfied.

Conversely, suppose \(\rho\) is any operator satisfying the trace and positivity conditions. Since \(\rho\) is positive, it must have a spectral decomposition

\[ \rho = \sum_j\lambda_j\ket j\bra j \]

, where the vectors \(\ket j\) are orthogonal, and \(\lambda_j\) are real, non-negative eigenvalues of \(\rho\). And from the trace condition we see that \(\sum_j\lambda_j=1\). herefore, a system in state \(\ket j\) with probability \(\lambda_j\) will have density operator \(\rho\). That is, the ensemble \(\{\lambda_j,\ket{j}\}\) is an ensemble of states giving rise to the density operator \(\rho\).

這個 theorem 給我們純純根據 density operator 的內秉性質定義它的方式, that is, we can define a density operator to be a positive operator \(\rho\) which has trace equal to one. We can now reformulate the postulates of quantum mechanics in the density operator picture:

Postulate 1: Associated to any isolated physical system is a complex vector space with inner product (that is, a Hilbert space) known as the state space of the system. (到這邊都跟前面一樣) The system is completely described by its density operator, which is a positive operator \(\rho\) with trace one, acting on the state space of the system. If a quantum system is in the state \(\rho_i\) with probability \(p_i\), then the density operator for the system is \(\sum_ip_i\rho_i\).

Postulate 2: The evolution of a closed quantum system is described by a unitary transformation. That is, the state \(\rho\) of the system at time \(t_1\) is related to the state \(\rho'\) of the system at time \(t_2\) by a unitary operator \(U\) which depends only on the times \(t_1\) and \(t_2\),

\[ \rho' = U\rho U^\dagger \]

Postulate 3: Quantum measurements are described by a collection \(\{M_m\}\) of measurement operators. These are operators acting on the state space of the system being measured. The index \(m\) refers to the measurement outcomes that may occur in the experiment. If the state of the quantum system is \(\rho\) immediately before the measurement then the probability that result \(m\) occurs is given by

\[ p(m) = \tr(M_m^\dagger M_m\rho) \]

, and the state of the system after the measurement is

\[ \frac{M_m\rho M_m^\dagger}{\tr(M_m^\dagger M_m\rho)} \]

The measurement operators satify the completeness equation,

\[ \sum_mM_m^\dagger M_m = I \]

Postulate 4: The state space of a composite physical system is the tensor product of the state spaces of the component physical systems. Moreover, if we have systems. Moreover, if we have systems numbered \(1\) through \(n\), and system number \(i\) is prepared in the state \(\rho_i\), then the joint state of the total system is \(\rho_1\otimes\rho_2\otimes\cdots\otimes\rho_n\).

以上的四條量子力學假設重建構是與 state vector-based 的詮釋方式 mathematically equivalent 的. 但換成這種思考方式的好處聽說是可以比較好的描述:

  • the quantum systems whhose state is not known
  • the subsystems of a composite quanutm system

Exercise 2.71

(Criterion to decide if a state is mixed or pure) Let \(\rho\) be a density operator. Show that \(\tr(\rho^2)\leq 1\), with equality holds if and only if \(\rho\) is a pure state.

solution
\[ \begin{align*} \tr(\rho^2) &= \tr(\sum_{i,j}p_ip_j\ket{i}\bra{i}\ket{j}\bra{j}) \\ &= \tr(\sum_{i,j}p_ip_j\ket{i}\delta_{ij}\bra{j}) \\ &= \tr(\sum_{i}p_i^2\ket{i}\bra{i}) \\ &= \sum_i p_i^2 \end{align*} \]

Since \(\rho\) is positve (which implies \(0\leq p_i\leq 1\) hence \(p_i^2\leq p_i\) for all \(i\)) and the completeness relation \(\sum_ip_i = 1\), we have \(\sum_ip_i^2 \leq 1\). The equality stands if and only if \(\rho\) is a pure state (which is \(\rho = \ket\psi\bra\psi\)).

注意!! 很多人會自作聰明, 把一個 \(\rho\) 的 eigenvalue & eigenvector 當成是構成它的 quantum states 們, 或是認為兩者之間有什麼神秘關係, 我現在就來讓你們美夢破碎! For example, one might suppose a quantum system with density matrix

\[ \rho = \frac{3}{4}\ket0\bra0+\frac{1}{4}\ket1\bra1 \nonumber \]

must be in state \(\ket0\) with probability \(\frac{3}{4}\) and in the state \(\ket1\) with probability \(\frac{1}{4}\). 現在就來舉個反例, now if we define:

\[ \begin{align*} \ket a &\equiv \sqrt\frac{3}{4}\ket0 + \sqrt\frac{1}{4}\ket1 \\ \ket b &\equiv \sqrt\frac{3}{4}\ket0 - \sqrt\frac{1}{4}\ket1 \end{align*} \]

, then this ensemble will also give rise to \(\rho = \frac12\ket{a}\bra{a}+\frac12\ket{b}\bra{b} = \frac{3}{4}\ket0\bra0+\frac{1}{4}\ket1\bra1\). That is, these two different ensembles of quantum states give rise to the same density matrix!

取而代之的是, 我們好奇到底哪種類 (which class of) 的 ensemble does give rise to same density matrix? That is, when do two sets of vectors, \({\ket{\tilde\psi_i}}\) and \({\ket{\tilde\varphi_i}}\) generate the same operator \(\rho\)? (notation: \({\ket{\tilde\psi_i}} = \sqrt{p_i}\ket{\psi_i}\) so that \(\rho = \sum_i{\ket{\tilde\psi_i}}{\bra{\tilde\psi_i}}\), 就是把 probability 吃進去 statevector 裡面, may become not normalized in length) The answer to this question has many applications in QCQI:

Theorem 2.6

(Unitary freedom in the ensemble for density matrices) The sets \({\ket{\tilde\psi_i}}\) and \({\ket{\tilde\varphi_j}}\) generate the same density matrix if and only if

\[ \ket{\tilde\psi_i} = \sum_ju_{ij}\ket{\tilde\varphi_j} \]

, where \(u_{ij}\) is a unitary matrix of complex numbers, with indicies \(i\) and \(j\), and we "pad" whichever set of vectors \(\ket{\tilde\psi_i}\) or \(\ket{\tilde\varphi_j}\) is smaller with additional vectors \(0\) so that the two sets have the same number of elements. 如果用 normalized states \(\ket{\psi_i}, \ket{\varphi_j}\) 來形容, we have \(\sqrt{p_i}\ket{\psi_i} = \sum_ju_{ij}\sqrt{q_j}\ket{\varphi_j}\) for some unitary matrix \(u_{ij}\).

proof

Suppose \(\tilde{\ket{\psi_i}} = \sum_ju_{ij}\tilde{\ket{\varphi_j}}\) for some unitary \(u_{ij}\). Then

\[ \begin{align*} \sum_i \tilde{\ket{\psi_i}} \tilde{\bra{\psi_i}} &= \sum_{ijk} u_{ij}\tilde{\ket{\varphi_j}}\; u_{ik}^*\tilde{\bra{\varphi_k}} \\ &= \sum_{jk} \bigg(\sum_iu_{ki}^\dagger u_{ij}\bigg) \tilde{\ket{\varphi_j}}\tilde{\bra{\varphi_k}} \\ &= \sum_{jk} I_{kj}\;\tilde{\ket{\varphi_j}}\tilde{\bra{\varphi_k}} \\ &= \sum_j \tilde{\ket{\varphi_j}}\tilde{\bra{\varphi_j}} \end{align*} \]

, which shows that \(\tilde{\ket{\psi_i}}\) and \(\tilde{\ket{\varphi_j}}\) generate the same operator.

Conversely, suppose

\[ A = \sum_i \tilde{\ket{\psi_i}} \tilde{\bra{\psi_i}} = \sum_j \tilde{\ket{\varphi_j}}\tilde{\bra{\varphi_j}}. \]

Let \(A = \sum_k\lambda_k\ket k\bra k\) be a decomposition for \(A\) such that \(\ket k\) are orthonormal, and \(\lambda_k\) are strictly positive. 我們的證明策略是 to relate the states \(\tilde{\ket{\psi_i}}\) to the states \(\tilde{\ket k}\equiv \sqrt{\lambda_k}\ket k\), and similarly relate the states \(\tilde{\ket{\varphi_j}}\) to the states \(\tilde{\ket k}\), and finally combine the two relations, done.

  • Let \(\ket\psi\) be any vector orthonormal to the space spanned by the \(\tilde{\ket k}\) hence \(\braket{\psi}{\tilde k}\braket{\tilde k}{\psi}=0\) for all \(k\), thus \(0 = \expval{A}{\psi} = \sum_i\braket{\psi}{\tilde{\psi_i}}\braket{\tilde{\psi_i}}{\psi} = \bigg\Vert\braket{\psi}{\tilde{\psi_i}}\bigg\Vert^2\).

  • This implies \(\braket{\psi}{\tilde{\psi_i}} = 0\) for all \(i\) and all \(\ket\psi\) orthonormal to the space spanned by \(\tilde{\ket k}\).

  • It follows that each \(\tilde{\ket{\psi_i}}\) can be expressed as a linear combination of the \(\tilde{\ket k}\) (因為它跟「垂直於它的空間」垂直), that is, \(\tilde{\ket{\psi_i}} = \sum_kc_{ik}\tilde{\ket k}\).

  • Since \(A = \sum_k\lambda_k\ket k\bra k = \sum_k\tilde{\ket k}\tilde{\bra k} = \sum_i\tilde{\ket{\psi_i}}\tilde{\bra{\psi_i}}\), we see that

    \[ \sum_k\tilde{\ket k}\tilde{\bra k} = \sum_{kl} \bigg(\sum_ic_{ik}c_{il}^*\bigg)\tilde{\ket k}\tilde{\bra l} \]

    , the operators \(\tilde{\ket k}\tilde{\bra l}\) are easily seen to be linearly independent (因為前面說過 \(A\) 的 spectral decomposition 裡面的 \(\ket k\) 們是 orthonormal 的)

  • Therefore we have \(\sum_i c_{ik}c_{il}^* = \delta_{kl}\), which ensures that we may append extra columns (可以自己從矩陣角度想想看為什麼是columns) to \(c\) to obtain a unitary matrix \(v\) such that \(\tilde{\ket{\psi_i}} = \sum_kv_{ik}\tilde{\ket k}\), and similarly find another unitary matrix \(w\) such that \(\tilde{\ket{\varphi_j}} = \sum_kw_{jk}\tilde{\ket k}\).

  • Thus we have \(\tilde{\ket{\psi_i}} = \sum_ju_{ij}\tilde{\ket{\varphi_j}}\), where \(u = vw^{-1} = vw^\dagger\). (別忘了 unitary 的性質)

Exercise 2.72

(Bloch sphere for mixed states) The Bloch sphere picture for pure states of a single qubit was introduced in [Section 1.2][#12--quantum-bits]. This description has an important generalization to mixed states as follows.

(1) Show that an arbitrary density matrix for a mixed state qubit may be written as

\[ \rho = \frac{I + \vec r\cdot\vec\sigma}{2} \]

, where \(\vec r\) is a real three-dimensional cevtor such that \(\Vert\vec r\Vert\leq 1\). This vector is known as the Bloch vector for the state \(\rho\).

(2) What is the Bloch vector representation for the state \(\rho = I/2\) ?

(3) Show that a state \(\rho\) is pure if and only if \(\Vert\vec r\Vert= 1\).

(4) Show that for pure states the description of the Bloch vector we have given coincides with that in Section 1.2

solution

(1) From Exercise 2.35 we have: (注意 \(\rho\) 是 hermitian 的所以只有三個自由度 \(a,b,c\).)

\[ \frac{I + \vec r\cdot\vec\sigma}{2} = \frac{1}{2}\begin{bmatrix}1+r_3 &r_1-ir_2 \\ r_1+ir_2 & 1-r_3\end{bmatrix} \xrightarrow{\text{must be}}\rho= \begin{bmatrix}a&b\\b^*&c\end{bmatrix} \]

Another property of density matrix is that it's positive, i.e. real non-negative eigenvalues:

\[ \begin{align*} \det(\rho-\lambda I) &= \frac{1}{4}\bigg((1-\lambda)^2 - (r_1^2+r_2^2+r_3^2)\bigg) = 0 \\ &\Rightarrow \lambda = \frac{2\pm\sqrt{4-4(1-\Vert\vec r\Vert^2)}}{2} = 1\pm \Vert\vec r\Vert \geq 0\;\;\;\;(\because\text{positive matrix}) \end{align*} \]

, hence \(\Vert\vec r\Vert\) must be \(\leq 1\).

(2) \(\rho = I/2\) implies \(\vec r=0\), which corresponds to the origin of Bloch sphere.

(3) From here we know that \(\rho\) is pure if and only if \(\tr(\rho^2)=1\), hence (可參考 Exercise 2.40)

\[ \tr\bigg(\frac{1}{4}\Big( I + 2\vec{r}\cdot\vec\sigma + (\vec{r}\cdot\vec\sigma)^2 \Big)\bigg) = \tr\bigg(\frac{1}{4}\Big( I + 2\vec{r}\cdot\vec\sigma + \Vert\vec r\Vert^2I \Big)\bigg) = \frac{1}{4}\Big(2+2\Vert\vec r\Vert^2\Big) = 1 \]

, it naturally follows that \(\Vert\vec r\Vert=1\) in order to fulfill the relation.

(4) Suppose \(\ket\psi=\alpha\ket0+\beta\ket1\), and with \(\rho = \ket\psi\bra\psi\) and the constrain \(\tr(\rho)=0\), we have \(\Vert\alpha\Vert^2+\Vert\beta\Vert^2=1\) (which cioncides with here from section 1.2), it follows that we can rewrite them in the form of:

\[ \ket\psi = e^{i\gamma}\bigg( \cos\frac{\theta}{2}\ket0 + e^{i\varphi}\sin\frac{\theta}{2}\ket1\bigg) \]

Exercise 2.73

Let \(\rho\) be a density operator. A minimal ensemble for \(\rho\) is an ensemble \(\{p_i, \ket{\psi_i}\}\) containing a number of elements equal to the rank of \(\rho\). Let \(\ket\psi\) be any state in the support of \(\rho\). (The support of a Hermitian operator \(A\) is the vector space spanned by the eigenvectors of \(A\) with non-zero eigenvalues.) Show that there is a minimal ensemble for \(\rho\) that contains \(\ket\psi\), and moreover that in any such ensemble, \(\ket\psi\) must appear with probability

\[ p_i = \frac{1}{\expval{\rho^{-1}}{\psi_i}} \]

, where \(\rho^{-1}\) is defined to be the inverse of \(\rho\), when \(\rho\) is considered as an operator acting only on the support of \(\rho\). (This definition removes the problem that \(\rho\) may not have an inverse.)

solution

Referencing from the last two pages in this paper and from the solution of this paper.