Tensors 1

Notations

Einstein summation convention is used here. A matrix M is denoted as [M] and its ij-th element is referred to by [M]_{ij}. Quantities or coefficients are indexed as for example u^i, A_{ij} or A_i^j. These indices do not automatically pertain to row and column indices of a matrix, but the quantities can be presented by matrices through isomorphisms once their indices are freely interpreted as rows and columns of matrices.

Coordinates of a vector

Let V be a n-dimensional vector space and \mathcal B=\{e_1,\cdots, e_n\} with e_i\in V be a basis for V. Then, we define the coordinate function as,

    \[[\cdot]_{\mathcal B}:V\to \mathbb M^{n\times 1}\]

such that for a vector v\in V written by its components (with respect to \mathcal B) as v=v_ie_i the function acts as,

    \[[v]_{\mathcal B}=\begin{bmatrix}v_1 \\ \vdots \\ v_n \end{bmatrix}\]

The coordinate function is a linear map.

Change of basis for vectors

Let \mathcal B and \tilde {\mathcal B} be two basis for V, then,

\tilde e_i=F_{ji}e_j and e_i=B_{ji}\tilde e_j

where the indices of the scalar terms F_{ji} and B_{ji} are intentionally set this way. So, if all F_{mn} are collected into a matrix [F], then the sum F_{ji}e_j is over the rows of the matrix for a particular column. In other words, we can utilize the rule of matrix multiplication and write,

    \[\tilde e_i=F_{ji}e_j=[F]^{\rm T}\begin{bmatrix} e_1\\ \vdots \\ e_n \end{bmatrix}\]

The same is true for [B]:=B_{ji}. In above formulations, note that j is a dummy index (i.e. we can equivalently write \tilde e_j=F_{ij}e_i=F_{ki}e_k)

Setting \mathcal B as the initial (old) basis and writing the current (new) basis \tilde {\mathcal B} in terms of \mathcal B is referred to as forward transform denoted by F_{ij}. Relatively, B_{ij} is called backward transform.

The relation between that forward and backward transforms is obtained as follows,

    \[\begin{split}e_i &= B_{ji} \tilde e_j=B_{ji}F_{kj}e_k\\&\implies B_{ji}F_{kj}=\delta_{ik}\\&\therefore [F]=[B]^{-1} \ , [B] = [F]^{-1}\end{split}\]

We now find how vector coordinates are transformed relative to different bases. A particular v\in V can be expressed by its components according to any of \mathcal B or \tilde{\mathcal B} basis, therefore,

    \[v=v_ie_i=\tilde v_i \tilde e_i\]

To find the relation between [v]_{\tilde{\mathcal B}} and [v]_{\mathcal B} we write,

    \[\begin{split}v&=v_ie_i=\tilde v_i\tilde e_i \implies v_ie_i = v_i B_{ji} \tilde e_j\equiv C_j\tilde e_j\implies C_j=\tilde v_j\\&\therefore \tilde v_i = B_{ij}v_j\equiv [B][v]_{\mathcal B}\\&\implies v_i = F_{ij}\tilde v_j\equiv [F][v]_{\tilde {\mathcal B}}\end{split}\]

As it can be observed, the old basis to new basis is transformed by the forward transform F_{ij} while the old coordinates v_i are transformed to the new ones, \tilde v_i, by the backward transform B_{ij}. Because the coordinates of v behave contrary to the basis vectors in transformation, the coordinates or the scalar components are said to be contravariant. A vector can be called a contravariant object because its scalar components (coordinates) transforms differently from the basis vectors whose their linearly combination equals to the vector. Briefly,

Proposition: Let v=v_ie_i. Then, the scalar components/coordinates v_i are transformed by B_{ij} if and only if the basis vectors e_i are transformed by F_{ij}, such that B_{ji}F_{kj}=\delta_{ik}.

Later, a vector is called a contravariant tensor. For the sake of notation and to distinguish between the transformations of the basis and the coordinates of a vector, in index of a coordinate is written as superscript to show it is contravariant. Therefore,

    \[v = v^ie_i=\tilde v^i\tilde e_i\]

Linear maps and linear functionals

Definition: \mathcal L(V, W) is defined as the space of all linear maps V\to W where the domain and codomain are vectors spaces.

It can be proved that \mathcal L is a vector space (\mathcal L, +, \cdot), hence, for T_1, T_2\in \mathcal L(V,W) and \alpha\in \mathbb R

    \[\alpha\cdot T_1=\alpha T_1\quad , (T_1+T_2)()=T_1()+T_2()\]

Note that the addition on the LHS is an operator in \mathcal L and the addition on the RHS is an operator in W.

Proposition 1: Let T\in \mathcal L(V, W), i.e a linear map from a vector space V to another one W. If \mathcal B=\{e_1, \cdots, e_n\} is a basis for V, and T(e_i)=w_i for w_i\in W and i=1,\cdots , n, then T is uniquely defined over V.

This proposition says a linear map over a space is uniquely determined by its action on the basis vectors of that space. In other words, if T(e_i)=w_i and T^*(e_i)=w_i then \forall v\in V, \ T(v)=T^*(v). proof: let T(e_i)=w_i (given by the nature of T), then for v\in V such that v=v^ie_i, we can write v^iT(e_i)=v^iw_i, therefore, T(v^ie_i)=T(v)=v^iw_i. Because, v^i‘s are unique for (a particular) v then v^iw_i is unique for v and hence T(v) must be unique for any v\in V. In other word, there is only one unique T over V such that T(e_i)=w_i.

As a side remark, if \mathcal B=\{e_1, \cdots, e_n\} is a basis for V, hence spanning V, then \{T(e_i)| i=1,\cdots n\} spans the range of T; The range of T is a subset of W.

By this proposition, a matrix completely determining a linear can be obtained for the linear map. let V be n-dimensional with a basis \mathcal B=\{e_i\}_1^n, and W be m-dimensional with a basis \mathcal B'=\{e_i'\}_1^m. Then there are coefficients T_i^j such that,

    \[T(e_i)= T_i^j e_j'\]

In the notation T_i^j, the index j is superscript because for a fixed e_i and hence a fixed i, the term T_i^j is the coordinate of T(e_i)\in W and it is a contravariant (e.g T(e_3)=T_3^j e_j'\equiv v^je_j').

For v\in V, and w=T(v), with the coordinates [v]_{\mathcal B} and [w]_{\mathcal B'}, we can show that,

    \[w_j = T_i^jv^i\]

This expression can be written as a matrix multiplication of [w]_{\mathcal B'}=[M][v]_{\mathcal B}, where [T]:=\mathcal M(T)\in \mathbb M^{m\times n} presented by its elements as,

    \[\begin{bmatrix} T_1^1 && T_2^1 && \cdots && T_n^1 \\T_1^2 && T_2^2 && \cdots && T_n^2 \\\vdots && \vdots && \cdots && \vdots\\T_1^m && T_2^m && \cdots && T_n^m \end{bmatrix}\]

As a remark, above can be viewed as columns of the matrix and written as,

    \[[T]=\begin{bmatrix} [T(e_1)]_{\mathcal B'} && [T(e_2)]_{\mathcal B'} && \cdots && [T(e_n)]_{\mathcal B'} \end{bmatrix}\]

Linear functional (linear form or covector)

Definition: a linear functional on V is a linear map f\in V^* :=\mathcal L(V,\mathbb F). The space V^* is called the dual space of V.

Proposition: Let \mathcal B=\{e_1, \cdots, e_n\} and \varepsilon_i \in V^* be defined as \varepsilon_i(e_j):=\delta_{ij}. Then, \{\varepsilon_i\}_1^n called dual basis of \mathcal B, is a basis of V^*, and hence \dim V = \dim V^*.

Proof: first we show that \varepsilon_i‘s are linearly independent, i.e. c_i\varepsilon_i=0 \implies c_i=0 \forall i=1, \cdots, n. Note that on the RHS, 0\in V^*. For a v\in V we can write c_i\varepsilon(v) and assume c_i\varepsilon(v)=0. Then,

    \[c_i\varepsilon(v)=c_i\varepsilon(v^je_j)=0\implies c_iv^j\varepsilon(e_j)=0\implies c_iv^j\delta_{ij}=0\implies c_iv_i=0\]


Since v is arbitrary, c_i=0 ■ .

Now we prove that \{\varepsilon_i\}_1^n spans V^*. I.e \forall f \in V^* \exists \{c_1, \cdots, c_n\} such that f=c_i\varepsilon_i. To this end, we apply both sides to a basis vector of V and write f(e_j)=c_i\varepsilon_i(e_j) which implies f(e_j)=c_j or explicitly c_j is found as c_j=f(e_j). Consequently, f=f(e_i)\varepsilon_i ■.

Consider V and \mathcal B. If f\in V^*, then the matrix of the linear functional/map f is

    \[[M]=\mathcal M(f)=\begin{bmatrix} f(e_1) && \cdots && f(e_n)\end{bmatrix}\in \mathbb M^{1\times n}\]

So, for v\in V as v=v^ie_i we can write,

    \[f(v)=[M][v]_\mathcal B\quad \in \mathbb R\]

Result: if the coordinates of a vector is shown by a column vector or single-column matrix (which is a vector in the space of \mathbb M^{n\times 1}), then a row vector or a single-row matrix represents the matrix of a linear functional.

Definition: a linear functional f\in V^*, which can be identified with a row vector as its matrix, is also called a covector.

Like vectors, a covector (and any linear map) is a mathematical object that is independent of a basis (i.e. invariant). The geometric representation of a vector in (or by an isomorphism in) \mathbb R^3 is an arrow in \mathbb E^3. For a covector isomorphic to \mathbb R^2, the representation is a set (stack) of planes in \mathbb E^3 that can be represented by iso lines in \mathbb E^2. A covector that is isomorphic to \mathbb R^3 can be represented by iso surfaces in \mathbb E^3.

Example: Let \mathcal B = \{e_1, e_2\} be a basis of V and [2,1] be the matrix of a covector f in some V^*. Then, if [x]_{\mathcal B} = [x_1,x_2]^{\rm T}, we can write,

    \[y=[2,1]\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \implies y = 2x_1 + x_2\]

which, for different values of y, is a set of (iso) lines in a Cartesian CS defined by two axes x_1 and x_2 along e_{1g} and e_{2g} that are the geometric representations of e_1 and e2. The Cartesian axes are not necessarily orthogonal.

If we chose any other basis \tilde {\mathcal B} = \{\tilde e_1, \tilde e_2\} for V, then the matrix of the covector f changes. Also, the geometric representations of \{\tilde e_1, \tilde e_2\} are different from e_{1g} and e_{2g} and hence the geometric representation of the covector stays the same shape.

Example: Let \mathcal B = \{e_1, e_2\} be a basis of V and \mathcal B^* = \{\varepsilon_1, \varepsilon_2\} be a basis for V^*. This means \epsilon_i\in V^* and \epsilon_i(e_j):=\delta_{ij}. Then, the matrix of each dual basis vector is as,

    \[\mathcal M(\varepsilon_i)=\begin{bmatrix}\varepsilon_i(e_1) && \varepsilon_i(e_2)\end{bmatrix} = \begin{bmatrix}\delta_{i1} && \delta_{i2}\end{bmatrix}\]

Change of basis for covectors

Let \mathcal B = \{e_i\}_1^n and \tilde{\mathcal B} = \{\tilde e_i\}_1^n be two bases for V, and hence, \mathcal B^* = \{\varepsilon_i\}_1^n and \tilde{\mathcal B^*} = \{\tilde \varepsilon_i\}_1^n be two bases for V^*. Each dual basis vector \tilde \epsilon_i can be written in terms of the (old) dual basis vectors by using a linear transformation as \tilde \varepsilon_i = Q_{ij}\varepsilon_j. Now, the coefficients Q_{ij} are to be determined as follows,

    \[\begin{split}\tilde \varepsilon_i(e_k) &= Q_{ij}\varepsilon_j(e_k)=Q_{ij}\delta_{jk}=Q_{ik}\\\implies \tilde \varepsilon_i(e_k) &= Q_{ik}\\\therefore Q_{ij}&=\tilde \varepsilon_i(e_j)\end{split}\]

Using the formula e_i=B_{ji}\tilde e_j​​​ regarding the change of basis of vectors, the above continues as,

    \[\begin{split}Q_{ij}&=\tilde \varepsilon_i(e_j)=\tilde \varepsilon_i(B_{kj}\tilde e_k)\\\text{by linearity of covectors}&= B_{kj}\tilde \varepsilon_i(\tilde e_k)=B_{kj}\delta_{ik}=B_{ij}\\\therefore Q_{ij}&=B_{ij}\end{split}\]

This indicates that the dual basis are transformed by the backward transformation. Referring to the index convention, we use subscript for components that are transformed trough a backward transformation. Therefore,

    \[\tilde \varepsilon^i=B_{ij}\varepsilon^j\]

meaning that dual basis vectors are contravariant because they behave contrary to the basis vectors in transformation from e_i to \tilde e_i.

Now let f\in V^*. Writing f=c_i\varepsilon^i=\tilde c_j \tilde \varepsilon^j and using the above relation, we get,

    \[\begin{split}c_i\varepsilon^i&=\tilde c_j \tilde \varepsilon^j \implies c_i F_{ij}\tilde \varepsilon^j=\tilde c_j \tilde \varepsilon^j\\\tilde c_j &= F_{ij}c_i\end{split}\]

meaning that they are transforming in a covariant manner when the basis of the vector space changes from e_i to \tilde e_i.

Briefly the following relations have been shown.

Basis and change of basis for the space of linear maps \mathcal L(V, W)

As can be proved \mathcal L(V,W) is a linear vector space and any linear map is a vector. Therefore, we should be able to find a basis for this space. If V is n-dimensional and W is m-dimensional, the \mathcal L(V,W) is mn-dimensional and hence its basis should have m\times n vectors, i.e. linear maps. Let’s enumerate the basis vectors of \mathcal L as \varphi_{ij}\in \mathcal L (V,W) for i=1, \cdots , m and j=1, \cdots , n, then any linear map T can be written as,

    \[ T = c_{ij}\varphi_{ij}\]

By proposition 1, any linear map is uniquely determined by its action on the basis vectors of its codomain. If \mathcal B = \{e_i\}_1^n be a basis for V, then for any basis vector e_k,

    \[T(e_k)=c_{ij}\varphi_{ij}(e_k)\]

Setting a basis for W as \mathcal B' = \{e_i'\}_1^m, the above equation becomes,

    \[a_{ik}e_i'=c_{ij}\varphi_{ij}(e_k)\]

This equation holds if,

    \[\begin{matrix}c_{ij}=a_{ik} \text{ and }  \varphi_{ij}(e_k)=e_i' && \text{ if } k=j \\c_{ij} = 0 \text{ and } \varphi_{ij}(e_k)=0 && \text{ if } k\ne j\end{matrix}\]

Therefore, we can choose a set of m\times n basis vectors \varphi_{ij} for \mathcal L (V,W) as,

    \[\varphi_{ij}(e_k) =  \begin{cases}e_i'\text{ if }k=j\\0\text{ if } k\neq  j\end{cases}\]

By recruiting the basis of V^*, the above can be written as,

    \[\{\varphi_i^j}=e_i'\varepsilon^j | i=1,\cdots, m \text{ and } j=1,\cdots, n\}\]

The term e_i'\varepsilon^j is obviously a linear map V\to W. It can be readily shown that c_{ij}\varphi_i^j=c_{ij}e_i'\varepsilon^j being a linear combinations of the derived basis vectors is linearly independent, i.e. c_{ij}e'_i\varepsilon^j(v)=0(v) for any v\in V (here, note that 0\in \mathcal L).

So, a linear map T can be written as a linear combination T = c_{ij}e_i'\varepsilon^j. Here, it is necessary to use the index level convention. To this end, we observe that for a fixed i the term c_{ij} couples with \varepsilon^j and represents the coordinates of a covector. As coordinates of a covector are covariant, index j is written as subscript. For a fixed j though, the term c_{ij} couples with e_i' and represents the coordinates of a vector. As coordinates of a vector are contravariant, index i should rise. Therefore, we write,

    \[T = c_j^ie_i'\varepsilon^j\]

The coefficients c_j^i can be determined as,

    \[\forall e_k\in \mathcal B\quad T(e_k) = c_j^ie_i'\varepsilon^j(e_k)=c_k^ie_i'\]

Stating that c_{ik} are the coordinates of T(e_k) with respect to the basis of W, i.e. \mathcal B'. Comparing with what was derived as T(e_k)=T_k^ie_i', we can conclude that c_k^i = T_k^i. Therefore,

    \[T=T_j^i e_i'\varepsilon^j\]

The above result can also be derived from w_i = T_j^iv^j as follows.

    \[\begin{split} w_i &= T_j^iv^j \implies w = (T_j^iv^j)e_i' = T_j^i\varepsilon_j(v)e_i'\\&\therefore T(v) = T_j^i\varepsilon_j(v)e_i' \text{ or } T = T_j^i e_i'\varepsilon^j \end{split}\]

Change of basis of \mathcal L(V,W) is as follows.

For \mathcal L(V,W), let \mathcal B=\{e_i\}_1^n and \tilde {\mathcal B}=\{\tilde e_i\}_1^n be bases for V, and \mathcal B'=\{e_i'\}_1^m and \tilde {\mathcal  B}'=\{\tilde e_i'\}_1^m be bases for W. Also, \mathcal B^*=\{\epsilon^i\}_1^n and \tilde {\mathcal B}^*=\{\tilde \epsilon^i\}_1^n are corresponding bases of V^*. Forward and backward transformation pairs in V and W are denoted as (F_{ij}, B_{ij}) and (F'_{ij}, B'_{ij}).

    \[\begin{split} T&=T_j^ie_i'\varepsilon^j = \tilde {T_j^i} \tilde e_i'\tilde \varepsilon^j \\&\implies T_j^ie_i'\varepsilon^j = \tilde {T_j^i} F'_{ki} e_k' B_{js}\varepsilon^s \implies T_s^k = \tilde {T_j^i} F'_{ki} B_{js}\\&(\text{ by } B_{nl}F_{lm}=\delta_{nm}) \ \implies B'_{lk}F_{sr}T_s^k = \tilde {T_j^i} \delta_{li}\delta_{jr}=\tilde T_r^l\\&\therefore \tilde {T_j^i} = B'_{ik}F_{sj}T_s^k\end {split}\]


Note that the coordinates T_s^k of a linear map need two transformations such that the covariant index s of T_s^k pertains to the forward transformation and the contravariant index k pertains to the backward transformation.

Example: let T\in \mathcal L(V,V), then,

    \[$T = T_i^je_j\varepsilon^i$ \quad \tilde {T_s^t} = B_{tj}F_{is}T_i^j\]

If the matrices, [F], [F]^{-1}=[B], and [T] are considered, we can write,

    \[ \tilde [T] =  [F]^{-1}[T][F]\]

Bilinear forms

A bilinear form is a bilinear map defined as T:V\times V\to \mathbb R. Setting a basis for V, a bilinear form can be represented by matrix multiplications on the coordinates of the input vectors. If \{e_i\}_1^n is a basis for V, then

    \[\begin{split} T(u,v)&=T(u^ie_i,v^je_j)=u^iv^jT(e_i, e_j)\\&\implies B(u,v)=u^iv^jT_{ij} \quad \text{ with } T_{ij}:=T(e_i, e_j) \end{split}\]

which can be written as,

    \[[u]^{\rm T}[T][v]\]

where [T]\in\mathbb M^{n\times n} with [T]_{ij}=T_{ij}.

The expression u^iv^jT(e_i, e_j) indicates that a bilinear form is uniquely defined by its action on the basis vectors. This is the same as what was shown for linear maps by proposition 1. This comes from the fact that a bilinear form is a linear map with respect to one of its arguments at a time.

Now we seek a basis for the space of bilinear forms, i.e. \mathcal L_b(V\times V, \mathbb R). This is a vector space with the following defined operations.

    \[\begin{split}\forall B_1, B_2 \in \mathcal B\quad (B_1+B_2)(u,v) &= B_1(u,v) + B_2(u,v)\\\forall \alpha\in \mathbb R\quad \alpha B(u,v) &= B(\alpha u, v)= B(u, \alpha v)\end{split}\]

The dimension of this space is n\times n, therefore, for any bilinear form T there are bilinear forms \rho_{ij}\in \mathcal B_b such that,

    \[T=c_{ij}\rho_{ij}\]

From the result T(u,v)=u^iv^jT(e_i, e_j)=u^iv^jT_{i,j} we can conclude that

    \[\begin{split}T(u,v)&=u^iv^jT_{ij}= \varepsilon^i(u)\varepsilon^j(v)T_{ij}\implies T(.,.) = T_{ij}\varepsilon^i(.)\varepsilon^j(.)\\\therefore c_{ij}&=T_{ij}, \quad \rho_{ij}= \varepsilon^i\varepsilon^j\quad \text {or } \rho_{ij}(e_s,e_t)=\begin{cases} 1& s=i \text { and } t=j\ 0& \text {otherwise}\end{cases}\end{split}\]

Following the index level convention, the indices of T_{ij} should stay as subscripts because each index pertains to the covariant coordinates of a covector after fixing the other index.

If \mathcal B and \tilde {\mathcal B} are two bases for V, then the change of basis of the space of bilinear forms are as follows.

    \[\begin{split} T&=T_{ij}\varepsilon^i\varepsilon^j = \tilde T_{ij} \tilde \varepsilon^i\tilde \varepsilon^j = \tilde T_{ij}B_{is}\varepsilon^sB_{jt}\varepsilon^t\\&\implies T_{st}=\tilde T_{ij}B_{is}B_{jt}\\&\therefore \tilde T_{kl} = F_{sk}F_{tl}T_{st}\end {split}\]

Example: the metric bilinear map (metric tensor)

Dot/inner product on the vector space V over \mathbb R is defined as a bilinear map \langle \cdot , \cdot \rangle : V\times V \to \mathbb R such that, \langle u , v \rangle = \langle v , u \rangle and v\ne 0 \iff \langle v , v \rangle > 0. With this regard two objects (that can have geometric interpretations for Euclidean spaces) are defined as,

1- Length of a vector \|v\|^2:=\langle v,v\rangle
2- Angle between two vectors \cos \theta :=\langle u/\|u\|,v/|v\|\rangle

Let see how the dot product is expressed through the coordinates of vectors. With \{e_i\}_1^n being a basis for V, we can write,

    \[u\cdot v :=g\langle u, v \rangle = u^iv^jg_{ij} \quad \text {s.t}\quad g_{ij}=e_i\cdot e_j\]

The term g_{ij} is called the metric tensor and its components can be presented by an n-by-n matrix as [g]_{ij}=e_i\cdot e_j.

If the basis is an orthonormal basis, i.e. e_i\cdot e_j=0 \forall i\ne j, then g_{ij}=\delta_{ij} and [g] is the identity matrix. Therefore, v\cdot u= u^iv^i and \|v\|^2 = v^iv^i.

Multilinear forms

A geneal multilinear form is a multilinear map defined as F:V_1\times V_2\times \cdots \times V_n\to \mathbb R, where V_i is a vector space. Particularly setting V_i=V leads to a simpler multilinear form as T:V\times V\times \cdots \times V\to \mathbb R.

Following the same steps as shown for a bilinear map, a multilinear form T:V\times V\times \cdots \times V\to \mathbb R can be written as,

    \[\begin{split} T(u,v,\cdots , z)&=T(u^ie_i,v^je_j, \cdots, z^ke_k)= u^iv^j\cdots z^k T(e_i,e_j, \cdots, e_k)\\&\implies T(u,v,\cdots , z)=u^iv^j\cdots z^kT_{ij\cdots k} \quad \text{with}\quad T_{ij\cdots k}:=T(e_i, e_j, \cdots e_k) \end{split}\]

which implies,

    \[T = T_{ij\cdots k}\varepsilon^i \varepsilon^j\cdots \varepsilon^k\]

showing that a multilinear form can be written as a linear combination of covectors.

Definition of a tensor

Defining the following terms,

  • Vector space V and basis \{e_i\}_1^n and another basis \{\tilde e_i\}_1^n.
  • Basis transformation as \tilde e_j=F_{ij}e_{i}, and therefore e_j=B_{ij}\tilde e_{i}.
  • The dual vector space of V as V^*.
  • Vector space V' and basis \{e_i'\}_1^m and another basis \{\tilde e_i'\}_1^m.
  • Basis transformation as \tilde e_j'=F'{ij}e{i}', and therefore e_j'=B'{ij}\tilde e{i}'
  • Linear map T\in \mathcal L(V,V').
  • Bilinear form \mathfrak B \in \mathcal L(V,V; \mathbb R).

we concluded that,

    \[\begin{split}v=v^ie_i=\tilde v^i\tilde e_i &\implies \tilde v^i = B_{ij}v^j\quad \text{contravariantly}\\\varepsilon^i, \tilde \varepsilon^i \in V^*, \varepsilon^i(e_j)=\tilde \varepsilon^i(\tilde e_j)=\delta_{ij} &\implies \tilde \varepsilon^i=B_{ij}\varepsilon^j \quad \text{contravariantly}\\f=c_i\varepsilon^i=\tilde c_j \tilde \varepsilon^j&\implies \tilde c_j = F_{ij}c_{i}\quad \text{covariantly}\\T= T=T_j^ie_i'\varepsilon^j = \tilde {T_j^i} \tilde e_i'\tilde \varepsilon^j &\implies \tilde {T_j^i} = B'_{il}F_{kj}T_k^l\quad \text{contravariant- and covariantly}\\\mathfrak B=\mathfrak B_{ij}\varepsilon^i\varepsilon^j = \tilde {\mathfrak B}_{ij} \tilde \varepsilon^i\tilde \varepsilon^j &\implies \tilde {\mathfrak B}_{ij} = F_{ki}F_{lj}{\mathfrak B}_{kl} \quad \text{covariantly and covariantly}\end{split}\]

It is observed that if a vector v\in V is written in terms a single sum/linear combination of basis vectors of V, then the components of the vectors change contravariantely with respect to a change of basis. Then, the covectors are considered and it is observed that their components change covariently upon change of basis of V* or V. A linear map can be written as a linear combination of vectors and covectors. The coefficients of this combination is seen to change both contra- and covariantely when the bases (of V and V') change. A bilinear form though can be written in terms of a linear combination of covectors. The corresponding coefficients change covariantly with change of basis. These results can be generalized toward an abstract definition of a mathematical object called a tensor. There are two following approaches for algebraically denfining a tensor.

Tensor as a multilinear form

Motivated by how a linear map and a bilinear form is written by combining basis vectors and covectors, a generalized combination of these vectors can considered. For example,

    \[\mathcal T:=\mathcal T^{ij}_k^l_s^t e_i e_j\varepsilon^k e_l\varepsilon^s e_t\]

This object T consists of a linear combination of a unified (merged) set of basis vector and covectors e_i e_j\varepsilon^k e_l\varepsilon^s e_t (of V and V^*) by scalar coefficients \mathcal T^{ij}_k^l_s^t. According to the type of the basis vectors, the indices become sub- or superscript, and hence it determines the type of the transformation regarding that index. By reordering the besis vectors and covectors, we can write,

    \[\mathcal T:=\mathcal T_{ks}^{ijlt} \varepsilon^k \varepsilon^s e_i e_j e_l  e_t\]

This motivates defining a multilinear form as \mathcal F:V^*\times V^* \times V\times V\times V\times V \to \mathbb R. This approach implies,

    \[\mathcal T_{ks}^{ijlt}\equiv \mathcal F_{ks}^{ijlt}=\mathcal F(\varepsilon^k, \varepsilon^s, e_i, e_j, e_l, e_t)\]

such an array can be realized as the components of some multilinear map T. This motivates viewing multilinear maps as the intrinsic objects underlying tensors.

Before expressig the definition of a tensor, we define a notation. A linear map and a bilinear form are respectively written as a linear combination of e_i\varepsilon^j and \varepsilon^i\varepsilon^j. Any of these (for any i,j \le the dimensions of the corrresponding spaces) can be considered as one new object and denoted as for example. \clubsuit_i^j:=e_i\varepsilon^j and \spadesuit^i^j:=\varepsilon^i\varepsilon^j. The writing of the basis vectors and/or basis covectors adjucent to each other is usually denoted by e_i\otimes\varepsilon^j and \varepsilon^i\otimes\varepsilon^j. This notation is refered to as tensor product of (basis) vectors. A general definition will be presented later. Using this notation, for now, we can write a linear map and a bilinear form as,

    \[T=T_j^ie_i\otimes\varepsilon^j\quad\quad \mathfrak B = \mathfrak B_{ij}\varepsilon^i\otimes\varepsilon^j\]

This notation can be extended to be used with any finite linear combination of tensor products of basis vectors and/or covectors where the combination coefficients takes indices following the index level convension. For example we can write,

    \[\mathcal T:=\mathcal T^{ij}_k^l_s^t e_i\otimes e_j\otimes\varepsilon^k\otimes e_l\otimes\varepsilon^s\otimes e_t    \]

Linear Algebra Cheat Sheet (1)

0- Notations and convention

A variable/quantity/element can be scalar, vector, or a matrix; its type should be understood from the context if not explicitly declared.

0-1- A vector z\in \mathbb{F}^{n} is considered as a column vector, i.e. a single-column matrix. Therefore:

z\in \mathbb{F}^{n}\equiv z\in \mathbb{F}^{n\times 1}

A row vector is therefore defined as  w^\text T\in \mathbb{F}^{1\times n}.

0-2- Dot product of column vectors:

\text{For }x,y\in \mathbb{R}^{n\times 1}, \ x\cdot y:=x^\text Ty

0-3- The ij-th element of a matrix A is denoted by A_{ij}. The i-th column and j-th row of the matrix are respectively denoted by A_{,i} and A_{j,} .

0-4- Columns and rows of a matrix are column and row vectors. A m\times n-matrix can then be represented as:

\begin{bmatrix} v_1& v_2& \dots & v_n\end{bmatrix}\ \text{ s.t } v_i=A_{,i}

or

\begin{bmatrix}u_1^\text T\\ u_2^\text T\\ \vdots\\ u_n^\text T\end{bmatrix}\ \text{ s.t } u_i^\text T=A_{i,}


1- Orthogonal matrix

a square matrix A\in \mathbb{R}^{n\times n} is said (definition) to be orthogonal iff A^{-1}=A^\text T; provided that A^{-1}, inverse of A exists. As a result, AA^{-1}=AA^\text T = I_n.

The following are equivalent for A:
a) A is orthogonal.
b) the column vectors of A are ortho-normal.
c) the row vectors of A are ortho-normal.
d) A is size preserving: \|Ax\|=\|x\|, \|.\| being the Euclidean norm, and x\in\ \mathbb{R}^{n\times 1}.
e) A is dot product preserving: Ax\cdot Ay=x\cdot y .


2- Some matrix multiplication identities

A\in \mathbb{F}^{m\times n},\ x\in \mathbb{F}^{n\times 1}\ \text{ then } Ax=\sum_{i=1}^n x_iA_{,i} A\in \mathbb{F}^{m\times n}\ , D\in \mathbb{F}^{n\times n}\ \text{ and diagonal }, B\in \mathbb{F}^{n\times k},\ \text{ s.t.}\\ \ \\ A=\begin{bmatrix} v_1& v_2& \dots & v_n\end{bmatrix}, \ B=\begin{bmatrix}u_1^\text T\\ \\ u_2^\text T\\ \vdots\\ u_n^\text T\end{bmatrix},\ D_{ij}= \begin{cases} \lambda_i &\text{, } i=j \\ 0 &\text{, } i\ne j \end{cases} \\ \ \\ \text{then }ADB=\sum_{i=1}^n \lambda_iv_i u_i^\text T

3- Change of basis

let \beta:=\{b_1,\dots,b_n\} and \beta':=\{b'_1,\dots,b'_n\} be two basis of \mathbb R^n. Then the following holds for the coordinates of a vector v\in\mathbb R^n with respect to the two bases:

[v]_{\beta’}=P_{\beta\to\beta’}[v]_{\beta}\quad\text{s.t} \\ \ \\ P_{\beta\to\beta’}=\begin{bmatrix} [b_1]_{\beta’}& [b_2]_{\beta’}& \dots & [b_n]_{\beta’}\end{bmatrix}

It can be proved that:

[v]_{\beta}=P_{\beta’\to\beta}[v]_{\beta’}\quad\text{s.t} \\ \ \\ P_{\beta’\to\beta}=P^{-1}_{\beta\to\beta’}

Rank of a Matrix

Let A\in  \mathbb{R}^{m\times n} (the results also holds for   \mathbb{C}^{m\times n}). Then, the column rank/row rank of A is defined as the dimension of the column/row space of A, i.e. the dimension of the vector space spanned by the columns/rows of A; this is then equivalent to the number of linearly independent columns/rows (column/rows vector) of A.

Theorem: column rank of A = row rank of A.

Definition: the rank of a matrix, rank(A), is the dimension of either the column or row space of A; simply the number of linearly independent columns or rows of A.

Definition: for a linear map f: \mathbb{R}^n\longrightarrow  \mathbb{R}^m, the rank of the linear map is defined as the dimension of the image of f. This definition is equivalent to the definition of the matrix rank as every linear map f:\mathbb{R}^n\longrightarrow  \mathbb{R}^m has a matrix A\in  \mathbb{R}^{m\times n} by which it can be written as f(x)=Ax.

Proposition: \text{rank}(A)\le \min(m,n). This leads to these definitions: A matrix A is said to be full rank iff \text{rank}(A)= \min(m,n), i.e. the largest possible rank, and it is said to be rank deficient iff \text{rank}(A)\lt \min(m,n), i.e. not having full rank.

Properties of rank

For  A\in  \mathbb{R}^{m\times n}:

1- only a zero matrix has rank zero.

2- If B\in  \mathbb{R}^{n\times k}, then  \text{rank}(AB)\le \min( \text{rank}(A) , \text{rank}(B))

3- \text{rank}(A+B)\le \text{rank}(A)+ \text{rank}(B)

4-  \text{rank}(A)=\text{rank}(A^{\text T})

5-  \text{rank}(A)= \text{rank}(A^{\text T}) =\text{rank}(AA^{\text T})= \text{rank}(A^{\text T}A)

6- If v\in \mathbb{R}^{r\times 1}, then for V=vv^\text T \in \mathbb{R}^{r\times r}, \text{rank}(V)=1. In addition, for v_i \in \mathbb{R}^{r\times 1} , \text{rank}(S=\sum_{i=1}^{n} v_iv_i^\text T)\le n, i.e. S has at most rank n.

7- A square matrix C\in \mathbb{R}^{n\times n} can be decomposed as C=U\Gamma U^{-1} where \Gamma is a diagonal matrix containing the eigenvalues of C. Then, \text{rank}(C)=\text{rank}(\Gamma), i.e. the number of non-zero eigenvalues of C.

8- For a square matrix C\in \mathbb{R}^{n\times n} , then equivalently C is full rank, is invertible, has non-zero determinant, and has n non-zero eigenvalues.

Proofs

P6:

V=\begin{bmatrix}v_1\begin{bmatrix}v_1\\v_2\\ \vdots\\v_r \end{bmatrix}&v_2\begin{bmatrix}v_1\\v_2\\ \vdots\\v_r \end{bmatrix}&\dots &v_r\begin{bmatrix}v_1\\v_2\\ \vdots\\v_r \end{bmatrix}\end{bmatrix}

where v_i\in \mathbb{R} are coordinates of the vector v. This indicates that each column of V is a scalar multiple of any other columns of V; therefore, the column space is one dimensional. Hence, \text{rank}(V)=1.

For the second part, property 3 proves the statement.