The following is a set of notes I prepared when I was discussing with someone the concept of vector spaces, the metric, and manifolds, with particular attention to common pitfalls, such as confusing $dx^i$ (the $i$-th component of the infinitesimal vector $d{\bf x}$ vs. $d{\bf x}_i$ (an infinitesimal vector, possibly the $i$-th basis vector of the tangent space of a manifold at a specific point).

Vector spaces

We start with a vector space $V$. It is a set with elements, for which the following is true:

  1. We have a commutative operation called "addition": If ${\bf v},{\bf w}\in V$ then ${\bf v}+{\bf w}={\bf w}+{\bf v}\in V$.
    We can multiply an element by a real number: if ${\bf v}\in V$ and $\alpha\in\mathbb{R}$, then $\alpha{\bf v}\in V$.
  2. That's all—well, almost all, there are a few standard axioms, e.g., associativity, distributivity—that there is to a vector space. Of course it follows from the above that if ${\bf v},{\bf w}\in V$ and $\alpha,\beta\in\mathbb{R}$, then $\alpha{\bf v}+\beta{\bf w}\in V$, i.e., linear combinations of vectors are themselves vectors in this vector space.

Now it so happens that we find that in most vector spaces, we can express all vectors as linear combinations of a (finite or infinite) set of basis vectors:
\begin{align}
{\bf v}=v^1{\bf e}_1+v^2{\bf e}_2+...+v^n{\bf e}_n.\tag{1}
\end{align}
The number $n$ is the dimensionality of the vector space. It need not be finite. (E.g., the vector space of quantum mechanical states of a system is usually not finite.)

Inner products

We can also endow the vector space with an "inner product". That is to say, we can define an operation for which the following is true:

  1. For every ${\bf v},{\bf w}\in V$, we have $\langle{\bf v},{\bf w}\rangle\in\mathbb{R}$;
  2. $\langle{\bf v},{\bf w}\rangle=\langle{\bf w},{\bf v}\rangle$;
  3. $\langle{\bf v},{\bf v}\rangle=0$ if and only if ${\bf v}=0{\bf v}=0$.
  4. For $\alpha,\beta\in\mathbb{R}$ and ${\bf v},{\bf w},{\bf u}\in V$, $\langle\alpha{\bf v}+\beta{\bf w},{\bf u}\rangle=\alpha\langle{\bf v},{\bf u}\rangle+\beta\langle{\bf w},{\bf u}\rangle$.

But now things get interesting! Let us look at the inner product of ${\bf v}=v^1{\bf e}_1+...+v^n{\bf e}_n$ and ${\bf w}=w^1{\bf e}_1+...+w^n{\bf e}_n$. Remember, $v^1...v^n$ are just real numbers, and so are $w^1...w^n$. So then:
\begin{align}
\langle{\bf v},{\bf w}\rangle&{}=\langle v^1{\bf e}_1+...+v^n{\bf e}_n,w^1{\bf e}_1+...+w^n{\bf e}_n\rangle\nonumber\\
&{}=\sum_{i=1}^n v^i\langle{\bf e}_i,w^1{\bf e}_1+...+w^n{\bf e}_n\rangle=\sum_{i=1}^n\sum_{j=1}^n\langle{\bf e}_i,{\bf e}_j\rangle v^i w^j,\tag{2}
\end{align}
or, introducing the symbol $g_{ij}=\langle{\bf e}_i,{\bf e}_j\rangle$ as the "metric" and using the Einstein summation convention, we have
\begin{align}
\langle{\bf v},{\bf w}\rangle=g_{ij}v^iw^j.\tag{3}
\end{align}

There is another way of looking at this result. Just as $w^j$ are the components of the vector ${\bf w}$ in the basis ${\bf e}_j$, perhaps $v_j=g_{ij}v^i$ are components of a corresponding "covector" in an alternate vector space with basis ${\bf e}^j$? In that case, we can express the inner product as a Cartesian dot product between the vector and the covector, so long as the following is true:
\begin{align}
{\bf e}^j\cdot{\bf e}_i=\delta^j_i,\tag{4}
\end{align}
so that
\begin{align}
(g_{ij}v^i{\bf e}^j)\cdot(w^k{\bf e}_k)=
g_{ij}v^iw^k({\bf e}^j\cdot{\bf e}_k)=
g_{ij}v^iw^k\delta^j_k=
g_{ij}v^iw^j.\tag{5}
\end{align}

Change of basis

The set of basis vectors need not be unique. What happens when we change bases? Why, we have
\begin{align}
{\bf v}=v^1{\bf e}_1+...+v^n{\bf e}_n=v'^1{\bf j}_1+...+v'^n{\bf j}_n.\tag{6}
\end{align}
How can we relate these two representations of the same vector? For starters, we may remember that ${\bf j}_i$ is just another vector, so it can be represented in the original basis:
\begin{align}
{\bf j}_i=J^1_i{\bf e}_1+...+J^n_i{\bf e}_n=\sum_{j=1}^n J^j_i{\bf e}_j.\tag{7}
\end{align}
This means that
\begin{align}
{\bf v}=\sum_{i=1}^n v'^i{\bf j}_i=\sum_{i=1}^n\sum_{j=1}^n v'^i J^j_i{\bf e}_j,\tag{8}
\end{align}
and therefore (with summation implied),
\begin{align}
v^j=J^j_iv'^i.\tag{9}
\end{align}
The numbers $J^j_i$, which are the representations of the basis vectors ${\bf j}_i$ in the original basis ${\bf e}_j$, thus form a matrix that characterizes the transformation of the representation of any vector ${\bf v}$ from the ${\bf j}_i$ basis to the ${\bf e}_j$ basis.

Manifolds

So far so good,  this is very abstract of course. Let us now move on to the concept of a manifold. A manifold $\mathcal{M}$ is a set that can be covered by finitely many copies of $\mathbb{R}^n$. By "covered" we mean that the set $\mathcal{M}$ can be split up into a finitely many (possibly overlapping) subsets, the elements of each of which can be mapped one-to-one to a subset of $\mathbb{R}^n$. Given a set $\mathcal{M}$, the smallest value of $n$ that allows such a mapping possible is the dimensionality of $\mathcal{M}$.

For instance, the Euclidean plane can be mapped by just a single copy of $\mathbb{R}^2$, so its dimensionality is 2. The surface of a sphere, like the Earth, needs at least two copies of $\mathbb{R}^2$ but it is still two dimensional. Other, topologically more complex surfaces may require more than two charts to be covered in full, but so long as all the charts are subsets of $\mathbb{R}^2$, the dimensionality is still 2.

At this point, it's prudent to offer a remark that will be important in a moment: a coordinate chart does not form a vector space, at least not in the sense we might expect. E.g., 60 degrees N, 270 degrees W plus 70 degrees N, 180 degrees W gives... 130 degrees North, 450 degrees W? That makes no sense.

So let's say we have an $n$-dimensional manifold. Pick an element of it that, in some coordinate chart, is represented by the $n$-tuplet $(x^1,...,x^n)$. At that point in this manifold, we can consider neighboring points that are characterized by infinitesimal displacements of the coordinates: $x^1+dx^1,...,x^n+dx^n$. As the displacements are infinitesimal, any higher-order, nonlinear terms can be ignored: The displacements are linear under addition and multiplication by real numbers. In other words, these infinitesimal coordinate displacements actually form a vector space. In fact, they can be used to form a basis:
\begin{align}
d{\bf x}_1&{}=dx^1(1,0,...,0),\nonumber\\
d{\bf x}_2&{}=dx^2(0,1,...,0),\nonumber\\
&...\nonumber\\
d{\bf x}_n&{}=dx^n(0,0,...,1).\tag{10}
\end{align}
Any (infinitesimal) vector can be expressed using this basis:
\begin{align}
d{\bf v}=v^id{\bf x}_i,\tag{11}
\end{align}
with summation implied.

It is now time to stress the distinction between $d{\bf x}_i$ (an infinitesimal vector) vs. $dx^i$ (an infinitesimal real number). Note the consistent use of boldface to mark vectors.

Of course the same patch of a manifold can be covered by different coordinate charts. In another coordinate chart, we have different infinitesimal displacements and a different expression for $d{\bf v}$:
\begin{align}
d{\bf v}=v'^jd{\bf y}_j.\tag{12}
\end{align}
We can, of course, express $d{\bf x}_j$ in terms of the $d{\bf y}_i$. More specifically, we have of course
\begin{align}
dx^j=\frac{\partial x^j}{\partial y^i}dy^i,\tag{13}
\end{align}
i.e., comparing against (9), we have $J^j_i=\partial x^j/\partial y^i$, thus
\begin{align}
v^j=\frac{\partial x^j}{\partial y^i}v'^i.\tag{14}
\end{align}

We also have, of course
\begin{align}
\frac{\partial x^j}{\partial x^i}=\delta^j_i.\tag{15}
\end{align}

This also tells us that if we consider $d{\bf x}_i$ a set of infinitesimal basis vectors, then
\begin{align}
\frac{\partial}{\partial{\bf x}_i}=\frac{\partial}{\partial x^i}(1,0,...,0)\tag{16}
\end{align}
would represent the corresponding set of covariant basis vectors.

Metric manifolds

So far, we found a way to recognize infinitesimal displacements in a manifold as a way to form a vector space, but this vector space is not yet endowed with an inner product. We have not introduced a  metric. Earlier, we showed that $g_{ij}=\langle{\bf e}_i,{\bf e}_j\rangle$, but this is not an actionable definition: indeed, this definition would be circular since in order to form the inner product $\langle{\bf e}_i,{\bf e}_j\rangle$ we'd need the metric first!

In other words, we are free to introduce a metric. In particular, we are free to introduce a metric to compute the infinitesimal squared length $ds^2$ of the vector that is constructed as a combination of the basis vectors $d{\bf x}_i$:
\begin{align}
ds^2=g_{ij}dx^idx^j.\tag{17}
\end{align}

In the context of a metric manifold, then, what is the meaning of the infinitesimal basis vectors $d{\bf x}_i$? At any point in the manifold, characterized by the coordinates $(x^1,...,x^n)$, the vectors $d{\bf x}_i=dx^i(0,...,1,...,0)$ form a basis of the "tangent space", a manifold that is characterized by the Euclidean metric $g_{ij}=1$ if $i=j$, 0 otherwise*.

Finally, note the transformation laws. Suppose we switch coordinates. The squared norm $ds^2$ remains invariant:
\begin{align}
ds^2=g_{ij}dx^idx^j=g_{ij}\frac{\partial x^i}{\partial y_k}\frac{\partial x^j}{\partial y_l}dy^kdy^l=g'_{kl}dy^kdy^l.\tag{18}
\end{align}
where
\begin{align}
g'_{kl}=\frac{\partial x^i}{\partial y_k}\frac{\partial x^j}{\partial y_l}g_{ij}.\tag{19}
\end{align}
Contrast this with how, say, $dx^i$ transforms:
\begin{align}
dy^i=\frac{\partial y^i}{\partial x^k}dx^k.\tag{20}
\end{align}
When we consistently use the index notation for components, quantities with an upper index transform as (20) and are called contravariant; quantities that transform like (19) are covariant.

The properties of the inner product imply that the metric cannot be degenerate, which means that its inverse $g^{jk}$ exists:
\begin{align}
g_{ij}g^{jk}=\delta_i^k.\tag{21}
\end{align}
The covariant metric $g_{ij}$ and its contravariant counterpart $g^{ij}$ can be used to convert a contravariant vector into a corresponding covariant vector and vice versa. This "raising" or "lowering" of indices is the means by which the components of a vector can be found in the tangent space or the cotangent space.

The transformation matrix is of course the Jacobian matrix,
\begin{align}
J^k_i=\frac{\partial x^k}{\partial y^i},\tag{22}
\end{align}
which we already encountered in (9) even before we introduced the concept of a manifold or that of coordinate charts.

A quantity that transforms as
\begin{align}
T'^{a..c}_{i..k}=\underbrace{J^a_b...J^c_d}_{m}~\underbrace{\hat{J}^j_i...\hat{J}^l_k}_{n}~T^{b..d}_{j..l}\tag{23}
\end{align}
where $\hat{J}^i_k=\partial y^i/\partial x^k$ is the inverse of $J^k_i$, is called a tensor of rank $(m,n)$, with $m$ contravariant and $n$ covariant indices.


*The use of a misleading symbol like $\delta_{ij}$ was avoided here on purpose.