Analytic Geometry

Norm

A norm on a vector space VV is a function :VR|| \cdot ||:V \to \mathbb{R} such that λR,  x,yV\forall \lambda \in \mathbb{R},\;x,y\in V , we have :

Manhattan Norm (L1 norm)

x1=i=1nxi||x||_1 = \sum_{i=1}^n|x_i|

Euclidian Norm

x2=i=1nxi2||x||_2=\sqrt{\sum_{i=1}^nx_i^2}

When we say x||x|| without any subscript ,we usually refer to the Euclidian norm.

General inner product

An inner product is any mapping <,>:V×VW<\cdot,\cdot>:V\times V \to W such that   x,y,zV,  p,qR\forall\; x,y,z \in V, \;p,q\in \mathbb{R} , we have

Suppose x=xibi\bold{x} = \sum x_i \bold{b}_i  and y=yibi\bold{y} = \sum y_i \bold{b}_i  , then <x,y>=ixi<bi,y>=xTz<\bold{x,y}> = \sum_i x_i<\bold{b_i,y}> = \bold{x^Tz} where zi=<bi,y>=<bi,jyjbj>=j<bi,bj>yj=jriTyz_i = <\bold{b_i,y}> = <\bold{b_i},\sum_jy_j\bold{b_j}> = \sum_j<\bold{b_i,b_j}>y_j = \sum_j\bold{r_i}^T\bold{y} where ri\bold{r_i} is the column vector with the jthj^\text{th} entry being <bi,bj><\bold{b_i,b_j}> . And riT\bold{r_i}^T is the row vector.

From here, it’s easy to see that <x,y>=xTAy<\bold{x,y}> = \bold{x^TAy} where Aij=<bi,bj>A_{ij} = <\bold{b_i,b_j}> is the matrix with the ithi^\text{th} row being riT\bold{r_i}^T.

Since Aij=  <bi,bj>  =  <bj,bi>  =  AjiA_{ij} = \;<\bold{b_i,b_j}> \; =\; <\bold{b_j,b_i}> \;=\; A_{ji} , thus A\bold{A} is a symmetric matrix.

Moreover because xV,  <x,x>\forall\bold{x}\in V,\;<\bold{x,x}> is positive, except for x=0\bold{x=0} when it’s 0, thus xV{0},    xTAx>0\forall \bold{x}\in V-\{\bold{0}\},\;\;\bold{x^TAx}>0 , which is what we call a positive definite matrix.

So finally, an inner product is an operation on VRnV\sube\mathbb{R}^n given by <x,y>=xTAy<\bold{x,y}>=\bold{x^TAy} where ARn×n\bold{A} \in \mathbb{R}^{n\times n} is a symmetric, positive definite matrix.

Induced norms

Any definition of a norm that can be expressed using an inner product as x=<x,x>||\bold{x}||=\sqrt{<\bold{x,x}>} is called an induced norm.

For any general inner product, the Cauch-Schwarz inequality guarantees that

<x,y>x  y|<\bold{x,y}>| \leq ||\bold{x}||\;||\bold{y}|| , and thus, there is always a way of defining an angle between two vectors.

Angle between vectors

For any 2 vectors x,yV\bold{x,y} \in V , if the angle between them is θ\theta , then :

cos(θ)=<x,y>x  y\cos(\theta)=\frac{|<\bold{x,y}>|}{||\bold{x}||\;||\bold{y}||}

And thus, we also have a notion of orthogonality, which is when cos(θ)=0\cos(\theta)=0  and thus <x,y>    =0<\bold{x,y}>\;\;=0 .

Orthonormal bases

A bases B={b1,b2,,bn}B=\{\bold{b_1,b_2,\dots,b_n}\} is called orthonormal iff i,jN,  i,jn,  ij\forall i,j\in \mathbb{N},\; i, j \leq n, \; i\neq j , we have <bi,bi>=1<\bold{b_i,b_i}> = 1 and <bi,bj>=0<\bold{b_i,b_j}>=0 . The inner product for such a the vector space generated by this bases is basically the dot product.

Orthogonal Projections

Suppose UU is a vector subspace of VRnV \sube \mathbb{R}^n with bases B={b1,b2,,bm}B=\{\bold{b_1,b_2,\dots,b_m}\} , then for any xV\bold{x} \in V , the projection xUU\bold{x}_U \in U is the vector in UU with the least distance xxU||\bold{x-x}_U|| as given by a norm induced from inner product <,><\cdot,\cdot> .

It can be shown that it is the case when xxU\bold{x-x}_U is perpendicular to every vector in BB because then, using the triangle inequality, it gives less distance than any other case.

To find the coordinates of xU\bold{x}_U as expressed in BB , say λ\bold{\lambda}, we write the equations

0=  <bi,xxU>=<bi,xBλ>=biTA(xBλ)0 = \;<\bold{b_i},\bold{x-x}_U> = <\bold{b_i},\bold{x-B\lambda}> = \bold{b_i^TA(x-B\lambda)} . Stacking up these equation for all ii till mm, we get : 0=BTA(xBλ)    BTABλ=BTAx    λ=(BTAB)1BTAx    xU=B(BTAB)1BTAx\bold{0 = B^TA(x-B\lambda)} \implies \bold{B^TAB\lambda = B^TAx} \implies \bold{\lambda = (B^TAB)^{-1}B^TAx} \implies \bold{x_U = B(B^TAB)^{-1}B^TAx}

Special case : If the default bases that everything is expressed in is an orthonormal bases , then A=I\bold{A=I} and we basically have λ=(BTB)1BTx\bold{\lambda = (B^TB)^{-1}B^Tx} and xU=B(BTB)1BTx\bold{x_U = B(B^TB)^{-1}B^Tx} . This is what we’ll use usually. This is basically the least squares solution for the problem Bλx\bold{B\lambda\approx x} .

Special Special case : If the default bases as well as the bases BB is orthonormal. In this case BTB=Im×m\bold{B^TB=I}^{m\times m} and thus λ=BTx\lambda = \bold{B^Tx} and xU=BBTx\bold{x}_U=\bold{BB^Tx}.

Note that here BBT\bold{BB^T} is not (always) equal to In×n\bold{I}^{n\times n} .

Projection of affine spaces

To project a vector is to find the point in a subspace closest to that vector, viewed as a point. Thus, being able to project a point on any general affine space (hyperplane) enables us to do things like SVC and SVM .

The projection of a point x\bold{x} on a hyperplane x0+U\bold{x_0}+U is given by x0+(xx0)U\bold{x_0}+(\bold{x-x_0})_U

Gram-Schmidt Orthogonalization

For a basis B={b1,b2,bn}B = \{\bold{b_1,b_2\dots,b_n}\} , define

u1=b1\bold{u_1=b_1} and uk=bk(bk)span(u1,u2,,uk1)\bold{u_k=b_k-(b_k)_{\text{span}(\bold{u_1,u_2,\dots,u_{k-1}})}} for 1<kn1<k\leq n .

The basis U={u1,u2,u3,un}U = \{\bold{u_1,u_2,u_3\dots,u_n}\} is an orthonormal basis.

Here (bk)span(u1,u2,u3,uk1)(\bold{b_k})_{\text{span}(\bold{u_1,u_2,u_3\dots,u_{k-1}})} is the projection of bk\bold{b_k} on the span of the k1k-1 elements of UU that we have already calculated.