Wired Thoughts: What *is* the square-root of -1?

Have you ever wondered why complex numbers exist? Have you ever wondered why the hell j == square-root of -1? Were you ever dissatisfied by the explanation your math or engineering profs gave you in college? If you are mathematically inclined (you need to have studied a significant amount of college-level mathematics) and are curious about the way the world works, keep reading.

In math we have the concept of the real number line ... a bunch of numbers that lie in a straight line that stretches from negative infinity to positive infinity. These are 1-D numbers:

. . . . . - 3, -2, -1, 0, 1, 2, 3, . . . . . .

We can easily wrap our heads around this concept. Now, what if we consider 2-D numbers? Numbers that are simply a pair of "x-y" coordinates? Well, this was a little harder to grasp in highschool, but we still understood it relatively easily:


              .
              .
              3
              2
              1
... -3, -2, -1, 0, 1, 2, 3 ...
             -1
             -2
             -3
              .
              .

Perhaps we can call 2-D numbers the special word 2D vectors. Where a vector is simply two numbers: (x, y), where x denotes the horizontal portion of the vector, and y denotes the vertical portion.

Ok, so far so good. Hold on to your hats, we are about to get way more technical.

Consider what a "2D rotation" operation in computer graphics is: it is a linear transformation and hence can be represented as a matrix multiplication operation. It is well known that rotation by an angle theta in the 2D field is left-multiplication by this 2x2 matrix:

[ cos(theta) -sin(theta) ]
[ sin(theta) cos(theta) ]

What if we set theta == pi/2 ? (90 degrees counter-clockwise) then that matrix, call it A, becomes:


[ 0      -1]
[ 1       0]

Viola. Now we can rotate any point, (x, y), by left-multiplying it with this matrix A:


[ 0     -1]   [x]         ==  [w]
[ 1      0]   [y]             [z]

where (w, z) is the resultant vector after rotation.

Now, let's do something very common for square matrices: let's compute the eigen values for this matrix A:

Remember, the eigen values of a matrix, are scalar values such that multiplying a vector by those scalar values is the same as multiplying that the matrix with that same vector.

In other words, the eigen values, are all values such that:

Ax = ex

Where A is the matrix whose eigenvalues we are trying to find, and e is the scalar eigen-value we are trying to solve for. x is any non-null (non-zero) vector.

So, from that equation we have:

(A-eI)x = 0

Where "I" is the 2x2 identity matrix.

Ok, since x is not a zero vector, we have:

det(A - eI) = 0, or the determinant of


[ -e         -1]
[1           -e]      == 0

Now we have to simply find the roots of the characteristic polynomial of this matrix, to find the value of e:

e^2 + 1 == 0, or e = sqrt(-1)

We have arrived at the most beautiful conclusion ever: without ever mentioning anything about complex numbers, abstract algebra, Abelian groups, Euler's formulae, etc, we find ourselves in need of defining the quantity sqrt(-1).

Again, remember what an eigen value really means in this situation: this is saying, to rotate a vector by 90 degrees in 2D-space we can either multiply by this well known matrix, A, or we can multiply by a mysterious quantity, sqrt(-1).

Of course, it makes sense that the eigen value for a rotation matrix wouldn't be a simple real scalar, since a real number multiplied with a vector would simply scale that vector's magnitude, not rotate it in any way. Cool eh?

We can see from this that the fact that j (or i) == sqrt(-1) is not merely a convention that some mathematician came up with ... it is a fundamental part of dealing with numbers. Much like how negative numbers were "discovered" at a point in time where they might have been seen as "useless".

So. How we find a mystery number, m, that when multiplied by itself twice gives us the sqrt(-1)? Easy. We think in two dimensions.

We are basically looking for 1 x m x m = -1

The number 1 represented in 2 dimensions is the vector (0, 1). We know that the vector (0, -1) is the same as the vector (0, 1) but rotated 180 degrees. So, we rotate (0, 1) by 90 degrees twice to get to (0, -1). How do we "rotate by 90 degrees" again? Easy ... we just showed, it's by multiplying by "sqrt(-1)". Hence we have m = j. Or more clearly, the number that represents a 90 degree counter-clockwise rotation is simply this new construct "j".

One last question to take care of: Why can we use j in normal algebraic equations and treat it just like any other ordinary number? Magic? No. The reason is subtle: We arrived at the need for a "real number that is equal to the sqrt(-1)" when we were figuring out the eigen value of that rotation Matrix A. Of course, no such number exists, but if it did exist, then it just be yet another number! We can multiply/divide/add/subtract it just like we can multiply/divide/add/subtract the number 7. The only caveat is that is a rotation and hence lives on a different (and orthogonal) axis to the real-number axis. No problem, we just have to start thinking in 2D when we need the concept of sqrt(-1). There is a very real connection though between j and the real numbers, and that connection is precisely that, when we square j (i.e. j^2), we get -1, a negative number that we are very familiar with.

That's it!

If you understood this post then congratualtions: you have just linked the two seemingly unrelated fields in mathematics: Linear Algebra, and Complex Algebra.

Until next time,
--Shafik

10 comments:

Anonymous said...: "Why can we use j in normal algebraic equations and treat it just like any other ordinary number? Magic? No. The reason is subtle: We arrived at the need for a "real number that is equal to the sqrt(-1)" when we were figuring out the eigen value of that rotation Matrix A. Of course, no such number exists, but if it did exist, then it just be yet another number!"

I hope this doesn't come off as negative because it should be a compliment: It's like you're avoiding to decribe the complex numbers ring without delving into the definition of a ring or going into abstract algebra.

I'm taking an intoduction to ring theory class currently. I don't know your background but maybe you could write a paper about this.; October 8, 2008 at 10:13 PM
Anonymous said...: Another way I try to explain to high school students about imaginary numbers is by graphying two parabolas. One with two real numbered zeros (explaining that it hits the X axis)
and then some other parabola that doesn't hit x axis, and then how can I make this problem equal zero since the parabola is always above/below the axis.; October 8, 2008 at 10:25 PM
Anonymous said...: Very nice overview! Compelling and accessible.

For your next trick, maybe you can generalize to the quaternions, which are the natural representation of 4-D spacetime...?; October 8, 2008 at 11:31 PM
Othello said...: Ax=ex for all x. But only in this particular case. Why?; October 8, 2008 at 11:56 PM
Anonymous said...: This challenged my original thinking on "i," which is/was that it is merely placeholder for sqrt(-1)... I never really understood why there was this emphasis on graphing imaginary numbers out like this, but the way you put it, it is quite a bit more intriguing.

A lot of it I didn't understand and just glossed over (I'm mathematically inclined but the highest-level math I've taken is first year algebra in university...and that was 10 years ago) but even just your comparison with the negative numbers (which I initially balked at) definitely got me thinking.; October 8, 2008 at 11:56 PM
Tomas said...: There is some flawed math here:

For each eigenvalue of a linear transformation A there is _one_ associated eigenvector x.

That is, after having found an eigenvalue, it cannot transform any vector x in the same way as the matrix - only its associated eigenvector.

It perhaps informational to say that when searching for eigenvalues we are searching for _vectors_ such that they come through the transformation unchanged, save for a scale factor (the eigenvalue).; October 9, 2008 at 2:27 AM
carton said...: As Thomas said, there is some sloppy math here. Ax = ex is only valid when x is the right eigenvector of A, it's not true for all x. In simpler terms, multiplying a vector by i isn't the same as multiplying by A unless the vector is lying on the x (horizontal) axis.; October 9, 2008 at 6:20 AM
Anonymous said...: Stop using j for sqrt(-1). That has validity only in a subset of electrical engineering. Virtually all other disciplines use i.; October 9, 2008 at 8:22 AM
amckenny said...: Heh, I don't even remember what the importance of matrices are in math, much less what eigenvalues are :-P.

I think I'll stick with my flavor of math:

A=P*(1+r/n)^(n*t)

and

E(Ri)=Rf+Beta*(Rm-Rf)
(where Beta=Cov(Ri,Rm)/Var(Rm))

:-) Finance... gotta love it!; October 9, 2008 at 1:50 PM
Unknown said...: I like this interpretation of how a linear algebra problem necessitates this new object called an imaginary number. However don't discount how the high school problems - i.e. solving a quadratic with no real roots - also validly give rise to this concept!; October 9, 2008 at 2:05 PM