## Bounded linear operators on ∞-dimensional vector spaces

A matrix is a box filled with numbers

$\mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix}.$

with the context understood to be that they will be multiplying something in an inner-product sense.

i.e. “matrix on the left” is read with an arrow → going right across rows. “matrix on right” is read with an arrow ↓ going down columns.

If you read about spectral theorems or bounded linear operators or even just abstract vector spaces you might come across, as I did, mention of “infinite-dimensional spaces”. What could that even mean? How do the dimensions fit together? How can I picture an infinite-dimensional thing?

I recently learned the answer and it’s not nearly as hard as I thought; I’ll share my new perspective with you.

• Normally we talk about an entry a_{i,j} in the matrix. It’s indexed by {row,column} where i,j ∈ {1,…,N}.
• The “infinite-dimensional vector space” idea uses the same a_{i,j} but i,j ∈ [0,1], the continuous line segment which bijects to [1,N] (another continuous line segment—just shift back by one and divide to biject it)
• So the matrix entries function the same way, they’re just now to be thought of as “continuous rows”
• —if you have the mental machinery to envisage a probability distribution—even better a 2-D joint distribution—then you have what’s required to “picture” this thing.

If you picture each of the matrix “blocks” as corresponding to a light/darkness value to represent the quantity inside

then the “infinite-dimensional” linear operator would just be “more subsquares in the grid”. If you want to allow complex values then Elias Wegert’s pictures (using colour as a “circular” value (complex argument) rather than brightness as a “straight” value)

then his pullbacks on a complex square Rect(Z)→arg(f) (I used 1000×1000 resolution) look fairly continuous—like an infinite-dimensional linear operator taking complex values a_{i,j}∈ℂ, i,j ∈ [0,1]

That’s the formal aspects taken care of. What kinds of things might an infinite-dimensional space be needed to represent? Here are some ideas:

## CALCULUS: the Really Really Short Version

So, you never went to university…or you assiduously avoided all maths whilst at university…or you started but were frightened away by the epsilons and deltas…. But you know the calculus is one of the pinnacles of human thought, and it would be nice to know just a bit of what they’re talking about……

There are some good intro-to-calculus videos floating around online. I think I can explain differentiation and integration—the two famous operations of calculus—even more briefly.



Let’s talk about sequences of numbers. Sequences that make sense next to each other, like your child’s height at different ages

not just an unrelated assemblage of numbers which happen to be beside each other. If you have handy a sequence of numbers that’s relevant to you, that’s great.



Differentiation and integration are two ways of transforming the sequence to see it differently-but-more-or-less-equivalently.

Consider the sequence 1, 2, 3, 4, 5. If I look at the differences I could rewrite this sequence as [starting point of 1], +1, +1, +1, +1. All I did was look at the difference between each number in the sequence and its neighbour. If I did the same thing to the sequence 1, 4, 9, 16, 25, the differences would be [starting point of 1], +3, +5, +7, +9.

That’s the derivative operation. It’s basically first-differencing, except in real calculus you would have an infinite, continuous thickness of data—as many numbers between 1, 4, and 9 as you want. In R you can use the diff operation on a sequence of related data to automate what I did above. For example do

• seq <- 1:5
• diff(seq)
• seq2 <- seq*seq
• diff(seq2)

A couple of things you may notice:

• I could have started at a different starting point and talked about a sequence with the same changes, changing from a different initial value. For example 5, 6, 7, 8, 9 does the same +1, +1, +1, +1 but starts at 5.
• I could second-difference the numbers, differencing the first-differences: +3, +5, +7, +9 (the differences in the sequence of square numbers) gets me ++2, ++2, ++2.
• I could third-difference the numbers, differencing the second-differences: +++0, +++0.
• Every time I diff I lose one of the observations. This isn’t a problem in the infinitary version although sometimes even infinitely-thick sequences can only be differentiated a few times, for other reasons.

The other famous tool for looking differently at a sequence is to look at cumulative sums: cumsum in R. This is integration. Looking at “total so far” in the sequence.

Consider again the sequence 1, 2, 3, 4, 5. If I added up the “total so far” at each point I would get 1, 3, 6, 10, 15. This is telling me the same information – just in a different way. The fundamental theorem of calculus says that if I diff( cumsum( 1:5 )) I will get back to +1, +2, +3, +4, +5. You can verify this without a calculator by subtracting neighbours—looking at differences—amongst 1, 3, 6, 10, 15. (Go ahead, try it; I’ll wait.)

Let’s look back at the square sequence 1, 4, 9, 25, 36. If I cumulatively sum I’d have 1, 5, 15, 40, 76. Pick any sequence of numbers that’s relevant to you and do cumsum and diff on it as many times as you like.



Those are the basics. Why are people so interested in this stuff? Why did it make such a splash and why is it considered to be in the canon of human progress? Here are a few reasons:

• If the difference in a sequence goes from +, +, +, +, … to −, −, −, −, …, then the numbers climbed a hill and started going back down. In other words the sequence reached a maximum. We like to maximize things, like efficiency, profit,
• A corresponding statement could be made for valley-bottoms. We like to minimise things like cost, waste, usage of valuable materials, etc.
• The diff verb takes you from position → velocity → acceleration, so this mathematics relates fundamental stuff in physics.
• The cumsum verb takes you from acceleration → velocity → position, which allows you to calculate stuff like work. Therefore you can pre-plan for example what would be the energy cost to do something in a large scale that’s too costly to just try it.
• What’s the difference between income and wealth? Well if you define net income to be what you earn less what you spend,

then wealth = cumsum(net income) and net income = diff(wealth). Another everyday relationship made absolutely crystal clear.
• In the infinitary version, symbolic formulae diff and cumsum to other symbolic formulae. For example diff( x² ) = 2x (look back at the square sequence above if you didn’t notice this the first time). This means instead of having to try (or make your computer try) a lot of stuff to see what’s going to work, you can just plain understand something.
• Also because of the symbolic nicety: post-calculus, if you only know how, e.g., diff( diff( diff( x ))) relates to x – but don’t know a formula for x itself – you’re not totally up a creek. You can use calculus tools to make relationships between varying diff levels of a sequence, just as good as a normal formula – thus expanding the landscape of things you can mathematise and solve.
• In fact diff( diff( x )) = − x is the source of this, this

, this,

, and therefore the physical properties of all materials (hardness, conductivity, density, why is the sky blue, etc) – which derive from chemistry which derives from Schrödinger’s Equation, which is solved by the “harmonic” diff( diff( x )) = − x.

Calculus isn’t “the end” of mathematics. It’s barely even before or after other mathematical stuff you may be familiar with. For example it doesn’t come “after” trigonometry, although the two do relate to each other if you’re familiar with both. You could apply the “differencing” idea to groups, topology, imaginary numbers, or other things. Calculus is just a tool for looking at the same thing in a different way.

Owen Pallett :: “The Riverbed”
In Conflict (2014)

## Nonlinear Superdimensions?

I’m not the first person to say "ceteris paribus is a lie". What this aphorism means is that if you make a c.p. assumption in order to think something through, then the conclusion you reach may be irrelevant to the real world.

Worse, because people don’t understand models, someone might take your careful “A implies B” statement to mean “Both A and B are the case”. For example rather than Edgeworth boxes implying that trade be always mutually beneficial, people might take you to mean that

1. exchange is characterised by Edgeworth boxes
2. private transactions are always Pareto optimal

which is not at all what the theory’s saying. The theory is just connecting assumptions to conclusion: yes, if this were true, then that would surely follow. Which is great because some people don’t actually think such things through.



Anyway. Ceteris paribus assumptions make thinking easier, but they hamstring whatever you find out—so that it may be useless, or (hopefully not) worse than useless: misleading.

But maybe it’s possible to keep the crutch of c.p. and make it less foolish.

There are some situations where it’s impossible to do what I’m going to suggest—like where space overlaps itself. But in Euclidean spaces it is possible.

Econometricians are already familiar with principal components analysis. You make one “composite dimension” which is composed of a fixed combination of existing dimensions.

composite = .4 × X₁   +   .2 × X₂   +   1.7 × X₃

This is what I’m calling a “super-dimension”.

You hold all other things constant so you can think logically about a situation that has the geometry of a single straight line. By creating a composite dimension maybe one could still use the handy ceteris-paribus assumption but roll more of real-life into the model too.



For example let’s say as wealth ↑, trips to the emergency room ↓. Then you could form a composite dimension with a positive coefficient on wealth, negative on emergency room visits, and talk about both at once with everything else held constant. One step forward relative to talking only about only wealth ↑↓.

But wait — maybe these are only linearly related around a small neighbourhood of some point. Well, we could still create a composite “super-dimension” by varying the coefficients. This could either come in the form of pre-transforming wealth to be log of wealth, or something else — like a threshold effect where we use two or three linear pieces (eg, rich enough with slope=0, way too poor with slope=0, and middle with a linear decrease). In general, whereas linear means +k+k+k+k+k+k+…, nonlinear can be interpreted as +1.2k+k+.9k+.8k+k+1.1k+1.3k+1.2k+1.4k+…. So instead of constructing a composite dimension with fixed coefficients before ignoring everything else, perhaps one could vary the coefficients along with the space.

That’s all. This may not be a new idea.

hi-res

Albert Wenger, one of the owners of tumblr

At minute 31:

• Google did not invent keyword advertising
• GoTo, later renamed Overture, out of IdeaLab, invented it
• and were acquired by Yahoo
• Google improved upon the keyword search idea, turning keyword search into a viable business model
• They realised there needs to be such a thing as a quality score—i.e., you don’t myopically give the ad space to the highest bidder. Long-term revenue maximisation required asking what the users want, and not p***ing them off.

Thank goodness someone with sense and mathematical credentials (W.W.Sawyer) has put the ghastly A Mathematician’s Apology to bed.

That Hardy was a very great mathematician is beyond question…. However, when any person eminent in some field makes statements outside that field, it is legitimate to consider the validity of these statements….

Hardy writes

I hate ‘teaching’….I love lecturing, and have lectured a great deal to extremely able classes. [2.]

Here lecturing means imparting mathematical knowledge to those able to understand it with little or no difficulty; teaching means giving time and effort to make it accessible to those who require assistance…. Good [management] consists in appreciating the merits of a wide variety of individuals and combining them into an effective team. [I]t is precisely this appreciation that Hardy lacks. He makes the extraordinary statement

Most people can do nothing at all well. [3.]

…[H]e regards you as doing well only if you are one of the ten best in the world at this particular activity…. [T]hat very few people do anything well is [then] an [obvious] consequence.

However in life we continually depend on the co-operation of men and women far below this exacting standard….

[E]ven … the … process that links the great mathematicians of one generation to those of the next [depends on them]. There may of course be direct contact, as when Riemann [studied] … under Gauss. But the fact that Gauss was able to reach university at all was due to two teachers, Buttner … and Bartels….[4.]

In science the importance of the expositor is perhaps as great as that of the discoverer. Mendel’s work in genetics remained unknown for many years because there was no one to publicize it and fight for it as Huxley did for Darwin.

He makes this curiously objective division of mankind into minds that are first-class, second class and so on…. There is no part of this that should be accepted as sound advice. If there is something you think worth doing, that you are able to do, that you have the opportunity to do, and that you enjoy doing, wisdom lies in getting on with it, and not giving a second’s thought to what ordinal number attaches to you in some system of intellectual snobbery. As for concern with the self, you are both happiest and most effective when you are so absorbed in what you are doing that for a while you forget the limited being that is actually performing it.

hi-res

The discovery of the laws of numbers is made upon the ground of the original, already prevailing error, that there are many similar things (but in reality there is nothing similar), at least, that there are things (but there is no “thing”). The supposition of plurality always presumes that there is something which appears frequently,—but here already error reigns, already we imagine beings, unities, which do not exist. Our sensations of space and time are false, for they lead—examined in sequence—to logical contradictions. In all scientific determinations we always reckon inevitably with certain false quantities, but as these quantities are at least constant, as, for instance, our sensation of time and space, the conclusions of science have still perfect accuracy and certainty in their connection with one another; one may continue to build upon them—until that final limit where the erroneous original suppositions, those constant faults, come into conflict with the conclusions, for instance in the doctrine of atoms. There still we always feel ourselves compelled to the acceptance of a “thing” or material “substratum” that is moved, whilst the whole scientific procedure has pursued the very task of resolving everything substantial (material) into motion; here, too, we still separate with our sensation the mover and the moved and cannot get out of this circle, because the belief in things has from immemorial times been bound up with our being. When Kant says, “The understanding does not derive its laws from Nature, but dictates them to her,” it is perfectly true with regard to the idea of Nature which we are compelled to associate with her (Nature = World as representation, that is to say as error), but which is the summing up of a number of errors of the understanding. The laws of numbers are entirely inapplicable to a world which is not our representation—these laws obtain only in the human world.

Fried Rice Nietzsche. Human, All Too Human

Like Spinoza, Nietzsche thinks about something for a bit and decides whatever conclusion he came to must have been the correct one. Learn something before you open your mouth, fool.

Yes, it’s ultimately futile to try to make a comprehensive comparison of things. Ultimately nothing is the same. But you know what? We don’t need to be so picky. Or rigid. Even though every rock is unique, they’re all comparable in some ways, like for example, they all fall under the category of “rock”. So I can put them in an equivalence class for the time being—without robbing them of their unique individuality, just saying they are comparable without being identical.

I feel like I’m stating the obvious. And this person is a venerated Western intellectual? Give me a break.



If by the “laws of numbers” he means to undermine mathematics, then Nietzsche’s critique falls short of most interesting mathematical stuff.

Reine geometrie

We don’t “lose” the insights from the A→B process because, in Fried Rice’s opinion, there’s some problem with counting things.

Yes, not all rocks are the same. But. We can still make equivalence-classes of rocks—treating them as the same for the time being.

And even if you couldn’t—that wouldn’t change the weirdness that happens when you mix two things like plus and times

(you get prime numbers which show up at not-totally-predictable times)

…or what happens when you combine shifts and swaps:

Nobody is “making this up”. Nor does it depend upon some person’s viewpoint. You can work out the symmetric group of order 3 and you’ll find the same thing I found when I worked it out.

This crap is just like what I’ve seen of Rousseau, Spinoza, Hegel, even Leibniz. People so full of themselves they think every time they clear their throat someone should get out a pen.

Remember that this is the same bloke who posited in The Eternal Return that states-of-affairs must recur given an infinite amount of time. Which is wrong: dynamical systems can wander off and never come back, like a random walker in 3D.

To me it’s much worse to invent false histories or pretend to authority than to suggest we treat a handful of rocks as equivalent for-the-moment. Get off yourself, Nietzsche.