Posts tagged with size

Rank-Nullity Theorem

The rank-nullity theorem in linear algebra says that dimensions either get

• thrown in the trash
• or show up

after the mapping.

By “the trash” I mean the origin—that black hole of linear algebra, the /dev/null, the ultimate crisscross paper shredder, the ashpile, the wormhole to void and cancelled oblivion; that country from whose bourn no traveller ever returns.

The way I think about rank-nullity is this. I start out with all my dimensions lined up—separated, independent, not touching each other, not mixing with each other. ||||||||||||| like columns in an Excel table. I can think of the dimensions as separable, countable entities like this whenever it’s possible to rejigger the basis to make the dimensions linearly independent.

I prefer to always think about the linear stuff in its preferably jiggered state and treat how to do that as a separate issue.

So you’ve got your 172 row × 81 column matrix mapping 172→ separate dimensions into →81 dimensions. I’ll also forget about the fact that some of the resultant →81 dimensions might end up as linear combinations of the input dimensions. Just pretend that each input dimension is getting its own linear λ stretch. Now linear just means multiplication.

Linear stretches λ affect the entire dimension the same. They turn a list like [1 2 3 4 5] into [3 6 9 12 15] (λ=3). It couldn’t be into [10 20 30 − 42856712 50] (λ=10 except not everywhere the same stretch=multiplication).

Also remember – everything has to stay centred on 0. (That’s why you always know there will be a zero subspace.) This is linear, not affine. Things stay in place and basically just stretch (or rotate).

So if my entire 18th input dimension [… −2 −1 0 1 2 3 4 5 …] has to get transformed the same, to [… −2λ −λ 0 λ 2λ 3λ 4λ 5λ …], then linearity has simplified this large thing full of possibility and data, into something so simple I can basically treat it as a stick |.

If that’s the case—if I can’t put dimensions together but just have to λ stretch them or nothing, and if what happens to an element of the dimension happens to everybody in that dimension exactly equal—then of course I can’t stick all the 172→ input dimensions into the →81 dimension output space. 172−81 of them have to go in the trash. (effectively, λ=0 on those inputs)

So then the rank-nullity theorem, at least in the linear context, has turned the huge concept of dimension (try to picture 11-D space again would you mind?) into something as simple as counting to 11 |||||||||||.

Why is the slope of perpendicular lines flipped over and switched signs?

Oh! This one only took me 17 years or so to figure out. This was a “fact” I had committed to memory in school but never thought about why.



From The Symplectization of Science by Mark Gotay and James Isenberg:

There are some connections to circles and homogeneous coordinates (v/‖v‖) but let’s leave those for another time.

Gotay & Isenberg’s exposition using the metric makes it clear that the
/‖v‖ part of the definition of cosine isn’t where the right-angle concept comes from. It comes from the v₁ w₁ + v₂ w₂.

$\large \dpi{150} \bg_white \!\!\!\! \text{Given two vectors named } {\color{Golden} \vec{v}},{\color{DarkBlue} \vec{w}} \text{ made up of } \\ \\ {\color{Golden} \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} =\vec{v}} \text{ and } {\color{DarkBlue} \begin{pmatrix} w_1 \\ w_2 \end{pmatrix} =\vec{w}} \\ \\ \text{the metric \scriptsize{(which is going to encode geometric information)}\Large\ is:} \\ \\ \boxed{g = {\color{Golden} v_1} \cdot {\color{DarkBlue} w_1} + {\color{Golden} v_2} \cdot {\color{DarkBlue} w_2}}$



So if the slope of my starting line is m, why is the slope of its perpendicular line −1/m?

First I could draw some examples.

I drew these with http://www.garrettbartley.com/graphpaper.html which is a good place to count out the “rise over run” and “negative run over rise” Δx & Δy distances to make sure they really do look perpendicular.

The length and the (affine or “shift”) positioning of perpendicular line segments doesn’t matter to their perpendicularity. So to make life easier on myself I’ll centre everything on zero and make the segments equal length.

The metric formula is going to work if let’s say my first vector v is (+1,+1) (one to the right and one up) and my second vector goes one down and one to the right. Then the metric would do:

+1 • +1 (horizontal) + +1 • −1 (vertical)

which cancels.



What if it were a slope of 9.18723 or something I don’t want to think about inverting?

This is a case where it’s probably easier to think in terms of abstractions and deduce, rather than using imagination in the conventional way.

If I went over +a steps to the right and +b steps to the up (slope=b/a), then the metric would do:

a•? + b•¿

What is that missing? If I plugged in (?←−b, ¿←a) or (?←b, ¿←−a), the metric would definitely always cancel.

And in either of those cases, the slope of the question marks (second line) would be −a/b.

So the multiplicative inverse (flipping) corresponds to swapping terms in the metric so that the two parts anti-match. And the additive inverse (sign change) means the anti-matched pairs will “fold in” to zero each other (rather than amplifying=doubling one another).

How I Use The Pythagorean Theorem Every Day

OK, not every day. But whenever I shop for packaged retail goods like a coffee or in the grocers.

The Pythagorean theorem demonstrates that a slightly larger circle has twice as much area as a slightly smaller circle.

(Since the diagonal of that square is √2 long relative to the "1" of the interior radius=leg of the right triangle. So the outer radius=hypotenuse=√2, and √2 squared is 2.)

And some of us know from Volume Integrals in calculus class that a cylinder's volume = circle area × height — and something like a sausage with a fat middle, or a cup with a wider mouth than base, can be thought of as a “stack” of circle areas

or in the case of a tapered glass, a “rectangle minus triangle” (when the circle is collapsed so just looking at base-versus-height “camera straight ahead on the table” view).

The shell-or-washer-method volume integral lessons were, I think, supposed to teach about symbolic manipulation, but I got a sense of what shapes turn out to be big or small volume as well.

By integrating dheight sized slices of circles that make up a larger 3-D shape, I can apply the inverse-square lesson of the Pythagorean theorem to how real-life “cylinders” or “cylinder-like things” will compare in volume.

• A regulation Ultimate Frisbee can hold 6 beers. (It’s flat/short, but really wide)

• The “large” size may not look much bigger but its volume can in fact be.
• Starbucks keeps the base of their Large cups small, I think, to make the large size look noticeably larger (since we apparently perceive the height difference better than the circle difference). (Maybe also so they fit in cup holders in cars.)

∂ Campbell’s

Cylinder = line-segment × disc

C = | × ●

The “product rule” from calculus works as well with the boundary operator ∂ as with the differentiation operator ∂.

∂C  =   ∂| × ●   +   | × ∂●

Oops. Typo. Sorry, I did this really late at night! cos and sin need to be swapped.

Oops. Another typo. Wrong formula for circumference.

Noticed:

• It’s easier for me to grok statistical significance (p's and t's) from a scatterplot than magnitude (β's).
• Even though magnitude can be the most important thing, it’s "hidden" off to the left.

Note to self: look off to the left more, and for longer.
• But I’m set up to understand the correlativeness in a sub_i, sub_j sense — which particular countries fit the pattern as well as how closely.

Questions:

• Minute __:__ Do each of the dimensions of social problems correlate individually, or is this only a mass effect of the combination?

If it’s true that raising marginal tax rates on the rich lowers crime rates without paying for any anti-crime programmes, that’s almost a free lunch.

UPDATE: Oh, hey, six months after I watch this and 3 days after I put up the story, I see Harvard Business Review has a story corroborating the same effect, instead pointing out how economists don’t look at the p's and t's on a regression table. I feel like I “mentally cross out” any lines with a low t value and then wonder about the F value on a regression with the “worthless” line removed.

Emotion Zero

In 20th-century abstract mathematics, one builds up ideas and properties—not assuming anything except what one is told. You think 2+3=5? Well in my space that I just made up, e₂⊕e₃ = e₁, and 5 doesn’t even exist!

Concepts are added in incrementally, like

• ‖A‖ means the “size” of A. size exists
• ‖A − B‖ means the “distance” between A and B. plus exists & negative exists; or, comparison exists

• (If zero exists, we could say the size of A = the distance between A and 0: ‖ A − 0 ‖ = ‖A‖.)

• ⟨ A | B ⟩ means A “times” B. times exists
• arccos ⟨A|B⟩ ‖A‖⁻¹ ‖B‖⁻¹ inverses exist. times exists. so angle exists
• topology adds in neighbourhood relationships—not necessarily in a way that you can infer size or distance (∵¬□∃ metric), but so that you could talk about paths or connectedness
• order or ranking — is it a total order? a transitive order? a partial order? a lattice? Order is subordinate to size, to distance, and to linearity.
• dimensionality — a set containing { ‘a’, ‘b’, the moon, 12, the vector (0 1 1 0 1)∈ℝ⁵, my cat’s hairball } doesn’t inherently have dimensions to it — so structured sets like ℝ² are supposed to explain how their universe breaks down
• linearitypossibly the scariest word in mathematics class? I’ve tried and will continue to try to explain it elsewhere, but “linear” is an extremely-restrictive-but-not-that-restrictive-because-so-many-things-are-linear-once-you-allow-calculus-and-maps-across-domains-for-example-fourier-transforms property. Linearity presumes monotonicity (order preservation), size, and a kind of “constancy” that tells you if 2 went to 4, then 13 is going to go to 26. Or “the 26 of the present land”.

Someone GPL’ed this nice (but not comprehensive) chart of two paths through the theory space—starting with a pair (thing, operation) [“magma”—sweet name, right?] and gradually adding more and more axioms until you get to a group.

Mathematical words obtain everyday meaning—sometimes unexpected meaning—in applications. For example

• "angle" might mean "correlation" — the angle between two pulse-trains would be their correlation; and in recommendation engines the matrix “cosine distance” is a basic measure of similarity
• "multiplication" — well what if you want to multiply two functions together? You could convolve them. Convolution doesn’t seem very much at all the same action as 3×8 = three groups of eight. Neither do Photoshop blends seem like multiplication, but some of them are.
• "size" — well maybe I mean "how well the business did" on a slew of different metrics — in which case, are there 20 different conceptions of "size"? I guess so.

Could you multiply two trees together? Could you define the angle between two natural numbers? The angle between two business models? Sure. If you know what you’re doing and why, you might even come up with a conclusion that makes sense. It all depends on (a) your ingenuity, (b) domain knowledge of the real-life situation, and (c) mathematical vocabulary.

Sometimes there is more than one interpretation that works with a given set. For example, {0,1} × {0,1} → {0,1} might be joined to operations that define “logical AND" and "logical OR”, or it might be interpreted just as on/off. Or it might be interpreted as the story of unrequited love.



All of that preface is meant to dislodge any notions you might have that ℝ² is somehow a “default” or “standard” paradigm. Sometimes number×number is an appropriate metaphor and sometimes not.

For example in the movie Rogue Trader, Nick Leeson’s boss is portrayed talking about “synergy” and “the information curve”. “Nick has positioned himself right there on the information curve!” It’s a parody and nobody seems to know quite what “the information curve” is (what’s on the axes? why is it curved?) but because Nick appears to be earning 70% of Barings’ profits, nobody questions the information curve.

Your typical crappy airport “business advice” books—Thomas Friedman kind of crap—will throw around 2-D charts that make no sense as well. Please leave some pics in the comments if you know what I’m talking about and examples come to mind. Here are a few dubious 2-D metaphors:

The “political compass” labels reduce the complexity of the world in particular ways that suit the rhetorical aims of these libertarian authors. For example projecting totalitarianism and populism into the same neighbourhood when one could just as well project them onto opposite ends of some other spectrum.

Here are some dubious scales—where either order, linearity, or 1-dimensionality is suspect.

(Remember: {"heroic", ”pragmatic”, ”circumspect”, ”brazen”} also comprises or belongs to a scale—in the ggplot sense of the word as well as other senses.)

Wow! You mean that losses are bad and earnings are good? That is some insightful business insight.

Crappy reductions needn’t be 2-D. The MBTI is a crappy reduction of personality in 4-D. And here are some in 1-D and other-D:

I like how step 5 leads to step 2. This should be a list rather than a flow.

Order, 1-dimensionality questionable.

Again, a list. This one has a heading. Apparently headings deserve 4 connecting wires whereas list items only deserve 3?

This is just a list of things. There is no “center” or “flow” or “order” or “cycle” relationship. Maybe “give them” and “get them” could have used a two-way arrow between them.

8-D and I just do not understand what these axis labels mean.

I actually spent hours finding the worst graphics evar. Not gonna tell you my google keywords though.



And, not to be critical all the time, here’s a 2-D metaphor that does work:



Stagepiece one: undermine the conceit that ℝ² is a default. Stagepiece two: cruddy graphics from various domains that force a metaphor that doesn’t really work. And now, the main act.

Today, I want to take aim at a highly suspect 2-D chart from the world of psychology:  the affect × intensity description of feelings.

Right away when I look at this, it seems like an overly limiting and not internally valid picture of emotional range. Like so many taxonomies, it gets deeply under my skin in a way that I can’t explain, except to shout: Bad theory! Bad theory!  I mean — how does it make sense to say

1. that each of these states is a point, as opposed to a spray or splotch or something else
2. that this precise “point” is the same for all individuals
3. "delighted" is slightly to the left of "happy" but happy is directly above "pleased"
4. that “sleepy” is to the right of “tired” instead of the other way around
5. that tired and sleepy are the same distance from each other as “pleased” and “glad”
6. WTF is “droopy”? It sounds like a word to be applied to a plant, not a person. I also don’t think it qualifies as an emotion. "Droopy" sounds like a word Good Housekeeping would use to shame a 1950’s American married woman for not being perky! happy! sexy! listening! rubbing his feet! when her husband returns home from work.
7. Are “sleepy” and “tense” actually moods or emotions? They sound like physical states.
8. All of these emotions are near the perimeter, but some are closer to the origin than others
9. sad minus gloomy = satisfied minus calm
??? because all of those are implicit in the drawings.

Remember what I was outlining at first. In abstract mathematics and in deciding the shape of a theory, we shouldn’t assume anything that doesn’t have to be assumed to explain the results.

I could attack the valence-intensity model in at least two ways.

1. First would be to exclaim “But you didn’t justify any of that stuff! Linearity? Dimensionality? Order? You skipped it all! Where’s the justification?”
2. Second, perhaps a little stronger than merely asking for backup, would be to point out flaws. For example if I could find a counterexample showing that emotional states don’t have magnitude, can’t be added, don’t break down on dimensions, or aren’t linear across dimensions.
The easiest critique of type [2] I could think of is to question the existence of a “zero-point” emotion. It might be possible to have low-or-zero activation of an emotion on the intensity axis, but on the valence axis? Could I have high intensity of zero valence? What about high intensity in the negative direction at zero valence? It doesn’t make sense.

I came up with a list—several years ago—of different feelings which all could contend for “emotional zero”.

• neutral
• feel blank
• both happy and sad (bittersweet)
• not sure
• ambivalent
• "I feel nothing"
• kinda sort
• middling

That’s just feelings we have the words for. There are lots of nameless emotions (or emotional superpositions) that could contend for the neutral canvas — the origin from which all other emotions are measured.

The fact that so many clearly distinct feelings all contend for the “origin” made me think there is, in fact, no origin. But making the space affine (removing zero) doesn’t fix the problems I had begun to notice with the circumplex view of the emotional spectrum. I think we just have to think of the range of emotions as a totally different kind of space. I don’t know its topology; I do believe there should be some “activation level” (like a scalar) at least sometimes; I do believe that superpositions are possible.

http://isomorphismes.tumblr.com/post/4840897988/logic-emotion

[G]eometry and number[s]…are unified by the concept of a coordinate system, which allows one to convert geometric objects to numeric ones or vice versa. …

[O]ne can view the length ❘AB❘ of a line segment AB not as a number (which requires one to select a unit of length), but more abstractly as the equivalence class of all line segments that are congruent to AB.

With this perspective, ❘AB❘ no longer lies in the standard semigroup ℝ⁺, but in a more abstract semigroup (the space of line segments quotiented by congruence), with addition now defined geometrically (by concatenation of intervals) rather than numerically.

A unit of length can now be viewed as just one of many different isomorphisms Φ: ℒ → ℝ⁺ between and ℝ⁺, but one can abandon … units and just work with directly. Many statements in Euclidean geometry … can be phrased in this manner.

(Indeed, this is basically how the ancient Greeks…viewed geometry, though of course without the assistance of such modern terminology as “semigroup” or “bilinear”.)
Terence Tao

(Source: terrytao.wordpress.com)

Measure: Sizing up the Continuum

For those not in the know, here’s what mathematicians mean by the word “measurable”:

1. The problem of measure is to assign a ℝ size ≥ 0 to a set. (The points not necessarily contiguous.) In other words, to answer the question:
How big is that?
2. Why is this hard? Well just think about the problem of sizing up a contiguous ℝ subinterval between 0 and 1.

• It’s obvious that [.4, .6] is .2 long and that
• [0, .8] has a length of .8.
• I don’t know what the length of [¼√2, √π/3] is but … it should be easy enough to figure out.
• But real numbers can go on forever: .2816209287162381682365...1828361...1984...77280278254....
• Most of them (the transcendentals) we don’t even have words or notation for.
• So there are a potentially infinite number of digits in each of these real numbers — which is essentially why the real numbers are so f#cked up — and therefore ∃ an infinitely infinite number of numbers just between 0% and 100%.

Yeah, I said infinitely infinite, and I meant that. More real numbers exist in-between .999999999999999999999999 and 1 than there are atoms in the universe. There are more real numbers just in that teensy sub-interval than there are integers (and there are integers).

In other words, if you filled a set with all of the things between .99999999999999999999 and 1, there would be infinity things inside. And not a nice, tame infinity either. This infinity is an infinity that just snorted a football helmet filled with coke, punched a stripper, and is now running around in the streets wearing her golden sparkly thong and brandishing a chainsaw:

Talking still of that particular infinity: in a set-theoretic continuum sense, ∃ infinite number of points between Barcelona and Vladivostok, but also an infinite number of points between my toe and my nose. Well, now the simple and obvious has become not very clear at all!

So it’s a problem of infinities, a problem of sets, and a problem of the continuum being such an infernal taskmaster that it took until the 20th century for mathematicians to whip-crack the real numbers into shape.
3. If you can define “size” on the [0,1] interval, you can define it on the [−535,19^19] interval as well, by extension.

If you can’t even define “size” on the [0,1] interval — how do you think you’re going to define it on all of ℝ? Punk.
4. A reasonable definition of “size” (measure) should work for non-contiguous subsets of ℝ such as “just the rational numbers” or “all solutions to cos² x = 0(they’re not next to each other) as well.

Just another problem to add to the heap.
5. Nevertheless, the monstrosity has more-or-less been tamed. Epsilons, deltas, open sets, Dedekind cuts, Cauchy sequences, well-orderings, and metric spaces had to be invented in order to bazooka the beast into submission, but mostly-satisfactory answers have now been obtained.

It just takes a sequence of 4-5 university-level maths classes to get to those mostly-satisfactory answers.

One is reminded of the hypermathematicians from The Hitchhiker’s Guide to the Galaxy who time-warp themselves through several lives of study before they begin their real work.

For a readable summary of the reasoning & results of Henri Lebesgue's measure theory, I recommend this 4-page PDF by G.H. Meisters. (NB: His weird ∁ symbol means complement.)

That doesn’t cover the measurement of probability spaces, functional spaces, or even more abstract spaces. But I don’t have an equally great reference for those.

Oh, I forgot to say: why does anyone care about measurability? Measure theory is just a highly technical prerequisite to true understanding of a lot of cool subjects — like complexity, signal processing, functional analysis, Wiener processes, dynamical systems, Sobolev spaces, and other interesting and relevant such stuff.

It’s hard to do very much mathematics with those sorts of things if you can’t even say how big they are.

Angle = Volume

This is trippy, and profound.

The determinant — which tells you the change in size after a matrix transformation 𝓜 — is just an Instance of the Alternating Multilinear Map.

(Alternating meaning it goes + − + − + − + − ……. Multilinear meaning linear in every term, ceteris paribus:

$\large \dpi{200} \bg_white \begin{matrix} a \; f(\cdots \blacksquare \cdots) + b \; f( \cdots \blacksquare \cdots) \\ = \\ f( \cdots a \ \blacksquare + b \ \blacksquare \cdots) \end{matrix} \\ \\ \qquad \footnotesize{\bullet f \text{ is the multilinear mapping}} \\ \qquad \bullet a, b \in \text{the underlying number corpus } \mathbb{K} \\ \qquad \bullet \text{above holds for any term } \blacksquare \text{ (if done one-at-a-time)}$)



Now we tripThe inner product — which tells you the “angle” between 2 things, in a super abstract sense — is also an instantiation of the Alternating Multilinear Map.

In conclusion, mathematics proves that Size is the same kind of thing as Angle

Say whaaaaaat? I’m going to go get high now and watch Koyaanaasqatsi.

Unmeasurable Distances

I wrote earlier about the many different ways to measure distance. One way I didn’t include is unmeasurable distance.

Sometimes A is

• tastier,
• sexier,
• cooler,
• more interesting,
• or otherwise better endowed

than B … but it’s impossible to quantify by how much. No problem; just say that A≻B but that |A−B| is undefined.

It’s still the case that if A is sexier than B and B is sexier than C, it must follow that A is sexier than C.

Symbolically: A≻B & B≻C A≻C.

This concept opens up many parts of human experience to the mathematical imagination.

I will also express my view on moral rates of income tax using orderings ≻.

Oh, and if you’re into this kind of thing: using orders instead of measurable quantities kind of saved the economic concept of “utility”. Kind of saved it. At least instead of talking about 174.27819 hedons, nowadays you can just say X is lexicographically preferred to Y. Ordinal utility instead of cardinal utility.