Quantcast

Posts tagged with high dimensional spaces

"There is more difference within the sexes than between them."
‒Ivy Compton-Bennett, Mother and Son

"In all of human biology, there is no greater difference than of that between men and women."
—Some biology notes I found online

These two statements sound like rhetorical opposites, but in fact both are true.

(Says me. I can’t prove this, but I bet that taking everything into consideration, divisions between men & women are greater than those between liberals & conservatives, blacks & non-blacks, tall & short, sick & well, D&D players and people who get laid, etc.)

Let me show how both statements can logically live together harmoniously.

Just like how most men are slower than female Olympians, but at the same time the average man is faster than the average woman.

NB: Not real data.

Measurement

Even when differences are statistically significant enough to draw conclusions (such as: “boys sprint faster than girls”), the magnitude may be really small so that the difference, while indisputable, is also unimportant. (“Statistical significance” is a confusing term in this respect.)

Consider that there are many ways you could measure differences among people. Here are some that come up frequently in the gender wars, grouped suggestively:

  • height, weight, curvature
  •   IQ, SAT scores, reading tests
  • speed, throwing distance, fine motor skills
  • communication skills, emotional intelligence
  • went to college, profession is engineer
  • finding things in the refrigerator, ability to focus, ability to multitask

There are many ways to measure each of these “dimensions”. For example, does "speed" mean in the 100m dash, 200m dash, marathon, trail running, bike race, or triathlon? While the answers wouldn’t be independent, they wouldn’t be one-to-one either.

A billion points in a million-dimensional space

Now you are faced with 6.7 billion points in an N-dimensional space, where N is the number of things you could measure. Let’s say like a billion points in a million-dimensional space. (Some dimensions may be collinear.)

On the one hand, there are always lots of pink and blue dots mixing in with each other (e.g. men who sew better than most women)‒and directly from Ivy’s point, the distance among pinks (variation among men) is greater than the distance from the pink centroid to the blue centroid (variation between men and women).

At the same time, though, if you had to choose just one factor by which to color these dots and get maximal classification power, it would have to be gender.

In other words, gender differences may generate a maximally separating hyperplane, but Euclidean distances between differently-gendered points are often small, and Euclidean distances between same-gendered points are often large.




"There is more difference within the sexes than between them."
‒Ivy Compton-Bennett, Mother and Son

"In all of human biology, there is no greater difference than of that between men and women."
—Some biology notes I found online

These two statements sound like rhetorical opposites, but in fact both are true.

(Says me. I can’t prove this, but I bet that taking everything into consideration, divisions between men & women are greater than those between liberals & conservatives, blacks & non-blacks, tall & short, sick & well, D&D players and people who get laid, etc.)

Let me show how both statements can logically live together harmoniously.

Just like how most men are slower than female Olympians, but at the same time the average man is faster than the average woman.

who is faster, men or women?

NB: Not real data.

Thanks to Stats in the Wild, here’s some real data on the ages of Olympians that makes the point:

age distribution of recent Olympians, by sport and by sex
age distribution of (historical) Olympians by gender

Measurement

Even when differences are statistically significant enough to draw conclusions (such as: “boys sprint faster than girls”), the magnitude may be really small so that the difference, while indisputable, is also unimportant. (“Statistical significance” is a confusing term in this respect.)

Consider that there are many ways you could measure differences among people. Here are some that come up frequently in the gender wars:

  • height, weight, curvature, angle of femur
  • IQ, SAT scores, reading tests
  • speed, throwing distance, fine motor skills
  • communication skills, emotional intelligence
  • went to college, profession is engineer
  • finding things in the refrigerator, ability to focus, ability to multitask

There are many ways to measure each of these. For example, does "speed" mean in the 100m dash, 200m dash, marathon, trail running, bike race, or triathlon? While the answers wouldn’t be independent, they wouldn’t be one-to-one either.

A billion points in a million-dimensional space

Now you are faced with 6.7 billion points in an N-dimensional space, where N is the number of things you could measure. Let’s say like a billion points in a million-dimensional space. (Some dimensions may be collinear.)

differences between men and women in a billion-dimensional space

On the one hand, there are always lots of pink and blue dots mixing in with each other (e.g. men who sew better than most women)‒and directly from Ivy’s point, the distance among pinks (variation among men) is greater than the distance from the pink centroid to the blue centroid (variation between men and women).

At the same time, though, if you had to choose just one factor by which to color these dots and get maximal classification power, it would have to be gender.

In other words, gender differences may generate a maximally separating hyperplane, but Euclidean distances between differently-gendered points are often small, and Euclidean distances between same-gendered points are often large.