Conversation topics on Facebook by age.
Posts tagged with data visualisation
Eric Fischer cross-referenced geolocated Tweets from across the world with data on known transport nodes. He then created what are in effect transit cartograms, with the thickness of a road or other mass transport line corresponding to the volume of Tweets sent along its path.
aggregated by John Burn-Murdoch on The Guardian, came to me through @vruba and reading.am.
The classic red/green colouring scheme for trading screens seems too alarmist.
Conceptually, the red/green distinction makes sense as corresponding to stop/go in traffic signals. But traffic signals need to be neon and striking in a hectic 3-D environment where it’s paramount for everyone to definitely not-miss the
But in a sheltered 2-D environment where goals commonly include to master emotion, to control passive reactivity, to keep a long-term head in the middle of short-term volatility, and to digest (calmly) massive amounts of information en simultáneo, neon red/green seems too grating.
I made the above picture with
R of course, like this:
require(quantmod) getSymbols("^GVZ") chartSeries(GVZ) reChart(up.col="light blue", dn.col="yellow")
GVZ is the gold volatility index.)
It’s not a perfect colour scheme—I would use
Lab to do better—but it already improves on
If we take that as a starting point, a less alarmist colour scheme for trading software could use the blue/yellow dichotomy to indicate whether a security price went up or down. Use a neutral chroma for “small” moves (this depends upon one’s time-frame, but properly the definition of “big move” should be calibrated to an exponential moving average with some width depending on one’s market telescope). Intensity of the move could be signalled with lightness, so that most figures on a screen are a readable lightness of a neutral colour, but “big moves” are tinged with convexly more chroma and very-convexly more lightness.
The definition of “up/down” might be refigured as whether the trader is short/long the security in question, or perhaps redness/greenness could be used in conjunction with the “market view” of cold/hot, to indicate whether a security is moving for/against one’s strategy. That too could be seen as overly alarming, but a (pseudo)convex coding of red-ness might again solve the problem again, only invoking the “panic mode” when there’s really something to worry about.
I’ll note some of the flaws for later reference in a longer piece I’m working on where I try to hit the highlights of numeracy / practical data literacy for non-statisticians.
Components of internet traffic 1995-2005
"the center of gravity of … media … is moving to a post-HTML environment,” we promised nearly a decade and half ago. The examples of the time were a bit silly — a “3-D furry-muckers VR space” and “headlines sent to a pager”…
I remember how in the late 90’s people would speculate that everyone would become a co-creator (in fact big-money books were written to this theme). But maybe the lesson is that there are a relatively small number of passionate artists and artisans trying to get the word out about their stuff, and well-organised corps are very good at getting us to pay attention to certain art and not other art—although “viral” is a fairly chaotic a Wild West, certainly more so than three-channel broadcast. The “peer-to-peer” category I interpret as people trading albums and movies by the top artists. The picture doesn’t go up until 2010 but I think big corps have made inroads into the fuchsia video band by now.
One more mathematical observation about this chart: the total amount of traffic obviously exploded during 1995-2005 but we see a constant height on the graph. So that’s like “modulo size changes”aka the familiar
If you’re using
R for the first time you may have looked at
?plot (2 page help file) or
?par (12 page help file) to figure out what’s going on. It’s overwhelming.
This document explains the
parameters I always bother to set. That way you can get decent
plots without reading every
(If you are just using
R for the very first time and need some data, type
data(pima) to load some interesting pre-cleaned data sets. Then do
plot(faithful) to see how the
base::plot functions. Type
??pima if you can’t find the dataset.)
> plot(faithful, pch=20, col=rgb(.1,.1,.1,.5), cex=.6)
Firstly: what is
par? When you type
par( lwd=3, col="#333333", yaxt="n" ), it will open an empty box that will hold your next
plot( dnorm, -3, 3). You can run different plots in the box and as long as you don’t close it, the line-width will be 3 times bigger than default, the y-axis won’t have labels, and the colour will be dark-grey.
There are a lot of plotting options. Here are the ones I use regularly:
cex = .8. Decreases the size of type or plotted points by 20%.
par(new=TRUE). Use this to plot two things on top of each other. Beware, the labels will overprint over each other too (but this doesn’t matter for quick, casual plots).
col = "red",
col = "#333333". I think
#333333is the best default colour and I use
redif a point or line needs to stand out.
col=rgb(.1,.1,.1,.5). This is another decent grey for overplotting. I used this in the Old Faithful plot at the top. The first three numbers are Red, Green, Blue and the fourth is Transparency.
lwd = 3. This is a good line width, I think, especially with the dark grey
pch = 20. Plots points with a small circle.
pch=19is a slightly larger dot and
pch=15is a square. Read after the second group of bullets for more info.
png("name of the plot.png"). Then do
plot(z), and remember to finish it off with
dev.off()means device off; the
par()window and the
png()file are considered “graphic devices”.]
Here are the ones I use less regularly, but still more than weird stuff like
lend = butt. Line ending is square rather than mitred. I use this before I make a histogram.
xlog=TRUE. “Hubble made this significantly worse chart before it was discovered that all data look like straight lines on log-log plots.” —Lawrence Krauss
las=1. If you want all of your axis labels to be printed horizontally.
mfrow=c(2,2). If you want to juxtapose four plots next to each other.
mfrowand they will write like a typewriter, left-to-right and starting over on a new line after 2 spots have been filled in.
mfcol=c(3,3)and they will fill in vertically. (Try it if what I said doesn’t make sense.)
yaxs="n". This suppresses printing the vertical axis labels. I do this when plotting a distribution because those vertical numbers aren’t meaningful.
main="It's a plot about nothing. Don't you get it? People _love_ nothing!". This is the title of the plot.
legend( "top right", legend=c("control", "placebo", "test group"), fill = c("black", "#333333", "red"), border="white", bty="n"). This is how I find legends look good. You should only need to change the placement,
fillto make it work for your plot.
plotmultiple figures in the same picture do
mfcol=c(3,2). Then the next six = three × two
plots you run will go in left-to-right or up-to-down order, filling in six spots.
par(mfrow=c(1,1))after you’re done, to go back to one
png("a plot about nothing.png"); plot( stuff ); dev.off(). The
dev.off()tells the system to go back to normal (printing to the screen—PNG device off).
Rcommunity: how to get some sweet, sweet log-axis tickmarks. Read all about it.
Most of these can be done inside of
plot( dpois, 0, 15, lwd = 3) or beforehand in a
par(lwd=3); plot( dpois, 0, 15). With
par(new=TRUE) and par(mfrow=c(2,2)), though, you need to do them in a
If you forget what the colours or the pch shapes are, do this:
plot( 1:25, pch=1:25, col=1:25 ). You’ll get this:
So basically, you only want
pch=20 and sometimes
pch=15, like I said.
One more thing you might like to learn is how to colour important data points red and normal ones grey. I’ll explain that another time.