Posts tagged with time series

World record progression for the men’s Long Jump.

A jump process.

PUN FULLY INTENDED.

(Source: Wikipedia)

hi-res

## Subtraction Is Crazy

I was re-reading Michael Murray’s explanation of cointegration:

and marvelling at the calculus.

Of course it’s not any subtraction. It’s subtracting a function from a shifted version of itself. Still doesn’t sound like a universal revolution.

(But of course the observation that the lagged first-difference will be zero around an extremum (turning point), along with symbolic formulæ for (infinitesimal) first-differences of a function, made a decent splash.)

$\large \dpi{200} \bg_white f ^\prime \equiv \lim_{\mathrm{lag} \downarrow 0} {\mathrm{lag} (f)-f \over |\mathrm{lag}| }$

Jeff Ryan wrote some R functions that make it easy to first-difference financial time series.

Here’s how to do the first differences of Goldman Sachs’ share price:

require(quantmod)
getSymbols("GS")
plot(  gs - lag(gs)  )


Look how much more structured the result is! Now all of the numbers are within a fairly narrow band. With length(gs) I found 1570 observations. Here are 1570 random normals plot(rnorm(1570, sd=10), type="l") for comparison:

Not perfectly similar, but very close!

Looking at the first differences compared to a Gaussian brings out what’s different between public equity markets and a random walk. What sticks out to me is the vol leaping up aperiodically in the $GS time series. I think I got even a little closer with drawing the stdev’s from a Poisson process plot(rnorm(1570, sd=rpois(1570, lambda=5)), type="l") but I’ll end there with the graphical futzing. What’s really amazing to me is how much difference a subtraction makes. ## What’s the difference between POSIXct and POSIXlt? POSIXct is the signed number of seconds since “the epoch”. For example it was • 1351298112 UTC (GMT) when I wrote this. (1351298112 UTC = Sat Oct 27, 2:35 am GMT = Fri Oct 26, 8:35 pm EST = Fri Oct 26, 5:35 pm PST = 2:35 pm HST) POSIXlt is one of many text | character | string formats such as: • May 17, 2017 • 17/5/2017 • 17-5-17 05:24:39 (Source: stat.ethz.ch) #### "You can keep blaming your parents for your life in your 20’s, but by the time you’re 30 it’s your own fault." —having a difficult time getting an original source on this quote This is like unknotting an autoregressive term in a time series. Even if the past only has a hold on the present back to 5 years ago, your upbringing still influences you when you’re 70. Because • who you were at 15 influences who you were at 20 ρ¹, • which in turn influences who you were at 25 ρ², • and so on until 9 half-decades later there’s a ρ¹¹ echo of your 15-year-old self whose apprehension at the way she looked (or rather didn’t look) rumbles faintly, faintly, faintly, faintly, faintly, faintly, faintly, faintly, faintly, faintly, faintly through time—the decisions then affected the next decisions which altered the next decisions … on and on to the present. $\large \dpi{200} \bg_white \mathrm{AR[1]:} \quad x_t = \rho \cdot x_{t-1} + \text{external forcing}$ If the initial spike was −1<ρ<1, then the rumble of the thunder diminishes geometrically over time. So a ρ=½ only shivers .00049 eleven knots into the future, and even a ρ=.9 recedes to a .314 by the time it’s so deep past.   Maybe I can spot a corollary to the new parents’ dilemma as well. If the present choices are always framed by the habits formed in the past, then ε perturbations in the baby’s care echo forward, and forward, and forward…and can they really be undone? @IgorCarron blogs recent applications of compressive sensing and matrix factorisation every week. (Compressive sensing solves underdetermined systems of equations, for example trying to fill in missing data, by L₁-norm minimisation.) This week: reverse-engineering biochemical pathways and complex systems analysis. ## Financial Time Series One of the negative reviews of my DIY MFE piece said the following: Financial markets are nothing more than an infinite time-series Of course, it was a recent grad saying this. I would like to respond with a parable. Once upon a time there were two securities analysts, Gemma and Yu Fen. Each was in her office analysing data, making phone calls, and trying to figure out what derivatives she should long and short to get the proper exposure on the same security — XXX — when it made its next big move. On Friday at work, Gemma received a package with a book she had ordered from Cambridge Press: Nonlinear Filters in the Analysis of Financial Time Series. She had a dinner date, but decided halfway through that the guy was annoying, laid down an embarrasing sum of cash, and bailed to meet up with her friends. It was a great evening out and when she woke up Saturday morning, Gemma started the book. Yu Fen had plans in Paris for the weekend with her girlfriend (they have an apartment in the Tresiemme). They also went out Friday night and, as often happens at expensive bars, a rich, old guy started buying them both drinks. Since everybody in this story is totally square and corporate, the conversation quickly turned to what they all do for a living, and Richard (the rich guy) seemed fascinated about everything that Yu Fen said about her analysis of the XXX security. Richard, Yu Fen, and many others got hammered at the expensive bar that night. During the course of their hanging out, Richard let it slip that he ran a hedge fund, and that he was planning to take out a massive short on XXX as soon as it passed 571.91. Richard opined that the fundamentals of XXX weren’t actually sound enough to support a price of more than 400. His analysis is boring so I won’t repeat it but Yu Fen was interested. She understood and agreed with all of his points. Yu Fen is a good enough judge of character (and of drunks) that she knew he wasn’t putting her on; some combination of pretty girl, good conversation, and expensive Scotch had led him to divulge his real position. After that night, the weekend finished pleasantly but without anything else financially relevant occurring. Back in London, Gemma wrote some code snippets to test some of the most interesting nonlinear filters from her book. She finished two-thirds of the book between Saturday and Sunday reading. The concepts were actually a lot simpler than she had foreseen…but then Gemma has a Ph.D. in a related area. The next week, Gemma worked her new filters into the existing code infrastructure and started analysing XXX with the new set of tools. She found that a few of the methods from the book, when applied judiciously on the right parts of the past data, transformed the signal in such a way as to shed light on one of the crucial questions she had had about the implied volatility surface. Yu Fen kept her eye on XXX at the same time and also felt like she had new insight. She had gotten Richard’s business card and called his office, pretending to be a deep-pocketed potential investor and fishing for information about the firm’s position on XXX. She also spent the week making phone calls to check out the fundamental weaknesses in XXX that Richard had delineated. It was difficult to winnow the disinformation out of what she was told, but bullsh*t-detection is one of Yu Fen’s strong suits. Richard’s viewpoint basically checked out — and Yu Fen even found out where the pockets of false support for XXX were and what price they would drop out at. Yu Fen couldn’t convince her desk to give her all of the leverage she wanted, but she loaded up on some disgusting immediate short positions against XXX and even liquidated some of her other positions early to get more attack power on the XXX. Meanwhile, trading volumes in XXX were growing. Its unflagging ascent had heretofore embarrassed sharp analysts and confounded great traders. Increasing numbers of news articles called a “bubble” in XXX, but for over three years now the bubble had not popped. Gemma, of course, wasn’t naive enough to merely take positions on whether a security would go up or down. She mainly modelled probability distributions of several Greeks parameters. Gemma would update probability distribution of her estimates as new data came in. Her signals were then based on estimates of higher moments of these distributions (the robustness of which had been improved by her weekend reading). Gemma made some important tweaks to her portfolio to reflect the statistical picture she had gained from answering some questions with the new nonlinear filters, in particular saving money by pulling out of a few hedges that she had over-secured herself with. The next week — on Tuesday, at around 1pm, XXX started testing 572. It had been climbing at a rate of roughly 8 points/month with a weekly standard deviation of 15 — and on Tuesday, it displayed slightly unusual behavior, lagging to 550 at the market’s close. But over the next two days, XXX’s vertiginous drop to 508 surprised nearly everyone who cared about XXX. Those who traded security XXX, or who had invested a significant chunk of their portfolios in XXX, were generally shocked and panicked. At 530, news wires warned of a speculative attack, or tried to point out causal factors, or took analyst quotes on the situation. Investors missed their kids’ soccer games, came home late, and ran their fingers through their hair as they sought frantically to figure out whether to flee, hedge, hold fast, or double down on XXX. Yu Fen, Richard, and a few others were among the few who knew where XXX’s final equilibrium price should be: somewhere in the 380-420 range. Yu Fen had to adjust the timing of her shorts as XXX went down — since it was never clear how much selling pressure it would take to kill the synthetic rallies that each of the players she had investigated tried to mount. Along with the money pumping up XXX’s price at each of the moves was a small flurry of news stories questioning which way XXX would move next. Some were penned or prompted by those fighting XXX’s decline. But Yu Fen trusted her original analysis, trimmed her sails, and rode XXX all the way to 365 before yanking off her shorts. Gemma, meanwhile, had her positions ravaged. Richard’s firm’s attack on security XXX fundamentally altered the market participants’ perceptions and analysis of XXX. All of Gemma’s higher moments had been estimated using data from the old regime. When the regime shifted, she kept feeding the new data into her prior—but the model took too long to shift its recommendations. At year’s end, Yu Fen only took home four times the bonus that the rest of her desk did, and Gemma didn’t get in much trouble because the move in XX had been so unprecedented that nobody could have seen it coming. Nevertheless Yu Fen got a big head out of it and started being resented by her coworkers, while Gemma felt discouraged because of the XXX blunder and a number of other issues, and the next year started floating her resume to business schools looking to expand their quant staff. Moral: Financial data does come as a time series, but future moves can’t necessarily be predicted by time series analysis. Price(APL, pre-iPod) is drawn from a different distribution than Price(APL, post-iPod), and so on. And also: A given market isn’t a 1-D time series (price). It’s two (bid & ask) 2-D time serieses (price & volume), … and if you count different types of orders (stops and limits), it’s more like six or eight 2-D time serieses that are all interconnected. So there. /snark GARCH stands for Generalized Autoregressive Conditional Heteroskedasticity. To translate, skedasticity refers to the volatility or wiggle of a time series. Heteroskedastic means that the wiggle itself tends to wiggle. Conditional means the wiggle of the wiggle depends on something else. Autoregressive means that the wiggle of the wiggle depends on its own past wiggle. Generalized means that the wiggle of the wiggle can depend on its own past wiggle in all kinds of wiggledy ways. Contrary to the recently popular "behavioral" approach which proposes to take advantage of economic “irrationality”, I suggest that value-added comes from creating investments with more [psychologically] attractive risk-sharing characteristics. Andrew Lo & Craig McKinlay, A Non-Random Walk Down Wall Street Examples of time series include the water level of the Nile river in Egypt, sun spots, the stock market close price for IBM, a digital recording of a Jazz pianist playing at a night club (in this case the time component of the time series is determined by the sampling rate) and the neutron emission of a radioactive element as the element decays. http://www.bearcave.com/misl/misl_tech/wavelets/forecast/index.html Also neuronal spike-trains, historical GDP of a country, seismic data readings, electrocardiograms, network data logs, histories of corporate earnings, ice cores, and tree ring data. That’s functional in the sense that the data of interest forms a mathematical function or curve, not in the sense that flats are functional and high heels are not. $\dpi{300} \bg_white f: \{ \rm{space} \} \to \{\rm{another\ space} \}$ So say you’re dealing with like a bit of handwriting, or a dinosaur footprint [x(h), y(h)], or a financial time series$(t), or a weather time series [long vector], or a bunch of electrodes all over someone’s brain [short vector], or measuring several points on an athlete’s body to see how they sync up [short vector].  That is not point data.  It’s a time series, or a “space series”, or both.

Techniques include:

• principal components analysis on the Fourier components
• landmark registration
• using derivatives or differences
• fitting splines
• smoothing and penalties for over-smoothing

The problem you’re always trying to solve is the “big p, small n problem”.  Lots of causes (p) and not enough data (n) to resolve them precisely.

You can see all of their examples, with code, at http://www.springerlink.com/content/978-0-387-95414-1.

hi-res