Friday, October 16, 2009

Galleon Group Raj Rajaratnam's Insider Trades


There was a pretty big insider trading bust today, involving high level executives investor relations firms, and Raj Rajaratnam of the Galleon Group hedge fund. From what I've been able to gather from the charges, I compiled a set of charts for various of the insider trades. It excludes some in stocks that no longer trade publicly (such as Hilton). Judge the charts for yourself. It would be interesting to develop a more systematic study of trading patterns prior to big jumps, and see how insiders show their hands in the form of the stock's price.


The charges appear here.

Tuesday, October 13, 2009

List of publicly traded companies with industry and sector classification from from Yahoo! Finance

Quantitative traders and investors doing research often need a comprehensive list of publicly trader US equities along with their sector and industry classification. The Yahoo! Finance industry browser is great for that, but getting the data into a neat form requires some programming ability. Since I often see people requesting a list of companies with industry and sector classification, I have decided to post the one I use. It contains the vast majority of US-listed stocks (excluding some very small OTC-BB names). The rows are tab-delimited, and the columns are ticker,sector,industry,market cap (in billions as of 10/1/2009', company name. I hope this is useful for you.

The list is available here .

Wednesday, July 1, 2009

Short-term VIX


The heatmap below shows 10 minute SPY returns depending on the 20-minute fast stochastic oscillator (called kfast here) applied to 5-second bar closes of SPY(vertical axis) and VIX(horizontal axis). The oscillator measures the current price relative to the price range in the last 20 minutes. So, when it takes on a value of 0 it means a new 20-minute low, whereas 1 would be a new 20 minute high. The color of each bin is the mean divided standard deviation (i.e. sharpe ratio, not annualized) for all the 10 minute returns that fall in it.

We can see that rise in SPY with drop in VIX is generally bullish (bottom left orange corner). Rising VIX along with SPY is bearish (bottom right), and so is falling SPY and dropping VIX (top left).

The data used here is 5-second bars for SPY and VIX for 370 trading days starting on 2008/01/10.

Friday, April 24, 2009

Intraday market action leading up to FOMC (Fed) Meetings


You know the drill. We will look at the last several FOMC meetings, and the days that preceded it to gain a sense of what market behavior to expect for the week of April 27. There is a Fed meeting on Wednesday, April 29.

The attached graph consists of a series of subplot. In each subplot we show the price of SPY for all the days up to and including FED day. So, the very last color (yellow) is the day of the Fed meeting. The only exception is the bottom right subplot in which I use the current day (Friday, April 25th) in order to give a sense of what the market has been like in the period leading up to now.

One notable thing is how many fakeouts (mostly down, then up) there are on the day of the announcement, and the overall high volatility of the days preceding it. This is always talked about by traders, and it is nice to observe directly in the data.

Friday, March 20, 2009

Options Expiration Intraday Action


Friday March 20 2009 is quadruple witching expiration. Traders believe that expiration days are difficult to trade and market makers and others play a variety of tricks in order to settle their positions.

I am not sure what the proper way to trade options expiration is, but thought I should assemble intraday prices of SPY (ETF tracking S&P 500) from options expiration Fridays (and some Thursdays) of the last year. Each square in the grid is the intraday SPY price for the options expiration Friday in the title, as well as the trading days preceding it. There are a few bad prints in the data, which I did not have time to correct, but I think it is still quite useful.

These are interesting to look at and maybe guide your decision making somehow, though admittedly the sample is small.

Friday, October 24, 2008

Market Woes

Currently, the market is focused on punishing everything not explicitly bailed out by the government. So specs will hammer all stocks/asset classes/governments/etc. until the government takes a slice or allows to fail. Insurance companies are set to be bailed out, so hedge funds and governments are on target right now. If we don't hear something on Sunday afternoon, it might get ugly Monday morning with fear, shutdowns and default rumors swirling.

Tuesday, September 18, 2007

Financial Markets Visualization

As I was driving in my car, I was pondering the regime-switching nature of the world financial markets. I wondered about how the relationships between the performance of various asset classes or industry groups evolve over time. For example, we all know that at certain times small caps outperform large-caps. More interestingly, there are times when the movement in short term interest rates may be positively or negatively correlated with the general market, depending on the underlying fundamental worries in the period. Does the market experience regime switches, which completely break the existing correlations between its various segments?

Even if I can't directly answer the above questions, I decided they should be easily quantifiable and with a few hours of work I should be able to produce a cool visualization that somehow illuminates the inner workings of the market.

Since I haven't worked with financial data recently, my first task was to somehow acquire a reasonable dataset to work with. Yahoo! Finance was perfect for the task due to the easily automatable downloading of data using its "Save as Spreadsheet" link. A couple of python scripts later, I had an automated pipeline of scripts that would allow me to download and merge (there are often missing days for some of the timeseries) data for arbitrary tickers. As a proxy for various market segments, I searched for ETFs which have been in existence since at least 1998, and settled on the following ones: EWJ, MDY, SPY, XLE, XLF, XLP, XLU, XLV, along with the transportation index ^DJT and ^IRX and ^TNX for the 90-day bill and 10-year bond. My dataset seem reasonably representative of the complexities of the market spanning industries, countries, and company sizes.

As customary, I converted my dataset of absolute prices to N-day relative log returns (I tried N of 1 ,2, and 5 and used N=2 in the graphs below.) My visualization scheme idea was to split up the time series into non-overlapping periods of K days, and produce a graph showing the similarity between pairs of periods. So, for example I should be able to see that the market right now is really different from the market in 2001.

The measure of similarity I decided to use, was the KL-divergence (relative entropy) or relative entropy between the joint multivariate distribution of returns in the two distinct periods. I modeled the distribution of returns in each period as a multivariate Gaussian (one dimension for each security in my dataset.) Since I specifically wanted to investigate similarity computed based on second-order characteristics, I scaled the data for each period of K days to have 0 mean and unit(i.e. 1) variance. Had I not scaled the data, much of my similarity metric would have depended on things such as the direction of the market or its volatility. Since such effects are usually rather obvious, removing them by centering the data, allowed me to focus almost entirely on the inter-security structure of market returns. KL-divergence is not a proper distance metric since it is not symmetric (i.e. the KL-divergence between the distributions P and Q is different from the one between Q and P), so my distance measure became the average of the KL-divergence computed both ways.

Implementing the above was a breeze thanks to MATLAB. Below, I have included the pairwise distance for non-overlapping 60-day periods, whose date is indicated on the axes. I only display the upper triangle of the distance matrix since it is symmetric.

The graph above uses 60 day periods, and the distributions compared are 2-day log returns, as described previously. The redder the color, the more similar two periods are, and the bluer it is, the more dissimilar. The height of the bear market in 2001 is most dissimilar to any period. Similarly, the bull market of 2005 is characterized by a red blob near the diagonal, which means that the covariance structure of the market remained fairly constant. We can see how the current period we are in compares against the past by examining the last column of the matrix: it is not particularly similar to any other period (i.e. absence of very red colors) and is in fact as dissimilar to anything else in recent history as any other period in the dataset. In general, by examining the lower right corner of the graph we can see that the most recent period of 60 days has the sharpest difference from the immediately preceding periods of any time period in the recent past. However, perhaps the difference, while noticeable, might be less striking than one would expect given the turmoil and the extreme volatility recently. While past correlations were broken as is evident by the prominence of the rightmost column of the matrix, there was also a large increase in the sheer magnitude of the moves, which we specifically excluded from this analysis.

Below, I also include a similar plot, only over 30 day periods. We can see that the fall of the market in February 2007 was really unlike anything seen in the past several years, in terms of the market covariance structure. With the shorter periods, periods are overall more similar which might be partly due to the fact that it is difficult to reliably estimate the covariance matrix for such a small periods. This can be addressed by projecting to a lower dimensional space, as in PCA and computing distances on the projected distributions.


What do you think about this visualization style? How would you use the underlying ideas of analyzing second-order structure to trade better? I would like to run this on a different basket of assets/industries that is more representative of what is truly important. Any ideas what to include?


In order to make sense of the information, here is a plot of the SPY ETF (proxy for the S&P 500):

P.S. This visualization was actually inspired by tools used to display the correlation structure of markers in the human genome.