*The first of two posts looking at how one year's market performance has a bearing on what comes next. The second looks in more detail at the actual figures.*

6 years ago, I published a mathematical post for Christmas, looking at the numbers in the song, "The Twelve Days of Christmas".

Time for a mathematical post for New Year, this time doing some statistics. To what extent is the performance of the financial market next year independent of what happened this year? That's to say: If the markets had a good 2022, does that mean it's the turn of 2023 to turn down, or is 2023 likely to follow suit and be good, or does 2022 have no bearing on 2023?

As it happens, 2022 was not a good year. With 4 trading days left, the S&P 500 index of US stocks is down 18%. So what, if anything does that mean for 2023?

A tweet by Bloomberg caught my eye. It links to this post, S&P 500 Facing a Historical Warning Sign After This Year's Slump. That's behind a paywall so I can't read the article, but I can see their two bullet points below the headline:

- Consecutive years of declines in US stock benchmark are rare
- When they do occur, second year has been worse than the first

The tweet said basically the same:

Consecutive down years are rare for US stocks, so after this year’s drop, there’s only a low probability they will decline again in 2023. Yet if they do, history shows that investors will have to brace for another very unpleasant 12 months

They make two claims.

1. Following a drop year, the probability is "only a low probability" that next year will decline. So next year is less likely to see decline than would be the case, and that is because this year has seen a net decline.

2. Should 2023, unlikely, be a second decline in a row, 2024 would be more likely to see decline than if we hadn't followed this pattern of two consecutive years down.

The article bullet points make a third claim:

3. When you do get two bad years in a row, the drop in the second year is worse than in the first.

Let's look at the stats to see how the first two claims stack up.

## Our Method

We need to examine the history of the S&P500. It is currently a list of 500 large companies trading on a stock market in the USA, weighted by market cap. It is not the 500 largest, but instead is chosen by committee to reflect a balance of sectors. It has been this since 1957, however arguably its predecessors have been doing basically the same thing since 1926. So we'll look at figures from 1926 to 2021, then repeat the calculations for 1957 to 2021. (We don't include 2022 because we are looking to see if a down year was followed by another down year, and we don't yet know what will happen after 2022.)

We will look at how many years in the date range saw the index decline. That will enable us to calculate a mean probability that any given year will see a decline, and also a standard deviation of that probability. We will then look at how many of those down-years were followed by a second down-year. Restricting our data set to years following a decline, we'll be able to work out the probability of a (second) declining year in that data set. Then we ask the question: Is this probability statistically significantly different to the probability of a red year in the full data set?

Lastly, we restrict again. We now just look at the years that follow two decline years in a row. Again, we ask how many of those were red, for the third time? What is the probability that third year is red? Is that different from the probability of the full set of years, by an amount that is statistically significant?

## 1926-2021

First, let's look at the full data set we have here:

### Full Data Set: 1926-2021

1926 to 2021 gives us 96 years. In those the S&P500 declined in 25 years, and saw a positive return in 71 years.

The negative years were 1929, 1930, 1931, 1932, 1934, 1937, 1939, 1940, 1941, 1946, 1953, 1957, 1962, 1966, 1969, 1973, 1974, 1977, 1981, 1990, 2000, 2001, 2002, 2008, 2018.

So the probability of a negative year is 25 / 96, or 26.042%.

This is a binomial distribution, so the variance is p(1-p), or 19.26%. The standard deviation is the square root of this, or 43.89%.

### Years Following Down-Year: 1926-2021

What happened in the year after those 25 down-years? The year after was also a drop on 8 of those years

- 1929-1930
- 1930-1931
- 1931-1932
- 1939-1940
- 1940-1941
- 1973-1974
- 2000-2001
- 2001-2002

So, just looking at the 25 years that followed a decline year, the probability of a second decline was 8/25, or 32%.

Straight away you'll notice that this is higher than the 26.042% decline-rate in the overall data set.

However we need to ask if 32% is *significantly* different from 26.042%. That's to say: If you pick 25 years at random (25 typical years), you wouldn't expect those 25 to be perfectly representative and to give you exactly 26.042% of them down every time you pick a sample. There's some variation. So we need to work out the allowable *range* of variation before we say, "hang on, there's something different going on here".

To do that, we need that standard deviation figure from earlier. That was 43.89%. That was the standard deviation (never mind the definition for that) of just picking one year at random. If we're averaging out a sample of 25 years (as we are here) we have to divide that standard deviation by the square root of 25. That just happens to be 5 exactly. 43.89%, divided by 5, gives a standard deviation for an average of 25 years of 8.778%.

So how much variation do we allow from 26.042% before we say this is not behaving typically? That's a matter of judgement, but we decide using something called "confidence intervals". Basically, to say there is evidence this sample is behaving differently from normal, we say that in the normal universe this is the kind of thing we'd only see less than 5% of the time. Or 2.5% of the time. Pick your number, you're picking your confidence interval. You're deciding how unusual this sample needs to be before you can say there's evidence it comes from a different universe.

(This is how scientists determine if a vaccine work. Don't give 500 people a vaccine see what proportion get better. Give a different 500 people a vaccine, see what proportion of them gets better. It's not enough to ask if more got better with the vaccine. You have to ask if enough more people got better before you conclude you've found a cure for Covid-19, malaria, or whatever it is.)

If you want to work with a 95% confidence interval, you need to see your sample differ from the universe you're matching against by at least 1.96 times the standard deviation. For comparison, a 97.5% confidence interval would use 2.24x standard deviation, and a 90% confidence interval would use 1.64x standard deviation

Our standard deviation, for a sample of 25, was 8.778%. The probability of a down year in the full data set was 26.042%

- At 90% confidence, 1.64 standard deviations is 14.44%. That gives a range of 11.6% to 40.5%. Our sample of 25 needs a probability outside of this for there to be statistically signifiant evidence it is not behaving typically
- At 95% confidence, 1.96 standard deviations is 17.20%. That gives a range of 8.8% to 43.2%.
- At 97.5% confidence, 2.24 standard deviations is 19.67%. That gives a range of 6.4% to 45.7%

Our sample of 25 years, the years that followed a down-year, featured 32% of those years showing a decline. Most statisticians would work with 5% confidence at the absolute lowest. But even if we allow 90% confidence intervals, 32% is still not statistically significant

**Conclusion**: There is no evidence to suggest the years following a negative return on the S&P 500 behave any differently to any other year.

### Years Following Double-Down Years: 1926 to 2021

We know there were 8 years when a decline was followed by a second decline. Bloomberg also made the claim that when you do get two declines in a row, a third decline is then more likely than usual.

What happened in those 8 cases? Let's look

- 1929-1930: 1931 was DOWN
- 1930-1931: 1932 was DOWN
- 1931-1932: 1933 was UP
- 1939-1940: 1941 was DOWN
- 1940-1941: 1942 was UP
- 1973-1974: 1975 was UP
- 2000-2001: 2002 was DOWN
- 2001-2002: 2003 was UP

So 50/50. 4 times out of 8, there was a third decline after the two. The other 4 times, there was not.

Instinctively, we feel that a third decline half of the time feels high. Much higher than the 26.042% of down years in the overall sample. It seems that, this time, Bloomberg are on to something: After 2 declines in a row, you are more likely to see a third decline.

Let's look at the numbers, rather than going by what we feel.

The standard deviation this time is 43.89% divided by the square root of 8 (we have a sample of 8, rather than the 25 we had last time). That's 15.52%

- At 90% confidence,1.64 standard deviations is 0.5% to 51.6%
- At 95% confidence, 1.96 standard deviations is -4.4% to 56.5%
- At 97.5% confidence, 2.24 standard deviations is -8.7% to 60.8%

So, again, even at a very loose 90% confidence level, a reading of 50% is well within the kind of thing you'd expect from a sample of 8 years.

**Conclusion**: There is no evidence to suggest the years following two consecutive double negative returns on the S&P 500 behave any differently to any other year.

## 1957-2021

Let's now follow the same method again, this time just looking at the years when the exact methodology of the present-day S&P500 has been used. I'll keep the explanation much more brief, as I'm repeating the process with a smaller data set.

### Full Data Set: 1957-2021

We have 65 years of data. 14 saw a decline. Probability of a decline is 21.54%. Variance is 16.90%, and standard deviation is 41.11%.

### Years Following Down-Year: 1957-2021

Of those 14 years that were red, 3 were followed by another red one. Probability is 21.43%.

A sample of 14 years means we divide the 41.11% by square root of 14 to get a standard deviation for a sample of 14 years of 10.99%.

- At 90% confidence, 1.64 standard deviations is 3.5% to 39.6%.
- At 95% confidence, 1.96 standard deviations is 0.0% to 43.1%.
- At 97.5% confidence, 2.24 standard deviations is -3.1% to 46.2%.

In any case, 21.43% is well within the confidence interval.

**Conclusion**: There is no evidence to suggest the years following a negative return on the S&P 500 behave any differently to any other year.

### Years Following Double-Down Years: 1957 to 2021

Of the 3 years when there were two drops in a row, a third drop followed on 1 occasion.

A data set of 3 is too small to prove anything. At 97.5% confidence you end up saying anything from -32% to +75% is fine. Even 0 out of 3 would be within the bounds of normal behaviour, and anything short of 3 out of 3 would also be normal.

**Conclusion**: There is not enough data from 1957 to 2021 to look at what happened after two drops in a row.

## Claim 3

What of the claim that the second year will be worse than the first, in those cases where a decline year is followed by a second year down? That one does stack up.

Looking from 1957 to 2021, remember there were 8 times when a decline year was followed by a second year in the red. On 7 out of 8 of those occurrences, the second year declined by more than the first.

If it were 50/50, so that it's equally likely that the second year will decline by more as it is that it will decline by less, the chance of getting 7 out of 8 greater declines is 3.5%. So at 95% or 90% confidence levels, 3.5% is sufficiently unlikely that there is evidence that the second drop is greater.

So this third claim does hold.

## Conclusions

Neither of the headlines from the Bloomberg tweet match the figures. There is no evidence that a negative year on the S&P 500 is less likely to be followed by a negative one, and there is no evidence to suggest two negative years will be followed by a third.

Maybe I misunderstood their tweet. Maybe "bracing for another unpleasant 12 months" refers to the 12 months about to begin, not the one that would begin once we've experienced a second down-year in a row. They weren't clear.

Notice I have not proved that the stock market performance one year is independent of the year before. There are statistical tools to examine the degree of correlation, and I have not used those. It would be interesting to see how correlated Y+1's performance is to Year Y.

What I have done is look at the claim that there is a pattern to suggest something specific and different happens following either one down year or two, and found that there is no evidence for such a pattern. The natural variation in stock market performance from one year to the next accounts for any supposed pattern.

I would question whether Bloomberg's headline here is anything more than clickbait. That is a weighty charge to bring: I generally consider Bloomberg to be a responsible provider that is at the quality end of business and financial analysis and journalism.

However, they claimed that after a down year on the S&P 500 we can expect a better year. Looking at the whole sample of 1926 to 2021, the S&P was down on the year 26% of the time, but down 32% of the time following a declining year. Superficially, that suggests you are *more* likely to see a drop following a year in the red, which is quite different from their headline of *less* likely. Now, we mustn't say that you are more likely to see a second red year; we saw above that 32% is close enough to 26% not to be statistically significant. But to drop a headline when there is no statistical significance, and when the raw data if anything suggests the reverse pattern, does look rather like clickbait from here.

Even just looking at 1957 to 2021, 21.43% is so similar to 21.54% that it is poor to put out a headline saying that you are less likely to experience a second decline.

In conclusion, given the year we've had in 2022, we cannot conclude that we are more or less likely to see a negative year in 2023. Not statistically we can't. But the sub-headline in Bloomberg's article is correct: Should 2023 turn out to be a negative year, it is likely to be a more negative one than this one. But that's a big if. On average, the S&P only ends the year down one year in four, and 2023 is no more likely to be such a year than any other. We'll know in 12 months time.

If you've enjoyed this post, you may like to read the sequel posted in early January.

## Add new comment