History

Leave a comment Posted on June 25, 2015June 29, 2015 Analytics, History

Bubble Markets, Burst Markets

Wall Street forecasters are notoriously bad at predicting what the markets are going to do. In fact the forecasts for 2001, 2002, and 2008 were actually worse than guessing. Granted, predicting the future is a hard job, but when it comes to stock markets, there are some things you can count on. Disclaimer: This is a look at the numbers; it is not investment advice.

Let’s take the Standard & Poor’s 500. It is an index of 500 large US companies stock values, much broader than the Dow Jones’s 30 company average. It isn’t a stock, and you can’t buy shares in it. But it is a convenient tool for tracking the overall condition of the stock market. It may also reflect on the state of the economy, which we’ll look at in a bit. Below are the monthly closing values of the S&P 500 since 1950. It’s value was about 17 points in January of 1950 and it closed around 2100 points here in June of 2015. It’s bounced around plenty in between.

Closing values of the S&P 500 stock index.

One of the questions to ask is whether the markets are overvalued or undervalued. Forecasters hope to predict crashes, but also to look for good buying opportunities. Short term fluctuations in the markets have proven to be very unpredictable. But longer term trends are a different story, and looking at them can give huge insights into what’s currently going on.

But first we have to look at the numbers in a different way. The raw data plot above makes things more difficult than they really need to be because it fails to let you clearly see the trend in how the index grows. Stocks have a tendency to grow exponentially in time. This is no secret, and most of the common online stock performance charts give you a log view option. Exponential growth is why advisors recommend most working people to get into investments early and ride them out for the long haul.

The exponential growth in the S&P is easy to see in the plot below, where I plotted the logarithm of the index value. For convenience I also plotted the straight line fit to these data — this is its exponential trend. Note that these data span six and a half decades, so we have some bull and bear markets in there — and whatever came in between them. And what you see is that no matter what short term silliness was going on, the index value always came back down to the red line. It didn’t necessarily stay there very long, but the line represents the stability position. It is a kind of first order term in a perturbation theory model, if you will. The line shows the value that the short term fluctuations move around.

Here I’ve taken the logarithm (base 10) of the index values to show the exponential growth trend. The grey area represents the confidence intervals.

This return to the line is a little bit clearer if we plot the difference between the index and the trend. This would seem to be a reasonable way to spot overvalued or undervalued markets. Meaning, that in 2000, when the S&P was some 800 points over its long term model value, the corresponding rapid drop back down to the line should have caught no one by surprise.

Differences between the S&P 500 index value and the exponential trend model value.

But this look at the numbers is a bit disingenuous. That’s because the value of the index has changed by huge amounts since 1950, so small points swings that we don’t care about at all today were a much bigger deal then. This makes more recent fluctuations appear to be a bigger deal than they may really be. So what we want to see is the percentage of the change, not the actual change.

And on top of this, let’s mark recession years (from the Federal Reserve Economic Database) in red. From this view we can see the bubble markets develop and the resulting panics that result when they burst (hello 2008). And that every recession brought a drop in the index (some bigger than others), but not every index drop represented a recession. In the tech bubble of the late 1990s the market was 110% overvalued at its peak. The crash of 2008 had it drop to about 45%, which is considerably undervalued. All that in 8 years. I think it’s safe to call that a panic. I know it made me panic.

Deviations in the S&P 500 index value from the exponential model are shown as a percentage of the index values. And recession years (from FRED) are shown in light red.

What we see is that the exponential model does a good job at calculating the baseline (stable position) values. If it didn’t, the recession-related drops in the index wouldn’t line up with the FRED data, and things like the 1990s bubble and the 2008 financial meltdown wouldn’t match the timeline. But they do. Quite well, actually. So this is a useful analysis tool.

It is also enlightening to take the same looks at the NASDAQ index since it represents a different sector of the stock market. NASDAQ started in 1971 and is more of a technology focused index. The NASDAQ composite index is created from all of the stocks listed on the NASDAQ exchange, which is more than 3000 stocks. So more companies in the index means this is a broader look, but it is focused on tech stocks.

So, as with the S&P above, here are the raw data. It looks similar to the S&P, and the size of the tech bubble is more clear. The initial monthly close of the index was 101 points, and it is over 5000 today.

Closing values of the NASDAQ stock index.

Not surprising to anyone, this index also grows with an exponential trend. The NASDAQ was absolutely on fire in the late 1990s. I wonder if this is what Prince meant when he wanted to party like it was 1999. Maybe he knew that would be the time to cash out?

Here I’ve taken the logarithm (base 10) of the index values to show the exponential growth trend. The grey area represents the confidence intervals.

The size of the dot-com bubble is clearer if we look at the deviation from the model, as we did with the S&P. At the height of the tech bubble, the NASDAQ was about 3500 points overvalued. Considering that the model puts its expected value at about 1300 points in 2000, I have to ask myself, what were they thinking?

Differences between the NASDAQ index value and the exponential trend model value.

The percent deviation plot shows this very clearly. At the height of the tech bubble, the NASDAQ was some 275% overvalued, almost three times that of the S&P 500’s overvalue. Before the late 1990s the NASDAQ had never strayed more than about 50% from the model value. Warren Buffet has said that the rear view mirror is always clearer than the windshield, but maybe Stevie Wonder shouldn’t be the one doing the driving.

Deviations in the NASDAQ index value from the exponential model are shown as a percentage of the index values. And recession years (from FRED) are shown in light red.

From this perspective, the NASDAQ today actually looks a few percentage points undervalued, so tech still seems to be a slightly better buy than the broader market (this is not investment advice).

Not only that, but the growth model of the NASDAQ, based on its 45 years of data, shows that it grows considerably faster than the broader market. If you go back and look at the raw data for either of the two indices, you’ll notice something special about the nature of exponential growth. The time it takes to double (triple, etc.) is a constant. As these are bigger numbers and because it is convenient, let’s look at the time it takes to grow by a factor of ten (decuple). The S&P 500 index decuples every 33.3 or so years. The NASDAQ composite, on the other hand, decuples every ~24 years (about 23 years and 11 months, give or take). This has huge implications for growth. That’s nine fewer years to grow by the same factor of 10.

Now comes the dangerous part. Let’s take the both of these indices and forecast their model values out thirty years. Both of the datasets contain more than thirty years worth of data, so forecasting this far out is a bit of a stretch, but not without some reasonable basis. Still, this is an exercise in “what if,” not promises, and certainly not investment advice.

Since we started with the S&P, let’s look at that first. If the historic growth trends continue, the model forecasts that the S&P 500 (currently around 2000 points) should be bouncing around the 10,000 point mark some time in the middle of 2038.

S&500 data, along with its exponential model fit, extended out thirty years. The grey area represents the confidence intervals.

The NASDAQ, on the other hand, which is currently around 5000 points, should average around 10,000 in late 2021, and 100,000 near the end of 2045. (Note: the S&P should be around 16,000 points at that time). Today the ratio of the NASDAQ to the S&P is about 2.4. But in 2045 it could reasonably be expected to be more than 6. Depending on the number of zeroes in your investment portfolio (before the decimal point…), that could be significant.

NASDAQ data, along with its exponential model fit, extended out thirty years. The grey area represents the confidence intervals.

This forecasting method will not predict market crashes. But that’s OK, because the professionals who try to forecast them can’t do that either. (Now if only Goldman-Sachs would hire me.) What it can do is give us a very clear idea of the market is over or under valued. By forecasting the stable position trend, we can easily spot bubbles, identify their size, and perhaps make wise decisions as a result.

Leave a comment Posted on January 9, 2015January 9, 2015 Analytics, History

Followup: The Effect of Elections on Gasoline Prices

My intention for the last post, The Effect of Elections on Gasoline Prices, was to be as thorough and quantitative as possible. A friend who is properly trained in statistics pointed out the need to run significance tests on the results. This is good advice and the analysis will be complete with its inclusion.

That last post ended with a visualization of the non-seasonal changes in gasoline prices in the months leading up to the election (August to November) for election years (Presidential or midterm), and used the same data in the same timeframe in non-election years as a control. We used inflation-adjusted, constant 2008 dollars to properly subtract the real seasonal changes and discover real trends in the analysis. That final figure (below) clearly showed that there is no trend of election-related price decreases. In fact, prices have tended to increase somewhat as the election nears. But the question that I failed to adequately address last time is: Are the price changes in election years significantly different from those of non-election years? This is the definitive question.

Non-seasonal, August to November changes in U.S. regular unleaded gasoline prices from 1976 to 2013. The comparison is made for election and non-election years. Original data source is the U.S. Bureau of Labor Statistics.

Because any sampled data set will suffer from sampling errors (it would be extremely difficult for every gas station in the country to be included in the BLS study each month), the sampled distribution will differ somewhat from the actual distribution. This is important because we frequently represent and compare data sets using their composite statistical values, like their mean values. And two independent samplings of the same distribution will produce two sets with different mean values; this makes understanding significant differences between them an important problem. What we need is a way to determine how different the datasets are, and if these differences are meaningful or if they are simply sampling errors (errors of chance).

Fortunately we are not the first to need such a tool. Mathematicians have developed a way to compare datasets to determine if their differences are significant or not. These are “tests of significance.” The t-test is one of these tests and it determines the probability that the differences between the means of the two distributions are due to chance. The first thing we should do is look at the distributions of these price changes. The two large election-year price drops (2006, 2008) are very clearly seen to be outliers, and the significant overlap of the distribution of price changes is readily visible.

Distributions of non-seasonal, August to November changes in U.S. regular unleaded gasoline prices from 1976 to 2013. Original data source is the U.S. Bureau of Labor Statistics.

Distributions of non-seasonal, August to November changes in U.S. regular unleaded gasoline prices from 1976 to 2013 for both election and non-election years. Original data source is the U.S. Bureau of Labor Statistics.

It is clear that were it not for the outliers in the election year data, these distributions would be considered to be very nearly identical. But to characterize the significance of their differences, we’ll run an independent t-test. The primary output of the test that we are concerned with is the p-value. This is the probability that differences between the two distributions are due to chance. Recall that the maximum value of a probability is 1. If it matters, I’m using R for data analysis.

Welch Two Sample t-test

data:  electionyear$changes and nonelectionyear$changes
t = -0.6427, df = 21.385, p-value = 0.5273
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  -0.2530637 0.1334810
sample estimates:
  mean of x mean of y 
-0.02367507 0.03611627

This p-value tells us that there is a 52.7% probability that differences between these two distributions are chance. The alternative hypothesis is then rejected and the difference in means is the same as 0. This answers the question that we posed and indicates that the changes in gas prices in election years are not significantly different from those of non-election years.

1 Comment Posted on January 4, 2015 Analytics, History

The Effect of Elections on Gasoline Prices

A quick web search shows that I’m not the only one who’s heard the talk about how gasoline prices always decline before an upcoming election. Of course, this always gets mentioned when local gas prices are declining and an election is coming up. But is this actually true? Do gas prices in the United States decrease leading up to an election? There are lots of articles written about this topic, and some even use numbers and statistics to back up their position, but I intend to be a bit more thorough here.

To get started, we’ll use the inflation-adjusted price of gasoline that we get from the U.S. Bureau of Labor Statistics. We’ve looked at these data in a previous post, and if you’re at all interested in how constant-dollar costs are calculated, you should go read that post first. This dataset goes back to 1976, so it includes a sizable number of election years. Nineteen, to be precise. It is important to use inflation adjusted data in this analysis because it compensates for the price changes from the changing buying power of the dollar. Ten cent changes in the price of gas in 1980 and in 2010 aren’t the same, and inflation-adjusted prices account for this.

Unleaded regular gasoline prices from 1980 to 2014 in constant 2008 dollars. Source: U.S. Bureau of Labor Statistics.

The first thing we should do is to look for annual trends. It is entirely reasonable to expect that the price of gas shows some regular changes each year, so we should first understand if this happens and by how much so that we can account for it in our analysis. To do this, we break the above graph up into its parts. We separate the observed data into the parts that repeat on an annual basis (seasonal changes), the parts that change more slowly (long-term trend), and the parts that change more quickly (remainder). If we add these all parts back together we will get the original observation. For gasoline, this additive decomposition gives us the results plotted below. Note the differences in the y-axis scales for the different plots. The “trend” component is the largest. Seasonal variations swing about seventeen cents from high to low, but the remainder, the fast-changing non-periodic fluctuations, can be in excess of a dollar, though they generally exist inside of a quarter of a dollar on either side of zero.

Decomposition of U.S. Gasoline prices into seasonal and other components. Original data source: U.S. Bureau of Labor Statistics.

The seasonal component here is the part we’re interested in first. There are a couple of ways to extract this component in a decomposition, but because we are using constant-dollar prices and looking for the true seasonal fluctuations, we don’t want to window the filter at all. So let’s take a closer look at that seasonal component. Recall that this is the component that occurs repeatably every year since 1976.

Seasonal change in U.S. gasoline prices (regular unleaded) in constant 2008 dollars. Original source: U.S. Bureau of Labor Statistics.

Not too surprisingly, we see there is a 7.7 cent increase in the summer, peaking in the June driving season, and a 9.2 cent decrease in gas prices in the winter, with the bottom arriving in December. Interestingly, since elections are in November, they occur during the natural seasonal price decline. This will not be a problem for us.

The key to determining whether or not non-seasonal conditions (i.e., elections) impact prices is simply to use the non-seasonal components of the prices for comparison. That is, by considering only the trend and remainder components. By excluding the seasonal fluctuations we can see how the non-seasonal prices changed, and from there the effect of elections on price changes will be able to be observed.

For the sake of simplicity, let’s define the time before the election that we are interested in to August through November. This is an assumption on my part. But the last quarter before the election seems to be the time we should pay the most attention to. We can repeat the analysis using any other window of time if we desire. Since we’ve already identified the seasonal changes in the decomposition, this is straightforward. I’ve highlighted the pre-election time on the following graph for ease of viewing.

Non-seasonal changes in U.S. regular unleaded gasoline prices, in constant 2008 dollars during election years from 1976 to 2012. Original source: U.S. Bureau of Labor Statistics.

So when we do this we find the following for election year, non-seasonal gas price changes:

1976: +$0.08/gallon
1978: +$0.09/gallon
1980: -$0.02/gallon
1982: No Change
1984: +$0.10/gallon
1986: +$0.05/gallon
1988: +$0.02/gallon
1990: +$0.37/gallon
1992: +$0.09/gallon
1994: +$0.07/gallon
1996: +$0.10/gallon
1998: +$0.07/gallon
2000: +$0.14/gallon
2002: +$0.12/gallon
2004: +$0.21/gallon
2006: -$0.64/gallon
2008: -$1.40/gallon
2010: +$0.20/gallon
2012: -$0.10/gallon

So only four out of the last 19 election years have shown a drop in gas prices that were not part of the normal seasonal variation. And only two of those were by more than a dime. The average change here is a 2.3 cent drop, but that is very heavily influenced by the 2008 drop of $1.40/gallon, statistically an outlier. The median value of an 8.1 cent increase is more in line with the typical behavior. And of the years when the prices don’t drop, the average increase is 11.5 cents. In other words, there is no election-year drop in gasoline prices using the BLS data.

We should ask ourselves how these election year results differ from those of non-election years. This is also straightforward to answer.

Non-seasonal changes in U.S. regular unleaded gasoline prices, in constant 2008 dollars during non-election years from 1977 to 2013. Original source: U.S. Bureau of Labor Statistics.

And we find the following non-election year, non-seasonal gas price change results (August to November):

1977: +$0.06/gallon
1979: +$0.17/gallon
1981: +$0.04/gallon
1983: -$0.02/gallon
1985: +$0.04/gallon
1987: +$0.05/gallon
1989: -$0.01/gallon
1991: +$0.08/gallon
1993: +$0.11/gallon
1995: +$0.01/gallon
1997: +$0.04/gallon
1999: +$0.10/gallon
2001: -$0.09/gallon
2003: No Change
2005: -$0.09/gallon
2007: +$0.35/gallon
2009: +$0.13/gallon
2011: -$0.09/gallon
2013: -$0.20/gallon

And so we find that just six of the nineteen non-election years showed August to November gas price decreases, and only one of those was more than a dime drop. The average price change in non-election years is a 3.6 cent increase with the median value of a 3.8 cent increase (it is nice when they agree). And for the years that show an increase, the average change is a rise of 9.9 cents. I think it is easier to grasp this visually.

This tells us that the non-seasonal median gas price change between August and November in an election year actually increases by 4.3 cents/gallon (in constant 2008 dollars) in an election year compared to the same time frame in a non-election year. The caveat here is that we are dealing with national prices instead of local, but I think we can call this myth busted.

1 Comment Posted on December 30, 2014December 30, 2014 Analytics, History, Inflation

The Cost of Things: A Constant-Dollar Look at Common Goods

The other day I went to the grocery store to get some things to make tacos. We often end up getting ground turkey these days, but I admit to preferring ground beef for this sort of thing. But after I looked at the prices of ground beef, I believe I understand why we usually get turkey. In case you haven’t noticed, the price for ground beef is on the rise, having crossed $4/lb in August of 2014 with no looking back. This perplexes me somewhat as ground beef comes from cows, which are not dangerous, rare, or hard to kill. Oh, for the cheap ground beef days of the 1980s, right?

Average U.S. prices for ground chuck (100% Beef) from 1980 to 2014. Source: U.S. Bureau of Labor Statistics.

Now any time you want to compare prices over some length of time you run into the same problem. The buying power of the dollar is not constant. It changes over time, and this makes it somewhat difficult to tell whether something is getting more or less expensive or if the buying power of the dollar is changing. Or both. The folks at the U.S. Bureau of Labor Statistics track the buying power of the dollar each month by finding out what it costs to purchase a basket of particular items in the marketplace. The result of their work is called the Consumer Price Index, or CPI. So whether the cost of something actually changes depends on how it changes compared with the index. To see this, we use the index to adjust past prices to reflect what they would have been if the dollar had kept a constant buying power.

This method isn’t perfect because it only reflects the buying power of the dollar at some point in time and doesn’t directly consider the change in the money supply. You might have thought that the amount of money in the U.S. economy was a fixed number, and that would be a reasonable assumption. But it would be a very wrong assumption. An increase in the money supply will decrease buying power but takes some time to be recognized by the market. Because of this, the CPI will lag somewhat behind actual values. In any case, we won’t let perfection be the enemy of good enough.

In order to use the CPI to compare historic prices, we pick a year and normalize everything with respect to that year’s value. Below is a plot of the CPI using 2008 as the reference year (where the CPI = 1). To use the chart, we can look and see that in 1985 the CPI was one-half, and this indicates that prices doubled between 1985 and 2008. Specifically, this means that the prices of the things in the BLS’s market basket doubled. That’s another way of saying the buying power of the dollar was cut in half in that time period. So much for your savings.

Consumer Price Index (CPI) from 1980 to 2014, using 2008 as the reference.

With this information in hand, we can have another look at the prices of ground beef in constant dollars. And now we see that the cost of ground beef was surprisingly high in the 1980s but decreased until about 2000. It then started its way up sharply around the middle of 2013 leading up to today’s prices. So ground beef really is getting more expensive, and by no small amount. Not good for taco lovers who prefer beef to turkey. And before you ask, they don’t track ground turkey prices, and I don’t know why.

Historic pricing of ground chuck in the U.S. in constant 2008 dollars. Source: U.S. Bureau of Labor Statistics.

On a side note, who would have ever thought, at the time, that we would look back at the 1980s as the days of the strong dollar?

In any case, the BLS tracks the prices of a lot of other products. We can get a feel for the health of the economy by looking at how the cost of things have changed using constant-valued currency. So let’s continue with food. Specifically, chicken. The inflation adjusted cost of fresh, whole chicken in constant 2008 dollars shows that chicken prices have been stable since about 1990, which means the price at the register has increased at about the rate of inflation. Hopefully your income has as well.

BLS data for U.S. fresh whole chicken prices in constant 2008 dollars. Source: U.S. Bureau of Labor Statistics.

Milk is an interesting product to look at because it one of the items in the BLS market basket that is used to calculate the CPI. Because of this, we should expect the cost of milk to move pretty much in line with the CPI and give us a flat cost curve, which it does—on average. These numbers are nationwide averages, so some fluctuations should be expected. But we also have (or have had in any case) minimum prices federally guaranteed at various times during this period. Meaning there’s no limit to how high milk prices can go, but prices won’t fall below a set value. And if that happens, your tax dollars are used to pay for the milk you didn’t buy in the store.

Milk prices from 1980 to 2014 in constant 2008 dollar prices. Source: U.S. Bureau of Labor Statistics.

Sugar is the last of the food products that we’ll look at. Sugar is another one of those agricultural products with price supports, but this one isn’t in the BLS market basket. Interestingly, the supported prices don’t generally adjust in line with inflation. So while the retail price of sugar (not shown) has been largely steady throughout the decades, the inflation-adjusted true cost of it has been coming down. Interestingly, it was the production shortages in 1979 and and 1980 that lead to the soaring prices in the early 1980s that ultimately lead to the switch to high fructose corn syrup (HFCS) by food and beverage companies. Other references show similar or larger spikes in sugar prices in the 1960s and 1970s. It isn’t clear that it was worth it, but sugar hasn’t been quite so volatile since the switch.

Sugar prices from 1980 to 2014 in constant 2008 dollars. Source: U.S. Bureau of Labor Statistics.

There are some other interesting products to look at. Electricity is one of them. I haven’t seen anything that shows seasonal price fluctuations quite as clearly as the electricity cost chart. What is especially interesting here is that sometimes the retail price (not shown) of electricity increased while the cost in constant dollars came down. Being a regulated public utility, the prices for electricity are generally not market-driven, but based on cost-recovery, so we can read from this that the costs of the electric utility business have been decreasing fairly consistently since the mid 1980s. Here in New England they’re talking about 40% increases in electricity rates in the next year or so, which would take us back to the historically high 1980s costs. There is no historic precedent for that kind of single year price increase. At least, not in the last three decades.

The cost of electricity in the U.S. from 1980 to 2014 in constant 2008 dollars. Source: U.S. Bureau of Labor Statistics.

Since we are on the topic of energy, it makes sense to look at gasoline. Gasoline showed a fairly flat cost curve from the mid 1980s though the early 2000s before it went all to hell. There was a significant correction in late 2008 which is curious. And in spite of the fact that the prices are on the decline, they’re still historically high by around a dollar per gallon. I always hear about how gasoline prices decline in election years. That isn’t clear from this look, so that will be a topic I’ll dig into soon.

Unleaded regular gasoline prices from 1980 to 2014 in constant 2008 dollars. Source: U.S. Bureau of Labor Statistics.

Staying with energy, for those who heat with natural gas, you’ll see that current rates are close to average lows. But costs have clearly been on a roller coaster for the last 15 years.

Natural gas prices from 1980 to 2014 in constant 2008 dollars. Source: U.S. Bureau of Labor Statistics.

So what does this tell us? First, that someone has to be making some serious money in the ground beef industry. The inflation-adjusted cost of ground beef is growing well above the rate of inflation, and I’ll bet that the difference is being pocketed by someone clever enough to pull it off. Second, I was surprised to see how some of the inflation-adjusted prices have actually declined. My instincts were to guess that I would see most costs on the rise, but flat or decreasing costs seem to be the rule and not the exception. This is, of course, a simple survey rather than an all-encompassing study. For all I know, every other product is skyrocketing. Given the slope of the inflation curve, retail prices are doing a good job of increasing at the register. May your income increase ever faster.

Leave a comment Posted on December 24, 2014December 24, 2014 Analytics, Crops, History

Feeding America: The Extraordinary Increase in US Farm Productivity (Part 2)

After my last post on the remarkable increase in field crop yields in American agriculture, I was interested in seeing where else yield improvements were observed and hopefully getting a better understanding of the causes. I’m interested in understanding how much of the impact mechanization (i.e tractors and harvesters) had as opposed to improvements in seed, fertilizers, and pesticides. Certainly all of these came together to boost productivity, but what had the greatest impact?

So on to (non-field crop) vegetables. The same USDA website has statistics for vegetables as well as for field crops, but to get any data of any historical significance, we can’t afford to be choosy. Only two crops have data going back to the pre-1970s days, and thankfully they go back to the 1860s. So here we have the crop yield data in cwt/acre (cwt is a centum weight, or hundredweight—a one hundred pound increment) for potatoes and sweet potatoes. Given what we have already seen, this isn’t terribly surprising. It looks much the same as the graphs for corn or wheat. That is, a relatively flat yield curve until the 1930s followed by a sharp upturn and a monotonic increase spanning seven decades. I still find this to be remarkable.

Potatoes and Sweet Potatoes yields in centum weight/acre (1868-2014).

And what about how the yield has grown? Looking below we can see that potatoes are 8 times more productive than they were some seventy years ago. Considering the crops we have looked at so far, that is the record (beating rice’s sevenfold yield increase).

Yield growth for Potatoes and Sweet Potatoes (1868-2014).

Explaining this is somewhat of a problem. Every crop we have looked at has shown the same yield curve behavior. But certainly not every crop had a sudden, massive successful hybridization improvement at the same time. And if pesticides and fertilizer were primarily responsible, then why wouldn’t we see a large step change in the yield curve instead of steady incremental growth? We do know that the tractor and other mechanized equipment came to popular use in the 1930s, but isolating its impact on crop yields is difficult.

To understand the effect of mechanization, we have to look at something else. We have to look at a product that doesn’t rely on fertilizers or pesticides. We have to look at an agricultural product where hybrid seed isn’t a factor. We have to look at a product where mechanization is the primary driver of yield growth. We have to look at milk.

Milk yield in lbs/head from 1924-2014.

And milk shows the exact same behavior. In 2014, a single dairy cow in the United States could be expected to produce close to 22,000 pounds of milk in a year. That is no small feat considering that the same cow was producing only about one-fifth of that at just over 4,000 pounds annually in 1924. (Well not the same cow…) This suggests that mechanization, and the knowledge behind it, has been the primary driving factor in the increase in farm productivity over the last 70 years.

Milk yield growth from 1924-2014.

And by knowledge I mean that it isn’t enough to make a milking system that pumps faster or has larger storage tanks to increase the time between handling. Today’s automated milking systems enable a cow to decide when she needs to be milked to enter a milking stall and have the system automatically attach and begin. This is an interesting video. I don’t know the language, but it shows we are a far cry from a three-legged stool and a bucket.

Now you might say, “Not so fast! The dairy industry gives cows hormones to increase milk production!” And while there is an element of truth there, it doesn’t explain what we see in the yield. In the 1930s it was found that Bovine Growth Hormone (BGH, also bovine somatotropin, or BST) boosted milk production. But to get BGH they had to extract it from the pituitary glands of cow cadavers, so it wasn’t able to be widely used. It wasn’t until the 1970s that the gene for producing it was identified. Recombinant bovine somatotropin (rBST) didn’t receive FDA approval for usage until 1983, and wasn’t commercially available until 1994. It also isn’t clear that it has always made great economic sense, and adoption has been less than commercial producer expectations, at least at times. So hormones can’t explain this observation.

This takes us back to mechanization. Knowledge, ingenuity, and the desire to improve have lead us to build equipment to do the job better on the farm. Better tractors, seed planters, harvesters, milkers, irrigation systems, and so on have allowed us to do more with less. And the most interesting thing of it all is that in most of the data that we’ve seen, there is no leveling off in sight. We haven’t hit the maximum yet; production yields continue to increase each year, on average.

Predicting the future is hard, but it is very easy to look backward in time to see how we got to where we are today. I don’t imagine a farmer alive in the 1920s would ever have imagined that it would be possible to be getting the production levels we are getting today. Likewise, today it seems impossible that these numbers can continue to increase for another seven decades. In spite of that, I’m going to bet on the impossible coming to reality when it comes to feeding America.

Leave a comment Posted on December 14, 2014 Analytics, Crops, History, Population

Feeding America: The Extraordinary Increase in US Farm Productivity

A few nights ago I was re-reading a P.J. O’Rourke book on terribleness. In the chapter on famine, everyone’s favorite bedtime reading subject, something he said struck me. “In most of the world, food production has well outpaced the growth of population. In the 1930s American wheat growers had an average yield of thirteen bushels per acre. By 1970 the yield was thirty-one bushels. In the same period the corn yield went from twenty-six bushels per acre to seventy-seven.” This was unexpected. I have the picture in my mind of the differences between arithmetic and geometric progressions when it comes to comparing food production and population. This is courtesy of Rev. Malthus, whose treatise was nicely summarized in the same book on terribleness in the chapter on overpopulation: “…there’s no end to the number of babies that can be made, but you can only plant so much wheat before you run the plow into the side of the house.”

According to data from the fine folks at the USDA, the story here is rather interesting. Yes, the wheat yield in the 1930s was in the low teens. But what is surprising is that it had been there at least since the USDA began recording wheat crop yields in 1866! In other words, the amount of wheat an acre of farmland produced showed no significant improvements for at least some seventy years, until the early 1930s. Since then, however, it has increased at a roughly constant rate, reaching its all time high of 47.1 bushels per acre in 2013. And it hasn’t settled at that yield; it continues to increase.

U.S. wheat crop yields, in bushels per acre. 1866-2014.

It should come as no great surprise, then, that corn yield numbers tell essentially the same story. But I was surprised by the fact that the upturn in the yield happens at about the same date. Corn production was a flat 25 bushels per acre for decades and started its way upward at about 1930. For the curious, it peaked at 173.4 bushels per acre in 2014 and continues to climb as well. That is a huge number. A corn farmer in 1900 working hard to get his 28 bushels per acre never in his most fantastic dreams thought yields like this were possible.

U.S. corn crop yields, in bushels per acre. 1866-2014.

We can look at this another way, by plotting the corn yield against the wheat yield. And what we see is that the relationship is well behaved. When wheat yields increase, so do corn yields, though not necessarily in the same proportion. After a point, every 10 bushels/acre growth in wheat equals about a 40 bushel/acre growth in corn. This suggests there is more to the story — that there is some common factor that drives this effect.

The crop yields for U.S. corn plotted against the yields of U.S. wheat for the years 1866-2014.

And there is more to the story. The USDA doesn’t just measure wheat and corn production. They determine the yields for all of the field crops. So that we can compare like things, first are the crop yields that are measured in bushels per acre: barley, corn, flaxseed, rye, sorghum, soybeans, and wheat. Of those, only rye and sorghum yields look as though they have stopped increasing. The rest show this continuously increasing trend over time, starting around the same year — 1930.

USDA crop yields (bushels/acre) for barley, corn, flaxseed, rye, sorghum, soybeans, and wheat.

And then we have the crop yields that are measured by weight (pounds per acre). These are: beans, cotton, hay, hops, peanuts, peppermint oil, rice, spearmint oil, sugarbeets, and tobacco. These also show the same yield growth since about 1930 behavior. Interestingly hay and tobacco yields seem to have joined rye and sorghum yields in leveling off (showing classic error function behavior).

USDA crop yields (pounds/acre) for beans, cotton, hay, hops, peanuts, peppermint oil, rice, spearmint oil, and tobacco.

USDA crop yields (pounds/acre) for beans, cotton, hay, hops, peanuts, peppermint oil, rice, spearmint oil, sugarbeets, and tobacco.

If it wasn’t clear before, it is now. Sometime around the 1930s, something quite dramatic began to happen in field crop production. An agricultural revolution of sorts. And while the effect is present for all of them, it varies in its impact depending on the crop. To show this, we normalize the yield rate by the average of the first few years in the dataset for a given crop. This shows us how the yield rate has grown in time. And we can compare them all if we plot them all with the same y-axis scale (conveniently done below). So the biggest crops in terms of yield rate growth are: corn (~7x), cotton (~6x), peanuts (~5.5x), rice (~7x), sorghum (4-6x).

Yield growth in US field crops. Plotted is the multiplier in the yield rate since the start of data collection for the given field crop.

I find it interesting that the curves for soybeans and cotton begin to increase in 1920, which is a few years before the others. Peanuts seem to be among the last to join the group as their yield didn’t start taking off until about 1950. Did some experimentation take place with soybeans and cotton crops and then once successful transition to other crops such as wheat, corn, and then peanuts? Soybeans became quite important in the US around 1910, so this is plausible. Though I haven’t found anything to suggest this is what actually happened.

We do know that hybrid seed became all the rage starting around 1930. Gregor Mendel demonstrated plant hybridization in the 1860s with peas, but it wasn’t until the 1930s when hybrid seed was able to produce a corn crop that was well suited to mechanical harvesting. One report says: “The tractor, corn picker, and hybrid seed corn came together to raise labor and land productivity in corn production in the late 1930s.” Tractor development (power takeoff, rubber tires) and sales really started to take off in the 1920s, which fits the timeline perfectly. And I’ll bet that improved chemical fertilizers and pesticides factor in as well.

In any case, I should go back to the original statement where O’Rourke said food crop yield rates were increasing faster than the population. In 1927 the world’s population is estimated to have been about 2 billion people. This is convenient for our comparison as it about coincides with our 1930 start of this agricultural revolution. Corn and wheat yields both doubled by about 1960, when the global population was about 3 billion (a 1.5x increase). So far, so good. And we reached about 7 billion people in 2012, an increase of a factor of 3.5 since 1930. This about matches the rate of increase in wheat production. But corn and rice have managed to stay ahead by a considerable margin, so O’Rourke is right and Reverend Malthus has been wrong. At least, since this “revolution” started.

But these crop yield numbers aren’t global numbers, they’re for domestic production. How do they compare against the growth in the U.S. population? Quite favorably, it turns out. In 1930 the U.S. population was about 123 million. It grew by about 50 percent to about 180 million in 1960, which is in line with global population increase we just considered, where crop yields for corn and wheat doubled. But in 2012 the U.S. population reached 313 million, a growth factor of only 2.5x over 1930. The slowest of the food crop yields shown above have at least grown at the same pace as the domestic population. But the major food crops like corn and rice are out in front by a mile, with their yields growing some 2.8 times faster than the growth in domestic population.

This is startling, but in a good way. It is a win for science and innovation and demonstrates how humans can manipulate their situation to work to their advantage. The primary food crop yields show no signs of leveling off. It makes sense that they probably will at some point. But for now it is nice to know that the Reverend Malthus is wrong.

Leave a comment Posted on June 10, 2013 Cemetery, History, Population

Death and Burial Trends in Amherst

When it comes to understanding how we use our cemeteries in Amherst, something was still missing in my mind after going through the population growth in town, the annual burial numbers in our cemeteries, and also the cemetery lot sales numbers. To quickly review, since 1934, on average, we have a flat average of just 22 burials per year (+/- 6) in Amherst, NH cemeteries (see below). The lot sales numbers indicate that we sell on average less than 8 cemetery lots per year (between 1971-2008), indicating that most of the burials are in lots that have been previously purchased (perpetual care lots).

Annual number of burials in Amherst, NH cemeteries. Data from Department of Public Works records. The red line shows the average value of 22.

This is rather surprising in light of the town’s population growth in that time. As per the US Census, we have gone from about 1000 people in Amherst to over 11,000 in that amount of time (see below). So the conundrum is that even though there are 11 times more of us here now than there were 8 decades ago, the number interred in our cemeteries each year hasn’t changed. At all.

US Census data for the town of Amherst, NH.

This doesn’t make a whole lot of sense, but the numbers are all real. Something to ponder while we press on.

One missing piece of this puzzle is the number of Amherst residents who die every year. Now you might guess that the number of residents who die and the number of people who are buried here would be basically the same. This would make sense, but this would be wrong. The number of resident deaths in Amherst is tabulated and reported every year in the annual town reports, which are available in the town library’s archive room, and that is where I obtained the numbers (graphed below). These are still remarkably small numbers, though they do show a slow growth trend, tripling in 8 decades.

Amherst Resident Deaths from 1934 to 2012, as recorded in Annual Town Reports. The blue line shows the running average.

So here is the conundrum now, since 1934 the population of Amherst has increased by a factor of 11, the number of annual resident deaths have gone up by a factor of 3, and despite those increases, the burial rate has not changed.

There’s a twist to this story, we just haven’t seen it yet. It is difficult to plot resident deaths the same graph with the burial numbers for comparison, but we do need to compare them. So instead let’s look the ratio of burials to deaths. This will be a much more useful graph. Plotted here (see below) is our annual Amherst cemetery burials as a percentage of Amherst resident deaths each year. There is something quite surprising going on, do you see it?

Burials in Amherst, NH cemeteries as a percentage of Amherst resident deaths. These numbers are skewed by the "Brought from away and buried in Amherst" burials.

Burials in Amherst, NH cemeteries as a percentage of Amherst resident deaths. The unlikely percentages are caused by the “brought from away and buried in Amherst” burials.

It turns out that it has been quite common for people to be “brought from away and buried in Amherst.” In some years those numbers have exceeded the numbers of Amherst residents who died. When that happens, we get some statistically very unlikely burial percentages, which you see above. Now why people would be brought in for burial is a question for the historians. But up until ten years ago or so, these “brought from away” numbers were in the town reports. Whether or not this is still going on is an unanswered question at this point, and we have no information on the number “sent away” for burial.

One thing that is readily discernible here is that the percentage of dead which are buried is decreasing. Let’s assume for a moment that the number of “brought from away” burials is currently the same as those who are sent away. If this is true, the 2012 numbers show that about 50% of our residents who die are not buried. Now if the number “brought from away” is greater than those sent away, as it must have been for many years, this fraction of residents not buried increases. So what happens to the 50%+ who aren’t buried?

The answer, I believe, is cremation, an increasingly popular alternative to burial in this part of the country. That explanation is consistent with the numbers that are reported here, and also with funeral homes reports in the area, which report that more than 60% are choosing cremation over burial. The numbers from Amherst here are somewhat muddied by the fact that the head of DPW tells me that cremated remains are sometimes buried, requiring a lot and being recorded as a burial, so cremation rates in town are greater than what would be suggested here by burial numbers alone.

Now I think we have a reasonably complete picture of how we use our cemeteries in Amherst. A useful thing to know for planning purposes would be the numbers of unsold and perpetual care (sold but unused) lots currently in town cemeteries. We could use these, along with what we know from historic numbers, to forecast the number of years left before new cemetery space is required.

Leave a comment Posted on May 25, 2013 Cemetery, History

Amherst Cemetery Lot Sales

There is one significant aspect of our town cemeteries that has managed to evade the conversation. Burial lots can be purchased at the time of need prior to a funeral, or they can be purchased in advance. These are known as “perpetual care” lots, and it is not uncommon for people or families to make these purchases, sometimes in quantity, for the future.

We have already looked at the usage of cemetery lots in town by examining the annual interment data and finding a relatively constant average 22 burials/year over many decades. But what we have not looked at is how quickly new lots are being purchased. That is, how quickly are we purchasing the unsold lots in our existing Amherst cemeteries, and how quickly are we using up those which were already purchased? Understanding these numbers lets us understand the demands our town places on our cemetery land reserves.

This is a relatively straightforward question to answer, at least during a span of time over which we can get numbers. The fine people at our Department of Public Works have been very patient and most helpful with my requests for information. As it turns out, one of the annual reporting duties of the DPW is to notify the NH Attorney General’s office of the sale of cemetery lots (the AG’s office oversees cemetery trusts), so this information is already tabulated on an annual basis. Unfortunately there are some years (1982-1991) for which the numbers can’t be readily located in town records (these records used to be maintained in other places and they would seem to have either been lost/misplaced or were perhaps never recorded). I may request copies of these reports from the AG’s office, but I digress. What data I have now will suffice.

The graph below plots burials in Amherst cemeteries (in blue) during the years for which I was able to obtain lot purchase data (1971-2008), and also the lot sales data (in red). Note the 1982-1991 lapse in red points on the graph – the missing record years. It is, I think, reasonable to assume the trend bridging the gap here. In any case, the sale of new lots is a surprisingly small number, with the average over more than three decades being 7.6 cemetery lots sold per year.

Amherst cemetery burials and lot sales from 1971-2008.

If you are the sort of person who sees value in histograms, we can view it that way too. If you aren’t used to histograms, what this shows is a count of how many times each number shows up in each set. In other words, how many years there were sales or usage of a given number of burial lots. You can see that there were six years where six lots were sold (red), and also six years where 20 lots were used (blue) and each set peaks very close to its average lots/year value (7.6 for lots sold, 22 for burials). This is just another illustration of the separation between the number of lots used and sold each year.

Histogram view of the burial lots used and sold for the years 1971-2008.

These data give us some insight into how our cemeteries are used. It is quite clear from the above graphs that more burials are performed than there are sales of new lots each year. We have no good way of knowing if the lot purchases were made for immediate use, or if the lots were bought to be held in perpetual care. But let us assume they were purchased for immediate use. Now the difference between two data sets establishes the lower limit to the number of previously sold (perpetual care) lots used for burial each year. This is graphed below.

Usage of previously purchased cemetery lots in Amherst, 1971-2008

Usage of previously purchased (perpetual care) cemetery lots in Amherst, 1971-2008 (lower limit).

This is useful because it can be used to calculate the lower bound on the percentage of burials in Amherst which draw from the previously sold, or perpetual care, lots, as plotted below. The average value is 68.3%.

Percentage of Amherst cemetery burials using previously held (perpetual care) lots. Note that this is a lower bound because we are assuming that lot sales go to immediate burials and not to perpetual care.

What these data indicate is that an average of at least 68% of burials in Amherst each year are performed using perpetual care lots (i.e. lots which were not purchased in the burial year). The lot sales numbers from DPW indicate that 7.6 new Amherst cemetery lots are sold on average each year, which is the actual demand for new cemetery land in Amherst. As the town’s population surpasses 12,000 today, I will admit to being rather surprised by this.

The obvious followup question here is how many unsold plots remain in our existing cemeteries? This is one which can only be answered by the DPW. The other question is what is the difference between the number of deaths in town each year and the burial numbers. Understanding this would help us to interpret what we see here.

Leave a comment Posted on May 23, 2013May 27, 2013 Cemetery, History, Population

Amherst Cemetery Usage

With the cemeteries in Amherst making headlines in local news, I thought it was appropriate to obtain some numbers on the historic demand for the town’s cemetery land. The actual number of burials per year are only one facet of the issue, though. The other big aspect, which has not been a part of public discussion as I am aware of it, is the sale of burial plots. I am also working on those data in order to provide a more complete picture of our resource demands. That will have to wait, though. (Note: Now complete.)

The town’s Department of Public Works has been very helpful with providing access to the town’s burial records (for which I am most appreciative), which go back to 1934 and are graphed below. These data conclude with the current year’s burial numbers, which, this being May, should be understood to be incomplete.

Annual number of burials in Amherst, NH cemeteries. Data from Department of Public Works records.

This graph nicely shows the fluctuations in the annual interment rates, and indicates that the data set is well represented by its mean value (red line). In order to make some other comparisons, however, we can determine the average number of burials per year, averaged over a decade, and replot. The graph below shows the average number of burials per year in the decade preceding the data point. So the 20.5 value in 1950 is the average number of burials from 1940-1949, and so on. The trend was clearly upward for the 2000-2010 decade, though it has fallen back to an average of 25 since 2010, as the graph above illustrates (not counting this year for obvious reasons).

Annualized mean burials in Amherst, NH cemeteries.

In a previous post on the town’s population growth, the US Census was the primary source of data. There, the information is on the decade years and showed Amherst’s considerable growth. I have replotted these Amherst population data below, restricting the graph to just the years for which I have burial data. In the 60 years shown here, the population of the town increased from 1461 to 11,201 (by count of the US Census department).

US Census population data for Amherst during the burial data years.

The remarkable thing about this is that with the 766% increase in population of the town (2010 vs. 1950), the interment rate has remained largely flat. In other words, a shrinking percentage of the town’s population is being interred over this time, as shown graphically below. This should not be confused with our absolute burial space demand, which is clearly demonstrated by the first graph in this post.

Percentage of the population of Amherst buried, averaged for each decade.

This is a very surprising result. And probably not attributable to longevity, although life expectancy numbers in the US have increased (see below). I have no numbers on cremation or private cemetery usage, but those are likely possible factors in play.

Historic life expectancy in the United States.

What the population and interment numbers do tell us is that there are more of us in town every year, but a relatively fixed number of us are buried here annually.

Leave a comment Posted on May 22, 2013May 27, 2013 History, Population

Historic Population and Growth of Amherst and Neighboring Towns

The town of Amherst, NH has had much growth in the past few decades. Some insight into that growth can be found by digging through town records as published in our annual Town Reports (available in the Reference Room in the town library). Page 80 of the town report for the year 2000 provides data on the town’s annual population, as taken by Selectmen’s census, since 1960 and is shown here. I find it interesting that we had only 2000 people in town in 1960. These are very informative data, but are somewhat limited in value because in their short time scale.

Amherst, NH population as recorded in annual town reports.

The US Census records the decennial population, something they’ve been doing since 1790, which is plotted below for Amherst from 1910 until the most recent one in 2010. Take note that the US Census data and the Selectmen’s Census from above do not generally agree in their absolute numbers, though they do follow the same trends during the years they overlap. These US Census data, while they does not contain the fine level of detail that the Selectmen’s Census does, paint a much broader picture of the history and growth of the town and are useful for that analysis.

US Census data for the town of Amherst, NH.

From is information, we can consider how and when our population has changed significantly. For this, we will examine the percentage of change of the population of town from the previous decennial census (graph appears below). These values paint a remarkable picture of the town’s growth. The 1960 and 1970 decades (1970 and 1980 census values) show enormous growth in the town. Between 1960 and 1970, the town’s population more than doubled (from 2061 to 4605). And from 1970 to 1980 it almost doubled again (4605 to 8243). After 1980, the growth rate plummeted and has remained relatively low.

The percentage change in the population of Amherst, NH from its previous decennial US Census.

To understand if this trend was broad or simply localized to Amherst, we can look at the same historic data for nearby towns. The US Census populations of Bedford and Hollis are plotted together below with the Amherst data from above. Note the large similar large population growths at approximately the same times.

US Census data for the towns of Amherst, Bedford, and Hollis, NH.

We can also calculate the percent change for Bedford and Hollis and plot those data with our Amherst data from above. The absolute values vary somewhat, but the data for the three towns all have in common several decades of large growth which peaked around the 1970 time period.

The percentage change in the population of Amherst, Bedford, and Hollis NH from their respective previous decennial US Census.

From these combined charts, we can conclude that the rapid population growth in the 1960s and 1970s was not localized to just Amherst. Amherst and its neighboring towns experienced a population boom in the decades following the “baby boom” (1946-1964).

Perceptual Contrast