The Big Friday Effect

The Big Friday effect is what I have called my observation that more mail is delivered on Friday than on any other day of the week.  It became apparent very quickly as I began to analyze my USPS mail and has remained through the entire year that I’ve been conducting this mail-counting experiment. You can see the Big Friday effect in the figure below, which plots the total number of pieces of mail I’ve received by weekday.  It is a curious effect that has an interesting cause.

The quantity of mail, by category, with the day of the week it was delivered.

The quantity of mail, by category, by the day of the week it was delivered. Notice that Friday is significantly higher than the other weekdays.

I wanted to look more deeply into the weekday distribution to understand what is behind Big Friday. I analyzed the mail from individual senders to see how it was distributed throughout the week, restricting the analysis to the top 15 largest senders (see below).  This limits me to senders with mail volumes of about 1 item per month or more.  Any sender with less mail volume than that won’t be able to have much of an impact on any given day.

Mail totals for the top 15 senders of mail (to me), broken down into categories.

Mail totals for the top 15 senders of mail (to me), broken down into categories.

Plotting each sender’s mail into weekdays is revealing.  Most of the them have mail delivery distributed throughout the week, which is what you would expect for a mostly random sending process.  There are two notable exceptions — they are the senders with the majority of their mail deliveries concentrated into one weekday.

The weekday mail totals for the top 15 senders.  Note that the y-axes all have independent scales.  Note the scale for Redplum.

The weekday mail totals for the top 15 senders. Note that the y-axes all have independent scales. Note the scale for Redplum.

The Amherst Citizen is a local newspaper which is generally delivered on Wednesdays.  This does give Wednesday a boost in the top figure, but it isn’t a huge contributor because it doesn’t come out every week.  It is also not that reliably delivered on Wednesday, with Thursday deliveries being about 1/3 as many as Wednesday’s.  But look at Redplum (lower left), the well known junk mail merchant.  With 44 deliveries on Friday alone it dominates the weekday totals.  Thursday and Saturday have two each.  Considering that there are 54 weeks of mail deliveries in these numbers, Redplum would seem to be very effective at getting advertisements and coupons delivered to households just in time for weekend shopping plans.

To illustrate just how strongly Redplum impacts the numbers, we can look the mail by weekday for these top 15:

Mail received by the listed senders by the weekday it was received.  Note the large contribution from Redplum on Friday.

Mail received by the listed senders by the weekday it was received. Note the large contribution from Redplum on Friday.

And then without Redplum’s contribution’s.  And so we find that Big Friday is all about one junk mail marketer being very precise with their product’s delivery.

Mail received by the listed senders by the weekday it was received.  Redplum was removed to illustrate it's effect.

Mail received by the listed senders by the weekday it was received. Redplum was removed to illustrate it’s effect.

Advertisements

Book Review: How To Create A Mind — Ray Kurzweil

If you set out to build a computer that could think like a person, where would you start?  What would you want to know?  How would you test your ideas?

When we build a computer simulation, we often start by studying the thing we want to simulate.  We then build a theory or a mathematical model of how it works.  And then we test the model against cases where we already know the answer and use the results to update the model.  Lather, rinse, and repeat until you get something useful or you determine that your model just doesn’t work.

“All models are wrong, but some models are useful.” —George E. P. Box

We can’t start to understand the mind from the perspective of trying to build one without first studying the brain.  Lots of animals have brains, so what is it that is different about the mammalian brains from those of the lower animals?  And how do human brains differ from other mammals?  Can we isolate key functions and put together a good working theory of operation?

Ray Kurzweil, in How To Create A Mind takes this approach.  Kurzweil is an inventor, an engineer, an entrepreneur; he is not a neuroscientist.  He quite clearly intends to see this work carried to its conclusion when an electronic mind of or beyond human intelligence becomes reality.

After talking with people who have worked in the field of Artificial Intelligence (AI), it seems appropriate to make a few remarks before continuing.  First, the term “Artificial Intelligence” makes some people shudder.  This seems to be in part due to the fact that the field didn’t advance as quickly as everyone believed it would in the 20th century.  But also in part that modern “smart” computers remain unable to perform anything even close to “common sense.”  Even those that use algorithms like the ones Kurzweil proposes.  That since the brute force capabilities of, say, Deep Blue or Watson, are so vast, that its “smarts” come simply from its immense computational capabilities and not from its ability to implement particularly smart algorithms.  In essence, that since Watson or Siri doesn’t “understand” you the way other humans do, that they never will.  End of story.

There is some truth here.  Even advanced modern computers can’t make logical inferences like a human can.  I just asked Siri, “Tell me something I should know.”  She responds with, “That may be beyond my abilities at the moment.”  But I am not convinced that is the end of the story.  Nate Silver, in The Signal and the Noise, talks a lot about forecasts by experts, and much of what he says suggests we shouldn’t give them too much weight because of just how often their predictions are terrible.

I’m very persuaded by Kurzweil’s Law of Accelerating Returns enabling things in the not too distant future that we can’t imagine as possible today.  There is simply too much evidence in support of it to ignore.  The capabilities of today’s computers would shock the engineers who built the ENIAC.  In 1949, Popular Mechanics suggested that computers might weigh less than 1.5 tons someday.  Ken Wilson, founder of Digital Equipment Corporation famously said in 1977 that, “There is no reason anyone would want a computer in their home.”  These dates aren’t that far in the past, so it is clear that very bright people can suffer from a remarkable lack of vision.  Particularly in an industry where the technological capabilities double in a span of less than two years.  So I think it is reasonable to expect that the continuing growth of information processing capability will give us some pretty amazing things in the years to come.  Exactly what they’ll be is less certain.

“If a…scientist says that something is possible he is almost certainly right, but if he says that it is impossible he is very probably wrong.” —Arthur C. Clarke

Kurzweil proposes the neocortex as the key differentiating element in advanced thinking.  And he proposes pattern recognition as the key neocortical function as part of his Pattern Recognition Theory of Mind.  American Neuroscientist Vernon Mountcastle discovered the columnar organization of the neocortex as the fundamental building block in 1957.  This organization, the cortical stack, exists pretty much through the entire neocortex, regardless of whether it is processing speech, vision, hearing, etc.  Kurzweil proffers that this single processing unit (the cortical stack) with different inputs can execute largely the same algorithm (pattern recognition) to achieve the necessary results, regardless of whether it is working on vision, speech or hearing.  We know that one area of the brain can do the work of others when necessary — an effect known as plasticity.  This is well documented and gives key support to idea of a common algorithm being used throughout the neocortex, though not specifically to it being pattern recognition.

But the approach is very effective.  Kurzweil long ago started a company to create software to do natural language processing.  You know it today as Nuance, the folks who make Apple’s Siri assistant work.  When trying to develop the algorithms to make natural language processing work, lots of different approaches were tried.  It was the pattern recognition approach, implemented using a hidden Markov model, that was the most successful by far.  Kurzweil makes the argument that Siri, when attempting to process your request, performs a very similar algorithm to the one that your brain must use to process language, and this should be thought of as a form of intelligence.  I find his arguments somewhat persuasive, but I have a colleague who argues quite strongly against that interpretation and supports his position well.  It is certainly food for thought while there are no objective answers.

In spite of the fact that the author is not a neuroscientist, there is a lot of neuroscience in these pages.  Here you’ll read about the exciting work of Benjamin Libet, V. S. Ramachandran, Michael Gazzaniga, and others, and dig into the concepts of free will versus determinism, decision making, and consciousness.  What it all comes down to, in Kurzweil’s view, is that human brains execute a complex algorithm, that modern technology isn’t yet capable of this level of complexity, but that it will someday.  Given that, how willing will we be to accept the consciousness of intelligent machines?  What will a machine need to do to convince a human it is conscious?  Is the Turing test enough?  You’ll have to come to your own conclusions here.  Given the way my children interact with Siri, I suspect that Kurzweil’s assumption of ready adoption by the next generation (though perhaps not older ones) is probably correct.

This is relevant because Kurzweil predicts that the “someday” when truly intelligent machines will be here will begin in 2029.  If you’re familiar at all with any of his previous work, his Law of Accelerating Returns is pervasive in his thought.  That technological progress increases at an exponential rate, and in about 2029 is when he predicts technology will be sufficiently mature to support strong AI.  This is from the perspective of raw processing capabilities, not from extrapolating the successful demonstrations of any sort of machine intelligence.  Mind you, a machine has just passed the King’s Wise Men self awareness test.  Kurzweil might be right.

But is brute force processing enough for the emergence of a conscious mind?  Kurzweil certainly thinks so.  But I don’t believe that USC Neuroscientist Antonio Damasio would agree with him.  In his own writings, Damasio argues that consciousness grew out of a concept of self, which in turn is a function of biological value.  That as individual biological cells organized into increasingly complex systems, that their own evolutionary survival depended on a higher level of cooperative activity.  Each cell’s natural inclination toward survival is the driving force in his view, and the connections that cells make to the brain through the nervous system amplify this survival instinct.  Damasio sees feelings and emotions as a part of the mapping that the brain does of the body, a feedback mechanism to understand how it is doing, and that consciousness is built up in this way and for this reason.  It is a wildly different view than Kurzweil’s in the sense that the driving force is not related to computational complexity.  Instead, it is a hierarchical, evolution-driven, survival behavior.  This begs the question, can a machine, without a biological survival instinct, develop a concept of the self and ultimately consciousness?  I expect more time will be spent pondering this question in the near future, not less.

“When we reflect upon the manifold phases of life and consciousness which have been evolved already, it would be rash to say that no others can be developed, and that animal life is the end of all things.  There was a time when fire was the end of all things: another when rocks and water were so.” —Samuel Butler, 1871

This book was hard to put down.  Kurzweil very thoroughly researches and documents his material, and whether you find him to be a genius or perhaps slightly insane, he always makes a strong case for his position.  It isn’t easy to go on the record with the sorts of predictions that Kurzweil has come to be known for, and few people do it.  But he’s smart, he’s gutsy, and he’s right far more than he’s wrong.  Spending a few hundred pages in the brilliance of Kurzweil is time well spent.

Book Review: Who’s In Charge — Michael Gazzaniga

Philosophers, theologians, and scientists have debated the concept of free will for centuries.  The very likable concept of free will is at odds with the very nature of our observations — things tend to have causes.  And so the question remains, how much decision-making freedom do we really have?  And what’s this conscious “we” concept, while we’re at it?

In the 1800s, a rather well known mathematician made a bold statement with some long-ranging consequences.  He said that if we knew the position and momentum of every particle in the universe, then we could calculate forward in time and predict the future.  This built on the foundations of the physical laws that Isaac Newton observed and was the start of what became known as determinism.  Since the brain is subject to physical laws (in essence, its functionality is a complex set of chemical reactions), this leaves no room for free will, which pretty much everyone believes they have.

“We may regard the present state of the universe as the effect of its past and the cause of its future.  An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.”

—Pierre Simon Laplace, A Philosophical Essay on Probabilities

As modern neuroscience has developed, our understanding of the brain and of the mind has progressed.  But as with all good research, every good answer leads us to more questions.  Michael Gazzaniga is a neuroscientist and a professor at one of my alma maters.  In Who’s In Charge he takes you on a journey through the concepts of emergence and consciousness, the distributed nature of the brain, the role of the Interpreter, and ultimately how these might change what we think about free will.

The conscious mind has considerably less control over the human body than it would like to believe.  This is the underlying theme of the book.  Gazzaniga’s personal career with split-brain patients (a treatment for severe epilepsy), and his review of modern neuroscience are convincing to that effect.  While it is nice to think that “we” call the shots, it becomes clear that “we” aren’t always in charge and who this “we” is has some interesting properties.

“The brain has millions of local processors making important decisions.  It is a highly specialized system with critical networks distributed throughout the 1,300 grams of tissue.  There is no one boss in the brain.  You are certainly not the boss of the brain.  Have you ever succeeded in telling your brain to shut up already and go to sleep?”

What our conscious self thinks is largely the result of a process that takes place in the left hemisphere of our brain.  Gazzaniga calls it “the interpreter.”  This process’s job is to make sense of things, to paint a consistent story from the sensory information that enters the brain.  Faced with explaining things that it has no good data for, the interpreter makes things up, a process known as confabulation. There is a story of a young woman undergoing brain surgery (for which you are often awake).  When a certain part of her brain was stimulated, she laughed.  When asked why she laughed, she remarked, “You guys are just so funny standing there.”  This is confabulation.

“What was interesting was that the left hemisphere did not say, “I don’t know,” which truly was the correct answer.  It made up a post hoc answer that fit the situation.  It confabulated, taking cues from what it knew and putting them together in an answer that made sense.” 

But the brain is even stranger than that.  If you touch your finger to your nose, the sensory signals from the finger and from the nose take measurably different times to reach the brain.  Different enough that the brain receives the signal from the nose well before it receives the signal from your finger.  It is the interpreter that alters the times and tells you that the two events happened simultaneously.

This is where neuroscience’s contribution to the sensation of free will comes into play.  Gazzaniga says, “What is going on is the match between ever-present multiple mental states and the impinging contextual forces within which it functions.  Our interpreter then claims we freely made a choice.”  This is supported by Benjamin Libet’s experiments which demonstrated that the brain is “aware” of events well before the conscious mind knows about them.  Libet even goes so far as to declare that conscousness is “out of the loop” in human decision making.  This is still hotly debated, but fascinating.

Gazzaniga argues that consciousness is an emergent property of the brain.  Emergence is, in essence, a property of a complex system that is not predictable from the properties of the parts alone.  It is a sort of cooperative phenomenon of complex systems.  Or, as Gazzaniga more cleverly puts it, “You’d never predict the tango if you only studied neurons.”  Emergence is a part of what’s known as complexity theory, which has increased in popularity in the last decade or so.  But at this point, designating something as an emergent property is still really just a way to say you don’t know why something happens.  And despite all the advances that have been made in neuroscience, we still fundamentally don’t understand consciousness.

Gazzaniga makes the case that the development of society likely had much to do with the development of our more advanced cognitive abilities.  That is, as animals developed more social behavior, that increased cognitive skills were necessary, and that this was probably the driving force in the evolution of the neocortex.

“Oxford University anthropologist Robin Dunbar has provided support for some type of social component driving the evolutionary expansion of the brain.  He has found that each primate species tends to have a typical social group size; that brain size correlates with social group size in primates and apes; that the bigger the neocortex, the larger the social group; and that the great apes require a bigger neocortex per given group size than do the other primates.”

There is some physiological evidence to support the relationship between society and neocortical function in the case of mirror neurons.  First discovered in macaque monkeys, when a monkey grabbed a grape, the same neuron fired in both the grape-grabbing monkey and one who watched him grab the grape.  Humans have mirror neurons too, though in much greater numbers.  They serve to create a sympathetic simulation of an event which drives emotional responses.  That is, the way we understand the emotional states of others is by simulating their mental states.  So when Bill Clinton told Bob Rafsky that he felt his pain, perhaps he really did.

This is not Gazzaniga’s first book and it shows.  The work is well planned and executed.  He uses clear language to describe some of the wonderful discoveries of modern neuroscience, and makes them available for laymen to learn and enjoy.  He discusses his own fascinating research, for which he is well known in his field, and also discusses other hot topics in neuroscience and their implications on modern society and also on the free will debate.  He ends the book discussing how modern neuroscience can and should be used in regards to the legal system, which caught me somewhat by surprise.  It is a fine chapter, but it doesn’t read like the rest of the book, feeling like a separate work that was added in as an afterthought.

I enjoyed Who’s In Charge? immensely.  It is an excellent read and will undoubtedly challenge some of your thoughts and enlighten you about how we think about the mind and the brain today.

Bubble Markets, Burst Markets

Wall Street forecasters are notoriously bad at predicting what the markets are going to do.  In fact the forecasts for 2001, 2002, and 2008 were actually worse than guessing.  Granted, predicting the future is a hard job, but when it comes to stock markets, there are some things you can count on.  Disclaimer:  This is a look at the numbers; it is not investment advice.

Let’s take the Standard & Poor’s 500.  It is an index of 500 large US companies stock values, much broader than the Dow Jones’s 30 company average.  It isn’t a stock, and you can’t buy shares in it.  But it is a convenient tool for tracking the overall condition of the stock market.  It may also reflect on the state of the economy, which we’ll look at in a bit.  Below are the monthly closing values of the S&P 500 since 1950.  It’s value was about 17 points in January of 1950 and it closed around 2100 points here in June of 2015.  It’s bounced around plenty in between.

Closing values of the S&P 500 stock index.

Closing values of the S&P 500 stock index.

One of the questions to ask is whether the markets are overvalued or undervalued.  Forecasters hope to predict crashes, but also to look for good buying opportunities.  Short term fluctuations in the markets have proven to be very unpredictable.  But longer term trends are a different story, and looking at them can give huge insights into what’s currently going on.

But first we have to look at the numbers in a different way.  The raw data plot above makes things more difficult than they really need to be because it fails to let you clearly see the trend in how the index grows.  Stocks have a tendency to grow exponentially in time.  This is no secret, and most of the common online stock performance charts give you a log view option.  Exponential growth is why advisors recommend most working people to get into investments early and ride them out for the long haul.

The exponential growth in the S&P is easy to see in the plot below, where I plotted the logarithm of the index value.  For convenience I also plotted the straight line fit to these data — this is its exponential trend.  Note that these data span six and a half decades, so we have some bull and bear markets in there — and whatever came in between them.  And what you see is that no matter what short term silliness was going on, the index value always came back down to the red line.  It didn’t necessarily stay there very long, but the line represents the stability position.  It is a kind of first order term in a perturbation theory model, if you will.  The line shows the value that the short term fluctuations move around.

Here I've taken the logarithm (base 10) of the index values to show the exponential growth trend.  The grey area represents the confidence intervals.

Here I’ve taken the logarithm (base 10) of the index values to show the exponential growth trend. The grey area represents the confidence intervals.

This return to the line is a little bit clearer if we plot the difference between the index and the trend.  This would seem to be a reasonable way to spot overvalued or undervalued markets.  Meaning, that in 2000, when the S&P was some 800 points over its long term model value, the corresponding rapid drop back down to the line should have caught no one by surprise.

Differences between the S&P 500 index value and the exponential trend model value.

Differences between the S&P 500 index value and the exponential trend model value.

But this look at the numbers is a bit disingenuous.  That’s because the value of the index has changed by huge amounts since 1950, so small points swings that we don’t care about at all today were a much bigger deal then.  This makes more recent fluctuations appear to be a bigger deal than they may really be.  So what we want to see is the percentage of the change, not the actual change.

And on top of this, let’s mark recession years (from the Federal Reserve Economic Database) in red.  From this view we can see the bubble markets develop and the resulting panics that result when they burst (hello 2008).  And that every recession brought a drop in the index (some bigger than others), but not every index drop represented a recession.  In the tech bubble of the late 1990s the market was 110% overvalued at its peak.  The crash of 2008 had it drop to about 45%, which is considerably undervalued.  All that in 8 years.  I think it’s safe to call that a panic.  I know it made me panic.

Deviations in the S&P 500 index value from the exponential model are shown as a percentage of the index values.  And recession years (from FRED) are shown in light red.

Deviations in the S&P 500 index value from the exponential model are shown as a percentage of the index values. And recession years (from FRED) are shown in light red.

What we see is that the exponential model does a good job at calculating the baseline (stable position) values.  If it didn’t, the recession-related drops in the index wouldn’t line up with the FRED data, and things like the 1990s bubble and the 2008 financial meltdown wouldn’t match the timeline.  But they do.  Quite well, actually.  So this is a useful analysis tool.

It is also enlightening to take the same looks at the NASDAQ index since it represents a different sector of the stock market.  NASDAQ started in 1971 and is more of a technology focused index.  The NASDAQ composite index is created from all of the stocks listed on the NASDAQ exchange, which is more than 3000 stocks.  So more companies in the index means this is a broader look, but it is focused on tech stocks.

So, as with the S&P above, here are the raw data.  It looks similar to the S&P, and the size of the tech bubble is more clear.  The initial monthly close of the index was 101 points, and it is over 5000 today.

Closing values of the NASDAQ stock index.

Closing values of the NASDAQ stock index.

Not surprising to anyone, this index also grows with an exponential trend.  The NASDAQ was absolutely on fire in the late 1990s.  I wonder if this is what Prince meant when he wanted to party like it was 1999.  Maybe he knew that would be the time to cash out?

Here I've taken the logarithm (base 10) of the index values to show the exponential growth trend.  The grey area represents the confidence intervals.

Here I’ve taken the logarithm (base 10) of the index values to show the exponential growth trend. The grey area represents the confidence intervals.

The size of the dot-com bubble is clearer if we look at the deviation from the model, as we did with the S&P.  At the height of the tech bubble, the NASDAQ was about 3500 points overvalued.  Considering that the model puts its expected value at about 1300 points in 2000, I have to ask myself, what were they thinking?

Differences between the NASDAQ index value and the exponential trend model value.

Differences between the NASDAQ index value and the exponential trend model value.

The percent deviation plot shows this very clearly.  At the height of the tech bubble, the NASDAQ was some 275% overvalued, almost three times that of the S&P 500’s overvalue.  Before the late 1990s the NASDAQ had never strayed more than about 50% from the model value.  Warren Buffet has said that the rear view mirror is always clearer than the windshield, but maybe Stevie Wonder shouldn’t be the one doing the driving.

Deviations in the NASDAQ index value from the exponential model are shown as a percentage of the index values.  And recession years (from FRED) are shown in light red.

Deviations in the NASDAQ index value from the exponential model are shown as a percentage of the index values. And recession years (from FRED) are shown in light red.

From this perspective, the NASDAQ today actually looks a few percentage points undervalued, so tech still seems to be a slightly better buy than the broader market (this is not investment advice).

Not only that, but the growth model of the NASDAQ, based on its 45 years of data, shows that it grows considerably faster than the broader market.  If you go back and look at the raw data for either of the two indices, you’ll notice something special about the nature of exponential growth.  The time it takes to double (triple, etc.) is a constant.  As these are bigger numbers and because it is convenient, let’s look at the time it takes to grow by a factor of ten (decuple).  The S&P 500 index decuples every 33.3 or so years.  The NASDAQ composite, on the other hand, decuples every ~24 years (about 23 years and 11 months, give or take).  This has huge implications for growth.  That’s nine fewer years to grow by the same factor of 10.

Now comes the dangerous part.  Let’s take the both of these indices and forecast their model values out thirty years.  Both of the datasets contain more than thirty years worth of data, so forecasting this far out is a bit of a stretch, but not without some reasonable basis.  Still, this is an exercise in “what if,” not promises, and certainly not investment advice.

Since we started with the S&P, let’s look at that first.  If the historic growth trends continue, the model forecasts that the S&P 500 (currently around 2000 points) should be bouncing around the 10,000 point mark some time in the middle of 2038.

S&500 data, along with its exponential model fit, extended out thirty years.  The grey area represents the confidence intervals.

S&500 data, along with its exponential model fit, extended out thirty years. The grey area represents the confidence intervals.

The NASDAQ, on the other hand, which is currently around 5000 points, should average around 10,000 in late 2021, and 100,000 near the end of 2045.  (Note: the S&P should be around 16,000 points at that time).  Today the ratio of the NASDAQ to the S&P is about 2.4.  But in 2045 it could reasonably be expected to be more than 6.  Depending on the number of zeroes in your investment portfolio (before the decimal point…), that could be significant.

NASDAQ data, along with its exponential model fit, extended out thirty years.  The grey area represents the confidence intervals.

NASDAQ data, along with its exponential model fit, extended out thirty years. The grey area represents the confidence intervals.

This forecasting method will not predict market crashes.  But that’s OK, because the professionals who try to forecast them can’t do that either.  (Now if only Goldman-Sachs would hire me.)  What it can do is give us a very clear idea of the market is over or under valued.  By forecasting the stable position trend, we can easily spot bubbles, identify their size, and perhaps make wise decisions as a result.

The Adjacent Possible and the Law of Accelerating Returns

A concept that inventor and futurist Ray Kurzweil drives home in his books is what he calls the Law of Accelerating Returns.  That is, the observation that technology growth (among other things) follows an exponential curve.  He shows this for no small number of pages for varying technologies and concepts.  Most famous is Moore’s Law, in which Gordon Moore (one of the founders of Intel Corporation) observed that the number of transistors on a die doubled in a fixed amount of time (about every two years).  Kurzweil argues that this exponential growth pattern applies to both technological and biological evolution. In other words, that progress grows exponentially in time.  It should be clear that this is an observation rather than something derived from fundamental scientific theories.

What makes this backward looking observation particularly interesting is that in spite of our observation of it as generally true over vast periods of time, humans are very linear thinkers and have a difficult time envisioning exponential growth rates forward in time.  Kurzweil is a notable exception to that rule.  Because of exponential growth, the technological progress we make in the next 50 years will not be the same as what we have realized in the last 50 years.  It will be very much larger.  Almost unbelievably larger — the equivalent of the progress made in the last ~600 years.  This is the nature of exponential growth (and why some people find Kurzweil’s predictions difficult to swallow).

Interestingly, when a survey of scientific literature was done by Derek Price in 1961, an exponential growth in scientific publications was readily observed, but dismissed as unsustainable.  This unsustainability in the growth rate was understood to be obvious by Price.  The survey was revisited in 2010 (citing the original work), with the exponential growth still being observed 39 years later.  So this linear forecasting is a handicap that seems to exist even when we have the data to the contrary staring us in the face.

On the other hand we have biologist Stuart Kauffmann.  He introduced the concept of the Adjacent Possible which was made more widely known in Steven Johnson’s excellent book, Where Good Ideas Come From.  The Adjacent Possible concept is another backwards-looking observation that describes how biological complexity has progressed through the combining of whatever nature had on hand at the time.  At first glimpse this sounds sort of bland and altogether obvious.  But it is a hugely powerful statement when you dig a little deeper.  This is a way of defining what change is possible.  That combining things that already exist is how things of greater complexity are formed.  Said slightly differently, what is actual today defines what is possible tomorrow.  And what becomes possible will then influence what can become actual.  And so on.  So while dramatic changes can happen, only certain changes are possible based on what is here now.  And thus the set of actual/possible combinations expands in time, increasing the complexity of what’s in the toolbox.

Johnson describes it in this way:

“Four billion years ago, if you were a carbon atom, there were a few hundred molecular configurations you could stumble into.  Today that same carbon atom, whose atomic properties haven’t changed one single nanogram, can help build a sperm whale or a giant redwood or an H1N1 virus, along with a near infinite list of other carbon-based life forms that were not part of the adjacent possible of prebiotic earth.  Add to that an equally list of human concoctions that rely on carbon—every single object on the planet made of plastic, for instance—and you can see how far the kingdom of the adjacent possible has expanded since those fatty acids self-assembled into the first membrane.” — Steven Johnson, Where Good Ideas Come From

Kauffmann’s complexity theory is really an ingenious observation.  Perhaps what is most shocking is that, given how obvious it is in hindsight, no one managed to put it into words before.  I should note that Charles Darwin’s contemporaries expressed the same sentiments.

What is next most shocking is that Kauffman’s observation is basically the same as Kurzweil’s.  We have to do a little bit of math to show this is true.  I promise, it isn’t too painful.

The Adjacent Possible is all about combinations.  So first let’s assume we have some set of n number of objects.  We want to take k of them at a time and determine how many unique k-sized combinations there are.  This is popularly known in mathematics as “n choose k.”  In other words, if I have three objects, how many different ways are there to combine them two at a time?  That’s what we’re working out.  There is a shortcut in math notation that says if we are going to multiply a number by all of the integers less than it, that we can write the number with an exclamation mark.  So 3x2x1 would simply be written as 3!, and the exclamation mark is pronounced “factorial” when you read it.  This turns out to be very helpful in counting combinations.  Our n choose k counting problem can then be written as:

Math5

You can try this out for relatively small numbers for n and k and see that this is true.

The pertinent question, however, is what are the total number of combinations for all possible values of k.  That is, if I have n objects, how many unique ways can I combine them if I take them one at a time, two at a time, three at a time, etc., all the way up to the whole set?  To find this out you evaluate the above equation for all values of k from 0 all the way to n and sum them all up.  When you do this you find that the answer is 2^n. Or written more mathematically:

Math1

So as an example, let us take 3 objects (n=3), let’s call them playing cards, and count all of the possible combinations of these three cards, as shown in the table below.  Note that there are exactly 2^3=8 distinct combinations.  Here a 1 in the row indicates a card’s inclusion in that combination.  We have no cards, all combinations of one card, all combinations of two cards, and then all three cards, for a total of 8 unique combinations.

Card 3 Card 2 Card 1
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1

You can repeat this for any size set and you’ll find that the total number of unique combinations of any size for a set of size n will always be 2^n.  If you are familiar with base 2 math, you might have recognized that already.  So for n=3 objects we have the 2^3 (8) combinations that we just saw.  And for n=4 we get 2^4 (16) combinations, for n=5 we have 2^5 (32) combinations, and so on.

So in other words, the number of possible combinations grows exponentially with the number of objects in the set.  But this exponential growth is exactly what Kurzweil observes in his Law of Accelerating Returns.  Kurzweil simply pays attention to how n grows with time, while Kauffman pays attention to the growth of (bio)diversity without being concerned about the time aspect.

Kauffman uses this model to describe the growth in complexity of biological systems.  That simple structures first evolved, and that combinations of those simple things made structures that were more complex, and that combinations of these more complex structures went on to create even more complex structures.  A simple look at any living thing shows a mind-boggling amount of complexity, but sometimes it is obvious how the component systems evolved.  Amino acids lead to proteins.  Proteins lead to membranes.  Membranes lead to cells.  Cells combine and specialize.  Entire biological systems develop.  Each of these steps relies on components of lower complexity as bits of their construction.

Kurzweil’s observation is one of technological progress.  That the limits of ideas are pushed through paradigm after paradigm, but still it is the combination of ideas that enable us to come up with the designs, the processes, and materials that get more transistors on a die year after year.  That is to say, semiconductor engineers 30 years ago had no clues how they would get around the challenges they faced in reaching today’s level of sophistication.  But adding new ways of thinking about the problems lead to entirely new types of solutions (paradigms) and the growth curve kept its pace.

Linking combinatorial complexity to progress gives us the modern definition of innovation.  That innovation is really the exploring and exploiting of the Adjacent Possible.  It is easy to look back in time and see the exponential growth of innovation that has brought us to the quality of life we have today.  It is much easier to dismiss it continuing on because we are faced with problems that we don’t currently have good ideas about how to solve.  What we see from Kurzweil’s and Kauffman’s observations is that the likelihood of coming up with good ideas, better ideas, life-changing ideas, increases exponentially in time, and happily, we have no good reason to expect this behavior to change.

The USPS and You

Every day but Sunday, a government employee comes to that place you call home and leaves you with any number of items.  Packages perhaps, but certainly letters, bills, advertisements, or magazines.  Most of these are sent to you by complete strangers.  Is there something interesting or valuable that can be learned by paying attention to what arrives in the mailbox?

Questions we might want to ask:  “How much mail do I get?”, “Who sends me mail?”, “How often do they send it?”, and “What kinds of mail do I get?”  Advertisers certainly have each one of us in their databases.  I’m sort of curious to know something about what they think they know about me.  But I’m also eager to explore what can be learned by simply paying attention to something that goes on around me with a high degree of regularity.

I’ve mentioned this before, but my methods here are to record the sender and the category of each piece of mail I receive daily.  This is for mail specifically directed to me, or not specifically directed to anyone (i.e. “Resident”).  I’ve been doing this since the end of July 2014, so I have a fair amount of data now.

Let’s start with quantity.  On average I’m getting about 100 pieces of mail per month.  This is pretty consistent over 8 months, but note that things picked up at the beginning of November and then dropped back in January.  The rate (i.e., slope) didn’t really change, there was just a shift in the baseline.  The November shift is undoubtedly from election related mail.  The January shift is the post-Christmas dropoff that we’ll see later.

Cumulative amount of delivered mail.

Cumulative amount of delivered mail.

One of the more interesting observations is the breakdown of the mail by category.  It should come as no surprise these days that the majority of mail is advertising.  If you include political adverting (a category I break out separately), this overall advertising category accounts for more than half of the mail I get in my letterbox.  Considering that the USPS’s own numbers suggest about 52% of the mail was advertising in 2014, it looks like my dataset is representative.  Interestingly, the percentage of mail that was advertising in 2005 was only about 47%, so the percentage of mail that is advertising is on the increase.  This is not unexpected.  The NY Times published a piece in 2012 indicating that the Postal Service had announced their plan for addressing the huge decreases first class mail.  It was to focus on increasing the amount of advertising mail that they carry.  The Wall Street Journal has a piece from 2011 showing that the advertising percentage was only about 25% in 1980 and has been increasing steadily ever since.  Mission accomplished.

Categorical percentages of delivered mail.

Categorical percentages of delivered mail.

The next largest category, “Information”, is communications from people that I know or businesses that I deal with.  In other words, mail I want or care about in some fashion.  This is about 22% of the total.  Bills are a separate category as I think they are different enough to track separately.  Yes I still get magazines.  No I don’t wish to convert to a digital subscription.  But thank you for asking.

I find it interesting to look at the breakdown of the composition of the mail over time.  Judging from the sharp changes in color in the largest category (bottom bar), you can probably guess when the last state primary and general election took place.  But note that in general, each week is dominated by advertisements.  Notable times that this is not true are the week leading up to an election, when political advertisements dominate (note that these are still advertisements), and the weeks leading up to Christmas.  This last week shows an increase in “Information” mail largely because of Christmas Cards.

Weekly mail by category.  Note that 2015 began mid-week.

Weekly mail by category. Note that 2015 began mid-week.

Let’s look more closely at the advertisement mail numbers all by themselves.  October was the peak month, which is somewhat surprising given the frenzy over the Black Friday shopping.  Predictably, direct mail fell off in January after the end of the Christmas shopping season.  But somewhat surprisingly it climbs back without too much delay.

Amount of advertising mail received each month.

Amount of advertising mail received each month.

So who exactly is it that sends me so much junk mail?  Good question.  Redplum is the biggest of them all by far.  Also known as Valassis Communications, Inc., they provide media and marketing services internationally, and they are one of the largest coupon distributors in the world.  In other words, they’re a junk mail vendor.  You can count on them, as I’m sure the USPS does, for a weekly delivery of a collection of ads contained inside of their overwrap.  After that I have Citibank, Bank of America, SiriusXM, and Geico, in that order.  I would not have expected Geico to show up this high on the list, but there they are.

The amount of advertising mail sorted by sender.

The amount of advertising mail sorted by sender, restricted to those with 2 or more pieces of mail being delivered.

Another question to consider is when does all this mail come?  We looked before at the monthly advertisement mailings numbers, but we can dig a little deeper and look at how mail deliveries vary by weekday.  If we look at raw numbers, we notice that Friday is by far the biggest mail day in terms of the number of items received.  This has been consistently true for the entire time I have been analyzing my mail.  I don’t have a good explanation for this observation.

The quantity of mail, by category, with the day of the week it was delivered.

The quantity of mail, by category, with the day of the week it was delivered.

But there’s more to it than just that.  We don’t get mail every weekday.  Lots of federal holidays fall on Mondays where there is no mail delivery.  What we really want to do is to look at how much mail we get for every day that mail was actually delivered.  This lets us compensate for an uneven amount of delivery weekdays.  When we do this, we find things even out quite a bit.  Big Friday is still the king, but the other days even out quite nicely.  Understanding what is going on with Friday deliveries is something I’m interested in.

Mail by category each weekday, normalized to the number of days mail was delivered each weekday.

Mail by category each weekday, normalized to the number of days mail was delivered each weekday.

What you can see from all this is that you are (or I am, in any case) more likely to get certain types of mail on some days than on others.  This is somewhat easier to see if we plot each category by itself.  I find it remarkable to see that I basically don’t get bills on Wednesdays.  Credit card applications come primarily Saturdays.  Charities don’t ask me for money on Mondays.  And political ads come on Thursdays and Fridays.  I’ll bet that if I further broke down the advertisement category into senders that more weekday specificity would emerge.

Normalized daily mail categories per weekday.

Normalized daily mail categories per weekday.

In the interest of completeness, we finish up by looking at the statistics of the daily mail delivery.  That is, how often do we get some particular number of pieces of mail in the letterbox?  Here we don’t concern ourselves with the category, only the quantity and how many times that quantity shows up.  We can see from the plot that we most often find three pieces of mail and have never found more than thirteen.  This distribution in quantities approximately follows what is known as a Poisson distribution.  It has nothing to do with fish, but rather was named after a French mathematician Siméon Denis Poisson.  The red line fit is a scaled Poisson distribution with the average (lambda) equal to 3.5.  This indicates that, on average, I get 3.5 pieces of mail daily.  This is slightly lower than the mean value from the plots above of 3.9, but they’re calculated in slightly different ways and have somewhat different meanings.

The distribution of mail quantities follows a Poisson distribution.

The distribution of mail quantities follows a Poisson distribution.

The most unexpected things that I have observed are the Big Friday effect, and the amount of regularity in the weekly of delivery of some specific types of mail.  As they have endured over eight months of data collection, I am inclined to think they are real, but it will be interesting to watch and see if they exist after an entire year of mail collection.  It is also interesting that the Wikipedia article on the Poisson distribution specifically mentions that it should apply to mail, seemingly appropriately, but I can find no record anywhere that anyone has actually done this experiment.

Followup: The Effect of Elections on Gasoline Prices

My intention for the last post, The Effect of Elections on Gasoline Prices, was to be as thorough and quantitative as possible.  A friend who is properly trained in statistics pointed out the need to run significance tests on the results.  This is good advice and the analysis will be complete with its inclusion.

That last post ended with a visualization of the non-seasonal changes in gasoline prices in the months leading up to the election (August to November) for election years (Presidential or midterm), and used the same data in the same timeframe in non-election years as a control.  We used inflation-adjusted, constant 2008 dollars to properly subtract the real seasonal changes and discover real trends in the analysis.  That final figure (below) clearly showed that there is no trend of election-related price decreases.  In fact, prices have tended to increase somewhat as the election nears.  But the question that I failed to adequately address last time is:  Are the price changes in election years significantly different from those of non-election years?  This is the definitive question.

Non-seasonal, August to November changes in U.S. regular unleaded gasoline prices from 1976 to 2013.  The comparison is made for election and non-election years.  Original data source is the U.S. Bureau of Labor Statistics.

Non-seasonal, August to November changes in U.S. regular unleaded gasoline prices from 1976 to 2013. The comparison is made for election and non-election years. Original data source is the U.S. Bureau of Labor Statistics.

Because any sampled data set will suffer from sampling errors (it would be extremely difficult for every gas station in the country to be included in the BLS study each month), the sampled distribution will differ somewhat from the actual distribution.  This is important because we frequently represent and compare data sets using their composite statistical values, like their mean values.  And two independent samplings of the same distribution will produce two sets with different mean values; this makes understanding significant differences between them an important problem.  What we need is a way to determine how different the datasets are, and if these differences are meaningful or if they are simply sampling errors (errors of chance).

Fortunately we are not the first to need such a tool.  Mathematicians have developed a way to compare datasets to determine if their differences are significant or not.  These are “tests of significance.”  The t-test is one of these tests and it determines the probability that the differences between the means of the two distributions are due to chance. The first thing we should do is look at the distributions of these price changes.  The two large election-year price drops (2006, 2008) are very clearly seen to be outliers, and the significant overlap of the distribution of price changes is readily visible.

Distributions of non-seasonal, August to November changes in U.S. regular unleaded gasoline prices from 1976 to 2013. Original data source is the U.S. Bureau of Labor Statistics.

Distributions of non-seasonal, August to November changes in U.S. regular unleaded gasoline prices from 1976 to 2013 for both election and non-election years. Original data source is the U.S. Bureau of Labor Statistics.

It is clear that were it not for the outliers in the election year data, these distributions would be considered to be very nearly identical.  But to characterize the significance of their differences, we’ll run an independent t-test.  The primary output of the test that we are concerned with is the p-value.  This is the probability that differences between the two distributions are due to chance.  Recall that the maximum value of a probability is 1.  If it matters, I’m using R for data analysis.

Welch Two Sample t-test

data:  electionyear$changes and nonelectionyear$changes
t = -0.6427, df = 21.385, p-value = 0.5273
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  -0.2530637 0.1334810
sample estimates:
  mean of x mean of y 
-0.02367507 0.03611627

This p-value tells us that there is a 52.7% probability that differences between these two distributions are chance.  The alternative hypothesis is then rejected and the difference in means is the same as 0.  This answers the question that we posed and indicates that the changes in gas prices in election years are not significantly different from those of non-election years.