A Layman’s Guide to Separating Causation from Correlation … and Noticing When Claims of Causality are Invalid

Imagine you’re the Minister for Education, deciding how large to make a school district. Larger school districts offer parents more school choice. You look at data from thousands of school districts and find that, in larger districts, child performance is better. You’re tempted to infer that district size increases child performance. But, as we know, correlation doesn’t imply causation. There are two alternative explanations:

  1. Reverse causality: child performance increases district size. When kids are doing better, a school district is allowed to expand.
  2. Omitted variables: neither district size nor child performance cause each other, but a third variable causes both. If parents care about education, they will both demand school choice (larger school districts) and also tutor their kids at home (increasing child performance).

The problem of separating causality from correlation occurs in virtually every question that we try to study with data.

  • Showing that adults with a degree earn higher salaries doesn’t mean that university is a worthwhile investment. It might be that high-ability kids go to university and their high ability would have led to them earning more anyway (ability is an omitted variable). Or, kids who expect high salaries in the future (e.g. due to being from well-connected families) are more willing to take on the debt to go to university today (reverse causality).
  • Showing that socially responsible firms perform better doesn’t mean that social responsibility pays off. It might be that, only once a firm is already performing well can it invest in social responsibility (reverse causality). Or, a forward-thinking management team (i) performs better, and (ii) gives thought to social issues (management quality is an omitted variable).
  • Showing that firms that cut investment subsequently perform badly doesn’t meant that cutting investment is bad. A McKinsey study makes the very strong causal claim to have found “finally, evidence that managing for the long-term pays off“. Their claim has been accepted as gospel by many, without recognising reverse causality – when a firm knows that its future prospects are poor, it should cut investment today. Presumably, this is what McKinsey advises its clients to do!
  • Showing that firms where a CEO has a high equity stake (owns a lot of shares) subsequently perform better doesn’t mean that equity incentives work. It might be that, when a CEO expects a firm to perform better in the future, she’s more willing to hold shares today.
  • Showing that a fad diet leads to weight loss doesn’t mean the diet caused weight loss. It might be that the desire to lose weight caused a person to choose the diet, and also to exercise more and it’s the latter that led to the weight loss (omitted variables).

The problem is even worse due to confirmation bias, as I explained in my recent TEDx talk, “From Post-Truth to Pro-Truth”. We jump to the conclusion that fits our view of the world.

  • Professors like me are all too eager to believe that our fascinating class is what got students that job.
  • We want to think that “nice guys finish first” – that responsible companies beat irresponsible ones.
  • Those whose view businesses as evil and self-serving will want to think that those who cut investment (to pay dividends or buy back stock) get their comeuppance later.
  • People like me who spend their lives studying on incentive compensation really want to believe that incentives actually matter, and that they’re not wasting their time.
  • Any proponent of a fad diet or slimming pill will claim they’re to thank for your six-pack abs.

We must be very, very careful about interpreting evidence as causal, when it only shows a correlation. Fortunately, there are now clever techniques to separate causality from correlation – (I) instruments, (II) natural experiments, and (III) regression discontinuity. This article aims to explain these techniques in simple language. But before starting, I must caution that these techniques are only valid in very rare cases. Some papers use one of the three “magic phrases” to try to claim that they have identified causality, and then back it up with as technical language as possible to give the aura of statistical sophistication and batter the reader into submission. Instead, as I’ll explain, you don’t need to be a statistical expert to see whether the authors are trying to pull the wool over your eyes. All you need is common sense. For each of these techniques, I have a “Reader Beware” section on what to look for. The intended audience for this post is practitioners, who might use academic research to guide policy or practice, so I will paint with a broad brush. For a more detailed academic treatment, please see Roberts and Whited (2012).

I begin by defining terms. We are interested in the causal effect of an independent variable (e.g. district size, degree) on a dependent variable (e.g. child performance, future income). A causal interpretation is only possible if the independent variable is exogenous (randomly assigned) – if university places were randomly given to some school leavers and not others, and those that went to university earned more, we could infer that the degree caused the higher salary. However, most variables are endogenous. They are not randomly assigned, but the product of something else – the dependent variable itself (expecting a high future income encourages you to get a degree today – reverse causality), or a third variable that also affects the dependent variable (high ability makes you more willing to get a degree – omitted variables).

I. Instruments

How do we solve the problem that the independent variable is endogenous? In a medical trial, you would randomly assign the independent variable (a new drug) by giving it to some patients (a treated group) and a placebo to others (a control group). But, we can’t do this in social sciences – we can’t force some firms to give their CEOs high equity stakes and others to give low equity stakes.

So, what we want is something as-good-as-random. This is an instrument – something that randomly shocks the independent variable, just like random assignment of a new drug. In the school district example, Hoxby (2000) used rivers as a shock to district size. In the U.S., school districts were formed in the 18th century, when crossing a river was difficult due to no cars and few bridges, and so districts very rarely crossed rivers. Hoxby found that school districts that were naturally smaller, due to rivers, exhibited worse performance. Since these districts were “randomly” assigned a small size, the results imply a causal effect from district size to child performance.

A valid instrument must be:

  1. Relevant. It must affect the independent variable of interest. Rivers are relevant, as they placed natural boundaries on district size.
  2. Exogenous. It must not affect the dependent variable except through the independent variable. Rivers are unlikely to affect a child’s performance other than through affecting district size. (Technically, this is referred to as “satisfying the exclusion restriction”; I will use “exogenous” for short).

To give an example of the ingenuity of some valid instruments:

  • Does a family firm perform better when it appoints a family CEOs rather than an external CEO, or worse due to nepotism? If family-run firms perform better, it could be due to reverse causality: if the firm is performing well, the owners will keep it within the family; if it’s not, they will need an outsider to fix it. Bennedsen, Nielsen, Perez-Gonzalez, and Wolfenzon (2007) use the gender of the CEO’s first-born child as an instrument. Gender is:
    1. Relevant: when the first child is male, family owners are more likely to pass on control to a family CEO than when the first child is female.
    2. Exogenous: it’s unlikely that the gender of a CEO’s first child will affect the performance of the family firm other than through affecting whether the next CEO is from within or outside the family.
  • Rather than studying whether firms actually have a family CEO, the authors predict whether firms will have a family CEO based on the gender of the first-born child. They found that firms with a higher probability of having a family CEO (due to having a male first child). Since whether a firm is predicted to have a family CEO is random – because the gender of the first child is random – this implies that family CEOs cause worse performance.
  • Does watching TV cause autism? If the correlation is positive, it may be that autistic kids watch TV more (reverse causality), or neglectful parents both abandon their kids to watch TV, and also cause autism (omitted variables). Waldman, Nicholson, Adilov, and Williams (2008) use rainfall as an instrument. Rainfall is:
    1. Relevant: rainfall causes kids to watch TV, since they can’t play sport outside.
    2. Exogenous: rainfall doesn’t cause autism other than through its impact on TV-watching (it doesn’t suddenly cause parents to be neglectful).
  • Rather than studying the actual number of hours of TV-watching, the authors predict TV-watching based on rainfall. They found that kids with higher predicted TV watching are more likely to be autistic. Since predicted TV-watching is random – because rainfall is random – this implies that watching TV causes autism.

Reader Beware

Often authors will claim causality by using the magic word “instruments” (or “instrumental variables”), when the instruments are actually invalid because they are not exogenous (it is relatively easy to find instruments that are relevant). A reader should ask the following questions:

  • Can the “instrument” affect the dependent variable other than through the independent variable? Let’s return to the earlier question of whether the CEO’s equity stake causes better future performance. We might use CEO age as an instrument for her equity stake, as older CEOs tend to have accumulated more shares. But, CEO age is not exogenous, since it might directly affect firm performance. Older CEOs might perform better as they are more experienced, or worse as they are entrenched.
  • What causes the instrument to vary to begin with, and could this factor also affect the dependent variable?  Even if CEO age did not directly affect firm performance (older CEOs are just as good as younger CEOs), whatever drives cross-sectional variation in age may do so. For example, trouble in the firm’s business model may lead to a firm retaining an old CEO, and also reduce firm performance.
  • Is the instrument a lagged variable? Some papers use last year’s independent variable as an instrument – in our setting, this would be the CEO’s equity stake last year. It’s relevant – last year’s equity stake will be linked to this year’s, since equity stakes tend to be stable over time. Surely it’s also exogenous – since it’s last year’s stake, it was already set in advance of this year? But, whatever causes this year’s stake to be endogenous also likely causes last year’s stake to be endogenous. Last year, the CEO could have forecast performance to be good this year, and so chosen to hold more shares.
  • Is the instrument a group average? Some papers use a group average as an instrument – in our setting, this would be the average equity stake among CEOs in the same industry as firm X. It’s relevant – if rival firms are giving their CEOs lots of equity, firm X must do so too, to remain competitive. Surely it’s also exogenous – the equity stake of other CEOs shouldn’t affect firm X’s performance? But, any endogeneity in firm X’s equity stake is simply soaked up at the industry level (see Section 2.3.4 of Gormley and Matsa (2014) for more detail). If the industry as a whole is performing well, firm X will perform well, and CEOs of other firms in the industry will gladly hold high equity stakes.
  • Are the authors up-front about their instruments? A tell-tale sign is when, in the introduction to a paper, authors say something like “we control for endogeneity using instruments and show that the results remain robust” without explaining what the instruments are until much later in the paper. Finding valid instruments is very difficult and it is the authors’ responsibility to explain what the instruments are and justify why they are relevant and exogenous. Not being up-front about what the instruments are suggests the authors may themselves not be sufficiently convinced about their validity, and so they bury them deep into the paper.

Even though some papers may claim to have statistically proven exogeneity, there is no valid test to do this. So, the best way to assess exogeneity is to use common sense – could the “instrument” (or whatever drives the instrument) affect the dependent variable other than through the independent variable? Note that no instrument will be completely exogenous and one can always spin stories to argue that it is not. For example, one could spin a story that rivers directly affect child performance, because when kids look out onto a river, they get inspired to be more creative. Ultimately, the reader must use common sense to see whether such stories are reasonable.

As an example of how authors might use complex technical language to overwhelm the reader into believing they have shown causality, consider the following extract:

“We reestimated our models using the xtabond2 procedure in STATA, which utilizes the generalized method of moments (GMM) model also known as system GMM. The xtabond2 procedure is designed for panels that may contain fixed effects and heteroscedastic and correlated errors within units, and employs first differencing, which instruments variables with suitable lags of their own first differences, to eliminate these issues and potential sources of omitted variable bias (please see Arellano & Bover, 1995; Blundell & Bond, 1998; Roodman, 2009). Furthermore, and importantly, xtabond2 also allows the ability to specify variables as endogenous to examine whether potential endogeneity is influencing findings.”

Sounds impressive, but when you strip back from the technical language, you see that the authors are using “lags” (i.e. last year’s variable – more precisely, the change in the variable from last year), which is generally invalid for the reasons discussed above. I use the above extract in no way to poke fun at this paper, but to stress that it’s common sense, not technical sophistication, that enables us to assess validity.

II. Natural Experiments

As discussed earlier, in social sciences, it is hard for the researcher to randomly assign treatments. A natural experiment is when firms are naturally (i.e., without the researcher having to do anything) divided into treated and control groups, for example if a law affects some firms but not others.

Bertrand and Mullainathan (2003) study whether takeover defenses worsen firm performance by entrenching CEOs and allowing them to coast. Their natural experiment is the adoption of state anti-takeover laws. Crucially, different states passed these laws in different years. Consider two plants located in New York, one of which belongs to a Delaware-incorporated firm and the other to a California-incorporated firm. In 1998, Delaware but not California passed anti-takeover laws. The Delaware-owned plant is affected by the law and part of the treated group; the California-owned plant is unaffected by the law and part of the control group.

Assume that, after 1998, we found that the Delaware-owned plant produced 2 (units of output) and the California-owned plant produced 7.  We might conclude that anti-takeover laws reduce output by 5. But, such a conclusion would be premature. Perhaps inefficient firms happen to incorporate in Delaware, and so the Delaware-owned plant was performing poorly even before 1998. Thus, it’s not the law that caused the Delaware-owned plant to perform poorly – it was performing poorly anyway. So, we must perform what’s known as a difference-in-differences analysis, which is best explained by the following (hypothetical) example:

Pre-1998 Post-1998 Difference
Delaware 8 2 -6
California 11 7 -4
Difference -3 -5 -2

Since the Delaware-owned plant is generally more efficient, it was already performing worse than the California plant pre-1998. The difference in their performance was -3 in the bottom row.  After 1998, the difference widened to -5. So, the difference-in-differences – the increase in the difference after 1998 – is -2, and so we can conclude that anti-takeover laws cause performance to fall by 2. Crucially, we use the pre-1998 difference in performance to control for the fact that Delaware-owned plants might be inherent different from California-owned plants.

We could also reach the same -2 conclusion by using the right-hand column, rather than the bottom row. The performance of the Delaware-owned plant fell from 8 to 2 after 1998 – a difference of -6. But, we can’t attribute this decline to the anti-takeover law, because many other events could have happened in 1998 that caused this fall – perhaps the economy went into recession in 1998. This is the role of the control group – the California-owned plant. We can use its difference in performance of -4 to measure the impact of other events that happened in 1998. The difference-in-differences is -2. So, we reach the same conclusion that anti-takeover laws cause performance to fall by 2.

Reader Beware

  • Are the treated and control groups trending in the same direction? The California-owned plant is only a valid control for other events that happened in 1998 if it is affected by the same events as the Delaware-owned plant. This is why Bertrand and Mullainathan use two plants located in New York – if the New York economy suffers a recession, it should have the same effect on both plants. If they had instead compared a plant incorporated and located in Delaware to a plant incorporated and located in California, the latter would not be a good control as Delaware may have suffered a recession in 1998 but not California. So, it is critical that the treated and control groups be trending in the same direction – the change in their performance post-1998 should have been the same if no law had been passed. This is known as the parallel trends assumption.
    • Note that we do not require the treated and control groups to be similar. In the above example, Delaware-owned plants are less efficient than California-owned plants. The level of their productivity is different pre-1998 -we only require the change or trend in their productivity around 1998 to have been the same had no law been passed. We can check this by checking the trends in performance of both plants for several years prior to 1998.
  • Was the natural experiment anticipatedIf the law change was anticipated, firms could respond in anticipation of the law. Then, a researcher might incorrectly conclude that the law had no effect – because the changes had already been made before the law got passed. Moreover, as Hennessy and Strebulaev (2016) show, anticipation may not only cause the measured effect to be weaker, but have the wrong sign.
  • Was the natural experiment exogenous? If firms could have lobbied for the law change, then it is no longer random whether a plant is treated or a control. Perhaps Delaware-incorporated firms knew that their future prospects were poor and lobbied legislators to pass anti-takeover laws in anticipation. As a result, we cannot conduct natural experiments using changes implemented by firms (as some papers do). For example, conducting a “difference-in-differences” between firms who chose to engage in stock splits and firms that do not, would not allow causal inference, since firms endogenously choose whether they are in the treated group (those who split their stock) and whether they are in the control group (those who don’t).

III. Regression Discontinuity

Here, randomness occurs due to the independent variable falling either just below or just above a cutoff in an unpredictable way. For example, Cunat, Gine, and Guadalupe (2012) study the effect of shareholder proposals to increase shareholder rights. Showing that firm performance improves after such proposals are passed does not imply that the proposals caused the improvement, because they are endogenous. Perhaps a large engaged blockholder made the proposals, and it could be the blockholder that improved firm performance. So, they compare proposals that narrowly pass (with 51% of the vote) to those that narrowly fail (with 49% of the vote). Whether the vote narrowly passes or narrowly fails is essentially random, and uncorrelated with other factors such as the presence of blockholders – if there were large blockholders, they would likely increase the vote from 49% to (say) 80%, not 51%. They compare the stock price reaction to the vote outcome, as well as changes in long-term performance, of firms where a shareholder proposal narrowly passes to firms where a shareholder proposal narrowly fails (similar to a difference-in-differences). Since the stock price and long-term performance improves significantly more for the former set of firms, they show that increased shareholder rights cause higher firm value and long-term performance.

For other examples of regression discontinuity that I have blogged about, see Flammer and Bansal (2017) on the effect of shareholder proposals to implement long-term incentives, and Malenko and Shen (2016) on the effect of proxy advisors on voting outcomes.

Reader Beware

  • Can firms perfectly manipulate the independent variable, i.e. choose whether they are above or below the threshold? Suppose directors have control over the votes of shares held in an employee benefit trust. Normally, they do not vote these shares, to avoid investor concerns about them distorting vote outcomes. However, in extreme conditions, they may. For close votes, control of these votes allow firms to essentially choose whether the vote is 51% or 49%. They might allow the proposal to pass if it is performing well (since it is not afraid about greater shareholder power), and cause it to fail if it performing poorly. Then, whether the proposal passes or fails is endogenous – it depends on firm performance.
    • Note that if firms can only partially (not perfectly) manipulate the vote, regression discontinuity is still valid as there is still some randomness as to whether the vote narrowly passes or narrowly fails.
  • Are firms comparable on other dimensions above and below the threshold? Firms above the threshold are treated and firms below are controls. The treated and control firms should be comparable on all other dimensions. Comparability might be violated if (hypothetically) firms with higher-quality management were able to predict when the vote is going to be close and persuade “swing” shareholders to vote against the proposal. Thus, management quality might jump when you move from above to below the threshold.

IV. An Alternative Technique: Common Sense

Finding valid instruments, natural experiments, and discontinuities is difficult. So, an alternative approach to get closer towards causality is to use common sense. For example, if your effect is indeed causal, it should be stronger in certain circumstances. If a higher CEO equity stake caused superior firm performance, through providing the CEO with better incentives, the effect should be stronger where CEOs have greatest freedom to slack – in firms with little ownership by institutional investors, poor governance, and low product market competition. This is what von Lilienfeld-Toal and Ruenzi (2014) show, as blogged about here.

Brav, Jiang, Partnoy, and Thomas (2008) show that, after hedge fund activists acquire a large stake in a firm and announce an intention to influence control, performance improves. There could be reverse causality if the hedge fund predicted the improvements and acquired the large stake in anticipation. As blogged about here, the authors support causality by showing that the improvements are stronger when the hedge fund employs hostile tactics, and remain significant even when the hedge fund already had a large stake prior to announcing its activist intent.

Note that common sense does not show causality as cleanly as the first three methods; it can only suggest causality. (In the first example, perhaps the measures of governance are inaccurate). But, it should be added to the toolkit. Just as a discerning reader should use common sense to avoid being impressed by complex, but invalid, statistical techniques, he/she should also be open to common sense approaches to suggesting causality, even if they cannot prove it. Researchers using this approach must be careful not to make strong causal claims.

CEOs Cut Investment To Sell Their Own Shares At High Prices

One of the most fundamental concerns with corporations is that they focus on short-term profit rather than investing for the long-term. This is a particular concern in the 21st century, where innovation is particularly critical for competitive success. Moreover, allegations of short-termism have serious social repercussions. Long-term investments, such as reducing carbon emissions, developing blockbuster drugs, or training workers, typically benefit stakeholders as well as shareholders, but short-term profit only goes to shareholders. The concerns that corporations exploit stakeholders to pander to shareholders has led to a substantial loss of trust in business and threatens its social license to operate.

But Where’s The Evidence?

However, actually finding evidence that short-termism even exists is extremely difficult. My prior post discussed several pieces of evidence to the contrary. Indeed, my 2007 “job market paper” (that you take on the academic job market in the final year of your PhD) was a theoretical model of how large shareholders can alleviate short-termism. In the first few minutes of most seminars, I’d get the question “What evidence is there that short-termism is even a problem in the first place?” I had to admit that there was little hard evidence – the best was a 2006 survey of executives where 78% admitted to sacrificing long-term value to meeting earnings targets, but this is only what executives claimed that they did, rather than what they actually did.

“No evidence of short-termism”, however, is not the same as “evidence of no short-termism”. There simply wasn’t evidence either way, since it’s hard to measure a CEO’s short-term concerns. One measure might be the amount of shares that she sells in the short-term. If the CEO sells a ton of shares in Q3 2017, then she wants the stock price to be particularly high in Q3 2017. Thus, she might cut investment in Q3 2017. But, a correlation between CEO equity sales and investment cuts would not imply causation. The problem is that CEO equity sales are endogenous – they are a deliberate choice of the CEO, and so this choice may be driven by other factors that also drive investment. For example, if prospects are looking bleak in Q3 2017, this might cause the CEO to rationally scale back investment, and separately to sell her shares.

Documenting Short-Termism: A New Approach

In a recent paper, Vivian Fang, Katharina Lewellen, and I* initiated a new approach. Rather than studying the shares that the CEO actually sells, we study the amount of shares that are scheduled to vest. For example, if a CEO was given a chunk of shares in Q3 2012, with a 5-year vesting period, they first become saleable in Q3 2017. CEOs typically sell a large portion of their shares when they vest, to diversify their portfolio (we verify this in the data). Thus, if the CEO knows that her shares will be vesting in Q3 2017, and so she’s likely to sell a large portion, she has incentives to cut Q3 2017 investment. Importantly, the driver of Q3 2017 vesting equity is the decision to grant the CEO shares back in Q3 2012. That was five years ago, and so is likely exogenous to (not driven by) Q3 2017 investment opportunities. Thus, any correlation between Q3 vesting equity and Q3 investment cuts is likely to be causal.

We include both shares and options in our measure of vesting equity and estimate this amount at the quarterly level. This is because the highest frequency with which investment is reported is also at the quarterly level. We regressed the change in investment (measured five different ways) on vesting equity and many control variables that may also drive investment cuts (e.g. investment opportunities or financing constraints).

CEOs Cut Investment When Their Equity Vests

We find a significant negative correlation between vesting equity and the growth rate in investment – using all five investment measures. Moreover, these results are robust to:

  • Removing equity grants where vesting depends on hitting certain performance targets, rather than reaching the end of a pre-specified time period (e.g. 5 years)
  • Considering only vesting stock or only vesting options
  • Including or excluding controls
  • Regressing the change in investment not on the amount of vesting equity, but the amount of equity sales that can be predicted by vesting equity

Alternative Explanations

So, the link between vesting equity and investment cuts appears to be robust. This is consistent with the idea that the CEO inefficiently cuts good investment projects to boost short-term earnings and thus the short-term stock price (the myopia hypothesis). But, as I explained in a recent TEDx talk, finding that the data is consistent with a hypothesis does not mean that the data supports the hypothesis – because it could also be consistent with alternative hypotheses.

The main concern is the efficiency hypothesis. Perhaps the CEO cuts bad investment projects, and so the cut in investment is efficient. Let’s say cutting investment is hard. It takes effort to identify wasteful projects and shut them down, and doing so may make the CEO unpopular  CEOs may instead prefer to coast and enjoy the quiet life. But, when the CEO is about to sell her shares, she overcomes her inertia and is willing to take tough decisions. If true, then short-term pressures are motivating, rather than distracting – a bit like how an impending essay deadline forces students to stop procrastinating.

We tested the efficiency hypothesis in two ways. First, if equity vesting causes the CEO to get her act together, you’d expect her to improve efficiency not just by cutting investment, but also by cutting other expenses or increasing sales growth. But, we find no evidence of this. Second, we show that that CEOs cut investment less when the cuts are more costly to them (the CEO is younger, so she suffers more from the long-term consequences of scrapping an efficient investment; or the firm is younger or smaller, suggesting that the investment is more valuable). These tests suggest that the investment cut is indeed likely to be inefficient.

How Does the CEO Benefit?

One complication is that Q3 earnings aren’t announced until the start of Q4. So, how does a CEO who sells equity in Q3 benefit from the earnings increase that results from the investment cut? We show that vesting equity increases the likelihood that the CEO issues positive earnings guidance in the same quarter. Doing so boosts the stock price by 2.5%, thus indeed allowing her to cash out at a high price. Indeed, we find that the CEO’s equity sales are concentrated in a small window immediately following the guidance event. So, the full picture appears to be – the CEO knows that her equity is vesting in Q3, so she cuts investment in Q3 and also issues positive earnings guidance in Q3, boosting the stock price and allowing her to sell her shares upon vesting.

If the CEO boosts earnings-per-share by 5c, how much should positive earnings guidance should she give? Probably around 4-5c – then, she will benefit as much as possible from the earnings increase – but not more than that else she will subsequently undeperform expectations. (The same reason explains why the CEO can’t issue positive guidance without the investment cut – both go hand-in-hand). Indeed, we find that, when more equity vests, the firm is particularly likely to beat the analyst forecast by a narrow margin (0-1 cent) but not a wide margin, consistent with the CEO communicating nearly all of the earnings increase ahead of time.

What Does It Mean For CEO Pay Design?

Executive pay is a highly controversial topic. Most people agree that it should be reformed, but the reforms typically focus on the level of pay. As I wrote earlier (see myth #5), the level of average CEO pay in the US is only 0.05% of firm value. Instead, these results suggest that the horizon of pay is more important – it affects the CEO’s incentives to invest, with potentially substantial implications for the company’s long-run success and the value it creates to other stakeholders. Cutting pay in half will win more headlines than extending the vesting horizon from (say) 3 to 7 years, but the latter is likely much more impactful. Indeed, the paper was referenced in the UK government’s Green Paper on Corporate Governance to justify the proposal to extend vesting periods. Here I describe a redesign of executive pay based in part on the results of this paper (since implemented by some companies), and here I summarize a paper by other scholars showing positive causal effects of long-term equity compensation.

* I apologize for covering one of my own papers. In this blog I typically cover other people’s papers. But, I needed to write about this paper anyway to fulfill the “dissemination” terms of my European Research Council grant, so decided to share it here.

Is Short Termism Really A Problem?

“Myopia [short-termism] is a first-order problem faced by the modern firm. In the last century, firms were predominantly capital-intensive, but nowadays competitive success increasingly depends on intangible assets such as human capital and R&D capabilities (Zingales (2000)). Building such competencies requires significant and sustained investment. Indeed, Thurow (1993) argues that investment is an issue of national importance that will critically determine the U.S.s success in global competition.”

So I wrote in my 2007 “job market paper”, later published in the 2009 Journal of Finance. The “job market paper” is the signature paper from your PhD thesis, that you take on the academic job market and often ends up seeding your future research agenda as a faculty member. Indeed, most of my work over the last 10 years has focused on the causes of and potential solutions to short-termism. These include short-term executive contracts, excessive disclosure requirements, the stock market ignoring intangibles, and investors owning too small stakes. So, I have a vested interest in claiming that short-termism is a massive problem. With The Purposeful Company, I have been applying these insights from research to propose policy reforms that will encourage companies, investors, and stakeholders to think more long-term.

But, as with all issues, it is important to consider different perspectives. This will help address the problem of “confirmation bias” – only accepting evidence or arguments that reinforce your viewpoint and rejecting those that contradict it – that I discussed in a recent TEDx talk, “From Post-Truth to Pro-Truth”. Here I summarise an excellent, contrarian article entitled “Are U.S. Companies Too Short-Term Oriented?” by Chicago’s Steve Kaplan, one of the world’s leading authorities on corporate finance. Steve presents a number of cogent arguments for why the problem of short-termism may be exaggerated, which I summarize here.

  1. The Boy Who Cried Wolf

Short-termism is not a new allegation, particularly in the US. My job market paper opened with a 1992 quote from renowned Harvard professor Michael Porter:

The nature of competition has changed, placing a premium on investment in increasingly complex and intangible forms—the kinds of investment most penalized by the U.S. [capital allocation] system.

Porter argued that the US stock market was excessively liquid, leading to shareholders buying and selling companies based on short-term profits rather than long-term value. He advocated a move towards the Japanese system of long-term, stable stakes. However, the evidence of the past 25 years has suggested that the Japanese model has not been the panacea previously thought. While this may be for reasons other than its illiquidity, more direct evidence shows that liquidity has many beneficial effects on firm value.

Steve also includes a 1979 quote from renowned corporate lawyer Marty Lipton, and a 1980 quote by Harvard professors Robert Hayes and William Abernathy, alleging the problem of short-termism. If companies were underinvesting since the 1980s, surely they’ll feel the effects today, after nearly 40 years? But, Steve uses data from 1951 to show that US corporate profits are now near all-time highs, and that the long-term growth in profits has easily outstripped supposedly more “long-term” countries such as Japan. Moreover, the growth in profits was faster after than before 1980s – indeed in the period in which the financial sector, and the focus on shareholder value maximization – both alleged drivers of short-termism – started taking off. Thus, critics alleging short-termism may seem like the boy who cried wolf.

One may argue that corporate profits are not the best measure of value creation, since they are narrowly focused on shareholders. However, evidence suggests that, in the long-term, shareholders and stakeholders are aligned: serving stakeholders ultimately benefits shareholders – and 40 years is a long time period. More direct evidence suggests that society has benefited. Steve cites numbers from the World Bank suggesting that, in 1980, 2 billion people lived in extreme poverty (44% of the world’s population), which by 2012 fell to 900 million (13%). The World Bank projected last year that, for the first time, this number was expected to have fallen below 10%. Steve writes that “while causality is hard to prove and many factors have contributed to this result, US companies – through international outsourcing and globalization – have played an important role in these outcomes.”

  1. No Open Goal
Those who believe that investors are too myopic should celebrate, rather than lament, this behavior. If investors are indeed too focused on the short-term, and thus not financing companies with superb long-term prospects, this gives critics an open goal – the critics can put their money where their mouth is, address the financing gap, and make a killing.

And that’s what venture capital (VC) tries to do. Its end investors commit their capital for 5-10 years, allowing a VC fund to make long-term investments that address the financing gap. So, if (a) short-termism has increased over time, the scope for venture capital has increased over time, and (b) if short-termism is a problem, then venture capital should be unusually profitable.

Steve shows that neither hypothesis is true. Starting with (a), the capital committed to VC funds as a fraction of the stock market has fluctuated in a relatively narrow band of 0.10-0.20% over the last 35 years. This does not suggest there are huge untapped opportunities to invest in innovation. Turning to (b), numerous studies suggest that, while VC funds outperform the market, this outperformance is relatively modest. For every 1% increase in the stock market, VC earns around 1.1-1.2%, and this modest outperformance may not fully control for the greater risk of VC nor be scalable.

Similar results also hold for private equity, which – like VC – also has committed capital and thus the ability to make long-term investments. Also – like VC – it invests in private firms, which are shielded from the alleged short-termism of the stock market, such as the need to report quarterly earnings.

  1. Unicorn Valuations
The Price/Earnings ratio compares the price of a share to its current earnings. If the P/E is high, then the stock market is valuing a firm much more highly than can be justified by its current earnings, because it is taking into account the potential for future profits. The current P/E ratio of the S&P 500 is 25, versus a historical median of 15.  Indeed, the high valuations of unicorns, despite them making little or even negative earnings, suggests that the stock market must forward looking and valuing something other than current profits.

Relatedly, U.S. companies are increasingly less likely to be profitable when they go public. This holds not only for tech IPOs and biotech IPOs. Investors are increasingly likely to back biotech and fracking firms, even though they have significant periods of negative cash flows.

So Why Is Short-Termism Seen As A Problem? 

Given the evidence above, why is short-termism seen as such a problem? Steve points to a number of potential causes:

  • Executives may have a vested interest in claiming that short-termism is a problem, to make them less accountable. Claiming that they shouldn’t be evaluated until 3 years down the line guarantees them employment for at least 3 years.
  • Companies are seen as focusing excessively on share buybacks and dividend payouts rather than investment. In Steve’s words, “This argument is something of a non sequitur. It suggests that in a buyback or dividend, the money simply disappears rather than going to investors who spend it or use it to make other investments. It also suggests that companies that don’t need money should invest it anyway, rather than give it back to shareholders.”
    • Indeed, I believe that the current criticism of dividends and, in particular, buybacks,  stems from substantial misunderstandings. I discuss these misunderstandings (in non-technical language) in p7-8 of my supplementary evidence to the UK House of Commons.
  • Confirmation bias. In the current political climate, many people see companies as evil, and are very willing to accept evidence that supports this view and reject evidence that contradicts it. As Steve writes, “the short-termers ignore a lot of evidence that goes against their position”.

Where Do We Stand?

Has Steve’s paper wiped out a large chunk of my research agenda and policy initiatives? No – it reinforces the need to take an evidence-based, circumspect approach to reform. It points to short-termism being a much more nuanced problem than the media or politicians claim. It is very tempting to make sweeping, unqualified statements (e.g. “all firms are short-termist”), as these are more likely be turned into headlines or Tweeted in 140 characters. But, doing so is very dangerous. Few issues are black-and-white; indeed, despite being a staunch Remainer, I posted on the case for Brexit. Conveying the view that all executives are crooks who sacrifice long-term value for short-term profit contributes to an anti-business sentiment which in turn is a potential contributor to the rise of populism, Trump’s election, and the Brexit vote. Being fast and loose with the evidence has serious consequences. Moreover, the view that short-termism is a universal and pervasive epidemic has supported calls to “throw the baby out of the bathwater”, i.e. abandon the current system – that has led to substantial technological process, rising corporate profits, and diminishing poverty – by mandating workers on boards, making managers less accountable by reducing shareholder rights, and tearing up current corporate forms for untried, untested alternatives.

Instead, diagnosis precedes treatment. Before deciding whether to amputate, a doctor will study whether a condition is local and can instead be spot-treated. Similarly, the optimal response to short-termism depends on how pervasive the problem is, and what the causes are. All the reforms that I have been proposing aim to work within the system, since my reading of the evidence is we do not have an epidemic, and so we do not want to tear up the system that has created many long-term unicorns. Moreover, the specific dimensions to reform should be driven by the evidence, which seems to suggest that buybacks and stock market liquidity are not causes of short-termism, but short-term executive pay and fragmented share ownership may be.

Corporate Governance in China

China will soon become the largest economy in the world, but many Westerners (myself included) know very little about it. Moreover, the vast majority of research on corporate governance is on the US. We often assume that these findings will apply throughout the world, but this assumption is unwarranted – the institutional setup is very different across different countries.

I thus sought to educate myself on China, and came across an excellent article by Fuxiu Jiang and Kenneth Kim of the Renmin University of China. In addition to providing a non-technical survey into Chinese corporate governance in its own right, it also introduces a special issue of the Journal of Corporate Finance with many papers on Chinese corporate governance. I summarize the article in bullet-point format below. All of these points I learned from the original article, so please cite it (not me) if you use anything from it (Jiang, Fuxiu atnd Kenneth A. Kim (2015): “Corporate Governance in China: A Modern Perspective”. Journal of Corporate Finance 32, 190-216). I hope you find this as helpful as I did.

Institutional Background

Capital Markets

  • On December 19, 1990 and July 3, 1991 the Shanghai and Shenzhen Stock Exchanges were launched. Shanghai is analogous to NYSE and Shenzhen to Nasdaq.
  • Regular domestic shares are A-shares, denominated in RMB. A small fraction of firms have B-shares, denominated in foreign currency (US or Hong Kong dollars).
    • B shares have the same cash flow rights as A shares, but were originally restricted to foreign investors.
      • Since 2001, Chinese can own B shares
      • Since 2003, qualified foreign institutional investors (QFIIs) can own A-shares
    • B shares are less than 0.5% of the total market cap on the two exchanges
  • Regulator is China Securities Regulatory Commission (CSRC), the equivalent of SEC
  • Shares are divided into tradable shares (TS, 1/3) and nontradable shares (NTS, 2/3). Initially, controlling shareholders (often the state or legal persons) held NTS, and domestic individual investors held TS.
  • Individual investors are typically uninformed speculators, leading to stock market volatility. Government has thus promoted institutional investors
    • In April 1998, the first closed-end fund was introduced. Open-end mutual funds and index funds were subsequently introduced.
    • October 27, 1999: insurance companies were approved to own stocks indirectly through a securities investment fund. October 24, 2004: insurance funds were allowed to invest in stocks directly.
    • As above, QFIIs could hold A-shares from 2003
    • Thus, tradable shares became held also by domestic and foreign institutional investors
  • Split share structure was to ensure that the government could retain control of firms. But, government realised that non-tradability is a problem – since NTS holders don’t benefit from stock price appreciation, they had little incentive to pursue shareholder value maximisation. Thus, conflict between TS and NTS
  • April 2005: government initiated the Split Share Reform, to transform all NTS into TS. Since this would dilute the value of TS, NTS holders had to negotiate a compensation plan with TS holders (typically additional shares)
    • Pilot programs conducted in April and June 2005. Reform expanded to all listed firms in August. By end of 2007, almost all firms had established a plan and timetable to convert NTS into TS. Since 2005, NTS are called “restricted shares” to convey the fact that they will eventually become tradable
  • Turnover is high. Even though it’s fallen, it still remains high by international standards. Average holding period of 1 year (4 months) on Shanghai (Shenzhen) Stock Exchange

Corporate Governance

  • For listed firms, a shareholder meeting is required once per year
    • Interim meetings can be called by large shareholders
  • A listed firm must have 5-19 directors
    • Board must meet at least two times per year
    • Since June 30, 2003, at least 1/3 of the board must be independent (can’t be related to the manager, be one of the top 10 shareholders or own 1% of shares, or have a business relationship with the firm).
    • Since China has concentrated ownership, primary duty of independent directors is to monitor large controlling shareholders on behalf of minority shareholders. In countries with dispersed ownership, it’s to monitor management on behalf of all shareholders.
  • Board structure is two-tier: in addition to the board of directors, there is a board of supervisors. Must have at least three supervisors, include representatives of shareholders, and at least 1/3 must be employees
  • Note that it’s the board chair who’s typically in charge of a company, not the CEO or General Manager (GM is often the title given to the CEO)
    • Chairs typically work full-time and go to work every day, unlike in the UK and US

Internal Governance: Stylized Facts and Interpretation

  • Ownership concentration
    • In 2012, largest shareholder owns, on average, 1/3 of the firm; 5 largest own over half of the firm
    • Ownership concentration has declined over time, particularly from 2005 to 2006 since common compensation in the Split Share Reform was to transfer shares from NTS to TS holders
    • Firms where the large shareholder owners > 50% have higher ROE but lower Q than other firms. Thus, even ignoring causality, it’s hard to say whether large shareholders are good or bad for firm value
    • From 2007, firms with multiple large shareholders outperform firms with single large shareholders in ROE. This may be because 2007 is the first year when firms have more TS than NTS, so governance through exit is strong (one large shareholder can threaten to sell if another large shareholder doesn’t cooperate with it)
    • When the government is a large shareholder, it does not tunnel for private benefits (e.g. perks), but it may sacrifice shareholder value for political objectives such as maintaining employment or overinvesting to prop up GDP
  • Managerial ownership
    • SOEs: managers have very little stake, typically because the manager is a government official appointed by the state
    • Non-SOEs: average ownership is 16%, since most non-SOEs are family firms or founded by entrepreneurs. But, median ownership is 0% in most years and 1.1% in 2012. Managers are rarely given shares or options as compensation; managers only become significant shareholders if it’s a family firm or if they buy the shares personally
  • Managerial pay
    • Pay has rapidly increased in a short period of time, but remains modest globally. In 2012, median pay for top manager of SOEs is RMB 470k ($77k)
    • Pay is not an important incentive for SOE managers. They’re government employees, so are incentivized by being promoted to high-level government positions when their term says firm managers has finished. Also, poorly-performing SOE managers are fired. Thus, incentives still matter, but aren’t provided by pay
  • Institutional ownership
    • Has risen over time, largely driven by emergence of mutual funds
    • But, ownership remains small.
      • In 2012, total institutional (mutual fund) ownership averages 17.4% (7.6%).
      • Median ownership of a mutual fund was 0.067% in 2011
    • In 2011, average holding period for a mutual funds is less than 6 months
  • Board structure
    • CEOs are chairs 25% of the time in non-SOEs, 10% of the time in SOEs
  • Capital structure
    • Average leverage in non-financial firms is 1/3. High compared to UK and US
    • Debt is unlikely to discipline managers in China since creditor rights are weak. Thus, bankruptcies are extremely rare
    • Banks don’t appear to monitor. Qian and Yeung (2015: even when controlling shareholders are tunneling from minority shareholders, banks continue to lend, and loan terms aren’t unfavorable.
  • Dividend policy
    • Dividends are very small: around 1%. Potential reasons:
      • Minority shareholders aren’t able to pressure firms to pay out earnings as dividends, since minority shareholder rights are weak.
      • Turnover is high, and so minority shareholders are speculators going after capital gains rather than caring about dividends
    • Dividends are largely driven by regulations.
      • E.g. Number of paying firms more than doubles in 2000 because a CSRC regulation, with effect from March 2001, required a Chinese-listed firm to pay dividends for three consecutive years if it wants to sell new shares

External Governance

  • As China has transitioned from a centrally planned economy to a market-oriented one, China has issued many laws and securities regulations, but China remains internationally weak in its laws, enforcement, and punishment
  • Government recognizes this and is taking steps. 2002 is referred to as the “Year of Corporate Governance of China”
    • Released Code of Corporate Governance
    • CSRC enacted many governance reforms and regulations, e.g. Improving disclosure requirements when large shareholders change
    • CSRC undertook an unprecedented large-scale review of 1,175 listed firms. Found that 30% had significant governance problems. Many CEOs were fired, many firms were fined.
  • Unlike other countries, little governance through managerial labor market, which is nascent
    • SOEs don’t compete among themselves for the best managers, since the government is the only demand-side entity
    • Many non-SOE firms are family firms, so little external hiring historicallly. May change going forwards as firms become more complex, and China’s one-child policy limits number of family candidates
  • Unlike other countries, little governance through corporate control market, which is nascent
    • State won’t sell SOEs to a raider
    • For non-SOEs, ownership is so concentrated that it would be hard for a raider to gain control
    • But, this may change going forwards given that almost all shares are now tradable
  • Like other countries, product market competition is an effective governance mechanism
  • Many Chinese firms engage in CSR to curry favor with the government, since one of the government’s main roles is to promote social welfare (like other countries). Lin et al. (2015): firms that engage in CSR are more likely to receive government subsidies
  • Cross-listings are likely an effective way for Chinese firms to obtain good governance

China’s Corporate Governance Code

  • Like most codes, contains broad and vague language that describes guiding principles rather than explicit regulations. There are eight chapters
  1. Shareholder rights
  2. Rules for controlling shareholders, including advocating a “reasonably balanced shareholding” (multiple sizable blockholders rather than a single large blockholder)
  3. Rules for directors and board of directors
  4. Duties and responsibilities of the supervisory board. Board is accountable to all shareholders and oversees both directors and senior management
  5. Performance assessments for directors, supervisors, and management
  6. Stakeholders. Firms should be good corporate citizens and cooperate with, inform, listen to, and honor the legal rights of stakeholders
  7. Disclosure. Firms must fully and accurately disclose all information required by law
  8. Code comes into effect on the date of issuance

Dangers of Using a Company-Wide Discount Rate

Any Finance 101 class will emphasize that the appropriate discount rate for a project depends on the project’s own characteristics, not the firm as a whole. If a utilities firm moves into media (e.g. Vivendi), it should use a media beta – not a utilities beta – to calculate the discount rate . However, a survey found that 58% of firms use a single company-wide discount rate for all projects, rather than a discount rate specific to the project’s characteristics. Indeed, when I was in investment banking, several clients would use their own cost of capital to discount a potential M&A target’s cash flows.

But the important question is – does this really matter? Perhaps an ivory-tower academic will tell you the correct weighted average cost of capital (WACC) is 11.524% but if you use 10%, is that good enough? Given the cash flows of a project are so difficult to estimate to begin with, it seems pointless to “fine-tune” the WACC calculation.

An interesting paper, entitled “The WACC Fallacy: The Real Effects of Using a Unique Discount Rate”, addresses the question. The paper is forthcoming in the Journal of Finance and co-authored by Philipp Krueger of Geneva, Augustin Landier of Toulouse and David Thesmar of HEC Paris.

This paper shows that it matters. The authors first looked at organic investment (capital expenditure, or “capex”). If your core business is utilities and the non-core division is media, you should be using a media discount rate for non-core capex. But, if you incorrectly use a utilities discount rate, the discount rate is too low and you’ll be taking too many projects. The authors indeed find that capex in a non-core division is greater if the non-core division has a higher beta than the core division. Moreover, they find the effect is smaller (a) in recent years, consistent with the increase in finance education (e.g. MBAs), (b) for larger divisions – if the non-core division is large, then management puts the effort into getting it right, (c) when management has high equity incentives, as these also give them incentives to get it right.

The authors then turn to M&A. They find that conglomerates tend to buy high-WACC targets rather than low-WACC targets, again consistent with them erroneously using their own WACC to value a target, when they should be using the target’s own high WACC. Moreover, the attraction of studying M&A is the authors can measure the stock market’s reaction to the deal, to quantify how much value is destroyed. They find that shareholder returns are 0.8% lower when the target’s WACC is higher than the acquirer’s WACC. They study 6,115 deals and the average acquirer size is $2bn. Thus, the value destruction is 0.8% * $2bn * 6,115 = $98bn lost to acquirers in aggregate because they don’t apply a simple principle taught in Finance 101!

We often wonder whether textbook finance theory is relevant in the real world – perhaps you don’t need the “academically” right answer and it’s sufficient to be close enough. But this paper shows that “getting it right” does make a big difference.

Why Banks Should Use Less Debt Financing

In the aftermath of the financial crisis, there have been numerous calls for banks to finance themselves less with debt and more with equity, to reduce the risk of another crisis. But this has been met with great resistance by bankers. They argue that equity is costlier than debt, and so forcing them to use more equity will make it more expensive for them to raise capital. If they can’t raise as much capital, they won’t be able to lend as much to small businesses and homeowners; if it’s more expensive to raise capital, they’ll need to take on riskier projects to generate a high enough return to meet their cost of capital. For example, Jamie Dimon of JP Morgan has said (paraphrased): “If they force us to hold more equity, we will have to take on riskier projects to hit our required return on equity”.

The Modigliani-Miller theorem, taught in undergrad or MBA finance 101, tells us that (under certain conditions), firm value is independent of capital structure – equity is no more costly than debt. Indeed, Jamie Dimon’s seemingly intuitive argument involves not one, not two, but three violations of basic finance theory:

  1. It treats the required return on equity as a constant (as if it were pi or Avogadro’s number). But, basic finance theory tells us that it depends on financial risk. If the firm is financed by more equity, it’s less risky, and so shareholders demand a lower return on equity. Banks won’t need to take on more risk, because the target will have fallen.
  2. Basic finance theory tells us that the required return on equity also depends on business risk. If the firm “takes on riskier projects”, shareholders will demand a higher return as a result. Thus, banks won’t have an incentive to take on more risk, because this will cause the target to rise.
  3. Equity is not something that you “hold”. It doesn’t sit idly on the balance sheet doing nothing – the bank can invest or lend the money raised by equity. Equity isn’t an asset, it’s a liability – it’s how a bank finances itself. If a firm finances itself with equity rather than debt (changes its liability mix), it needn’t change the projects it invests in (its asset mix).

The fallacies inherent in most bankers’ arguments are exposed in Anat Admati and Martin Hellwig’s influential book “The Bankers’ New Clothes“; see this link for non-technical articles on this topic. However, some bankers may counter that the Modigliani-Miller theorem doesn’t hold in the real world. There are valid reasons for why it’s advantageous to finance with debt rather than equity – debt gives tax shields, and incentivizes management to work harder to avoid bankruptcy.

But a new paper by Roni Kisin and Asaf Manela of the Olin School of Business at Washington University in St. Louis exposes these arguments – using banks’ own actions! They find that bankers’ own behavior suggests that they don’t view debt as useful – that the above advantages of debt are small in the real world. Their identification is clever. They exploit the fact that, prior to the crisis, banks had access to a loophole – asset-backed commercial paper conduits (a form of securitization) that allowed them to lower their equity capital requirements by 90%.

Using these conduits was costly – the interest rate on asset-backed commercial paper is higher than that on directly-issued commercial paper (which didn’t benefit from the loophole). Thus, banks traded off the benefits (of reducing equity capital requirements) with the costs of using the conduit. If financing themselves with equity, rather than debt, truly was costly, banks would have used the conduits to a large degree – particularly since the availability of the loophole was well-known to all banks.

But they didn’t. Roni and Asaf estimate that, based on the limited usage of these conduits, it’s not costly for banks to finance themselves with equity. Even if banks were to increase their equity ratios from 6% to 16%, this would cost all U.S. banks in aggregate $3.7 billion. The average cost per bank is $143 million, or 4% of annual profits. Lending interest rates would rise by 0.03% and quantities would decrease by 1.5%. While the above numbers are not small, they are far lower than the numbers branded around by bankers, and arguably a small price to pay to substantially reduce the risk of another crisis.

One caveat is that the authors are clear that they quantify the cost of increasing equity capital requirements, rather than the cost of increasing equity capital. It may be that the cost of increasing equity capital requirements is low, not because the cost of raising equity is low, but because banks have other ways of complying with the requirements (e.g. other loopholes, or changing the riskiness of the assets they invest in). Nevertheless, the paper provides innovative evidence that increasing capital requirements is much lower than what many banks claim.

How Corporate Credit Ratings Induce Short-Termism

Credit rating agencies were under particular scrutiny in the recent financial crisis, as critics argue they gave too high ratings to securities that turned out to be toxic. One potential culprit is the “issuer-pays” model, where it is the company being rated that pays for credit ratings, which may encourage rating agencies to be overly-generous to win business.

But, a recent paper by my new LBS colleague Taylor Begley points to an important additional cost of corporate credit ratings – and one that arises even if ratings are perfectly accurate. Companies may engage in short-term behavior to achieve a particular credit rating. This problem arises because credit ratings are discrete categories (e.g. AAA, AA+, BB) rather than a continuous number (e.g. 93.2, 87.8). Thus, a company has a strong incentive to just get into the AAA- category than be at the top of the AA+ category.

In turn, a major driver of credit ratings is a company’s financial ratios. For example, for firms with an excellent business risk profile, a Debt/EBITDA ratio of 1.5-2.0 typically leads to a rating of AA; a ratio of 2.0-3.0 typically leads to a rating of A. For firms with a fair business risk profile, a Debt/EBITDA ratio of 1.5-2.0 typically leads to a rating of BBB-; a ratio of 2.0-3.0 typically leads to a rating of BB+ (which is below investment-grade, i.e. has “junk” status). (Source: Standard & Poor’s Business Risk / Financial Risk Matrix).

These discrete thresholds thus give companies incentives to be lie just below a threshold. They can achieve this by short-term behavior such as cutting research and development (R&D). This increases EBITDA, thus reducing the Debt/EBITDA ratio and potentially meeting the threshold. Importantly, the incentives to engage in short-termism depend on where the firm is compared to the next lowest threshold. A firm with a Debt/EBITDA ratio of 2.1 has strong incentives to engage in short-termism, because it has a high chance of being able to lower it to below 2.0, but a firm with a Debt/EBITDA ratio of 2.5 has much weaker incentives. Taylor indeed finds that firms close to a threshold are significantly more likely to cut not only R&D, but also selling, general, and administrative (SG&A) expenses, which contains expenditure in advertising, information technology, employee training, and other forms of organizational capital.

Other papers have previously found evidence of short-termism to meet other types of thresholds – for example, companies may cut R&D to ensure their earnings fall just above analyst earnings expectations. But a particularly novel finding of this paper is that Taylor is able to document negative long-run effects of such short-termism. Companies close to ratings thresholds subsequently suffer declines in the number of patents that they produce, and also the number of citations to their patents (a measure of the quality of innovation). They also experience declines in profitability and valuation ratios.

The cost of credit ratings that critics typically focus upon is that inaccurate ratings lead to redistributional consequences. If the ratings of a security are too high, the buyer pays too much for them. Thus, the seller wins and the buyer loses. While these redistributional concerns are clearly very important, they don’t directly affect the overall size of the pie (sellers get a larger slice, buyers a smaller slice). In contrast, Taylor shows that credit ratings have efficiency (rather than just redistributional) consequences – they affect the overall size of the pie. If companies cut investment to meet ratings thresholds, they erode their future value, making everyone worse off in the long-run. This is a particular concern for the 21st century firm, whose value is especially driven by intangible assets (such as brand strength, innovative capabilities, and corporate culture) which requires several years to build and bear fruit.

The paper certainly does not argue that credit ratings should be scrapped; these costs must be weighed against their numerous benefits. Many financial targets (e.g. analyst earnings expectations) also have the potential to lead to short-termism. Rather, the paper highlights a potential cost to credit ratings that boards may be able to mitigate. One potential remedy that discussed in a previous post is to increase the vesting period of executives’ stock and options, to tie them to the long-run performance of the firm.