(This article was initially featured on the blog for my book, “Grow the Pie: How Great Companies Deliver Both Purpose and Profit”)
Investors are increasingly scrutinizing the ESG performance of companies. This is a positive development – if CEOs know that investors evaluate them on ESG factors, not just short-term earnings, they’ll prioritise these factors over quarterly profit. Indeed, research shows that investors are less likely to sell a company that’s announced weak earnings if its ESG performance is strong. This suggests that investors recognise that quarterly earnings are a less relevant measure of performance for companies that take their social impact seriously.
But one major challenge is that it’s very difficult to measure ESG performance. This challenge may be why some investors focus on short-term financial metrics when evaluating a company. We know what a company’s profits and dividends are – but we don’t know what it’s ESG performance is, as it’s so difficult to define.
For that reason, ESG rating agencies (such as MSCI, Sustainalytics, Vigeo Eiris, ISS, RobecoSAM, and Asset4) can play a major positive role. They painstakingly collect and aggregate a range of information on a company’s ESG performance – its own disclosures, third-party reports (e.g. from NGOs), news items, and proprietary research through company interviews and questionnaires. They come up with an overall ESG score, as well as scores for the individual components (E, S, and G) separately.
However, different rating agencies disagree substantially on a company’s ESG performance. The correlation between ESG ratings across different providers is around 0.3. This contrasts with credit ratings, where the correlation between ratings by S&P and Moody’s is around 0.99.
Two recent papers do a deep dive into the source of the disagreement. One is an MIT Sloan working paper entitled Aggregate Confusion: The Divergence of ESG Ratings. It decomposes the disagreement into three sources:
- 53% is due to measurement: raters measure the same ESG attribute with different indicators. For example:
- Labour practices could be evaluated on the basis of workforce turnover, or number of labour cases against the firm
- Female friendliness could be measured by the gender pay gap, the percentage of women on the board, or the percentage of women in the workforce
- 44% is due to scope: different raters include different attributes
- Most raters consider a firm’s greenhouse gas emissions when evaluating its environmental record, but only some will include electromagnetic radiation
- One rating agency may include lobbying, another might not
- 3% is due to weight: different raters place different weights on the individual components when calculating the overall score
Surprisingly, there’s even disagreement on objective facts that can be verified from public records. Whether a company is a member of the UN Global Compact has a correlation of 0.86. Whether the CEO is the same as the Chairman has a correlation of only 0.56. The authors also point to a strong “rater effect” – if a provider rates a company highly on one attribute, it will rate it highly on other attributes. This is an example of the “halo effect” extensively documented in psychology.
The second study is What a Difference an ESG Ratings Provider Makes! by Research Affiliates. (For full disclosure, I’m an Advisor to Research Affiliates, but had no involvement in the study). Rather than looking at aggregate data like the first study, it does a deep dive into a specific example, looking at how two different providers rate two specific companies:
- Provider 1 ranks Wells Fargo as top-quartile in Governance, while Provider 2 ranks it in the bottom 5%. That’s because Provider 1 counts the fake bank accounts scandal within “Information to Customers”, which is part of its Social score, not Governance score
- Provider 1 gives Facebook a top 10% Environmental rating, while Provider 2 ranks it in the bottom 30%. That’s because their environmental ratings measure quite different attributes, and each attribute has different weights.
What does this all mean for investors? It does not at all mean that ESG ratings are useless, or that providers are biased or incompetent. ESG performance is simply difficult to measure, and reasonable people can disagree – just as some equity research analysts will rank a company as Buy and another as Sell.
But what it does mean is that you can’t just take an ESG rating off the shelf and then trade according to it. You need to understand what the ESG rating is actually capturing, because different providers define ESG in different ways. Some investors may care about electromagnetic radiation but others may not, and so an investor needs to understand whether the rating takes this into account.
I once did a Wall Street Journal debate on “Does SRI Make Financial Sense?” where my opponent argued that SRI is futile since it’s impossible to define ESG performance. But this “confusion” is actually good for investors, rather than bad. If there were one unambiguously correct ESG rating, then we wouldn’t need human investors and a computer could put on an ESG strategy. That we can’t rely on ratings – but need to deeply understand a company, talk to management, and take into account its strategic context – means that human investors can add substantial value even in a big data world.
Similarly, if there were one unambiguously correct ESG rating, it would be priced into the market – so you wouldn’t be able to make money trading on it. The book discusses multiple sources of ESG information that are not taken into account by the market, because the market has difficulty in valuing intangible factors (like ESG) that are hard to measure. What this means is that all investors – not just “socially responsible” investors, should take ESG factors seriously. While they’re sometimes dismissed as “non-financial” factors, evidence shows that they often become financially material in the long-run.