Divided we stand, united we fall. Don’t fear granularity

A thought experiment

I have a claims data set that can be represented in the familiar triangular form. I know from an exogenous source that the data conforms with the assumptions of the chain ladder in its so-called EDF form, so the chain ladder is a valid forecast model.

The data set is at the transaction level, with each transaction tagged with its date, and so I can choose any granularity I wish for the triangle: annual, quarterly, weekly, daily, or even indefinitely fine. Which should I choose if my objective is to produce a chain ladder forecast of loss reserve with minimum prediction error?

Most people respond to this question, somewhat, along the following lines. Clearly, if I adopt extremely coarse granularity, say 5-year development periods, my model will make use of little information, and can be expected to deliver poor results. So, I should begin to increase granularity from this extreme. But how far? If I reduce my development periods to days, my data cells will become very erratic, and this will also produce poor results. There must be an optimum granularity between the extremes. Mustn’t there?

Wrong!

Optimum chain ladder granularity

The technical answer will appear shortly, but a lapse in the above logic can already be noted. The respondent is clearly correct in the assessment of erratic behaviour of data cells of short duration, but, considering no further than this, overlooks the fact that there will be very many of these cells, and the erratic variations will tend to cancel out.

But this is merely an intuitive argument that fine granularity might not be as absurd as one is tempted to think. It says nothing about optimum granularity.

A rigorous investigation of the situation is carried out in this open-access paper . This is one of a trio of papers that I published during 2025 on the chain ladder and granularity. Detailed citations are given at the end of this article.

The results in the paper are mathematically, not numerically, based. The conclusions are, therefore, black and white. I won’t dwell on the theory other than to mention the basic concept used to arrive at the conclusions. This is the concept of a sufficient statistic.

Essentially, a sufficient statistic (for some parameter governing a data set) is a summary statistic of the data set that contains just as much information as the entire data set for the purpose of estimating the parameter. The concept may be applied to the chain ladder.

There are actually two distinct chain ladder models, usually referred to as the EDF chain ladder and the Mack chain ladder, the details of whose definitions need not detain us here. Suffice to say that although they are quite different models, there is an equivalence relation between their parameter sets and they yield the same estimates of loss reserve.

It is a simple matter to show that the sufficient statistics for the parameters of the EDF chain ladder are the row and column sums of the subject claim triangle. There is then a relatively simple argument demonstrating that a choice of finer granularity will always lead to smaller prediction error in the forecast liability.

There is one technical exception. If granularity is made too fine, then the chance of a cell being empty increases, which leads to a higher chance of zero row and/or column sums appearing. These carry no additional information. In this case, the prediction error does not decrease, but neither does it increase.

Thus, the optimum granularity is the finest such that any finer granularity will achieve nothing more than the introduction of one or more zero rows or columns.

A mortality parallel

This result may seem surprising, but there is a well-known parallel in mortality estimation. This is the Kaplan-Meier (“KM”) estimator of a mortality rate, or any other type of failure rate, on the basis of continuous-scale observations. This is a maximum likelihood estimator, just as is the chain ladder algorithm.

The KM estimator proceeds as follows:

Partition the observation interval into sub-intervals of constant exposure (new sub-interval commences at each death or other change of exposure), which maximises granularity.
Estimate survival rate on each sub-interval, each estimate based on one death or none.
Take the product of all these survival rates as the estimator of survival rate over the full interval.

This involves the same concepts as in the chain ladder case: maximum granularity, estimates over very short periods, and very many of these estimates.

Does this lead to any useful conclusions?

Does the above lead anywhere useful in practice? I would argue that it does.

Granted, the argument assumes the “pure” chain ladder, and almost no-one uses this in practice. There may be missing portions of the triangle; outlier values requiring modification; “actuarial judgement” fiddles; and so on. How well does the above conclusion extrapolate from its stylised environment to the real world?

By a continuity argument, one might expect that as one moves away from the pure chain ladder by small degrees, the conclusion on granularity would continue to hold. After all, the statement only says that more information leads to better forecasts. Credible in any setting.

Of course, wholesale departures from the pure model may strain the continuity argument and even cause its collapse. The fact is that we cannot know unless the contending model is formulated in all its detail.

Ultimately, results such as those discussed here are indicative. They apply in a very strict environment, but hint that they possibly apply under wider circumstances. In the present case, there is no great difficulty in believing that increased information leads to improved forecasts under fairly general conditions. One would be justified in extending modelling into data of greater granularity with confidence that more efficient forecasting is likely.