The Distribution Behind the Drawdown

A drawdown is not only how much capital was lost. It is the path through which the loss became visible.

Allocators do not experience volatility as a standard deviation. They experience capital moving away from its prior high, time passing, confidence decaying, and liquidity decisions becoming harder. Maximum drawdown is one of the few risk statistics that speaks directly to that experience. It is also one of the easiest statistics to misuse.

A realized maximum drawdown is not a property of a return distribution alone. It is a functional of an ordered path. Two histories can contain exactly the same returns, the same mean, the same volatility, the same skewness, and the same terminal value, while producing materially different maximum drawdowns. The difference is sequence.

Scope of this note

Atamus Capital does not publish proprietary strategy rules, signals, feature definitions, datasets, data transformations, model architectures, candidate-generation methods, training procedures, parameters, investment universes, portfolio construction methods, execution processes, trade-level information, position-level information, or performance results. This note examines the mathematics of drawdown through closed-form calculations, deterministic numerical experiments, and controlled Monte Carlo experiments under fully specified probability laws. No Atamus Capital strategy return, backtest, live result, internal threshold, research pipeline, holding period, implementation assumption, model-development workflow, or portfolio construction method is disclosed. The models, parameters, horizons, and simulations below are public illustrations for studying path-dependent risk. They are not Atamus Capital internal settings.

Abstract

Maximum drawdown is a path functional. It depends on the running maximum of a wealth or log-wealth process and therefore depends on the order in which returns arrive, not only on their marginal distribution. We formalize drawdown as \(D_t=M_t-X_t\), where \(M_t=\sup_{s\leq t}X_s\), and study the maximum drawdown \(\overline D_T=\sup_{t\leq T}D_t\). The Brownian benchmark gives an analytical expected maximum drawdown representation in terms of the universal functions of Magdon-Ismail, Atiya, Pratap, and Abu-Mostafa. In the zero-drift case, the representation reduces to the exact closed form \(\mathbb E[\overline D_T]=\sigma\sqrt{\pi T/2}\). A 12,000-path convergence experiment validates this continuous-time formula against sampled Brownian paths. At annual volatility 16 percent and horizon three years, the continuous zero-drift expected maximum log drawdown is 0.347329 log units, or 34.7329 log percentage points. A daily-grid Monte Carlo estimate is 0.334424 log units, rising to 0.341931 log units when the sampling grid is refined to 1,008 steps per year. A separate controlled experiment under stated assumptions shows that the same set of one-year daily returns can produce maximum wealth drawdowns of 4.22 percent, 8.68 percent, or 47.28 percent solely through reordering. Further simulations show that positive serial dependence and volatility clustering reshape the entire drawdown distribution. The conclusion is simple: drawdown is not a scalar inconvenience appended to return. It is a distribution induced by sequence, dependence, and tails.

1. The object being measured

Let \(X_t\) denote cumulative log wealth or cumulative arithmetic profit measured on a common scale. Define the running maximum

M_t=\sup_{0\leq s\leq t}X_s,

the drawdown process

\[ D_t=M_t-X_t, \]

and the maximum drawdown over horizon \([0,T]\)

\overline D_T=\sup_{0\leq t\leq T}D_t =\sup_{0\leq u\leq v\leq T}\left(X_u-X_v\right).

The last equality is the most revealing form. Maximum drawdown is the largest peak-to-subsequent-trough loss. The trough must occur after the peak. This temporal ordering is why drawdown cannot be inferred from a histogram of returns alone.

For a geometric wealth process \(V_t\), it is often cleaner to work with

X_t=\log V_t.

A log drawdown of size \(d\) corresponds to a wealth drawdown

1-e^{-d}.

For small \(d\), the two are nearly the same. For large losses, the distinction matters. A 50 percent log drawdown is not a 50 percent wealth loss. It corresponds to

1-e^{-0.50}=39.35\%.

Throughout this note, Brownian formulas are stated in log units. When useful, we also report the corresponding wealth drawdown equivalent.

2. Same returns, different drawdowns

Consider a discrete sequence of log returns \(r_1,\ldots,r_n\). Let

X_k=\sum_{i=1}^{k}r_i, \qquad \overline D_n=\max_{0\leq i\leq j\leq n}(X_i-X_j).

Let \(\pi\) be a permutation of \(\{1,\ldots,n\}\). In general,

\overline D_n(r_1,\ldots,r_n) \neq \overline D_n(r_{\pi(1)},\ldots,r_{\pi(n)}).

The marginal distribution has not changed. The sample mean has not changed. The sample variance has not changed. Terminal log return has not changed. The path has changed.

In the model-based illustration used for Figure 1, a single set of 252 daily log returns is generated from standardized \(t_5\) shocks and shifted so that the terminal log return is exactly 6.00 percent. The multiset of returns is then reordered three ways. The terminal result is identical in all cases. The maximum drawdown is not.

Ordering	Terminal log return	Max log drawdown	Wealth drawdown equivalent
Alternating gains and losses	6.00%	4.3117%	4.2201%
Random order	6.00%	9.0804%	8.6804%
Clustered losses after gains	6.00%	64.0143%	47.2783%

The example is intentionally stylized. Its purpose is not to imitate a production return stream. Its purpose is to isolate a mathematical fact: path risk is not a function of marginal returns alone.

Figure 1

Same returns, different drawdowns

Terminal log return fixed at 6.00%

Figure 1. Same returns, different drawdowns. The same multiset of returns can produce materially different maximum drawdowns solely through sequence. Terminal log return is fixed at 6.00 percent in every ordering.

View data

Ordering	Terminal log return	Max log drawdown	Wealth drawdown equivalent
Clustered losses	6.00%	64.0143%	47.2783%
Alternating	6.00%	4.3117%	4.2201%
Random	6.00%	9.0804%	8.6804%

Proposition 1. Positive homogeneity of maximum drawdown

For any \(c>0\),

\overline D_T(cX)=c\overline D_T(X).

Proof. The running maximum of the scaled path is

M_t^{cX}=\sup_{s\leq t}cX_s=c\sup_{s\leq t}X_s=cM_t.

Therefore

D_t^{cX}=M_t^{cX}-cX_t=c(M_t-X_t)=cD_t.

Taking the supremum over \(t\in[0,T]\) gives the result.

Positive homogeneity is useful because it tells us how drawdown scales with exposure in a fixed path model. It does not make drawdown a linear function of a portfolio process under rebalancing, dependence, or constraints. It also does not remove path dependence.

3. The Brownian benchmark

A tractable benchmark begins with drifted Brownian motion:

X_t=\mu t+\sigma W_t, \qquad 0\leq t\leq T,

where \(W_t\) is standard Brownian motion, \(\mu\) is the drift per unit time, and \(\sigma>0\) is the diffusion coefficient. For a geometric Brownian motion

\frac{dV_t}{V_t}=\widehat\mu\,dt+\widehat\sigma\,dW_t,

Ito's lemma gives

X_t=\log V_t = \left(\widehat\mu-\frac{1}{2}\widehat\sigma^2\right)t +\widehat\sigma W_t.

Thus the Brownian formula can be applied to log wealth with

\mu=\widehat\mu-\frac{1}{2}\widehat\sigma^2, \qquad \sigma=\widehat\sigma.

Magdon-Ismail, Atiya, Pratap, and Abu-Mostafa derive the distribution and expected value of the maximum drawdown of a Brownian motion with drift. With

x=\frac{\mu^2T}{2\sigma^2},

the expected maximum drawdown has the analytical representation

\mathbb E[\overline D_T] = \begin{cases} \dfrac{2\sigma^2}{\mu}Q_p(x), & \mu>0,\\[8pt] \sigma\sqrt{\dfrac{\pi T}{2}}, & \mu=0,\\[8pt] -\dfrac{2\sigma^2}{\mu}Q_n(x), & \mu<0. \end{cases}

The functions \(Q_p\) and \(Q_n\) are universal functions obtained from eigenvalue expansions of a reflected Brownian first-passage problem. They do not depend separately on \(\mu\), \(\sigma\), or \(T\). They depend only on \(x=\mu^2T/(2\sigma^2)\). Figure 2 uses published tabulated values of these universal functions with interpolation in log x to draw the nonzero-drift Brownian expectation curves. The interpolation is used only for visualization between published tabulated points and is not a fitted market result. The zero-drift curve and the validation in Figure 3 use the exact closed form directly.

The phase transition is the important structural result. As \(T\to\infty\), expected maximum drawdown grows logarithmically under positive drift, as a square root under zero drift, and linearly under negative drift:

\mathbb E[\overline D_T] \sim \begin{cases} \dfrac{\sigma^2}{\mu}\left(\frac12\log T+\log\frac{\mu}{\sigma}+0.63519\right), & \mu>0,\\[10pt] \sigma\sqrt{\dfrac{\pi T}{2}}, & \mu=0,\\[10pt] -\mu T-\dfrac{\sigma^2}{\mu}, & \mu<0. \end{cases}

A profitable drift does not eliminate drawdowns. It changes their scaling law. This distinction matters. A strategy can have positive expected log growth and still have a drawdown distribution that is too severe for a mandate, liquidity profile, investor base, or governance structure.

Figure 2

Brownian expected drawdown across drift regimes

Positive, zero, and negative drift regimes

Figure 2. Brownian expected drawdown across drift regimes. The zero-drift curve is exact. The nonzero-drift curves use the published Qp and Qn universal-function table with interpolation in log x for display.

View data

Horizon	Positive drift	Zero drift	Negative drift
0.25 years	0.0944	0.1003	0.1066
2.64 years	0.2699	0.3259	0.4023
5.09 years	0.3494	0.4526	0.6058
7.55 years	0.4047	0.5509	0.7960
10.00 years	0.4497	0.6341	0.9903

4. Expected drawdown is not a limit

The Brownian expectation is a mean over paths. It is not an upper bound.

For the zero-drift case with \(\sigma=16\%\) annually and \(T=3\) years, the formula gives

\mathbb E[\overline D_T] =0.16\sqrt{\frac{3\pi}{2}} =0.3473286.

As a wealth drawdown equivalent,

1-e^{-0.3473286}=29.3427\%.

A single observed three-year drawdown below this number does not prove the path was safe. A single observed three-year drawdown above this number does not prove the process was defective. The statistic is random. The right object is the distribution.

The sampled-grid experiment in Figure 3 validates the continuous formula by simulating Brownian paths at increasingly fine grids. The sampled-grid maximum drawdown is a lower bound for the continuous-time maximum because it does not observe extrema between grid points. As the grid becomes finer, the estimate moves toward the continuous formula.

Steps per year	Paths	Mean sampled max log drawdown	Ratio to formula
63	12,000	0.324420	93.4043%
126	12,000	0.330436	95.1363%
252	12,000	0.334424	96.2845%
504	12,000	0.339251	97.6743%
1008	12,000	0.341931	98.4459%

The convergence is not cosmetic. It is a warning about measurement. A drawdown measured from monthly observations is not the same object as a drawdown measured from daily observations, and neither is exactly the continuous path functional.

Figure 3

Sampled paths converge upward

Sampled-grid approximation versus continuous-time formula

Figure 3. Sampled-grid maximum drawdown is a lower-bound approximation to the continuous path functional. The estimate rises as the observation grid is refined.

View data

Steps per year	Paths	Mean sampled max log drawdown	Ratio to formula
63	12,000	0.324420	93.4043%
126	12,000	0.330436	95.1363%
252	12,000	0.334424	96.2845%
504	12,000	0.339251	97.6743%
1008	12,000	0.341931	98.4459%

5. The distribution behind one drawdown number

Now consider a Brownian benchmark log-wealth process generated under the stated stochastic process with annual drift \(\mu=6\%\), annual volatility \(\sigma=16\%\), and horizon \(T=3\) years, sampled daily. This is not a forecast and not an Atamus strategy assumption. It is a controlled public benchmark.

The simulation uses 15,000 paths. The maximum log drawdown distribution is summarized below.

Statistic	Log drawdown	Wealth drawdown equivalent
Mean	0.272145	23.8256%
Median	0.248286	21.9864%
90th percentile	0.424922	34.6179%
95th percentile	0.490361	38.7595%
99th percentile	0.630300	46.7568%
CED95	0.577450	43.8672%

Here \(\text{CED95}\) denotes the conditional expected log drawdown above the 95th percentile. The reported wealth value is the wealth-drawdown equivalent of that log quantity:

\text{CED95} = \mathbb E\left[\overline D_T\mid \overline D_T\geq q_{0.95}(\overline D_T)\right].

This is analogous to expected shortfall, but applied to the distribution of maximum drawdowns rather than one-period losses. It emphasizes that the severe part of the drawdown distribution is not described by the historical maximum alone.

Figure 4

One drawdown number is drawn from a distribution

Maximum drawdown quantiles and conditional expected drawdown

Figure 4. A realized maximum drawdown is one pathwise observation. The risk object is the distribution induced by the assumed path process.

View data

Statistic	Log drawdown	Wealth drawdown equivalent
Mean	0.272145	23.8256%
Median	0.248286	21.9864%
90th percentile	0.424922	34.6179%
95th percentile	0.490361	38.7595%
99th percentile	0.630300	46.7568%
CED95	0.577450	43.8672%

6. Serial dependence changes the distribution

Suppose daily log returns have the same unconditional mean and variance, but different serial dependence. Let

r_t-\bar\mu=\phi(r_{t-1}-\bar\mu)+\varepsilon_t, \qquad \varepsilon_t\sim\mathcal N\left(0,(1-\phi^2)\sigma_d^2\right),

where \(\sigma_d=0.16/\sqrt{252}\) and \(\bar\mu=0.06/252\). Then all values of \(\phi\) have the same unconditional daily variance. The path distribution still changes.

The model-based three-year results are:

φ	Median wealth drawdown	95th percentile	CED95
-0.25	16.9382%	30.3196%	34.7312%
0.00	21.9336%	38.2878%	43.2290%
0.25	27.8970%	48.0778%	53.8867%
0.50	35.7420%	59.3586%	65.4281%

The marginal daily volatility is unchanged. The annual drift is unchanged. The horizon is unchanged. Positive serial dependence clusters losses and gains. Clustering changes peak-to-trough behavior.

This is one reason drawdown deserves separate treatment from volatility. Volatility can be preserved while drawdown risk changes materially.

Figure 5

Serial dependence reshapes path risk

Same unconditional variance, different path distribution

Figure 5. All processes use the same unconditional daily variance and annual drift. Positive serial dependence increases drawdown severity by clustering losses.

View data

Phi	Median wealth drawdown	95th percentile	CED95
-0.25	16.9382%	30.3196%	34.7312%
0.0	21.9336%	38.2878%	43.2290%
0.25	27.8970%	48.0778%	53.8867%
0.5	35.7420%	59.3586%	65.4281%

7. Tails and volatility clustering

Marginal tail behavior and conditional volatility also reshape the drawdown distribution. We compare four fully specified daily benchmark processes with the same annual drift and the same unconditional one-day variance, scaled from a 16 percent IID annualized volatility convention:

\begin{aligned} &\text{IID Gaussian},\\ &\text{IID standardized Student }t_5,\\ &\text{Gaussian AR(1) with }\phi=0.30,\\ &\text{GARCH-}t_5\text{ with }\alpha=0.10,\ \beta=0.88. \end{aligned}

For the GARCH process,

r_t=\bar\mu+\varepsilon_t, \qquad \varepsilon_t=\sigma_t z_t,

with standardized \(t_5\) innovations and

\sigma_t^2=\omega+\alpha\varepsilon_{t-1}^2+\beta\sigma_{t-1}^2, \qquad \omega=(1-\alpha-\beta)\sigma_d^2.

The resulting maximum drawdown summaries are:

Process	95th percentile	99th percentile	CED95
IID Gaussian	38.7595%	46.7568%	43.8672%
IID standardized t₅	38.4294%	46.7815%	43.5565%
Gaussian AR(1), φ = 0.30	50.3539%	59.5925%	56.2646%
GARCH-t₅	40.4838%	53.5653%	49.5620%

The comparison should not be overinterpreted. These are not fitted market models. They are controlled counterfactuals. Their value is that they isolate mechanisms. In this design, positive serial dependence has the largest effect on the 95th percentile. Volatility clustering with fat-tailed shocks has a stronger effect deeper in the tail, especially at the 99th percentile and in \(\text{CED95}\).

Figure 6

Tails and volatility clustering alter the severe tail

Benchmark processes with matched drift and unconditional variance

Figure 6. The comparison isolates mechanisms under benchmark processes with the same annual drift and the same unconditional one-day variance, scaled from a 16 percent IID annualized volatility convention. No market or strategy data is used.

View data

Process	95th percentile	99th percentile	CED95
IID Gaussian	38.7595%	46.7568%	43.8672%
IID Student t5	38.4294%	46.7815%	43.5565%
AR(1) Gaussian, phi=0.30	50.3539%	59.5925%	56.2646%
GARCH t5	40.4838%	53.5653%	49.5620%

8. What a drawdown statistic can and cannot say

A historical maximum drawdown answers one question:

What was the largest observed peak-to-trough decline on this sampled path?

It does not answer, by itself:

What is the expected maximum drawdown under the assumed process?
What is the 95th percentile of maximum drawdown?
How much of the observed drawdown came from sequence rather than marginal volatility?
How sensitive is drawdown to sampling frequency?
How does serial dependence alter path risk?
What happens under volatility clustering or fat-tailed shocks?

The distinction is important for systematic research. A favorable historical maximum drawdown can be produced by a benign ordering of losses. An unfavorable historical maximum drawdown can be produced by an unusually concentrated ordering. Neither observation should be ignored. Neither should be treated as a complete law.

Atamus views drawdown as a distributional object. The realized number is only one pathwise observation. The risk question is broader:

\mathcal L(\overline D_T\mid \mathcal M),

where \(\mathcal M\) denotes the modeling assumptions, sampling frequency, dependence structure, tail model, implementation constraints, and conditioning information used to generate or evaluate paths.

A drawdown policy that considers only the realized historical maximum is implicitly treating one path as the distribution. That is rarely a defensible assumption.

9. Conclusion

Maximum drawdown is allocator-relevant because it measures the shape of pain. It is mathematically difficult for the same reason. It is not a moment of one-period returns. It is a supremum over an ordered path.

The Brownian benchmark provides a useful anchor. It shows that expected maximum drawdown has distinct scaling regimes: logarithmic under positive drift, square-root under zero drift, and linear under negative drift. The zero-drift formula

\mathbb E[\overline D_T]=\sigma\sqrt{\frac{\pi T}{2}}

is simple, exact, and useful. The general drifted case is more complex but still analytically structured through universal functions.

The simulations show why theory alone is not enough. Sampling frequency matters. Order matters. Serial dependence matters. Tail behavior matters. Volatility clustering matters. A drawdown is not just a number at the bottom of a performance table. It is the visible trace of a path distribution.

The institutional standard is therefore not to ask whether a drawdown was acceptable after it occurred. The stronger question is what distribution of drawdowns was admissible before the path was known.

References

1. Magdon-Ismail, M., Atiya, A. F., Pratap, A., and Abu-Mostafa, Y. S. On the Maximum Drawdown of a Brownian Motion. Journal of Applied Probability, 41(1), 147-161, 2004.

2. Magdon-Ismail, M. and Atiya, A. F. An Analysis of the Maximum Drawdown Risk Measure. Risk, 2004.

3. Goldberg, L. R. and Mahmoud, O. Drawdown: From Practice to Theory and Back Again. Mathematics and Financial Economics, 2017.

4. Chekhlov, A., Uryasev, S., and Zabarankin, M. Drawdown Measure in Portfolio Optimization. International Journal of Theoretical and Applied Finance, 2005.

5. Bollerslev, T. Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 1986.

Disclaimer

Research notes published by Atamus Capital are provided for general informational and research purposes only. They do not constitute investment advice, trading advice, a recommendation, an offer to sell, or a solicitation to buy any security, fund interest, account, or investment product.

This note does not disclose Atamus Capital's proprietary strategies, signals, feature definitions, datasets, data transformations, model architectures, candidate-generation methods, training procedures, parameters, portfolio construction methods, execution processes, investment universe, research thresholds, model-development workflow, or investment decisions.