| Title: | Critical Line Algorithm in Pure R |
|---|---|
| Description: | Implements 'Markowitz' Critical Line Algorithm ('CLA') for classical mean-variance portfolio optimization, see Markowitz (1952) <doi:10.2307/2975974>. Care has been taken for correctness in light of previous buggy implementations. |
| Authors: | Yanhao Shi [aut], Martin Maechler [aut, cre]
|
| Maintainer: | Martin Maechler <[email protected]> |
| License: | GPL (>= 3) | file LICENSE |
| Version: | 0.96-3 |
| Built: | 2026-06-05 09:57:23 UTC |
| Source: | https://github.com/cran/CLA |
The Critical Line Algorithm was first proposed by Markowitz(1987) to solve the mean-variance optimal portfolio problem.
We solve the problem with “box” constraints, i.e., allow to
specify lower and upper bounds (via lB and uB) for each
asset weight.
Here we provide a pure R implementation, quite fine tuned and debugged compared to earlier ones.
CLA(mu, covar, lB, uB, check.cov = TRUE, check.f = TRUE, tol.lambda = 1e-07, give.MS = TRUE, keep.names = TRUE, trace = 0)CLA(mu, covar, lB, uB, check.cov = TRUE, check.f = TRUE, tol.lambda = 1e-07, give.MS = TRUE, keep.names = TRUE, trace = 0)
mu |
numeric vector of length |
covar |
the |
lB, uB
|
vectors of length |
check.cov |
|
check.f |
|
tol.lambda |
the tolerance when checking for lambda changes or being zero. |
give.MS |
|
keep.names |
|
trace |
an integer (or |
The current implementation of the CLA is based (via Norring's)
on Bailey et al.(2013). We have found buglets in that implementation
which lead them to introduce their “purge” routines
(purgeNumErr, purgeExcess),
which are no longer necessary.
Even though this is a pure R implementation, the algorithm is quite fast
also when the number of assets is large (1000s), though that
depends quite a bit on the exact problem.
an object of class "CLA" which is a
list with components
weights_set |
a |
free_indices |
a |
gammas |
numeric vector of length |
lambdas |
numeric vector of length |
MS_weights |
the |
The exact results of the algorithm, e.g., the assets with non-zero weights, may slightly depend on the (computer) platform, e.g., for the S&P 500 example, differences between 64-bit or 32-bit, version of BLAS or Lapack libraries etc, do have an influence, see the R script ‘tests/SP500-ex.R’ in the package sources.
Alexander Norring did the very first version (unpublished master thesis). Current implementation: Yanhao Shi and Martin Maechler
Markowitz, H. (1952) Portfolio selection, The Journal of Finance 7, 77–91; doi:10.2307/2975974.
Markowitz, H. M. (1987, 1st ed.) and Markowitz, H. M. and Todd, P. G. (2000) Mean-Variance Analysis in Portfolio Choice and Capital Markets; chapters 7 and 13.
Niedermayer, A. and Niedermayer, D. (2010) Applying Markowitz’s Critical Line Algorithm, in J. B. Guerard (ed.), Handbook of Portfolio Construction, Springer; chapter 12, 383–400; doi:10.1007/978-0-387-77439-8_12.
Bailey, D. H. and López de Prado, M. (2013) An open-source implementation of the critical-line algorithm for portfolio optimization, Algorithms 6(1), 169–196; doi:10.3390/a6010169,
Yanhao Shi (2017) Implementation and applications of critical line algorithm for portfolio optimization; unpublished Master's thesis, ETH Zurich.
MS;
for plotting CLA results: plot.CLA.
data(muS.sp500) ## Full data taking too much time for example set.seed(47) iS <- sample.int(length(muS.sp500$mu), 24) CLsp.24 <- CLA(muS.sp500$mu[iS], muS.sp500$covar[iS, iS], lB=0, uB=1/10) CLsp.24 # using the print() method for class "CLA" plot(CLsp.24) if(require(Matrix)) { ## visualize how weights change "along turning points" show(image(Matrix(CLsp.24$weights_set, sparse=TRUE), main = "CLA(muS.sp500 <random_sample(size=24)>) $ weights_set", xlab = "turning point", ylab = "asset number")) } ## A 3x3 example (from real data) where CLA()'s original version failed ## and 'check.f = TRUE' produces a warning : mc3 <- list( mu = c(0.0408, 0.102, -0.023), cv = matrix(c(0.00648, 0.00792, 0.00473, 0.00792, 0.0334, 0.0121, 0.00473, 0.0121, 0.0793), 3, 3, dimnames = list(NULL, paste0(c("TLT", "VTI","GLD"), ".Adjusted")))) rc3 <- with(mc3, CLA(mu=mu, covar=cv, lB=0, uB=1, trace=TRUE))data(muS.sp500) ## Full data taking too much time for example set.seed(47) iS <- sample.int(length(muS.sp500$mu), 24) CLsp.24 <- CLA(muS.sp500$mu[iS], muS.sp500$covar[iS, iS], lB=0, uB=1/10) CLsp.24 # using the print() method for class "CLA" plot(CLsp.24) if(require(Matrix)) { ## visualize how weights change "along turning points" show(image(Matrix(CLsp.24$weights_set, sparse=TRUE), main = "CLA(muS.sp500 <random_sample(size=24)>) $ weights_set", xlab = "turning point", ylab = "asset number")) } ## A 3x3 example (from real data) where CLA()'s original version failed ## and 'check.f = TRUE' produces a warning : mc3 <- list( mu = c(0.0408, 0.102, -0.023), cv = matrix(c(0.00648, 0.00792, 0.00473, 0.00792, 0.0334, 0.0121, 0.00473, 0.0121, 0.0793), 3, 3, dimnames = list(NULL, paste0(c("TLT", "VTI","GLD"), ".Adjusted")))) rc3 <- with(mc3, CLA(mu=mu, covar=cv, lB=0, uB=1, trace=TRUE))
Find and , given ) and
CLA result.
findMu(Sig0, result, covar, tol.unir = 1e-06, equal.tol = 1e-06)findMu(Sig0, result, covar, tol.unir = 1e-06, equal.tol = 1e-06)
Sig0 |
numeric vector of |
result |
a |
covar |
the same |
tol.unir |
numeric tolerance passed to |
equal.tol |
numeric tolerance to be used in
|
a list with components
Mu |
numeric vector of same length, say |
weight |
numeric |
Master thesis, p.33
data(muS.sp500) ## Full data taking too much time for example if(getRversion() >= "3.6") .Rk <- RNGversion("3.5.0") # for back compatibility & warning set.seed(2016) iS <- sample.int(length(muS.sp500$mu), 17) if(getRversion() >= "3.6") do.call(RNGkind, as.list(.Rk)) # revert cov17 <- muS.sp500$covar[iS, iS] CLsp.17 <- CLA(muS.sp500$mu[iS], covar=cov17, lB=0, uB = 1/2) CLsp.17 # 16 turning points summary(tpS <- CLsp.17$MS_weights[,"Sig"]) str(s0 <- seq(0.0186, 0.0477, by = 0.0001)) mu.. <- findMu(s0, result=CLsp.17, covar=cov17) str(mu..) stopifnot(dim(mu..$weight) == c(17, length(s0))) plot(s0, mu..$Mu, xlab=quote(sigma), ylab = quote(mu), type = "o", cex = 1/4) points(CLsp.17$MS_weights, col = "tomato", cex = 1.5)data(muS.sp500) ## Full data taking too much time for example if(getRversion() >= "3.6") .Rk <- RNGversion("3.5.0") # for back compatibility & warning set.seed(2016) iS <- sample.int(length(muS.sp500$mu), 17) if(getRversion() >= "3.6") do.call(RNGkind, as.list(.Rk)) # revert cov17 <- muS.sp500$covar[iS, iS] CLsp.17 <- CLA(muS.sp500$mu[iS], covar=cov17, lB=0, uB = 1/2) CLsp.17 # 16 turning points summary(tpS <- CLsp.17$MS_weights[,"Sig"]) str(s0 <- seq(0.0186, 0.0477, by = 0.0001)) mu.. <- findMu(s0, result=CLsp.17, covar=cov17) str(mu..) stopifnot(dim(mu..$weight) == c(17, length(s0))) plot(s0, mu..$Mu, xlab=quote(sigma), ylab = quote(mu), type = "o", cex = 1/4) points(CLsp.17$MS_weights, col = "tomato", cex = 1.5)
Find and , given ) and
CLA result.
findSig(Mu0, result, covar, equal.tol)findSig(Mu0, result, covar, equal.tol)
Mu0 |
numeric vector of |
result |
a |
covar |
the same |
equal.tol |
numeric tolerance to be used in
|
a list with components
Sig |
numeric vector of same length, say |
weight |
numeric |
Master thesis, p.33
data(muS.sp500) ## Full data taking too much time for example: Subset of n=21: if(getRversion() >= "3.6") .Rk <- RNGversion("3.5.0") # for back compatibility & warning set.seed(2018) iS <- sample.int(length(muS.sp500$mu), 21) if(getRversion() >= "3.6") do.call(RNGkind, as.list(.Rk)) # revert cov21 <- muS.sp500$covar[iS, iS] CLsp.21 <- CLA(muS.sp500$mu[iS], covar=cov21, lB=0, uB = 1/2) CLsp.21 # 14 turning points summary(tpM <- CLsp.21$MS_weights[,"Mu"]) str(m0 <- c(min(tpM),seq(0.00205, 0.00525, by = 0.00005), max(tpM))) sig. <- findSig(m0, result=CLsp.21, covar=cov21) str(sig.) stopifnot(dim(sig.$weight) == c(21, length(m0))) plot(sig.$Sig, m0, xlab=quote(sigma), ylab = quote(mu), type = "o", cex = 1/4) points(CLsp.21$MS_weights, col = "tomato", cex = 1.5) title("Efficient Frontier from CLA()") mtext("findSig() to interpolate between turning points", side=3)data(muS.sp500) ## Full data taking too much time for example: Subset of n=21: if(getRversion() >= "3.6") .Rk <- RNGversion("3.5.0") # for back compatibility & warning set.seed(2018) iS <- sample.int(length(muS.sp500$mu), 21) if(getRversion() >= "3.6") do.call(RNGkind, as.list(.Rk)) # revert cov21 <- muS.sp500$covar[iS, iS] CLsp.21 <- CLA(muS.sp500$mu[iS], covar=cov21, lB=0, uB = 1/2) CLsp.21 # 14 turning points summary(tpM <- CLsp.21$MS_weights[,"Mu"]) str(m0 <- c(min(tpM),seq(0.00205, 0.00525, by = 0.00005), max(tpM))) sig. <- findSig(m0, result=CLsp.21, covar=cov21) str(sig.) stopifnot(dim(sig.$weight) == c(21, length(m0))) plot(sig.$Sig, m0, xlab=quote(sigma), ylab = quote(mu), type = "o", cex = 1/4) points(CLsp.21$MS_weights, col = "tomato", cex = 1.5) title("Efficient Frontier from CLA()") mtext("findSig() to interpolate between turning points", side=3)
Compute the vectors of means () and standard deviations
(), for all the turning points of a CLA
result.
MS(weights_set, mu, covar)MS(weights_set, mu, covar)
weights_set |
numeric matrix ( |
mu |
expected (log) returns (identical to argument of
|
covar |
covariance matrix of (log) returns (identical to
argument of |
These are trivially computable from the CLA()'s result.
To correctly interpolate this, “hyperbolic”
interpolation is needed, provided by the findSig and
findMu functions.
a list with components
Sig |
numeric vector of length |
Mu |
numeric vector of length |
Yanhao Shi
CLA.
## The function is quite simply MS ## and really an auxiliary function for CLA(). ## TODO: add small (~12 assets) example## The function is quite simply MS ## and really an auxiliary function for CLA(). ## TODO: add small (~12 assets) example
The simple example Data of Markowitz and Todd (2000); used for illustrating the CLA; reused in Bailey and López de Prado (2013).
data("muS.10ex")data("muS.10ex")
A list with two components,
Named num [1:10] 1.175 1.19 0.396 1.12 0.346 ...
names : chr [1:10] "X1" "X2" "X3" "X4" ...
num [1:10, 1:10] 0.4076 0.0318 0.0518 0.0566 0.033 ...
From ‘http://www.quantresearch.info/CLA_Data.csv.txt’ (URL no longer working, Aug.2020!) by López de Prado.
Markowitz, H. M. (1987, 1st ed.) and Markowitz, H. M. and Todd, P. G. (2000) Mean-Variance Analysis in Portfolio Choice and Capital Markets, page 335.
Bailey, D. H. and López de Prado, M. (2013) An open-source implementation of the critical-line algorithm for portfolio optimization, Algorithms 6(1), 169–196; doi:10.3390/a6010169, p. 16f.
data(muS.10ex) str(muS.10ex) CLA.10ex <- with(muS.10ex, CLA(mu, covar, lB=0, uB=1)) if(require("Matrix")) drop0(zapsmall(CLA.10ex$weights_set)) ## The results, summarized, as in Bayley and López de Prado (Table 2, p.18) : with(CLA.10ex, round(cbind(MS_weights[,2:1], lambda=lambdas, t(weights_set)), 3)) CLA.10ex.1c <- with(muS.10ex, CLA(mu, covar, lB=1/100, uB=1)) round(CLA.10ex.1c$weights_set, 3)data(muS.10ex) str(muS.10ex) CLA.10ex <- with(muS.10ex, CLA(mu, covar, lB=0, uB=1)) if(require("Matrix")) drop0(zapsmall(CLA.10ex$weights_set)) ## The results, summarized, as in Bayley and López de Prado (Table 2, p.18) : with(CLA.10ex, round(cbind(MS_weights[,2:1], lambda=lambdas, t(weights_set)), 3)) CLA.10ex.1c <- with(muS.10ex, CLA(mu, covar, lB=1/100, uB=1)) round(CLA.10ex.1c$weights_set, 3)
If are the basically the scale standardized log returns
for of 476 stocks from S&P 500, as
from SP500, then somehow
averaged over time; actually as predicted by muSigma() at the
end of the time period, and
are estimated covariances.
These are the main “inputs” needed for the CLA algorithm, see
CLA.
data("muS.sp500")data("muS.sp500")
A list with two components,
Named num [1:476] 0.00233 0.0035 0.01209 0.00322 0.00249 ...
names : chr [1:476] "A" "AA" "AAPL" "ABC" ...
num [1:476, 1:476] 0.001498 0.000531 0.000536 ...
It is as simple as this:
data(SP500, package="FRAPO")
system.time(muS.sp500 <- muSigmaGarch(SP500)) # 26 sec. (lynne, 2017)
muSigmaGarch() which was used to construct it.
data(muS.sp500) str(muS.sp500)data(muS.sp500) str(muS.sp500)
Compute (mu, Sigma) for a set of assets via a GARCH fit to each
individual asset, using package
fGarch's garchFit().
muSigmaGarch(x, formula = ~garch(1, 1), cond.dist = "std", trace = FALSE, ...)muSigmaGarch(x, formula = ~garch(1, 1), cond.dist = "std", trace = FALSE, ...)
x |
numeric matrix or data frame ( |
formula |
optional formula for |
cond.dist |
the conditional distribution to be used for the garch process. |
trace |
logical indicating if some progress of |
... |
optional arguments to |
a list with components
mu |
numeric vector of length |
covar |
covariance matrix ( |
muS.sp500 which has been produced via muSigmaGarch.
CLA which needs (mu, covar) as crucial input.
if(requireNamespace("FRAPO")) { data(NASDAQ, package = "FRAPO") ## 12 randomly picked stocks from NASDAQ data iS <- if(FALSE) { ## created (w/ warning, in new R) by RNGversion("3.5.0"); set.seed(17); iS <- sample(ncol(NASDAQ), 12) } else c(341L, 2126L, 1028L, 1704L, 895L, 1181L, 454L, 410L, 1707L, 425L, 950L, 5L) X. <- NASDAQ[, iS] muSig <- muSigmaGarch(X.) stopifnot(identical(names(muSig$mu), names(NASDAQ)[iS]), identical(dim(muSig$covar), c(12L,12L)), all.equal(unname(muSig$mu), c( 7.97, -4.05, -14, 21.5, -5.36, -15.3, -15.9, 11.8, -1.64, -14, 3.13, 121) / 10000, tol = 0.0015)) }if(requireNamespace("FRAPO")) { data(NASDAQ, package = "FRAPO") ## 12 randomly picked stocks from NASDAQ data iS <- if(FALSE) { ## created (w/ warning, in new R) by RNGversion("3.5.0"); set.seed(17); iS <- sample(ncol(NASDAQ), 12) } else c(341L, 2126L, 1028L, 1704L, 895L, 1181L, 454L, 410L, 1707L, 425L, 950L, 5L) X. <- NASDAQ[, iS] muSig <- muSigmaGarch(X.) stopifnot(identical(names(muSig$mu), names(NASDAQ)[iS]), identical(dim(muSig$covar), c(12L,12L)), all.equal(unname(muSig$mu), c( 7.97, -4.05, -14, 21.5, -5.36, -15.3, -15.9, 11.8, -1.64, -14, 3.13, 121) / 10000, tol = 0.0015)) }
A partly experimental plot() method for
CLA() objects.
It draws the efficient frontier in the aka (mean,
std.dev.) plane.
Currently, this is quite rudimentary.
Future improvements would allow
- to add the/some single asset points,
- to correctly (‘hyperbolically’) interpolate between turning points
- add text about the number of (unique) critical points
- add option add = FALSE which when TRUE would use
lines instead plot.
## S3 method for class 'CLA' plot(x, type = "o", main = "Efficient Frontier", xlab = expression(sigma(w)), ylab = expression(mu(w)), col = adjustcolor("blue", alpha.f = 0.5), pch = 16, ...)## S3 method for class 'CLA' plot(x, type = "o", main = "Efficient Frontier", xlab = expression(sigma(w)), ylab = expression(mu(w)), col = adjustcolor("blue", alpha.f = 0.5), pch = 16, ...)
x |
|
type |
the |
main |
main |
xlab, ylab
|
x- and y- axis labels, passed to |
col, pch
|
color and point type, passed to
|
... |
potentially further arguments passed to
|
Martin Maechler.
## TODO %% Add A. Norring's small 12-asset example see --> ../TODO ## ---- one example is in help(CLA)## TODO %% Add A. Norring's small 12-asset example see --> ../TODO ## ---- one example is in help(CLA)