Title: | Utilities from 'Seminar fuer Statistik' ETH Zurich |
---|---|
Description: | Useful utilities ['goodies'] from Seminar fuer Statistik ETH Zurich, some of which were ported from S-plus in the 1990s. For graphics, have pretty (Log-scale) axes eaxis(), an enhanced Tukey-Anscombe plot, combining histogram and boxplot, 2d-residual plots, a 'tachoPlot()', pretty arrows, etc. For robustness, have a robust F test and robust range(). For system support, notably on Linux, provides 'Sys.*()' functions with more access to system and CPU information. Finally, miscellaneous utilities such as simple efficient prime numbers, integer codes, Duplicated(), toLatex.numeric() and is.whole(). |
Authors: | Martin Maechler [aut, cre] , Werner Stahel [ctb] (Functions: compresid2way(), f.robftest(), last(), p.scales(), p.dnorm()), Andreas Ruckstuhl [ctb] (Functions: p.arrows(), p.profileTraces(), p.res.2x()), Christian Keller [ctb] (Functions: histBxp(), p.tachoPlot()), Kjetil Halvorsen [ctb] (Functions: KSd(), ecdf.ksCI()), Alain Hauser [ctb] (Functions: cairoSwd(), is.whole(), toLatex.numeric()*), Christoph Buser [ctb] (to function Duplicated()), Lorenz Gygax [ctb] (to function p.res.2fact()), Bill Venables [ctb] (Functions: empty.dimnames(), primes()), Tony Plate [ctb] (to inv.seq()), Isabelle Flückiger [ctb], Marcel Wolbers [ctb], Markus Keller [ctb], Sandrine Dudoit [ctb], Jane Fridlyand [ctb], Greg Snow [ctb] (to loessDemo()), Henrik Aa. Nielsen [ctb] (to loessDemo()), Vincent Carey [ctb], Ben Bolker [ctb], Philippe Grosjean [ctb], Frédéric Ibanez [ctb], Caterina Savi [ctb], Charles Geyer [ctb], Jens Oehlschlägel [ctb] |
Maintainer: | Martin Maechler <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1-20 |
Built: | 2024-10-23 10:24:05 UTC |
Source: | https://github.com/mmaechler/sfsmisc |
AsciiToInt
returns integer
codes in 0:255
for each (one byte) character in strings
. ichar
is an
alias for it, for old S compatibility.
strcodes
implements in R the basic engine for translating
characters to corresponding integer codes.
chars8bit()
is the inverse function of
AsciiToint
, producing “one byte” characters from integer
codes. Note that it (and hence strcodes()
depends on the
locale, see Sys.getlocale()
.
AsciiToInt(strings) ichar(strings) chars8bit(i = 1:255) strcodes(x, table = chars8bit(1:255))
AsciiToInt(strings) ichar(strings) chars8bit(i = 1:255) strcodes(x, table = chars8bit(1:255))
strings , x
|
|
i |
numeric (integer) vector of values in |
table |
a vector of (unique) character strings, typically of one character each. |
Only codes in 1:127
make up the ASCII encoding which should be
identical for all R versions, whereas the ‘upper’ half
is often determined from the ISO-8859-1 (aka “ISO-Latin 1)”
encoding, but may well differ, depending on the locale setting, see
also Sys.setlocale
.
Note that 0
is no longer allowed since, R does not allow
\0
aka nul
characters in a string anymore.
AsciiToInt
(and hence ichar
) and chars8bit
return a
vector of the same length as their argument.
strcodes(x, tab)
returns a list
of the same
length
and names
as x
with list
components of integer vectors with codes in 1:255
.
Martin Maechler, partly in 1991 for S-plus
chars8bit(65:70)#-> "A" "B" .. "F" stopifnot(identical(LETTERS, chars8bit(65:90)), identical(AsciiToInt(LETTERS), 65:90)) ## may only work in ISO-latin1 locale (not in UTF-8): try( strcodes(c(a= "ABC", ch="1234", place = "Zürich")) ) ## in "latin-1" gives {otherwise should give NA instead of 252}: ## Not run: $a [1] 65 66 67 $ch [1] 49 50 51 52 $place [1] 90 252 114 105 99 104 ## End(Not run) myloc <- Sys.getlocale() if(.Platform $ OS.type == "unix") withAutoprint({ # ''should work'' here try( Sys.setlocale(locale = "de_CH") )# "try": just in case strcodes(c(a= "ABC", ch="1234", place = "Zürich")) # no NA hopefully AsciiToInt(chars8bit()) # -> 1:255 {if setting latin1 succeeded above} chars8bit(97:140) try( Sys.setlocale(locale = "de_CH.utf-8") )# "try": just in case chars8bit(97:140) ## typically looks different than above }) ## Resetting to original locale .. works "mostly": lapply(strsplit(strsplit(myloc, ";")[[1]], "="), function(cc) try(Sys.setlocale(cc[1], cc[2]))) -> .scratch Sys.getlocale() == myloc # TRUE if we have succeeded to reset it
chars8bit(65:70)#-> "A" "B" .. "F" stopifnot(identical(LETTERS, chars8bit(65:90)), identical(AsciiToInt(LETTERS), 65:90)) ## may only work in ISO-latin1 locale (not in UTF-8): try( strcodes(c(a= "ABC", ch="1234", place = "Zürich")) ) ## in "latin-1" gives {otherwise should give NA instead of 252}: ## Not run: $a [1] 65 66 67 $ch [1] 49 50 51 52 $place [1] 90 252 114 105 99 104 ## End(Not run) myloc <- Sys.getlocale() if(.Platform $ OS.type == "unix") withAutoprint({ # ''should work'' here try( Sys.setlocale(locale = "de_CH") )# "try": just in case strcodes(c(a= "ABC", ch="1234", place = "Zürich")) # no NA hopefully AsciiToInt(chars8bit()) # -> 1:255 {if setting latin1 succeeded above} chars8bit(97:140) try( Sys.setlocale(locale = "de_CH.utf-8") )# "try": just in case chars8bit(97:140) ## typically looks different than above }) ## Resetting to original locale .. works "mostly": lapply(strsplit(strsplit(myloc, ";")[[1]], "="), function(cc) try(Sys.setlocale(cc[1], cc[2]))) -> .scratch Sys.getlocale() == myloc # TRUE if we have succeeded to reset it
Produce nice expressions for
axis
labeling instead of the scientific notation
"a E<k>"
.
axTexpr(side, at = axTicks(side, axp = axp, usr = usr, log = log), axp = NULL, usr = NULL, log = NULL, drop.1 = FALSE)
axTexpr(side, at = axTicks(side, axp = axp, usr = usr, log = log), axp = NULL, usr = NULL, log = NULL, drop.1 = FALSE)
side |
integer in 1:4 specifying the axis side, as for
|
at |
numeric vector; with identical default as in
|
axp , usr , log
|
as for |
drop.1 |
logical indicating if |
This is just a utility with the same arguments as
axTicks
, a wrapper pretty10exp(at, *)
.
an expression of the same length as x
, with elements of the
form a %*% 10 ^ k
.
Martin Maechler
pretty10exp
;
eaxis
, axis
, axTicks
.
x <- 1e7*(-10:50) y <- dnorm(x, m=10e7, s=20e7) plot(x,y)## not really nice, the following is better: ## For horizontal y-axis labels, need more space: op <- par(mar= .1+ c(5,5,4,1)) plot(x,y, axes= FALSE, frame=TRUE) aX <- axTicks(1); axis(1, at=aX, label= axTexpr(1, aX)) ## horizontal labels on y-axis: aY <- axTicks(2); axis(2, at=aY, label= axTexpr(2, aY), las=2) par(op) ### -- only 'x' and using log-scale there: plot(x,y, xaxt= "n", log = "x") aX <- axTicks(1); axis(1, at=aX, label= axTexpr(1, aX)) ## Now an "engineer's version" ( more ticks; only label "10 ^ k" ) : axp <- par("xaxp") #-> powers of 10 *inside* 'usr' axp[3] <- 1 # such that only 10^. are labeled aX <- axTicks(1, axp = axp) xu <- 10 ^ par("usr")[1:2] e10 <- c(-1,1) + round(log10(axp[1:2])) ## exponents of 10 *outside* 'usr' v <- c(outer(1:9, e10[1]:e10[2], function(x,E) x * 10 ^ E)) v <- v[xu[1] <= v & v <= xu[2]] plot(x,y, xaxt= "n", log = "x", main = "engineer's version of x - axis") axis(1, at = aX, label = axTexpr(1, aX, drop.1=TRUE)) # 'default' axis(1, at = v, label = FALSE, tcl = 2/3 * par("tcl"))
x <- 1e7*(-10:50) y <- dnorm(x, m=10e7, s=20e7) plot(x,y)## not really nice, the following is better: ## For horizontal y-axis labels, need more space: op <- par(mar= .1+ c(5,5,4,1)) plot(x,y, axes= FALSE, frame=TRUE) aX <- axTicks(1); axis(1, at=aX, label= axTexpr(1, aX)) ## horizontal labels on y-axis: aY <- axTicks(2); axis(2, at=aY, label= axTexpr(2, aY), las=2) par(op) ### -- only 'x' and using log-scale there: plot(x,y, xaxt= "n", log = "x") aX <- axTicks(1); axis(1, at=aX, label= axTexpr(1, aX)) ## Now an "engineer's version" ( more ticks; only label "10 ^ k" ) : axp <- par("xaxp") #-> powers of 10 *inside* 'usr' axp[3] <- 1 # such that only 10^. are labeled aX <- axTicks(1, axp = axp) xu <- 10 ^ par("usr")[1:2] e10 <- c(-1,1) + round(log10(axp[1:2])) ## exponents of 10 *outside* 'usr' v <- c(outer(1:9, e10[1]:e10[2], function(x,E) x * 10 ^ E)) v <- v[xu[1] <= v & v <= xu[2]] plot(x,y, xaxt= "n", log = "x", main = "engineer's version of x - axis") axis(1, at = aX, label = axTexpr(1, aX, drop.1=TRUE)) # 'default' axis(1, at = v, label = FALSE, tcl = 2/3 * par("tcl"))
Provides a graphics device for Sweave, based on
cairo_pdf
. The advantage of cairoSwd()
compared
to pdf()
is its support of Unicode characters.
cairoSwd(name, width, height, ...)
cairoSwd(name, width, height, ...)
name |
file name prefix to which ‘.pdf’ will be appended. |
width , height
|
in inches, see |
... |
further arguments, passed to |
Sweave devices need to have an argument list as above.
Usage in a Sweave chunk:
<<some-plot, fig=TRUE, grdevice=cairoSwd>>=
Alain Hauser
Capture output and print first and last parts, eliding
middle parts. Particularly useful for teaching purposes, and, e.g.,
in Sweave (RweaveLatex
).
By default, when middle = NA
, capture.output(EXPR, first, last)
basically does
co <- capture.output(EXPR) writeLines(head(co, first)) cat( ... dotdots ...) writeLines(tail(co, last))
capture.and.write(EXPR, first, last = 2, middle = NA, i.middle, dotdots = " ....... ", n.dots = 2)
capture.and.write(EXPR, first, last = 2, middle = NA, i.middle, dotdots = " ....... ", n.dots = 2)
EXPR |
the (literal) expression the output of which is to be captured. |
first |
integer: how many lines should be printed at beginning. |
last |
integer: how many lines should be printed at the end. |
middle |
numeric (or NA logical): |
i.middle |
index start of middle part |
dotdots |
string to be used for elided lines |
n.dots |
number of |
return value of capture.output(EXPR)
.
Martin Maechler, ETH Zurich
x <- seq(0, 10, by = .1) ## for matrix, dataframe, .. first lines include a header line: capture.and.write( cbind(x, log1p(exp(x))), first = 5) ## first, *middle* and last : capture.and.write( cbind(x, x^2, x^3), first = 4, middle = 3, n.dots= 1)
x <- seq(0, 10, by = .1) ## for matrix, dataframe, .. first lines include a header line: capture.and.write( cbind(x, log1p(exp(x))), first = 5) ## first, *middle* and last : capture.and.write( cbind(x, x^2, x^3), first = 4, middle = 3, n.dots= 1)
col01scale
and colcenter
(re)scale the columns of a
matrix. These are simple one-line utilities, mainly with a didactical
purpose.
colcenter (mat) col01scale(mat, scale.func = function(x) diff(range(x)), location.func = mean)
colcenter (mat) col01scale(mat, scale.func = function(x) diff(range(x)), location.func = mean)
mat |
numeric matrix, to rescaled. |
scale.func , location.func
|
two functions mapping a numeric vector to a single number. |
a matrix with the same attributes as the input mat
.
Martin Maechler
The standard R function scale()
.
## See the simple function definitions: colcenter ## simply one line col01scale# almost as simple
## See the simple function definitions: colcenter ## simply one line col01scale# almost as simple
For an analysis of variance or regression with (at least) two factors: Plot components + residuals for two factors according to Tukey's “forget-it plot”. Try it!
compresid2way(aov, data=NULL, fac=1:2, label = TRUE, numlabel = FALSE, xlab=NULL, ylab=NULL, main=NULL, col=c(2,3,4,4), lty=c(1,1,2,4), pch=c(1,2))
compresid2way(aov, data=NULL, fac=1:2, label = TRUE, numlabel = FALSE, xlab=NULL, ylab=NULL, main=NULL, col=c(2,3,4,4), lty=c(1,1,2,4), pch=c(1,2))
aov |
either an |
data |
data frame containing |
fac |
the two factors used for plotting. Either column numbers or
names for argument |
label |
logical indicating if levels of factors should be shown in the plot. |
numlabel |
logical indicating if effects of factors will be shown in the plot. |
xlab , ylab , main
|
the usual |
col , lty , pch
|
colors, line types, plotting characters to be used
for plotting [1] positive residuals, [2] negative residuals,
[3] grid, [4] labels. If |
For a two-way analysis of variance, the plot shows the additive components of the fits for the two factors by the intersections of a grid, along with the residuals. The observed values of the target variable are identical to the vertical coordinate.
The application of the function has been extended to cover more complicated models. The components of the fit for two factors are shown as just described, and the residuals are added. The result is a “component plus residual” plot for two factors in one display.
Invisibly, a list with components
compy |
data.frame containing the component effects of the two factors, and combined effects plus residual |
coef |
coefficients: Intercept and effects of the factors |
Werner Stahel [email protected]
F. Mosteller and J. W. Tukey (1977) Data Analysis and Regression: A Second Course in Statistics. Addison-Wesley, Reading, Mass., p. 176.
John W. Tukey (1977) Exploratory Data Analysis. Addison-Wesley, Reading, Mass., p. 381.
## From Venables and Ripley (2002) p.165. N <- c(0,1,0,1,1,1,0,0,0,1,1,0,1,1,0,0,1,0,1,0,1,1,0,0) P <- c(1,1,0,0,0,1,0,1,1,1,0,0,0,1,0,1,1,0,0,1,0,1,1,0) K <- c(1,0,0,1,0,1,1,0,0,1,0,1,0,1,1,0,0,0,1,1,1,0,1,0) yield <- c(49.5,62.8,46.8,57.0,59.8,58.5,55.5,56.0,62.8,55.8,69.5,55.0, 62.0,48.8,45.5,44.2,52.0,51.5,49.8,48.8,57.2,59.0,53.2,56.0) npk <- data.frame(block=gl(6,4), N=factor(N), P=factor(P), K=factor(K), yield=yield) npk.cr <- compresid2way(yield ~ N+P+K, data=npk, fac=c("P","K")) ## Fisher's 1926 data on potatoe yield data(potatoes) pot.aov <- aov(yield ~ nitrogen+potash+pos, data=potatoes) compresid2way(pot.aov, pch=as.character(potatoes$pos)) compresid2way(yield~nitrogen+potash, data=subset(potatoes, pos == 2)) ## 2 x 3 design : data(warpbreaks) summary(fm1 <- aov(breaks ~ wool + tension, data = warpbreaks)) compresid2way(fm1)
## From Venables and Ripley (2002) p.165. N <- c(0,1,0,1,1,1,0,0,0,1,1,0,1,1,0,0,1,0,1,0,1,1,0,0) P <- c(1,1,0,0,0,1,0,1,1,1,0,0,0,1,0,1,1,0,0,1,0,1,1,0) K <- c(1,0,0,1,0,1,1,0,0,1,0,1,0,1,1,0,0,0,1,1,1,0,1,0) yield <- c(49.5,62.8,46.8,57.0,59.8,58.5,55.5,56.0,62.8,55.8,69.5,55.0, 62.0,48.8,45.5,44.2,52.0,51.5,49.8,48.8,57.2,59.0,53.2,56.0) npk <- data.frame(block=gl(6,4), N=factor(N), P=factor(P), K=factor(K), yield=yield) npk.cr <- compresid2way(yield ~ N+P+K, data=npk, fac=c("P","K")) ## Fisher's 1926 data on potatoe yield data(potatoes) pot.aov <- aov(yield ~ nitrogen+potash+pos, data=potatoes) compresid2way(pot.aov, pch=as.character(potatoes$pos)) compresid2way(yield~nitrogen+potash, data=subset(potatoes, pos == 2)) ## 2 x 3 design : data(warpbreaks) summary(fm1 <- aov(breaks ~ wool + tension, data = warpbreaks)) compresid2way(fm1)
Kumulative Verteilung von x
aufzeichnen, auf Wunsch auch Median
und Quartile.
This is just an old German language version of
plot.ecdf()
used for teaching at ETHZ.
cum.Vert.funkt(x, Quartile = TRUE, titel = TRUE, Datum = TRUE, rang.axis = n <= 20, xlab = "", main = "", ...)
cum.Vert.funkt(x, Quartile = TRUE, titel = TRUE, Datum = TRUE, rang.axis = n <= 20, xlab = "", main = "", ...)
x |
numeric vector whose empirical distribution should be plotted. |
Quartile |
logical indicating if all 3 non-trivial quartiles should be drawn. |
titel |
logical indicating if a German title should be drawn. |
Datum |
logical indicating if |
rang.axis |
logical indicating if all the ranks should be marked at the y-axis. Defaults to true if there are not more than 20 observations. |
xlab , main
|
x-axis label and main title; default to empty. |
... |
optional further arguments, passed to |
the return value of plotStep()
which is called
internally, invisibly.
Martin Maechler et al.
plotStep
on which it is based; but you should
really consider using plot.ecdf()
from the
stats package instead of this.
cum.Vert.funkt(runif(12)) cum.Vert.funkt(runif(20)) Z <- rnorm(50) cum.Vert.funkt(Z)
cum.Vert.funkt(runif(12)) cum.Vert.funkt(runif(20)) Z <- rnorm(50) cum.Vert.funkt(Z)
Compute numerical derivatives of given observations
(x,y)
, using cubic smoothing splines with GCV, see
smooth.spline
. In other words, estimate
and/or
for the model
D1D2(x, y, xout = x, spar.offset = 0.1384, deriv = 1:2, spl.spar = NULL)
D1D2(x, y, xout = x, spar.offset = 0.1384, deriv = 1:2, spl.spar = NULL)
x , y
|
numeric vectors of same length, supposedly from a model
|
xout |
abscissa values at which to evaluate the derivatives. |
spar.offset |
numeric fudge added to the smoothing parameter,
see |
deriv |
integer in |
spl.spar |
direct smoothing parameter for |
It is well known that for derivative estimation, the optimal smoothing
parameter is larger (more smoothing) than for the function itself.
spar.offset
is really just a fudge offset added to the
smoothing parameter. Note that in R's implementation of
smooth.spline
, spar
is really on the
scale.
When deriv = 1:2
(as per default), both derivatives are
estimated with the same smoothing parameter which is suboptimal
for the single functions individually. Another possibility is to call
D1D2(*, deriv = k)
twice with k = 1
and k = 2
and
use a larger smoothing parameter for the second derivative.
a list with several components,
x |
the abscissae values at which the derivative(s) are evaluated. |
D1 |
if |
D2 |
if |
spar |
the smoothing parameter used in the (final)
|
df |
the equivalent degrees of freedom in that
|
Martin Maechler, in 1992 (for S).
D2ss
which calls smooth.spline
twice,
first on y
, then on the values;
smooth.spline
on which it relies completely.
set.seed(8840) x <- runif(100, 0,10) y <- sin(x) + rnorm(100)/4 op <- par(mfrow = c(2,1)) plot(x,y) lines(ss <- smooth.spline(x,y), col = 4) str(ss[c("df", "spar")]) plot(cos, 0, 10, ylim = c(-1.5,1.5), lwd=2, main = expression("Estimating f'() : "~~ frac(d,dx) * sin(x) == cos(x))) offs <- c(-0.1, 0, 0.1, 0.2, 0.3) i <- 1 for(off in offs) { d12 <- D1D2(x,y, spar.offset = off) lines(d12$x, d12$D1, col = i <- i+1) } legend(2,1.6, c("true cos()",paste("sp.off. = ", format(offs))), lwd=1, col = 1:(1+length(offs)), cex = 0.8, bg = NA) par(op)
set.seed(8840) x <- runif(100, 0,10) y <- sin(x) + rnorm(100)/4 op <- par(mfrow = c(2,1)) plot(x,y) lines(ss <- smooth.spline(x,y), col = 4) str(ss[c("df", "spar")]) plot(cos, 0, 10, ylim = c(-1.5,1.5), lwd=2, main = expression("Estimating f'() : "~~ frac(d,dx) * sin(x) == cos(x))) offs <- c(-0.1, 0, 0.1, 0.2, 0.3) i <- 1 for(off in offs) { d12 <- D1D2(x,y, spar.offset = off) lines(d12$x, d12$D1, col = i <- i+1) } legend(2,1.6, c("true cos()",paste("sp.off. = ", format(offs))), lwd=1, col = 1:(1+length(offs)), cex = 0.8, bg = NA) par(op)
Compute the numerical first or 2nd derivatives of given
observations
(x[i], y ~= f(x[i]))
.
D1tr
is the trivial discrete first derivative
using simple difference ratios, whereas D1ss
and D2ss
use cubic smoothing splines (see smooth.spline
)
to estimate first or second derivatives, respectively.
D2ss
first uses smooth.spline
for the first derivative
and then applies the same to the predicted values
(where
are the values of
xout
) to find .
D1tr(y, x = 1) D1ss(x, y, xout = x, spar.offset = 0.1384, spl.spar=NULL) D2ss(x, y, xout = x, spar.offset = 0.1384, spl.spar=NULL)
D1tr(y, x = 1) D1ss(x, y, xout = x, spar.offset = 0.1384, spl.spar=NULL) D2ss(x, y, xout = x, spar.offset = 0.1384, spl.spar=NULL)
x , y
|
numeric vectors of same length, supposedly from a model
|
xout |
abscissa values at which to evaluate the derivatives. |
spar.offset |
numeric fudge added to the smoothing parameter(s),
see |
spl.spar |
direct smoothing parameter(s) for |
It is well known that for derivative estimation, the optimal smoothing
parameter is larger (more smoothing needed) than for the function itself.
spar.offset
is really just a fudge offset added to the
smoothing parameters. Note that in R's implementation of
smooth.spline
, spar
is really on the
scale.
D1tr()
and D1ss()
return a numeric vector of the length
of y
or xout
, respectively.
D2ss()
returns a list with components
x |
the abscissae values (= |
y |
estimated values of |
spl.spar |
numeric vector of length 2, contain the |
spar.offset |
as specified on input (maybe rep()eated to length 2). |
Martin Maechler, in 1992 (for S).
D1D2
which directly uses the 2nd derivative of
the smoothing spline; smooth.spline
.
## First Derivative --- spar.off = 0 ok "asymptotically" (?) set.seed(330) mult.fig(12) for(i in 1:12) { x <- runif(500, 0,10); y <- sin(x) + rnorm(500)/4 f1 <- D1ss(x=x,y=y, spar.off=0.0) plot(x,f1, ylim = range(c(-1,1,f1))) curve(cos(x), col=3, add= TRUE) } set.seed(8840) x <- runif(100, 0,10) y <- sin(x) + rnorm(100)/4 op <- par(mfrow = c(2,1)) plot(x,y) lines(ss <- smooth.spline(x,y), col = 4) str(ss[c("df", "spar")]) xx <- seq(0,10, len=201) plot(xx, -sin(xx), type = 'l', ylim = c(-1.5,1.5)) title(expression("Estimating f''() : " * frac(d^2,dx^2) * sin(x) == -sin(x))) offs <- c(0.05, 0.1, 0.1348, 0.2) i <- 1 for(off in offs) { d12 <- D2ss(x,y, spar.offset = off) lines(d12, col = i <- i+1) } legend(2,1.6, c("true : -sin(x)",paste("sp.off. = ", format(offs))), lwd=1, col = 1:(1+length(offs)), cex = 0.8, bg = NA) par(op)
## First Derivative --- spar.off = 0 ok "asymptotically" (?) set.seed(330) mult.fig(12) for(i in 1:12) { x <- runif(500, 0,10); y <- sin(x) + rnorm(500)/4 f1 <- D1ss(x=x,y=y, spar.off=0.0) plot(x,f1, ylim = range(c(-1,1,f1))) curve(cos(x), col=3, add= TRUE) } set.seed(8840) x <- runif(100, 0,10) y <- sin(x) + rnorm(100)/4 op <- par(mfrow = c(2,1)) plot(x,y) lines(ss <- smooth.spline(x,y), col = 4) str(ss[c("df", "spar")]) xx <- seq(0,10, len=201) plot(xx, -sin(xx), type = 'l', ylim = c(-1.5,1.5)) title(expression("Estimating f''() : " * frac(d^2,dx^2) * sin(x) == -sin(x))) offs <- c(0.05, 0.1, 0.1348, 0.2) i <- 1 for(off in offs) { d12 <- D2ss(x,y, spar.offset = off) lines(d12, col = i <- i+1) } legend(2,1.6, c("true : -sin(x)",paste("sp.off. = ", format(offs))), lwd=1, col = 1:(1+length(offs)), cex = 0.8, bg = NA) par(op)
These functions are provided for compatibility with older versions of the sfsmisc package only, and may be defunct as soon as of the next release.
pmax.sa(scalar, arr) pmin.sa(scalar, arr)
pmax.sa(scalar, arr) pmin.sa(scalar, arr)
scalar |
numeric scalar. |
arr |
any numeric R object, typically array. |
pmax.sa(s, a)
and pmin.sa(s, a)
return (more-dimensional) arrays.
These have been deprecated, because pmax
and
pmin
do so too, if the array is used as
first argument.
This function implements a simple Gaussian maximum likelihood discriminant rule, for diagonal class covariance matrices.
In machine learning lingo, this is called “Naive Bayes” (for continuous predictors). Note that naive Bayes is more general, as it models discrete predictors as multinomial, i.e., binary predictor variables as Binomial / Bernoulli.
dDA(x, cll, pool = TRUE) ## S3 method for class 'dDA' predict(object, newdata, pool = object$pool, ...) ## S3 method for class 'dDA' print(x, ...) diagDA(ls, cll, ts, pool = TRUE)
dDA(x, cll, pool = TRUE) ## S3 method for class 'dDA' predict(object, newdata, pool = object$pool, ...) ## S3 method for class 'dDA' print(x, ...) diagDA(ls, cll, ts, pool = TRUE)
x , ls
|
learning set data matrix, with rows corresponding to cases (e.g., mRNA samples) and columns to predictor variables (e.g., genes). |
cll |
class labels for learning set, must be consecutive integers. |
object |
object of class |
ts , newdata
|
test set (prediction) data matrix, with rows corresponding to cases and columns to predictor variables. |
pool |
logical flag. If true (by default), the covariance matrices
are assumed to be constant across classes and the discriminant rule
is linear in the data. Otherwise ( |
... |
further arguments passed to and from methods. |
dDA()
returns an object of class dDA
for which there are
print
and predict
methods. The latter
returns the same as diagDA()
:
diagDA()
returns an integer vector of class predictions for the
test set.
Sandrine Dudoit, [email protected] and
Jane Fridlyand, [email protected] originally wrote
stat.diag.da()
in CRAN package sma which was modified
for speedup by Martin Maechler [email protected]
who also introduced dDA
etc.
S. Dudoit, J. Fridlyand, and T. P. Speed. (2000) Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. (Statistics, UC Berkeley, June 2000, Tech Report #576)
lda
and qda
from the
MASS package;
naiveBayes
from e1071.
## two artificial examples by Andreas Greutert: d1 <- data.frame(x = c(1, 5, 5, 5, 10, 25, 25, 25, 25, 29), y = c(4, 1, 2, 4, 4, 4, 6:8, 7)) n.plot(d1) library(cluster) (cl1P <- pam(d1,k=4)$cluster) # 4 surprising clusters with(d1, points(x+0.5, y, col = cl1P, pch =cl1P)) i1 <- c(1,3,5,6) tr1 <- d1[-i1,] cl1. <- c(1,2,1,2,1,3) cl1 <- c(2,2,1,1,1,3) plot(tr1, cex=2, col = cl1, pch = 20+cl1) (dd.<- diagDA(tr1, cl1., ts = d1[ i1,]))# ok (dd <- diagDA(tr1, cl1 , ts = d1[ i1,]))# ok, too! points(d1[ i1,], pch = 10, cex=3, col = dd) ## use new fit + predict instead : (r1 <- dDA(tr1, cl1)) (r1.<- dDA(tr1, cl1.)) stopifnot(dd == predict(r1, new = d1[ i1,]), dd.== predict(r1., new = d1[ i1,])) plot(tr1, cex=2, col = cl1, bg = cl1, pch = 20+cl1, xlim=c(1,30), ylim= c(0,10)) xy <- cbind(x= runif(500, min=1,max=30), y = runif(500, min=0, max=10)) points(xy, cex= 0.5, col = predict(r1, new = xy)) abline(v=c( mean(c(5,25)), mean(c(25,29)))) ## example where one variable xj has Var(xj) = 0: x4 <- matrix(c(2:4,7, 6,8,5,6, 7,2,3,1, 7,7,7,7), ncol=4) y <- c(2,2, 1,1) m4.1 <- dDA(x4, y, pool = FALSE) m4.2 <- dDA(x4, y, pool = TRUE) xx <- matrix(c(3,7,5,7), ncol=4) predict(m4.1, xx)## gave integer(0) previously predict(m4.2, xx)
## two artificial examples by Andreas Greutert: d1 <- data.frame(x = c(1, 5, 5, 5, 10, 25, 25, 25, 25, 29), y = c(4, 1, 2, 4, 4, 4, 6:8, 7)) n.plot(d1) library(cluster) (cl1P <- pam(d1,k=4)$cluster) # 4 surprising clusters with(d1, points(x+0.5, y, col = cl1P, pch =cl1P)) i1 <- c(1,3,5,6) tr1 <- d1[-i1,] cl1. <- c(1,2,1,2,1,3) cl1 <- c(2,2,1,1,1,3) plot(tr1, cex=2, col = cl1, pch = 20+cl1) (dd.<- diagDA(tr1, cl1., ts = d1[ i1,]))# ok (dd <- diagDA(tr1, cl1 , ts = d1[ i1,]))# ok, too! points(d1[ i1,], pch = 10, cex=3, col = dd) ## use new fit + predict instead : (r1 <- dDA(tr1, cl1)) (r1.<- dDA(tr1, cl1.)) stopifnot(dd == predict(r1, new = d1[ i1,]), dd.== predict(r1., new = d1[ i1,])) plot(tr1, cex=2, col = cl1, bg = cl1, pch = 20+cl1, xlim=c(1,30), ylim= c(0,10)) xy <- cbind(x= runif(500, min=1,max=30), y = runif(500, min=0, max=10)) points(xy, cex= 0.5, col = predict(r1, new = xy)) abline(v=c( mean(c(5,25)), mean(c(25,29)))) ## example where one variable xj has Var(xj) = 0: x4 <- matrix(c(2:4,7, 6,8,5,6, 7,2,3,1, 7,7,7,7), ncol=4) y <- c(2,2, 1,1) m4.1 <- dDA(x4, y, pool = FALSE) m4.2 <- dDA(x4, y, pool = TRUE) xx <- matrix(c(3,7,5,7), ncol=4) predict(m4.1, xx)## gave integer(0) previously predict(m4.2, xx)
Compute the other diagonal identity matrix.
The result is basically a fast version of diag(n)[, n:1]
.
diagX(n)
diagX(n)
n |
positive integer. |
a numeric matrix with many zeros – apart from
1
s in the other diagonal.
Martin Maechler, 1992.
diag
.
diagX(4) for(m in 1:5) stopifnot(identical(diagX(m), diag(m)[, m:1, drop = FALSE]))
diagX(4) for(m in 1:5) stopifnot(identical(diagX(m), diag(m)[, m:1, drop = FALSE]))
Integer number representations in other Bases.
Formally, for every element x[i]
, compute the (vector
of) “digits” of the
base
representation of the number
,
.
Revert such a representation to integers.
digitsBase(x, base = 2, ndigits = 1 + floor(1e-9 + log(max(x,1), base))) ## S3 method for class 'basedInt' as.integer(x, ...) ## S3 method for class 'basedInt' print(x, ...) as.intBase(x, base = 2) bi2int(xlist, base)
digitsBase(x, base = 2, ndigits = 1 + floor(1e-9 + log(max(x,1), base))) ## S3 method for class 'basedInt' as.integer(x, ...) ## S3 method for class 'basedInt' print(x, ...) as.intBase(x, base = 2) bi2int(xlist, base)
x |
For For |
base |
integer, at least 2 specifying the base for representation. |
ndigits |
number of bits/digits to use. |
... |
potential further arguments passed to methods, notably
|
xlist |
a |
For digitsBase()
, an object, say m
, of class
"basedInt"
which is basically a (ndigits
x n
)
matrix
where m[,i]
corresponds to x[i]
,
n <- length(x)
and attr(m,"base")
is the input
base
.
as.intBase()
and the as.integer
method for
basedInt
objects return an integer
vector.
bi2int()
is the low-level workhorse of as.intBase()
.
Some of these functions existed under names digits
and
digits.v
in previous versions of the sfsmisc package.
Martin Maechler, Dec 4, 1991 (for S-plus; then called digits.v
).
digitsBase(0:12, 8) #-- octal representation empty.dimnames(digitsBase(0:33, 2)) # binary ## This may be handy for just one number (and default decimal): digits <- function(n, base = 10) as.vector(digitsBase(n, base = base)) digits(128982734) # 1 2 8 9 8 2 7 3 4 digits(128, base = 8) # 2 0 0 ## one way of pretty printing (base <= 10!) b2ch <- function(db) noquote(gsub("^0+(.{1,})$"," \\1", apply(db, 2, paste, collapse = ""))) b2ch(digitsBase(0:33, 2)) #-> 0 1 10 11 100 101 ... 100001 b2ch(digitsBase(0:33, 4)) #-> 0 1 2 3 10 11 12 13 20 ... 200 201 ## Hexadecimal: i <- c(1:20, 100:106) M <- digitsBase(i, 16) hexdig <- c(0:9, LETTERS[1:6]) cM <- hexdig[1 + M]; dim(cM) <- dim(M) b2ch(cM) #-> 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 ... 6A ## IP (Internet Protocol) numbers coding: <n>.<n>.<n>.<n> <--> longinteger ip_ntoa <- function(n) apply(digitsBase(n, base = 256), 2, paste, collapse=".") ip_ntoa(2130706430 + (0:9))# "126.255.255.254" ... "127.0.0.7" ## and the inverse: ip_aton <- function(a) bi2int(lapply(strsplit(a, ".", fixed=TRUE), as.integer), 256) n <- 2130706430 + (0:9) head(ip <- ip_ntoa(n)) head(ip_aton(ip)) stopifnot( n == ip_aton(ip_ntoa(n )), ip == ip_ntoa(ip_aton(ip))) ## Inverse of digitsBase() : as.integer method for the "basedInt" class as.integer(M) ## or also as.intBase() working from strings: (cb <- apply(digitsBase(0:33, 4), 2, paste, collapse = "")) ##-> "000" "001" ..... "200" "201" all(0:33 == as.intBase(cb, base = 4))
digitsBase(0:12, 8) #-- octal representation empty.dimnames(digitsBase(0:33, 2)) # binary ## This may be handy for just one number (and default decimal): digits <- function(n, base = 10) as.vector(digitsBase(n, base = base)) digits(128982734) # 1 2 8 9 8 2 7 3 4 digits(128, base = 8) # 2 0 0 ## one way of pretty printing (base <= 10!) b2ch <- function(db) noquote(gsub("^0+(.{1,})$"," \\1", apply(db, 2, paste, collapse = ""))) b2ch(digitsBase(0:33, 2)) #-> 0 1 10 11 100 101 ... 100001 b2ch(digitsBase(0:33, 4)) #-> 0 1 2 3 10 11 12 13 20 ... 200 201 ## Hexadecimal: i <- c(1:20, 100:106) M <- digitsBase(i, 16) hexdig <- c(0:9, LETTERS[1:6]) cM <- hexdig[1 + M]; dim(cM) <- dim(M) b2ch(cM) #-> 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 ... 6A ## IP (Internet Protocol) numbers coding: <n>.<n>.<n>.<n> <--> longinteger ip_ntoa <- function(n) apply(digitsBase(n, base = 256), 2, paste, collapse=".") ip_ntoa(2130706430 + (0:9))# "126.255.255.254" ... "127.0.0.7" ## and the inverse: ip_aton <- function(a) bi2int(lapply(strsplit(a, ".", fixed=TRUE), as.integer), 256) n <- 2130706430 + (0:9) head(ip <- ip_ntoa(n)) head(ip_aton(ip)) stopifnot( n == ip_aton(ip_ntoa(n )), ip == ip_ntoa(ip_aton(ip))) ## Inverse of digitsBase() : as.integer method for the "basedInt" class as.integer(M) ## or also as.intBase() working from strings: (cb <- apply(digitsBase(0:33, 4), 2, paste, collapse = "")) ##-> "000" "001" ..... "200" "201" all(0:33 == as.intBase(cb, base = 4))
Duplicated() generalizes the duplicated
method for
vectors, by returning indices of “equivalence classes” for
duplicated entries and returning nomatch
(NA
by default)
for unique entries.
Note that duplicated()
is not TRUE
for the first time a
duplicate appears, whereas Duplicated()
only marks unique
entries with nomatch
(NA
).
Duplicated(v, incomparables = FALSE, fromLast = FALSE, nomatch = NA_integer_)
Duplicated(v, incomparables = FALSE, fromLast = FALSE, nomatch = NA_integer_)
v |
a vector, often character, factor, or numeric. |
incomparables |
a vector of values that cannot be compared,
passed to both |
fromLast |
logical indicating if duplication should be considered
from the reverse side, i.e., the last (or rightmost) of identical
elements would correspond to |
nomatch |
passed to |
an integer vector of the same length as v
. Can be used as a
factor
, e.g., in split
,
tapply
, etc.
Christoph Buser and Martin Maechler, Seminar fuer Statistik, ETH Zurich, Sep.2007
uniqueL
(also in this sfsmisc package);
duplicated
, match
.
x <- c(9:12, 1:4, 3:6, 0:7) data.frame(x, dup = duplicated(x), dupL= duplicated(x, fromLast=TRUE), Dup = Duplicated(x), DupL= Duplicated(x, fromLast=TRUE))
x <- c(9:12, 1:4, 3:6, 0:7) data.frame(x, dup = duplicated(x), dupL= duplicated(x, fromLast=TRUE), Dup = Duplicated(x), DupL= Duplicated(x, fromLast=TRUE))
An extended axis()
function which labels more
prettily, in particular for log-scale axes.
It makes use of plotmath or (LaTeX) expression
s of
the form for labeling a
log-scaled axis and when otherwise exponential formatting would be
used (see
pretty10exp
).
eaxis(side, at = if(log) axTicks(side, axp=axp, log=log, nintLog=nintLog) else axTicks(side, axp=axp, log=log), labels = NULL, log = NULL, use.expr = log || format.info(as.numeric(at), digits=7)[3] > 0, f.smalltcl = 3/5, at.small = NULL, small.mult = NULL, equidist.at.tol = 0.002, small.args = list(), draw.between.ticks = TRUE, between.max = 4, outer.at = TRUE, drop.1 = TRUE, sub10 = FALSE, las = 1, nintLog = max(12, par("lab")[2 - is.x]), axp = NULL, n.axp = NULL, max.at = Inf, lab.type = "plotmath", lab.sep = "cdot", ...)
eaxis(side, at = if(log) axTicks(side, axp=axp, log=log, nintLog=nintLog) else axTicks(side, axp=axp, log=log), labels = NULL, log = NULL, use.expr = log || format.info(as.numeric(at), digits=7)[3] > 0, f.smalltcl = 3/5, at.small = NULL, small.mult = NULL, equidist.at.tol = 0.002, small.args = list(), draw.between.ticks = TRUE, between.max = 4, outer.at = TRUE, drop.1 = TRUE, sub10 = FALSE, las = 1, nintLog = max(12, par("lab")[2 - is.x]), axp = NULL, n.axp = NULL, max.at = Inf, lab.type = "plotmath", lab.sep = "cdot", ...)
side |
integer in 1:4, specifying side of |
at |
numeric vector of (“normalsized”) tick locations; by
default |
labels |
|
log |
logical or |
use.expr |
logical specifying if |
f.smalltcl |
factor specifying the lengths of the small ticks in proportion to the normalsized, labeled ticks. |
at.small |
locations of small ticks; the default,
|
small.mult |
positive integer (or |
equidist.at.tol |
a small positive number, a tolerance to be used
for checking equidistant |
small.args |
optional |
draw.between.ticks |
(only if |
between.max |
(only if |
outer.at |
logical specifying that |
drop.1 |
logical specifying if |
sub10 |
logical, integer (of length 1 or 2) or |
nintLog |
only used in R > 2.13.x, when |
axp |
to be passed to |
n.axp |
to be set to
see |
max.at |
maximal number of |
las , ...
|
arguments passed to (the first) |
lab.type |
string, passed to |
lab.sep |
separator between mantissa and exponent for LaTeX labels,
see |
Martin Maechler
axis
,
axTicks
,
axTexpr
,
pretty10exp
.
x <- lseq(1e-10, 0.1, length = 201) plot(x, pt(x, df=3), type = "l", xaxt = "n", log = "x") eaxis(1) ## without small ticks: eaxis(3, at.small=FALSE, col="blue") ## If you like the ticks, but prefer traditional (non-"plotmath") labels: plot(x, gamma(x), type = "l", log = "x") eaxis(1, labels=NA) x <- lseq(.001, 0.1, length = 1000) plot(x, sin(1/x)*x, type = "l", xaxt = "n", log = "x") eaxis(1) eaxis(3, n.axp = 1)# -> xaxp[3] = 1: only 10^j (main) ticks ## non- log-scale : draw small ticks, but no "10^k" if not needed: x <- seq(-100, 100, length = 1000) plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1) # default -> {1, 2, 5} * 10^j ticks eaxis(3, n.axp = 2)# -> xaxp[3] := 2 -- approximately two (main) ticks x <- seq(-1, 1, length = 1000) plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1, small.args = list(col="blue")) x <- x/1000 plot(x, 1-sin(x)/x, type = "l", xaxt = "n", yaxt = "n") eaxis(1) eaxis(2) ## more labels than default: op <- par(lab=c(10,5,7)) plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1) # maybe (depending on your canvas), there are too many, ## in that case, maybe use plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1, axTicks(1)[c(TRUE,FALSE)]) # drop every 2nd label eaxis(3, labels=FALSE) ## ore use 'max.at' which thins as well: plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1, max.at=6) par(op) ### Answering R-help "How do I show real values on a log10 histogram", 26 Mar 2013 ## the data: set.seed(1); summary(x <- rlnorm(100, m = 2, sdl = 3)) ## the plot (w/o x-axis) : r <- hist(log10(x), xaxt = "n", xlab = "x [log scale]") ## the nice axis: axt <- axTicks(1) eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE)) ## Additionally demo'ing 'sub10' options: plot(r, xaxt="n") eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE, sub10 = 2)) ## or plot(r, xaxt="n") eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE, sub10 = "10")) ## or plot(r, xaxt="n") eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE, sub10 = c(-2, 2)))
x <- lseq(1e-10, 0.1, length = 201) plot(x, pt(x, df=3), type = "l", xaxt = "n", log = "x") eaxis(1) ## without small ticks: eaxis(3, at.small=FALSE, col="blue") ## If you like the ticks, but prefer traditional (non-"plotmath") labels: plot(x, gamma(x), type = "l", log = "x") eaxis(1, labels=NA) x <- lseq(.001, 0.1, length = 1000) plot(x, sin(1/x)*x, type = "l", xaxt = "n", log = "x") eaxis(1) eaxis(3, n.axp = 1)# -> xaxp[3] = 1: only 10^j (main) ticks ## non- log-scale : draw small ticks, but no "10^k" if not needed: x <- seq(-100, 100, length = 1000) plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1) # default -> {1, 2, 5} * 10^j ticks eaxis(3, n.axp = 2)# -> xaxp[3] := 2 -- approximately two (main) ticks x <- seq(-1, 1, length = 1000) plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1, small.args = list(col="blue")) x <- x/1000 plot(x, 1-sin(x)/x, type = "l", xaxt = "n", yaxt = "n") eaxis(1) eaxis(2) ## more labels than default: op <- par(lab=c(10,5,7)) plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1) # maybe (depending on your canvas), there are too many, ## in that case, maybe use plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1, axTicks(1)[c(TRUE,FALSE)]) # drop every 2nd label eaxis(3, labels=FALSE) ## ore use 'max.at' which thins as well: plot(x, sin(x)/x, type = "l", xaxt = "n") eaxis(1, max.at=6) par(op) ### Answering R-help "How do I show real values on a log10 histogram", 26 Mar 2013 ## the data: set.seed(1); summary(x <- rlnorm(100, m = 2, sdl = 3)) ## the plot (w/o x-axis) : r <- hist(log10(x), xaxt = "n", xlab = "x [log scale]") ## the nice axis: axt <- axTicks(1) eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE)) ## Additionally demo'ing 'sub10' options: plot(r, xaxt="n") eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE, sub10 = 2)) ## or plot(r, xaxt="n") eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE, sub10 = "10")) ## or plot(r, xaxt="n") eaxis(1, at = axt, labels = pretty10exp(10^axt, drop.1=TRUE, sub10 = c(-2, 2)))
Plots the empirical (cumulative) distribution function (ECDF) for
univariate data, together with upper and lower simultaneous 95% confidence curves,
computed via Kolmogorov-Smirnov' , see
KSd
.
ecdf.ksCI(x, main = NULL, sub = NULL, xlab = deparse(substitute(x)), ci.col = "red", ...)
ecdf.ksCI(x, main = NULL, sub = NULL, xlab = deparse(substitute(x)), ci.col = "red", ...)
x |
|
main , sub , xlab
|
arguments passed to |
ci.col |
color for confidence interval lines. |
... |
optional arguments passed to |
Nothing. Used for its side effect, to produce a plot.
Presently, will only work if length(x)
> 9.
Kjetil Halvorsen
Bickel and Doksum, see KSd
.
ecdf
and plot.stepfun
in standard
R.
ecdf.ksCI( rchisq(50,3) )
ecdf.ksCI( rchisq(50,3) )
Compute points on (the boundary of) an ellipse which is given by elementary geometric parameters.
ellipsePoints(a, b, alpha = 0, loc = c(0, 0), n = 201, keep.ab.order=FALSE)
ellipsePoints(a, b, alpha = 0, loc = c(0, 0), n = 201, keep.ab.order=FALSE)
a , b
|
length of half axes in (x,y) direction. Note that
|
alpha |
angle (in degrees) giving the orientation of the ellipse,
i.e., the original (x,y)-axis ellipse is rotated by |
loc |
center (LOCation) of the ellipse. |
n |
number of points to generate. |
keep.ab.order |
logical indicating if Note that |
A numeric matrix of dimension n x 2
, each row containing the
(x,y) coordinates of a point.
Martin Maechler, March 2002.
the ellipse package and ellipsoidhull
and ellipsoidPoints
in the cluster package.
## Simple Ellipse, centered at (0,0), x-/y- axis parallel: ep <- ellipsePoints(5,2) str(ep) plot(ep, type="n",asp=1) ; polygon(ep, col = 2) ## (a,b) = (2,5) is equivalent to (5,2) : lines(ellipsePoints(2,5), lwd=2, lty=3) ## keep.order=TRUE : Now, (2,5) are axes in x- respective y- direction: lines(ellipsePoints(2,5, keep.ab.order=TRUE), col="blue") ## rotate by 30 degrees : plot(ellipsePoints(5,2, alpha = 30), asp=1) abline(h=0,v=0,col="gray") abline(a=0,b= tan( 30 *pi/180), col=2, lty = 2) abline(a=0,b= tan(120 *pi/180), col=3, lty = 2) ## NB: use x11(type = "Xlib") for the following if you can if(dev.interactive(TRUE)) { ## Movie : rotating ellipse : nTurns <- 4 # #{full 360 deg turns} for(al in 1:(nTurns*360)) { ep <- ellipsePoints(3,6, alpha=al, loc = c(5,2)) plot(ep,type="l",xlim=c(-1,11),ylim=c(-4,8), asp=1, axes = FALSE, xlab="", ylab="") } ## Movie : rotating _filled_ ellipse {less nice to look at} for(al in 1:180) { ep <- ellipsePoints(3,6, alpha=al, loc = c(5,2)) plot(ep,type="n",xlim=c(-1,11),ylim=c(-4,8), asp=1, axes = FALSE, xlab="", ylab="") polygon(ep,col=2,border=3,lwd=2.5) } }# only if interactive
## Simple Ellipse, centered at (0,0), x-/y- axis parallel: ep <- ellipsePoints(5,2) str(ep) plot(ep, type="n",asp=1) ; polygon(ep, col = 2) ## (a,b) = (2,5) is equivalent to (5,2) : lines(ellipsePoints(2,5), lwd=2, lty=3) ## keep.order=TRUE : Now, (2,5) are axes in x- respective y- direction: lines(ellipsePoints(2,5, keep.ab.order=TRUE), col="blue") ## rotate by 30 degrees : plot(ellipsePoints(5,2, alpha = 30), asp=1) abline(h=0,v=0,col="gray") abline(a=0,b= tan( 30 *pi/180), col=2, lty = 2) abline(a=0,b= tan(120 *pi/180), col=3, lty = 2) ## NB: use x11(type = "Xlib") for the following if you can if(dev.interactive(TRUE)) { ## Movie : rotating ellipse : nTurns <- 4 # #{full 360 deg turns} for(al in 1:(nTurns*360)) { ep <- ellipsePoints(3,6, alpha=al, loc = c(5,2)) plot(ep,type="l",xlim=c(-1,11),ylim=c(-4,8), asp=1, axes = FALSE, xlab="", ylab="") } ## Movie : rotating _filled_ ellipse {less nice to look at} for(al in 1:180) { ep <- ellipsePoints(3,6, alpha=al, loc = c(5,2)) plot(ep,type="n",xlim=c(-1,11),ylim=c(-4,8), asp=1, axes = FALSE, xlab="", ylab="") polygon(ep,col=2,border=3,lwd=2.5) } }# only if interactive
Remove all dimension names from an array for compact printing.
empty.dimnames(a)
empty.dimnames(a)
a |
an |
Returns a
with its dimnames replaced by empty character strings.
Bill Venables / Martin Maechler, Sept 1993.
unname
removes the dimnames.
empty.dimnames(diag(5)) # looks much nicer (a <- matrix(-9:10, 4,5)) empty.dimnames(a) # nicer, right?
empty.dimnames(diag(5)) # looks much nicer (a <- matrix(-9:10, 4,5)) empty.dimnames(a) # nicer, right?
Draws a scatter plot, adding vertical “error bars” to all the points.
errbar(x, y, yplus, yminus, cap = 0.015, ylim = range(y,yplus,yminus), xlab= deparse(substitute(x)), ylab= deparse(substitute(y)), ...)
errbar(x, y, yplus, yminus, cap = 0.015, ylim = range(y,yplus,yminus), xlab= deparse(substitute(x)), ylab= deparse(substitute(y)), ...)
x |
vector of x values. |
y |
vector of y values. |
yplus |
vector of y values: the tops of the error bars. |
yminus |
vector of y values: the bottoms of the error bars. |
cap |
the width of the little lines at the tops and bottoms of the error bars in units of the width of the plot. Default is 0.015. |
ylim |
(numeric of length 2): the y-axis extents with a sensible default. |
xlab , ylab
|
axis labels for the plot, as in
|
... |
Graphical parameters (see |
Originally Charles Geyer, U.Chicago, early 1991; then Martin Mächler.
errbar
in package Hmisc is similar.
y <- rnorm(10); d <- 1 + .1*rnorm(10) errbar(1:10, y, y + d, y - d, main="Error Bars example")
y <- rnorm(10); d <- 1 + .1*rnorm(10) errbar(1:10, y, y + d, y - d, main="Error Bars example")
Compute a robust F-Test, i.e., a Wald test for multiple coefficients
of an rlm
object.
f.robftest(object, var = -1)
f.robftest(object, var = -1)
object |
result of |
var |
variables. Either their names or their indices; the
default, |
This builds heavily on summary.rlm()
, the
summary
method for rlm
results.
An object of class "htest"
, hence with the standard print
methods for hypothesis tests. This is basically a list with components
statistic |
the F statistic, according to ... |
df |
numerator and denominator degrees of freedom. |
data.name |
(extracted from input |
alternative |
|
p.value |
the P-value, using an F-test on |
Werner Stahel, July 2000; updates by Martin Maechler.
FIXME — Need some here !
rlm
, summary.aov
, etc.
if(require("MASS")) { ## same data as example(rlm) data(stackloss) summary(rsl <- rlm(stack.loss ~ ., stackloss)) f.robftest(rsl) } else " forget it "
if(require("MASS")) { ## same data as example(rlm) data(stackloss) summary(rsl <- rlm(stack.loss ~ ., stackloss)) f.robftest(rsl) } else " forget it "
Compute the prime factorization(s) of integer(s) n
.
factorize(n, verbose = FALSE)
factorize(n, verbose = FALSE)
n |
vector of integers to factorize. |
verbose |
logical indicating if some progress information should be printed. |
works via primes
, currently in a cheap way, sub-optimal
for large composite .
A named list
of the same length as n
,
each element a 2-column matrix with column "p"
the prime
factors and column~"m"
their respective exponents (or
multiplities), i.e., for a prime number n
, the resulting matrix
is cbind(p = n, m = 1)
.
Martin Maechler, Jan. 1996.
For factorization of moderately or really large numbers, see the gmp
package, and its factorize()
.
factorize(47) factorize(seq(101, 120, by=2))
factorize(47) factorize(seq(101, 120, by=2))
Construct a “list”, really an environment
typically of functions and optionally other R objects, where the
function
s and formula
s all
all share the same environment.
Consequently, the functions may call each other.
On technical level, this is just a simple wrapper around
list2env()
.
funEnv(..., envir = NULL, parent = parent.frame(), hash = (...length() > 100), size = max(29L, ...length()))
funEnv(..., envir = NULL, parent = parent.frame(), hash = (...length() > 100), size = max(29L, ...length()))
... |
an arbitrary named “list” of R objects,
typically including several |
envir |
an |
parent |
(for the case |
hash , size
|
(for the case |
an environment
, say E
, containing the objects
from ...
(plus those in envir
), and all function
objects' environment()
is E.
Martin Maechler
ee <- funEnv(f = function(x) g(2*(x+1)), g = function(y) hh(y+1), hh = function(u) u^2, info = "Some Information (not a function)") ls(ee) # here the same as names(ee) ## Check that it works: i.e., that "f sees g" and "g sees hh": stopifnot(all.equal(ee$f(pi), (2*pi+3)^2)) ee$f(0:4) # [1] 9 25 49 81 121
ee <- funEnv(f = function(x) g(2*(x+1)), g = function(y) hh(y+1), hh = function(u) u^2, info = "Some Information (not a function)") ls(ee) # here the same as names(ee) ## Check that it works: i.e., that "f sees g" and "g sees hh": stopifnot(all.equal(ee$f(pi), (2*pi+3)^2)) ee$f(0:4) # [1] 9 25 49 81 121
Compute the hat matrix or smoother matrix, of ‘any’ (linear) smoother, smoothing splines, by default.
hatMat(x, trace= FALSE, pred.sm = function(x, y, ...) predict(smooth.spline(x, y, ...), x = x)$y, ...)
hatMat(x, trace= FALSE, pred.sm = function(x, y, ...) predict(smooth.spline(x, y, ...), x = x)$y, ...)
x |
numeric vector or matrix. |
trace |
logical indicating if the whole hat matrix, or only its trace, i.e. the sum of the diagonal values should be computed. |
pred.sm |
a function of at least two arguments |
... |
optionally further arguments to the smoother function
|
The hat matrix (if
trace = FALSE
as per default) or
a number, , the trace of
, i.e.,
.
Note that dim(H) == c(n, n)
where n <- length(x)
also in
the case where some x values are duplicated (aka ties).
Martin Maechler [email protected]
Hastie and Tibshirani (1990). Generalized Additive Models. Chapman & Hall.
smooth.spline
, etc.
Note the demo, demo("hatmat-ex")
.
require(stats) # for smooth.spline() or loess() x1 <- c(1:4, 7:12) H1 <- hatMat(x1, spar = 0.5) # default : smooth.spline() matplot(x1, H1, type = "l", main = "columns of smoother hat matrix") ## Example 'pred.sm' arguments for hatMat() : pspl <- function(x,y,...) predict(smooth.spline(x,y, ...), x = x)$y pksm <- function(x,y,...) ksmooth(sort(x),y, "normal", x.points=x, ...)$y ## Rather than ksmooth(): if(require("lokern")) pksm2 <- function(x,y,...) glkerns(x,y, x.out=x, ...)$est ## Explaining 'trace = TRUE' all.equal(sum(diag((hatMat(c(1:4, 7:12), df = 4)))), hatMat(c(1:4, 7:12), df = 4, trace = TRUE), tol = 1e-12) ## ksmooth() : Hk <- hatMat(x1, pr = pksm, bandwidth = 2) cat(sprintf("df = %.2f\n", sum(diag(Hk)))) image(Hk) Matrix::printSpMatrix(as(round(Hk, 2), "sparseMatrix")) ##---> see demo("hatmat-ex") for more (and larger) examples
require(stats) # for smooth.spline() or loess() x1 <- c(1:4, 7:12) H1 <- hatMat(x1, spar = 0.5) # default : smooth.spline() matplot(x1, H1, type = "l", main = "columns of smoother hat matrix") ## Example 'pred.sm' arguments for hatMat() : pspl <- function(x,y,...) predict(smooth.spline(x,y, ...), x = x)$y pksm <- function(x,y,...) ksmooth(sort(x),y, "normal", x.points=x, ...)$y ## Rather than ksmooth(): if(require("lokern")) pksm2 <- function(x,y,...) glkerns(x,y, x.out=x, ...)$est ## Explaining 'trace = TRUE' all.equal(sum(diag((hatMat(c(1:4, 7:12), df = 4)))), hatMat(c(1:4, 7:12), df = 4, trace = TRUE), tol = 1e-12) ## ksmooth() : Hk <- hatMat(x1, pr = pksm, bandwidth = 2) cat(sprintf("df = %.2f\n", sum(diag(Hk)))) image(Hk) Matrix::printSpMatrix(as(round(Hk, 2), "sparseMatrix")) ##---> see demo("hatmat-ex") for more (and larger) examples
Utility to view PDF-rendered help
pages; particularly
useful in case they contain mathematical formulas or otherwise
sophisticated formats.
helppdf(topic, viewer = getOption("pdfviewer"), quiet = !interactive(), ...)
helppdf(topic, viewer = getOption("pdfviewer"), quiet = !interactive(), ...)
topic |
the topic, passed to |
viewer |
a pdf viewer; the default is typically what you want interactively. |
quiet |
|
... |
further optional arguments passed to |
Returns the full path of the pdf file produced.
Martin Maechler
if(interactive()) { ## Both calls work : helppdf(Normal) helppdf("NegBinomial") } else if(.Platform$OS.type != "windows") { # batch mode (Windows often too slow for this) od <- setwd(tempdir()) ff <- helppdf(Normal, viewer=NULL) stopifnot(file.exists(ff)) ; print(ff) setwd(od)# revert to previous dir. }
if(interactive()) { ## Both calls work : helppdf(Normal) helppdf("NegBinomial") } else if(.Platform$OS.type != "windows") { # batch mode (Windows often too slow for this) od <- setwd(tempdir()) ff <- helppdf(Normal, viewer=NULL) stopifnot(file.exists(ff)) ; print(ff) setwd(od)# revert to previous dir. }
Creates a histogram and a horizontal boxplot on the current graphics device.
histBxp(x, nclass, breaks, probability=FALSE, include.lowest=TRUE, xlab = deparse(substitute(x)), ..., width=0.2, boxcol=3, medcol=2, medlwd=5, whisklty=2, staplelty=1)
histBxp(x, nclass, breaks, probability=FALSE, include.lowest=TRUE, xlab = deparse(substitute(x)), ..., width=0.2, boxcol=3, medcol=2, medlwd=5, whisklty=2, staplelty=1)
x |
numeric vector of data for histogram. Missing values
( |
nclass |
recommendation for the number of classes (i.e., bars) the histogram should
have. The default is a number proportional to the logarithm of the length
of |
breaks |
vector of the break points for the bars of the histogram. The count in the
i-th bar is |
probability |
logical flag: if |
include.lowest |
If |
xlab |
character or expression for x axis labeling. |
... |
additional arguments to |
width |
width of the box relative to the height of the histogram. DEFAULT is
|
boxcol |
color of filled box. The default is |
medcol |
the color of the median line. The special value, |
medlwd |
median line width. The special value |
whisklty |
whisker line type. The special value |
staplelty |
staple (whisker end cap) line type. The special value Graphical parameters (see |
If include.lowest
is FALSE
the bottom breakpoint must be
strictly less than the minimum of the data, otherwise (the default) it
must be less than or equal to the minimum of the data. The top
breakpoint must be greater than or equal to the maximum of the data.
This function has been called hist.bxp()
for 17 years; in 2012,
the increasingly strong CRAN policies required a new name (which could not
be confused with an S3 method name).
S-Plus: Markus Keller, Christian Keller; port to R in 1990's: Martin Mächler.
hist
, barplot
,
boxplot
, rug
and
scat1d
in the Hmisc package.
lab <- "50 samples from a t distribution with 5 d.f." mult.fig(2*3, main = "Hist() + Rug() and histBxp(*)") for(i in 1:3) { my.sample <- rt(50, 5) hist(my.sample, main=lab); rug(my.sample)# for 50 obs., this is ok, too.. histBxp(my.sample, main=lab) }
lab <- "50 samples from a t distribution with 5 d.f." mult.fig(2*3, main = "Hist() + Rug() and histBxp(*)") for(i in 1:3) { my.sample <- rt(50, 5) hist(my.sample, main=lab); rug(my.sample)# for 50 obs., this is ok, too.. histBxp(my.sample, main=lab) }
Given where
, compute a cheap
approximation of
.
integrate.xy(x, fx, a, b, use.spline=TRUE, xtol=2e-08)
integrate.xy(x, fx, a, b, use.spline=TRUE, xtol=2e-08)
x |
abscissa values. |
fx |
corresponding values of |
a , b
|
the boundaries of integration; these default to min(x) and max(x) respectively. |
use.spline |
logical; if TRUE use an interpolating spline. |
xtol |
tolerance factor, typically around
|
Note that this is really not good for noisy fx
values;
probably a smoothing spline should be used in that case.
Also, we are not yet using Romberg in order to improve the trapezoid rule. This would be quite an improvement in equidistant cases.
the approximate integral.
Martin Maechler, May 1994 (for S).
integrate
for numerical integration of
functions.
x <- 1:4 integrate.xy(x, exp(x)) print(exp(4) - exp(1), digits = 10) # the true integral for(n in c(10, 20,50,100, 200)) { x <- seq(1,4, len = n) cat(formatC(n,wid=4), formatC(integrate.xy(x, exp(x)), dig = 9),"\n") }
x <- 1:4 integrate.xy(x, exp(x)) print(exp(4) - exp(1), digits = 10) # the true integral for(n in c(10, 20,50,100, 200)) { x <- seq(1,4, len = n) cat(formatC(n,wid=4), formatC(integrate.xy(x, exp(x)), dig = 9),"\n") }
Compute a short expression for a given integer vector, typically
an index, that can be expressed shortly, using :
etc.
inv.seq(i)
inv.seq(i)
i |
vector of (usually increasing) integers. |
a call
(“the inside of an
expression
”) to be eval()
ed to
return the original i
.
Martin Maechler, October 1995; more elegant implementation from Tony Plate.
rle
for another kind of integer vector coding.
(rr <- inv.seq(i1 <- c(3:12, 20:24, 27, 30:33))) eval(rr) stopifnot(eval(rr) == i1) e2 <- expression(c(20:13, 3:12, -1:-4, 27, 30:31)) (i2 <- eval(e2)) (r2 <- inv.seq(i2)) stopifnot(all.equal(r2, e2[[1]])) ## Had {mapply()} bug in this example: ii <- c(1:3, 6:9, 11:16) stopifnot(identical(ii, eval(inv.seq(ii))))
(rr <- inv.seq(i1 <- c(3:12, 20:24, 27, 30:33))) eval(rr) stopifnot(eval(rr) == i1) e2 <- expression(c(20:13, 3:12, -1:-4, 27, 30:31)) (i2 <- eval(e2)) (r2 <- inv.seq(i2)) stopifnot(all.equal(r2, e2[[1]])) ## Had {mapply()} bug in this example: ii <- c(1:3, 6:9, 11:16) stopifnot(identical(ii, eval(inv.seq(ii))))
This function tests whether a numeric
or complex
vector
or array consists of whole numbers. The function is.integer
is not appropriate for this since it tests whether the vector is of class
integer
(see examples).
is.whole(x, tolerance = sqrt(.Machine$double.eps))
is.whole(x, tolerance = sqrt(.Machine$double.eps))
x |
|
tolerance |
maximal distance to the next whole number |
The return value has the same dimension as the argument x
: if x
is a vector, the function returns a logical
vector of the same length;
if x
is a matrix or array, the function returns a logical
matrix
or array of the same dimensions. Each entry in the result indicates whether
the corresponding entry in x
is whole.
Alain Hauser <[email protected]>
## Create a random array, matrix, vector set.seed(307) a <- array(runif(24), dim = c(2, 3, 4)) a[4:8] <- 4:8 m <- matrix(runif(12), 3, 4) m[2:4] <- 2:4 v <- complex(real = seq(0.5, 1.5, by = 0.1), imaginary = seq(2.5, 3.5, by = 0.1)) ## Find whole entries is.whole(a) is.whole(m) is.whole(v) ## Numbers of class integer are always whole is.whole(dim(a)) is.whole(length(v))
## Create a random array, matrix, vector set.seed(307) a <- array(runif(24), dim = c(2, 3, 4)) a[4:8] <- 4:8 m <- matrix(runif(12), 3, 4) m[2:4] <- 2:4 v <- complex(real = seq(0.5, 1.5, by = 0.1), imaginary = seq(2.5, 3.5, by = 0.1)) ## Find whole entries is.whole(a) is.whole(m) is.whole(v) ## Numbers of class integer are always whole is.whole(dim(a)) is.whole(length(v))
Generate numeric sequences applying a linear recursion nr.it
times.
iterate.lin.recursion(x, coeff, delta = 0, nr.it)
iterate.lin.recursion(x, coeff, delta = 0, nr.it)
x |
numeric vector with initial values, i.e., specifying
the beginning of the resulting sequence; must be of length (larger
or) equal to |
coeff |
coefficient vector of the linear recursion. |
delta |
numeric scalar added to each term; defaults to 0. If not zero, determines the linear drift component. |
nr.it |
integer, number of iterations. |
numeric vector, say r
, of length n + nr.it
, where
n = length(x)
. Initialized as r[1:n] = x
, the recursion
is r[k+1] = sum(coeff * r[(k-m+1):k])
, where m = length(coeff)
.
Depending on the zeroes of the characteristic polynomial of coeff
,
there are three cases, of convergence, oszillation and divergence.
Martin Maechler
seq
can be regarded as a trivial special case.
## The Fibonacci sequence: iterate.lin.recursion(0:1, c(1,1), nr = 12) ## 0 1 1 2 3 5 8 13 21 34 55 89 144 233 ## seq() as a special case: stopifnot(iterate.lin.recursion(4,1, d=2, nr=20) == seq(4, by=2, length=1+20)) ## ''Deterministic AR(2)'' : round(iterate.lin.recursion(1:4, c(-0.7, 0.9), d = 2, nr=15), dig=3) ## slowly decaying : plot(ts(iterate.lin.recursion(1:4, c(-0.9, 0.95), nr=150)))
## The Fibonacci sequence: iterate.lin.recursion(0:1, c(1,1), nr = 12) ## 0 1 1 2 3 5 8 13 21 34 55 89 144 233 ## seq() as a special case: stopifnot(iterate.lin.recursion(4,1, d=2, nr=20) == seq(4, by=2, length=1+20)) ## ''Deterministic AR(2)'' : round(iterate.lin.recursion(1:4, c(-0.7, 0.9), d = 2, nr=15), dig=3) ## slowly decaying : plot(ts(iterate.lin.recursion(1:4, c(-0.9, 0.95), nr=150)))
Computes the critical value for Kolmogorov-Smirnov's , for
sample sizes
and confidence level 95%.
KSd(n)
KSd(n)
n |
the sample size, |
Based on tables values given in the reference below.
For uses interpolations from exact values, elsewhere
uses asymptotic approximation.
The critical value for D (two-sided) for significance level 0.05 (or confidence level 95%).
Kjetil Halvorsen and Martin Maechler
Peter J. Bickel and Kjell A. Doksum (1977), Mathematical Statistics: Basic Ideas and Selected Topics. Holden Day. Section 9.6 and table IX.
Is used from ecdf.ksCI
.
KSd(90) KSd(1:9)# now works op <- par(mfrow=c(2,1)) plot(KSd, 10, 150)# nice abline(v = c(75,85), col = "gray") plot(KSd, 79, 81, n = 1001)# *very* tiny discontinuity at 80 par(op)
KSd(90) KSd(1:9)# now works op <- par(mfrow=c(2,1)) plot(KSd, 10, 150)# nice abline(v = c(75,85), col = "gray") plot(KSd, 79, 81, n = 1001)# *very* tiny discontinuity at 80 par(op)
Extract the last elements of a vector.
last(x, length.out = 1, na.rm = FALSE)
last(x, length.out = 1, na.rm = FALSE)
x |
any vector. |
length.out |
integer indicating how many element are desired. If
positive, return the |
na.rm |
logical indicating if the last non-missing value (if any)
shall be returned. By default (it is |
a vector of length abs(length.out)
of last values from x
.
This function may eventually be deprecated for the standard R
function tail()
.
Useful for the turnogram()
function in package
pastecs.
Werner Stahel ([email protected]), and independently, Philippe Grosjean ([email protected]), Frédéric Ibanez ([email protected]).
a <- c(NA, 1, 2, NA, 3, 4, NA) last(a) last(a, na.rm=TRUE) last(a, length = 2) last(a, length = -3)
a <- c(NA, 1, 2, NA, 3, 4, NA) last(a) last(a, na.rm=TRUE) last(a, length = 2) last(a, length = -3)
Add confidence/prediction hyperbolas for
to a plot with data or regression line.
linesHyperb.lm(object, c.prob=0.95, confidence=FALSE, k=if (confidence) Inf else 1, col=2, lty=2, do.abline=TRUE)
linesHyperb.lm(object, c.prob=0.95, confidence=FALSE, k=if (confidence) Inf else 1, col=2, lty=2, do.abline=TRUE)
object |
result of |
c.prob |
coverage probability in |
confidence |
logical; if true, do (small) confidence band, else,
realistic prediction band for the mean of |
k |
integer or |
col , lty
|
attributes for the |
do.abline |
logical; if true, the regression line is drawn as well. |
With predict.lm(*, interval=)
is available,
this function linesHyperb.lm
is only slightly more general for
its k
argument.
Martin Maechler, Oct 1995
predict.lm(*, interval=)
optionally computes
prediction or confidence intervals.
data(swiss) plot(Fertility ~ Education, data = swiss) # the data (lmS <- lm(Fertility ~ Education, data = swiss)) linesHyperb.lm(lmS) linesHyperb.lm(lmS, conf=TRUE, col="blue")
data(swiss) plot(Fertility ~ Education, data = swiss) # the data (lmS <- lm(Fertility ~ Education, data = swiss)) linesHyperb.lm(lmS) linesHyperb.lm(lmS, conf=TRUE, col="blue")
A version of list(...)
, but with “automatically”
named list components.
list_(...)
list_(...)
... |
components to make up the resulting |
The names are extracted from sys.call()
, and the function
is written to be fast (rather than easy to ready for the uninitiated ;-)
a list
with the components in the arguments with
names
taken from their call to list_(..)
.
Martin Maechler
a <- 1:4; lett <- letters[1:9]; CH <- "Suisse" all.equal(list (a, lett), list_(a, lett)) # "names for current but not for target" str(list(a, lett, CH)) # [[1]], [[2]], .. (no names) str(list_(a, lett, CH))# $a $lett .. stopifnot(identical( list (a, lett, CH), unname(L <- list_(a, lett, CH))), is.list(L), names(L) == c("a", "lett", "CH"), identical(lett, L$lett) ## etc ) ## The function is currently defined as function (...) `names<-`(list(...), vapply(sys.call()[-1L], as.character, ""))
a <- 1:4; lett <- letters[1:9]; CH <- "Suisse" all.equal(list (a, lett), list_(a, lett)) # "names for current but not for target" str(list(a, lett, CH)) # [[1]], [[2]], .. (no names) str(list_(a, lett, CH))# $a $lett .. stopifnot(identical( list (a, lett, CH), unname(L <- list_(a, lett, CH))), is.list(L), names(L) == c("a", "lett", "CH"), identical(lett, L$lett) ## etc ) ## The function is currently defined as function (...) `names<-`(list(...), vapply(sys.call()[-1L], as.character, ""))
A graphical and interactive demonstration and visualization of how
loess
works. By clicking on the graphic, the user
determines the current estimation window which is visualized together
with the weights.
loessDemo(x, y, span = 1/2, degree = 1, family = c("gaussian", "symmetric"), nearest = FALSE, nout = 501, xlim = numeric(0), ylim = numeric(0), strictlim = TRUE, verbose = TRUE, inch.sym = 0.25, pch = 4, shade = TRUE, w.symbols = TRUE, sym.col = "blue", w.col = "light blue", line.col = "steelblue")
loessDemo(x, y, span = 1/2, degree = 1, family = c("gaussian", "symmetric"), nearest = FALSE, nout = 501, xlim = numeric(0), ylim = numeric(0), strictlim = TRUE, verbose = TRUE, inch.sym = 0.25, pch = 4, shade = TRUE, w.symbols = TRUE, sym.col = "blue", w.col = "light blue", line.col = "steelblue")
x , y
|
numeric vectors of the same length; the demo is about
|
span |
the smoothing parameter |
degree |
the degree of the polynomials to be used; must be in |
family |
if |
nearest |
logical indicating how |
nout |
the number of points at which to evaluate, i.e,
determining |
xlim |
x-range; to extend or determine (iff |
ylim |
y-range; to extend or determine (iff |
strictlim |
logical determining if |
verbose |
logical ...... |
inch.sym |
symbol size in inches of the maximal weight circle symbol. |
pch |
plotting character, see |
shade |
logical; if true, |
w.symbols |
logical indicating if the non-zero weights should be
visualized by circles with radius proportional to |
sym.col , w.col , line.col
|
colors for the symbols, weights and lines, respectively. |
As function loess.demo()
, written and posted to S-news, on 27
Sep 2001, by Greg Snow, Brigham Young University,
it was modified by Henrik Aa. Nielsen, IMM, DTU,
and subsequently spiffed up for R by Martin Maechler.
if(dev.interactive()) { if(requireNamespace("lattice")) { data("ethanol", package = "lattice") attach(ethanol) loessDemo(E,NOx, span=.25) loessDemo(E,NOx, span=.25, family = "symmetric") loessDemo(E,NOx, degree=0)# Tricube Kernel estimate } ## Artificial Example with one outlier n2 <- 50; x <- 1:(1+2*n2) fx <- (x/10 - 5)^2 y <- fx + 4*rnorm(x) y[n2+1] <- 1e4 loessDemo(x,y, span=1/3, ylim= c(0,1000))# not robust !! loessDemo(x,y, span=1/3, family = "symm") loessDemo(x,y, span=1/3, family = "symm", w.symb = FALSE, ylim = c(0,40)) loessDemo(x,y, span=1/3, family = "symm", ylim = c(0,40)) ## but see warnings() --- there's a "fixup" }
if(dev.interactive()) { if(requireNamespace("lattice")) { data("ethanol", package = "lattice") attach(ethanol) loessDemo(E,NOx, span=.25) loessDemo(E,NOx, span=.25, family = "symmetric") loessDemo(E,NOx, degree=0)# Tricube Kernel estimate } ## Artificial Example with one outlier n2 <- 50; x <- 1:(1+2*n2) fx <- (x/10 - 5)^2 y <- fx + 4*rnorm(x) y[n2+1] <- 1e4 loessDemo(x,y, span=1/3, ylim= c(0,1000))# not robust !! loessDemo(x,y, span=1/3, family = "symm") loessDemo(x,y, span=1/3, family = "symm", w.symb = FALSE, ylim = c(0,40)) loessDemo(x,y, span=1/3, family = "symm", ylim = c(0,40)) ## but see warnings() --- there's a "fixup" }
Generate sequences which are equidistant on a log-scale.
lseq(from, to, length)
lseq(from, to, length)
from |
starting value of sequence. |
to |
end value of the sequence. |
length |
desired length of the sequence. |
a numeric
vector of length length
.
seq
.
(x <- lseq(1, 990, length= 21)) plot(x, x^4, type = "b", col = 2, log = "xy") if(with(R.version, major >= 2 && minor >= 1)) plot(x, exp(x), type = "b", col = 2, log = "xy")
(x <- lseq(1, 990, length= 21)) plot(x, x^4, type = "b", col = 2, log = "xy") if(with(R.version, major >= 2 && minor >= 1)) plot(x, exp(x), type = "b", col = 2, log = "xy")
“Translate” an R matrix (like object) into a LaTeX table,
using \begin{tabular} ...
.
mat2tex(x, file= "mat.tex", envir = "tabular", nam.center = "l", col.center = "c", append = TRUE, digits = 3, title)
mat2tex(x, file= "mat.tex", envir = "tabular", nam.center = "l", col.center = "c", append = TRUE, digits = 3, title)
x |
a matrix |
file |
names the file to which LaTeX commands should be written |
envir |
a string, the LaTeX environment name; default is
|
nam.center |
character specifying row names should be center;
default |
col.center |
character (vector) specifying how the columns should
be centered; must have values from |
append |
logical; if |
digits |
integer; setting of |
title |
a string, possibly using LaTeX commands, which will span the columns of the LaTeX matrix |
No value is returned. This function, when used correctly, only writes LaTeX commands to a file.
For S: Vincent Carey [email protected], from a post on Feb.19, 1991 to S-news. Port to R (and a bit more) by Martin Maechler [email protected].
latex
in package Hmisc is more flexible
(but may surprise by its auto-printing ..).
mex <- matrix(c(pi,pi/2,pi/4,exp(1),exp(2),exp(3)),nrow=2, byrow=TRUE, dimnames = list(c("$\\pi$","$e$"), c("a","b","c"))) mat2tex(mex, file = print(tf <- tempfile("mat", , ".tex")), title="$\\pi, e$, etc." ) ## The last command produces the file "mat<xyz>.tex" containing ##> \begin{tabular} {| l|| c| c| c|} ##> \multicolumn{ 4 }{c}{ $\pi, e$, etc. } \\ \hline ##> \ & a & b & c \\ \hline \hline ##> $\pi$ & 3.14 & 1.57 & 0.785 \\ \hline ##> $e$ & 2.72 & 7.39 & 20.1 \\ \hline ##> \end{tabular} ## Now you have to properly embed the contents of this file ## in a LaTeX document -- for example, you will need a ## preamble, the \begin{document} statement, etc. ## Note that the backslash needs protection in dimnames ## or title actions. mat2tex(mex, stdout(), col.center = c("r","r","c"))
mex <- matrix(c(pi,pi/2,pi/4,exp(1),exp(2),exp(3)),nrow=2, byrow=TRUE, dimnames = list(c("$\\pi$","$e$"), c("a","b","c"))) mat2tex(mex, file = print(tf <- tempfile("mat", , ".tex")), title="$\\pi, e$, etc." ) ## The last command produces the file "mat<xyz>.tex" containing ##> \begin{tabular} {| l|| c| c| c|} ##> \multicolumn{ 4 }{c}{ $\pi, e$, etc. } \\ \hline ##> \ & a & b & c \\ \hline \hline ##> $\pi$ & 3.14 & 1.57 & 0.785 \\ \hline ##> $e$ & 2.72 & 7.39 & 20.1 \\ \hline ##> \end{tabular} ## Now you have to properly embed the contents of this file ## in a LaTeX document -- for example, you will need a ## preamble, the \begin{document} statement, etc. ## Note that the backslash needs protection in dimnames ## or title actions. mat2tex(mex, stdout(), col.center = c("r","r","c"))
missingCh
can be used to test whether a value was specified
as an argument to a function. Very much related to the standard R
function missing
, here the argument is given by its
name, a character string.
As missingCh()
calls missing()
, do consider the
caveats about the latter, see missing
.
missingCh(x, envir = parent.frame())
missingCh(x, envir = parent.frame())
x |
a |
envir |
a (function evaluation) |
a logical
indicating if the argument named x
is
missing
in the function “above”, typically the
caller of missingCh
, but see the use of envir
in the
vapply
example.
Martin Maechler
tst1 <- function(a, b, dd, ...) ## does not work an with argument named 'c' ! c(b = missingCh("b"), dd = missingCh("dd")) tst1(2)#-> both 'b' and 'dd' are missing tst1(,3,,3) ## b dd ## FALSE TRUE -- as 'b' is not missing but 'dd' is. Tst <- function(a,b,cc,dd,EEE, ...) vapply(c("a","b","cc","dd","EEE"), missingCh, NA, envir=environment()) Tst() ## TRUE ... TRUE -- as all are missing() Tst(1,,3) ## a b cc dd EEE ## FALSE TRUE FALSE TRUE TRUE ## ..... ..... ## as 'a' and 'cc' where not missing() ## Formal testing: stopifnot(tst1(), !tst1(,3,3), Tst(), Tst(1,,3, b=2, E="bar") == c(0,0,1,0,0)) ## maybe surprising that this ^^ becomes 'dd' and only 'cc' is missing
tst1 <- function(a, b, dd, ...) ## does not work an with argument named 'c' ! c(b = missingCh("b"), dd = missingCh("dd")) tst1(2)#-> both 'b' and 'dd' are missing tst1(,3,,3) ## b dd ## FALSE TRUE -- as 'b' is not missing but 'dd' is. Tst <- function(a,b,cc,dd,EEE, ...) vapply(c("a","b","cc","dd","EEE"), missingCh, NA, envir=environment()) Tst() ## TRUE ... TRUE -- as all are missing() Tst(1,,3) ## a b cc dd EEE ## FALSE TRUE FALSE TRUE TRUE ## ..... ..... ## as 'a' and 'cc' where not missing() ## Formal testing: stopifnot(tst1(), !tst1(,3,3), Tst(), Tst(1,,3, b=2, E="bar") == c(0,0,1,0,0)) ## maybe surprising that this ^^ becomes 'dd' and only 'cc' is missing
Do simple matrix plots, providing an easy interface to
matplot
by using a default x variable.
mpl(mat, ...) p.m(mat, ...)
mpl(mat, ...) p.m(mat, ...)
mat |
numeric matrix. |
... |
further arguments passed to |
p.m(m)
use the first column of m
as variable,
whereas
mpl(m)
uses the integers 1, 2, ..., nrow(m)
as coordinates and rownames(m)
as axis labels if possible.
These were really created for playing around with curves etc, and
probably should be deprecated since in concrete examples, using
matplot()
directly is more appropriate.
Martin Maechler
matplot
,
plot.mts(*, plot.type = "single")
.
data(animals, package = "cluster") mpl(animals, type = "l")
data(animals, package = "cluster") mpl(animals, type = "l")
Easy Setup for plotting multiple figures (in a rectangular layout) on
one page. It allows to specify a main title and uses smart
defaults for several par
calls.
mult.fig(nr.plots, mfrow, mfcol, marP = rep(0, 4), mgp = c(if(par("las") != 0) 2. else 1.5, 0.6, 0), mar = marP + 0.1 + c(4,4,2,1), oma = c(0,0, tit.wid, 0), main = NULL, tit.wid = if (is.null(main)) 0 else 1 + 1.5*cex.main, cex.main = par("cex.main"), line.main = cex.main - 1/2, col.main = par("col.main"), font.main = par("font.main"), ...)
mult.fig(nr.plots, mfrow, mfcol, marP = rep(0, 4), mgp = c(if(par("las") != 0) 2. else 1.5, 0.6, 0), mar = marP + 0.1 + c(4,4,2,1), oma = c(0,0, tit.wid, 0), main = NULL, tit.wid = if (is.null(main)) 0 else 1 + 1.5*cex.main, cex.main = par("cex.main"), line.main = cex.main - 1/2, col.main = par("col.main"), font.main = par("font.main"), ...)
nr.plots |
integer; the number of plot figures you'll want to draw. |
mfrow , mfcol
|
instead of |
marP |
numeric(4) vector of figure margins to add
(“Plus”) to default |
mgp |
argument for |
mar |
argument for |
oma |
argument for |
main |
character. The main title to be used for the whole graphic. |
tit.wid |
numeric specifying the vertical width to be used for the
main title; note that this is only used for the default value of
|
cex.main |
numeric; the character size to be used for the main title. |
line.main |
numeric; the margin line at which the title is written
(via |
col.main , font.main
|
color and font for main title, passed to
|
... |
further arguments to |
A list
with two components that are lists themselves, a
subset of par()
,
new.par |
the current |
old.par |
the |
Martin Maechler, UW Seattle, 1990 (for S
).
opl <- mult.fig(5, main= expression("Sine Functions " * sin(n * pi * x))) x <- seq(0, 1, len = 201) for (n in 1:5) plot(x, sin(n * pi * x), ylab ="", main = paste("n = ",n)) par(opl$old.par) rr <- mult.fig(mfrow=c(5,1), main= "Cosinus Funktionen", cex = 1.5, marP = - c(0, 1, 2, 0)) for (n in 1:5) plot(x, cos(n * pi * x), type = 'l', col="red", ylab ="") str(rr) par(rr$old.par) ## The *restored* par settings: str(do.call("par", as.list(names(rr$new.par)))) ## Manual setting of `tit.wid` in case subsequent code also manages par(): mult.fig(4, tit.wid = 2)$old.par -> opar plot(lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings)) par(opar) # reset
opl <- mult.fig(5, main= expression("Sine Functions " * sin(n * pi * x))) x <- seq(0, 1, len = 201) for (n in 1:5) plot(x, sin(n * pi * x), ylab ="", main = paste("n = ",n)) par(opl$old.par) rr <- mult.fig(mfrow=c(5,1), main= "Cosinus Funktionen", cex = 1.5, marP = - c(0, 1, 2, 0)) for (n in 1:5) plot(x, cos(n * pi * x), type = 'l', col="red", ylab ="") str(rr) par(rr$old.par) ## The *restored* par settings: str(do.call("par", as.list(names(rr$new.par)))) ## Manual setting of `tit.wid` in case subsequent code also manages par(): mult.fig(4, tit.wid = 2)$old.par -> opar plot(lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings)) par(opar) # reset
n.code
convert “round integers” to short character strings.
This is useful to build up variable names in simulations, e.g.
code2n
is the inverse function of n.code()
.
n.code(n, ndig = 1, dec.codes = c("", "d", "c", "k")) code2n(ncod, ndig = 1, dec.codes = c("", "d", "c", "k"))
n.code(n, ndig = 1, dec.codes = c("", "d", "c", "k")) code2n(ncod, ndig = 1, dec.codes = c("", "d", "c", "k"))
n |
integer vector. |
ncod |
character vector, typically resulting from
|
ndig |
integer giving number of digits before the coding character. |
dec.codes |
character code for 1, 10, 100, 1000 (etc). |
n.code(n)
returns a character
vector of the same
length as n
.
code2n(ncod)
returns a integer
vector of the same
length as ncod
.
Usually, code2n(n.code(n)) == n
.
Martin Maechler
n10 <- c(10,20,90, 100,500, 2000,10000) (c10 <- n.code(n10))#-> "1d" "2d" "9d" "1c" .. stopifnot(code2n(c10) == n10)
n10 <- c(10,20,90, 100,500, 2000,10000) (c10 <- n.code(n10))#-> "1d" "2d" "9d" "1c" .. stopifnot(code2n(c10) == n10)
A utility function which basically calls plot(*, type="n")
and text
. To have names or numbers instead of points
in a plot is useful for identifaction, e.g., in a residual plot, see
also TA.plot
.
n.plot(x, y = NULL, nam = NULL, abbr = n >= 20 || max(nchar(nam))>=8, xlab = NULL, ylab = NULL, log = "", cex = par("cex"), col = par("col"), ...)
n.plot(x, y = NULL, nam = NULL, abbr = n >= 20 || max(nchar(nam))>=8, xlab = NULL, ylab = NULL, log = "", cex = par("cex"), col = par("col"), ...)
x , y
|
coordinates at which to plot. If |
nam |
the labels to plot at each (x,y). Per default, these
taken from the data |
abbr |
logical indicating if the |
xlab , ylab
|
labels for the x- and y- axis, the latter being empty by default. |
log |
character specifying if log scaled axes should be used, see
|
cex |
plotting character expansion, see |
col |
color to use for |
... |
further arguments to be passed to the |
invisibly, a character vector with the labels used.
Martin Maechler, since 1992
n.plot(1:20, cumsum(rnorm(20))) data(cars) with(cars, n.plot(speed, dist, cex = 0.8, col = "forest green"))
n.plot(1:20, cumsum(rnorm(20))) data(cars) with(cars, n.plot(speed, dist, cex = 0.8, col = "forest green"))
This function “smoothes” an improper correlation matrix as it can result
from cor
with use="pairwise.complete.obs"
or
hetcor
.
It is deprecated now, in favor of
nearPD()
from package Matrix.
nearcor(R, eig.tol = 1e-6, conv.tol = 1e-07, posd.tol = 1e-8, maxits = 100, verbose = FALSE)
nearcor(R, eig.tol = 1e-6, conv.tol = 1e-07, posd.tol = 1e-8, maxits = 100, verbose = FALSE)
R |
a square symmetric approximate correlation matrix |
eig.tol |
defines relative positiveness of eigenvalues compared to largest, default=1e-6. |
conv.tol |
convergence tolerance for algorithm, default=1.0e-7 |
posd.tol |
tolerance for enforcing positive definiteness, default=1.0e-8 |
maxits |
maximum number of iterations |
verbose |
logical specifying if convergence monitoring should be verbose. |
This implements the algorithm of Higham (2002), then forces symmetry,
then forces positive definiteness using code from
posdefify
. This implementation does not make
use of direct LAPACK access for tuning purposes as in the MATLAB code
of Lucas (2001). The algorithm of Knol DL and ten Berge (1989) (not
implemented here) is more general in (1) that it allows contraints to
fix some rows (and columns) of the matrix and (2) to force the
smallest eigenvalue to have a certain value.
A list
, with components
cor |
resulting correlation matrix |
fnorm |
Froebenius norm of difference of input and output |
iterations |
number of iterations used |
converged |
logical |
Jens Oehlschlägel
See those in posdefify
.
the slightly more flexible nearPD
which also
returns a classed matrix (class dpoMatrix
).
For new code, nearPD()
is really preferred to nearcor()
,
which hence is considered deprecated.
hetcor
, eigen
;
posdefify
for a simpler algorithm.
cat("pr is the example matrix used in Knol DL, ten Berge (1989)\n") pr <- matrix(c(1, 0.477, 0.644, 0.478, 0.651, 0.826, 0.477, 1, 0.516, 0.233, 0.682, 0.75, 0.644, 0.516, 1, 0.599, 0.581, 0.742, 0.478, 0.233, 0.599, 1, 0.741, 0.8, 0.651, 0.682, 0.581, 0.741, 1, 0.798, 0.826, 0.75, 0.742, 0.8, 0.798, 1), nrow = 6, ncol = 6) ncr <- nearcor(pr) nr <- ncr$cor plot(pr[lower.tri(pr)], nr[lower.tri(nr)]); abline(0,1, lty=2) round(cbind(eigen(pr)$values, eigen(nr)$values), 8) cat("The following will fail:\n") try(factanal(cov=pr, factors=2)) cat("and this should work\n") try(factanal(cov=nr, factors=2)) if(require("polycor")) { n <- 400 x <- rnorm(n) y <- rnorm(n) x1 <- (x + rnorm(n))/2 x2 <- (x + rnorm(n))/2 x3 <- (x + rnorm(n))/2 x4 <- (x + rnorm(n))/2 y1 <- (y + rnorm(n))/2 y2 <- (y + rnorm(n))/2 y3 <- (y + rnorm(n))/2 y4 <- (y + rnorm(n))/2 dat <- data.frame(x1, x2, x3, x4, y1, y2, y3, y4) x1 <- ordered(as.integer(x1 > 0)) x2 <- ordered(as.integer(x2 > 0)) x3 <- ordered(as.integer(x3 > 1)) x4 <- ordered(as.integer(x4 > -1)) y1 <- ordered(as.integer(y1 > 0)) y2 <- ordered(as.integer(y2 > 0)) y3 <- ordered(as.integer(y3 > 1)) y4 <- ordered(as.integer(y4 > -1)) odat <- data.frame(x1, x2, x3, x4, y1, y2, y3, y4) xcor <- cor(dat) pcor <- cor(data.matrix(odat)) # cor() no longer works for factors hcor <- hetcor(odat, ML=TRUE, std.err=FALSE)$correlations ncor <- nearcor(hcor)$cor try(factanal(covmat=xcor, factors=2, n.obs=n)) try(factanal(covmat=pcor, factors=2, n.obs=n)) try(factanal(covmat=hcor, factors=2, n.obs=n)) try(factanal(covmat=ncor, factors=2, n.obs=n)) }
cat("pr is the example matrix used in Knol DL, ten Berge (1989)\n") pr <- matrix(c(1, 0.477, 0.644, 0.478, 0.651, 0.826, 0.477, 1, 0.516, 0.233, 0.682, 0.75, 0.644, 0.516, 1, 0.599, 0.581, 0.742, 0.478, 0.233, 0.599, 1, 0.741, 0.8, 0.651, 0.682, 0.581, 0.741, 1, 0.798, 0.826, 0.75, 0.742, 0.8, 0.798, 1), nrow = 6, ncol = 6) ncr <- nearcor(pr) nr <- ncr$cor plot(pr[lower.tri(pr)], nr[lower.tri(nr)]); abline(0,1, lty=2) round(cbind(eigen(pr)$values, eigen(nr)$values), 8) cat("The following will fail:\n") try(factanal(cov=pr, factors=2)) cat("and this should work\n") try(factanal(cov=nr, factors=2)) if(require("polycor")) { n <- 400 x <- rnorm(n) y <- rnorm(n) x1 <- (x + rnorm(n))/2 x2 <- (x + rnorm(n))/2 x3 <- (x + rnorm(n))/2 x4 <- (x + rnorm(n))/2 y1 <- (y + rnorm(n))/2 y2 <- (y + rnorm(n))/2 y3 <- (y + rnorm(n))/2 y4 <- (y + rnorm(n))/2 dat <- data.frame(x1, x2, x3, x4, y1, y2, y3, y4) x1 <- ordered(as.integer(x1 > 0)) x2 <- ordered(as.integer(x2 > 0)) x3 <- ordered(as.integer(x3 > 1)) x4 <- ordered(as.integer(x4 > -1)) y1 <- ordered(as.integer(y1 > 0)) y2 <- ordered(as.integer(y2 > 0)) y3 <- ordered(as.integer(y3 > 1)) y4 <- ordered(as.integer(y4 > -1)) odat <- data.frame(x1, x2, x3, x4, y1, y2, y3, y4) xcor <- cor(dat) pcor <- cor(data.matrix(odat)) # cor() no longer works for factors hcor <- hetcor(odat, ML=TRUE, std.err=FALSE)$correlations ncor <- nearcor(hcor)$cor try(factanal(covmat=xcor, factors=2, n.obs=n)) try(factanal(covmat=pcor, factors=2, n.obs=n)) try(factanal(covmat=hcor, factors=2, n.obs=n)) try(factanal(covmat=ncor, factors=2, n.obs=n)) }
Compute the number of sign changes in the sequence y
.
nr.sign.chg(y)
nr.sign.chg(y)
y |
numeric vector. |
an integer giving the number of sign changes in sequence y
.
Note that going from positive to 0 to positive is not a sign change.
Martin Maechler, 17 Feb 1993.
(y <- c(1:2,1:-1,0:-2)) nr.sign.chg(y)## = 1
(y <- c(1:2,1:-1,0:-2)) nr.sign.chg(y)## = 1
Draws arrows, like the arrows
function, but with
“nice” filled arrow heads.
p.arrows(x1, y1, x2, y2, size = 1, width, fill = 2, ...)
p.arrows(x1, y1, x2, y2, size = 1, width, fill = 2, ...)
x1 , y1
|
coordinates of points from which to draw. |
x2 , y2
|
coordinates of points to which to draw. |
size |
symbol size as a fraction of a character height; default 1. |
width |
width of the arrow head; defaults to .... |
fill |
color for filling the arrow head. |
... |
further arguments passed to |
Andreas Ruckstuhl, 19 May 1994; (cosmetic by MM).
example(arrows, echo = FALSE) #-> x, y, s plot(x,y, main="p.arrows(.)") p.arrows(x[s], y[s], x[s+1], y[s+1], col= 1:3, fill = "dark blue")
example(arrows, echo = FALSE) #-> x, y, s plot(x,y, main="p.arrows(.)") p.arrows(x[s], y[s], x[s+1], y[s+1], col= 1:3, fill = "dark blue")
Plot the date (and time, if required) in German, at the lower right hand margin of your plot.date
p.datum(outer = FALSE, cex = 0.75, ...)
p.datum(outer = FALSE, cex = 0.75, ...)
outer |
logical; passed to |
cex |
non-negative; passed to |
... |
any arguments to |
plot(1) p.datum()
plot(1) p.datum()
These are utilities for pretty plotting of often used parametric densities.
p.dnorm (mu = 0, s = 1, h0.col = "light gray", ms.lines = TRUE, ms.col = "gray", ...) p.dchisq(nu, h0.col = "light gray", ...) p.dgamma(shape, h0.col = "light gray", ...)
p.dnorm (mu = 0, s = 1, h0.col = "light gray", ms.lines = TRUE, ms.col = "gray", ...) p.dchisq(nu, h0.col = "light gray", ...) p.dgamma(shape, h0.col = "light gray", ...)
mu , s
|
numbers, the mean and standard deviation of the normal distribution. |
nu |
positive number, the degrees of freedom |
shape |
number, the |
h0.col |
color specification for the line |
ms.lines |
logical, used for the normal only: should lines be
drawn at the mean and |
ms.col |
color for the |
... |
further parameter passed to |
Werner Stahel et al.
the underlying density functions,
dnorm
, dchisq
, dgamma
.
p.dnorm() p.dnorm(mu=1.5, add = TRUE, ms.lines = FALSE) # add to the plot above p.dchisq(2, main= "Chi^2 Densities -- nu = 2,3,4") p.dchisq(3, add = TRUE, col = "red") p.dchisq(4, add = TRUE, col = "blue") op <- par(mfrow = c(2,2), mgp = c(1.6, 0.6,0), mar = c(3,3,1,1)) for(sh in 1:4) p.dgamma(sh) par(op)
p.dnorm() p.dnorm(mu=1.5, add = TRUE, ms.lines = FALSE) # add to the plot above p.dchisq(2, main= "Chi^2 Densities -- nu = 2,3,4") p.dchisq(3, add = TRUE, col = "red") p.dchisq(4, add = TRUE, col = "blue") op <- par(mfrow = c(2,2), mgp = c(1.6, 0.6,0), mar = c(3,3,1,1)) for(sh in 1:4) p.dgamma(sh) par(op)
Add a horizontal boxplot to the current plot. This is mainly an
auxiliary function for histBxp
, since
boxplot(*, horizontal = TRUE, add = TRUE)
is usually
much preferable to this.
p.hboxp(x, y.lo, y.hi, boxcol = 3, medcol = 2, medlwd = 5, whisklty = 2, staplelty = 1)
p.hboxp(x, y.lo, y.hi, boxcol = 3, medcol = 2, medlwd = 5, whisklty = 2, staplelty = 1)
x |
univariate data set. |
y.lo , y.hi
|
minimal and maximal user coordinates
or |
boxcol , medcol
|
color of the box and the median line. |
medlwd |
line width of median line. |
whisklty , staplelty
|
line types of the whisker and the staple, the latter being used for the outmost non-outliers. |
....
Martin Maechler building on code from Markus and Christian Keller.
boxplot(**, horizontal = TRUE, add= TRUE)
.
## ==> See code in 'histBxp' (.) and example(histBxp) ! ##
## ==> See code in 'histBxp' (.) and example(histBxp) ! ##
Displays a series of plots of the profile t function and the likelihood
profile traces for the parameters in a nonlinear regression model that
has been fitted with nls
and profiled with
profile.nls
.
p.profileTraces(x, cex = 1, subtitle = paste("t-Profiles and traces of ", deparse(attr(x,"summary")$formula)))
p.profileTraces(x, cex = 1, subtitle = paste("t-Profiles and traces of ", deparse(attr(x,"summary")$formula)))
x |
an object of class |
cex |
character expansion, see |
subtitle |
a subtitle to set for the plot. The default now
includes the |
the stats-internal stats:::plot.profile.nls
plot
method just does “the diagonals”.
Andreas Ruckstuhl, R port by Isabelle Flückiger and Marcel Wolbers
profile
, and nls
(which has
unexported profile
and stats:::plot.profile.nls
methods).
require(stats) data(Puromycin) Treat <- Puromycin[Puromycin$state == "treated", ] fm <- nls(rate ~ T1*conc/(T2+conc), data=Treat, start = list(T1=207,T2=0.06)) (pr <- profile(fm)) # quite a few things.. op <- par(mfcol=1:2) plot(pr) # -> 2 'standard' plots par(op) ## ours: p.profileTraces(pr)
require(stats) data(Puromycin) Treat <- Puromycin[Puromycin$state == "treated", ] fm <- nls(rate ~ T1*conc/(T2+conc), data=Treat, start = list(T1=207,T2=0.06)) (pr <- profile(fm)) # quite a few things.. op <- par(mfcol=1:2) plot(pr) # -> 2 'standard' plots par(op) ## ours: p.profileTraces(pr)
Plots a numeric “residual like” variable against two factor covariates, using boxplots.
p.res.2fact(x, y, z, restricted, notch = FALSE, xlab = NULL, ylab = NULL, main = NULL)
p.res.2fact(x, y, z, restricted, notch = FALSE, xlab = NULL, ylab = NULL, main = NULL)
x , y
|
two factors or numeric vectors giving the levels of factors. |
z |
numeric vector of same length as |
restricted |
positive value which truncates the size. The corresponding symbols are marked by stars. |
notch |
logical indicating if the boxplots should be notched, see
|
xlab , ylab
|
axis labels, see |
main |
main title passed to |
if values are restricted, this make use of the auxiliar
function u.boxplot.x
.
Lorenz Gygax [email protected] and Martin Maechler, Jan.95;
starting from p.res.2x()
.
p.res.2x
, boxplot
,
plot.lm
, TA.plot
.
I <- 8; J <- 3; K <- 20 xx <- factor(rep(rep(1:I, rep(K,I)),J)) yy <- factor(rep(1:J, rep(I*K,J))) zz <- rt(I*J*K, df=5) #-- Student t with 5 d.f. p.res.2fact(xx,yy,zz, restr= 4, main= "i.i.d. t <- 5 random |.| <= 4") mtext("p.res.2fact(xx,yy,zz, restr= 4, ..)", line=1, adj=1, outer=TRUE, cex=1) ## Real data data(warpbreaks) (fm1 <- lm(breaks ~ wool*tension, data = warpbreaks)) ## call via formula method of p.res.2x(): p.res.2x(~ ., fm1) # is shorter than, but equivalent to ## p.res.2x(~ wool + tension, fm1) ## or the direct ## with(warpbreaks, p.res.2fact(wool, tension, residuals(fm1))) ## ## whereas this is "transposed": p.res.2x(~ tension+wool, fm1)
I <- 8; J <- 3; K <- 20 xx <- factor(rep(rep(1:I, rep(K,I)),J)) yy <- factor(rep(1:J, rep(I*K,J))) zz <- rt(I*J*K, df=5) #-- Student t with 5 d.f. p.res.2fact(xx,yy,zz, restr= 4, main= "i.i.d. t <- 5 random |.| <= 4") mtext("p.res.2fact(xx,yy,zz, restr= 4, ..)", line=1, adj=1, outer=TRUE, cex=1) ## Real data data(warpbreaks) (fm1 <- lm(breaks ~ wool*tension, data = warpbreaks)) ## call via formula method of p.res.2x(): p.res.2x(~ ., fm1) # is shorter than, but equivalent to ## p.res.2x(~ wool + tension, fm1) ## or the direct ## with(warpbreaks, p.res.2fact(wool, tension, residuals(fm1))) ## ## whereas this is "transposed": p.res.2x(~ tension+wool, fm1)
Plot Residuals, e.g., of a multiple linear regression, against two (predictor) variables, using positively and negatively oriented line segments for positive and negative residuals.
This is a (S3) generic function with a default
and a
formula
method.
p.res.2x(x, ...) ## Default S3 method: p.res.2x(x, y, z, restricted, size = 1, slwd = 1, scol = 2:3, xlab = NULL, ylab = NULL, main = NULL, xlim = range(x), ylim = range(y), ...) ## S3 method for class 'formula' p.res.2x(x = ~., data, main = deparse(substitute(data)), xlab = NULL, ylab = NULL, ...)
p.res.2x(x, ...) ## Default S3 method: p.res.2x(x, y, z, restricted, size = 1, slwd = 1, scol = 2:3, xlab = NULL, ylab = NULL, main = NULL, xlim = range(x), ylim = range(y), ...) ## S3 method for class 'formula' p.res.2x(x = ~., data, main = deparse(substitute(data)), xlab = NULL, ylab = NULL, ...)
x , y
|
numeric vectors of the same length specifying 2
covariates. For the |
z |
numeric vector of same length as |
restricted |
positive value which truncates the size. The corresponding symbols are marked by stars. |
size |
the symbols are scaled so that |
slwd , scol
|
line width and color(s) for the residual
|
xlab , ylab , main
|
axis labels, and title see |
xlim , ylim
|
the basic x- and y- axis extents, see
|
... |
further arguments passed to |
data |
(for the |
Each residual zz[i]
is visualized as line segment centered at
,
, where the
lengths of the segments are proportional to the absolute
values
.
Positive residuals' line segments have slope , and negative ones
slope
, and
scol
is used to use different colors for
negative and positive segments.
The formula interface calls p.res.2fact()
when
both x
and y
are factor
s.
Andreas Ruckstuhl in June 1991 and Martin Maechler, in 1992, '94, 2003-4.
Stahel, W.~A. (2008) Statistische Datenanalyse: Eine Einführung für Naturwissenschaftler, 5. Auflage, Vieweg, Wiesbaden; Paragraph 13.8.r and 13.8.v.
p.res.2fact
,
plot.lm
,
TA.plot
.
xx <- rep(1:10,7) yy <- rep(1:7, rep(10,7)) zz <- rnorm(70) p.res.2x(xx,yy,zz, restricted = 2, main = "i.i.d. N(0,1) random residuals") example(lm.influence, echo = FALSE) op <- mult.fig(2, marP=c(-1,-1,-1,0), main="p.res.2x(*,*, residuals(lm.SR))")$old.par with(LifeCycleSavings, { p.res.2x(pop15, ddpi, residuals(lm.SR), scol=c("red", "blue")) p.res.2x(pop75, dpi, residuals(lm.SR), scol=2:1) }) ## with formula interface: p.res.2x(~ pop15 + ddpi, lm.SR, scol=c("red", "blue")) p.res.2x(~ pop75 + dpi, lm.SR, scol=2:1) par(op) # revert par() settings above
xx <- rep(1:10,7) yy <- rep(1:7, rep(10,7)) zz <- rnorm(70) p.res.2x(xx,yy,zz, restricted = 2, main = "i.i.d. N(0,1) random residuals") example(lm.influence, echo = FALSE) op <- mult.fig(2, marP=c(-1,-1,-1,0), main="p.res.2x(*,*, residuals(lm.SR))")$old.par with(LifeCycleSavings, { p.res.2x(pop15, ddpi, residuals(lm.SR), scol=c("red", "blue")) p.res.2x(pop75, dpi, residuals(lm.SR), scol=2:1) }) ## with formula interface: p.res.2x(~ pop15 + ddpi, lm.SR, scol=c("red", "blue")) p.res.2x(~ pop75 + dpi, lm.SR, scol=2:1) par(op) # revert par() settings above
Give scale conversion factors of three coordinate systems in use for traditional R graphics: use, cm, symbol.
p.scales(unit = relsysize * 2.54 * min(pin), relsysize = 0.05)
p.scales(unit = relsysize * 2.54 * min(pin), relsysize = 0.05)
unit |
length of unit (or x and y units) of symbol coordinates in cm. |
relsysize |
same, as a proportion of the plotting area. |
A numeric 2x2 matrix, with rows named x
and y
, and
columns, named "sy2usr"
and "usr2cm"
which give the
scale conversion factors from ‘symbol’ (as given) to
‘usr’ coordinates and from these to ‘cm’, respectively.
Werner Stahel, 1990; simplification: M.Maechler, 1993, 2004
par("usr")
, of also ("pin")
on which this
is based.
p.scales()
p.scales()
Puts a symbol (pointer) on a plot at each of the specified locations.
p.tachoPlot(x, y, z, angle=c(pi/4,3*pi/4), size, method = c("robust", "sensitive", "rank"), legend = TRUE, show.method = legend, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), xlim, ylim, ...)
p.tachoPlot(x, y, z, angle=c(pi/4,3*pi/4), size, method = c("robust", "sensitive", "rank"), legend = TRUE, show.method = legend, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), xlim, ylim, ...)
x , y , z
|
coordinates of points. Numeric vectors of the same length.
Missing values ( |
angle |
numeric vector whose elements give the angles between the horizontal baseline and the minimum and maximum direction of the pointer measured clockwise in radians. |
size |
length of the pointers in cm. |
method |
string specifying the method to calculate the angle of
the pointer. One of The minimum and maximum direction of the pointer corresponds to
min(z) and max(z) if method is |
legend |
logical flag: if |
show.method |
logical flag, defaulting to |
xlab , ylab
|
labels for x and y axis; defaults to the ‘expression’ used in the function call. |
xlim , ylim
|
numeric of length 2, the limits for the x and y axis,
respectively; see |
... |
further arguments to |
A scatter plot of the variables x and y is plotted. The value of the third variable z is given by the direction of a pointer (similar to a tachometer). Observations whose z-coordinate is missing are marked by a dot.
A plot is created on the current graphics device.
Christian Keller, June 1995
data(state) data(USArrests) p.tachoPlot(state.center $x, state.center $y, USArrests[,"UrbanPop"]) data(mtcars) par(mfrow=c(2,2)) ## see the difference between the three methods (not much differ. here!) p.tachoPlot(mtcars$hp, mtcars$disp, mtcars$mpg, method="sens") p.tachoPlot(mtcars$hp, mtcars$disp, mtcars$mpg, method="rank") p.tachoPlot(mtcars$hp, mtcars$disp, mtcars$mpg, method="rob")
data(state) data(USArrests) p.tachoPlot(state.center $x, state.center $y, USArrests[,"UrbanPop"]) data(mtcars) par(mfrow=c(2,2)) ## see the difference between the three methods (not much differ. here!) p.tachoPlot(mtcars$hp, mtcars$disp, mtcars$mpg, method="sens") p.tachoPlot(mtcars$hp, mtcars$disp, mtcars$mpg, method="rank") p.tachoPlot(mtcars$hp, mtcars$disp, mtcars$mpg, method="rob")
For longer time-series, it is sometimes important to spread the time-series plots over several subplots. p.ts(.) does this both automatically, and under manual control.
Actually, this is a generalization of plot.ts
(with different defaults).
p.ts(x, nrplots = max(1, min(8, n %/% 400)), overlap = nk %/% 16, date.x = NULL, do.x.axis = !is.null(date.x), do.x.rug = FALSE, ax.format, main.tit = NULL, ylim = NULL, ylab = "", xlab = "Time", quiet = FALSE, mgp = c(1.25, .5, 0), ...)
p.ts(x, nrplots = max(1, min(8, n %/% 400)), overlap = nk %/% 16, date.x = NULL, do.x.axis = !is.null(date.x), do.x.rug = FALSE, ax.format, main.tit = NULL, ylim = NULL, ylab = "", xlab = "Time", quiet = FALSE, mgp = c(1.25, .5, 0), ...)
x |
timeseries (possibly multivariate) or numeric vector. |
nrplots |
number of sub-plots. Default: in {1..8},
approximately |
overlap |
by how much should subsequent plots overlap. Defaults to about 1/16 of sub-length on each side. |
date.x |
a time “vector” of the same length as |
do.x.axis |
logical specifying if an x axis should be drawn (i.e., tick marks and labels). |
do.x.rug |
logical specifying if |
ax.format |
when |
main.tit |
Main title (over all plots). Defaults to name
of |
ylim |
numeric(2) or NULL; if the former, specifying the y-range for the plots. Defaults to a common pretty range. |
ylab , xlab
|
labels for y- and x-axis respectively, see
description in |
quiet |
logical; if |
mgp |
|
... |
further graphic parameters for each |
A page of nrplots
subplots is drawn on the current
graphics device.
Martin Maechler, [email protected]; July 1994 (for S).
p.ts()
calls mult.fig()
for setup.
Further, plot.ts
and plot
.
stopifnot(require(stats)) ## stopifnot(require(datasets)) data(sunspots) p.ts(sunspots, nr=1) # == usual plot.ts(..) p.ts(sunspots) p.ts(sunspots, nr=3, col=2) data(EuStockMarkets) p.ts(EuStockMarkets[,"SMI"]) ## multivariate : p.ts(log10(EuStockMarkets), col = 2:5) ## with Date - x-axis (dense random dates): set.seed(12) x <- as.Date("2000-02-29") + cumsum(1+ rpois(1000, lambda= 2.5)) z <- cumsum(.1 + 2*rt(1000, df=3)) p.ts(z, 4, date.x = x) p.ts(z, 6, date.x = x, ax.format = "%b %Y", do.x.rug = TRUE)
stopifnot(require(stats)) ## stopifnot(require(datasets)) data(sunspots) p.ts(sunspots, nr=1) # == usual plot.ts(..) p.ts(sunspots) p.ts(sunspots, nr=3, col=2) data(EuStockMarkets) p.ts(EuStockMarkets[,"SMI"]) ## multivariate : p.ts(log10(EuStockMarkets), col = 2:5) ## with Date - x-axis (dense random dates): set.seed(12) x <- as.Date("2000-02-29") + cumsum(1+ rpois(1000, lambda= 2.5)) z <- cumsum(.1 + 2*rt(1000, df=3)) p.ts(z, 4, date.x = x) p.ts(z, 6, date.x = x, ax.format = "%b %Y", do.x.rug = TRUE)
A simple utility for displaying simple S vectors; can be used as debugging utility.
paste.vec(name, digits = options()$digits)
paste.vec(name, digits = options()$digits)
name |
string with an variable name which must exist in the current environment (R session). |
digits |
how many decimal digits to be used; passed to
|
a string of the form "NAME = x1 x2 ..."
Martin Maechler, about 1992.
x <- 1:4 paste.vec(x) ##-> "x = 1 2 3 4"
x <- 1:4 paste.vec(x) ##-> "x = 1 2 3 4"
a simple “version”, or wrapper for
packageDescription()
, returning a named character vector,
including "file"
, and still has a useful print()
method.
pkgDesc (pkg, lib.loc = NULL, fields = NULL, ...) pkgBuilt(pkg, lib.loc = NULL, ...)
pkgDesc (pkg, lib.loc = NULL, fields = NULL, ...) pkgBuilt(pkg, lib.loc = NULL, ...)
pkg |
a |
lib.loc |
library location to find the package in; the default
|
fields |
a character vector (or |
... |
further optional arguments passed to |
a named character
vector, with names
, the
fields, identical to the names of the list
returned
by packageDescription
, plus its "file"
attribute.
Additionally the resulting vector is of class "Dlist"
which
activates a useful print()
method.
The file
is always returned; not the least that the author
wants to see it quite often as his .libPaths()
is
non-trivial and typically longer than 4 entries.
Martin Maechler, Jan. 2021
packageDescription
,
.libPaths
.
str(pd <- pkgDesc("sfsmisc")) pd[c("Date","Packaged", "Built","file")] pkgBuilt("sfsmisc") ## Show "Built" (and "file") for all packages whose namespaces are loaded: lNs <- loadedNamespaces() mlNs <- sapply(lNs, pkgBuilt) t(mlNs) # typically prints nicely pkgs <- c("grid", "lattice", "MASS", "Matrix", "nlme", "lme4", "sfsmisc") pkgs <- c("foobar", "barbar", pkgs, "kitty") # + names that typically don't exist pkgsOk <- basename(find.package(pkgs, quiet=TRUE)) mpkg <- sapply(pkgsOk, pkgBuilt) stopifnot(is.matrix(mpkg), nrow(mpkg) == 2) mpkg["Built",]
str(pd <- pkgDesc("sfsmisc")) pd[c("Date","Packaged", "Built","file")] pkgBuilt("sfsmisc") ## Show "Built" (and "file") for all packages whose namespaces are loaded: lNs <- loadedNamespaces() mlNs <- sapply(lNs, pkgBuilt) t(mlNs) # typically prints nicely pkgs <- c("grid", "lattice", "MASS", "Matrix", "nlme", "lme4", "sfsmisc") pkgs <- c("foobar", "barbar", pkgs, "kitty") # + names that typically don't exist pkgsOk <- basename(find.package(pkgs, quiet=TRUE)) mpkg <- sapply(pkgsOk, pkgBuilt) stopifnot(is.matrix(mpkg), nrow(mpkg) == 2) mpkg["Built",]
List some system level information about the compiled code library,
typically its dependencies, for R packages with compiled code; for
Unix-alikes or more generally when cmd
is installed locally.
pkgLibs(pkg, cmd = if(Sys.info()[["sysname"]] == "Darwin") "otool -L" else "ldd")
pkgLibs(pkg, cmd = if(Sys.info()[["sysname"]] == "Darwin") "otool -L" else "ldd")
pkg |
|
cmd |
a |
Note that there seems some language confusion as “DLL” on Windows is also used for “Dynamic-link Library” and Wikipedia warns about confusing the two concepts (“dynamically loaded ..” vs “dynamic-link ..”).
a named list
with one entry per package in pkg
, the
names
being the directory / folder names of the
corresponding pkgs from pkg
.
The exact structure of such entries is currently subject to change and you should not rely on its exact format for now.
Martin Maechler
‘Dynamic Loading’ on Wikipedia, https://en.wikipedia.org/wiki/Dynamic_loading
On Windows, “DLL” is also used for Dynamic-link library, https://en.wikipedia.org/wiki/Dynamic-link_library.
man ldd
from a terminal on a valid OS.
dyn.load()
,
library.dynam()
, and getLoadedDLLs()
.
Also, .C
, .Call
which use such DLLs.
# for the example only using standard R packages : myPkgs <- c("stats", "MASS", "rpart", "Matrix") pl <- pkgLibs(myPkgs) pl stopifnot(exprs = { is.list(pl) length(pl) == length(myPkgs) is.character(pkgD <- names(pl)) }) ## Have seen this failing when a strange development version of "Matrix" was picked up: try( stopifnot( dir.exists(pkgD)) )
# for the example only using standard R packages : myPkgs <- c("stats", "MASS", "rpart", "Matrix") pl <- pkgLibs(myPkgs) pl stopifnot(exprs = { is.list(pl) length(pl) == length(myPkgs) is.character(pkgD <- names(pl)) }) ## Have seen this failing when a strange development version of "Matrix" was picked up: try( stopifnot( dir.exists(pkgD)) )
For one-dimensional nonparametric regression, plot the data and fitted values, typically a smooth function, and optionally use segments to visualize the residuals.
plotDS(x, yd, ys, xlab = "", ylab = "", ylim = rrange(c(yd, ys)), xpd = TRUE, do.seg = TRUE, seg.p = 0.95, segP = list(lty = 2, lwd = 1, col = 2), linP = list(lty = 1, lwd = 2.5, col = 3), ...)
plotDS(x, yd, ys, xlab = "", ylab = "", ylim = rrange(c(yd, ys)), xpd = TRUE, do.seg = TRUE, seg.p = 0.95, segP = list(lty = 2, lwd = 1, col = 2), linP = list(lty = 1, lwd = 2.5, col = 3), ...)
x , yd , ys
|
numeric vectors all of the same length, representing
Alternatively, |
xlab , ylab
|
x- and y- axis labels, as in |
ylim |
limits of y-axis to be used; defaults to a robust range of the values. |
xpd |
see |
do.seg |
logical indicating if residual segments should be drawn,
at |
seg.p |
segment percentage of segments to be drawn, from
|
segP |
list with named components |
linP |
list with named components |
... |
further arguments passed to |
Non-existing components in the lists segP
or linP
will result in the par
defaults to be used.
plotDS()
used to be called pl.ds
up to November 2007.
Martin Maechler, since 1990
seqXtend()
to construct more smooth ys
“objects”.
data(cars) x <- cars$speed yd <- cars$dist ys <- lowess(x, yd, f = .3)$y plotDS(x, yd, ys) ## More interesting : Version of example(Theoph) data(Theoph) Th4 <- subset(Theoph, Subject == 4) ## just for "checking" purposes -- permute the observations: Th4 <- Th4[sample(nrow(Th4)), ] fm1 <- nls(conc ~ SSfol(Dose, Time, lKe, lKa, lCl), data = Th4) ## Simple plotDS(Th4$Time, Th4$conc, fitted(fm1), sub = "Theophylline data - Subject 4 only", segP = list(lty=1,col=2), las = 1) ## Nicer: Draw the smoother not only at x = x[i] (observations): xsm <- unique(sort(c(Th4$Time, seq(0, 25, length = 201)))) ysm <- c(predict(fm1, newdata = list(Time = xsm))) plotDS(Th4$Time, Th4$conc, ys = list(x=xsm, y=ysm), sub = "Theophylline data - Subject 4 only", segP = list(lwd=2), las = 1)
data(cars) x <- cars$speed yd <- cars$dist ys <- lowess(x, yd, f = .3)$y plotDS(x, yd, ys) ## More interesting : Version of example(Theoph) data(Theoph) Th4 <- subset(Theoph, Subject == 4) ## just for "checking" purposes -- permute the observations: Th4 <- Th4[sample(nrow(Th4)), ] fm1 <- nls(conc ~ SSfol(Dose, Time, lKe, lKa, lCl), data = Th4) ## Simple plotDS(Th4$Time, Th4$conc, fitted(fm1), sub = "Theophylline data - Subject 4 only", segP = list(lty=1,col=2), las = 1) ## Nicer: Draw the smoother not only at x = x[i] (observations): xsm <- unique(sort(c(Th4$Time, seq(0, 25, length = 201)))) ysm <- c(predict(fm1, newdata = list(Time = xsm))) plotDS(Th4$Time, Th4$conc, ys = list(x=xsm, y=ysm), sub = "Theophylline data - Subject 4 only", segP = list(lwd=2), las = 1)
Plots a step function
f(x)= , i.e., a piecewise constant function of one variable.
With one argument, plots the empirical cumulative distribution
function.
plotStep(ti, y, cad.lag = TRUE, verticals = !cad.lag, left.points= cad.lag, right.points= FALSE, end.points= FALSE, add = FALSE, pch = par('pch'), xlab=deparse(substitute(ti)), ylab=deparse(substitute(y)), main=NULL, ...)
plotStep(ti, y, cad.lag = TRUE, verticals = !cad.lag, left.points= cad.lag, right.points= FALSE, end.points= FALSE, add = FALSE, pch = par('pch'), xlab=deparse(substitute(ti)), ylab=deparse(substitute(y)), main=NULL, ...)
ti |
numeric vector = |
y |
numeric vector |
cad.lag |
logical: Draw 'cad.lag', i.e., “continue à droite, limite à gauche”. Default = TRUE. |
verticals |
logical: Draw vertical lines? Default= |
left.points |
logical: Draw left points? Default= |
right.points |
logical: Draw right points? Default= |
end.points |
logical: Draw 2 end points? Default= |
add |
logical: Add to existing plot? Default= |
pch |
plotting character for points, see |
xlab , ylab
|
labels of x- and y-axis |
main |
main title; defaults to the call' if you do not want a title,
use |
... |
Any valid argument to |
invisibly: List with components t
and y
.
Calls plot(..), points(..), segments(..) appropriately and plots on current graphics device.
Martin Maechler, Seminar for Statistics, ETH Zurich, [email protected], 1991 ff.
The plot
methods plot.ecdf
and
plot.stepfun
in R which are conceptually nicer.
segments(..., method = "constant")
.
##-- Draw an Empirical CDF (and see the default title ..) plotStep(rnorm(15)) plotStep(runif(25), cad.lag=FALSE) plotStep(runif(25), cad.lag=FALSE, add=TRUE, lty = 2) ui <- sort(runif(20)) plotStep(ui, ni <- cumsum(rpois(19, lambda=1.5) - 1.5), cad.lag = FALSE) plotStep(ui, ni, verticals = TRUE, right.points = TRUE) plotStep(rnorm(201), pch = '.') #- smaller points
##-- Draw an Empirical CDF (and see the default title ..) plotStep(rnorm(15)) plotStep(runif(25), cad.lag=FALSE) plotStep(runif(25), cad.lag=FALSE, add=TRUE, lty = 2) ui <- sort(runif(20)) plotStep(ui, ni <- cumsum(rpois(19, lambda=1.5) - 1.5), cad.lag = FALSE) plotStep(ui, ni, verticals = TRUE, right.points = TRUE) plotStep(rnorm(201), pch = '.') #- smaller points
Evaluate one or several univariate polynomials at several locations,
i.e. compute coef[1] + coef[2]*x + ... + coef[p+1]* x^p
(in the simplest case where x
is scalar and coef
a vector).
polyn.eval(coef, x)
polyn.eval(coef, x)
coef |
“numeric” vector or matrix. If a vector, Note that |
x |
“numeric” vector or array. Either |
The stable “Horner rule” is used for evaluation in any case.
When length(coef) == 1L
, polyn.eval(coef, x)
now returns a
vector of length(x)
whereas previously, it just gave the number
coef
independent of x
.
numeric vector or array, depending on input dimensionalities, see above.
Martin Maechler, ages ago.
For much more sophisticated handling of polynomials, use the
polynom package, see, e.g., predict.polynomial
.
For multivariate polynomials (and also for nice interface to the
orthopolynom package), consider the mpoly package.
polyn.eval(c(1,-2,1), x = 0:3)# (x - 1)^2 polyn.eval(c(0, 24, -50, 35, -10, 1), x = matrix(0:5, 2,3))# 5 zeros! (cf <- rbind(diag(3), c(1,-2,1))) polyn.eval(cf, 0:5) x <- seq(-3,7, by=1/4) cf <- 4:1 (px <- polyn.eval(cf, x)) # is exact if((gmpT <-"package:gmp" %in% search()) || require("gmp")) withAutoprint({ pxq <- polyn.eval(coef = as.bigq(cf, 1), x=x) pxq stopifnot(pxq == px) if(!gmpT) detach("package:gmp") }) if((RmpfrT <-"package:Rmpfr" %in% search()) || require("Rmpfr")) withAutoprint({ pxM <- polyn.eval(coef = mpfr(cf, 80), x=x) # 80 bits accuracy pxM stopifnot(pxM == px) if(!RmpfrT) detach("package:Rmpfr") }) stopifnot(identical(polyn.eval(12, x), rep(12, length(x))), identical(polyn.eval(7, diag(3)), matrix(7, 3,3)))
polyn.eval(c(1,-2,1), x = 0:3)# (x - 1)^2 polyn.eval(c(0, 24, -50, 35, -10, 1), x = matrix(0:5, 2,3))# 5 zeros! (cf <- rbind(diag(3), c(1,-2,1))) polyn.eval(cf, 0:5) x <- seq(-3,7, by=1/4) cf <- 4:1 (px <- polyn.eval(cf, x)) # is exact if((gmpT <-"package:gmp" %in% search()) || require("gmp")) withAutoprint({ pxq <- polyn.eval(coef = as.bigq(cf, 1), x=x) pxq stopifnot(pxq == px) if(!gmpT) detach("package:gmp") }) if((RmpfrT <-"package:Rmpfr" %in% search()) || require("Rmpfr")) withAutoprint({ pxM <- polyn.eval(coef = mpfr(cf, 80), x=x) # 80 bits accuracy pxM stopifnot(pxM == px) if(!RmpfrT) detach("package:Rmpfr") }) stopifnot(identical(polyn.eval(12, x), rep(12, length(x))), identical(polyn.eval(7, diag(3)), matrix(7, 3,3)))
From a matrix m
, construct a "close" positive definite
one.
posdefify(m, method = c("someEVadd", "allEVadd"), symmetric = TRUE, eigen.m = eigen(m, symmetric= symmetric), eps.ev = 1e-07)
posdefify(m, method = c("someEVadd", "allEVadd"), symmetric = TRUE, eigen.m = eigen(m, symmetric= symmetric), eps.ev = 1e-07)
m |
a numeric (square) matrix. |
method |
a string specifying the method to apply; can be abbreviated. |
symmetric |
logical, simply passed to |
eigen.m |
the |
eps.ev |
number specifying the tolerance to use, see Details below. |
We form the eigen decomposition
where is the
diagonal matrix of eigenvalues,
, with decreasing eigenvalues
.
When the smallest eigenvalue are less than
Eps <- eps.ev * abs(lambda[1])
, i.e., negative or “almost
zero”, some or all eigenvalues are replaced by positive
(>= Eps
) values,
.
Then,
is computed
and rescaled in order to keep the original diagonal (where that is
>= Eps
).
a matrix of the same dimensions and the “same” diagonal
(i.e. diag
) as m
but with the property to
be positive definite.
As we found out, there are more sophisticated algorithms to solve
this and related problems. See the references and the
nearPD()
function in the Matrix package.
We consider nearPD()
to also be the successor of this package's nearcor()
.
Martin Maechler, July 2004
Section 4.4.2 of Gill, P.~E., Murray, W. and Wright, M.~H. (1981) Practical Optimization, Academic Press.
Cheng, Sheung Hun and Higham, Nick (1998) A Modified Cholesky Algorithm Based on a Symmetric Indefinite Factorization; SIAM J. Matrix Anal.\ Appl., 19, 1097–1110.
Knol DL, ten Berge JMF (1989) Least-squares approximation of an improper correlation matrix by a proper one. Psychometrika 54, 53–61.
Highham (2002) Computing the nearest correlation matrix - a problem from finance; IMA Journal of Numerical Analysis 22, 329–343.
Lucas (2001) Computing nearest covariance and correlation matrices. A thesis submitted to the University of Manchester for the degree of Master of Science in the Faculty of Science and Engeneering.
eigen
on which the current methods rely.
nearPD()
in the Matrix package.
(Further, the deprecated nearcor()
from this package.)
set.seed(12) m <- matrix(round(rnorm(25),2), 5, 5); m <- 1+ m + t(m); diag(m) <- diag(m) + 4 m posdefify(m) 1000 * zapsmall(m - posdefify(m))
set.seed(12) m <- matrix(round(rnorm(25),2), 5, 5); m <- 1+ m + t(m); diag(m) <- diag(m) + 4 m posdefify(m) 1000 * zapsmall(m - posdefify(m))
Fisher's potato crop data set is of historical interest as an early example of a multi-factor block design.
data(potatoes)
data(potatoes)
A data frame with 64 observations on the following 5 variables.
a factor with levels 1:4
.
a factor with 16 levels A
to H
and
J
to Q
, i.e., LETTERS[1:17][-9]
.
a factor specifying the amount of nitrogen
sulfate (), with the four levels
0,1,2,4
.
a factor specifying the amount of potassium (K,
‘kalium’) sulfate, with the four levels 0,1,2,4
.
a numeric vector giving the yield of potatoes in ...
Bennett, J. H. (1972) Collected Papers of R. A. Fischer vol.~II, 1925-31; The University of Adelaide.
T.Eden and R. A. Fisher (1929) Studies in Crop Variation. VI. Experiments on the Response of the Potato to Potash and Nitrogen. J. Agricultural Science 19, 201–213. Accessible from Bennett (1972), see above.
data(potatoes) ## See the experimental design: with(potatoes, { cat("4 blocks of experiments;", "each does every (nitrogen,potash) combination (aka 'treat'ment) once.", '', sep="\n") print(ftable(table(nitrogen, potash, treat))) print(ftable(tt <- table(pos,potash,nitrogen))) tt[cbind(pos,potash,nitrogen)] <- as.character(treat) cat("The 4 blocks pos = 1, 2, 3, 4:\n") ftable(tt) }) ## First plot: with(potatoes, interaction.plot(potash,nitrogen, response=yield)) ## ANOVAs: summary(aov(yield ~ nitrogen * potash + Error(pos), data = potatoes)) # "==>" can use simply summary(aov(yield ~ nitrogen + potash + pos, data = potatoes)) # and summary(aov(yield ~ nitrogen + potash, data = potatoes))
data(potatoes) ## See the experimental design: with(potatoes, { cat("4 blocks of experiments;", "each does every (nitrogen,potash) combination (aka 'treat'ment) once.", '', sep="\n") print(ftable(table(nitrogen, potash, treat))) print(ftable(tt <- table(pos,potash,nitrogen))) tt[cbind(pos,potash,nitrogen)] <- as.character(treat) cat("The 4 blocks pos = 1, 2, 3, 4:\n") ftable(tt) }) ## First plot: with(potatoes, interaction.plot(potash,nitrogen, response=yield)) ## ANOVAs: summary(aov(yield ~ nitrogen * potash + Error(pos), data = potatoes)) # "==>" can use simply summary(aov(yield ~ nitrogen + potash + pos, data = potatoes)) # and summary(aov(yield ~ nitrogen + potash, data = potatoes))
Produce nice expressions to be used
instead of the scientific notation
"a E<k>"
.
pretty10exp(x, drop.1 = FALSE, sub10 = FALSE, digits = 7, digits.fuzz, off = pmax(10^-digits, 2^-(l10x*log2(10)+1075)), lab.type = c("plotmath","latex"), lab.sep = c("cdot", "times"))
pretty10exp(x, drop.1 = FALSE, sub10 = FALSE, digits = 7, digits.fuzz, off = pmax(10^-digits, 2^-(l10x*log2(10)+1075)), lab.type = c("plotmath","latex"), lab.sep = c("cdot", "times"))
x |
numeric vector (e.g. axis tick locations) |
drop.1 |
logical indicating if |
sub10 |
logical, Special cases: |
digits |
number of digits for mantissa ( |
digits.fuzz |
the old deprecated name for |
off |
a numeric offset in |
lab.type |
a string indicating how the result should look like.
By default, ( |
lab.sep |
character separator between mantissa and exponent for LaTeX labels; it will be prepended with a backslash, i.e., ‘"cdot"’ will use ‘"\cdot"’ |
For the default lab.type = "plotmath"
,
an expression of the same length as x
, typically with elements
of the form a %*% 10 ^ k
.
Exceptions are 0
which is kept simple, if drop.1
is
true and ,
10 ^ k
is used, and if sub10
is not false, a %*% 10 ^ 0
as a
, and a %*% 10 ^ k
as
as the corresponding formatted number a * 10^k
independently of
drop.1
.
Otherwise, a character
vector of the same length as
x
. For lab.type = "latex"
, currently the only
alternative to the default, these strings are LaTeX (math mode)
compatible strings.
If sub10
is set, it will typically be a small number such as 0,
1, or 2. Setting sub10 = TRUE
will be interpreted as
sub10 =1
where resulting exponents will either be
negative or
.
Martin Maechler; Ben Bolker contributed lab.type = "latex"
and lab.sep
.
axTexpr
and eaxis()
which build on
pretty10exp()
, notably the eaxis()
example plots.
The new toLatex.numeric
method which gives very similar
results with option scientific = TRUE
.
Further, axis
, axTicks
.
pretty10exp(-1:3 * 1000) pretty10exp(-1:3 * 1000, drop.1 = TRUE) pretty10exp(c(1,2,5,10,20,50,100,200) * 1e3) pretty10exp(c(1,2,5,10,20,50,100,200) * 1e3, drop.1 = TRUE) set.seed(17); lx <- rlnorm(10, m=8, s=6) pretty10exp(lx, digits = 3) pretty10exp(lx, digits = 3, sub10 = 2) pretty10exp(lx, digits = 3, lab.type="latex") pretty10exp(lx, digits = 3, lab.type="latex", lab.sep="times", sub10=2) ## use regular formatted numbers from 0.03 to 300 : pretty10exp(3*10^(-3:4), sub10 = c(-2,2)) pretty10exp(3*10^(-3:4), sub10 = c(-2,2), lab.type = "l") ax <- 10^(-6:0) - 2e-16 pretty10exp(ax, drop.1=TRUE) # nice for plotting pretty10exp(ax, drop.1=TRUE, sub10=TRUE) pretty10exp(ax, drop.1=TRUE, sub10=c(-2,2)) ## in sfsmisc version <= 1.0-16, no 'digits', ## i.e., implicitly had digits := #{double precision digits} == (dig. <- .Machine$double.digits * log10(2)) # 15.95 pretty10exp(ax, drop.1=TRUE, digits= dig.) # ''ugly'' ## Subnormal numbers x <- sort(c(outer(10^-(323:305), 1:9))); x <- c(x[1]/2, x) tail(x, 12) # nice head(x, 6) # "ugly" (they are multiple's of 2^-1074): head(x, 6) / 2^-1074 # nice head(p0 <- pretty10exp(x, off = 10^-7), 30) # previous behavior {before 'off' existed} str(head(pTen <- lapply(p0, `[[`, 3L))) str(exTen <- sapply(pTen, `[[`, 3L)) # -324 -324 .. head(f0 <- sapply(p0, `[[`, 2L), 17) head(p1 <- pretty10exp(x))# new default str(head(pTen1 <- lapply(p1, `[[`, 3L))) str(exTen1 <- sapply(pTen1, `[[`, 3L)) # -324 -324 .. head(f1 <- sapply(p1, `[[`, 2L), 17) # head(cbind(x, f0, f1, exTen, exTen1), 80) (nEQ <- which(sapply(1:length(p0), function(i) p0[[i]] != p1[[i]]))) cbind(x, f0, f1, exTen, exTen1)[nEQ,] stopifnot(is.finite(f1), 0.5 <= f1, f1 <= 9)
pretty10exp(-1:3 * 1000) pretty10exp(-1:3 * 1000, drop.1 = TRUE) pretty10exp(c(1,2,5,10,20,50,100,200) * 1e3) pretty10exp(c(1,2,5,10,20,50,100,200) * 1e3, drop.1 = TRUE) set.seed(17); lx <- rlnorm(10, m=8, s=6) pretty10exp(lx, digits = 3) pretty10exp(lx, digits = 3, sub10 = 2) pretty10exp(lx, digits = 3, lab.type="latex") pretty10exp(lx, digits = 3, lab.type="latex", lab.sep="times", sub10=2) ## use regular formatted numbers from 0.03 to 300 : pretty10exp(3*10^(-3:4), sub10 = c(-2,2)) pretty10exp(3*10^(-3:4), sub10 = c(-2,2), lab.type = "l") ax <- 10^(-6:0) - 2e-16 pretty10exp(ax, drop.1=TRUE) # nice for plotting pretty10exp(ax, drop.1=TRUE, sub10=TRUE) pretty10exp(ax, drop.1=TRUE, sub10=c(-2,2)) ## in sfsmisc version <= 1.0-16, no 'digits', ## i.e., implicitly had digits := #{double precision digits} == (dig. <- .Machine$double.digits * log10(2)) # 15.95 pretty10exp(ax, drop.1=TRUE, digits= dig.) # ''ugly'' ## Subnormal numbers x <- sort(c(outer(10^-(323:305), 1:9))); x <- c(x[1]/2, x) tail(x, 12) # nice head(x, 6) # "ugly" (they are multiple's of 2^-1074): head(x, 6) / 2^-1074 # nice head(p0 <- pretty10exp(x, off = 10^-7), 30) # previous behavior {before 'off' existed} str(head(pTen <- lapply(p0, `[[`, 3L))) str(exTen <- sapply(pTen, `[[`, 3L)) # -324 -324 .. head(f0 <- sapply(p0, `[[`, 2L), 17) head(p1 <- pretty10exp(x))# new default str(head(pTen1 <- lapply(p1, `[[`, 3L))) str(exTen1 <- sapply(pTen1, `[[`, 3L)) # -324 -324 .. head(f1 <- sapply(p1, `[[`, 2L), 17) # head(cbind(x, f0, f1, exTen, exTen1), 80) (nEQ <- which(sapply(1:length(p0), function(i) p0[[i]] != p1[[i]]))) cbind(x, f0, f1, exTen, exTen1)[nEQ,] stopifnot(is.finite(f1), 0.5 <= f1, f1 <= 9)
Find all prime numbers aka ‘primes’ less than .
Uses an obvious sieve method (and some care), working with
logical
and and integer
s to be quite fast.
primes(n, pSeq = NULL)
primes(n, pSeq = NULL)
n |
a (typically positive integer) number. |
pSeq |
optionally a vector of primes (2,3,5,...) as if from a
|
As the function only uses max(n)
, n
can also be a
vector of numbers.
The famous prime number theorem states that , the
number of primes below
is asymptotically
in the sense that
.
Equivalently, the inverse of , the
-th prime number
is around
; recent results (Pierre Dusart, 1999),
prove that
numeric vector of all prime numbers .
Bill Venables (<= 2001); Martin Maechler gained another 40% speed, carefully working with logicals and integers.
factorize
. For large , use the gmp package
and its
isprime
and nextprime
functions.
(p1 <- primes(100)) system.time(p1k <- primes(1000)) # still lightning fast stopifnot(length(p1k) == 168) system.time(p.e7 <- primes(1e7)) # still only 0.3 sec (2015 (i7)) stopifnot(length(p.e7) == 664579) ## The famous pi(n) := number of primes <= n: pi.n <- approxfun(p.e7, seq_along(p.e7), method = "constant") pi.n(c(10, 100, 1000)) # 4 25 168 plot(pi.n, 2, 1e7, n = 1024, log="xy", axes = FALSE, xlab = "n", ylab = quote(pi(n)), main = quote("The prime number function " ~ pi(n))) eaxis(1); eaxis(2) ## Exploring p(n) := the n-th prime number ~=~ n * pnn(n), where ## pnn(n) := log n + log log n pnn <- function(n) { L <- log(n); L + log(L) } n <- 6:(N <- length(PR <- primes(1e5))) m.pn <- cbind(l.pn = ceiling(n*(pnn(n)-1)), pn = PR[n], u.pn = floor(n*pnn(n))) matplot(n, m.pn, type="l", ylab = quote(p[n]), main = quote(p[n] ~~ "with lower/upper bounds" ~ n*(log(n) + log(log(n)) -(1~"or"~0)))) ## (difference to the lower approximation) / n --> ~ 0.0426 (?) : plot(n, PR[n]/n - (pnn(n)-1), type = 'l', cex = 1/8, log="x", xaxt="n") eaxis(1); abline(h=0, col=adjustcolor(1, 0.5))
(p1 <- primes(100)) system.time(p1k <- primes(1000)) # still lightning fast stopifnot(length(p1k) == 168) system.time(p.e7 <- primes(1e7)) # still only 0.3 sec (2015 (i7)) stopifnot(length(p.e7) == 664579) ## The famous pi(n) := number of primes <= n: pi.n <- approxfun(p.e7, seq_along(p.e7), method = "constant") pi.n(c(10, 100, 1000)) # 4 25 168 plot(pi.n, 2, 1e7, n = 1024, log="xy", axes = FALSE, xlab = "n", ylab = quote(pi(n)), main = quote("The prime number function " ~ pi(n))) eaxis(1); eaxis(2) ## Exploring p(n) := the n-th prime number ~=~ n * pnn(n), where ## pnn(n) := log n + log log n pnn <- function(n) { L <- log(n); L + log(L) } n <- 6:(N <- length(PR <- primes(1e5))) m.pn <- cbind(l.pn = ceiling(n*(pnn(n)-1)), pn = PR[n], u.pn = floor(n*pnn(n))) matplot(n, m.pn, type="l", ylab = quote(p[n]), main = quote(p[n] ~~ "with lower/upper bounds" ~ n*(log(n) + log(log(n)) -(1~"or"~0)))) ## (difference to the lower approximation) / n --> ~ 0.0426 (?) : plot(n, PR[n]/n - (pnn(n)-1), type = 'l', cex = 1/8, log="x", xaxt="n") eaxis(1); abline(h=0, col=adjustcolor(1, 0.5))
printTable2()
prints a 2-way contingency table “with all
bells and whistles” (currently using German labeling).
margin2table()
computes marginals, adds them to the table and
returns a margin2table
object the print method for which adds
text decorations (using "-"
and "|"
).
printTable2(table2, digits = 3) margin2table(x, totName = "sum", name.if.empty=FALSE) ## S3 method for class 'margin2table' print(x, digits = 3, quote = FALSE, right = TRUE, ...)
printTable2(table2, digits = 3) margin2table(x, totName = "sum", name.if.empty=FALSE) ## S3 method for class 'margin2table' print(x, digits = 3, quote = FALSE, right = TRUE, ...)
table2 |
a matrix with non-negative integer entries, i.e. the contingency table. |
x |
a matrix; for |
digits |
Anzahl Dezimalstellen, auf die die Häufigkeiten gerundet werden sollen. |
quote , right
|
logicals passed to |
totName |
string to use as row- and column- name if |
name.if.empty |
logical indicating if the margin “totals” should be named in any case. |
... |
further potential arguments, unused currently. |
margin2table
returns a matrix with added marginals,
i.e., an extra row and column, and is of class "margin2table"
(and "table"
still) which has a nice print method.
printTable2
is just producing output.
Martin Maechler, Feb.1993; then Dec 2003
margin2table(diag(4),,TRUE) m <- diag(3); colnames(m) <- letters[1:3] margin2table(m) margin2table(m / sum(m)) data(HairEyeColor) margin2table(HairEyeColor[,, "Male"]) printTable2(HairEyeColor[,, "Male"]) printTable2(HairEyeColor[,, "Female"])
margin2table(diag(4),,TRUE) m <- diag(3); colnames(m) <- letters[1:3] margin2table(m) margin2table(m / sum(m)) data(HairEyeColor) margin2table(HairEyeColor[,, "Male"]) printTable2(HairEyeColor[,, "Male"]) printTable2(HairEyeColor[,, "Female"])
This is defunct now:
The global DEBUG
has been a cheap precursor to R's
options(verbose= .)
(or a verbose
function argument).
This function prints out its arguments as cat()
does,
additionally printing the name of function in which it's been called —
only when a global variable DEBUG
exists and is
TRUE
.
prt.DEBUG(..., LEVEL = 1)
prt.DEBUG(..., LEVEL = 1)
... |
arguments to be passed to |
LEVEL |
integer (or logical) indicating a debugging level for printing. |
Martin Maechler, originally for S-PLUS.
Closes the PostScript or PDF file
(postscript
,pdf
), openend by a previous
ps.do
(or pdf.latex
, or ...) call, using
dev.off
, and additionally opens a previewer for that
file, unless the previewer is already up. This almost provides
an ‘interactive’ device (like x11
) for
postscript
or pdf
.
ps.end(call.gv= NULL, command = getOption("eps_view"), debug = getOption("verbose")) pdf.end(call.viewer= NULL, command = getOption("pdfviewer"), debug = getOption("verbose"))
ps.end(call.gv= NULL, command = getOption("eps_view"), debug = getOption("verbose")) pdf.end(call.viewer= NULL, command = getOption("pdfviewer"), debug = getOption("verbose"))
call.gv , call.viewer
|
logical, indicating if the postscript or
acrobat reader (e.g., ghostview or |
command |
character, giving a system command for PostScript previewing.
By default, |
debug |
logical; if |
Depends on Unix tools, such as ps
.
Martin Maechler
postscript
, postscript
pdf.do
, ps.do
,
...
if(interactive() ) { myPS <- tempfile("ex", fileext = ".ps") ps.do(myPS) data(sunspots) plot(sunspots) ps.end() tempfile("ex-sun", fileext = ".pdf") -> myPDF pdf.latex(myPDF) plot(sunspots) pdf.end(call. = FALSE) # basically the same as dev.off() } ps.latex(tempfile("ex2", fileext = ".eps")) plot(sunspots) ps.end(call.gv = FALSE) # basically the same as dev.off()
if(interactive() ) { myPS <- tempfile("ex", fileext = ".ps") ps.do(myPS) data(sunspots) plot(sunspots) ps.end() tempfile("ex-sun", fileext = ".pdf") -> myPDF pdf.latex(myPDF) plot(sunspots) pdf.end(call. = FALSE) # basically the same as dev.off() } ps.latex(tempfile("ex2", fileext = ".eps")) plot(sunspots) ps.end(call.gv = FALSE) # basically the same as dev.off()
All functions start a pseudo PostScript or Acrobat preview device, using
postscript
or pdf
, and further registering
the file name for subsequent calls to pdf.end()
or
ps.end()
.
pdf.do(file, paper = "default", width = -1, height = -1, onefile = FALSE, title = NULL, version = "1.4", quiet = FALSE, ...) pdf.latex(file, height = 5 + main.space * 1.25, width = 9.5, main.space=FALSE, lab.space = main.space, paper = "special", title = NULL, lab=c(10, 10, 7), mgp.lab=c(1.6, 0.7, 0), mar=c(4, 4, 0.9, 1.1), ...) ps.do(file, width=-1, height=-1, onefile=FALSE, horizontal=FALSE, title = NULL, ...) ps.latex(file, height = 5 + main.space * 1.25, width = 9.5, main.space=FALSE, lab.space = main.space, paper = "special", title = NULL, lab=c(10, 10, 7), mgp.lab=c(1.6, 0.7, 0), mar=c(4, 4, 0.9, 1.1), ...)
pdf.do(file, paper = "default", width = -1, height = -1, onefile = FALSE, title = NULL, version = "1.4", quiet = FALSE, ...) pdf.latex(file, height = 5 + main.space * 1.25, width = 9.5, main.space=FALSE, lab.space = main.space, paper = "special", title = NULL, lab=c(10, 10, 7), mgp.lab=c(1.6, 0.7, 0), mar=c(4, 4, 0.9, 1.1), ...) ps.do(file, width=-1, height=-1, onefile=FALSE, horizontal=FALSE, title = NULL, ...) ps.latex(file, height = 5 + main.space * 1.25, width = 9.5, main.space=FALSE, lab.space = main.space, paper = "special", title = NULL, lab=c(10, 10, 7), mgp.lab=c(1.6, 0.7, 0), mar=c(4, 4, 0.9, 1.1), ...)
file |
character giving the PostScript/PDF file name to be written. |
height |
device height in inches, |
width |
device width in inches; for this and
|
onefile , horizontal
|
logicals passed to
|
title |
PostScript/PDF (not plot!) title passed to
|
version |
a string describing the PDF version that will be
required to view the output, see |
quiet |
logical specifying that some (informative/warning) messages should not be issued. |
main.space |
logical; if true, leave space for a main title (unusual for LaTeX figures!). |
lab.space |
logical; if true, leave space for x- and y- labels
(by not subtracting from |
paper |
character (or missing), typically |
lab |
integer of length 3, |
mgp.lab |
three decreasing numbers determining space for axis
labeling, see |
mar |
four numbers, indicating marginal space, see
|
... |
arguments passed to |
ps.latex
and pdf.latex
have an additional
LaTeX
flavor,
and just differ by some extra par
settings from the
*.do
siblings: E.g., after ps.do(..)
is called, the graphical parameters c("mar", "mgp", "lab")
are
reset (to values that typically are better than the defaults for LaTeX
figures).
Whereas the defaults for paper
, width
, and height
differ between pdf
and postscript
,
they are set such as to provide very similar functionality, for
the functions ps.do()
and pdf.do()
; e.g., by default,
both use a full plot on portrait-oriented page of the default paper,
as per getOption("papersize")
.pdf.do()
sets the default paper
to "special"
when both width
and height
are specified.
A list with components
old.par |
containing the old |
new.par |
containing the newly set |
Martin Maechler
ps.end
, pdf
, postscript
,
dev.print
.
if(interactive()) { ps.latex("ps.latex-ex.ps", main= TRUE) data(sunspots) plot(sunspots,main=paste("Sunspots Data, n=",length(sunspots)),col="red") ps.end() pdf.latex("pdf.latex-ex.pdf", main= TRUE) data(sunspots) plot(sunspots,main=paste("Sunspots Data, n=",length(sunspots)),col="red") pdf.end() ps.do("ps_do_ex.ps") example(plot.function) ps.end() pdf.do("pdf_do_ex.pdf", width=12, height=5) plot(sunspots, main="Monthly Sunspot numbers (in Zurich, then Tokyo)") pdf.end() }
if(interactive()) { ps.latex("ps.latex-ex.ps", main= TRUE) data(sunspots) plot(sunspots,main=paste("Sunspots Data, n=",length(sunspots)),col="red") ps.end() pdf.latex("pdf.latex-ex.pdf", main= TRUE) data(sunspots) plot(sunspots,main=paste("Sunspots Data, n=",length(sunspots)),col="red") pdf.end() ps.do("ps_do_ex.ps") example(plot.function) ps.end() pdf.do("pdf_do_ex.pdf", width=12, height=5) plot(sunspots, main="Monthly Sunspot numbers (in Zurich, then Tokyo)") pdf.end() }
Determine the quadrant of planar points, i.e. in which of the four parts cut by the x- and y- axis the points lie. Zero values (i.e. points on the axes) are treated as if positive.
quadrant(x, y=NULL)
quadrant(x, y=NULL)
x , y
|
numeric vectors of the same length, or |
numeric vector of same length as x
(if that's a vector) with
values in 1:4
indicating the quadrant number of the
corresponding point.
xy <- as.matrix(expand.grid(x= -7:7, y= -7:7)); rownames(xy) <- NULL (qu <- quadrant(xy)) plot(xy, col = qu+1, main = "quadrant() number", axes = FALSE) abline(h=0, v=0, col="gray") # the x- and y- axis text(xy, lab = qu, col = qu+1, adj = c(1.4,0))
xy <- as.matrix(expand.grid(x= -7:7, y= -7:7)); rownames(xy) <- NULL (qu <- quadrant(xy)) plot(xy, col = qu+1, main = "quadrant() number", axes = FALSE) abline(h=0, v=0, col="gray") # the x- and y- axis text(xy, lab = qu, col = qu+1, adj = c(1.4,0))
These functions provide quasi random numbers or space filling or
low discrepancy sequences in the -dimensional unit cube.
sHalton(n.max, n.min = 1, base = 2, leap = 1) QUnif (n, min = 0, max = 1, n.min = 1, p, leap = 1, silent = FALSE)
sHalton(n.max, n.min = 1, base = 2, leap = 1) QUnif (n, min = 0, max = 1, n.min = 1, p, leap = 1, silent = FALSE)
n.max |
maximal (sequence) number. |
n.min |
minimal sequence number. |
n |
number of |
base |
integer |
min , max
|
lower and upper limits of the univariate intervals.
Must be of length 1 or |
p |
dimensionality of space (the unit cube) in which points are generated. |
leap |
integer indicating (if |
silent |
logical asking to suppress the message about enlarging
the prime table for large |
sHalton(n,m)
returns a numeric vector of length n-m+1
of
values in .
QUnif(n, min, max, n.min, p=p)
generates n-n.min+1
p-dimensional points in returning a numeric matrix
with p columns.
For leap
Kocis and Whiten recommend values of
, and particularly the
for dimensions
up to 400.
Martin Maechler
James Gentle (1998) Random Number Generation and Monte Carlo Simulation; sec.\ 6.3. Springer.
Kocis, L. and Whiten, W.J. (1997) Computational Investigations of Low-Discrepancy Sequences. ACM Transactions of Mathematical Software 23, 2, 266–294.
32*sHalton(20, base=2) stopifnot(sHalton(20, base=3, leap=2) == sHalton(20, base=3)[1+2*(0:9)]) ## ------- a 2D Visualization ------- Uplot <- function(xy, axes=FALSE, xlab="", ylab="", ...) { plot(xy, xaxs="i", yaxs="i", xlim=0:1, ylim=0:1, xpd = FALSE, axes=axes, xlab=xlab, ylab=ylab, ...) box(lty=2, col="gray40") } do4 <- function(n, ...) { op <- mult.fig(4, main=paste("n =", n,": Quasi vs. (Pseudo) Random"), marP=c(-2,-2,-1,0))$old.par on.exit(par(op)) for(i in 1:2) { Uplot(QUnif(n, p=2), main="QUnif", ...) Uplot(cbind(runif(n), runif(n)), main="runif", ...) } } do4(100) do4(500) do4(1000, cex = 0.8, col="slateblue") do4(10000, pch= ".", col="slateblue") do4(40000, pch= ".", col="slateblue")
32*sHalton(20, base=2) stopifnot(sHalton(20, base=3, leap=2) == sHalton(20, base=3)[1+2*(0:9)]) ## ------- a 2D Visualization ------- Uplot <- function(xy, axes=FALSE, xlab="", ylab="", ...) { plot(xy, xaxs="i", yaxs="i", xlim=0:1, ylim=0:1, xpd = FALSE, axes=axes, xlab=xlab, ylab=ylab, ...) box(lty=2, col="gray40") } do4 <- function(n, ...) { op <- mult.fig(4, main=paste("n =", n,": Quasi vs. (Pseudo) Random"), marP=c(-2,-2,-1,0))$old.par on.exit(par(op)) for(i in 1:2) { Uplot(QUnif(n, p=2), main="QUnif", ...) Uplot(cbind(runif(n), runif(n)), main="runif", ...) } } do4(100) do4(500) do4(1000, cex = 0.8, col="slateblue") do4(10000, pch= ".", col="slateblue") do4(40000, pch= ".", col="slateblue")
Read an emacs “Org” table (in file
or text
) by
read.table()
.
read.org.table(file, header = TRUE, skip = 0, encoding = "native", fileEncoding = "", text, quiet=FALSE, ...)
read.org.table(file, header = TRUE, skip = 0, encoding = "native", fileEncoding = "", text, quiet=FALSE, ...)
file |
a file name, a |
header |
logical indicating if the org table has header line (in
the usual |
skip |
integer number of initial lines to skip. |
encoding |
to be used in the main
|
fileEncoding |
if |
text |
instead of |
quiet |
|
... |
further arguments passed to |
TODO: It should be easy to extend read.org.table()
to also
work for some of the proposed Markdown formats for tables.
Please write to maintainer("sfsmisc")
or open a
github issue if you are interested.
Org-Mode Manual on tables, https://orgmode.org/manual/Tables.html
Org tutorial for tables, https://orgmode.org/worg/org-tutorials/tables.html
CRAN package ascii
can write org tables.
read.table
t1 <- " | a | var2 | C | |---+------+-----| | 2 | may | 3.4 | | 7 | feb | 4.7 | " d <- read.org.table(text = t1) d stopifnot(dim(d) == c(2, 3), identical(names(d), c("a", "var2", "C")), d[,"a"] == c(2,7))
t1 <- " | a | var2 | C | |---+------+-----| | 2 | may | 3.4 | | 7 | feb | 4.7 | " d <- read.org.table(text = t1) d stopifnot(dim(d) == c(2, 3), identical(names(d), c("a", "var2", "C")), d[,"a"] == c(2,7))
relErrV()
:Compute the signed relative error componentwise (“vectorized”)
between the target
and current
vectors,
using the absolute error, i.e., the difference
in case the relative error is not well defined, i.e., when target
is zero or infinite.
relErr()
:simply the mean absolute value of the
relative errors between target
and current
vectors;
typically the “same” as
all.equal.numeric(target, vector, tolerance=0, countEQ=TRUE)
.
Currently useful only when both vectors are finite.
relErrV(target, current, eps0 = .Machine$double.xmin) relErr (target, current)
relErrV(target, current, eps0 = .Machine$double.xmin) relErr (target, current)
target |
numeric, possibly scalar. |
current |
numeric vector of |
eps0 |
non-negative number; values |
relErrV()
:a numeric vector of the same length (or array
of the same dimension) as current
.
relErr()
:a single number.
Martin Maechler, originally as part of Matrix package's ‘test-tools.R’.
all.equal.numeric()
is similar in spirit but returns TRUE
or
string containing the mean relative or absolute error.
## relErrV() test example: showing how it works fine with {NA, Inf, 0} : eps <- 1e-4*c(-9, -8, -6, -4, 0.5, 1, 5) target <- c(-1:1, 0, 0, NA, NaN, Inf, -Inf, Inf, 0 , Inf, 1 , -3:3) current <- c(-1:1,1e-7,NaN,NA, 0 , Inf, Inf, 0, Inf, 1, Inf, -3:3+ eps) cbind(target, current, absE = current-target, relE = relErrV(target,current)) -> M ; M stopifnot(exprs = { is.logical(isFr <- is.finite(rF <- M[,"relE"])) target==current | isFr == is.finite(aF <- M[,"absE"]) identical(aF[!isFr] , rF[!isFr]) identical(numeric(), relErrV(numeric(), integer())) # length 0 {used to fail} }) tools::assertError(relErrV(1, numeric()), verbose=TRUE) # no longer allowed ## relErr() is pretty simple --- (possibly too simple, currently) relErr relErr(target, current) # NA (of course) all.equal.numeric(target, current) ## "'is.NA' value mismatch ..." ## comparison after dropping NA's : hasN <- is.na(target) | is.na(current) all.equal(target[!hasN], current[!hasN], tolerance=0) # "Mean abs. diff.: Inf" relErr(target[!hasN], current[!hasN]) # NaN (to improve?) ## comparison after only keeping cases where both are finite: finN <- is.finite(target) & is.finite(current) all.equal(target[finN], current[finN], tol=0) # "Mean abs.d.: 0.000279.." all.equal(target[finN], current[finN], tol=0, countEQ=TRUE) # " " : 0.000239.. relErr(target[finN], current[finN]) # 0.0002392929
## relErrV() test example: showing how it works fine with {NA, Inf, 0} : eps <- 1e-4*c(-9, -8, -6, -4, 0.5, 1, 5) target <- c(-1:1, 0, 0, NA, NaN, Inf, -Inf, Inf, 0 , Inf, 1 , -3:3) current <- c(-1:1,1e-7,NaN,NA, 0 , Inf, Inf, 0, Inf, 1, Inf, -3:3+ eps) cbind(target, current, absE = current-target, relE = relErrV(target,current)) -> M ; M stopifnot(exprs = { is.logical(isFr <- is.finite(rF <- M[,"relE"])) target==current | isFr == is.finite(aF <- M[,"absE"]) identical(aF[!isFr] , rF[!isFr]) identical(numeric(), relErrV(numeric(), integer())) # length 0 {used to fail} }) tools::assertError(relErrV(1, numeric()), verbose=TRUE) # no longer allowed ## relErr() is pretty simple --- (possibly too simple, currently) relErr relErr(target, current) # NA (of course) all.equal.numeric(target, current) ## "'is.NA' value mismatch ..." ## comparison after dropping NA's : hasN <- is.na(target) | is.na(current) all.equal(target[!hasN], current[!hasN], tolerance=0) # "Mean abs. diff.: Inf" relErr(target[!hasN], current[!hasN]) # NaN (to improve?) ## comparison after only keeping cases where both are finite: finN <- is.finite(target) & is.finite(current) all.equal(target[finN], current[finN], tol=0) # "Mean abs.d.: 0.000279.." all.equal(target[finN], current[finN], tol=0, countEQ=TRUE) # " " : 0.000239.. relErr(target[finN], current[finN]) # 0.0002392929
Simple constructors of a constant character string from one character, notably a “blank” string of given string length.
M.M. is now ‘mentally deprecating’ bl.string
in
favor of using repChar()
in all cases.
With R 3.3.0 (May 2016), the new function
strrep()
was introduced; it is faster typically, and more
flexible, e.g. accepting a vector for the 2nd argument.
This (for now informally) deprecates all uses of repChar()
and bl.string()
.
repChar(char, no) bl.string(no)
repChar(char, no) bl.string(no)
char |
single character (or arbitrary string). |
no |
non-negative integer. |
One string, i.e., character(1)
), for bl.string
a
blank string, fulfilling n == nchar(bl.string(n))
.
Martin Maechler, early 1990's (for bl.string
).
r <- sapply(0:8, function(n) ccat(repChar(" ",n), n)) cbind(r) repChar("-", 4) repChar("_", 6) ## it may make sense to a string of more than one character: repChar("-=- ", 6) ## show the very simple function definitions: repChar bl.string
r <- sapply(0:8, function(n) ccat(repChar(" ",n), n)) cbind(r) repChar("-", 4) repChar("_", 6) ## it may make sense to a string of more than one character: repChar("-=- ", 6) ## show the very simple function definitions: repChar bl.string
Rotate planar (xy) points by angle phi
(in radians).
rot2(xy, phi)
rot2(xy, phi)
xy |
numeric 2-column matrix, or coercable to one. |
phi |
numeric scalar, the angle in radians (i.e., |
A two column matrix as xy
, containing the rotated points.
Martin Maechler, Oct.1994
## Rotate three points by 60 degrees : (xy0 <- rbind(c(1,0.5), c(1,1), c(0,1))) (Txy <- rot2(xy0, phi = 60 * pi/180)) plot(xy0, col = 2, type = "b", asp = 1, xlim=c(-1,1), ylim=c(0,1.5), main = "rot2(*, pi/3) : 2d rotation by 60°") points(Txy, col = 3, type = "b") O <- rep(0,2); P2 <- rbind(xy0[2,], Txy[2,]) arrows(O,O,P2[,1],P2[,2], col = "dark gray") xy0 <- .8*rbind(c(1,0), c(.5,.6), c(.7,1), c(1,1), c(.9,.8), c(1,0)) - 0.2 plot(xy0, col= 2, type="b", main= "rot2( <polygon>, pi/4 * 1:7)", asp=1, xlim=c(-1,1),ylim=c(-1,1), lwd= 2, axes = FALSE, xlab="", ylab="") abline(h=0, v=0, col="thistle"); text(1.05, -.05, "x"); text(-.05,1.05, "y") for(phi in pi/4 * 0:7) do.call("arrows",c(list(0,0),rot2(xy0[2,], phi), length=0.1, col="gray40")) for(phi in pi/4 * 1:7) polygon(rot2(xy0, phi = phi), col = 1+phi/(pi/4), border=2, type = "b")
## Rotate three points by 60 degrees : (xy0 <- rbind(c(1,0.5), c(1,1), c(0,1))) (Txy <- rot2(xy0, phi = 60 * pi/180)) plot(xy0, col = 2, type = "b", asp = 1, xlim=c(-1,1), ylim=c(0,1.5), main = "rot2(*, pi/3) : 2d rotation by 60°") points(Txy, col = 3, type = "b") O <- rep(0,2); P2 <- rbind(xy0[2,], Txy[2,]) arrows(O,O,P2[,1],P2[,2], col = "dark gray") xy0 <- .8*rbind(c(1,0), c(.5,.6), c(.7,1), c(1,1), c(.9,.8), c(1,0)) - 0.2 plot(xy0, col= 2, type="b", main= "rot2( <polygon>, pi/4 * 1:7)", asp=1, xlim=c(-1,1),ylim=c(-1,1), lwd= 2, axes = FALSE, xlab="", ylab="") abline(h=0, v=0, col="thistle"); text(1.05, -.05, "x"); text(-.05,1.05, "y") for(phi in pi/4 * 0:7) do.call("arrows",c(list(0,0),rot2(xy0[2,], phi), length=0.1, col="gray40")) for(phi in pi/4 * 1:7) polygon(rot2(xy0, phi = phi), col = 1+phi/(pi/4), border=2, type = "b")
Compute generalized ‘rot13’ character translations or “rotations”
In the distant past, considered as poor man's encryption, such rotations are way too poor nowadays and provided mainly for didactical reasons.
rotn(ch, n = 13)
rotn(ch, n = 13)
ch |
a |
n |
an integer in |
Note that the default n = 13
makes rotn
into
a function that is its own inverse.
Written after having searched for it and found
seqinr::rot13()
which was generalized and rendered more
transparently to my eyes.
a character as ch
, but with each character (which
belongs to letters
or LETTERS
“rotated” by n
(positions in the alphabet).
Martin Maechler
rot2
, a completely different rotation (namely in the
plane aka ).
rotn(c("ABC", "a","b","c"), 1) rotn(c("ABC", "a","b","c"), 2) rotn(c("ABC", "a","b","c"), 26) # rotation by 26 does not change much (ch <- paste("Hello", c("World!", "you too"))) rotn(ch) rotn( rotn(ch ) ) # rotn(*, 13) is its own inverse
rotn(c("ABC", "a","b","c"), 1) rotn(c("ABC", "a","b","c"), 2) rotn(c("ABC", "a","b","c"), 26) # rotation by 26 does not change much (ch <- paste("Hello", c("World!", "you too"))) rotn(ch) rotn( rotn(ch ) ) # rotn(*, 13) is its own inverse
Given a real numbers with the particular property that
is integer, find integer numbers
which are close to
(
), and have identical “marginal”
sum,
sum(x) == sum(y)
.
As I found later, the problem is known as “Apportionment Problem” and it is quite an old problem with several solution methods proposed historically, but only in 1982, Balinski and Young proved that there is no method that fulfills three natural desiderata.
Note that the (first) three methods currently available here were all (re?)-invented by M.Maechler, without any knowledge of the litterature. At the time of writing, I have not even checked to which (if any) of the historical methods they match.
roundfixS(x, method = c("offset-round", "round+fix", "1greedy"))
roundfixS(x, method = c("offset-round", "round+fix", "1greedy"))
x |
a numeric vector which must sum to an integer |
method |
character string specifying the algorithm to be used. |
Without hindsight, it may be surprising that all three methods give
identical results (in all situations and simulations considered),
notably that the idea of ‘mass shifting’ employed in the
iterative "1greedy"
algorithm seems equivalent to the much simpler
idea used in "offset-round"
.
I am pretty sure that these algorithms solve the
optimization problem,
,
typically for all
simultaneously, but have not bothered to find a formal proof.
a numeric vector, say r
, of the same length as x
, but
with integer values and fulfulling sum(r) == sum(x)
.
Martin Maechler, November 2007
Michel Balinski and H. Peyton Young (1982) Fair Representation: Meeting the Ideal of One Man, One Vote;
https://en.wikipedia.org/wiki/Apportionment_paradox
https://www.ams.org/samplings/feature-column/fcarc-apportionii3
round
etc
## trivial example kk <- c(0,1,7) stopifnot(identical(kk, roundfixS(kk))) # failed at some point x <- c(-1.4, -1, 0.244, 0.493, 1.222, 1.222, 2, 2, 2.2, 2.444, 3.625, 3.95) sum(x) # an integer r <- roundfixS(x) stopifnot(all.equal(sum(r), sum(x))) m <- cbind(x=x, `r2i(x)` = r, resid = x - r, `|res|` = abs(x-r)) rbind(m, c(colSums(m[,1:2]), 0, sum(abs(m[,"|res|"])))) chk <- function(y) { cat("sum(y) =", format(S <- sum(y)),"\n") r2 <- roundfixS(y, method="offset") r2. <- roundfixS(y, method="round") r2_ <- roundfixS(y, method="1g") stopifnot(all.equal(sum(r2 ), S), all.equal(sum(r2.), S), all.equal(sum(r2_), S)) all(r2 == r2. & r2. == r2_) # TRUE if all give the same result } makeIntSum <- function(y) { n <- length(y) y[n] <- ceiling(y[n]) - (sum(y[-n]) %% 1) y } set.seed(11) y <- makeIntSum(rnorm(100)) chk(y) ## nastier example: set.seed(7) y <- makeIntSum(rpois(100, 10) + c(runif(75, min= 0, max=.2), runif(25, min=.5, max=.9))) chk(y) ## Not run: for(i in 1:1000) stopifnot(chk(makeIntSum(rpois(100, 10) + c(runif(75, min= 0, max=.2), runif(25, min=.5, max=.9))))) ## End(Not run)
## trivial example kk <- c(0,1,7) stopifnot(identical(kk, roundfixS(kk))) # failed at some point x <- c(-1.4, -1, 0.244, 0.493, 1.222, 1.222, 2, 2, 2.2, 2.444, 3.625, 3.95) sum(x) # an integer r <- roundfixS(x) stopifnot(all.equal(sum(r), sum(x))) m <- cbind(x=x, `r2i(x)` = r, resid = x - r, `|res|` = abs(x-r)) rbind(m, c(colSums(m[,1:2]), 0, sum(abs(m[,"|res|"])))) chk <- function(y) { cat("sum(y) =", format(S <- sum(y)),"\n") r2 <- roundfixS(y, method="offset") r2. <- roundfixS(y, method="round") r2_ <- roundfixS(y, method="1g") stopifnot(all.equal(sum(r2 ), S), all.equal(sum(r2.), S), all.equal(sum(r2_), S)) all(r2 == r2. & r2. == r2_) # TRUE if all give the same result } makeIntSum <- function(y) { n <- length(y) y[n] <- ceiling(y[n]) - (sum(y[-n]) %% 1) y } set.seed(11) y <- makeIntSum(rnorm(100)) chk(y) ## nastier example: set.seed(7) y <- makeIntSum(rpois(100, 10) + c(runif(75, min= 0, max=.2), runif(25, min=.5, max=.9))) chk(y) ## Not run: for(i in 1:1000) stopifnot(chk(makeIntSum(rpois(100, 10) + c(runif(75, min= 0, max=.2), runif(25, min=.5, max=.9))))) ## End(Not run)
Compute a robust range, i.e. the usual range()
as long
as there are no outliers, using the “whisker boundaries” of
boxplot
, i.e., boxplot.stats
.
rrange(x, range=1, coef = 1.5, na.rm = TRUE)
rrange(x, range=1, coef = 1.5, na.rm = TRUE)
x |
numeric vector the robust range of which shall be computed. |
range |
number for S compatibility; |
coef |
numeric multiplication factor definying the outlier boundary, see ‘Details’ below. |
na.rm |
logical indicating how |
The robust range is really just what boxplot.stats(x,
coef=coef)
returns as the whisker boundaries.
This is the most extreme values x[j]
still inside median
plus/minus coef * IQR
.
numeric vector c(m,M)
with which is (not
strictly) inside
range(x) = c(min(x),max(x))
.
Martin Maechler, 1990.
range
, fivenum
,
boxplot
and boxplot.stats
.
A more sophisticated robust range for (strongly) asymmetric data can
be derived from the skewness adjusted boxplot statistics
adjboxStats
which is a generalization of
boxplot.stats
.
stopifnot(rrange(c(1:10,1000)) == c(1,10))
stopifnot(rrange(c(1:10,1000)) == c(1,10))
Produce a sequence of unique values (sorted increasingly),
containing the initial set of values x
.
This can be useful for setting prediction e.g. ranges in nonparametric
regression.
seqXtend(x, length., method = c("simple", "aim", "interpolate"), from = NULL, to = NULL)
seqXtend(x, length., method = c("simple", "aim", "interpolate"), from = NULL, to = NULL)
x |
numeric vector. |
length. |
integer specifying approximately the desired
|
method |
string specifying the method to be used. The default,
|
from , to
|
numbers to be passed to (the default method for)
|
numeric vector of increasing values, of approximate length
length.
(unless length. < length(unique(x))
in which case, the result
is simply sort(unique(x))
),
containing the original values of x
.
From, r <- seqXtend(x, *)
, the original values are at
indices ix <- match(x,r)
, i.e., identical(x, r[ix])
.
method = "interpolate"
typically gives the best results. Calling
roundfixS
, it also need more computational resources
than the other methods.
Martin Maechler
seq
; plotDS
can make particularly
good use of seqXtend()
a <- c(1,2,10,12) seqXtend(a, 12)# --> simply 1:12 seqXtend(a, 12, "interp")# ditto seqXtend(a, 12, "aim")# really worse stopifnot(all.equal(seqXtend(a, 12, "interp"), 1:12)) ## for a "general" x, however, "aim" aims better than default x <- c(1.2, 2.4, 4.6, 9.9) length(print(seqXtend(x, 12))) # 14 length(print(seqXtend(x, 12, "aim"))) # 12 length(print(seqXtend(x, 12, "int"))) # 12 ## "interpolate" is really nice: xt <- seqXtend(x, 100, "interp") plot(xt, main="seqXtend(*, 100, \"interpol\")") points(match(x,xt), x, col = 2, pch = 20) # .... you don't even see that it's not equidistant # whereas the cheap method shows ... xt2 <- seqXtend(x, 100) plot(xt2, col="blue") points(match(x,xt2), x, col = 2, pch = 20) ## with "Date" objects Drng <- as.Date(c("2007-11-10", "2012-07-12")) (px <- pretty(Drng, n = 16)) # say, for the main labels ## say, a finer grid, for ticks -- should be almost equidistant n3 <- 3*length(px) summary(as.numeric(diff(seqXtend(px, n3)))) # wildly varying summary(as.numeric(diff(seqXtend(px, n3, "aim")))) # (ditto) summary(as.numeric(diff(seqXtend(px, n3, "int")))) # around 30
a <- c(1,2,10,12) seqXtend(a, 12)# --> simply 1:12 seqXtend(a, 12, "interp")# ditto seqXtend(a, 12, "aim")# really worse stopifnot(all.equal(seqXtend(a, 12, "interp"), 1:12)) ## for a "general" x, however, "aim" aims better than default x <- c(1.2, 2.4, 4.6, 9.9) length(print(seqXtend(x, 12))) # 14 length(print(seqXtend(x, 12, "aim"))) # 12 length(print(seqXtend(x, 12, "int"))) # 12 ## "interpolate" is really nice: xt <- seqXtend(x, 100, "interp") plot(xt, main="seqXtend(*, 100, \"interpol\")") points(match(x,xt), x, col = 2, pch = 20) # .... you don't even see that it's not equidistant # whereas the cheap method shows ... xt2 <- seqXtend(x, 100) plot(xt2, col="blue") points(match(x,xt2), x, col = 2, pch = 20) ## with "Date" objects Drng <- as.Date(c("2007-11-10", "2012-07-12")) (px <- pretty(Drng, n = 16)) # say, for the main labels ## say, a finer grid, for ticks -- should be almost equidistant n3 <- 3*length(px) summary(as.numeric(diff(seqXtend(px, n3)))) # wildly varying summary(as.numeric(diff(seqXtend(px, n3, "aim")))) # (ditto) summary(as.numeric(diff(seqXtend(px, n3, "int")))) # around 30
Collect (and print) information about the current R session and
environment, using sessionInfo()
and more mostly
low-level and platform dependent information.
isRshared()
is a utility called from sessionInfoX()
.
sessionInfoX(pkgs = NULL, list.libP = FALSE, extraR.env = TRUE) ## S3 method for class 'sessionInfoX' print(x, locale = TRUE, RLIBS = TRUE, Renv = TRUE, ...) isRshared(platform = .Platform)
sessionInfoX(pkgs = NULL, list.libP = FALSE, extraR.env = TRUE) ## S3 method for class 'sessionInfoX' print(x, locale = TRUE, RLIBS = TRUE, Renv = TRUE, ...) isRshared(platform = .Platform)
pkgs |
|
list.libP |
a logical indicating if for all
|
extraR.env |
logical indicating if all environment
variables should be recorded which start with |
x |
typically the result of |
locale |
logical, passed to |
RLIBS |
logical indicating if the information about R_LIBS should be printed. |
Renv |
logical indicating if the information about R environment variables should be printed. |
... |
passed to |
platform |
For isRshared()
, a logical
indicating if R has
been installed as “shared”, i.e., linked to ‘libR*’ shared
library.
For sessionInfoX()
, an object of S3 class "sessionInfoX"
, a list
with components (there may be more, experimental and not yet listed here):
sInfo |
simply the value of |
sysInf |
the value of |
capabilities |
the value of |
Machine |
the value of |
compiledBy |
for R 4.3.0 and newer, the value of |
extSoft |
for R 3.2.0 and newer, the value of |
grSoft |
for R 3.2.0 and newer, the value of |
tclVersion |
for R 3.2.0 and newer and when tcltk is loaded,
the Tcl version ( |
LAPACK |
for R 3.0.3 and newer, the value of |
pcre |
for R 3.1.3 and newer, the value of |
pkgDescr |
If |
libPath |
the value of |
RLIBS |
a |
n.RLIBS |
simply a |
R.env |
a named character vector with the “important” R
environment variables |
xR.env |
if |
shared |
(not available on Windows, where it is conceptually always true:)
|
Martin Maechler, December 2015 ff.
sessionInfo
,
.libPaths
, R.version
, Sys.getenv
.
six0 <- sessionInfoX() six0$shared # useful (for some, e.g., MM) on Unix alikes sixN <- sessionInfoX("nlme", list.libP = TRUE) sixN # -> print() method for "sessionInfoX" names(sixN) str(sixN, max = 1)# outline of lower-level structure str(sixN$pkgDescr) # list with one component "nlme"
six0 <- sessionInfoX() six0$shared # useful (for some, e.g., MM) on Unix alikes sixN <- sessionInfoX("nlme", list.libP = TRUE) sixN # -> print() method for "sessionInfoX" names(sixN) str(sixN, max = 1)# outline of lower-level structure str(sixN$pkgDescr) # list with one component "nlme"
From base R's R.version.string
, produce a somewhat
shorter version, with or without date, notably also for patched or
development versions of R.
Main use is for plotting or construction of file of variable names.
shortRversion(Rv = R.version, Rst = Rv$status, Rvstring = if (!is.null(s <- Rv$version.string)) s else R.version.string, date = Rst != "", spaces = TRUE)
shortRversion(Rv = R.version, Rst = Rv$status, Rvstring = if (!is.null(s <- Rv$version.string)) s else R.version.string, date = Rst != "", spaces = TRUE)
Rv |
|
Rst |
a string specifying the status of R's version. For
released versions of R, this is |
Rvstring |
a string with a default that should work even for R versions previous to 1.0.0. |
date |
logical specifying if the date of the R version should be included in the result; by default, this will be true only for non-released versions of R. |
spaces |
logical indicating if the result may contain spaces (aka
‘blanks’); setting it to false, replaces the blanks by |
a character
string, typically a shortened version of Rvstring
.
Martin Maechler
shortRversion() ## (including the date, typically for an R Core developer) ## but this is shorter: (Rver <- shortRversion(date=FALSE)) shortRversion(spaces=FALSE)# e.g. for a file of even directory name shortRversion(spaces=FALSE, date=FALSE)# even shorter, ditto ## If you want even shorter { abbreviate() will remove spaces, too }: abbreviate(shortRversion(), 11) abbreviate(shortRversion(date=FALSE), 13)
shortRversion() ## (including the date, typically for an R Core developer) ## but this is shorter: (Rver <- shortRversion(date=FALSE)) shortRversion(spaces=FALSE)# e.g. for a file of even directory name shortRversion(spaces=FALSE, date=FALSE)# even shorter, ditto ## If you want even shorter { abbreviate() will remove spaces, too }: abbreviate(shortRversion(), 11) abbreviate(shortRversion(date=FALSE), 13)
Rounds to significant digits similarly to signif
.
signi(x, digits = 6)
signi(x, digits = 6)
x |
numeric vector to be rounded. |
digits |
number of significant digits required. |
numeric vector “close” to x
, i.e. by at least digits
significant digits.
This is really just round(x, digits - trunc(log10(abs(x))))
and
hence mainly of didactical use. Rather use signif()
otherwise.
Martin Maechler, in prehistoric times (i.e. before 1990).
(x1 <- seq(-2, 4, by = 0.5)) identical(x1, signi(x1))# since 0.5 is exact in binary arithmetic (x2 <- pi - 3 + c(-5,-1,0, .1, .2, 1, 10,100)) signi(x2, 3)
(x1 <- seq(-2, 4, by = 0.5)) identical(x1, signi(x1))# since 0.5 is exact in binary arithmetic (x2 <- pi - 3 + c(-5,-1,0, .1, .2, 1, 10,100)) signi(x2, 3)
Source (via sys.source()
) and attach
(attach
) an R source file.
sourceAttach(file, pos=2, name = paste(abbreviate(gsub(fsep,"", dirname(file)), 12, method="both.sides"), basename(file), sep=fsep), keep.source = getOption("keep.source.pkgs"), warn.conflicts = TRUE)
sourceAttach(file, pos=2, name = paste(abbreviate(gsub(fsep,"", dirname(file)), 12, method="both.sides"), basename(file), sep=fsep), keep.source = getOption("keep.source.pkgs"), warn.conflicts = TRUE)
file |
file name |
pos |
passed to |
name |
character, with a smart default, passed to |
keep.source |
logical, see |
warn.conflicts |
logical, see |
the return value of attach()
.
Martin Maechler, 29 Jul 2011
sourceAttach(system.file("test-tools-1.R", package="Matrix", mustWork=TRUE)) search() # shows the new "data base" at position 2 ## look what it contains: ls.str(pos = 2)
sourceAttach(system.file("test-tools-1.R", package="Matrix", mustWork=TRUE)) search() # shows the new "data base" at position 2 ## look what it contains: ls.str(pos = 2)
Provide an overview over all datasets available by
data()
in a (list of) given R packages.
str_data(pkgs, filterFUN, ...)
str_data(pkgs, filterFUN, ...)
pkgs |
character vector of names of R packages. |
filterFUN |
|
... |
potentical further arguments to be passed to
|
invisibly (see invisible
) a list
with
named components matching the pkgs
argument. Each of these
components is a named list with one entry per data(.)
argument
name. Each entry is a character
vector of the names
of all objects, typically only one.
The side effect is, as with str()
, to print
everything (via cat
) to the console.
Martin Maechler
str_data("cluster") str_data("datasets", max=0, give.attr = FALSE) ## Filtering (and return value) dfl <- str_data("datasets", filterFUN=is.data.frame) str(df.d <- dfl$datasets) ## dim() of all those data frames: t(sapply(unlist(df.d), function(.) dim(get(.)))) ### Data sets in all attached packages but "datasets" (and stubs): s <- search() (Apkgs <- sub("^package:", '', s[grep("^package:", s)])) str_data(Apkgs[!Apkgs %in% c("datasets", "stats", "base")])
str_data("cluster") str_data("datasets", max=0, give.attr = FALSE) ## Filtering (and return value) dfl <- str_data("datasets", filterFUN=is.data.frame) str(df.d <- dfl$datasets) ## dim() of all those data frames: t(sapply(unlist(df.d), function(.) dim(get(.)))) ### Data sets in all attached packages but "datasets" (and stubs): s <- search() (Apkgs <- sub("^package:", '', s[grep("^package:", s)])) str_data(Apkgs[!Apkgs %in% c("datasets", "stats", "base")])
Return information about the Linux hardware, notably the CPU (the central processor unit) and memory of the computer R is running on. This is currently only available for Linux.
These functions exist on other unix-alike platforms, but produce an error when called.
Sys.procinfo(procfile) Sys.cpuinfo() Sys.meminfo() Sys.memGB(kind = "MemTotal") Sys.MIPS()
Sys.procinfo(procfile) Sys.cpuinfo() Sys.meminfo() Sys.memGB(kind = "MemTotal") Sys.MIPS()
procfile |
name of file the lines of which give the CPU info “as on Linux” |
kind |
a |
The Sys.*info()
functions return a "simple.list"
,
here basically a named character vector,
(where the names have been filtered through make.names(*,
unique=TRUE)
which is of importance for multi-processor or multi-core
CPUs, such that vector can easily be indexed.
Sys.memGB()
returns available memory in giga bytes [GB];Sys.MIPS()
returns a number giving an approximation of
the Million Iinstructions Per Second that
the CPU processes (using “bogomips”). This is a performance
measure of the basic non-numeric processing capabilities.
For single-core Linux systems, often about twice the basic clock rate
in “MHz” (as available by Sys.cpuinfo()["cpu.MHz"]
); now,
with multicore systems, the result is often around (but smaller than)
2 * #{cores} * clock.rate
.
These currently do rely on the Linux ‘/proc/’ file system, and may not easily be portable to non-Linux environments.
On multi-processor machines, Sys.cpuinfo()
contains each field
for each processor (i.e., names(Sys.cpuinfo())
has
duplicated
entries).
Conceivably, the bogoMIPS source code is open and available and could be built into R.
Martin Maechler
Sys.ps
, etc.
(n.cores <- parallel::detectCores()) if(substr(R.version[["os"]], 1,5) == "linux") { ##-- only on Linux Sys.cpuinfo() # which is often ugly; this looks much better: length(Sys.cpu2 <- local({I <- Sys.cpuinfo(); I[ !grepl("^flags", names(I)) ] })) ## may still be too much, notably if n.cores > 2: (Sys3 <- Sys.cpu2[!grepl("[.][0-9]+$", names(Sys.cpu2))]) Sys.MIPS() ## just the 'bogomips' from above: Sys.MIPS() / as.numeric(Sys.cpuinfo()["cpu.MHz"]) ## ~~ 2 * #{cores} ((no longer)) ## Available Memory -- can be crucial: Sys.memGB() #- default "MemTotal" if(Sys.memGB("MemFree") > 16) message("Be happy! You have more than 16 Gigabytes of free memory") }
(n.cores <- parallel::detectCores()) if(substr(R.version[["os"]], 1,5) == "linux") { ##-- only on Linux Sys.cpuinfo() # which is often ugly; this looks much better: length(Sys.cpu2 <- local({I <- Sys.cpuinfo(); I[ !grepl("^flags", names(I)) ] })) ## may still be too much, notably if n.cores > 2: (Sys3 <- Sys.cpu2[!grepl("[.][0-9]+$", names(Sys.cpu2))]) Sys.MIPS() ## just the 'bogomips' from above: Sys.MIPS() / as.numeric(Sys.cpuinfo()["cpu.MHz"]) ## ~~ 2 * #{cores} ((no longer)) ## Available Memory -- can be crucial: Sys.memGB() #- default "MemTotal" if(Sys.memGB("MemFree") > 16) message("Be happy! You have more than 16 Gigabytes of free memory") }
These functions return process id and status information, typically about the running R process.
Sys.ps(process= Sys.getpid(), fields = c("pid", "pcpu", "time", "vsz", "comm"), usefile = length(fields) > 10, ps.cmd = Sys.ps.cmd(), verbose = getOption("verbose"), warn.multi = verbose || any(fields != "ALL")) Sys.sizes(process = Sys.getpid(), ps.cmd = Sys.ps.cmd())
Sys.ps(process= Sys.getpid(), fields = c("pid", "pcpu", "time", "vsz", "comm"), usefile = length(fields) > 10, ps.cmd = Sys.ps.cmd(), verbose = getOption("verbose"), warn.multi = verbose || any(fields != "ALL")) Sys.sizes(process = Sys.getpid(), ps.cmd = Sys.ps.cmd())
process |
the process id, an integer. |
fields |
character strings of |
usefile |
logical; if true, |
ps.cmd |
character string, giving the “ps” command name to be used. |
verbose |
logical ... |
warn.multi |
logical ... |
Use man ps
on your respective Unix system, to see what fields are
supported exactly. Unix dialects do differ here, and,
SunOS-Solaris even has more than one ps command...
Note, that Sys.sizes()
currently returns two integers which are
“common” to Solaris and Linux.
Martin Maechler
Sys.info
, Sys.getpid
,
proc.time
.
(.pid <- Sys.getpid()) ## process ID of current process Sys.sizes(.pid) ## The default process statistics about the running R process try( Sys.ps() )
(.pid <- Sys.getpid()) ## process ID of current process Sys.sizes(.pid) ## The default process statistics about the running R process try( Sys.ps() )
From a linear (or glm) model fitted, produce the so-called Tukey-Anscombe plot. Useful (optional) additions include: 0-line, lowess smooth, 2sigma lines, and automatic labeling of observations.
TA.plot(lm.res, fit= fitted(lm.res), res= residuals(lm.res, type="pearson"), labels= NULL, main= mk.main(), xlab = "Fitted values", draw.smooth= n >= 10, show.call = TRUE, show.2sigma= TRUE, lo.iter = NULL, lo.cex= NULL, par0line = list(lty = 2, col = "gray"), parSmooth = list(lwd = 1.5, lty = 4, col = 2), parSigma = list(lwd = 1.2, lty = 3, col = 4), verbose = FALSE, ...)
TA.plot(lm.res, fit= fitted(lm.res), res= residuals(lm.res, type="pearson"), labels= NULL, main= mk.main(), xlab = "Fitted values", draw.smooth= n >= 10, show.call = TRUE, show.2sigma= TRUE, lo.iter = NULL, lo.cex= NULL, par0line = list(lty = 2, col = "gray"), parSmooth = list(lwd = 1.5, lty = 4, col = 2), parSigma = list(lwd = 1.2, lty = 3, col = 4), verbose = FALSE, ...)
lm.res |
|
fit |
fitted values; you probably want the default here. |
res |
residuals to use. Default: Weighted ("Pearson") residuals if weights have been used for the model fit. |
labels |
strings to use as plotting symbols for each point.
Default( |
main |
main title to plot. Default: sophisticated, resulting in
something like "Tukey-Anscombe Plot of : y ~ x" constructed from
|
xlab |
x-axis label for plot. |
draw.smooth |
logical; if |
show.call |
logical; if |
show.2sigma |
logical; if |
lo.iter |
positive integer, giving the number of lowess
robustness iterations. The default depends on the model and
is |
lo.cex |
character expansion ("cex") for lowess and other marginal texts. |
par0line |
a list of arguments (with reasonable defaults) to be passed to
|
parSmooth , parSigma
|
each a list of arguments (with reasonable
default) for drawing the smooth curve (if |
verbose |
logical indicating if some construction details should
be reported ( |
... |
further graphical parameters are passed to
|
The above mentioned plot is produced on the current graphic device.
Martin Maechler, Seminar fuer Statistik, ETH Zurich, Switzerland; [email protected]
plot.lm
which also does a QQ normal plot and more.
data(stackloss) TA.plot(lm(stack.loss ~ stack.x)) example(airquality) summary(lmO <- lm(Ozone ~ ., data= airquality)) TA.plot(lmO) TA.plot(lmO, label = "O") # instead of case numbers if(FALSE) { TA.plot(lm(cost ~ age+type+car.age, claims, weights=number, na.action=na.omit)) } ##--- for aov(.) : ------------- data(Gun, package = "nlme") TA.plot( aov(rounds ~ Method + Physique/Team, data = Gun)) ##--- Not so clear what it means for GLM, but: ------ if(require(rpart)) { # for the two datasets only data(solder, package = "rpart") TA.plot(glm(skips ~ ., data = solder, family = poisson), cex= .6) data(kyphosis, package = "rpart") TA.plot(glm(Kyphosis ~ poly(Age,2) + Start, data=kyphosis, family = binomial), cex=.75) # smaller title and plotting characters }
data(stackloss) TA.plot(lm(stack.loss ~ stack.x)) example(airquality) summary(lmO <- lm(Ozone ~ ., data= airquality)) TA.plot(lmO) TA.plot(lmO, label = "O") # instead of case numbers if(FALSE) { TA.plot(lm(cost ~ age+type+car.age, claims, weights=number, na.action=na.omit)) } ##--- for aov(.) : ------------- data(Gun, package = "nlme") TA.plot( aov(rounds ~ Method + Physique/Team, data = Gun)) ##--- Not so clear what it means for GLM, but: ------ if(require(rpart)) { # for the two datasets only data(solder, package = "rpart") TA.plot(glm(skips ~ ., data = solder, family = poisson), cex= .6) data(kyphosis, package = "rpart") TA.plot(glm(Kyphosis ~ poly(Age,2) + Start, data=kyphosis, family = binomial), cex=.75) # smaller title and plotting characters }
For the case of more than two categories or indices (in INDEX
),
traditional tapply(*, simplify = TRUE)
still returns a
list when an array may seem more useful and natural. This is provided
by tapplySimpl()
if the function FUN()
is defined such
as to return a vector of the same length in all cases.
tapplySimpl(X, INDEX, FUN, ...)
tapplySimpl(X, INDEX, FUN, ...)
X |
an atomic object, typically a vector. All these arguments
are as in |
INDEX |
list of (typically more than one) factors, each of same
length as |
FUN |
the function to be applied. For the result to be
simplifiable, |
... |
optional arguments to |
If the above conditions are satisfied, the list returned from
r <- tapply(X, INDEX, FUN, ...)
is simplified into an
array
of rank , i.e.,
1+length(INDEX)
; otherwise, tapplySimpl()
returns the list
r
, i.e., the same as tapply()
.
Martin Maechler, 14 Jun 1993 (for S-plus).
tapply(*, simplify=TRUE)
.
## Using tapply() would give a list (with dim() of a matrix); ## here we get 3-array: data(esoph) with(esoph, { mima <<- tapplySimpl(ncases/ncontrols, list(agegp, alcgp), range) stopifnot(dim(mima) == c(2, nlevels(agegp), nlevels(alcgp))) }) aperm(mima)
## Using tapply() would give a list (with dim() of a matrix); ## here we get 3-array: data(esoph) with(esoph, { mima <<- tapplySimpl(ncases/ncontrols, list(agegp, alcgp), range) stopifnot(dim(mima) == c(2, nlevels(agegp), nlevels(alcgp))) }) aperm(mima)
This is graphical user interface (GUI) to density
,
allowing for dynamic bandwidth choice and a simple kind of zooming,
relying on library(tcltk)
.
tkdensity(y, n = 1024, log.bw = TRUE, showvalue = TRUE, xlim = NULL, do.rug = size < 1000, kernels = NULL, from.f = if (log.bw) -2 else 1/1000, to.f = if (log.bw) +2.2 else 2, col = 2)
tkdensity(y, n = 1024, log.bw = TRUE, showvalue = TRUE, xlim = NULL, do.rug = size < 1000, kernels = NULL, from.f = if (log.bw) -2 else 1/1000, to.f = if (log.bw) +2.2 else 2, col = 2)
y |
numeric; the data the density of which we want. |
n |
integer; the number of abscissa values for
|
log.bw |
logical; if true (default), the gui scrollbar is on a log bandwidth scale, otherwise, simple interval. |
showvalue |
logical; if true, the value of the current (log) bandwidth is shown on top of the scrollbar. |
xlim |
initial |
do.rug |
logical indicating if |
kernels |
character vector of kernel names as allowable for the
|
from.f , to.f
|
numeric giving the left and right limit of the bandwidth scrollbar. |
col |
color to be used for the density curve. |
library(tcltk)
must be working, i.e., Tcl/Tk must have been
installed on your platform, and must have been visible during R's
configuration and/or installation.
You can not only choose the bandwidth (the most important parameter), but also the kernel, and you can zoom in and out (in x-range only).
none.
(How could this be done? tcltk
widgets run as separate processes!)
Martin Maechler, building on demo(tkdensity)
.
if (dev.interactive(TRUE)) ## does really not make sense otherwise if(try(require("tcltk"))) { ## sometimes (rarely) there, but broken data(faithful) tkdensity(faithful $ eruptions) set.seed(7) if(require("nor1mix")) tkdensity(rnorMix(1000, MW.nm9), kernels = c("gaussian", "epanechnikov")) }
if (dev.interactive(TRUE)) ## does really not make sense otherwise if(try(require("tcltk"))) { ## sometimes (rarely) there, but broken data(faithful) tkdensity(faithful $ eruptions) set.seed(7) if(require("nor1mix")) tkdensity(rnorMix(1000, MW.nm9), kernels = c("gaussian", "epanechnikov")) }
Formats real numbers, possibly in scientific notation, with a given number of digits after the decimal point. Output can be used in LaTeX math mode, e.g., for printing numbers in a table, where each number has to be printed with the same number of digits after the decimal point, even if the last digits are zeros.
## S3 method for class 'numeric' toLatex(object, digits = format.info(object)[2], scientific = format.info(object)[3] > 0, times = "\\cdot", ...)
## S3 method for class 'numeric' toLatex(object, digits = format.info(object)[2], scientific = format.info(object)[3] > 0, times = "\\cdot", ...)
object |
a numeric vector. |
digits |
number of digits after the decimal point (for the
mantissa if |
scientific |
logical indicating if scientific notation |
times |
character string indicating the LaTeX symbol to be used for the ‘times’ sign. |
... |
unused; for compatibility with |
a character
vector of the same length as object
,
containing the formatted numbers.
We use digits
for round
, i.e., round after
the decimal point on purpose, rather than signif()
icant
digit rounding as used by print()
or
format()
.
Alain Hauser
pretty10exp
which gives expression
s
similar to our scientific=TRUE
.
toLatex
with other methods.
xx <- pi * 10^(-9:9) format(xx) formatC(xx) toLatex(xx) #-> scientific = TRUE is chosen toLatex(xx, scientific=FALSE) sapply(xx, toLatex) sapply(xx, toLatex, digits = 2)
xx <- pi * 10^(-9:9) format(xx) formatC(xx) toLatex(xx) #-> scientific = TRUE is chosen toLatex(xx, scientific=FALSE) sapply(xx, toLatex) sapply(xx, toLatex, digits = 2)
R does not have S' concept of frame = 0
, aka ‘session
frame’. These two function were an attempt to provide a portable
way for working with frame 0, particularly when porting code
from S.
They have been deprecated since August 2013.
u.assign0(x, value, immediate = FALSE) u.get0(x)
u.assign0(x, value, immediate = FALSE) u.get0(x)
x |
character string giving the name of the object. |
value |
any R object which is to be assigned. |
immediate |
logical, for S compatibility. No use in R. |
Really don't use these anymore...
Martin Maechler
Return the x-coordinates in an ‘n-way’ side-by-side boxplot. This is an auxiliary function and exists mainly for backcompatibility with S-plus.
u.boxplot.x(n, j = 1:n, fullrange = 100)
u.boxplot.x(n, j = 1:n, fullrange = 100)
n |
number of boxplots. |
j |
indices of boxplots. |
fullrange |
x-coords as 'uniform' in |
a numeric vector of length n
, with values inside where
fullrange
.
Martin Maechler
u.boxplot.x(7) # == 8.93 22.62 36.3 ... 91.07
u.boxplot.x(7) # == 8.93 22.62 36.3 ... 91.07
Return one string of the form "day/month/year", plus "hour:minutes", optionally.
u.date(short=FALSE)
u.date(short=FALSE)
short |
logical; if |
String with current date (and time).
Martin Maechler, ca. 1992
u.date() u.date(short = TRUE)
u.date() u.date(short = TRUE)
Daten der Form 8710230920 aufspalten in Jahr, Monat, Tag, Std, Min
u.datumdecode(d, YMDHMnames = c("Jahr", "Monat", "Tag", "Std", "Min"))
u.datumdecode(d, YMDHMnames = c("Jahr", "Monat", "Tag", "Std", "Min"))
d |
numeric dates in the form YYMMDDHHMM. |
YMDHMnames |
(column) names to be used for the result. |
a numeric matrix (or vector) with 5 columns containing the year, month, etc.
MM: This is a wrong concept, and also suffers from the “millenium bug” (by using only 2 digits for the year).
?? (someone at SfS ETH)
R's proper date-time coding: DateTimeClasses
;
u.date
etc.
u.datumdecode(8710230920) ## Jahr Monat Tag Std Min ## 87 10 23 9 20 u.datumdecode(c(8710230900, 9710230920, 0210230920)) ## Jahr Monat Tag Std Min ## [1,] 87 10 23 9 00 ## [2,] 97 10 23 9 20 ## [3,] 2 10 23 9 20
u.datumdecode(8710230920) ## Jahr Monat Tag Std Min ## 87 10 23 9 20 u.datumdecode(c(8710230900, 9710230920, 0210230920)) ## Jahr Monat Tag Std Min ## [1,] 87 10 23 9 00 ## [2,] 97 10 23 9 20 ## [3,] 2 10 23 9 20
Return current date and time as a string, possibly including day of the week in German.
u.Datumvonheute(W.tag=2, Zeit=FALSE) C.Monatsname C.Wochentag C.Wochentagkurz C.weekday
u.Datumvonheute(W.tag=2, Zeit=FALSE) C.Monatsname C.Wochentag C.Wochentagkurz C.weekday
W.tag |
logical or integer specifying you want weekday (‘Wochentag’).
|
Zeit |
logical or integer specifying if time ("Zeit") is desired.
|
A string with the current date/time, in the form specified by the arguments.
The C.*
are character
vector “constants”,
the German ones actually used by u.Datumvonheute
.
Caterina Savi, Martin Maechler
u.date
for a similar English version, and
p.datum
which plots.
For English month names, etc month.name
.
u.Datumvonheute() u.Datumvonheute(W.tag=1, Zeit=TRUE) u.Datumvonheute(W.tag= FALSE, Zeit=2)
u.Datumvonheute() u.Datumvonheute(W.tag=1, Zeit=TRUE) u.Datumvonheute(W.tag= FALSE, Zeit=2)
Compute only for high values and keep low ones –
antisymmetrically such that
u.log(x)
is (once) continuously
differentiable, it computes
for
and
for
.
u.log(x, c = 1)
u.log(x, c = 1)
x |
numeric vector to be transformed. |
c |
scalar, > 0 |
Alternatively, the ‘IHS’ (inverse hyperbolic sine)
transform has been proposed, first in more generality as
curves by Johnson(1949); in its simplest form,
,
which is also antisymmetric, continous and once differentiable as our
u.log(.)
.
numeric vector of same length as x
.
Martin Maechler, 24 Jan 1995
N. L. Johnson (1949). Systems of Frequency Curves Generated by Methods of Translation, Biometrika 36, pp 149–176. doi:10.2307/2332539
Werner Stahel's sophisticated version of John Tukey's “started log” (which was ), with concave extension to negative values and adaptive default choice of
;
logst
in his CRAN package relevance, or
LogSt
in package DescTools.
curve(u.log, -3, 10); abline(h=0, v=0, col = "gray20", lty = 3) curve(1 + log(x), .01, add = TRUE, col= "brown") # simple log curve(u.log(x, 2), add = TRUE, col=2) curve(u.log(x, c= 0.4), add = TRUE, col=4) ## Compare with IHS = inverse hyperbolic sine == asinh ihs <- function(x) log(x+sqrt(x^2+1)) # == asinh(x) {aka "arsinh(x)" or "sinh^{-1} (x)"} xI <- c(-Inf, Inf, NA, NaN) stopifnot(all.equal(xI, asinh(xI))) # but not for ihs(): cbind(xI, asinh=asinh(xI), ihs=ihs(xI)) # differs for -Inf x <- runif(500, 0, 4); x[100+0:3] <- xI all.equal(ihs(x), asinh(x)) #==> is.NA value mismatch: asinh() is correct {i.e. better!} curve(u.log, -2, 20, n=1000); abline(h=0, v=0, col = "gray20", lty = 3) curve(ihs(x)+1-log(2), add=TRUE, col=adjustcolor(2, 1/2), lwd=2) curve(ihs(x), add=TRUE, col=adjustcolor(4, 1/2), lwd=2) ## for x >= 0, u.log(x) is nicely between IHS(x) and shifted IHS ## a log10-scale version of asinh() {aka "IHS" }: ihs10(x) := asinh(x/2) / ln(10) ihs10 <- function(x) asinh(x/2)/log(10) xyaxis <- function() abline(h=0, v=0, col = "gray20", lty = 3) leg3 <- function(x = "right") legend(x, legend = c(quote(ihs10(x) == asinh(x/2)/log(10)), quote(log[10](1+x)), quote(log[10](x))), col=c(1,2,5), bty="n", lwd=2) curve(asinh(x/2)/log(10), -5, 20, n=1000, lwd=2); xyaxis() curve(log10(1+x), col=2, lwd=2, add=TRUE) curve(log10( x ), col=5, lwd=2, add=TRUE); leg3() ## zoom out and x-log-scale curve(asinh(x/2)/log(10), .1, 100, log="x", n=1000); xyaxis() curve(log10(1+x), col=2, add=TRUE) curve(log10( x ), col=5, add=TRUE) ; leg3("center") curve(log10(1+x) - ihs10(x), .1, 1000, col=2, n=1000, log="x", ylim = c(-1,1)*0.10, main = "absolute difference", xaxt="n"); xyaxis(); eaxis(1, sub10=1) curve(log10( x ) - ihs10(x), col=4, n=1000, add = TRUE) curve(abs(1 - ihs10(x) / log10(1+x)), .1, 5000, col=2, log = "xy", ylim = c(6e-9, 2), main="|relative error| of approx. ihs10(x) := asinh(x/2)/log(10)", n=1000, axes=FALSE) eaxis(1, sub10=1); eaxis(2, sub10=TRUE) curve(abs(1 - ihs10(x) / log10(x)), col=4, n=1000, add = TRUE) legend("left", legend = c(quote(log[10](1+x)), quote(log[10](x))), col=c(2,4), bty="n", lwd=1) ## Compare with Stahel's version of "started log" ## (here, for *vectors* only, and 'base', as Desctools::LogSt(); ## by MM: "modularized" by providing a threshold-computer function separately: logst_thrWS <- function(x, mult = 1) { lq <- quantile(x[x > 0], probs = c(0.25, 0.75), na.rm = TRUE, names = FALSE) if (lq[1] == lq[2]) lq[1] <- lq[2]/2 lq[1]^(1 + mult)/lq[2]^mult } logst0L <- function(x, calib = x, threshold = thrFUN(calib, mult=mult), thrFUN = logst_thrWS, mult = 1, base = 10) { ## logical index sub-assignment instead of ifelse(): { already in DescTools::LogSt } res <- x # incl NA's notNA <- !is.na(sml <- (x < (th <- threshold))) i1 <- sml & notNA; res[i1] <- log(th, base) + ((x[i1] - th)/(th * log(base))) i2 <- !sml & notNA; res[i2] <- log(x[i2], base) attr(res, "threshold") <- th attr(res, "base") <- base res } logst0 <- function(x, calib = x, threshold = thrFUN(calib, mult=mult), thrFUN = logst_thrWS, mult = 1, base = 10) { ## Using pmax.int() instead of logical indexing -- NA's work automatically - even faster xm <- pmax.int(threshold, x) res <- log(xm, base) + (x - xm)/(threshold * log(base)) attr(res, "threshold") <- threshold attr(res, "base") <- base res } ## u.log() is really using natural log() -- whereas logst() defaults to base=10 curve(u.log, -4, 10, n=1000); abline(h=0, v=0, col = "gray20", lty = 3); points(-1:1, -1:1, pch=3) curve(log10(x) + 1, add=TRUE, col=adjustcolor("midnightblue", 1/2), lwd=4, lty=6) curve(log10(x), add=TRUE, col=adjustcolor("skyblue3", 1/2), lwd=4, lty=7) curve(logst0(x, threshold= 2 ), add=TRUE, col=adjustcolor("orange",1/2), lwd=2) curve(logst0(x, threshold= 1 ), add=TRUE, col=adjustcolor(2, 1/2), lwd=2) curve(logst0(x, threshold= 1/4), add=TRUE, col=adjustcolor(3, 1/2), lwd=2, lty=2) curve(logst0(x, threshold= 1/8), add=TRUE, col=adjustcolor(4, 1/2), lwd=2, lty=2)
curve(u.log, -3, 10); abline(h=0, v=0, col = "gray20", lty = 3) curve(1 + log(x), .01, add = TRUE, col= "brown") # simple log curve(u.log(x, 2), add = TRUE, col=2) curve(u.log(x, c= 0.4), add = TRUE, col=4) ## Compare with IHS = inverse hyperbolic sine == asinh ihs <- function(x) log(x+sqrt(x^2+1)) # == asinh(x) {aka "arsinh(x)" or "sinh^{-1} (x)"} xI <- c(-Inf, Inf, NA, NaN) stopifnot(all.equal(xI, asinh(xI))) # but not for ihs(): cbind(xI, asinh=asinh(xI), ihs=ihs(xI)) # differs for -Inf x <- runif(500, 0, 4); x[100+0:3] <- xI all.equal(ihs(x), asinh(x)) #==> is.NA value mismatch: asinh() is correct {i.e. better!} curve(u.log, -2, 20, n=1000); abline(h=0, v=0, col = "gray20", lty = 3) curve(ihs(x)+1-log(2), add=TRUE, col=adjustcolor(2, 1/2), lwd=2) curve(ihs(x), add=TRUE, col=adjustcolor(4, 1/2), lwd=2) ## for x >= 0, u.log(x) is nicely between IHS(x) and shifted IHS ## a log10-scale version of asinh() {aka "IHS" }: ihs10(x) := asinh(x/2) / ln(10) ihs10 <- function(x) asinh(x/2)/log(10) xyaxis <- function() abline(h=0, v=0, col = "gray20", lty = 3) leg3 <- function(x = "right") legend(x, legend = c(quote(ihs10(x) == asinh(x/2)/log(10)), quote(log[10](1+x)), quote(log[10](x))), col=c(1,2,5), bty="n", lwd=2) curve(asinh(x/2)/log(10), -5, 20, n=1000, lwd=2); xyaxis() curve(log10(1+x), col=2, lwd=2, add=TRUE) curve(log10( x ), col=5, lwd=2, add=TRUE); leg3() ## zoom out and x-log-scale curve(asinh(x/2)/log(10), .1, 100, log="x", n=1000); xyaxis() curve(log10(1+x), col=2, add=TRUE) curve(log10( x ), col=5, add=TRUE) ; leg3("center") curve(log10(1+x) - ihs10(x), .1, 1000, col=2, n=1000, log="x", ylim = c(-1,1)*0.10, main = "absolute difference", xaxt="n"); xyaxis(); eaxis(1, sub10=1) curve(log10( x ) - ihs10(x), col=4, n=1000, add = TRUE) curve(abs(1 - ihs10(x) / log10(1+x)), .1, 5000, col=2, log = "xy", ylim = c(6e-9, 2), main="|relative error| of approx. ihs10(x) := asinh(x/2)/log(10)", n=1000, axes=FALSE) eaxis(1, sub10=1); eaxis(2, sub10=TRUE) curve(abs(1 - ihs10(x) / log10(x)), col=4, n=1000, add = TRUE) legend("left", legend = c(quote(log[10](1+x)), quote(log[10](x))), col=c(2,4), bty="n", lwd=1) ## Compare with Stahel's version of "started log" ## (here, for *vectors* only, and 'base', as Desctools::LogSt(); ## by MM: "modularized" by providing a threshold-computer function separately: logst_thrWS <- function(x, mult = 1) { lq <- quantile(x[x > 0], probs = c(0.25, 0.75), na.rm = TRUE, names = FALSE) if (lq[1] == lq[2]) lq[1] <- lq[2]/2 lq[1]^(1 + mult)/lq[2]^mult } logst0L <- function(x, calib = x, threshold = thrFUN(calib, mult=mult), thrFUN = logst_thrWS, mult = 1, base = 10) { ## logical index sub-assignment instead of ifelse(): { already in DescTools::LogSt } res <- x # incl NA's notNA <- !is.na(sml <- (x < (th <- threshold))) i1 <- sml & notNA; res[i1] <- log(th, base) + ((x[i1] - th)/(th * log(base))) i2 <- !sml & notNA; res[i2] <- log(x[i2], base) attr(res, "threshold") <- th attr(res, "base") <- base res } logst0 <- function(x, calib = x, threshold = thrFUN(calib, mult=mult), thrFUN = logst_thrWS, mult = 1, base = 10) { ## Using pmax.int() instead of logical indexing -- NA's work automatically - even faster xm <- pmax.int(threshold, x) res <- log(xm, base) + (x - xm)/(threshold * log(base)) attr(res, "threshold") <- threshold attr(res, "base") <- base res } ## u.log() is really using natural log() -- whereas logst() defaults to base=10 curve(u.log, -4, 10, n=1000); abline(h=0, v=0, col = "gray20", lty = 3); points(-1:1, -1:1, pch=3) curve(log10(x) + 1, add=TRUE, col=adjustcolor("midnightblue", 1/2), lwd=4, lty=6) curve(log10(x), add=TRUE, col=adjustcolor("skyblue3", 1/2), lwd=4, lty=7) curve(logst0(x, threshold= 2 ), add=TRUE, col=adjustcolor("orange",1/2), lwd=2) curve(logst0(x, threshold= 1 ), add=TRUE, col=adjustcolor(2, 1/2), lwd=2) curve(logst0(x, threshold= 1/4), add=TRUE, col=adjustcolor(3, 1/2), lwd=2, lty=2) curve(logst0(x, threshold= 1/8), add=TRUE, col=adjustcolor(4, 1/2), lwd=2, lty=2)
u.sys()
is a convenient wrapper (of system()
) to call to
the underlying operating system. The main purpose has been to provide
a function with identical UI both in S-PLUS and R.
MM thinks you shouldn't use this anymore, usually.
Sys.ps.cmd()
returns the ‘ps’ (‘process status’)
OS command name (as character
string), and is typically
usable on unix alikes only.
u.sys(..., intern = TRUE) Sys.ps.cmd()
u.sys(..., intern = TRUE) Sys.ps.cmd()
... |
any number of strings – which will be
|
intern |
logical – note that the default is reversed from
the one in |
Martin Maechler
system
, really!;
on non-Windows, Sys.ps()
which makes use of Sys.ps.cmd()
.
u.sys # shows how simply the function is defined : ## Not run: function (..., intern = TRUE) system(paste(..., sep = ""), intern = intern) ## End(Not run) # All *running* processes of user [sometimes only R]: try ( u.sys(Sys.ps.cmd(), "ur") )
u.sys # shows how simply the function is defined : ## Not run: function (..., intern = TRUE) system(paste(..., sep = ""), intern = intern) ## End(Not run) # All *running* processes of user [sometimes only R]: try ( u.sys(Sys.ps.cmd(), "ur") )
Give regularly spaced points on interval with mean 0
(exactly) and variance about 1 (very close for even
n
and larger round.dig
). Note that depends on
n
.
unif(n, round.dig = 1 + trunc(log10(n)))
unif(n, round.dig = 1 + trunc(log10(n)))
n |
positive integer specifying the number of points desired. |
round.dig |
integer indicating to how many digits the result is rounded. |
numeric vector of length n
, symmetric around 0, hence
with exact mean 0
, and variance approximately 1.
It relies on the fact that .
Martin Maechler, ca 1990
runif
for producing uniform random numbers.
(u <- unif(8)) var(u) (u. <- unif(8, 12))# more digits in result, hence precision for Var : var(u.)
(u <- unif(8)) var(u) (u. <- unif(8, 12))# more digits in result, hence precision for Var : var(u.)
A version of unique
keeping enough information to
reverse (or invert) to the original data.
uniqueL(x, isuniq = !duplicated(x), need.sort = is.unsorted(x))
uniqueL(x, isuniq = !duplicated(x), need.sort = is.unsorted(x))
x |
numeric vector, of length |
isuniq |
logical vector of the same length as |
need.sort |
logical indicating if |
list of two components,
ix |
integer vector of indices |
xU |
vector of values from |
such that both x[isuniq] === xU
and xU[ix] === x
.
Martin Maechler
Duplicated
from the sfsmisc package in
addition to the standard unique
and
duplicated
.
x0 <- c(1:3,2:7,8:4) str(r0 <- uniqueL(x0)) with(r0, xU[ix]) ## == x0 !
x0 <- c(1:3,2:7,8:4) str(r0 <- uniqueL(x0)) with(r0, xU[ix]) ## == x0 !
Concatenate vector elements or anything using
paste(*, collapse = .)
.
These are simple short abbreviations I have been using in my own codes
in many places.
vcat(vec, sep = " ") ccat(...)
vcat(vec, sep = " ") ccat(...)
vec , ...
|
any vector and other arguments to be pasted to together. |
sep |
the separator to use, see the Details section. |
The functions are really just defined as
vcat := function(vec, sep = " ") paste(vec, collapse = sep)
ccat := function(...) paste(..., collapse = "", sep = "")
a character string (of length 1) with the concatenated arguments.
Martin Maechler, early 1990's.
paste
, as.character
,
format
. cat()
is really for printing.
ch <- "is" ccat("This ", ch, " it: ", 100, "%") vv <- c(1,pi, 20.4) vcat(vv) vcat(vv, sep = ", ")
ch <- "is" ccat("This ", ch, " it: ", 100, "%") vv <- c(1,pi, 20.4) vcat(vv) vcat(vv, sep = ", ")
The main motivation for this function has been the easy construction
of a “full GAM formula” from something as simple as
Y ~ .
.
The potential use is slightly more general.
wrapFormula(f, data, wrapString = "s(*)")
wrapFormula(f, data, wrapString = "s(*)")
f |
the initial |
data |
|
wrapString |
|
a formula
very similar to f
; just replacing each
additive term by its wrapped version.
There are limits for this to work correctly; notably the right hand
side of the formula f
should not be nested or otherwise
complicated, rather typically just .
as in the examples.
Martin Maechler, May 2007.
formula
;
gam
from package mgcv (or also from
package gam).
myF <- wrapFormula(Fertility ~ . , data = swiss) myF # Fertility ~ s(Agriculture) + s(....) + ... if(require("mgcv")) { m1 <- gam(myF, data = swiss) print( summary(m1) ) plot(m1, pages = 1) ; title(format(m1$call), line= 2.5) } ## other wrappers: wrapFormula(Fertility ~ . , data = swiss, wrap = "lo(*)") wrapFormula(Fertility ~ . , data = swiss, wrap = "poly(*, 4)")
myF <- wrapFormula(Fertility ~ . , data = swiss) myF # Fertility ~ s(Agriculture) + s(....) + ... if(require("mgcv")) { m1 <- gam(myF, data = swiss) print( summary(m1) ) plot(m1, pages = 1) ; title(format(m1$call), line= 2.5) } ## other wrappers: wrapFormula(Fertility ~ . , data = swiss, wrap = "lo(*)") wrapFormula(Fertility ~ . , data = swiss, wrap = "poly(*, 4)")
Produce the grid used by persp
, contour, etc, as
an N x 2
matrix.
This is really outdated by expand.grid()
nowadays.
xy.grid(x, y)
xy.grid(x, y)
x , y
|
any vectors of same mode. |
a 2-column matrix of “points” for each combination of x
and
y
, i.e. with length(x) * length(y)
rows.
Martin Maechler, 26 Oct 1994.
expand.grid
which didn't exist when
xy.grid
was first devised.
plot(xy.grid(1:7, 10*(0:4))) x <- 1:3 ; y <- 10*(0:4) xyg <- xy.grid(x,y) ## Compare with expand.grid() : m2 <- as.matrix(expand.grid(y,x)[, 2:1]) dimnames(m2) <- NULL stopifnot(identical(xyg, m2))
plot(xy.grid(1:7, 10*(0:4))) x <- 1:3 ; y <- 10*(0:4) xyg <- xy.grid(x,y) ## Compare with expand.grid() : m2 <- as.matrix(expand.grid(y,x)[, 2:1]) dimnames(m2) <- NULL stopifnot(identical(xyg, m2))
Given smoother data and maybe weights
,
with multiple
, use the unique x values, replacing the
's by their (weighted) mean and updating the weights
accordingly.
xy.unique.x(x, y, w, fun.mean = mean, ...)
xy.unique.x(x, y, w, fun.mean = mean, ...)
x , y
|
numeric vectors of same length. Alternatively, |
w |
numeric vector of non-negative weights – or missing which corresponds to all weights equal. |
fun.mean |
the mean |
... |
optional arguments all passed to |
Numeric matrix with three columns, named
x
, y
and w
with unique x
values and
corresponding y
and weights w
.
Martin Maechler, 8 Mar 1993.
e.g., smooth.spline
uses something like
this internally.
## simple example: x <- c(1,1,2,4,3,1) y <- 1:6 rbind(x, y) xy.unique.x(x, y) # x y w # 1 1 3 3 # 2 2 3 1 # 3 4 4 1 # 4 3 5 1 xy.unique.x(x, y, fromLast = TRUE)
## simple example: x <- c(1,1,2,4,3,1) y <- 1:6 rbind(x, y) xy.unique.x(x, y) # x y w # 1 1 3 3 # 2 2 3 1 # 3 4 4 1 # 4 3 5 1 xy.unique.x(x, y, fromLast = TRUE)