1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128
|
TO DO
------
o Generalizing 'wgt.himedian': We'd want a C API on which R builds.
There are pure R implementations:
- 'weighted.median()' in limma
and I have generalized it ---> inst/ex-funs.R
- more general code (different 'tie' strategies; weighted *quantile*s)
in /u/maechler/R/MM/STATISTICS/robust/weighted-median.R
- The 'Hmisc' package has wtd.quantile()
o covOGK():
The argument name 'weight.fn' is pretty ugly and the default function
name 'hard.rejection()' is just awful (we need a globally available
function as 'role model'.
- Could allow 'n.iter = 0' to simply compute Cov()_{ij} = rcov(X_i, X_j)
o scaleTau2(): Also do a cheap finite-sample correction [MM] !
[done partly; but undocumented, since bound to change]
--- rrcov etc ---
o rrcov.control() __ NEEDS name change ! ______
probably use mcd.control() and lts.control()
or forget about *control() completely?
since there are only a few in each ??????/
o tolellipse() --> renamed to tolEllipsePlot()
- maybe use cluster::ellipsoidPoints()
- allow other percentiles than just 97.5%
- maybe *return* something
o plot(mcd. ) [ R/covPlot.R ] : should show the call
Default for 'ask' should be smarter: depend on
prod(par("mfrow")) < #{plots} (which depends on 'classic' and p=2)
o ltsReg(): has undocumented '$resid'
in addition to '$residuals' and '$raw.residuals';
drop it or document it !
--- glmrob --
o predict.glmrob() __NEEDED__ and probably also residuals.glmrob()
should allow a 'type = *' specification like the *.glm() methods.
o glmrob(*, weights.on.x = "robCov") currently uses MASS::cov.rob(), i.e.
"MVE" and Andreas had a comment that "mcd" is worse.
But still, I (MM) strongly believe we should use covMcd() instead.
HOWEVER: Need something better when 'X' has (binary!) factors!
"hat" +- works, but needs more work
o psi.*(...., rho = FALSE/TRUE) functions from Andreas
should be replaced by using the new psi_func objects
--> ./R/psi-funs-AR.R & ./man/pkg-internal.Rd
o glmrob() needs a bit more tests in ./tests/
[also consider those from man/glmrob.Rd]
take those from Martin's old 'robGLM1' package (need more!)
o --> first test already shows that Martin's tests for "huberC == Inf"
were *not* yet moved from robGLM1 to glmrob()...
(in other words: glmrob() should work
o also, ni = 0 does not work quite as it should ( ./tests/binom-ni-small.R )
o obj $ df ... maybe should be defined -- for "glm" methods to be
applicable
o summary.glmrob() should be better documented;
we should decide if the current return value is fine.
o Eva's code (and MM's) also computed & returned the "asymptotic efficiency"!
o anova.glmrob(): More modularization, allowing to provide own 'test' function.
Test if Huber's C are different. Need theory to compare different C's and
same model (which includes classical vs robust).
o drop1() would be nice
--- nlrob ---
o nlrob() needs tests in ./tests/ -- you can take some from man/nlrob.Rd
o summary.nlrob() is currently a "no-op" -- printing it should summarize
robustness weights!
------
o Add data sets from the MMY-book -- mostly done {do we have *all* ?}
--- lmrob --- --- --- --- ---
o more tests in tests/
o fully implement and test the multivariate case (y = matrix with > 1 col.)
o src/lmrob.c :
Things to fix:
- does many many vector and vector-matrix things itself
instead of using BLAS and LAPACK
- does median() , MAD() instead of using R's sort() routines
----
o Alternative version of covOGK() for correlation-only
using's Huber's correlation formula which ensures [-1,1] range
--> ~/R/MM/Pkg-ex/robustbase/robcorgroesser1.R
and ~/R/MM/STATISTICS/robust/pairwise-new.R
o package 'riv' (author @ epfl.ch!) has 'slc()' ~= cov.S(.) -- in pure R code
doesn't Valentin have a version too?
otherwise: test this, ask author for "donation" to robustbase
o adjOutlyingness() :
typo-bug is corrected; and I have made it more pretty.
Still a bit problematic when denominator = 0
Currently leave away all the c/0 = Inf and 0/0 = NaN values.
MM: Maybe, it's the fact that the coef = 1.5 should really depend on
the sample size n and will be too large for small n (??)
--> should ask Mia and maybe Guy Brys
|