File: TODO

package info (click to toggle)
robustbase 0.5-0-1-1
  • links: PTS
  • area: main
  • in suites: squeeze
  • size: 1,564 kB
  • ctags: 456
  • sloc: fortran: 2,524; ansic: 1,782; makefile: 1
file content (128 lines) | stat: -rw-r--r-- 4,378 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
TO DO
------

o  Generalizing  'wgt.himedian':  We'd want a C API on which R builds.

   There are pure R implementations:
    - 'weighted.median()' in limma
    	 and I have generalized it ---> inst/ex-funs.R
    - more general code (different 'tie' strategies; weighted *quantile*s)
      in /u/maechler/R/MM/STATISTICS/robust/weighted-median.R
    - The 'Hmisc' package has wtd.quantile()

o covOGK():
  The argument name 'weight.fn' is pretty ugly and the default function
  name 'hard.rejection()' is just awful (we need a globally available
  function as 'role model'.

  - Could allow 'n.iter = 0' to simply compute Cov()_{ij} = rcov(X_i, X_j)

o scaleTau2():  Also do a cheap finite-sample correction [MM] !
		[done partly; but undocumented, since bound to change]

--- rrcov etc ---

o rrcov.control() __ NEEDS  name change ! ______
   probably use  mcd.control() and lts.control()

   or forget about *control() completely?
   since there are only a few in each ??????/

o tolellipse() --> renamed to tolEllipsePlot()
     - maybe use  cluster::ellipsoidPoints()
     - allow other percentiles than just  97.5%
     - maybe *return* something

o plot(mcd. ) [ R/covPlot.R ] : should show the call
	     Default for 'ask' should be smarter: depend on
	     prod(par("mfrow")) < #{plots} (which depends on 'classic' and p=2)

o ltsReg():  has  undocumented '$resid'
	     in addition to '$residuals' and '$raw.residuals';
	     drop it or document it !

--- glmrob --

o predict.glmrob() __NEEDED__ and probably also residuals.glmrob()
  should allow a 'type = *' specification like the *.glm() methods.

o glmrob(*, weights.on.x = "robCov")  currently uses  MASS::cov.rob(), i.e.
  "MVE" and Andreas had a comment that "mcd" is worse.
  But still, I (MM) strongly believe we should use  covMcd() instead.
  HOWEVER: Need something better when 'X' has (binary!) factors!
  "hat" +- works, but needs more work

o psi.*(....,  rho = FALSE/TRUE)   functions from Andreas
  should be replaced by using the new  psi_func  objects
  --> ./R/psi-funs-AR.R  &  ./man/pkg-internal.Rd

o glmrob() needs a bit more tests in ./tests/
           [also consider those from man/glmrob.Rd]
   take those from Martin's old 'robGLM1' package (need more!)
o  --> first test already shows that Martin's tests for "huberC == Inf"
      were *not* yet moved from robGLM1 to glmrob()...
   (in other words:  glmrob() should work

o also, ni = 0 does not work quite as it should ( ./tests/binom-ni-small.R )

o obj $ df ...  maybe should be defined -- for "glm" methods to be
  applicable

o summary.glmrob() should be better documented;
  we should decide if the current return value is fine.

o Eva's code (and MM's) also computed & returned the "asymptotic efficiency"!

o anova.glmrob(): More modularization, allowing to provide own 'test' function.
  Test if Huber's C are different. Need theory to compare different C's and
  same model (which includes classical vs robust).

o drop1() would be nice


--- nlrob ---

o nlrob() needs tests in ./tests/ -- you can take some from man/nlrob.Rd

o summary.nlrob() is currently a "no-op" -- printing it should summarize
		  robustness weights!

------

o Add data sets from the MMY-book -- mostly done {do we have *all* ?}


--- lmrob --- --- --- --- ---

o more tests in	tests/

o fully implement and test the multivariate case (y = matrix with > 1 col.)

o src/lmrob.c :

   Things to fix:

   - does many many vector and vector-matrix things itself
     instead of using BLAS and LAPACK
   - does median() , MAD() instead of using R's  sort() routines

----

o Alternative version of covOGK() for correlation-only
  using's Huber's correlation formula which ensures [-1,1] range
  --> ~/R/MM/Pkg-ex/robustbase/robcorgroesser1.R
  and ~/R/MM/STATISTICS/robust/pairwise-new.R

o package 'riv' (author @ epfl.ch!) has 'slc()'  ~=  cov.S(.)  -- in pure R code
  doesn't Valentin have a version too?
  otherwise: test this, ask author for "donation" to robustbase

o adjOutlyingness() :
    typo-bug is corrected; and I have made it more pretty.
    Still a bit problematic when denominator = 0
    Currently leave away all the c/0 = Inf and 0/0 = NaN values.

    MM: Maybe, it's the fact that the   coef = 1.5  should really depend on
        the sample size  n   and will be too large for small n (??)
  --> should ask Mia and maybe Guy Brys