File: DEoptim.Rd

package info (click to toggle)
r-cran-deoptim 2.2-8-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 252 kB
  • sloc: ansic: 431; makefile: 5
file content (330 lines) | stat: -rw-r--r-- 17,124 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
\name{DEoptim}
\alias{DEoptim}
\title{Differential Evolution Optimization}
\concept{minimization}
\description{
  Performs evolutionary global optimization via the Differential Evolution algorithm.
}
\usage{
DEoptim(fn, lower, upper, control = DEoptim.control(), ..., fnMap=NULL)
}
\arguments{
  \item{fn}{the function to be optimized (minimized). The function should have as its first 
    argument the vector of real-valued parameters to optimize, and return a scalar real result. \code{NA} 
    and \code{NaN} values are not allowed.}
  \item{lower, upper}{two vectors specifying scalar real lower and upper
    bounds on each parameter to be optimized, so that the i-th element 
    of \code{lower} and \code{upper} applies to the i-th parameter. The implementation searches
    between \code{lower} and \code{upper} for the global optimum (minimum) of \code{fn}.}
  \item{control}{a list of control parameters; see \code{\link{DEoptim.control}}.}
  \item{fnMap}{an optional function that will be run after each population is
    created, but before the population is passed to the objective function. This
    allows the user to impose integer/cardinality constraints.   See the
    the sandbox directory of the source code for a simple example.}
  \item{...}{further arguments to be passed to \code{fn}.}
}
\details{
  \code{DEoptim} performs optimization (minimization) of \code{fn}. 

  The \code{control} argument is a list; see the help file for
  \code{\link{DEoptim.control}} for details.  

  The \R implementation of Differential Evolution (DE), \pkg{DEoptim}, was first published on the Comprehensive \R Archive
  Network (CRAN) in 2005 by David Ardia. Early versions were written in 
  pure \R. Since version 2.0-0 (published to CRAN in 2009) the package has relied on an
  interface to a C implementation of DE, which is significantly 
  faster on most problems as compared to the implementation in 
  pure \R. The C interface is in many respects similar to the MS
  Visual C++ v5.0 implementation of the Differential Evolution algorithm
  distributed with the book \emph{Differential Evolution -- A Practical
  Approach to Global Optimization} by Price, K.V., Storn, R.M., Lampinen
  J.A, Springer-Verlag, 2006. Since version
  2.0-3 the C implementation dynamically allocates the memory required
  to store the population, removing limitations on the number of members
  in the population and length of the parameter vectors that may be
  optimized.  Since version 2.2-0, the package allows for parallel
  operation, so that the evaluations of the objective function may be 
  performed using all available cores.  This is accomplished using either
  the built-in \pkg{parallel} package or the \pkg{foreach} package.  If
  parallel operation is desired, the user should set \code{parallelType}
  and make sure that the arguments and packages needed by the objective
  function are available; see \code{\link{DEoptim.control}}, the example
  below and examples in the sandbox directory for details.  
  
  Since becoming publicly available, the package \pkg{DEoptim} has been
  used by several authors to solve optimization problems arising in
  diverse domains; see Mullen et al. (2011) for a review.
  
  To perform a maximization (instead of minimization) of a given
  function, simply define a new function which is the opposite of the
  function to maximize and apply \code{DEoptim} to it.
  
  To integrate additional constraints (other than box constraints) on the parameters \code{x} of
  \code{fn(x)}, for instance \code{x[1] + x[2]^2 < 2}, integrate the
  constraint within the function to optimize, for instance: 
  \preformatted{
    fn <- function(x)\{
      if (x[1] + x[2]^2 >= 2)\{
        r <- Inf
      else\{
        ...
      \}
      return(r)
    \}
  }
  This simplistic strategy usually does not work all that well for gradient-based or Newton-type 
  methods. It is likely to be alright when the solution is in the interior of the feasible region, but when 
  the solution is on the boundary, optimization algorithm would have a difficult time converging. Furthermore, when
  the solution is on the boundary, this strategy would make the algorithm converge to an inferior solution in the interior. 
  However, for methods such as DE which are not gradient based, this strategy might not be that bad.

  Note that \code{DEoptim} stops if any \code{NA} or \code{NaN} value is
  obtained. You have to redefine your function to handle these values
  (for instance, set \code{NA} to \code{Inf} in your objective function).

  It is important to emphasize that the result of \code{DEoptim} is a random variable, 
  i.e., different results may be obtained when the algorithm is run repeatedly with the same 
  settings. Hence, the user should set the random seed if they want to reproduce the results, e.g., by 
  setting \code{set.seed(1234)} before the call of \code{DEoptim}.

  \code{DEoptim} relies on repeated evaluation of the objective function
  in order to move the population toward a global minimum. Users
  interested in making \code{DEoptim} run as fast as possible should
  consider using the package in parallel mode (so that all CPU's
  available are used), and also 
  ensure that evaluation of the objective function is as efficient as
  possible (e.g. by using vectorization in pure \R code, or
  writing parts of the objective function in a
  lower-level language like C or Fortran).
  
  Further details and examples of the \R package \pkg{DEoptim} can be found
  in Mullen et al. (2011) and Ardia et al. (2011a, 2011b) or look at the 
  package's vignette by typing \code{vignette("DEoptim")}. Also, an illustration of
  the package usage for a high-dimensional non-linear portfolio optimization problem 
  is available by typing \code{vignette("DEoptimPortfolioOptimization")}. 
  
  Please cite the package in publications. Use \code{citation("DEoptim")}.
}
\value{
  The output of the function \code{DEoptim} is a member of the \code{S3} class \code{DEoptim}. More precisely,
  this is a list (of length 2) containing the following elements:\cr

  \code{optim}, a list containing the following elements:
  \itemize{
    \item \code{bestmem}: the best set of parameters found.
    \item \code{bestval}: the value of \code{fn} corresponding to \code{bestmem}.
    \item \code{nfeval}: number of function evaluations.
    \item \code{iter}: number of procedure iterations.
  }

  \code{member}, a list containing the following elements:
  \itemize{
    \item \code{lower}: the lower boundary.
    \item \code{upper}: the upper boundary.
    \item \code{bestvalit}: the best value of \code{fn} at each iteration.
    \item \code{bestmemit}: the best member at each iteration.
    \item \code{pop}: the population generated at the last iteration.
    \item \code{storepop}: a list containing the intermediate populations.
  }

  Members of the class \code{DEoptim} have a \code{plot} method that
  accepts the argument \code{plot.type}.\cr
  \code{plot.type = "bestmemit"} results
  in a plot of the parameter values that represent the lowest value of the objective function
  each generation. \code{plot.type = "bestvalit"} plots the best value of
  the objective function each generation. Finally,
  \code{plot.type = "storepop"} results in a plot of
  stored populations (which are only available if these have been saved by
  setting the \code{control} argument of \code{DEoptim} appropriately). Storing intermediate populations 
  allows us to examine the progress of the optimization in detail.   
  A summary method also exists and returns the best parameter vector, the best value of the objective function,
  the number of generations optimization ran, and the number of times the 
  objective function was evaluated. 
}
\note{
  \emph{Differential Evolution} (DE) is a search heuristic introduced by Storn and Price (1997). 
  Its remarkable performance as a global optimization algorithm on continuous numerical minimization 
  problems has been extensively explored; see Price et al. (2006). DE belongs to the class of genetic 
  algorithms which use biology-inspired operations of 
  crossover, mutation, and selection on a population in order to minimize an objective 
  function over the course of successive generations (see Mitchell, 1998). As with other evolutionary algorithms, 
  DE solves optimization problems by evolving a population of candidate solutions using alteration and selection
  operators. DE uses floating-point instead of bit-string encoding of population members, and 
  arithmetic operations instead of logical operations in mutation. DE is particularly well-suited to find the global optimum of a 
  real-valued function of real-valued parameters, and does not require that the function be
  either continuous or differentiable.

  Let \eqn{\mathit{NP}}{NP} denote the number of parameter vectors (members) \eqn{x \in R^d}{x in R^d} in the population. 
  In order to create the initial generation, \eqn{\mathit{NP}}{NP} guesses for the optimal value
  of the parameter vector are made, either using random values between lower and upper 
  bounds (defined by the user) or using values given by
  the user. Each generation involves creation of a new population from
  the current population members \eqn{\{ x_i \,|\, i = 1, \ldots, \mathit{NP}\}}{{x_i | i=1,...,NP}}, 
  where \eqn{i} indexes the vectors that make up the population.
  This is accomplished using \emph{differential mutation} of the
  population members. An initial mutant parameter vector \eqn{v_i} is
  created by choosing three members of the population, \eqn{x_{r_0}},
  \eqn{x_{r_1}} and \eqn{x_{r_2}}, at random. Then \eqn{v_i} is
  generated as

  \deqn{v_i \doteq x_{r_0} + \mathit{F} \cdot (x_{r_1} - x_{r_2})}{v_i := x_{r_0} + F * (x_{r_1} - x_{r_2})}  

  where \eqn{\mathit{F}}{F} is the differential weighting factor, effective values for which are
  typically between 0 and 1. After the first mutation operation, mutation is
  continued until \eqn{d} mutations have been made, with a crossover probability 
  \eqn{\mathit{CR} \in [0,1]}{CR in [0,1]}.
  The crossover probability \eqn{\mathit{CR}}{CR} controls the fraction of the parameter
  values that are copied from the mutant. If an element of the trial parameter vector is found to violate the
  bounds after mutation and crossover, it is reset in such a way that the bounds are respected (with the
  specific protocol depending on the implementation).
  Then, the objective function values associated with the children are determined. If a trial
  vector has equal or lower objective function value than the previous
  vector it replaces the previous vector in the population;
  otherwise the previous vector remains. Variations of this scheme have also
  been proposed; see Price et al. (2006) and \code{\link{DEoptim.control}}.

  Intuitively, the effect of the scheme is that the shape of the distribution of the population in the
  search space is converging with respect to size and direction towards areas with high
  fitness. The closer the population gets to the global optimum, the more the distribution
  will shrink and therefore reinforce the generation of smaller difference vectors.  

  As a general advice regarding the choice of \eqn{\mathit{NP}}{NP}, \eqn{\mathit{F}}{F} and \eqn{\mathit{CR}}{CR}, 
  Storn et al. (2006) state the following: Set the number
  of parents \eqn{\mathit{NP}}{NP} to 10 times the number of parameters, select differential weighting factor 
  \eqn{\mathit{F} = 0.8}{F = 0.8} and crossover constant \eqn{\mathit{CR} = 0.9}{CR = 0.9}. Make sure that you initialize your parameter vectors
  by exploiting their full numerical range, i.e., if a parameter is allowed to exhibit
  values in the range [-100, 100] it is a good idea to pick the initial values from this
  range instead of unnecessarily restricting diversity. If you experience misconvergence in 
  the optimization process you usually have to increase the value for \eqn{\mathit{NP}}{NP}, but often you only have to adjust
  \eqn{\mathit{F}}{F} to be a little lower or higher than 0.8. If you increase
  \eqn{\mathit{NP}}{NP} and simultaneously lower \eqn{\mathit{F}}{F} a little, convergence is more
  likely to occur but generally takes longer, i.e., DE is getting
  more robust (there is always a convergence speed/robustness trade-off).

  DE is much more sensitive to the choice of \eqn{\mathit{F}}{F} than it is to
  the choice of \eqn{\mathit{CR}}{CR}. \eqn{\mathit{CR}}{CR} is more like a fine tuning element. High
  values of \eqn{\mathit{CR}}{CR} like \eqn{\mathit{CR} = 1}{CR = 1} give faster convergence if convergence
  occurs. Sometimes, however, you have to go down as much as \eqn{\mathit{CR} = 0}{CR = 0} to
  make DE robust enough for a particular problem. For more details on the DE strategy, we refer 
  the reader to Storn and Price (1997) and Price et al. (2006).
}
\references{
  Ardia, D., Boudt, K., Carl, P., Mullen, K.M., Peterson, B.G. (2011)
  Differential Evolution with \pkg{DEoptim}. An Application to Non-Convex Portfolio Optimization. 
  \emph{R Journal}, 3(1), 27-34. 
  \doi{10.32614/RJ-2011-005}

  Ardia, D., Ospina Arango, J.D., Giraldo Gomez, N.D. (2011)
  Jump-Diffusion Calibration using Differential Evolution. 
  \emph{Wilmott Magazine}, 55 (September), 76-79.
  \doi{10.1002/wilm.10034}
  
  Mitchell, M. (1998) 
  \emph{An Introduction to Genetic Algorithms}.
  The MIT Press. ISBN 0262631857.

  Mullen, K.M, Ardia, D., Gil, D., Windover, D., Cline,J. (2011). 
  \pkg{DEoptim:} An R Package for Global Optimization by Differential Evolution. 
  \emph{Journal of Statistical Software}, 40(6), 1-26.
  \doi{10.18637/jss.v040.i06}

  Price, K.V., Storn, R.M., Lampinen J.A. (2006)
  \emph{Differential Evolution - A Practical Approach to Global Optimization}.
  Berlin Heidelberg: Springer-Verlag. ISBN 3540209506.

  Storn, R. and Price, K. (1997) 
  Differential Evolution -- A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces,
  \emph{Journal of Global Optimization}, 11:4, 341--359.  
}
\author{
  David Ardia, Katharine Mullen \email{mullenkate@gmail.com}, 
  Brian Peterson and Joshua Ulrich.
}
\seealso{
  \code{\link{DEoptim.control}} for control arguments,
  \code{\link{DEoptim-methods}} for methods on \code{DEoptim} objects,
  including some examples in plotting the results;
  \code{\link{optim}} or \code{\link{constrOptim}}
  for alternative optimization algorithms.
}
\examples{
  ## Rosenbrock Banana function
  ## The function has a global minimum f(x) = 0 at the point (1,1).  
  ## Note that the vector of parameters to be optimized must be the first 
  ## argument of the objective function passed to DEoptim.
  Rosenbrock <- function(x){
    x1 <- x[1]
    x2 <- x[2]
    100 * (x2 - x1 * x1)^2 + (1 - x1)^2
  }

  ## DEoptim searches for minima of the objective function between
  ## lower and upper bounds on each parameter to be optimized. Therefore
  ## in the call to DEoptim we specify vectors that comprise the
  ## lower and upper bounds; these vectors are the same length as the
  ## parameter vector.
  lower <- c(-10,-10)
  upper <- -lower
 
  ## run DEoptim and set a seed first for replicability
  set.seed(1234)
  DEoptim(Rosenbrock, lower, upper)

  ## increase the population size
  DEoptim(Rosenbrock, lower, upper, DEoptim.control(NP = 100))

  ## change other settings and store the output
  outDEoptim <- DEoptim(Rosenbrock, lower, upper, DEoptim.control(NP = 80,
                        itermax = 400, F = 1.2, CR = 0.7))
  
  ## plot the output
  plot(outDEoptim)

  ## 'Wild' function, global minimum at about -15.81515
  Wild <- function(x)
    10 * sin(0.3 * x) * sin(1.3 * x^2) +
       0.00001 * x^4 + 0.2 * x + 80

  plot(Wild, -50, 50, n = 1000, main = "'Wild function'")

  outDEoptim <- DEoptim(Wild, lower = -50, upper = 50,
                        control = DEoptim.control(trace = FALSE))
  
  plot(outDEoptim)

  DEoptim(Wild, lower = -50, upper = 50,
          control = DEoptim.control(NP = 50))
 
  ## The below examples shows how the call to DEoptim can be
  ## parallelized.
  ## Note that if your objective function requires packages to be
  ## loaded or has arguments supplied via \code{...}, these should be
  ## specified using the \code{packages} and \code{parVar} arguments
  ## in control.  
  \dontrun{ 

  Genrose <- function(x) {
     ## One generalization of the Rosenbrock banana valley function (n parameters)
     n <- length(x)
     ## make it take some time ... 
     Sys.sleep(.001) 
     1.0 + sum (100 * (x[-n]^2 - x[-1])^2 + (x[-1] - 1)^2)
  }

  # get some run-time on simple problems
  maxIt <- 250                     
  n <- 5

  oneCore <- system.time( DEoptim(fn=Genrose, lower=rep(-25, n), upper=rep(25, n),
                   control=list(NP=10*n, itermax=maxIt)))

  withParallel <-  system.time( DEoptim(fn=Genrose, lower=rep(-25, n), upper=rep(25, n),
                   control=list(NP=10*n, itermax=maxIt, parallelType=1)))

  ## Compare timings 
  (oneCore)
  (withParallel)
 }
}
\keyword{nonlinear}
\keyword{optimize}