File: write.dta.Rd

package info (click to toggle)
foreign 0.8.27-1
links: PTS
area: main
in suites: lenny
size: 1,572 kB
ctags: 674
sloc: ansic: 7,071; asm: 4; sh: 2; makefile: 1
file content (82 lines) | stat: -rw-r--r-- 3,269 bytes
% This file is part of the 'foreign' package for R
% It is distributed under the GPL version 2 or later

\name{write.dta}
\alias{write.dta}
\title{Write Files in Stata Binary Format}
\usage{
write.dta(dataframe, file, version = 7,
          convert.dates = TRUE, tz = "GMT",
          convert.factors = c("labels", "string", "numeric", "codes"))
}
\arguments{
  \item{dataframe}{a data frame.}
  \item{file}{character string giving filename.}
  \item{version}{Stata version: 6, 7, 8 and 10 are supported, and 9 is
    mapped to 8.}
  \item{convert.dates}{Convert \code{Date} and \code{POSIXt} objects
    to Stata dates?}
  \item{tz}{timezone for date conversion}
  \item{convert.factors}{how to handle factors}
  } 
\description{
  Writes the data frame to file in the Stata binary
  format.  Does not write matrix variables.
}
\details{
  The major differences between Stata versions is that 7.0 and later
  allows 32-character variable names (5 and 6 were restricted to
  8-character name). The \code{abbreviate} function is used to trim long
  variables to the permitted length. A warning is given if this is
  needed and it is an error for the abbreviated names not to be unique.

  The columns in the data frame become variables in the Stata data set.
  Missing values are correctly handled.  Optionally, R date/time objects
  (\code{POSIXt} classes) are converted into the Stata format.  This loses
  information -- Stata dates are in days since 1960-1-1.  \code{POSIXct}
  objects can be written without conversion but will not be understood as
  dates by Stata;  \code{POSIXlt} objects cannot be written without
  conversion.

  There are four options for handling factors. The default is to use
  Stata \code{value labels} for the factor levels.
  With \code{convert.factors="string"}, the factor levels are written as
  strings. With \code{convert.factors="numeric"} the numeric values
  of the levels are written, or \code{NA} if they cannot be coerced to
  numeric. Finally, \code{convert.factors="codes"} writes the underlying
  integer codes of the factors. This last used to be the only available
  method and is provided largely for backwards compatibility.

  For Stata 8 or later use the default \code{version=7} -- the only
  advantage of Stata 8 format is that it can represent multiple
  different missing value types, and R doesn't have them. Stata 10
  allows longer format lists, but R does not make use of them.

  Note that the Stata formats are documented to be use ASCII strings --
  \R does not enforce this, but use of non-ASCII character strings will
  not be portable as the encoding is not recorded.

  Stata uses some large numerical values to represent missing
  values. This function does not currently check, and hence integers
  greater than \code{2147483620} and doubles greater than
  \code{8.988e+307} may be misinterpreted by Stata.
}
\value{
  \code{NULL}
}
\references{
  Stata 6.0 Users Manual, Stata 7.0 Programming manual, Stata 8.0, 9.0 online
  help describe the file formats.
} 
\author{Thomas Lumley}
\seealso{
  \code{\link{read.dta}},
  \code{\link{attributes}},
  \code{\link{DateTimeClasses}},
  \code{\link{abbreviate}}
}
\examples{
write.dta(swiss, swissfile <- tempfile())
read.dta(swissfile)
}
\keyword{file}