File: FragmentScanOptions.Rd

package info (click to toggle)
apache-arrow 23.0.1-4
  • links: PTS
  • area: main
  • in suites:
  • size: 76,368 kB
  • sloc: cpp: 654,608; python: 70,522; ruby: 45,964; ansic: 18,742; sh: 7,367; makefile: 633; javascript: 125; xml: 41
file content (48 lines) | stat: -rw-r--r-- 1,842 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dataset-format.R
\name{FragmentScanOptions}
\alias{FragmentScanOptions}
\alias{CsvFragmentScanOptions}
\alias{ParquetFragmentScanOptions}
\alias{JsonFragmentScanOptions}
\title{Format-specific scan options}
\description{
A \code{FragmentScanOptions} holds options specific to a \code{FileFormat} and a scan
operation.
}
\section{Factory}{

\code{FragmentScanOptions$create()} takes the following arguments:
\itemize{
\item \code{format}: A string identifier of the file format. Currently supported values:
\itemize{
\item "parquet"
\item "csv"/"text", aliases for the same format.
}
\item \code{...}: Additional format-specific options

\code{format = "parquet"}:
\itemize{
\item \code{use_buffered_stream}: Read files through buffered input streams rather than
loading entire row groups at once. This may be enabled
to reduce memory overhead. Disabled by default.
\item \code{buffer_size}: Size of buffered stream, if enabled. Default is 8KB.
\item \code{pre_buffer}: Pre-buffer the raw Parquet data. This can improve performance
on high-latency filesystems. Disabled by default.
\item \code{thrift_string_size_limit}: Maximum string size allocated for decoding thrift
strings. May need to be increased in order to read
files with especially large headers. Default value
100000000.
\item \code{thrift_container_size_limit}: Maximum size of thrift containers.  May need to be
increased in order to read files with especially large
headers. Default value 1000000.
\code{format = "text"}: see \link{CsvConvertOptions}. Note that options can only be
specified with the Arrow C++ library naming. Also, "block_size" from
\link{CsvReadOptions} may be given.
}
}

It returns the appropriate subclass of \code{FragmentScanOptions}
(e.g. \code{CsvFragmentScanOptions}).
}