File: parse.folder.Rd

package info (click to toggle)
r-cran-tcr 2.3.2%2Bds-1
links: PTS, VCS
area: main
in suites: bookworm, bullseye, trixie
size: 2,316 kB
sloc: cpp: 187; makefile: 5
file content (136 lines) | stat: -rw-r--r-- 4,501 bytes
parent folder | download | duplicates (2)
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/parsing.R
\name{parse.folder}
\alias{parse.folder}
\alias{parse.file.list}
\alias{parse.file}
\alias{parse.mitcr}
\alias{parse.mitcrbc}
\alias{parse.migec}
\alias{parse.vdjtools}
\alias{parse.immunoseq}
\alias{parse.immunoseq2}
\alias{parse.immunoseq3}
\alias{parse.tcr}
\alias{parse.mixcr}
\alias{parse.imseq}
\alias{parse.migmap}
\title{Parse input table files with immune receptor repertoire data.}
\usage{
parse.file(.filename, 
.format = c('mitcr', 'mitcrbc', 'migec', 'vdjtools', 'immunoseq', 
'mixcr', 'imseq', 'tcr'), ...)

parse.file.list(.filenames, 
.format = c('mitcr', 'mitcrbc', 'migec', 'vdjtools', 'immunoseq', 
'mixcr', 'imseq', 'tcr'), .namelist = NA)

parse.folder(.folderpath, 
.format = c('mitcr', 'mitcrbc', 'migec', 'vdjtools', 'immunoseq', 
'mixcr', 'imseq', 'tcr'), ...)

parse.mitcr(.filename)

parse.mitcrbc(.filename)

parse.migec(.filename)

parse.vdjtools(.filename)

parse.immunoseq(.filename)

parse.immunoseq2(.filename)

parse.immunoseq3(.filename)

parse.mixcr(.filename)

parse.imseq(.filename)

parse.tcr(.filename)

parse.migmap(.filename)
}
\arguments{
\item{.folderpath}{Path to the folder with text cloneset files.}

\item{.format}{String that specifies the input format.}

\item{...}{Parameters passed to \code{parse.cloneset}.}

\item{.filename}{Path to the input file with cloneset data.}

\item{.filenames}{Vector or list with paths to files with cloneset data.}

\item{.namelist}{Either NA or character vector of length \code{.filenames} with names for output data frames.}
}
\value{
Data frame with immune receptor repertoire data. Each row in this data frame corresponds to a clonotype.
The data frame has following columns:

- "Umi.count" - number of barcodes (events, UMIs);

- "Umi.proportion" - proportion of barcodes (events, UMIs);

- "Read.count" - number of reads;

- "Read.proportion" - proportion of reads;

- "CDR3.nucleotide.sequence" - CDR3 nucleotide sequence;

- "CDR3.amino.acid.sequence" - CDR3 amino acid sequence;

- "V.gene" - names of aligned Variable gene segments;

- "J.gene" - names of aligned Joining gene segments;

- "D.gene" - names of aligned Diversity gene segments;

- "V.end" - last positions of aligned V gene segments (1-based);

- "J.start" - first positions of aligned J gene segments (1-based);

- "D5.end" - positions of D'5 end of aligned D gene segments (1-based);

- "D3.end" - positions of D'3 end of aligned D gene segments (1-based);

- "VD.insertions" - number of inserted nucleotides (N-nucleotides) at V-D junction (-1 for receptors with VJ recombination);

- "DJ.insertions" - number of inserted nucleotides (N-nucleotides) at D-J junction (-1 for receptors with VJ recombination);

- "Total.insertions" - total number of inserted nucleotides (number of N-nucleotides at V-J junction for receptors with VJ recombination).
}
\description{
Load the TCR data from the file with the given filename to a data frame or load all 
files from the given folder to a list of data frames. The folder must contain onky files with the specified format.
Input files could be either text files or archived with gzip ("filename.txt.gz") or bzip2 ("filename.txt.bz2").
For a general parser see \code{\link{parse.cloneset}}.

Parsers are available for:
MiTCR ("mitcr"), MiTCR w/ UMIs ("mitcrbc"), MiGEC ("migec"), VDJtools ("vdjtools"), 
ImmunoSEQ ("immunoseq" or 'immunoseq2' for old and new formats respectively),
MiXCR ("mixcr"), IMSEQ ("imseq") and tcR ("tcr", data frames saved with the `repSave()` function).

Output of MiXCR should contain either all hits or best hits for each gene segment.

Output of IMSEQ should be generated with parameter "-on". In this case there will be no positions of aligned gene segments in the output data frame
due to restrictions of IMSEQ output.

tcR's data frames should be saved with the `repSave()` function.
}
\examples{
\dontrun{
# Parse file in "~/mitcr/immdata1.txt" as a MiTCR file.
immdata1 <- parse.file("~/mitcr/immdata1.txt", 'mitcr')
# Parse VDJtools file archive as .gz file.
immdata1 <- parse.file("~/mitcr/immdata3.txt.gz", 'vdjtools')
# Parse files "~/data/immdata1.txt" and "~/data/immdat2.txt" as MiGEC files.
immdata12 <- parse.file.list(c("~/data/immdata1.txt",
                             "~/data/immdata2.txt"), 'migec')
# Parse all files in "~/data/" as MiGEC files.
immdata <- parse.folder("~/data/", 'migec')
}
}
\seealso{
\link{parse.cloneset}, \link{repSave}, \link{repLoad}
}