File: README.md

package info (click to toggle)
golang-github-brentp-vcfgo 0.0~git20190824.654ed2e-2
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, forky, sid, trixie
  • size: 2,924 kB
  • sloc: makefile: 5; sh: 1
file content (311 lines) | stat: -rw-r--r-- 6,707 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
[![GoDoc](https://godoc.org/github.com/brentp/vcfgo?status.svg)](https://godoc.org/github.com/brentp/vcfgo)
[![Build Status](https://travis-ci.org/brentp/vcfgo.svg)](https://travis-ci.org/brentp/vcfgo)
[![Coverage Status](https://coveralls.io/repos/brentp/vcfgo/badge.svg)](https://coveralls.io/r/brentp/vcfgo)

vcfgo is a golang library to read, write and manipulate files in the variant call format.

# vcfgo
--
    import "github.com/brentp/vcfgo"

Package vcfgo implements a Reader and Writer for variant call format. It eases
reading, filtering modifying VCF's even if they are not to spec. Example:

## Usage

```go
f, _ := os.Open("examples/test.auto_dom.no_parents.vcf")
rdr, err := vcfgo.NewReader(f, false)
if err != nil {
    panic(err)
}
for {
    variant := rdr.Read()
    if variant == nil {
        break
    }
    fmt.Printf("%s\t%d\t%s\t%v\n", variant.Chromosome, variant.Pos, variant.Ref(), variant.Alt())
    dp, err := variant.Info().Get("DP")
    fmt.Printf("depth: %v\n", dp.(int))
    sample := variant.Samples[0]
    // we can get the PL field as a list (-1 is default in case of missing value)
    PL, err := variant.GetGenotypeField(sample, "PL", -1)
    if err != nil {
        panic(err)
    }
    fmt.Printf("%v\n", PL)
    _ = sample.DP
}
fmt.Fprintln(os.Stderr, rdr.Error())

```

## Status

`vcfgo` is well-tested, but still in development. It tries to tolerate, but report
errors; after every `rdr.Read()` call, the caller can check `rdr.Error()`
and get feedback on the errors without stopping execution unless it is explicitly
requested to do so.

Info and sample fields are pre-parsed and stored as `map[string]interface{}` so
callers will have to cast to the appropriate type upon retrieval.

#### type Header

```go
type Header struct {
	SampleNames   []string
	Infos         map[string]*Info
	SampleFormats map[string]*SampleFormat
	Filters       map[string]string
	Extras        map[string]string
	FileFormat    string
	// contid id maps to a map of length, URL, etc.
	Contigs map[string]map[string]string
}
```

Header holds all the type and format information for the variants.

#### func  NewHeader

```go
func NewHeader() *Header
```
NewHeader returns a Header with the requisite allocations.

#### type Info

```go
type Info struct {
	Id          string
	Description string
	Number      string // A G R . ''
	Type        string // STRING INTEGER FLOAT FLAG CHARACTER UNKONWN
}
```

Info holds the Info and Format fields

#### func (*Info) String

```go
func (i *Info) String() string
```
String returns a string representation.

#### type InfoMap

```go
type InfoMap map[string]interface{}
```

InfoMap holds the parsed Info field which can contain floats, ints and lists
thereof.

#### func (InfoMap) String

```go
func (m InfoMap) String() string
```
String returns a string that matches the original info field.

#### type Reader

```go
type Reader struct {
	Header *Header

	LineNumber int64
}
```

Reader holds information about the current line number (for errors) and The VCF
header that indicates the structure of records.

#### func  NewReader

```go
func NewReader(r io.Reader, lazySamples bool) (*Reader, error)
```
NewReader returns a Reader.

#### func (*Reader) Clear

```go
func (vr *Reader) Clear()
```
Clear empties the cache of errors.

#### func (*Reader) Error

```go
func (vr *Reader) Error() error
```
Error() aggregates the multiple errors that can occur into a single object.

#### func (*Reader) Read

```go
func (vr *Reader) Read() *Variant
```
Read returns a pointer to a Variant. Upon reading the caller is assumed to check
Reader.Err()

#### type SampleFormat

```go
type SampleFormat Info
```

SampleFormat holds the type info for Format fields.

#### func (*SampleFormat) String

```go
func (i *SampleFormat) String() string
```
String returns a string representation.

#### type SampleGenotype

```go
type SampleGenotype struct {
	Phased bool
	GT     []int
	DP     int
	GL     []float32
	GQ     int
	MQ     int
	Fields map[string]string
}
```

SampleGenotype holds the information about a sample. Several fields are
pre-parsed, but all fields are kept in Fields as well.

#### func  NewSampleGenotype

```go
func NewSampleGenotype() *SampleGenotype
```
NewSampleGenotype allocates the internals and returns a SampleGenotype

#### func (*SampleGenotype) String

```go
func (sg *SampleGenotype) String(fields []string) string
```
String returns the string representation of the sample field.

#### type VCFError

```go
type VCFError struct {
	Msgs  []string
	Lines []int64
}
```

VCFError satisfies the error interface and allows multiple errors. This is
useful because, for example, on a single line, every sample may have a field
that doesn't match the description in the header. We want to keep parsing but
also let the caller know about the error.

#### func  NewVCFError

```go
func NewVCFError() *VCFError
```
NewVCFError allocates the needed ingredients.

#### func (*VCFError) Add

```go
func (e *VCFError) Add(err error, line int64)
```
Add adds an error and the line number within the vcf where the error took place.

#### func (*VCFError) Clear

```go
func (e *VCFError) Clear()
```
Clear empties the Messages

#### func (*VCFError) Error

```go
func (e *VCFError) Error() string
```
Error returns a string with all errors delimited by newlines.

#### func (*VCFError) IsEmpty

```go
func (e *VCFError) IsEmpty() bool
```
IsEmpty returns true if there no errors stored.

#### type Variant

```go
type Variant struct {
	Chromosome      string
	Pos        		uint64
	Id         		string
	Ref        		string
	Alt        		[]string
	Quality    		float32
	Filter     		string
	Info       		InfoMap
	Format     		[]string
	Samples    		[]*SampleGenotype
	Header     		*Header
	LineNumber 		int64
}
```

Variant holds the information about a single site. It is analagous to a row in a
VCF file.

#### func (*Variant) GetGenotypeField

```go
func (v *Variant) GetGenotypeField(g *SampleGenotype, field string, missing interface{}) (interface{}, error)
```
GetGenotypeField uses the information from the header to parse the correct time
from a genotype field. It returns an interface that can be asserted to the
expected type.

#### func (*Variant) String

```go
func (v *Variant) String() string
```
String gives a string representation of a variant

#### type Writer

```go
type Writer struct {
	io.Writer
	Header *Header
}
```

Writer allows writing VCF files.

#### func  NewWriter

```go
func NewWriter(w io.Writer, h *Header) (*Writer, error)
```
NewWriter returns a writer after writing the header.

#### func (*Writer) WriteVariant

```go
func (w *Writer) WriteVariant(v *Variant)
```
WriteVariant writes a single variant