1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
|
@multi
Feature: Multi-sample VCF
Here we take a VCF line and parse the information for multiple named
samples
Scenario: When parsing a record
Given the multi sample header line
"""
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Original s1t1 s2t1 s3t1 s1t2 s2t2 s3t2
"""
When I parse the header
Given multisample vcf line
"""
1 10321 . C T 106.30 . AC=5;AF=0.357;AN=14;BaseQRankSum=3.045;DP=1537;Dels=0.01;FS=5.835;HaplotypeScore=220.1531;MLEAC=5;MLEAF=0.357;MQ=26.69;MQ0=258;MQRankSum=-4.870;QD=0.10;ReadPosRankSum=0.815 GT:AD:DP:GQ:PL 0/1:189,25:218:30:30,0,810 0/0:219,22:246:24:0,24,593 0/1:218,27:248:34:34,0,1134 0/0:220,22:248:56:0,56,1207 0/1:168,23:193:19:19,0,493 0/1:139,22:164:46:46,0,689 0/1:167,26:196:20:20,0,522
"""
When I parse the record
Then I expect rec.valid? to be true
Then I expect rec.chrom to contain "1"
Then I expect rec.pos to contain 10321
Then I expect rec.ref to contain "C"
And I expect multisample rec.alt to contain ["T"]
And I expect rec.qual to be 106.30
And I expect rec.info.ac to be 5
And I expect rec.info.af to be 0.357
And I expect rec.info.dp to be 1537
And I expect rec.info['dp'] to be 1537
And I expect rec.info.readposranksum to be 0.815
And I expect rec.info['ReadPosRankSum'] to be 0.815
And I expect rec.info.fields to contain ["AC", "AF", "AN", "BASEQRANKSUM", "DP", "DELS", "FS", "HAPLOTYPESCORE", "MLEAC", "MLEAF", "MQ", "MQ0", "MQRANKSUM", "QD", "READPOSRANKSUM"]
And I expect rec.sample['Original'].ad to be [189,25]
And I expect rec.sample['Original'].gt to be "0/1"
And I expect rec.sample['s3t2'].ad to be [167,26]
And I expect rec.sample['s3t2'].dp to be 196
And I expect rec.sample['s3t2'].gq to be 20
And I expect rec.sample['s3t2'].pl to be [20,0,522]
# And the nicer self resolving
And I expect rec.sample.original.gt to be "0/1"
And I expect rec.sample.s3t2.pl to be [20,0,522]
# And the even better
And I expect rec.original.gt? to be true
And I expect rec.original.gt to be "0/1"
And I expect rec.s3t2.pl to be [20,0,522]
# Check for missing data
And I expect test rec.missing_samples? to be false
And I expect test rec.original? to be true
# Special functions
And I expect r.original? to be true
And I expect r.original.gti? to be true
And I expect r.original.gti to be [0,1]
And I expect r.original.gti[1] to be 1
And I expect r.original.gts? to be true
And I expect r.original.gts to be ["C","T"]
And I expect r.original.gts[1] to be "T"
Given multisample vcf line
"""
1 10723 . C G 73.85 . AC=4;AF=0.667;AN=6;BaseQRankSum=1.300;DP=18;Dels=0.00;FS=3.680;HaplotypeScore=0.0000;MLEAC=4;MLEAF=0.667;MQ=20.49;MQ0=11;MQRankSum=1.754;QD=8.21;ReadPosRankSum=0.000 GT:AD:DP:GQ:PL ./. ./. 1/1:2,2:4:6:66,6,0 1/1:4,1:5:3:36,3,0 ./. ./. 0/0:6,0:6:3:0,3,33
"""
When I parse the record
Then I expect rec.pos to contain 10723
Then I expect rec.valid? to be true
And I expect rec.original? to be false
And I expect rec.sample.s1t1? to be false
And I expect rec.sample.s3t2? to be true
And I expect rec.missing_samples? to be true
# Phased genotype
Given multisample vcf line
"""
1 10723 . C G 73.85 . AC=4;AF=0.667;AN=6;BaseQRankSum=1.300;DP=18;Dels=0.00;FS=3.680;HaplotypeScore=0.0000;MLEAC=4;MLEAF=0.667;MQ=20.49;MQ0=11;MQRankSum=1.754;QD=8.21;ReadPosRankSum=0.000 GT:AD:DP:GQ:PL 0|1 ./. 1/1:2,2:4:6:66,6,0 1/1:4,1:5:3:36,3,0 ./. ./. 0/0:6,0:6:3:0,3,33
"""
When I parse the record
Then I expect rec.pos to contain 10723
Then I expect rec.valid? to be true
And I expect r.original? to be true
And I expect r.original.gts? to be true
And I expect r.original.gts to be ["C","G"]
And I expect r.original.gts[0] to be "C"
And I expect r.original.gts[1] to be "G"
# INFO fields with matching tails
Given multisample vcf line
"""
1 10723 . C G 73.85 . AC=4;AF=0.667;CIEND=999;END=111;AN=6;BaseQRankSum=1.300;DP=18;Dels=0.00;FS=3.680;HaplotypeScore=0.0000;MLEAC=4;MLEAF=0.667;MQ=20.49;MQ0=11;MQRankSum=1.754;QD=8.21;ReadPosRankSum=0.000 GT:AD:DP:GQ:PL 0|1 ./. 1/1:2,2:4:6:66,6,0 1/1:4,1:5:3:36,3,0 ./. ./. 0/0:6,0:6:3:0,3,33
"""
When I parse the record
Then I expect r.info.end to be 111
And I expect r.info.ciend to be 999
|