File: obisample.rst

package info (click to toggle)
obitools 1.2.13%2Bdfsg-5
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 4,652 kB
  • sloc: python: 18,199; ansic: 1,542; makefile: 98
file content (67 lines) | stat: -rw-r--r-- 2,363 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
.. automodule:: obisample
   
   :py:mod:`obisample` specific options
   ------------------------------------   

   .. cmdoption::  -s ###, --sample-size ###   
   
        Specifies the size of the generated sample.
                       
            - without the ``-a`` option, sample size is expressed as the exact number of sequence 
              records to be sampled (default: number of sequence records in the input file). 
       
            - with the ``-a`` option, sample size is expressed as a fraction of the
              sequence record numbers in the input file 
              (expressed as a number between 0 and 1).
 
    *Example:*
    
    		.. code-block:: bash
    
       			> obisample -s 1000 seq1.fasta > seq2.fasta
     
		Samples randomly 1000 sequence records from the ``seq1.fasta`` file, with replacement, 
		and saves them in the ``seq2.fasta`` file.
   
   .. cmdoption::  -a, --approx-sampling   
   
                   Switches the resampling algorithm to an approximative one, 
                   useful for large files.
                   
                   The default algorithm selects exactly the number of sequence records
                   specified with the ``-s`` option. When the ``-a`` option is set, 
                   each sequence record has a probability to be selected related to the
                   ``count`` attribute of the sequence record and the ``-s`` fraction. 

   	*Example:*
        
    		.. code-block:: bash
    
       			> obisample -s 0.5 -a seq1.fastq > seq2.fastq
    
		Samples randomly half of the sequence records of the ``seq1.fastq`` file, 
		without replacement, 
		and saves them in the ``seq2.fastq`` file.

   .. cmdoption::  -w, --without-replacement   
   
                   Asks for sampling without replacement.

   	*Example:*
        
    		.. code-block:: bash

       			> obisample -s 1000 -w seq1.fasta > seq2.fasta

   		Samples randomly 1000 sequence records from the ``seq1.fasta`` file, without replacement 
   		(the input file must contain at least 1000 sequences), and saves them in the ``seq2.fasta`` file.

   .. include:: ../optionsSet/inputformat.txt

   .. include:: ../optionsSet/defaultoptions.txt

   :py:mod:`obisample` used sequence attribute
   -------------------------------------------   

           - :doc:`count <../attributes/count>`