File: piping.rst

package info (click to toggle)
python-pybedtools 0.10.0-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 16,620 kB
  • sloc: python: 10,030; cpp: 899; makefile: 142; sh: 57
file content (108 lines) | stat: -rw-r--r-- 3,016 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
.. include:: includeme.rst

.. doctest::
   :hide:

   >>> import pybedtools
   >>> a = pybedtools.example_bedtool('a.bed')
   >>> b = pybedtools.example_bedtool('b.bed')

Chaining methods together (pipe)
--------------------------------

One useful thing about :class:`BedTool` methods is that they often return a
new :class:`BedTool`.  In practice, this means that we can chain together
multiple method calls all in one line, similar to piping on the command
line.

For example, this intersect and merge can be combined into one command:

.. doctest::
    :options: +NORMALIZE_WHITESPACE

    >>> # These two lines...
    >>> x1 = a.intersect(b, u=True)
    >>> x2 = x1.merge()

    >>> # ...can be combined into one line:
    >>> x3 = a.intersect(b, u=True).merge()

    >>> x2 == x3
    True

A rule of thumb is that all methods that wrap BEDTools_ programs return
:class:`BedTool` objects, so you can chain these together. Many
:mod:`pybedtools`-unique methods return :class:`BedTool` objects too, just
check the docs (according to :ref:`good docs principle`). For example, as
we saw in one of the examples above, the :meth:`BedTool.saveas` method
returns a :class:`BedTool` object.  That means we can sprinkle those
commands within the example above to save the intermediate steps as
meaningful filenames for later use. For example:

.. doctest::

    >>> x4 = a.intersect(b, u=True).saveas('a-with-b.bed').merge().saveas('a-with-b-merged.bed')

Now we have new files in the current directory called :file:`a-with-b.bed`
and :file:`a-with-b-merged.bed`.  Since :meth:`BedTool.saveas` returns a
:class:`BedTool` object, `x4` points to the :file:`a-with-b-merged.bed`
file.

Sometimes it can be cleaner to separate consecutive calls on each line:

.. doctest::

    >>> x4 = a\
    ... .intersect(b, u=True)\
    ... .saveas('a-with-b.bed')\
    ... .merge()\
    ... .saveas('a-with-b-merged.bed')

Operator overloading
--------------------

There's an even easier way to chain together commands.

I found myself doing intersections so much that I thought it would be
useful to overload the ``+`` and ``-`` operators to do intersections.
To illustrate, these two example commands do the same thing:

.. doctest::
 
    >>> x5 = a.intersect(b, u=True)
    >>> x6 = a + b

    >>> x5 == x6
    True

Just as the `+` operator assumes `intersectBed` with the `-u` arg, the `-`
operator assumes `intersectBed` with the `-v` arg:


.. doctest::

    >>> x7 = a.intersect(b, v=True)
    >>> x8 = a - b

    >>> x7 == x8
    True


If you want to operating on the resulting :class:`BedTool` that is
returned by an addition or subtraction, you'll need to wrap the operation
in parentheses.  This is another way to do the chaining together of the
intersection and merge example from above:

.. doctest:: 

    >>> x9 = (a + b).merge()

And to double-check that all these methods return the same thing:

.. doctest::

    >>> x2 == x3 == x4 == x9
    True


You can learn more about chaining in :ref:`chaining principle`.