File: Modifiers.rst

package info (click to toggle)
seqan2 2.5.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 228,748 kB
  • sloc: cpp: 257,602; ansic: 91,967; python: 8,326; sh: 1,056; xml: 570; makefile: 229; awk: 51; javascript: 21
file content (220 lines) | stat: -rw-r--r-- 10,215 bytes parent folder | download | duplicates (9)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
.. sidebar:: ToC

    .. contents::

.. _tutorial-datastructures-modifiers:

Modifiers
=========

Learning Objective
  In this tutorial you will learn how to modify the elements of a container without copying them using SeqAn modifiers.
  You will learn about the different specializations and how to work with them.

Difficulty
  Basic

Duration
  20 min

Prerequisites
  :ref:`tutorial-getting-started-first-steps-in-seqan`, :ref:`tutorial-datastructures-sequences`

Overview
--------

Modifiers give a different view to other classes.
They can be used to change the elements of a container without touching them.
For example, someone gave you an algorithm that works on two arbitrary :dox:`String` objects, but you want to use it for the special pair of a string and its reverse (left-to-right mirror).
The classical approach would be to make a copy of the one string, where all elements are mirrored from left to right and call the algorithm with both strings.
With modifiers, e.g. a :dox:`ModifiedString`, you can create the reverse in :math:`\mathcal{O}(1)` extra memory without copying the original string.
This can be handy if the original sequence is large.

Modifiers implement a certain concept (e.g. :dox:`ContainerConcept`, :dox:`RandomAccessIteratorConcept Iterator`, ...) or class interface (:dox:`String`, ...) and thus can be used as such.
The mirror modifier is already part of SeqAn and implements the class interface of :dox:`String` and can be used in every algorithm that works on strings.

The Modified String
-------------------

The :dox:`ModifiedString ModifiedString` is a modifier that implements the :dox:`String` interface and thus can be used like a :dox:`String`.
It has two template parameters.
The first one specifies a sequence type (e.g. :dox:`String`, :dox:`Segment`, ...) and the second one specifies the modifiers behavior.
That can be :dox:`ModReverseString` for mirroring a string left to right or :dox:`ModViewModifiedString` for applying a function to every single character (like 'C'->'G', 'A'->'T', ...).

ModReverse
^^^^^^^^^^

We begin with the specialization :dox:`ModReverseString` from the example above.
Now we have a given string:

.. includefrags:: demos/tutorial/modifiers/modreverse.cpp
   :fragment: main

and want to get the reverse.
So we need a :dox:`ModifiedString` specialized with ``String<char>`` and :dox:`ModReverseString`.
We create the modifier and link it with ``myString``:

.. includefrags:: demos/tutorial/modifiers/modreverse.cpp
   :fragment: modifier

The result is:

.. includefrags:: demos/tutorial/modifiers/modreverse.cpp
   :fragment: output1

.. includefrags:: demos/tutorial/modifiers/modreverse.cpp.stdout
   :fragment: output1

To verify that we didn't copy ``myString``, we replace an infix of the original string and see that, as a side effect, the modified string has also changed:

.. includefrags:: demos/tutorial/modifiers/modreverse.cpp
   :fragment: output2

.. includefrags:: demos/tutorial/modifiers/modreverse.cpp.stdout
   :fragment: output2

ModView
^^^^^^^

Another specialization of the :dox:`ModifiedString` is the :dox:`ModViewModifiedString` modifier.
Assume we need all characters of ``myString`` to be in upper case without copying ``myString``.
In SeqAn you first create a functor (a STL unary function) which converts a character to its upper-case character.

.. includefrags:: demos/tutorial/modifiers/modview.cpp
   :fragment: functor

and then create a :dox:`ModifiedString` specialized with ``ModView<MyFunctor>``:

.. includefrags:: demos/tutorial/modifiers/modview.cpp
   :fragment: mod_str

The result is:

.. includefrags:: demos/tutorial/modifiers/modview.cpp
   :fragment: output

.. includefrags:: demos/tutorial/modifiers/modview.cpp.stdout

The upper-case functor and some other predefined functors are part of SeqAn (in ``seqan/modifier/modifier_functors.h``) already.
The following functors can be used as an argument of :dox:`ModViewModifiedString`:

``FunctorUpcase<TValue>``
  Converts each character of type ``TValue`` to its upper-case character

``FunctorLowcase<TValue>``
  Converts each character to type ``TValue`` to its lower-case character

``FunctorComplement<Dna>``
  Converts each nucleotide to its complementary nucleotide

``FunctorComplement<Dna5>``
  The same for the :dox:`Dna5` alphabet

``FunctorConvert<TInValue,TOutValue>``
  Converts the type of each character from ``TInValue`` to ``TOutValue``

So instead of defining your own functor we could have used a predefined one:

.. includefrags:: demos/tutorial/modifiers/modview.cpp
   :fragment: predefined

Assignment 1
""""""""""""

.. container:: assignment

   Type
     Review

   Objective
     In this assignment you will create a modifier using your own functor.
     Assume you have given two Dna sequences as strings as given in the code example below.
     Let's assume you know that in one of your Dna sequences a few 'C' nucleotides are converted into 'T' nucleotides, but you still want to compare the sequence.
     Extend the code example as follows:

     #. Write a functor which converts all 'C' nucleotides to 'T' nucleotides.
     #. Define a :dox:`ModifiedString` with the specialization :dox:`ModViewModifiedString` using this functor.
     #. Now you can modify both sequences to compare them, treating all 'Cs' as 'Ts'.
        Print the results.

    .. includefrags:: demos/tutorial/modifiers/assignment1.cpp

   Solution
      .. container:: foldable

         .. includefrags:: demos/tutorial/modifiers/assignment1_solution.cpp

         .. includefrags:: demos/tutorial/modifiers/assignment1_solution.cpp.stdout

^^^^^^^^^

For some commonly used modifiers you can use the following shortcuts:

+-----------------------------------+---------------------------------------------------------------------------------+
| Shortcut                          | Substitution                                                                    |
+===================================+=================================================================================+
| ``ModComplementDna``              | ``ModView<FunctorComplement<Dna> >``                                            |
+-----------------------------------+---------------------------------------------------------------------------------+
| ``ModComplementDna5``             | ``ModView<FunctorComplement<Dna5> >``                                           |
+-----------------------------------+---------------------------------------------------------------------------------+
| ``DnaStringComplement``           | ``ModifiedString<DnaString, ModComplementDna>``                                 |
+-----------------------------------+---------------------------------------------------------------------------------+
| ``Dna5StringComplement``          | ``ModifiedString<Dna5String, ModComplementDna5>``                               |
+-----------------------------------+---------------------------------------------------------------------------------+
| ``DnaStringReverse``              | ``ModifiedString<DnaString, ModReverse>``                                       |
+-----------------------------------+---------------------------------------------------------------------------------+
| ``Dna5StringReverse``             | ``ModifiedString<Dna5String, ModReverse>``                                      |
+-----------------------------------+---------------------------------------------------------------------------------+
| ``DnaStringReverseComplement``    | ``ModifiedString<ModifiedString<DnaString, ModComplementDna>, ModReverse>``     |
+-----------------------------------+---------------------------------------------------------------------------------+
| ``Dna5StringReverseComplement``   | ``ModifiedString<ModifiedString<Dna5String, ModComplementDna5>, ModReverse>``   |
+-----------------------------------+---------------------------------------------------------------------------------+

The Modified Iterator
---------------------

We have seen how a :dox:`ModifiedString` can be used to modify strings without touching or copying original data.
The same can be done with iterators.
The :dox:`ModifiedIterator` implements the :dox:`RandomAccessIteratorConcept Iterator` concept and thus can be used in every algorithm or data structure that expects an iterator.
In fact, we have already used the :dox:`ModifiedIterator` unknowingly in the examples above, as in our cases the :dox:`ModifiedString` returns a corresponding :dox:`ModifiedIterator` via the :dox:`ContainerConcept#Iterator` meta-function.
The main work is done in the :dox:`ModifiedIterator`, whereas the :dox:`ModifiedString` only overloads the :dox:`ContainerConcept#begin` and :dox:`ContainerConcept#end`.
Normally, you are going to use the :dox:`ModifiedString` and maybe the result of its :dox:`ContainerConcept#Iterator` meta-function instead of a :dox:`ModifiedIterator` directly.

Nested Modifiers
----------------

As modifiers implement a certain concept and depend on classes of this concept, two modifiers can be chained to create a new modifier.
We have seen how the :dox:`ModifiedString` specialized with :dox:`ModReverseString` and :dox:`ModViewModifiedString` can be used.
Now we want to combine them to create a modifier for the reverse complement of a :dox:`DnaString` We begin with the original string:

.. includefrags:: demos/tutorial/modifiers/nested.cpp
   :fragment: string

Then we define the modifier that complements a :dox:`DnaString`:

.. includefrags:: demos/tutorial/modifiers/nested.cpp
   :fragment: complement

This modifier now should be reversed from left to right:

.. includefrags:: demos/tutorial/modifiers/nested.cpp
   :fragment: reverse

The original string can be given to the constructor.

.. includefrags:: demos/tutorial/modifiers/nested.cpp
   :fragment: constructor

The result is:

.. includefrags:: demos/tutorial/modifiers/nested.cpp
   :fragment: output

.. includefrags:: demos/tutorial/modifiers/nested.cpp.stdout
    :fragment: output


Using a predefined shortcut, the whole example could be reduced to:

.. includefrags:: demos/tutorial/modifiers/nested.cpp
   :fragment: alternative