1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192
|
==============
Best Practices
==============
While there are plenty of ways to interact with LibCST, we recommend some patterns
over others. Various best practices are laid out here along with their justifications.
Avoid ``isinstance`` when traversing
------------------------------------
Excessive use of ``isinstance`` implies that you should rewrite your check as a
matcher or unroll it into a set of visitor methods. Often, you should make use of
:func:`~libcst.ensure_type` to make your type checker aware of a node's type.
Often it is far easier to use :ref:`libcst-matchers` over explicit instance checks
in a transform. Matching against some pattern and then extracting a value from a
node's child is often easier and far more readable. Unfortunately this clashes with
various type-checkers which do not understand that :func:`~libcst.matchers.matches`
guarantees a particular set of children. Instead of instance checks, you should use
:func:`~libcst.ensure_type` which can be inlined and nested.
For example, if you have written the following::
def get_identifier_name(node: cst.CSTNode) -> Optional[str]:
if m.matches(node, m.Name()):
assert isinstance(node, cst.Name)
return node.value
return None
You could instead write something like::
def get_identifier_name(node: cst.CSTNode) -> Optional[str]:
return (
cst.ensure_type(node, cst.Name).value
if m.matches(node, m.Name())
else None
)
If you find yourself attempting to manually traverse a tree using ``isinstance``,
you can often rewrite your code using visitor methods instead. Nested instance checks
can often be unrolled into visitors methods along with matcher decorators. This may
entail adding additional state to your visitor, but the resulting code is far more
likely to work after changes to LibCST itself. For example, if you have written the
following::
class CountBazFoobarArgs(cst.CSTVisitor):
"""
Given a set of function names, count how many arguments to those function
calls are the identifiers "baz" or "foobar".
"""
def __init__(self, functions: Set[str]) -> None:
super().__init__()
self.functions: Set[str] = functions
self.arg_count: int = 0
def visit_Call(self, node: cst.Call) -> None:
# See if the call itself is one of our functions we care about
if isinstance(node.func, cst.Name) and node.func.value in self.functions:
# Loop through each argument
for arg in node.args:
# See if the argument is an identifier matching what we want to count
if isinstance(arg.value, cst.Name) and arg.value.value in {"baz", "foobar"}:
self.arg_count += 1
You could instead write something like::
class CountBazFoobarArgs(m.MatcherDecoratableVisitor):
"""
Given a set of function names, count how many arguments to those function
calls are the identifiers "baz" or "foobar".
"""
def __init__(self, functions: Set[str]) -> None:
super().__init__()
self.functions: Set[str] = functions
self.arg_count: int = 0
self.call_stack: List[str] = []
def visit_Call(self, node: cst.Call) -> None:
# Store all calls in a stack
if m.matches(node.func, m.Name()):
self.call_stack.append(cst.ensure_type(node.func, cst.Name).value)
def leave_Call(self, original_node: cst.Call) -> None:
# Pop the latest call off the stack
if m.matches(node.func, m.Name()):
self.call_stack.pop()
@m.visit(m.Arg(m.Name("baz") | m.Name("foobar")))
def _count_args(self, node: cst.Arg) -> None:
# See if the most shallow call is one we're interested in, so we can
# count the args we care about only in calls we care about.
if self.call_stack[-1] in self.functions:
self.arg_count += 1
While there is more code than the previous example, it is arguably easier to understand
and maintain each part of the code. It is also immune to any future changes to LibCST
which change's the tree shape. Note that LibCST is traversing the tree completely
in both cases, so while the first appears to be faster, it is actually doing the same
amount of work as the second.
Prefer ``updated_node`` when modifying trees
--------------------------------------------
When you are using :class:`~libcst.CSTTransformer` to modify a LibCST tree, only return
modifications to ``updated_node``. The ``original_node`` parameter on any
``leave_<Node>`` method is provided for book-keeping and is guaranteed to be
equal via ``==`` and ``is`` checks to the ``node`` parameter in the corresponding
``visit_<Node>`` method. Remember that LibCST trees are immutable, so the only
way to make a modification is to return a new tree. Hence, by the time we get to
calling ``leave_<Node>`` methods, we have an updated tree whose children have been
modified. Therefore, you should only return ``original_node`` when you want to
explicitly discard changes performed on the node's children.
Say you wanted to rename all function calls which were calling global functions.
So, you might write the following::
class FunctionRenamer(cst.CSTTransformer):
def leave_Call(self, original_node: cst.Call, updated_node: cst.Call) -> cst.Call:
if m.matches(original_node.func, m.Name()):
return original_node.with_changes(
func=cst.Name(
"renamed_" + cst.ensure_type(original_node.func, cst.Name).value
)
)
return original_node
Consider writing instead::
class FunctionRenamer(cst.CSTTransformer):
def leave_Call(self, original_node: cst.Call, updated_node: cst.Call) -> cst.Call:
if m.matches(updated_node.func, m.Name()):
return updated_node.with_changes(
func=cst.Name(
"renamed_" + cst.ensure_type(updated_node.func, cst.Name).value
)
)
return updated_node
The version that returns modifications to ``original_node`` has a subtle bug. Consider
the following code snippet::
some_func(1, 2, other_func(3))
Running the recommended transform will return us a new code snippet that looks like this::
renamed_some_func(1, 2, renamed_other_func(3))
However, running the version which modifies ``original_node`` will instead return::
renamed_some_func(1, 2, other_func(3))
That's because the ``updated_node`` tree contains the modification to ``other_func``.
By returning modifications to ``original_node`` instead of ``updated_node``, we accidentally
discarded all the work done deeper in the tree.
.. _libcst-config_best_practice:
Provide a ``config`` when generating code from templates
--------------------------------------------------------
When generating complex trees it is often far easier to pass a string to
:func:`~libcst.parse_statement` or :func:`~libcst.parse_expression` than it is to
manually construct the tree. When using these functions to generate code, you should
always use the ``config`` parameter in order to generate code that matches the
defaults of the module you are modifying. The :class:`~libcst.Module` class even has
a helper attribute :attr:`~libcst.Module.config_for_parsing` to make it easy to use. This
ensures that line endings and indentation are consistent with the defaults in the
module you are adding the code to.
For example, to add a print statement to the end of a module::
module = cst.parse_module(some_code_string)
new_module = module.with_changes(
body=(
*module.body,
cst.parse_statement(
"print('Hello, world!')",
config=module.config_for_parsing,
),
),
)
new_code_string = new_module.code
Leaving out the ``config`` parameter means that LibCST will assume some defaults
and could result in added code which is formatted differently than the rest of the
module it was added to. In the above example, because we used the config from the
already-parsed example, the print statement will be added with line endings matching
the rest of the module. If we neglect the ``config`` parameter, we might accidentally
insert a windows line ending into a unix file or vice versa, depending on what system
we ran the code under.
|