1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237
|
Gast, Beniget!
==============
Beniget is a collection of Compile-time analyse on Python Abstract Syntax Tree(AST).
It's a building block to write static analyzer or compiler for Python.
Beniget relies on `gast <https://pypi.org/project/gast/>`_ to provide a cross
version abstraction of the AST, effectively working on both Python2 and
Python3.
API
---
Basically Beniget provides three analyse:
- ``beniget.Ancestors`` that maps each node to the list of enclosing nodes;
- ``beniget.DefUseChains`` that maps each node to the list of definition points in that node;
- ``beniget.UseDefChains`` that maps each node to the list of possible definition of that node.
See sample usages and/or run ``pydoc beniget`` for more information :-).
Sample Usages
-------------
Detect unused imports
*********************
This is a very basic usage: look for def without any use, and warn about them, focusing on imported values.
.. code:: python
>>> import beniget, gast as ast
# parse some simple statements
>>> code = "from math import cos, sin; print(cos(3))"
>>> module = ast.parse(code)
# compute the def-use chains at module level
>>> duc = beniget.DefUseChains()
>>> duc.visit(module)
# grab the import statement
>>> imported = module.body[0].names
# inspect the users of each imported name
>>> for name in imported:
... ud = duc.chains[name]
... if not ud.users():
... print("Unused import: {}".format(ud.name()))
Unused import: sin
*NOTE*: Due to the dynamic nature of Python, one can fool this analysis by
calling the ``eval`` function, eventually through an indirection, or by performing a lookup
into ``globals()``.
Find all functions marked with a given decorator
************************************************
Let's assume we've got a ``@nice`` decorator applied to some functions. We can traverse the users
of this decorator to find which functions are decorated.
.. code:: python
# parse some simple statements
>>> code = """
... nice = lambda x: x
... @nice
... def aw(): pass
... def some(): pass"""
>>> module = ast.parse(code)
# compute the def-use chains at module level
>>> duc = beniget.DefUseChains()
>>> duc.visit(module)
# analysis to find parent of a node
>>> ancestors = beniget.Ancestors()
>>> ancestors.visit(module)
# find the nice definition
>>> nice = [d for d in duc.locals[module] if d.name() == "nice"][0]
# walkthrough its users
>>> for use in nice.users():
... # we're interested in the parent of the decorator
... parents = ancestors.parents(use.node)
... # direct parent of the decorator is the function
... fdef = parents[-1]
... print(fdef.name)
aw
Gather attributes of ``self``
*****************************
This analysis gathers all attributes of a class, by going through all methods and checking
the users of the first method parameter, investigating the one used in attribute lookup.
.. code:: python
>>> import gast as ast
>>> import beniget
>>> class Attributes(ast.NodeVisitor):
...
... def __init__(self, module_node):
... # compute the def-use of the module
... self.chains = beniget.DefUseChains()
... self.chains.visit(module_node)
... self.users = set() # all users of `self`
... self.attributes = set() # attributes of current class
...
... def visit_ClassDef(self, node):
... # walk methods and fill users of `self`
... for stmt in node.body:
... if isinstance(stmt, ast.FunctionDef):
... self_def = self.chains.chains[stmt.args.args[0]]
... self.users.update(use.node for use in self_def.users())
... self.generic_visit(node)
...
... def visit_Attribute(self, node):
... # any attribute of `self` is registered
... if node.value in self.users:
... self.attributes.add(node.attr)
>>> code = "class My(object):\n def __init__(self, x): self.x = x"
>>> module = ast.parse(code)
>>> classdef = module.body[0]
>>> attr = Attributes(module)
>>> attr.visit(classdef)
>>> list(attr.attributes)
['x']
*NOTE*: This is *not* an alias analysis, so assigning ``self`` to another variable, or
setting it in a tuple is not captured by this analysis. It's still possible to write such an
a analysis using def-use chains though ;-)
Compute the identifiers captured by a function
**********************************************
In Python, inner functions (and lambdas) can capture identifiers defined in the outer scope.
This analysis computes such identifiers by registering each identifier defined in the function,
then walking through all loaded identifier and checking whether it's local or not.
.. code:: python
>>> import gast as ast
>>> import beniget
>>> class Capture(ast.NodeVisitor):
...
... def __init__(self, module_node):
... # initialize def-use chains
... self.chains = beniget.DefUseChains()
... self.chains.visit(module_node)
... self.users = set() # users of local definitions
... self.captured = set() # identifiers that don't belong to local users
...
... def visit_FunctionDef(self, node):
... # initialize the set of node using a local variable
... for def_ in self.chains.locals[node]:
... self.users.update(use.node for use in def_.users())
... self.generic_visit(node)
...
... def visit_Name(self, node):
... # register load of identifiers not locally definied
... if isinstance(node.ctx, ast.Load):
... if node not in self.users:
... self.captured.add(node.id)
>>> code = 'def foo(x):\n def bar(): return x\n return bar'
>>> module = ast.parse(code)
>>> inner_function = module.body[0].body[0]
>>> capture = Capture(module)
>>> capture.visit(inner_function)
>>> list(capture.captured)
['x']
Compute the set of instructions required to compute a function
**************************************************************
This is actually very similar to the computation of the closure, but this time
let's use the UseDef chains combined with the ancestors.
.. code:: python
>>> import gast as ast
>>> import beniget
>>> class CaptureX(ast.NodeVisitor):
...
... def __init__(self, module_node, fun):
... self.fun = fun
... # initialize use-def chains
... du = beniget.DefUseChains()
... du.visit(module_node)
... self.chains = beniget.UseDefChains(du)
... self.ancestors = beniget.Ancestors()
... self.ancestors.visit(module_node)
... self.external = list()
... self.visited_external = set()
...
... def visit_Name(self, node):
... # register load of identifiers not locally defined
... if isinstance(node.ctx, ast.Load):
... uses = self.chains.chains[node]
... for use in uses:
... try:
... parents = self.ancestors.parents(use.node)
... except KeyError:
... return # a builtin
... if self.fun not in parents:
... parent = self.ancestors.parentStmt(use.node)
... if parent not in self.visited_external:
... self.visited_external.add(parent)
... self.external.append(parent)
... self.rec(parent)
...
... def rec(self, node):
... "walk definitions to find their operands's def"
... if isinstance(node, ast.Assign):
... self.visit(node.value)
... # TODO: implement this for AugAssign etc
>>> code = 'a = 1; b = [a, a]; c = len(b)\ndef foo():\n return c'
>>> module = ast.parse(code)
>>> function = module.body[3]
>>> capturex = CaptureX(module, function)
>>> capturex.visit(function)
>>> # the three top level assignments have been captured!
>>> list(map(type, capturex.external))
[<class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>]
Acknowledgments
---------------
Beniget is in Pierre Augier's debt, for he triggered the birth of beniget and provided
countless meaningful bug reports and advices. Trugarez!
|