Source code: Lib/ast.py
The ast
module helps Python applications to process trees of the Python abstract syntax grammar. The abstract syntax itself might
change with each Python release; this module helps to find out programmatically what the current grammar looks like.
An abstract syntax tree can be generated by passing ast.PyCF_ONLY_AST
as a flag to the compile[]
built-in function, or using the
parse[]
helper provided in this module. The result will be a tree of objects whose classes all inherit from ast.AST
. An abstract syntax tree can be compiled into a Python code object using the built-in
compile[]
function.
Abstract Grammar¶
The abstract grammar is currently defined as follows:
-- ASDL's 4 builtin types are: -- identifier, int, string, constant module Python { mod = Module[stmt* body, type_ignore* type_ignores] | Interactive[stmt* body] | Expression[expr body] | FunctionType[expr* argtypes, expr returns] stmt = FunctionDef[identifier name, arguments args, stmt* body, expr* decorator_list, expr? returns, string? type_comment] | AsyncFunctionDef[identifier name, arguments args, stmt* body, expr* decorator_list, expr? returns, string? type_comment] | ClassDef[identifier name, expr* bases, keyword* keywords, stmt* body, expr* decorator_list] | Return[expr? value] | Delete[expr* targets] | Assign[expr* targets, expr value, string? type_comment] | AugAssign[expr target, operator op, expr value] -- 'simple' indicates that we annotate simple name without parens | AnnAssign[expr target, expr annotation, expr? value, int simple] -- use 'orelse' because else is a keyword in target languages | For[expr target, expr iter, stmt* body, stmt* orelse, string? type_comment] | AsyncFor[expr target, expr iter, stmt* body, stmt* orelse, string? type_comment] | While[expr test, stmt* body, stmt* orelse] | If[expr test, stmt* body, stmt* orelse] | With[withitem* items, stmt* body, string? type_comment] | AsyncWith[withitem* items, stmt* body, string? type_comment] | Match[expr subject, match_case* cases] | Raise[expr? exc, expr? cause] | Try[stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody] | Assert[expr test, expr? msg] | Import[alias* names] | ImportFrom[identifier? module, alias* names, int? level] | Global[identifier* names] | Nonlocal[identifier* names] | Expr[expr value] | Pass | Break | Continue -- col_offset is the byte offset in the utf8 string the parser uses attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset] -- BoolOp[] can use left & right? expr = BoolOp[boolop op, expr* values] | NamedExpr[expr target, expr value] | BinOp[expr left, operator op, expr right] | UnaryOp[unaryop op, expr operand] | Lambda[arguments args, expr body] | IfExp[expr test, expr body, expr orelse] | Dict[expr* keys, expr* values] | Set[expr* elts] | ListComp[expr elt, comprehension* generators] | SetComp[expr elt, comprehension* generators] | DictComp[expr key, expr value, comprehension* generators] | GeneratorExp[expr elt, comprehension* generators] -- the grammar constrains where yield expressions can occur | Await[expr value] | Yield[expr? value] | YieldFrom[expr value] -- need sequences for compare to distinguish between -- x < 4 < 3 and [x < 4] < 3 | Compare[expr left, cmpop* ops, expr* comparators] | Call[expr func, expr* args, keyword* keywords] | FormattedValue[expr value, int conversion, expr? format_spec] | JoinedStr[expr* values] | Constant[constant value, string? kind] -- the following expression can appear in assignment context | Attribute[expr value, identifier attr, expr_context ctx] | Subscript[expr value, expr slice, expr_context ctx] | Starred[expr value, expr_context ctx] | Name[identifier id, expr_context ctx] | List[expr* elts, expr_context ctx] | Tuple[expr* elts, expr_context ctx] -- can appear only in Subscript | Slice[expr? lower, expr? upper, expr? step] -- col_offset is the byte offset in the utf8 string the parser uses attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset] expr_context = Load | Store | Del boolop = And | Or operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift | RShift | BitOr | BitXor | BitAnd | FloorDiv unaryop = Invert | Not | UAdd | USub cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn comprehension = [expr target, expr iter, expr* ifs, int is_async] excepthandler = ExceptHandler[expr? type, identifier? name, stmt* body] attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset] arguments = [arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs, expr* kw_defaults, arg? kwarg, expr* defaults] arg = [identifier arg, expr? annotation, string? type_comment] attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset] -- keyword arguments supplied to call [NULL identifier for **kwargs] keyword = [identifier? arg, expr value] attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset] -- import name with optional 'as' alias. alias = [identifier name, identifier? asname] attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset] withitem = [expr context_expr, expr? optional_vars] match_case = [pattern pattern, expr? guard, stmt* body] pattern = MatchValue[expr value] | MatchSingleton[constant value] | MatchSequence[pattern* patterns] | MatchMapping[expr* keys, pattern* patterns, identifier? rest] | MatchClass[expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns] | MatchStar[identifier? name] -- The optional "rest" MatchMapping parameter handles capturing extra mapping keys | MatchAs[pattern? pattern, identifier? name] | MatchOr[pattern* patterns] attributes [int lineno, int col_offset, int end_lineno, int end_col_offset] type_ignore = TypeIgnore[int lineno, string tag] }
Node classes¶
classast.
AST
¶This is the base of all AST node classes. The actual node classes are derived from the Parser/Python.asdl
file, which is reproduced
above. They are defined in the _ast
C module and re-exported in ast
.
There is one class defined for each left-hand side symbol in the abstract grammar [for example, ast.stmt
or ast.expr
]. In addition, there is one class defined for each
constructor on the right-hand side; these classes inherit from the classes for the left-hand side trees. For example, ast.BinOp
inherits from ast.expr
. For production rules with alternatives [aka “sums”], the left-hand side class is abstract: only instances of specific constructor nodes are ever created.
_fields
¶Each concrete class has an attribute _fields
which gives the names of all child nodes.
Each instance of a concrete class has one attribute for each child node, of the type as defined in the grammar. For example,
ast.BinOp
instances have an attribute left
of type ast.expr
.
If these attributes are marked as optional in the grammar [using a question mark], the value might be None
. If the attributes can have zero-or-more values [marked with an asterisk], the values are represented as Python lists. All possible attributes must be present and have valid values when compiling an AST with
compile[]
.
lineno
¶ col_offset
¶
end_lineno
¶ end_col_offset
¶Instances of ast.expr
and ast.stmt
subclasses have
lineno
, col_offset
, end_lineno
, and
end_col_offset
attributes. The lineno
and end_lineno
are the first and last line numbers of source text span [1-indexed so the first line is line 1]
and the col_offset
and end_col_offset
are the corresponding UTF-8 byte offsets of the first and last tokens that generated the node. The UTF-8 offset is recorded because the parser uses UTF-8 internally.
Note that the end positions are not required by the
compiler and are therefore optional. The end offset is after the last symbol, for example one can get the source segment of a one-line expression node using source_line[node.col_offset : node.end_col_offset]
.
The constructor of a class ast.T
parses its arguments as follows:
If there are positional arguments, there must be as many as there are items in
T._fields
; they will be assigned as attributes of these names.If there are keyword arguments, they will set the attributes of the same names to the given values.
For example, to create and populate an ast.UnaryOp
node, you could use
node = ast.UnaryOp[] node.op = ast.USub[] node.operand = ast.Constant[] node.operand.value = 5 node.operand.lineno = 0 node.operand.col_offset = 0 node.lineno = 0 node.col_offset = 0
or the more compact
node = ast.UnaryOp[ast.USub[], ast.Constant[5, lineno=0, col_offset=0], lineno=0, col_offset=0]
Changed in version 3.8: Class ast.Constant
is now used for all constants.
Changed in version 3.9: Simple indices are represented by their value, extended slices are represented as tuples.
Deprecated since version 3.8: Old classes ast.Num
, ast.Str
, ast.Bytes
, ast.NameConstant
and ast.Ellipsis
are still available, but they will be removed in future Python releases. In the meantime, instantiating them will return an instance of a different class.
Deprecated since version 3.9: Old classes ast.Index
and ast.ExtSlice
are still available, but they will
be removed in future Python releases. In the meantime, instantiating them will return an instance of a different class.
Note
The descriptions of the specific node classes displayed here were initially adapted from the fantastic Green Tree Snakes project and all its contributors.
Literals¶
classast.
Constant
[value]¶A constant value. The value
attribute of the Constant
literal contains the Python object it represents. The values represented can be simple types such as a number, string or None
, but also immutable
container types [tuples and frozensets] if all of their elements are constant.
>>> print[ast.dump[ast.parse['123', mode='eval'], indent=4]] Expression[ body=Constant[value=123]]class
ast.
FormattedValue
[value, conversion, format_spec]¶Node representing a single formatting field in an f-string.
If the string contains a single formatting field and nothing else the node can be isolated otherwise it appears in JoinedStr
.
value
is any expression node [such as a literal, a variable, or a function call].conversion
is an integer:-1: no formatting
115:
!s
string formatting114:
!r
repr formatting97:
!a
ascii formatting
format_spec
is aJoinedStr
node representing the formatting of the value, orNone
if no format was specified. Bothconversion
andformat_spec
can be set at the same time.
ast.
JoinedStr
[values]¶An f-string, comprising a series of FormattedValue
and Constant
nodes.
>>> print[ast.dump[ast.parse['f"sin[{a}] is {sin[a]:.3}"', mode='eval'], indent=4]] Expression[ body=JoinedStr[ values=[ Constant[value='sin['], FormattedValue[ value=Name[id='a', ctx=Load[]], conversion=-1], Constant[value='] is '], FormattedValue[ value=Call[ func=Name[id='sin', ctx=Load[]], args=[ Name[id='a', ctx=Load[]]], keywords=[]], conversion=-1, format_spec=JoinedStr[ values=[ Constant[value='.3']]]]]]]class
ast.
List
[elts, ctx]¶ class ast.
Tuple
[elts, ctx]¶
A list or tuple. elts
holds a list of nodes representing the elements. ctx
is Store
if the container is an assignment target [i.e. [x,y]=something
], and Load
otherwise.
>>> print[ast.dump[ast.parse['[1, 2, 3]', mode='eval'], indent=4]] Expression[ body=List[ elts=[ Constant[value=1], Constant[value=2], Constant[value=3]], ctx=Load[]]] >>> print[ast.dump[ast.parse['[1, 2, 3]', mode='eval'], indent=4]] Expression[ body=Tuple[ elts=[ Constant[value=1], Constant[value=2], Constant[value=3]], ctx=Load[]]]class
ast.
Set
[elts]¶A set. elts
holds a list of nodes representing the set’s elements.
>>> print[ast.dump[ast.parse['{1, 2, 3}', mode='eval'], indent=4]] Expression[ body=Set[ elts=[ Constant[value=1], Constant[value=2], Constant[value=3]]]]class
ast.
Dict
[keys,
values]¶A dictionary. keys
and values
hold lists of nodes representing the keys and the values respectively, in matching order [what would be returned when calling dictionary.keys[]
and dictionary.values[]
].
When doing dictionary unpacking using dictionary literals the expression to be expanded goes in the values
list, with a None
at
the corresponding position in keys
.
>>> print[ast.dump[ast.parse['{"a":1, **d}', mode='eval'], indent=4]] Expression[ body=Dict[ keys=[ Constant[value='a'], None], values=[ Constant[value=1], Name[id='d', ctx=Load[]]]]]
Variables¶
classast.
Name
[id, ctx]¶
A variable name. id
holds the name as a string, and ctx
is one of the following types.
ast.
Load
¶ class ast.
Store
¶
class ast.
Del
¶Variable references can be used to load the value of a variable, to assign a new value to it, or to delete it. Variable references are given a context to distinguish these cases.
>>> print[ast.dump[ast.parse['a'], indent=4]] Module[ body=[ Expr[ value=Name[id='a', ctx=Load[]]]], type_ignores=[]] >>> print[ast.dump[ast.parse['a = 1'], indent=4]] Module[ body=[ Assign[ targets=[ Name[id='a', ctx=Store[]]], value=Constant[value=1]]], type_ignores=[]] >>> print[ast.dump[ast.parse['del a'], indent=4]] Module[ body=[ Delete[ targets=[ Name[id='a', ctx=Del[]]]]], type_ignores=[]]
ast.
Starred
[value,
ctx]¶A *var
variable reference. value
holds the variable, typically a Name
node. This type must be used when building a
Call
node with *args
.
>>> print[ast.dump[ast.parse['a, *b = it'], indent=4]] Module[ body=[ Assign[ targets=[ Tuple[ elts=[ Name[id='a', ctx=Store[]], Starred[ value=Name[id='b', ctx=Store[]], ctx=Store[]]], ctx=Store[]]], value=Name[id='it', ctx=Load[]]]], type_ignores=[]]
Expressions¶
classast.
Expr
[value]¶When an expression, such as a function call, appears as a statement by itself with its return value not used or stored, it is wrapped in this container. value
holds one of the other nodes in this section, a
Constant
, a Name
, a Lambda
, a Yield
or
YieldFrom
node.
>>> print[ast.dump[ast.parse['-a'], indent=4]] Module[ body=[ Expr[ value=UnaryOp[ op=USub[], operand=Name[id='a', ctx=Load[]]]]], type_ignores=[]]class
ast.
UnaryOp
[op, operand]¶A unary operation. op
is the operator, and operand
any
expression node.
ast.
UAdd
¶ class ast.
USub
¶ class
ast.
Not
¶ class ast.
Invert
¶Unary operator tokens. Not
is the not
keyword, Invert
is the ~
operator.
>>> print[ast.dump[ast.parse['not x', mode='eval'], indent=4]] Expression[ body=UnaryOp[ op=Not[], operand=Name[id='x', ctx=Load[]]]]class
ast.
BinOp
[left, op, right]¶A binary operation [like
addition or division]. op
is the operator, and left
and right
are any expression nodes.
>>> print[ast.dump[ast.parse['x + y', mode='eval'], indent=4]] Expression[ body=BinOp[ left=Name[id='x', ctx=Load[]], op=Add[], right=Name[id='y', ctx=Load[]]]]class
ast.
Add
¶ class ast.
Sub
¶
class ast.
Mult
¶ class ast.
Div
¶ class
ast.
FloorDiv
¶ class ast.
Mod
¶ class
ast.
Pow
¶ class ast.
LShift
¶ class
ast.
RShift
¶ class ast.
BitOr
¶ class
ast.
BitXor
¶ class ast.
BitAnd
¶ class
ast.
MatMult
¶Binary operator tokens.
classast.
BoolOp
[op, values]¶A boolean
operation, ‘or’ or ‘and’. op
is Or
or And
. values
are the values involved. Consecutive operations with the same operator, such as a or b or c
, are collapsed into one node with several values.
This doesn’t include not
, which is a
UnaryOp
.
>>> print[ast.dump[ast.parse['x or y', mode='eval'], indent=4]] Expression[ body=BoolOp[ op=Or[], values=[ Name[id='x', ctx=Load[]], Name[id='y', ctx=Load[]]]]]class
ast.
And
¶ class ast.
Or
¶
Boolean operator tokens.
classast.
Compare
[left, ops, comparators]¶A comparison of two or more values. left
is the first value in the comparison, ops
the list of operators, and comparators
the list of values
after the first element in the comparison.
>>> print[ast.dump[ast.parse['1 5 if n