Hướng dẫn dùng ast python python

Source code: Lib/ast.py

The ast module helps Python applications to process trees of the Python abstract syntax grammar. The abstract syntax itself might change with each Python release; this module helps to find out programmatically what the current grammar looks like.

An abstract syntax tree can be generated by passing ast.PyCF_ONLY_AST as a flag to the compile[] built-in function, or using the parse[] helper provided in this module. The result will be a tree of objects whose classes all inherit from ast.AST. An abstract syntax tree can be compiled into a Python code object using the built-in compile[] function.

Abstract Grammar¶

The abstract grammar is currently defined as follows:

-- ASDL's 4 builtin types are:
-- identifier, int, string, constant

module Python
{
    mod = Module[stmt* body, type_ignore* type_ignores]
        | Interactive[stmt* body]
        | Expression[expr body]
        | FunctionType[expr* argtypes, expr returns]

    stmt = FunctionDef[identifier name, arguments args,
                       stmt* body, expr* decorator_list, expr? returns,
                       string? type_comment]
          | AsyncFunctionDef[identifier name, arguments args,
                             stmt* body, expr* decorator_list, expr? returns,
                             string? type_comment]

          | ClassDef[identifier name,
             expr* bases,
             keyword* keywords,
             stmt* body,
             expr* decorator_list]
          | Return[expr? value]

          | Delete[expr* targets]
          | Assign[expr* targets, expr value, string? type_comment]
          | AugAssign[expr target, operator op, expr value]
          -- 'simple' indicates that we annotate simple name without parens
          | AnnAssign[expr target, expr annotation, expr? value, int simple]

          -- use 'orelse' because else is a keyword in target languages
          | For[expr target, expr iter, stmt* body, stmt* orelse, string? type_comment]
          | AsyncFor[expr target, expr iter, stmt* body, stmt* orelse, string? type_comment]
          | While[expr test, stmt* body, stmt* orelse]
          | If[expr test, stmt* body, stmt* orelse]
          | With[withitem* items, stmt* body, string? type_comment]
          | AsyncWith[withitem* items, stmt* body, string? type_comment]

          | Match[expr subject, match_case* cases]

          | Raise[expr? exc, expr? cause]
          | Try[stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody]
          | Assert[expr test, expr? msg]

          | Import[alias* names]
          | ImportFrom[identifier? module, alias* names, int? level]

          | Global[identifier* names]
          | Nonlocal[identifier* names]
          | Expr[expr value]
          | Pass | Break | Continue

          -- col_offset is the byte offset in the utf8 string the parser uses
          attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset]

          -- BoolOp[] can use left & right?
    expr = BoolOp[boolop op, expr* values]
         | NamedExpr[expr target, expr value]
         | BinOp[expr left, operator op, expr right]
         | UnaryOp[unaryop op, expr operand]
         | Lambda[arguments args, expr body]
         | IfExp[expr test, expr body, expr orelse]
         | Dict[expr* keys, expr* values]
         | Set[expr* elts]
         | ListComp[expr elt, comprehension* generators]
         | SetComp[expr elt, comprehension* generators]
         | DictComp[expr key, expr value, comprehension* generators]
         | GeneratorExp[expr elt, comprehension* generators]
         -- the grammar constrains where yield expressions can occur
         | Await[expr value]
         | Yield[expr? value]
         | YieldFrom[expr value]
         -- need sequences for compare to distinguish between
         -- x < 4 < 3 and [x < 4] < 3
         | Compare[expr left, cmpop* ops, expr* comparators]
         | Call[expr func, expr* args, keyword* keywords]
         | FormattedValue[expr value, int conversion, expr? format_spec]
         | JoinedStr[expr* values]
         | Constant[constant value, string? kind]

         -- the following expression can appear in assignment context
         | Attribute[expr value, identifier attr, expr_context ctx]
         | Subscript[expr value, expr slice, expr_context ctx]
         | Starred[expr value, expr_context ctx]
         | Name[identifier id, expr_context ctx]
         | List[expr* elts, expr_context ctx]
         | Tuple[expr* elts, expr_context ctx]

         -- can appear only in Subscript
         | Slice[expr? lower, expr? upper, expr? step]

          -- col_offset is the byte offset in the utf8 string the parser uses
          attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset]

    expr_context = Load | Store | Del

    boolop = And | Or

    operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift
                 | RShift | BitOr | BitXor | BitAnd | FloorDiv

    unaryop = Invert | Not | UAdd | USub

    cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn

    comprehension = [expr target, expr iter, expr* ifs, int is_async]

    excepthandler = ExceptHandler[expr? type, identifier? name, stmt* body]
                    attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset]

    arguments = [arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs,
                 expr* kw_defaults, arg? kwarg, expr* defaults]

    arg = [identifier arg, expr? annotation, string? type_comment]
           attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset]

    -- keyword arguments supplied to call [NULL identifier for **kwargs]
    keyword = [identifier? arg, expr value]
               attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset]

    -- import name with optional 'as' alias.
    alias = [identifier name, identifier? asname]
             attributes [int lineno, int col_offset, int? end_lineno, int? end_col_offset]

    withitem = [expr context_expr, expr? optional_vars]

    match_case = [pattern pattern, expr? guard, stmt* body]

    pattern = MatchValue[expr value]
            | MatchSingleton[constant value]
            | MatchSequence[pattern* patterns]
            | MatchMapping[expr* keys, pattern* patterns, identifier? rest]
            | MatchClass[expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns]

            | MatchStar[identifier? name]
            -- The optional "rest" MatchMapping parameter handles capturing extra mapping keys

            | MatchAs[pattern? pattern, identifier? name]
            | MatchOr[pattern* patterns]

             attributes [int lineno, int col_offset, int end_lineno, int end_col_offset]

    type_ignore = TypeIgnore[int lineno, string tag]
}

Node classes¶

class ast.AST

This is the base of all AST node classes. The actual node classes are derived from the Parser/Python.asdl file, which is reproduced above. They are defined in the _ast C module and re-exported in ast.

There is one class defined for each left-hand side symbol in the abstract grammar [for example, ast.stmt or ast.expr]. In addition, there is one class defined for each constructor on the right-hand side; these classes inherit from the classes for the left-hand side trees. For example, ast.BinOp inherits from ast.expr. For production rules with alternatives [aka “sums”], the left-hand side class is abstract: only instances of specific constructor nodes are ever created.

_fields

Each concrete class has an attribute _fields which gives the names of all child nodes.

Each instance of a concrete class has one attribute for each child node, of the type as defined in the grammar. For example, ast.BinOp instances have an attribute left of type ast.expr.

If these attributes are marked as optional in the grammar [using a question mark], the value might be None. If the attributes can have zero-or-more values [marked with an asterisk], the values are represented as Python lists. All possible attributes must be present and have valid values when compiling an AST with compile[].

linenocol_offsetend_linenoend_col_offset

Instances of ast.expr and ast.stmt subclasses have lineno, col_offset, end_lineno, and end_col_offset attributes. The lineno and end_lineno are the first and last line numbers of source text span [1-indexed so the first line is line 1] and the col_offset and end_col_offset are the corresponding UTF-8 byte offsets of the first and last tokens that generated the node. The UTF-8 offset is recorded because the parser uses UTF-8 internally.

Note that the end positions are not required by the compiler and are therefore optional. The end offset is after the last symbol, for example one can get the source segment of a one-line expression node using source_line[node.col_offset : node.end_col_offset].

The constructor of a class ast.T parses its arguments as follows:

  • If there are positional arguments, there must be as many as there are items in T._fields; they will be assigned as attributes of these names.

  • If there are keyword arguments, they will set the attributes of the same names to the given values.

For example, to create and populate an ast.UnaryOp node, you could use

node = ast.UnaryOp[]
node.op = ast.USub[]
node.operand = ast.Constant[]
node.operand.value = 5
node.operand.lineno = 0
node.operand.col_offset = 0
node.lineno = 0
node.col_offset = 0

or the more compact

node = ast.UnaryOp[ast.USub[], ast.Constant[5, lineno=0, col_offset=0],
                   lineno=0, col_offset=0]

Changed in version 3.8: Class ast.Constant is now used for all constants.

Changed in version 3.9: Simple indices are represented by their value, extended slices are represented as tuples.

Deprecated since version 3.8: Old classes ast.Num, ast.Str, ast.Bytes, ast.NameConstant and ast.Ellipsis are still available, but they will be removed in future Python releases. In the meantime, instantiating them will return an instance of a different class.

Deprecated since version 3.9: Old classes ast.Index and ast.ExtSlice are still available, but they will be removed in future Python releases. In the meantime, instantiating them will return an instance of a different class.

Note

The descriptions of the specific node classes displayed here were initially adapted from the fantastic Green Tree Snakes project and all its contributors.

Literals¶

class ast.Constant[value]

A constant value. The value attribute of the Constant literal contains the Python object it represents. The values represented can be simple types such as a number, string or None, but also immutable container types [tuples and frozensets] if all of their elements are constant.

>>> print[ast.dump[ast.parse['123', mode='eval'], indent=4]]
Expression[
    body=Constant[value=123]]

class ast.FormattedValue[value, conversion, format_spec]

Node representing a single formatting field in an f-string. If the string contains a single formatting field and nothing else the node can be isolated otherwise it appears in JoinedStr.

  • value is any expression node [such as a literal, a variable, or a function call].

  • conversion is an integer:

    • -1: no formatting

    • 115: !s string formatting

    • 114: !r repr formatting

    • 97: !a ascii formatting

  • format_spec is a JoinedStr node representing the formatting of the value, or None if no format was specified. Both conversion and format_spec can be set at the same time.

class ast.JoinedStr[values]

An f-string, comprising a series of FormattedValue and Constant nodes.

>>> print[ast.dump[ast.parse['f"sin[{a}] is {sin[a]:.3}"', mode='eval'], indent=4]]
Expression[
    body=JoinedStr[
        values=[
            Constant[value='sin['],
            FormattedValue[
                value=Name[id='a', ctx=Load[]],
                conversion=-1],
            Constant[value='] is '],
            FormattedValue[
                value=Call[
                    func=Name[id='sin', ctx=Load[]],
                    args=[
                        Name[id='a', ctx=Load[]]],
                    keywords=[]],
                conversion=-1,
                format_spec=JoinedStr[
                    values=[
                        Constant[value='.3']]]]]]]

class ast.List[elts, ctx]¶ class ast.Tuple[elts, ctx]

A list or tuple. elts holds a list of nodes representing the elements. ctx is Store if the container is an assignment target [i.e. [x,y]=something], and Load otherwise.

>>> print[ast.dump[ast.parse['[1, 2, 3]', mode='eval'], indent=4]]
Expression[
    body=List[
        elts=[
            Constant[value=1],
            Constant[value=2],
            Constant[value=3]],
        ctx=Load[]]]
>>> print[ast.dump[ast.parse['[1, 2, 3]', mode='eval'], indent=4]]
Expression[
    body=Tuple[
        elts=[
            Constant[value=1],
            Constant[value=2],
            Constant[value=3]],
        ctx=Load[]]]

class ast.Set[elts]

A set. elts holds a list of nodes representing the set’s elements.

>>> print[ast.dump[ast.parse['{1, 2, 3}', mode='eval'], indent=4]]
Expression[
    body=Set[
        elts=[
            Constant[value=1],
            Constant[value=2],
            Constant[value=3]]]]

class ast.Dict[keys, values]

A dictionary. keys and values hold lists of nodes representing the keys and the values respectively, in matching order [what would be returned when calling dictionary.keys[] and dictionary.values[]].

When doing dictionary unpacking using dictionary literals the expression to be expanded goes in the values list, with a None at the corresponding position in keys.

>>> print[ast.dump[ast.parse['{"a":1, **d}', mode='eval'], indent=4]]
Expression[
    body=Dict[
        keys=[
            Constant[value='a'],
            None],
        values=[
            Constant[value=1],
            Name[id='d', ctx=Load[]]]]]

Variables¶

class ast.Name[id, ctx]

A variable name. id holds the name as a string, and ctx is one of the following types.

class ast.Load¶ class ast.Store¶ class ast.Del

Variable references can be used to load the value of a variable, to assign a new value to it, or to delete it. Variable references are given a context to distinguish these cases.

>>> print[ast.dump[ast.parse['a'], indent=4]]
Module[
    body=[
        Expr[
            value=Name[id='a', ctx=Load[]]]],
    type_ignores=[]]

>>> print[ast.dump[ast.parse['a = 1'], indent=4]]
Module[
    body=[
        Assign[
            targets=[
                Name[id='a', ctx=Store[]]],
            value=Constant[value=1]]],
    type_ignores=[]]

>>> print[ast.dump[ast.parse['del a'], indent=4]]
Module[
    body=[
        Delete[
            targets=[
                Name[id='a', ctx=Del[]]]]],
    type_ignores=[]]

class ast.Starred[value, ctx]

A *var variable reference. value holds the variable, typically a Name node. This type must be used when building a Call node with *args.

>>> print[ast.dump[ast.parse['a, *b = it'], indent=4]]
Module[
    body=[
        Assign[
            targets=[
                Tuple[
                    elts=[
                        Name[id='a', ctx=Store[]],
                        Starred[
                            value=Name[id='b', ctx=Store[]],
                            ctx=Store[]]],
                    ctx=Store[]]],
            value=Name[id='it', ctx=Load[]]]],
    type_ignores=[]]

Expressions¶

class ast.Expr[value]

When an expression, such as a function call, appears as a statement by itself with its return value not used or stored, it is wrapped in this container. value holds one of the other nodes in this section, a Constant, a Name, a Lambda, a Yield or YieldFrom node.

>>> print[ast.dump[ast.parse['-a'], indent=4]]
Module[
    body=[
        Expr[
            value=UnaryOp[
                op=USub[],
                operand=Name[id='a', ctx=Load[]]]]],
    type_ignores=[]]

class ast.UnaryOp[op, operand]

A unary operation. op is the operator, and operand any expression node.

class ast.UAdd¶ class ast.USub¶ class ast.Not¶ class ast.Invert

Unary operator tokens. Not is the not keyword, Invert is the ~ operator.

>>> print[ast.dump[ast.parse['not x', mode='eval'], indent=4]]
Expression[
    body=UnaryOp[
        op=Not[],
        operand=Name[id='x', ctx=Load[]]]]

class ast.BinOp[left, op, right]

A binary operation [like addition or division]. op is the operator, and left and right are any expression nodes.

>>> print[ast.dump[ast.parse['x + y', mode='eval'], indent=4]]
Expression[
    body=BinOp[
        left=Name[id='x', ctx=Load[]],
        op=Add[],
        right=Name[id='y', ctx=Load[]]]]

class ast.Add¶ class ast.Sub¶ class ast.Mult¶ class ast.Div¶ class ast.FloorDiv¶ class ast.Mod¶ class ast.Pow¶ class ast.LShift¶ class ast.RShift¶ class ast.BitOr¶ class ast.BitXor¶ class ast.BitAnd¶ class ast.MatMult

Binary operator tokens.

class ast.BoolOp[op, values]

A boolean operation, ‘or’ or ‘and’. op is Or or And. values are the values involved. Consecutive operations with the same operator, such as a or b or c, are collapsed into one node with several values.

This doesn’t include not, which is a UnaryOp.

>>> print[ast.dump[ast.parse['x or y', mode='eval'], indent=4]]
Expression[
    body=BoolOp[
        op=Or[],
        values=[
            Name[id='x', ctx=Load[]],
            Name[id='y', ctx=Load[]]]]]

class ast.And¶ class ast.Or

Boolean operator tokens.

class ast.Compare[left, ops, comparators]

A comparison of two or more values. left is the first value in the comparison, ops the list of operators, and comparators the list of values after the first element in the comparison.

>>> print[ast.dump[ast.parse['1 5 if n

Chủ Đề