Hướng dẫn python escape hyphen

String literals are described by the following lexical definitions:

stringliteral:   shortstring | longstring
shortstring:     "'" shortstringitem* "'" | '"' shortstringitem* '"'
longstring:      "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
shortstringitem: shortstringchar | escapeseq
longstringitem:  longstringchar | escapeseq
shortstringchar: 
longstringchar:  
escapeseq:       "\" 

In plain English: String literals can be enclosed in matching single quotes (') or double quotes ("). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings). The backslash (\) character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character. String literals may optionally be prefixed with a letter `r' or `R'; such strings are called raw strings and use different rules for backslash escape sequences.

In triple-quoted strings, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the string. (A ``quote'' is the character used to open the string, i.e. either ' or ".)

Unless an `r' or `R' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are:

Escape Sequence Meaning 
\newline Ignored
\\ Backslash (\)
\' Single quote (')
\" Double quote (")
\a ASCII Bell (BEL)
\b ASCII Backspace (BS)
\f ASCII Formfeed (FF)
\n ASCII Linefeed (LF)
\r ASCII Carriage Return (CR)
\t ASCII Horizontal Tab (TAB)
\v ASCII Vertical Tab (VT)
\ooo ASCII character with octal value ooo
\xhh... ASCII character with hex value hh...

In strict compatibility with Standard C, up to three octal digits are accepted, but an unlimited number of hex digits is taken to be part of the hex escape (and then the lower 8 bits of the resulting hex number are used in 8-bit implementations).

Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.)

When an `r' or `R' prefix is present, backslashes are still used to quote the following character, but all backslashes are left in the string. For example, the string literal r"\n" consists of two characters: a backslash and a lowercase `n'. String quotes can be escaped with a backslash, but the backslash remains in the string; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a value string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw string cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the string, not as a line continuation.


See About this document... for information on suggesting changes.

I'm using the basic elasticsearch library in python 3.

I have a query:

query = {
        "query": {
            "bool": {
                "must": [{ "term": {"hostname": '"hal-pc"' } }]
            }
        }
    }

That I call with: page = es.search(index = index_name, body=query, search_type='scan', scroll='2m')

However I'm not getting any results. I can query on other fields so I know my query works, but when I add the search for a field with a hyphen in the value, I cannot find anything. How can I escape this character? I know that with normal ES queries you can send a message to configure your ES to respond to certain characters in certain ways, but I don't know how to do that in python.

asked Feb 14, 2017 at 16:01

2

If the field hostname is analyzed in mapping, elasticsearch does not store the field value as is. Instead, it stores "hal-pc" as two separate terms: "hal" and "pc".So, the doc might not be obtained when search for "hal-pc" using term filter.

You can search for "hal-pc" using Match query to get the necessary result. Or, by making the field hostname field not-analyzed and using term query as is.

{
    "query": {
        "match" : {
            "hostname": "hal-pc"
        }
    }
}

But, this might also return docs where hostname is just "hal" or just "pc" as well.

answered Feb 14, 2017 at 16:41

3