How do you take a string between two characters in python?

from timeit import timeit
from re import search, DOTALL


def partition_find[string, start, end]:
    return string.partition[start][2].rpartition[end][0]


def re_find[string, start, end]:
    # applying re.escape to start and end would be safer
    return search[start + '[.*]' + end, string, DOTALL].group[1]


def index_find[string, start, end]:
    return string[string.find[start] + len[start]:string.rfind[end]]


# The wikitext of "Alan Turing law" article form English Wikipeida
# //en.wikipedia.org/w/index.php?title=Alan_Turing_law&action=edit&oldid=763725886
string = """..."""
start = '==Proposals=='
end = '==Rival bills=='

assert index_find[string, start, end] \
       == partition_find[string, start, end] \
       == re_find[string, start, end]

print['index_find', timeit[
    'index_find[string, start, end]',
    globals=globals[],
    number=100_000,
]]

print['partition_find', timeit[
    'partition_find[string, start, end]',
    globals=globals[],
    number=100_000,
]]

print['re_find', timeit[
    're_find[string, start, end]',
    globals=globals[],
    number=100_000,
]]

Result:

index_find 0.35047444528454114
partition_find 0.5327825636197754
re_find 7.552149639286381

re_find was almost 20 times slower than index_find in this example.

In this guide to splitting strings in Python, we’ll explore the various ways we can use the language to precisely split a string. When we split strings between characters in Python, it’s possible to extract part of a string from the whole [also known as a substring].

Learning how to split a string will be useful for any Python programmer. Whether you intend to use Python for web development, data science, or natural language processing, splitting a string will be a routine operation.

We’ll follow several procedures for obtaining substrings in Python. First, we’ll take a look at splice notation and the split[] function. Afterwards, we’ll examine more advanced techniques, such as regex.

Split a String Between Characters with Slice Notation

When it comes to splitting strings, slice notation is an obvious choice for Python developers. With slice notation, we can find a subsection of a string.

Example: Split a string with slice notation

text = """BERNARDO
Well, good night.
If you do meet Horatio and Marcellus,
The rivals of my watch, bid them make haste."""

speaker = text[:8]

print[speaker]

Output

BERNARDO

Split a String by Character Position

To use this method, we need to know the start and end location of the substring we want to slice. We can use the index[] method to find the index of a character in a string.

Example: How to find the index of a character in a string

sentence = "Jack and Jill went up the hill."

index1 = sentence.index["J",0]
print[index1]

index2 = sentence.index["J",1]
print[index2]

Output

0
9

A Quick Guide to Using split[]

The Python standard library comes with a function for splitting strings: the split[] function. This function can be used to split strings between characters. The split[] function takes two parameters. The first is called the separator and it determines which character is used to split the string.

The split[] function returns a list of substrings from the original string. By passing different values to the split[] function, we can split a string in a variety of ways.

Splitting Strings with the split[] Function

We can specify the character to split a string by using the separator in the split[] function. By default, split[] will use whitespace as the separator, but we are free to provide other characters if we want.

Example: Splitting a string by whitespace

sentence = "The quick brown fox jumps over the lazy dog."

# split a string using whitespace
words = sentence.split[]

print[words]

Output

['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog.']

Example: Splitting a string separated by commas

rainbow = "red,orange,yellow,green,blue,indigo,violet"

# use a comma to separate the string
colors = rainbow.split[',']

print[colors]

Output

['red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']

Use split[] with Multiple Arguments

Using the split[] function, we also can control how many lines of text are split. This function takes a second parameter: maxsplit. This variable tells the split[] function how many splits to perform.

Example: Splitting multiple lines of text

text = """HORATIO
Before my God, I might not this believe
Without the sensible and true avouch
Of mine own eyes."""

lines = text.split[maxsplit=1]

print[lines]

Output

['HORATIO', 'Before my God, I might not this believe\nWithout the sensible and true avouch\nOf mine own eyes.']

Because we set maxsplit to a value of 1, the text was split into two substrings.

If we have a text that is divided by multiple identical characters, we can use the split[] function to separate the strings between the characters.

Example: Using symbols to separate a string

nums = "1--2--3--4--5"

nums = nums.split['--']

print[nums]

Output

['1', '2', '3', '4', '5']

How to Find a String Between Two Symbols

We can combine the index[] function with slice notation to extract a substring from a string. The index[] function will give us the start and end locations of the substring. Once we know the location of the symbols [$’sin this case], we’ll extract the string using slice notation.

Example: Extracting a substring with the index[] function

# extract the substring surrounded by $'s from the text
text = "Lorem ipsum dolor sit amet, $substring$ adipisicing elit."

start = text.index['$']
end = text.index['$',start+1]

substring = text[start+1:end]
print[f"Start: {start}, End: {end}"]
print[substring]

Output

Start: 28, End: 38
substring

How to Use Regular Expression to Split a String Between Characters

Regular Expression is a convenient way of searching a string or text for patterns. Because regular expression patterns [regex] are so versatile, they can be used to create very targeted searches.

Python comes with the relibrary. With regex, we can search text with a fine tooth comb, looking for specific words, phrases, or even words of a certain length.

Example: Using a regular expression to search for a string

import re

text="""The Fulton County Grand Jury said Friday an investigation
of Atlanta's recent primary election produced "no evidence" that
any irregularities took place."""

# search the text for words that are 14 characters long
match= re.search[r"\w{14}", text]
print[match.group[]]

# search the text for the word "Atlanta"
atlanta = re.search[r"Atlanta",text]
print[atlanta.group[]]

Output

irregularities
Atlanta

Example: Using regex to find a date

sentence= "Tony was born on May 1st 1972."

date= re.search[r"\d{4}",sentence]

print[date.group[]]

Output

1972

In the examples above, we employed the search[] method to find a substring using regular expression patterns. This method has two arguments. The first is our regex pattern, and the second is the string we’d like to perform the search on.

Regular expression uses special characters and numbers to create targeted searches. For instance, our first example uses the special characters \w to search for words.

Special Characters for Regular Expressions:

  • /w – Searches for alphanumeric characters [words]
  • /d – Searches for digit characters [0-9]
  • /s – Search for whitespace characters

Example: Find if a string starts with a word with regex

speech= """HAMLET
O God, your only jig-maker. What should a man do
but be merry? for, look you, how cheerfully my
mother looks, and my father died within these two hours."""

match= re.search[r"^HAMLET",speech]
print["HAMLET" in match.group[]]

Output

True

Furthermore, we can use regex to find a string between two characters. In the next example, we’ll use a regex pattern to find a string between square brackets.

Example: Regular expression to find all the characters between two special characters

speech="""KING CLAUDIUS
[Aside] O, 'tis too true!
How smart a lash that speech doth give my conscience!"""

match = re.search[r"[?

Chủ Đề