If you want exact matches of words then consider word tokenizing the target string. I use the recommended word_tokenize from nltk:
from nltk.tokenize import word_tokenize
Here is the tokenized string from the accepted answer:
a_string = "A string is more than its parts!"
tokens = word_tokenize[a_string]
tokens
Out[46]: ['A', 'string', 'is', 'more', 'than', 'its', 'parts', '!']
The accepted answer gets modified as follows:
matches_1 = ["more", "wholesome", "milk"]
[x in tokens for x in matches_1]
Out[42]: [True, False, False]
As in the accepted answer, the word "more" is still matched. If "mo" becomes a match string, however, the accepted answer still finds a match. That is a behavior I did not want.
matches_2 = ["mo", "wholesome", "milk"]
[x in a_string for x in matches_1]
Out[43]: [True, False, False]
Using word tokenization, "mo" is no longer matched:
[x in tokens for x in matches_2]
Out[44]: [False, False, False]
That is the additional behavior that I wanted. This answer also responds to the duplicate question here.
Summary: In this tutorial, we will learn different ways to check for multiple substrings in another string in Python. Return True if the string contains any of the given substrings. In this method, we iterate through the list of substrings and check using the We append the boolean
results to a list and pass it to Method 1: Using any[] with for loop to Check for Substrings
in
operator if it exists in another string. any[]
function to return True or False indicating whether any of the sub-strings are present in the string.
The any[]
function in Python accepts an iterable [such as List, Dictionary, Tuple, etc] as a parameter and returns True if any of the elements in the iterable is True
.
substrings = ['python', 'python3', 'programming']
string = 'Learn programming at pencilprogrammer.com'
result_list = []
for x in substrings:
# append True/False for substring x
result_list.append[x in string]
#call any[] with boolean results list
print[any[result_list]]
Output: True
One drawback of this method is that it is case-sensitive. If a substring is present in the main string, but the case does not match, it will return false result.
We can overcome this by changing the strings to the same case [e.g lower], inside of the loop body:
substrings = ['mY', 'naME', 'is', 'KuMAR']
string = 'Author name: Adarsh Kumar'
result_list = []
for x in substrings:
# append True/False for substring x
result_list.append[x.lower[] in string.lower[]]
#call any[] with boolean results list
print[any[result_list]]
Alternatively, we can check for substring using the regular expression as discussed below.
Method 2: Using any[] with regular expression [re]
Using regular expressions, we can easily check multiple substrings in a single-line statement.
We use the
findall[]
method of the re
module to get all the matches as a list of strings and pass it to any[]
method to get the result in True or False.
import re
string = 'Python is good for Machine Learning and Data-Science'
"""
pass substrings separated by | as 1st argument
and main string value as 2nd argument.
Additionally, we can pass re.IGNORECASE paramter as
3rd argument to make matching case-insensitive.
"""
match_list = re.findall[r'python|machine|good', string, re.IGNORECASE]
print[any[match_list]]
Output: True
This is one of the fastest method to check whether a string contains any of the given multiple substrings in Python.
I am an engineer by education and writer by passion. I started this blog to share my little programming wisdom with other programmers out there. Hope it helps you.
LinkedIn
Python any[] Function
Python any[] function accepts iterable [list, tuple, dictionary etc.] as an argument and return true if any of the element in iterable is true , else it returns false . If the iterable object is empty, the any[] function will return False.
any Vs all
- any will return True when at least one of the elements is Truthy.
- all will return True only when all the elements are Truthy.
Check if multiple strings exist in another string
In this case, we can use Python "any[]" .
Here the script return "Found a match", because at least one word exists in the list.
example 2:
output
How to check if string contains substring from list
If your list is too long, it is better to use Python Regular Expression .
Above example return "Found a match" because "one" is exist in the list.
Check If a String Contains Multiple Keywords
You can also find a solution for this by using iteration .
Above script return "Found a match" because "one" is exist in the myList.
All matches including duplicates in a string
If you want to get all the matches including duplicates from the list:
First word match in a string from list
If you want the first match with False as a default:
Above example return "one" because the word "one" is the starting word and exists in the myList also.
How to extract the first and final words from a string?
Similarly to check if all the strings from the list are found, use "all" instead of "any" .
Above example return False because "six" is not in the string.