Use a two-way approach: split and analyze the words:
import re
strings = ["3n3k game gnma34 xbox360 table", "the a22b b3kj3 ps4 2ij2aln potato"]
exceptions = ['xbox360', 'ps4']
def cleanse[word]:
rx = re.compile[r'\D*\d']
if rx.match[word] and word not in exceptions:
return ''
return word
nstrings = [" ".join[filter[None, [
cleanse[word] for word in string.split[]]]]
for string in strings]
print[nstrings]
# ['game xbox360 table', 'the ps4 potato']
Additionally, I changed the regular expression to
`\D*\d`
and try to match them at the beginning of each "word" [with re.match[]
] as \w
contains digits as well.
If you are able to upgrade to the newer regex
module, you could use [*SKIP][*FAIL]
and a better expression without
the need of a function:
\b[?:xbox360|ps4]\b # define your exceptions
[*SKIP][*FAIL] # these shall fail
| # or match words with digits
\b[A-Za-z]*\d\w*\b
See a demo on regex101.com and the complete Python
snippet here:
import regex as re
strings = ["3n3k game gnma34 xbox360 table", "the a22b b3kj3 ps4 2ij2aln potato 123123 1234"]
exceptions = [r'\d+', 'xbox360', 'ps4']
rx = re.compile[r'\b[?:{}]\b[*SKIP][*FAIL]|\b[A-Za-z]*\d\w*\b'.format["|".join[exceptions]]]
nstrings = [" ".join[
filter[None, [rx.sub['', word]
for word in string.split[]]]]
for string in strings]
print[nstrings]
# ['game xbox360 table', 'the ps4 potato 123123 1234']
Created: May-28, 2021 Alphanumeric characters contain the blend of the 26 characters of the letter set and the numbers 0 to 9. Non-alphanumeric characters include characters that are not letters or digits, like In this tutorial, we will discuss how to remove non-alphanumeric characters from a string in Python. We can use the For example, Output:isalnum[]
Method to Remove All Non-Alphanumeric Characters in Python Stringfilter[]
Function to Remove All Non-Alphanumeric Characters in Python String+
and @
.Use the
isalnum[]
Method to Remove All Non-Alphanumeric Characters in Python Stringisalnum[]
method to check whether a given character or string is alphanumeric or not. We can compare each character individually from a string, and if it is alphanumeric, then we combine it using the join[]
function.string_value = "alphanumeric@123__"
s = ''.join[ch for ch in string_value if ch.isalnum[]]
print[s]
alphanumeric123
Use the filter[]
Function to Remove All Non-Alphanumeric Characters in Python String
The filter[]
function is used to construct an iterator from components of the iterable object and filters the object’s elements using a function.
For our problem, the string is our object, and we will use the isalnum[]
function, which checks whether a given string contains alphanumeric characters or not by
checking each character. The join[]
function combines all the characters to return a string.
For example,
string_value = "alphanumeric@123__"
s = ''.join[filter[str.isalnum, string_value]]
print[s]
Output:
alphanumeric123
This method does not work with Python 3.
Use Regular Expressions to Remove All Non-Alphanumeric Characters in Python String
A regular expression is an exceptional grouping of characters that helps you match different strings or sets of strings, utilizing a specific syntax in a pattern. To use regular expressions, we import the re module.
We can use the sub[]
function from this module to replace all the string that matches a non-alphanumeric character by an empty character.
For example,
import re
string_value = "alphanumeric@123__"
s=re.sub[r'[\W_]+', '', string_value]
print[s]
Output:
alphanumeric123
Alternatively, we can also use the following pattern.
import re
string_value = "alphanumeric@123__"
s = re.sub[r'[^a-zA-Z0-9]', '', string_value]
print[s]
Output:
alphanumeric123