Hướng dẫn how do you remove punctuation marks in python? - làm cách nào để xóa dấu chấm câu trong python?

Nhiều lần khi làm việc với các chuỗi Python, chúng tôi có một vấn đề trong đó chúng tôi cần loại bỏ một số ký tự nhất định khỏi chuỗi. Điều này có thể có các ứng dụng trong tiền xử lý dữ liệu trong lĩnh vực khoa học dữ liệu và cả trong lập trình hàng ngày. Hãy để thảo luận về những cách nhất định mà chúng ta có thể thực hiện nhiệm vụ này bằng Python.

Phương pháp 1: Xóa dấu câu từ chuỗi có dịch

Hai đối số đầu tiên cho phương thức String.Translate là các chuỗi trống và đầu vào thứ ba là danh sách python của dấu câu cần được xóa. Điều này hướng dẫn phương pháp Python để loại bỏ dấu câu từ một chuỗi. Đây là một trong những cách tốt nhất để dải dấu câu từ một chuỗi.best ways to strip punctuation from a string.

Python3

import

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
0

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
6

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
8
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
9
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
4

Output:

Gfg is best for  Geeks 

Phương pháp 2: Xóa dấu câu từ một chuỗi với vòng lặp PythonPython loop

Đây là cách vũ phu trong đó nhiệm vụ này có thể được thực hiện. Trong đó, chúng tôi kiểm tra các dấu chấm câu bằng cách sử dụng một chuỗi thô có chứa dấu chấm câu và sau đó chúng tôi xây dựng một chuỗi loại bỏ các dấu câu đó.

Python3

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
8
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
2

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
5
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
6
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
7
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
8

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
6
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
3

s.translate(None, string.punctuation)
4
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
s.translate(None, string.punctuation)
7

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
8
s.translate(str.maketrans('', '', string.punctuation))
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
2

Output: 

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 

Phương pháp 3: Xóa dấu câu từ một chuỗi với Regex & NBSP; 

Phần của việc thay thế bằng dấu câu cũng có thể được thực hiện bằng Regex. Trong đó, chúng tôi thay thế tất cả các dấu câu bằng một chuỗi trống bằng cách sử dụng một regex nhất định.

Python3

import

s.translate(str.maketrans('', '', string.punctuation))
4

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
8
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
2

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
5
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
6
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
7
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
8

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
6
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
3

s.translate(None, string.punctuation)
4
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
s.translate(None, string.punctuation)
7

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 

The original string is : Gfg, is best : for ! Geeks ; The string after punctuation filter : Gfg is best for Geeks 3The original string is : Gfg, is best : for ! Geeks ; The string after punctuation filter : Gfg is best for Geeks 8s.translate(str.maketrans('', '', string.punctuation)) 0 The original string is : Gfg, is best : for ! Geeks ; The string after punctuation filter : Gfg is best for Geeks 1 The original string is : Gfg, is best : for ! Geeks ; The string after punctuation filter : Gfg is best for Geeks 2

Python3

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
8
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
2

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2

sets      : 19.8566138744
regex     : 6.86155414581
translate : 2.12455511093
replace   : 28.4436721802
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
sets      : 19.8566138744
regex     : 6.86155414581
translate : 2.12455511093
replace   : 28.4436721802
5

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
5
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
6
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
7
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
8

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
6
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
3

s.translate(None, string.punctuation)
4
sets      : 19.8566138744
regex     : 6.86155414581
translate : 2.12455511093
replace   : 28.4436721802
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
00

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
6
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
7
s.translate(None, string.punctuation)
3

s.translate(None, string.punctuation)
4
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
s.translate(None, string.punctuation)
7

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
8
s.translate(str.maketrans('', '', string.punctuation))
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
1
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
2

Phương pháp 3: Xóa dấu câu từ một chuỗi với Regex & NBSP;O(n)

Phần của việc thay thế bằng dấu câu cũng có thể được thực hiện bằng Regex. Trong đó, chúng tôi thay thế tất cả các dấu câu bằng một chuỗi trống bằng cách sử dụng một regex nhất định.O(n)


import

s.translate(str.maketrans('', '', string.punctuation))
4

s.translate(None, string.punctuation)

exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
2
exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)
5
exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)
6
exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)
7

s.translate(str.maketrans('', '', string.punctuation))

The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
3
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter : Gfg is best  for  Geeks 
8
s.translate(str.maketrans('', '', string.punctuation))
0
The original string is : Gfg, is best : for ! Geeks ;
The string after punctuation filter :  Gfg is best  for  Geeks 
1
import re, string, timeit

s = "string. With. Punctuation"
exclude = set(string.punctuation)
table = string.maketrans("","")
regex = re.compile('[%s]' % re.escape(string.punctuation))

def test_set(s):
    return ''.join(ch for ch in s if ch not in exclude)

def test_re(s):  # From Vinko's solution, with fix.
    return regex.sub('', s)

def test_trans(s):
    return s.translate(table, string.punctuation)

def test_repl(s):  # From S.Lott's solution
    for c in string.punctuation:
        s=s.replace(c,"")
    return s

print "sets      :",timeit.Timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000)
print "regex     :",timeit.Timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000)
print "translate :",timeit.Timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000)
print "replace   :",timeit.Timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)
2

Nếu tốc độ không phải là một lo lắng, mặc dù có một lựa chọn khác là:

exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)

Điều này nhanh hơn s.replace với mỗi char, nhưng sẽ không thực hiện cũng như các phương pháp python không pure như regexes hoặc string.translate, như bạn có thể thấy từ các thời gian dưới đây.Đối với loại vấn đề này, thực hiện nó ở mức thấp nhất có thể được đền đáp.

Mã thời gian:

import re, string, timeit

s = "string. With. Punctuation"
exclude = set(string.punctuation)
table = string.maketrans("","")
regex = re.compile('[%s]' % re.escape(string.punctuation))

def test_set(s):
    return ''.join(ch for ch in s if ch not in exclude)

def test_re(s):  # From Vinko's solution, with fix.
    return regex.sub('', s)

def test_trans(s):
    return s.translate(table, string.punctuation)

def test_repl(s):  # From S.Lott's solution
    for c in string.punctuation:
        s=s.replace(c,"")
    return s

print "sets      :",timeit.Timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000)
print "regex     :",timeit.Timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000)
print "translate :",timeit.Timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000)
print "replace   :",timeit.Timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)

Điều này cho kết quả sau:

sets      : 19.8566138744
regex     : 6.86155414581
translate : 2.12455511093
replace   : 28.4436721802