How can I simply strip all tags from an element I find in BeautifulSoup?
Hugo
26.1k7 gold badges80 silver badges95 bronze badges
asked Apr 25, 2013 at 4:26
Signal et Communication
Ingénierie Réseaux et Télécommunications
The expected result is:
Signal et Communication
Ingénierie Réseaux et Télécommunications
Here is the source code:
#!/usr/bin/env python3
from bs4 import BeautifulSoup
text = '''
Signal et Communication
Ingénierie Réseaux et Télécommunications
'''
soup = BeautifulSoup[text]
print[soup.get_text[]]
answered Jul 20, 2015 at 16:37
SparkAndShineSparkAndShine
15.9k19 gold badges86 silver badges129 bronze badges
You can use the decompose method in bs4:
soup = bs4.BeautifulSoup['I linked to example.com']
for a in soup.find['a'].children:
if isinstance[a,bs4.element.Tag]:
a.decompose[]
print soup
Out: I linked to
answered Oct 17, 2013 at 22:37
danblackdanblack
1111 silver badge2 bronze badges
Code to simply get the contents as text instead of html:
'html_text' parameter is the string which you will pass in this function to get the text
from bs4 import BeautifulSoup
soup = BeautifulSoup[html_text, 'lxml']
text = soup.get_text[]
print[text]
answered May 18, 2020 at 8:53
1
it looks like this is the way to do! as simple as that
with this line you are joining together the all text parts within the current element
''.join[htmlelement.find[text=True]]
answered Apr 25, 2013 at 4:46
Daniele BDaniele B
18.7k23 gold badges106 silver badges165 bronze badges
Here is the source code: you can get the text which is exactly in the URL
URL = ''
page = requests.get[URL]
soup = bs4.BeautifulSoup[page.content,'html.parser'].get_text[]
print[soup]
answered Mar 10, 2020 at 15:08
Not the answer you're looking for? Browse other questions tagged python beautifulsoup or ask your own question.
I am crawling data from a website. This website has code like this:
Tag b:
Hello
world!
This is what I tried:
new_data = data.find["span",{"class":"demo-span"}]
print[new_data.get_text[]]
Expected output:
Hello world!
But the actual output is:
Tag b: Hello world!
Hello world! ''' soup = BeautifulSoup[html, 'html.parser'] new_data = soup.find["span", {"class": "demo-span"}] new_data.b.decompose[] print[new_data.get_text[' ', strip=True]] # Hello world!
answered Jun 12, 2018 at 8:00
Keyur PotdarKeyur Potdar
7,0096 gold badges25 silver badges39 bronze badges
2