Stop Beautifulsoup From Removing Whitespace
BeautifulSoup is removing whitespace between before newlines tags: print BeautifulSoup(' \n ') The cod
Solution 1:
As a workaround, you could try replacing all <section>...</section>
with <pre>...</section>
before parsing. BeautifulSoup would then fully preserve the spaces. For example:
from bs4 import BeautifulSoup
import re
html = "<?xml version='1.0' encoding='UTF-8'?><section> \n</section>"
html = re.sub(r'(\</?)(section)(\>)', r'\1pre\3', html)
soup = BeautifulSoup(html, "lxml")
printrepr(soup.pre.text) # repr used to show where the spaces are
Giving you:
u' \n'
Baca Juga
- Pyqt5 Cannot Update Progress Bar From Thread And Received The Error "cannot Create Children For A Parent That Is In A Different Thread"
- Loading A Dataset In Python (numpy) When There Are Variable Spaces Delimiting Columns
- How Can I Get Rid Of Curly Braces When Using White Space In Python With Tkinter?
Post a Comment for "Stop Beautifulsoup From Removing Whitespace"