Stop Beautifulsoup From Removing Whitespace
BeautifulSoup is removing whitespace between before newlines tags: print BeautifulSoup(' \n ') The cod
Solution 1:
As a workaround, you could try replacing all <section>...</section>
with <pre>...</section>
before parsing. BeautifulSoup would then fully preserve the spaces. For example:
from bs4 import BeautifulSoup
import re
html = "<?xml version='1.0' encoding='UTF-8'?><section> \n</section>"
html = re.sub(r'(\</?)(section)(\>)', r'\1pre\3', html)
soup = BeautifulSoup(html, "lxml")
printrepr(soup.pre.text) # repr used to show where the spaces are
Giving you:
u' \n'
Post a Comment for "Stop Beautifulsoup From Removing Whitespace"