Python Regex "or" Gives Empty String When Using Findall
I'm using a simple regex (.*?)(\d+[.]\d+)|(.*?)(\d+) to match int/float/double value in a string. When doing findall the regex shows empty strings in the output. The empty strings
Solution 1:
Since you defined 4 capturing groups in the pattern, they will always be part of the re.findall
output unless you remove them (say, by using filter(None, ...)
.
However, in the current situation, you may "shrink" your pattern to
r'(.*?)(\d+(?:\.\d+)?)'
See the regex demo
Now, it will only have 2 capturing groups, and thus, findall
will only output 2 items per tuple in the resulting list.
Details:
(.*?)
- Capturing group 1 matching any zero or more chars other than line break chars, as few as possible up to the first occurrence of ...(\d+(?:\.\d+)?)
- Capturing group 2:\d+
- one of more digits(?:\.\d+)?
- an optional *non-*capturing group that matches 1 or 0 occurrences of a.
and 1+ digits.
See the Python demo:
import re
rx = r"(.*?)(\d+(?:[.]\d+)?)"
ss = ["CA$1.90", "RM1"]
for s in ss:
print(re.findall(rx, s))
# => [('CA$', '1.90')] [('RM', '1')]
Post a Comment for "Python Regex "or" Gives Empty String When Using Findall"