Python Positive-lookbehind Split Variable-width
I though that I have set up the expression appropriately, but the split is not working as intended. c = re.compile(r'(?<=^\d\.\d{1,2})\s+'); for header in ['1.1 Introduction', '
Solution 1:
Lookbehinds in python cannot be of variable width, so your lookbehind is not valid.
You can use a capture group as a workaround:
c = re.compile(r'(^\d\.\d{1,2})\s+');
for header in ['1.1 Introduction', '1.42 Appendix']:
print re.split(c, header)[1:] # Remove the first element because it's empty
Output:
['1.1', 'Introduction']
['1.42', 'Appendix']
Solution 2:
your error in the regex is in the part {1,2}
because Lookbehinds need to be fixed-width, thus quantifiers are not allowed.
try this website to test your regex before you put it in code.
BUT in your case you don't need to use regex at all:
simply try this:
for header in ['1.1 Introduction', '1.42 Appendix']:
print header.split(' ')
result:
['1.1', 'Introduction']
['1.42', 'Appendix']
hope this helps.
Solution 3:
My solution may look lame. But you are checking only two digits after dot. So, you can use two lookbehind.
c = re.compile(r'(?:(?<=^\d\.\d\d)|(?<=^\d\.\d))\s+');
Post a Comment for "Python Positive-lookbehind Split Variable-width"