Python Regex Extract Width X Depth X Height
I am trying to extract the physical dimensions of items from a column 'Description' in a df to create a new column with it. Dimensions usually appear in this format (120x80x100) in
Solution 1:
You can use the regex, \d+\s*x\s*\d+(?:\s*x\s*\d+)?
Explanation:
\d+: One or more digits\s*: Zero or more whitespace charactersx: Literal,x(?:\s*x\s*\d+)?: Optional non-capturing group
If you want the numbers to be of one to three digits, replace \d+ with \d{1,3} as shown in the regex, \d{1,3}\s*x\s*\d{1,3}(?:\s*x\s*\d{1,3})?.
If your code requires you to use a group, do it as follows:
(\d{1,3}\s*x\s*\d{1,3}(?:\s*x\s*\d{1,3})?)
Solution 2:
We can try using a re.findall approach with a regex pattern covering all possible dimension formats:
inp = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit 1. 20x80x100 ed do 120 x 80 x 100 eiusmod 120x80 tempor...'
dims = re.findall(r'\d+(?:\s*x\s*\d+){1,2}', inp)
print(dims) # ['120x80x100', '120 x 80 x 100', '120x80']Solution 3:
Something like this should work:
\d+(\s?x\s?\d+){1,2}
Post a Comment for "Python Regex Extract Width X Depth X Height"