Replace String Between Tags If String Begins With "1"
Solution 1:
You can use the re
module
>>>text = 'textextextextext<tag>10005991</tag>textextextextext'>>>re.sub(r'<tag>1(\d+)</tag>','<tag>YYYYY</tag>',text)
'textextextextext<tag>YYYYY</tag>textextextextext'
re.sub
will replace the matched text with the second argument.
Quote from the doc
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged.
Usage may be like:
withopen("file") as f:
for i in f:
withopen("output") as f2:
f2.write(re.sub(r'<tag>1(\d+)</tag>','<tag>YYYYY</tag>',i))
Solution 2:
You can use regex but as you have a multi-line string you need to use re.DOTALL
flag , and in your pattern you can use positive look-around for match string between tags:
>>>print re.sub(r'(?<=<tag>)1\d+(?=</?tag>)',r'YYYYYY',s,re.DOTALL,re.MULTILINE)
textextextextext<tag>YYYYYY<tag>textextextextext
textextextextext<tag>20005992</tag>textextextextext
textextextextext<tag>YYYYYY</tag>textextextextext
textextextextext<tag>20005994</tag>textextextextext
re.DOTALL
Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline.
Also as @Bhargav Rao have did in his answer you can use grouping instead look-around :
>>>print re.sub(r'<tag>(1\d+)</?tag>',r'<tag>YYYYYY</?tag>',s,re.DOTALL,re.MULTILINE)
textextextextext<tag>YYYYYY</?tag>textextextextext
textextextextext<tag>20005992</tag>textextextextext
textextextextext<tag>YYYYYY</?tag>textextextextext
textextextextext<tag>20005994</tag>textextextextext
Solution 3:
I think your best bet is to use ElementTree
The main idea: 1) Parse the file 2) Find the elements value 3) Test your condition 4) Replace value if condition met
Here is a good place to start parsing : How do I parse XML in Python?
Post a Comment for "Replace String Between Tags If String Begins With "1""