Skip to content Skip to sidebar Skip to footer

Python Regex To Find "document" Word In A Given String In Forward Direction And Replace By Empty String

If the word 'document' can be produced by removing characters from a given string, the letters spelling 'document' are to be removed from the string. If letters from the resulting

Solution 1:

I have a solution with regular expressions and recursion:

from re importcompile

candidates = ["doconeument", "documdocumentent",  "documentone",
              "pydocdbument", "documentdocument", "hansi"]
word = "document"defstrip_word(word, candidate):
    regex = compile("^(.*)" + "(.*)".join(word) + "(.*)$")
    match = regex.match(candidate)
    ifnot match:
        return candidate
    return strip_word(word, "".join(match.groups()))

for cand in candidates:
    print(f"'{cand}' -> '{strip_word(word, cand)}'")

Edit: Did a correction on the code (two first lines of function were left outside).

Solution 2:

If the given string fails to match the regular expression:

r'^([a-z]*)d([a-z]*)o([a-z]*)c([a-z]*)u([a-z]*)m([a-z]*)e([a-z]*)n([a-z]*)t([a-z]*)$'

the string is returned. If the regex matches the string, the string:

"\1\2\3\4\5\6\7\8\9"

is formed and an attempt is made to match that string with the regex. This process is repeated until there is no match, at which time the last string tested is returned. Note that each string thus produced contains 8 characters fewer than the preceding string.

Demo, step 1

Demo, step 2

If the regex matches the string, capture group 1 will contain the substring that precedes "d" in "document", capture group 2 will contain the substring that is between "d" and "o", and so on, with capture group 9 containing the substring that follows "t". Some or all of these substrings may be empty.

I will leave it to the OP to produce the Python code needed to implement this algorithm.

Post a Comment for "Python Regex To Find "document" Word In A Given String In Forward Direction And Replace By Empty String"