Including Word Boundary In String Modification To Be More Specific
Background The following is a minor change from modification of skipping empty list and continuing with function import pandas as pd Names = [list(['ann']), list(
Solution 1:
You need to add word boundary to each string in lists of df.loc[m].P_Name
as follows:
s = df.loc[m].P_Name.map(lambda x: [r'\b'+item+r'\b'for item in x])
Out[71]:
0 [\bann\b]
2 [\belisabeth\b, \blis\b]
3 [\bhis\b, \bhe\b]
Name: P_Name, dtype: object
df.loc[m, 'Text'].replace(s, '**BLOCK**',regex=True)
Out[72]:
0 **BLOCK** had an anniversery today
2 I like **BLOCK** and **BLOCK** 5 lists
3 one day **BLOCK** and **BLOCK** cheated
Name: Text, dtype: object
Solution 2:
Sometime for loop is good practice
df['New']=[pd.Series(x).replace(dict.fromkeys(y,'**BLOCK**') ).str.cat(sep=' ')for x , y in zip(df.Text.str.split(),df.P_Name)]
df.New.where(df.P_Name.astype(bool),inplace=True)
df
Text ... New0 ann had an anniversery today ... **BLOCK** had an anniversery today
1 nothing here ... NaN
2 I like elisabeth and lis 5 lists ... I like**BLOCK**and**BLOCK**5 lists
3oneday he and his cheated ... oneday**BLOCK**and**BLOCK** cheated
4 same here ... NaN
[5rows x 4 columns]
Post a Comment for "Including Word Boundary In String Modification To Be More Specific"