Skip to content Skip to sidebar Skip to footer

Text Mining With Python And Pandas

this maybe is a duplicate, but I had no luck finding it... I am working on some text mining in Python with Pandas. I have words in a DataFrame and the Porter stemming next to it wi

Solution 1:

You can apply a set to this instead of a list, so you are removing all the duplicates automaticly:

import pandas as pd
pda = pd.DataFrame.from_dict({'Word': ['bank', 'hold', 'banking', 'holding', 'bank'], 
                              'Porter': ['bank', 'hold', 'bank', 'hold', 'bank'], 
                              'SomeData': ['12', '13', '12', '13', '12']})

pdm = pd.DataFrame(pda.groupby(['Porter'])['Word'].apply(set))

Post a Comment for "Text Mining With Python And Pandas"