Skip to content Skip to sidebar Skip to footer

Distinct Combinations Values In Pandas Dataframes

Is there an easy way to pull out the distinct combinations of values in a dataframe? I've used pd.Series.unique() for single columns, but what about multiple columns? Example data:

Solution 1:

You can zip the columns and create a set:

>>> set(zip(df.number, df.letter))
{(1, 'a'), (1, 'b'), (2, 'a'), (3, 'b')}

Solution 2:

You can set the index to those columns and then call unique on the index:

In [165]:
idx = df.set_index(['number','letter']).index
idx.unique()

Out[165]:
array([(1, 'a'), (2, 'a'), (3, 'b'), (1, 'b')], dtype=object)

Post a Comment for "Distinct Combinations Values In Pandas Dataframes"