Sort Within Group Without Changing Group Order?
Solution 1:
You could create a new temporary column that transforms B
, A
and C
to 1
, 2
and 3
, so that you maintain order of the unordered. Then, just drop the temporary column. In Answer #1, this is more dynamic and will work if the group
column values are not grouped together consecutively. For Answer #2, they must be consecutive (the inputs for answer #1 and answer #2 are ordered differently)
Updated Answer #1 (per comment - the groups are not consecutive in rows, but we still want to order them correctly by the order of appearance of the first value within each group.) The code [l for l in enumerate((df['group'].unique()))]
will assign a number to each group depending on the order of the first value of the group
column in the dataframe.
In[1]:
name group revenue
0 Name1 GroupB 13 Name4 GroupA 44 Name5 GroupA 58 Name7 GroupC 91 Name2 GroupB 22 Name3 GroupB 35 Name6 GroupA 66 Name7 GroupC 77 Name7 GroupC 8
dft = pd.DataFrame([l for l in enumerate((df['group'].unique()))], columns=['group_number','group'])
df = pd.merge(df, dft, how='left', on='group').sort_values(['group_number', 'revenue'], ascending = [True, False])
df
Out[1]:
name group revenue group_number
5 Name3 GroupB 304 Name2 GroupB 200 Name1 GroupB 106 Name6 GroupA 612 Name5 GroupA 511 Name4 GroupA 413 Name7 GroupC 928 Name7 GroupC 827 Name7 GroupC 72
I wanted to highlight the output of dft
of the enumerate
line of code before the merge and sort.
dft = pd.DataFrame([l for l in enumerate((df['group'].unique()))], columns=['group_number','group'])
dft
Out[2]:
group_number group00 GroupB
11 GroupA
22 GroupC
Answer #2
import pandas as pd
df = pd.DataFrame({'name': ['Name1','Name2','Name3','Name4','Name5','Name6', 'Name7', 'Name7', 'Name7'],
'group':['GroupB','GroupB','GroupB','GroupA','GroupA','GroupA','GroupC','GroupC','GroupC'],'revenue':[1,2,3,4,5,6,7,8,9]})
df['cs'] = (df['group'] != df['group'].shift(1)).cumsum()
df = df.sort_values(['cs', 'revenue'], ascending = [True, False])
df
Out[11]:
name group revenue cs
2 Name3 GroupB 3 1
1 Name2 GroupB 2 1
0 Name1 GroupB 1 1
5 Name6 GroupA 6 2
4 Name5 GroupA 5 2
3 Name4 GroupA 4 2
8 Name7 GroupC 9 3
7 Name7 GroupC 8 3
6 Name7 GroupC 7 3
For both answers, then just drop the column:
df = df.drop('cs', axis=1)
Out[12]:
name group revenue
2 Name3 GroupB 31 Name2 GroupB 20 Name1 GroupB 15 Name6 GroupA 64 Name5 GroupA 53 Name4 GroupA 48 Name7 GroupC 97 Name7 GroupC 86 Name7 GroupC 7
Solution 2:
Why use groupby at all? You could just chain together multiple sort_values calls to get the correct sort order. e.g. using similar data to linked question and you wanted to sort by revenue descending but maintain groups ascending you could do:
import pandas as pd
df = pd.DataFrame({'name': ['Name1','Name2','Name3','Name4','Name5','Name6', 'Name7', 'Name7', 'Name7'],
'group':['GroupB','GroupB','GroupB','GroupA','GroupA','GroupA','GroupC','GroupC','GroupC'],'revenue':[1,2,3,4,5,6,7,8,9]})
df.sort_values(by='revenue', ascending= False).sort_values(by='group')
Which would return:
name group revenue
5 Name6 GroupA 64 Name5 GroupA 53 Name4 GroupA 42 Name3 GroupB 31 Name2 GroupB 20 Name1 GroupB 18 Name7 GroupC 97 Name7 GroupC 86 Name7 GroupC 7
Post a Comment for "Sort Within Group Without Changing Group Order?"