Skip to content Skip to sidebar Skip to footer

How Can I Merge A Pandas Dataframes Based On A Substring From One Of The Columns?

I have 2 dataframes: df1 and df2 df1 School Conference 0 Air Force Mt. West 1 Akron MAC 2 Alabama at Birmingham C-USA

Solution 1:

You can use list comprehension to check if the columns from each dataframe are in each other (you also compare case-insensitively) and then merge:

df1['SCHOOL_NAME'] = df1['School'].apply(lambda x: [y for y in df2['SCHOOL_NAME']
                                                    if x in y or y in x]).str[0]
df1 = df1.merge(df2, how='left').drop('SCHOOL_NAME', axis=1) #can pass on='SCHOOL_NAME' to merge.
df1
Out[1]: 
                  School Conference  RATE
0              Air Force   Mt. West  53.0
1                  Akron        MAC  77.0
2  Alabama at Birmingham      C-USA  75.0
3                 Auburn   Sun Belt  93.0

You could also search case-insensitively by adding .lower() to x and y:

df1['SCHOOL_NAME'] = df1['School'].apply(lambda x: [y for y in df2['SCHOOL_NAME']
                                                    if x.lower() in y.lower()
                                                    or y.lower() in x.lower()]).str[0]
df1 = df1.merge(df2, how='left').drop('SCHOOL_NAME', axis=1) #can pass on='SCHOOL_NAME' to merge.
df1
Out[2]:
                  School Conference  RATE
0              Air Force   Mt. West  53.0
1                  Akron        MAC  77.0
2  Alabama at Birmingham      C-USA  75.0
3                 Auburn   Sun Belt  93.0

Single line of code per comment:

df1 = (df1.assign(SCHOOL_NAME = df1['School'].apply(lambda x: [y for y in df2['SCHOOL_NAME']
                                                    if x.lower() in y.lower()
                                                    or y.lower() in x.lower()]).str[0])
          .merge(df2, how='left').drop('SCHOOL_NAME', axis=1))
df1
Out[3]: 
                  School Conference  RATE
0              Air Force   Mt. West  53.0
1                  Akron        MAC  77.0
2  Alabama at Birmingham      C-USA  75.0
3                 Auburn   Sun Belt  93.0

Post a Comment for "How Can I Merge A Pandas Dataframes Based On A Substring From One Of The Columns?"