Skip to content Skip to sidebar Skip to footer

How To Compare Two Csv Files And Get The Difference?

I have two CSV files, a1.csv city,state,link Aguila,Arizona,https://www.glendaleaz.com/planning/documents/AppendixAZONING.pdf AkChin,Arizona,http://www.maricopa-az.gov/zoningcode/w

Solution 1:

You can use pandas to read in two files, join them and remove all duplicate rows:

import pandas as pd
a = pd.read_csv('a1.csv')
b = pd.read_csv('a2.csv')
ab = pd.concat([a,b], axis=0)
ab.drop_duplicates(keep=False)

Reference: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html

Solution 2:

First, concatenate the DataFrames, then drop the duplicates while still keeping the first one. Then reset the index to keep it consistent.

import pandas as pd

a = pd.read_csv('a1.csv')
b = pd.read_csv('a2.csv')
c = pd.concat([a,b], axis=0)

c.drop_duplicates(keep='first', inplace=True) # Set keep to False if you don't want any# of the duplicates at all
c.reset_index(drop=True, inplace=True)
print(c)

Post a Comment for "How To Compare Two Csv Files And Get The Difference?"