How To Compare Two Csv Files And Get The Difference?
I have two CSV files, a1.csv city,state,link Aguila,Arizona,https://www.glendaleaz.com/planning/documents/AppendixAZONING.pdf AkChin,Arizona,http://www.maricopa-az.gov/zoningcode/w
Solution 1:
You can use pandas
to read in two files, join them and remove all duplicate rows:
import pandas as pd
a = pd.read_csv('a1.csv')
b = pd.read_csv('a2.csv')
ab = pd.concat([a,b], axis=0)
ab.drop_duplicates(keep=False)
Reference: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html
Solution 2:
First, concatenate the DataFrames, then drop the duplicates while still keeping the first one. Then reset the index to keep it consistent.
import pandas as pd
a = pd.read_csv('a1.csv')
b = pd.read_csv('a2.csv')
c = pd.concat([a,b], axis=0)
c.drop_duplicates(keep='first', inplace=True) # Set keep to False if you don't want any# of the duplicates at all
c.reset_index(drop=True, inplace=True)
print(c)
Post a Comment for "How To Compare Two Csv Files And Get The Difference?"