Concatenating And Saving Multiple Pair Of CSV In Pandas
I am a beginner in python. I have a hundred pair of CSV file. The file looks like this: 25_13oct_speed_0.csv 26_13oct_speed_0.csv 25_13oct_speed_0.1.csv 26_13oct_speed_0.1.csv
Solution 1:
Idea is create DataFrame by list of files and add 2 new columns by Series.str.split
by first _
:
print (files)
['25_13oct_speed_0.csv', '26_13oct_speed_0.csv',
'25_13oct_speed_0.1.csv', '26_13oct_speed_0.1.csv',
'25_13oct_speed_0.2.csv', '26_13oct_speed_0.2.csv']
df1 = pd.DataFrame({'files': files})
df1[['g','names']] = df1['files'].str.split('_', n=1, expand=True)
print (df1)
files g names
0 25_13oct_speed_0.csv 25 13oct_speed_0.csv
1 26_13oct_speed_0.csv 26 13oct_speed_0.csv
2 25_13oct_speed_0.1.csv 25 13oct_speed_0.1.csv
3 26_13oct_speed_0.1.csv 26 13oct_speed_0.1.csv
4 25_13oct_speed_0.2.csv 25 13oct_speed_0.2.csv
5 26_13oct_speed_0.2.csv 26 13oct_speed_0.2.csv
Then loop per groups per column names
, loop by groups with DataFrame.itertuples
and create new DataFrame with read_csv
, if necessary add new column filled by values from g
, append to list, concat
and last cave to new file by name from column names
:
for i, g in df1.groupby('names'):
out = []
for n in g.itertuples():
df = pd.read_csv(n.files).assign(source=n.g)
out.append(df)
dfbig = pd.concat(out, ignore_index=True)
print (dfbig)
dfbig.to_csv(g['names'].iat[0])
Post a Comment for "Concatenating And Saving Multiple Pair Of CSV In Pandas"