Python Pandas: Flatten With Arrays In Column
I have a pandas Data Frame having one column containing arrays. I'd like to 'flatten' it by repeating the values of the other columns for each element of the arrays. I succeed to m
Solution 1:
You need numpy.repeat
with str.len
for creating columns x
and y
and for z
use this solution:
import pandas as pd
import numpy as np
from itertools import chain
df = pd.DataFrame({
"x": np.repeat(toConvert.x.values, toConvert.z.str.len()),
"y": np.repeat(toConvert.y.values, toConvert.z.str.len()),
"z": list(chain.from_iterable(toConvert.z))})
print (df)
x y z
01101011110102211010332202014220202
Solution 2:
Here's a NumPy based solution -
np.column_stack((toConvert[['x','y']].values.\
repeat(map(len,toConvert.z),axis=0),np.hstack(toConvert.z)))
Sample run -
In [78]: toConvert
Out[78]:
x y z
0110 (101, 102, 103)
1220 (201, 202)
In [79]: np.column_stack((toConvert[['x','y']].values.\
...: repeat(map(len,toConvert.z),axis=0),np.hstack(toConvert.z)))
Out[79]:
array([[ 1, 10, 101],
[ 1, 10, 102],
[ 1, 10, 103],
[ 2, 20, 201],
[ 2, 20, 202]])
Post a Comment for "Python Pandas: Flatten With Arrays In Column"