Creating New Column Using Output Of If Else Statement Causes Error
Solution 1:
I think the best is use loc
and isin
, because you can't compare a scalar with an array like that using if
or elif
it becomes ambiguous:
printdf
year month day
0 2005 3 20
1 2005 4 20
2 2005 5 20
3 2005 6 20
4 2005 7 20
5 2005 8 20
6 2005 9 20
df['test'] = 'C'
df.loc[df['month'].isin([3,4,5]) , 'test'] = 'A'
df.loc[df['month'].isin([6,7,8]) , 'test'] = 'B'printdf
year month day test
0 2005 3 20 A
1 2005 4 20 A
2 2005 5 20 A
3 2005 6 20 B
4 2005 7 20 B
5 2005 8 20 B
6 2005 9 20 C
Or you can fill column test
by value C
this way:
df.loc[df['month'].isin([3,4,5]) , 'test'] = 'A'
df.loc[df['month'].isin([6,7,8]) , 'test'] = 'B'
df.loc[df['month'].isin([1,2,9,10,11,12]) , 'test'] = 'C'printdf
year month day test
0 2005 3 20 A
1 2005 4 20 A
2 2005 5 20 A
3 2005 6 20 B
4 2005 7 20 B
5 2005 8 20 B
6 2005 9 20 C
Solution 2:
Try
defvaluesetter(x):
if x in [3,4,5]: return"A"elif x in [6,7,8]: return"B"else: return"C"
df["test"] = list(map(valuesetter,df.month))
Solution 3:
The exception message you're getting is pretty self explanatory. df['month'] is a series, and the truth value of a series is ambiguous because it represents a series of truth values. You can do what you're trying to do with pd.Series.map
defassignmentFunction(value):
if value in [3, 4, 5]:
return'A'elif value in [6, 7, 8]:
return'B'else:
return'C'
df['test'] = df['month'].map(assignmentFunction)
Solution 4:
You can use a comprehension to create your test
column:
>>>df = pd.DataFrame({'month' : pd.Series(range(1,13))})>>>df['test'] = ['A'if m in [3,4,5] else...'B'if m in [6,7,8] else...'C'for m in df['month']]>>>df
month test
0 1 C
1 2 C
2 3 A
3 4 A
4 5 A
5 6 B
6 7 B
7 8 B
8 9 C
9 10 C
10 11 C
11 12 C
Or you can apply a function, which produces the same result:
>>>defvalue(month):...if month in [3,4,5]:...return'A'...if month in [6,7,8]:...return'B'...return'C'>>>df['test'] = df['month'].apply(value)
Solution 5:
This answer mainly tries to explain the error that you're seeing. As I'm not a pandas
user, I'll let the other answers speak to better ways to write this code...
df.month
returns an array. some_array == 6
will return another array (constructed such that new_array[i] == True
iff some_array[i] == 6
).
Because of situations like this, in numpy, an array does not have a truth value (unlike normal python sequences). So, to test if an array is truthy, you need to specify what you mean. e.g. to specify that all elements must be truthy, you'd want: (df.month == 6).all()
Post a Comment for "Creating New Column Using Output Of If Else Statement Causes Error"