Skip to content Skip to sidebar Skip to footer

Creating New Column Using Output Of If Else Statement Causes Error

I am using the following code, if(df.month == 3 or df.month == 4 or df.month == 5): df.test = 'A' elif(df.month == 6 or df.month == 7 or df.month == 8): df.test = 'B' else:

Solution 1:

I think the best is use loc and isin, because you can't compare a scalar with an array like that using if or elif it becomes ambiguous:

printdf

   year  month  day
0  2005      3   20
1  2005      4   20
2  2005      5   20
3  2005      6   20
4  2005      7   20
5  2005      8   20
6  2005      9   20

df['test'] = 'C'
df.loc[df['month'].isin([3,4,5]) , 'test'] = 'A'
df.loc[df['month'].isin([6,7,8]) , 'test'] = 'B'printdf  

   year  month  day test
0  2005      3   20    A
1  2005      4   20    A
2  2005      5   20    A
3  2005      6   20    B
4  2005      7   20    B
5  2005      8   20    B
6  2005      9   20    C

Or you can fill column test by value C this way:

df.loc[df['month'].isin([3,4,5]) , 'test'] = 'A'
df.loc[df['month'].isin([6,7,8]) , 'test'] = 'B'
df.loc[df['month'].isin([1,2,9,10,11,12]) , 'test'] = 'C'printdf    

   year  month  day test
0  2005      3   20    A
1  2005      4   20    A
2  2005      5   20    A
3  2005      6   20    B
4  2005      7   20    B
5  2005      8   20    B
6  2005      9   20    C

Solution 2:

Try

defvaluesetter(x):
    if x in [3,4,5]: return"A"elif x in [6,7,8]: return"B"else: return"C"

df["test"] = list(map(valuesetter,df.month))

Solution 3:

The exception message you're getting is pretty self explanatory. df['month'] is a series, and the truth value of a series is ambiguous because it represents a series of truth values. You can do what you're trying to do with pd.Series.map

defassignmentFunction(value):
    if value in [3, 4, 5]:
        return'A'elif value in [6, 7, 8]:
        return'B'else:
        return'C'

df['test'] = df['month'].map(assignmentFunction)

Solution 4:

You can use a comprehension to create your test column:

>>>df = pd.DataFrame({'month' : pd.Series(range(1,13))})>>>df['test'] = ['A'if m in [3,4,5] else...'B'if m in [6,7,8] else...'C'for m in df['month']]>>>df
    month test
0       1    C
1       2    C
2       3    A
3       4    A
4       5    A
5       6    B
6       7    B
7       8    B
8       9    C
9      10    C
10     11    C
11     12    C

Or you can apply a function, which produces the same result:

>>>defvalue(month):...if month in [3,4,5]:...return'A'...if month in [6,7,8]:...return'B'...return'C'>>>df['test'] = df['month'].apply(value)

Solution 5:

This answer mainly tries to explain the error that you're seeing. As I'm not a pandas user, I'll let the other answers speak to better ways to write this code...


df.month returns an array. some_array == 6 will return another array (constructed such that new_array[i] == True iff some_array[i] == 6).

Because of situations like this, in numpy, an array does not have a truth value (unlike normal python sequences). So, to test if an array is truthy, you need to specify what you mean. e.g. to specify that all elements must be truthy, you'd want: (df.month == 6).all()

Post a Comment for "Creating New Column Using Output Of If Else Statement Causes Error"