Onehotencoder Only A Single Feature Which Is String
I want one of my ONLY ONE of my features to be converted to a separate binary features: df['pattern_id'] Out[202]: 0 3 1 3 ... 7440 2 7441 2 7442 3 Name: patt
Solution 1:
If you take a look at the documentation for OneHotEncoder
you can see that the categorical_features
argument expects '“all” or array of indices or mask' not a string. You can make your code work by changing to the following lines
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
# Create a dataframe of random ints
df = pd.DataFrame(np.random.randint(0, 4, size=(100, 4)),
columns=['pattern_id', 'B', 'C', 'D'])
onehotencoder = OneHotEncoder(categorical_features=[df.columns.tolist().index('pattern_id')])
df = onehotencoder.fit_transform(df)
However df
will no longer be a DataFrame
, I would suggest working directly with the numpy arrays.
Solution 2:
You can also do it like this
import pandas as pd
from sklearn.preprocessingimportOneHotEncoder
onehotenc = OneHotEncoder()
X = onehotenc.fit_transform(df.required_column.values.reshape(-1, 1)).toarray()
We need to reshape the column, because fit_transform
requires a 2-D array. Then you can add columns to this numpy array and then merge it with your DataFrame.
Seen from this link here
Solution 3:
The recommended way to work with different column types is detailed in the sklearn documentation here.
Representative example:
numeric_features = ['age', 'fare']
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])
categorical_features = ['embarked', 'sex', 'pclass']
categorical_transformer = OneHotEncoder(handle_unknown='ignore')
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
Post a Comment for "Onehotencoder Only A Single Feature Which Is String"