How to encode the labels and show the performance of encoded labels ~ TUTORIALTPOINT- Java Tutorial, C Tutorial, DBMS Tutorial

Usually in Machine learning we encounter data which have multiple labels in one or multiple columns. These labels can be characters or numeric form. These kind of data cannot be fed in the raw format to a Machine Learning model. To make the data understandable for the model, it is often labeled using Label encoding. Label Encoding is a technique of converting the labels into numeric form so that it could be ingested to a machine learning model. It is an important step in data preprocessing for supervised learning techniques. In this method, we generally replace each value in a categorical column with numbers from 0 to N-1. LabelEncoder is a utility class to help normalize labels such that they contain only values between 0 and n_classes-1.

The following example demonstrates you how to encode labels. Here i am using iris.csv file for example purpose. you can download this file here

sklearn.preprocessing.LabelEncode is used for performing label encoding. The detailed description can be found here on the official website (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html)

Setp-1: First we find the unique labels in the column variety as follows

import numpy as np
import pandas as pd

# Import dataset required data set
df = pd.read_csv('iris.csv')

df['variety'].unique()

Output

array(['Setosa', 'Versicolor', 'Virginica'], dtype=object)

Setps-2: Now using preprocessing.LabelEncoder() we encode the above unique data set as follow

# Import label encoder
from sklearn import preprocessing

# label_encoder object knows how to understand word labels.
label_encoder = preprocessing.LabelEncoder()

# Encode labels in column 'species'.
df['variety']= label_encoder.fit_transform(df['variety'])

df['variety'].unique()

Output

array([0, 1, 2])

As you can observe in the above output, Setosa is labeled as 0, Versicolor is labeled as 1, and Virginica is labeled as 2

Sunday, 30 January 2022

How to encode the labels and show the performance of encoded labels

0 comments :

Post a Comment

NumPy Tutorial

Advertisement

Java Tutorial

UGC NET CS TUTORIAL

Data Base Management

C Programming

Python Tutorial

GATE TUTORIAL

Data Structures

computer Organization

Computer Basics