Estimated time needed: 1 hour
In this lab, we will study how to convert a linear classifier into a multi-class classifier, including multinomial logistic regression or softmax regression, One vs. All (One-vs-Rest) and One vs. One
After completing this lab you will be able to:
- Understand and apply some theory behind:
- Softmax regression
- One vs. All (One-vs-Rest)
- One vs. One
In Multi-class classification, we classify data into multiple class labels . Unlike classification trees and k-nearest neighbour, the concept of Multi-class classification for linear classifiers is not as straightforward. We can convert logistic regression to Multi-class classification using multinomial logistic regression or softmax regression; this is a generalization of logistic regression, this will not work for support vector machines. One vs. All (One-vs-Rest) and One vs. One are two other multi-class classification techniques can covert any two-class classifier to a multi-class classifier.
For this lab, we are going to be using several Python libraries such as scit-learn, numpy and matplotlib for visualizations. Some of these libraries might be installed in your lab environment, others may need to be installed by you by removing the hash signs. The cells below will install these libraries when executed.
import piplite
await piplite.install(['pandas'])
await piplite.install(['matplotlib'])
await piplite.install(['numpy'])
await piplite.install(['scikit-learn'])
await piplite.install(['scipy'])
from pyodide.http import pyfetch
async def download(url, filename):
response = await pyfetch(url)
if response.status == 200:
with open(filename, "wb") as f:
f.write(await response.bytes())
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import pandas as pd
This functions Plots different decision boundary
plot_colors = "ryb"
plot_step = 0.02
def decision_boundary (X,y,model,iris, two=None):
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, plot_step),
np.arange(y_min, y_max, plot_step))
plt.tight_layout(h_pad=0.5, w_pad=0.5, pad=2.5)
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
cs = plt.contourf(xx, yy, Z,cmap=plt.cm.RdYlBu)
if two:
cs = plt.contourf(xx, yy, Z,cmap=plt.cm.RdYlBu)
for i, color in zip(np.unique(y), plot_colors):
idx = np.where( y== i)
plt.scatter(X[idx, 0], X[idx, 1], label=y,cmap=plt.cm.RdYlBu, s=15)
plt.show()
else:
set_={0,1,2}
print(set_)
for i, color in zip(range(3), plot_colors):
idx = np.where( y== i)
if np.any(idx):
set_.remove(i)
plt.scatter(X[idx, 0], X[idx, 1], label=y,cmap=plt.cm.RdYlBu, edgecolor='black', s=15)
for i in set_:
idx = np.where( iris.target== i)
plt.scatter(X[idx, 0], X[idx, 1], marker='x',color='black')
plt.show()
This function will plot the probability of belonging to each class; each column is the probability of belonging to a class the row number is the sample number.
def plot_probability_array(X,probability_array):
plot_array=np.zeros((X.shape[0],30))
col_start=0
ones=np.ones((X.shape[0],30))
for class_,col_end in enumerate([10,20,30]):
plot_array[:,col_start:col_end]= np.repeat(probability_array[:,class_].reshape(-1,1), 10,axis=1)
col_start=col_end
plt.imshow(plot_array)
plt.xticks([])
plt.ylabel("samples")
plt.xlabel("probability of 3 classes")
plt.colorbar()
plt.show()
In ths lab we will use the iris dataset, it consists of 3 different types of irises’ (Setosa y=0, Versicolour y=1, and Virginica y=2) petal and sepal length, stored in a 150×4 numpy.ndarray
The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width.
The below plot uses the secoond two features
pair=[1, 3]
iris = datasets.load_iris()
X = iris.data[:, pair]
y = iris.target
np.unique(y)
array([0, 1, 2])
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.RdYlBu)
plt.xlabel("sepal width (cm)")
plt.ylabel("petal width")
Text(0, 0.5, 'petal width')
SoftMax regression is similar to logistic regression, the softmax function convernts the actual distances i.e. dot products of $x$ with each of the parameters $theta_i$ for the $K$ classes. This is converted to probabilities using the following :
$softmax(x,i) = frac{e^{ theta_i^T bf x}}{sum_{j=1}^K e^{theta_j^T x}} $
The training procedure is almost identical to logistic regression. Consider the three-class example where $y in {0,1,2}$ we would like to classify $x_1$. We can use the softmax function to generate a probability of how likely the sample belongs to each class:
$[softmax(x_1,0),softmax(x_1,1),softmax(x_1,2)]=[0.97,0.2,0.1]$
The index of each probability is the same as the class. We can make a prediction using the argmax function:
$hat{y}=argmax_i {softmax(x,i)}$
For the above example, we can make a prediction as follows:
$hat{y}=argmax_i {[0.97,0.2,0.1]}=0$
sklearn
does this automatically, but we can verify the prediction step, we fit the model:
lr = LogisticRegression(random_state=0).fit(X, y)
We generate the probability using the method predict_proba
probability=lr.predict_proba(X)
We can plot the probability of belonging to each class; each column is the probability of belonging to a class the row number is the sample number.
plot_probability_array(X,probability)
Similar Notebooks
- nlp c w4 lecture notebook model architecture
- c4 w4 ungraded lab reformer lsh
- c w4 assignment
- c w4 assignment solution
- chapter 7 working with keras
- nlp c w4 lecture nb 1
- c4 w lecture notebook attention
- nlp c w4 lecture notebook model training
- c4 w lecture notebook transformer decoder
- chapter14 conclusions
Copyright © Code Fetcher 2022