3. Support Vector Machines

arrow_back Back to Experiments

3. Support Vector Machines

Aim

    To study and implement the Support Vector Machine (SVM) algorithm for classification and regression tasks, and to analyze how the hyperplane, margin, and choice of kernel functions (linear, polynomial, RBF) influence the model’s Performance.

Understand the Support Vector Machine Algorithm Before You Begin

Overview: Support Vector Machine (SVM) is a powerful supervised machine learning algorithm used for both classification and regression tasks. It works by finding the optimal boundary, called a hyperplane, that best separates data points of different classes in a high-dimensional space. The goal is to maximize the margin—the distance between the hyperplane and the nearest data points (called support vectors).

SVM can efficiently perform non-linear classification using kernel functions that transform the input data into higher dimensions, making it easier to separate complex datasets. Common kernels include linear, polynomial, and radial basis function (RBF) kernels.

Further Understanding: Support Vector Machines

Algorithm

  1. Load the Social_Network_Ads.csv dataset using pandas and use only age, estimated salary features as input x data.
  2. Take purchased data Colum as y (target variable).
  3. Split the data into training/testing sets(25% test set,75% as training set)
  4. Perform Feature Scaling for training and test data.
  5. Create SVM classifier object with linear kernel.
  6. Train the model using the training sets.
  7. Make predictions using the testing set.
  8. Print the Confusion Matrix.
  9. Plot SVM classifier plot with classifier boundary.

About Social Network Ads Dataset

Number of features: 5 (ID, gender, age, estimated salary, purchased (target variable to be predicted))\nNumber of samples: 400

Source: Social Network Dataset

Simulation

Visualizing Linear & Non-linear SVM Decision Boundaries

Pre-Lab Questions

  1. Why Standard Scaling of feature is essential for data? Give the formula to do standard scaling?.
  2. What is the use of the panda’s library? Explain any four inbuilt functions in pandas library

Post-Lab Questions

  1. Calculate accuracy, precision and recall for linear kernel SVM from confusions matrix
  2. Run the code for Polynomial Kernel, Gaussian Kernel, Sigmoid Kernel, give performance measures (accuracy, precision and recall) for them and find which kernel performs best .

Result

The SVM classifier with a linear kernel was successfully implemented on the Social Network Ads dataset. It accurately classified customer purchasing behavior, and the confusion matrix along with the decision boundary plot confirmed the model’s effective performance.