# Regression with Multilayer Perceptron(MLP) Using Python

In this article, we are going to understand how Multi-Layer Perceptrons can be used for Regression tasks and modeling.

## Introduction

Multilayer-Perceptron(MLP) is one of the fundamental and early Neural Networks also known as Plain Vanilla Neural Networks. It is classified as a feed-forward network, indicating that data flows only in one direction, forward, unlike other neural networks, such as Recurrent Neural Networks (RNNs), that can process sequential data. MLP is quite popular in tasks like Image classification, Natural Language Processing(NLP), time series forecasting, etc. At the core, these tasks basically come under the category of classification where the model categorizes different classes based on what it learned. But, Neural Networks are an all-rounder, They can also be used for Regression tasks which helps to predict continuous outcomes. In this article, we are going to understand how Multi-Layer Perceptrons can be used for Regression tasks and modeling and What makes them different from other Regression Models, So, let's begin.

## What is Regression?

In the case of Regression, you need to note two terms, dependent variable, and independent variable. In the context of Advertising and Revenue, we can say that Revenue earned is a dependent variable and the amount spent on Advertising(different types of Advertising techniques) is the independent variable. But why is that the case and what does it mean? Basically, there is a relationship between Advertising and Revenue(When Advertising increases, Revenue Improves). The dependent variable, revenue earned, is the outcome of interest, and it is dependent on the amount spent on advertising, which is the independent variable. So the Revenue earned is dependent on the Advertising, but, Advertising is independent of the revenue earned. It can be controlled and changed as per our requirements. This is one of the important concepts in Regression Analysis.

### Linear Regression

You might be heard about Linear Regression when you started your journey toward Machine Learning because it is known as the hello world algorithm of machine learning. The basic idea behind linear regression is to fit a straight line using the basic slope-intercept equations. It is a popular Regression method that can be used in datasets with more linearity. Here is what it looks like.

### Non-linear Regression

Regression Problems may not be always linear in nature, such type of Regression is known as Non-linear Regression. Here, there might be more complex relationships between variables it might be an exponential, logarithmic, or some other type of non-linear relationship. Unlike Linear Regression, non-linear regression models can be used to capture more complex relationships, such as curves or other non-linear shapes.

Non-linear regression models can be more challenging to fit and interpret than linear regression models. The form of the non-linear function must be specified, and the parameters of the function must be estimated from the data.

## How MLP Can Be Used for Regression Problems?

Multi-layer perceptrons(MLPs) are really good at learning complex relationships that exist in a given dataset. So, it can be used for solving non-linear regression problems. As we know, MLPs consist of multiple layers of interconnected nodes, or neurons, Each neuron in a layer receives input from the previous layer, applies a transformation function to the input, and passes the output to the next layer. Basically, there are three different layers, the input layer, the hidden layer, and the output layer. The hidden layers can be of any size.

The classification MLP and Regression MLP are not too different, but there are a few differences of course. First, If you want to predict a single value(Revenue earned because of Advertising), you only need a single output neuron and if you want to predict multiple values, you can add multiple output neurons.

Second, In general, we don't apply any activation function to the output layer of MLP, when dealing with regression tasks, It just does the weighted sum and sends the output. But, in case you want your value between a given range, for example, -1 or +1 you can use activation like Tanh(Hyperbolic Tangent) function.

Third, The loss functions that can be used in Regression MLP include Mean Squared Error(MSE) and Mean Absolute Error(MAE). MSE can be used in datasets with fewer outliers, while MAE is a good measure in datasets which has more outliers.

## Performing Regression using MLP

We are going to implement all sorts of things we have discussed. Our plan is to generate a random dataset using NumPy and then evaluate how the neural network is able to identify patterns from that dataset. This will help us to learn how MLP is handling regression non-linear tasks. Next, we will work with a real-world dataset that depicts the correlation between advertising and sales.

### Building dataset using NumPy

from sklearn.datasets import make_regressionimport numpy as npimport matplotlib.pyplot as pltX_train = np.linspace(-10, 10, 1000)y_train = np.sin(X_train) + np.random.normal(0, 0.2, size=X_train.shape)X_test =  np.linspace(-10, 10, 500)y_test =  np.sin(X_test) + np.random.normal(0, 0.2, size=X_test.shape)

Now, if you plot the dataset, you'll see something like this,

plt.scatter(X_train, y_train)plt.show()

The training dataset looks like a wavey pattern, you'll get the same pattern when you plot the testing data as well, but not exactly the same data points. This type of dataset is not good for a Simple Linear Regression Model to predict with low MSE because it is non-linear, but, let's see what Neural Networks can do.

### Creating and Training MLP Model

Let's create our Neural Network, I'm going to use the TensorFlow version of Keras to create the network. You can use the standard Keras version without any problem.

model = tf.keras.models.Sequential()model.add(tf.keras.layers.Dense(62, activation='relu', input_dim=1))model.add(tf.keras.layers.Dense(62, activation='relu'))model.add(tf.keras.layers.Dense(1, activation='linear'))

We created a model which has 1 input layer since we only have a single feature, and then the first and second hidden layers have 62 neurons each. Well, it is completely up to you to select the number of neurons. Here I selected  62 because it performs well. now let's compile & train the model.

model.compile(loss='mse', optimizer='adam', metrics=['mae'])history = model.fit(X_train, y_train, epochs=500)

Nothing seems unfamiliar here, we discussed MSE and MAE, and the other argument is the optimizer, which is I selected "Adam".

### Plotting the Regression Curve

Here is how you plot the Regression Curve formed when training the model,

plt.scatter(X_train, y_train)plt.plot(X_train, model.predict(X_train), color='red')plt.show()

The red line curve in the graph is known as the regression curve formed by our MLP. This is really cool, the network actually finds the wavey pattern of the dataset, and this is how basically regression works with MLP. Now, let's plot the testing data as well,

pred = model.predict(X_test)plt.scatter(X_test, y_test)plt.plot(X_test, pred, color='red')plt.show()

Plotting is a great way to evaluate the performance of the model, but it alone doesn't give a good idea about the efficiency of regression models, so, it is good the check the MSE on testing the model.

from sklearn.metrics import mean_squared_errorpred = model.predict(X_test)mean_squared_error(pred, y_test)---------0.04299296078870944

That's not really bad, however, if you can reduce the MSE as much as lower, you'll get more effective models. You can also try Linear Regression Model to compare.

## Predicting Sales on Advertising Dataset

Ok, let's get into some real-world examples, here is the dataset from Kaggle. This dataset contains the different forms of advertising and the effects of advertising on sales. You can download the dataset from Kaggle and use Pandas to load it.

import pandas as pddataset = pd.read_csv('Advertising Budget and Sales.csv')

### Splitting the dataset

Here is how you can split the dataset into train and test sets,

from sklearn.model_selection import train_test_splitX = dataset.iloc[:, 1: 4]y = dataset.iloc[:, -1]X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)