Regression
Regression
Regression analysis is a statistical method used to understand the relationship between a dependent variable and one or more independent variables. It is commonly employed for prediction, forecasting, and understanding the relationship between variables. It aims to model the relationship between the variables and make predictions about the dependent variable based on the values of the independent variables. There are different types of regression:-
Steps for regression modeling:-
- Hypothesize deterministic component.
- Estimate unknown model parameters.
- Specify probability distribution of random error term and estimate standard deviation of error.
- Evaluate model.
- Use model for prediction and estimation.
Types of regression
Simple Linear Regression:-
Simple linear regression is a statistical technique used to model the relationship between a single independent variable (predictor) and a continuous dependent variable (response). It assumes that there is a linear relationship between the independent variable x and dependent variable y.
Y=β0+β1X+ϵ
The goal of simple linear regression is to estimate the coefficients that minimize the sum of squared differences between the observed values of y and the values predicted by the linear model.
A linear regression model attempts to explain the relationship between a dependent (output variables) variable and one or more independent (predictor variable) variables using a straight line.
This straight line is represented using the following formula:
y = mx +c
Where, y: dependent variable
x: independent variable
m: Slope of the line (For a unit increase in the quantity of X, Y increases by m.1 = m units.)
c: y intercept (The value of Y is c when the value of X is 0)
The first step in finding a linear regression equation is to determine if there is a relationship between the two variables. We can do this by using the Correlation coefficient and scatter plot. When a correlation coefficient shows that data is likely to be able to predict future outcomes and a scatter plot of the data appears to form a straight line, we can use simple linear regression to find a predictive function. The next step is to find a straight line between Sales and Marketing that explain the relationship between them.
Simple linear regression is a fundamental technique in statistics and is widely used in various fields, including economics, social sciences, engineering, and machine learning. It provides a straightforward approach to modeling and understanding the relationship between two variables.
Multiple Regression:-
Multiple regression is a statistical technique used to analyze the relationship between a single dependent variable and multiple independent variables. It extends the concepts of simple linear regression, where only one independent variable is used, to cases where there are two or more independent variables. Multiple regression allows for the examination of how each independent variable contributes to the variation in the dependent variable while controlling for the effects of other variables. Multiple regression is widely used in various fields such as economics, social sciences, business, and engineering for modeling and understanding complex relationships between multiple variables. It provides a versatile framework for analyzing data and making predictions based on multiple predictors. The multiple regression model is represented by:-
Y=β0+β1X1+β2X2+…+βpXp+ϵ
- is the dependent variable (response),
- are the independent variables (predictors),
- is the intercept,
- are the coefficients (slopes) corresponding to each independent variable,
- is the error term.
- The goal of multiple regression is to estimate the coefficients that minimize the sum of squared differences between the observed values of and the values predicted by the regression model.
- Multiple regression is widely used in various fields such as economics, social sciences, business, and engineering for modeling and understanding complex relationships between multiple variables. It provides a versatile framework for analyzing data and making predictions based on multiple predictors.
Polynomial Regression:-
Polynomial regression is a type of regression analysis where the relationship between the independent variable X and the dependent variable Y is modeled as an n-degree polynomial function. Unlike simple linear regression, which assumes a linear relationship between X and Y, polynomial regression can capture non-linear relationships between the variables. It is also called the special case of Multiple Linear Regression in ML. Because we add some polynomial terms to the Multiple Linear regression equation to convert it into Polynomial Regression. The polynomial regression model is represented as:-
Y=β0+β1X+β2X2+…+βnXn+ϵ
Where:
- is the dependent variable (response),
- is the independent variable (predictor),
- are the coefficients,
- is the error term.
The degree of the polynomial (n) determines the complexity of the model and the flexibility to capture non-linear patterns in the data. Higher-degree polynomials can fit the data more closely but may also lead to overfitting, where the model learns the noise in the data rather than the underlying relationship.
Polynomial regression is useful for modeling non-linear relationships in the data and capturing complex patterns that cannot be captured by simple linear regression. However, it's important to be cautious about overfitting, especially when using higher-degree polynomials, and to validate the model using appropriate techniques such as cross-validation. It is a linear model with some modification in order to increase the accuracy. The dataset used in Polynomial regression for training is of non-linear nature.
It makes use of a linear regression model to fit the complicated and non-linear functions and datasets.
Hence, "In Polynomial regression, the original features are converted into Polynomial features of required degree (2,3,..,n) and then modeled using a linear model."
Steps for Polynomial Regression:-
The main steps involved in Polynomial Regression are given below:
- Data Pre-processing.
- Build a Linear Regression model and fit it to the dataset.
- Build a Polynomial Regression model and fit it to the dataset.
- Visualize the result for Linear Regression and Polynomial Regression model.
- Predicting the output.
Logistics Regression:-
Logistic regression is a statistical method used for modeling the relationship between a binary dependent variable and one or more independent variables. It is widely used for binary classification tasks, where the goal is to predict the probability that an observation belongs to one of two classes.
Despite its name, logistic regression is a classification algorithm rather than a regression algorithm. It is called "logistic regression" because it uses the logistic function (also known as the sigmoid function) to model the probability of the binary outcome. Logistic Regression was used in the biological sciences in early twentieth century. It was then used in many social science applications. Logistic Regression is used when the dependent variable(target) is categorical.
For example,
To predict whether an email is spam (1) or (0)
Whether the tumor is malignant (1) or not (0)
It is a binary classification problems (problems with two class values).Logistic regression is named for the function used at the core of the method, the logistic function. The logistic function, also called the sigmoid function. It’s an S-shaped curve that can take any real-valued number and map it into a value between 0 and 1, but never exactly at those limits.
1 / (1 + e^-value)
where, e is the base of the natural logarithms and value is the actual numerical value that you want to transform.
Logistic regression is a powerful and interpretable algorithm for binary classification tasks. It is widely used in various fields such as medicine, finance, marketing, and social sciences for predicting binary outcomes and making decisions based on probabilistic predictions.
.png)
Comments
Post a Comment