Linear Regression: A Model-Based Approach
Simple or single-variate linear regression is the simplest case of linear regression with a single independent variable, ? = ?.
The following figure illustrates simple linear regression:
When implementing simple linear regression, you typically start with a given set of input-output (?-?) pairs (green circles). These pairs are your observations. For example, the leftmost observation (green circle) has the input ? = 5 and the actual output (response) ? = 5. The next one has ? = 15 and ? = 20, and so on.
The estimated regression function (black line) has the equation ?(?) = ?₀ + ?₁?. Your goal is to calculate the optimal values of the predicted weights ?₀ and ?₁ that minimize SSR and determine the estimated regression function. The value of ?₀, also called the intercept, shows the point where the estimated regression line crosses the ? axis. It is the value of the estimated response ?(?) for ? = 0. The value of ?₁ determines the slope of the estimated regression line.
The predicted responses (red squares) are the points on the regression line that correspond to the input values. For example, for the input ? = 5, the predicted response is ?(5) = 8.33 (represented with the leftmost red square).
The residuals (vertical dashed gray lines) can be calculated as ?ᵢ – ?(?ᵢ) = ?ᵢ – ?₀ – ?₁?ᵢ for ? = 1, …, ?. They are the distances between the green circles and red squares. When you implement linear regression, you are actually trying to minimize these distances and make the red squares as close to the predefined green circles as possible.