For the variables x and y, the two regression lines are 6x + y = 30 and 3x + 2y = 25. What are the values of x̅, y̅ and r respectively?This question was previously asked in Show
NDA (Held On: 17 Nov 2019) Maths Previous Year paper View all NDA Papers >
Answer (Detailed Solution Below)Option 3 : \(\frac{{35}}{9},\frac{{20}}{3},\; - 0.5\) Free Electric charges and coulomb's law (Basic) 10 Questions 10 Marks 10 Mins Concept: The line of regression of y on x is given by: \(y - \;\bar y = {b_{yx}}\;\left( {x - \;\bar x} \right)\) where byx is called the regression coefficient of y on x. Similarly, the line of regression of x on y is given by: \(x - \;\bar x = {b_{xy}}\;\left( {y - \bar y} \right)\) wherebxy is called the regression coefficient of x on y. The correlation coefficient r2 = byx × bxy The two lines of regression intersect each other at \(\left( {\bar x\;,\;\bar y} \right)\) Calculation: Given: Two regression lines are 6x + y = 30 and 3x + 2y = 25. As we know that, the two lines of regression intersect each other at \(\left( {\bar x\;,\;\bar y} \right)\) By solving these two equations: 6x + y = 30 and 3x + 2y = 25 We get \(\left( {\bar x\;,\;\bar y} \right) = \left( {\frac{{35}}{9},\frac{{20}}{3}} \right)\) We can write 6x + y = 30 as line of regression of x on y: \(x - \frac{{35}}{9} = \; - \frac{1}{6} \times \left( {y - \frac{{20}}{3}} \right)\) ------(1) By comparing equation (1), with line of regression of x on y which is given by: \(x - \;\bar x = {b_{xy}}\;\left( {y - \bar y} \right)\) we get \({b_{xy}} = \; - \frac{1}{6}\) Similarly, we can write 3x + 2y = 25 as line of regression of y on x: \(y - \frac{{20}}{3} = \; - \frac{3}{2}\;\left( {y - \frac{{35}}{9}} \right)\) ------(2) By comparing equation (2), with line of regression of x on y which is given by \(\;y - \;\bar y = {b_{yx}}\;\left( {x - \;\bar x} \right)\): we get \({b_{yx}} = \; - \frac{3}{2}\) As we know that, r2 = byx × bxy \( \Rightarrow {r^2} = \; - \frac{1}{6} \times \; - \frac{3}{2} = \frac{1}{4}\) \( \Rightarrow r = \; \pm \frac{1}{2} = \; \pm 0.5\) As we know that, sign of \(r,\;{b_{xy}}\;and\;{b_{yx}}\;is\;always\;same\) ⇒ r = - 0.5 Last updated on Sep 29, 2022 Union Public Service Commission (UPSC) has released the NDA Result II 2022 (Name Wise List) for the exam that was held on 4th September 2022. Earlier, the roll number wise list was released by the board. A total number of 400 vacancies will be filled for the UPSC NDA II 2022 exam. The selection process for the exam includes a Written Exam and SSB Interview. Candidates who get successful selection under UPSC NDA II will get a salary range between Rs. 15,600 to Rs. 39,100. Let's discuss the concepts related to Statistics and Correlation and Regression. Explore more from Mathematics here. Learn now! Back to the Table of ContentsApplied Statistics - Lesson 6Lesson Overview
Last lesson we introduced correlation and the correlation coefficients of Pearson and Spearman. In this lesson we come up with linear regression equations. Linear RegressionRegression goes one step beyond correlation in identifying the relationship between two variables. It creates an equation so that values can be predicted within the range framed by the data. This is known as interpolation. To go beyond the observations is fraught with peril and is known as extrapolation. However, doing so to determine the federal deficit or necessary pension funding levels are nonetheless important applications.Since the discussion is on linear correlations and the predicted values need to be as close as possible to the data, the equation is called the best-fitting line or regression line. The regression line was named after the work Galton did in gene characteristics that reverted (regressed) back to a mean value. That is, tall parents had children closer to the average. Slope is an important concept so we will review some important facts here.
In summary, if y = mx + b, then m is the slope and b is the y-intercept (i.e., the value of y when x = 0). Often linear equations are written in standard form with integer coefficients (Ax + By = C). Such relationships must be converted into slope-intercept form (y = mx + b) for easy use on the graphing calculator. One other form of an equation for a line is called the point-slope form and is as follows: y - y1 = m(x - x1). The slope, m, is as defined above, x and y are our variables, and (x1, y1) is a point on the line. Special SlopesIt is important to understand the difference between positive, negative, zero, and undefined slopes. In summary, if the slope is positive, y increases as x increases, and the function runs "uphill" (going left to right). If the slope is negative, y decreases as x increases and the function runs downhill. If the slope is zero, y does not change, thus is constant—a horizontal line. Vertical lines are problematic in that there is no change in x. Thus our formula is undefined due to division by zero. Some will term this condition infinite slope, but be aware that we can't tell if it is positive or negative infinity! Hence the rather confusing term no slope is also in common usage for this situation.An equation of a line can be expressed as y = mx + b or y = ax + b or even y = a + bx. As we see, the regression line has a similar equation. There are a wide variety of reasons to pick one equation form over another and certain disciplines tend to pick one to the exclusion of the other. BE FLEXIBLE both on the order of the terms within the equation and on the symbols used for the coefficients! With the interdisciplinary nature of a lot of research these days, conflict between differing notations should be minimized.
The y-intercept of the regression line is ß0 and the slope is ß1. The following formulas give the y-intercept and the slope of the equation. Notice that the denominators are the same, so that saves calculations. Also, the calculator will have values for certain portions. Another way to write the equation is in point-slope form where the centroid is the point that is always on the line. The centroid is the following ordered pair: (mean of x, mean of y).
There are certain guidelines for regression lines:
The y variable is often termed the criterion variable and the x variable the predictor variable. The slope is often called the regression coefficient and the intercept the regression constant. The slope can also be expressed compactly as ß1= r × sy/sx. Normally we then predict values for y based on values of x. This still does not mean that y is caused by x. It is still imperative for the researcher to understand the variables under study and the context they operate under before making such an interpretation. Of course, simple algebra also allows one to calculate x values for a given value of y. Example: Write the regression line for the following points:
Solution 1: Thus ß0 = [7·115 - 21·14] ÷ [5 · 115 - 212] = 511 ÷ 134 = 3.81 and ß1 = [5·14 - 21·7] ÷ [5 · 115 - 212] = -77 ÷ 134 = -0.575. Thus the regression line for this example is y = -0.575x + 3.81. Solution 2: On your TI-83+ graphing calculator, enter the data into L1 and L2 and do a LinReg(ax+b) L1, L2 (STAT, CALC, 4) or LinReg(a+bx) L1, L2 (STAT, CALC, 8). You should get a screen with There is no mathematical difference between the two linear regression forms LinReg(ax+b) and LinReg(a+bx), only different professional groups prefer different notations. Preferred is perhaps too weak a word here. The calculator manufacturer included both forms since neither group was willing to compromise and use the other. Note the presence on your TI-83+ graphing calculator of several other regression functions as well. Specifically, quadratic (y = ax2 + bx + c), cubic (y = ax3 + bx2 + cx + d), quartic (y = ax4 + bx3 +cx2 + dx + e), exponential (y = abx), and power or variation (y = axb). Thus an easy way to find a quadratic through three points would be to enter the data in a pair of lists then do a quadratic regression on the lists. Least Squares ProcedureThe method of least squares was first published in 1806 by Legendre. However, Gauss "communicated the whole matter to Olbers in 1802." What is the Least Squares Property? This link has a nice colorful example of these residuals, residual squares, and residual sum of squares. Example: Find the Linear Regression line through (3,1), (5,6), (7,8) by brute force.
Using the fact that (A + B + C)2 = A2 + B2 + C2 + 2AB + 2AC + 2BC, we can quickly find SSres = 101 + 83m2 + 3b2 - 178m - 30b + 30mb. This expression is quadratic in both m and b. We can rewrite it both ways and then find the vertex for each (which is the minimum since we are summing squares). Remember the vertex of y = ax2 + bx + c is -b/2a. SSres = 3b2 + (30m - 30)b + (101 + 83m2 - 178m). This link brings up a Java applet which allows you to add a point to a graph and see what influence it has on a regression line. This link brings up a Java applet which encourages you to guess the regression line and correlation coefficient for a data set. Predicting Standard ScoresWith some standard algebra it can be shown (Hinkle, page 129) that there is a direct (meaning the intercept is zero) relationship between standard y scores and standard x scores, with the correlation coefficient the slope: zy = r×zx.Prediction ErrorsAlthough we minimize the sum of the squared distances of the actual y scores from the predicted y scores (y'), there is a distribution of these distances or errors in prediction which is important to discuss. We will define these directed (signed) distances (residuals) as e = (y-y'), where y' is our predicted value. Clearly both positive and negative values occur with a mean of zero. The variance can be computed as
When we consider multiple distributions it is often assumed that their standard deviations are equal. This property is called homoscedasticity. We often consider the conditional distribution or distribution of all y scores with the same value of x. If we assume these conditional distributions are all normal and homoscedastic, we can make probabilistic statements about the predicted scores. The standard deviation we use is the standard error calculated above.
What is the formula for regression equation X and Y?The regression equation Y on X is Y = a + bx, is used to estimate value of Y when X is known. The regression equation X on Y is X = c + dy is used to estimate value of X when Y is given and a, b, c and d are constant.
How do you find the regression line of Y on X and X on Y?There are two lines of regression- that of Y on X and X on Y. The line of regression of Y on X is given by Y = a + bX where a and b are unknown constants known as intercept and slope of the equation. This is used to predict the unknown value of variable Y when value of variable X is known.
What is regression coefficient formula?What is the Formula for Regression Coefficients? The formula for regression coefficients is given as a = n(∑xy)−(∑x)(∑y)n(∑x2)−(∑x)2 n ( ∑ x y ) − ( ∑ x ) ( ∑ y ) n ( ∑ x 2 ) − ( ∑ x ) 2 and b = (∑y)(∑x2)−(∑x)(∑xy)n(∑x2)−(∑x)2 ( ∑ y ) ( ∑ x 2 ) − ( ∑ x ) ( ∑ x y ) n ( ∑ x 2 ) − ( ∑ x ) 2 .
|