In this areas of research facing the problem of structural underidentification, it is helpful to be aware of the geometry of rank deficient models. What does the geometry of rank deficient models look like? How does constrained regression work? Why do some constraints not work? The geometry shows that there are some things we know about all possible solutions when using rank deficient models.
For example, in the rank deficient by one situation, the OLS solutions solutions to the normal equations all lie on a line in multidimensional space. We can describe this line explicitly: the line is identified. The constraint we use whether it is implicit or we chose it explicitly determines one of the points on this line and, thus, one of the infinite number of least squares solutions.
Our choice is, of course, subject to error; it is no better than the choice of the constraint used to select that solution. This fact should keep researchers modest in their claims for solutions based on constrained regression. In each of the four cases of linear dependency discussed above, the matrix of independent variables is one less than full column rank since only two of the independent variables are linearly independent. Adding the third independent variable means that one of the three variables can be determined perfectly from the other two.
WISE All-Sky Release Explanatory Supplement: Data Processing
This three variable model has a rank of two and is rank deficient by one. Because of this linear dependency, no unique solution exists. One way to obtain a solution, however, is to impose a constraint on the possible solutions such as constraining the math test effect on GPA to be half as great as that of the verbal test effect. The constraints are often based on theory or past research. That is, the researcher has some reason to believe that math skills as measured by the test should be less important to the overall GPA than verbal skills as measured by the test.
Justifying that the math effect should be one-half as large as the verbal test effect requires precision not often found in social research. Less theoretically, we can obtain a solution by using any appropriate generalized inverse. This identifies the model, but the solutions depend on the constraint employed generalized inverse used and different constraints can provide widely divergent results.
Others have written on the geometry of generalized inverses or related topics  —  , but this paper provides a unique, and more intuitive, view. It emphasizes the geometry of the solution space not the construction of a generalized inverse , it does so from the row perspective using row equations rather than a column perspective using column vectors , and it emphasizes the null space and the hyperspace of solutions that is parallel to the null space.
The method used is straightforward.
I begin with simple spaces of one, two, and three dimensions. I then extend this approach to situations with four or more dimensions. Understanding this geometry takes some effort even in the one-, two-, and three-dimensional situations and, obviously, more effort as we move to the geometry of four or more dimensions. To simplify, I will deal throughout with the normal equations associated with Ordinary Least Squares OLS regression, since this is the situation most familiar to readers.
I begin with the simplest situation, the bivariate case. We subtract the mean of the independent variable from each independent variable scores and the mean of the dependent variable from the dependent variable scores.
This leaves us with deviation scores and allows us to consider only the one regression coefficient between these two variables since the intercept is zero. In this situation there is only one normal equation. In the one independent one dependent variable situation, there are only two quantities needed to find the regression coefficient: the sums of squares for the independent variable and the sum of products for the independent and dependent variables.
In this two variable situation there is one normal equation 1 yielding the familiar solution. Using matrix algebra, we write this same equation as. The prime means that the column vector has been transposed in this case into a row vector. When we carry out the matrix multiplications, we end up with a single equation: equation 1. For concreteness, we create values for and , and place them into 1 : and. Geometrically, the solution space has only one dimension b and equation 1 allows us to solve for a unique point on this line.
It determines where on that one-dimension of possible values of b the solution lies. We extend this method by moving to the two independent variable situation. We again center the variables by subtracting their means from them so that all of the variables are in deviation score form.
We distinguish between the two independent variables by subscripting them with a one or a two: or. From an algebraic perspective the quantities of interest are , , , ,. Formulas from introductory texts that cover multiple regression allow one to place these quantities into formulas and solve for the two regression coefficients . The matrix algebra representation remains the same , but now the X matrix contains two columns one for each of the independent variables and n rows one for each of the observations.
The vector b has two elements one for the regression coefficient for the first independent variable and one for the second independent variable. We write out the explicit matrix form of the equations using the sums of squares and cross-products: 2 Carrying out the matrix multiplication in 2 , we can write the two normal equations:.
Solution of underdetermined systems of equations with gridded a priori constraints
We again supply some appropriate values for the sums of squares and products [ , , , , ] and placing these into 3 produce a set of two normal equations that could result from real data,. Geometrically, the solution space has two dimensions: one for and one for. The normal equations in 4 are equations for lines and if these two lines intersect in a point in this two dimensional solution space that point will determine a unique solution to this two equation system.
This is depicted in Figure 1. The horizontal axis represents the solutions for and the vertical line the solutions for. We construct the two lines based on the equations in 4 in the following manner. Using the first equation, if then so that one of the points on the line is 2, 0.
On the other hand, if then and a second point on this first line is 0, 4 and these two points allow us to draw this first line in the two dimensional solution space. If we set then , then a second point on the line is 0, 1. This allows us to construct the second line. These two lines intersect at 1, 2 ; that is, and. This is the geometric view of the solution to the normal equations with two independent variables.
- Sentences, Paragraphs, and Beyond: With Integrated Readings;
- 4. Pipeline Science Modules!
- System of linear equations with constraints.
- Navigation menu.
- Gaussian Elimination!
- An iterative method for solving a kind of constrained linear matrix equations system.
- Money Laundering: A Concise Guide for All Business.
It is likely familiar to most readers albeit from a different context. The solution is where the two lines representing equations 1 and 2 intersect 1, 2. Imagine the situation in which the two equations are linearly dependent, for example: 5 The second equation is one-half times the first equation. There is no unique solution to these equations. When we substitute the second equations value for into the first equations value for and solve for , we obtain , a rather uninformative result since could take on any value. We say that b 2 is not identified.
If we substitute the value of from second equation into the first equation, we find that. Geometrically we can plot the first equation as before and end up with the line for equation 1 in Figure 1.
- Practical Research: Planning and Design.
- Ethnolinguistic Chicago: Language and Literacy in the Citys Neighborhoods.
- Constrained linear matrix equation and its application - IEEE Conference Publication.
- The Winter War: Russia’s Invasion of Finland, 1939-1940.
- Culture and the Politics of Welfare: Exploring Societal Values and Social Choices;
- The Constrained Solutions of Two Matrix Equations.
When we plot the second line, we find that it crosses the axis at 0, 4 and the axis at 2, 0. That is, the lines for these two equations coincide. Any solutions to these equations lie on this line.