![]() | This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||
|
I have found use for the "pooled within-group covariance matrix." See for example http://people.revoledu.com/kardi/tutorial/LDA/Numerical%20Example.html or "Analyzing Multivariate Data" by Lattin, Carroll, and Green. This page seems a natural place to put a definition for such a thing, but it doesn't exactly fit into the flow of the page. Suggestions? Otherwise I'll just jump in at a future date. dfrankow (talk) 16:54, 29 December 2008 (UTC)
This article starts with a big mess of slashes and sigmas and brackets... not sure if this is just my browser rendering something in a strange manner (though I'm using firefox, so I expect I'm not the only one seeing this) or quite what it is. I don't know how to fix it either; maybe somebody else does?
—The preceding unsigned comment was added by 129.169.10.56 (talk) 16:32, 5 December 2006 (UTC).
This formula looks at first sight very complicated. Actually its derivation is quite simple (for simplicity we assume μ to be 0,(just replace everywhere X by X - μ if you want)):
We are almost finished. Of course for every unit vector you get (in general) different values, so you do not have just one number like in the scalar case but a whole bunch of numbers (a continuum) parametrised by the unit vectors in n dimensions (actually only the direction counts, u and -u give the same value) Now comes the big trick. We do not have to keep this infinity of numbers, as you can see below all the information is contained in the covariance matrix (wow!)
Now because u is a constant we have:
or
and we are done... (easy, isn't it :)
I have moved the comments above to this discussion page for several reasons. The assertion that this very simple formula looks "very complicated" seems exceedingly silly and definitely not NPOV. Then whoever wrote it refers to "its derivation". That makes no sense. It is a definition, not an identity or any sort of proposition. What proposition that author was trying to derive is never stated. The writing is a model of unclarity. Michael Hardy 22:52 Mar 12, 2003 (UTC)
Okay I try to explain the idea more clearly:
if you have some set of vector measueremnts you can consider it as a cloud of points in n-dimensions. If you want to find something interesting about your data set you can look a the data from different directions, or what is essentially the same perform a projection into 1,2, or 3 dimensions. But there are many projections possible, which one to take ? Life is not enough to try them all ;-) One criterion you can apply (not the best one but is better than nothing and it works sometimes...) is to look at directions for which the data have large variance (this makes sense if you want to find the most "energetic" components...) I tried to explain that the covariance matrix is a tool (at least can be interpreted in this way) to represent the data variances in all possible drections in a effective and compact way. If you once understand this it is immediatly clear why it is useful to look for the eigenvalues and eigenvectors of the covariance matrix.
I can not expect to see your version of this ! ;-)
I have inserted into the article some language that I think addresses your point, which I still think was quite unclear as you wrote it originally. Michael Hardy 19:53 Mar 14, 2003 (UTC)
PS: "One criterion is ....", but some other criteria may exist. (My point here is that in standard English, "criterion" is the singular. Michael Hardy 19:54 Mar 14, 2003 (UTC)
does not explain anything, and actually makes it more complicated because as you said this depends on the basis...
which version they find more illuminating... At the moment it is hardly possible because my version is quite hidden :-(
I think that it is better to explain the things one defines. The above explanations helped me more to understand the topic than the mathematical absolutely correct formulas one finds, when looking for "covariance matrix". To my opinion most of the people that use Wikipedia are interested in both versions, so they should see them at the same site and not hidden in the discussion group.
People, label yourself in your comments so we know who is talking. Also be a little more specific about what you are pointing at. There are too many "that", "I", and "you" for it to be clear who is talking about what. I suggest the four-tilde signature so the date is included. SEWilco 17:11, 15 Jan 2004 (UTC)
(SEWilco 08:39, 7 Jul 2004 (UTC)) Maybe something like this will be useful:
I've never encountered the usage of for denoting the covariance matix. I've always used for this (and for autocorrelation matrix). This is standard notation pracise in the field of signal processing. --Fredrik Orderud 12:26, 20 Apr 2005 (UTC)
Standard notation:
ALSO standard notation:
ALSO standard notation:
Unfortunately the first two of these usages jar with each other. The first and third are in perfect harmony. The first notation is found in William Feller's celebrated two-volume book on probability, which everybody is familiar with, so it's surprising that some people are suggesting it first appeared on Wikipedia. It's also found in some statistics texts, e.g., Sanford Weissberg's linear regression text. Michael Hardy 18:05, 28 Apr 2005 (UTC)
This article was starting to become an examplar of crackpothood. Someone who apparently didn't like the opening paragraphs, instead of replacing them with other material, simply put the other material above those opening paragraphs, so that the article started over again, saying, in a later paragraph, "In statistics, a covariance matrix is ..." etc., and giving the same elaborate definition again, with stylistic differences. And that eventually became the second of FOUR such iterations, with stylistic differences! Other things were wrong too. Why, for example was there a "stub" notice?? There should have been a "cleanup" notice right at the top, instead of a "stub" notice at the bottom. Inline TeX often gets misaligned or appears far too big, or both, on many browsers, but it looks as if someone went through and put perfectly good-looking non-TeX inline mathematical notation with TeX (e.g., "an n × n matrix" ---> "an matrix"). (Tex generally looks very good when "displayed", however. And when TeX used in the normal way, as opposed to its use on Wikipedia, there's certainly no problem with inline math notation.) Using lower-case letters for random variables is jarring, since in many cases one wants to write such things as
and it is crucial to be careful about which of the "x"s above are capital and which are lower-case. The cleanup isn't finished yet .... Michael Hardy 19:16, 28 Apr 2005 (UTC)
I added the list of properties in the article. There was only two of them stated, and these properties should definitivley be on an article about cov and var matrices. --Steffen Grønneberg 14:06, 5 October 2005 (UTC)
Somehow I find it difficult to understand the 5th property:
Shouldn't the correct formula be:
, since no relationship between "X" and "Y" is defined? Does anyone have a reference on this? --Fredrik Orderud 12:01, 11 October 2005 (UTC)
Yep, my bad. Fixed it now. The reference I used is Multivariate Analysis by K. V. Mardia, J. T. Kent, J. M. Bibby. Its in chap. 3. (http://www.amazon.com/gp/product/0124712525/103-2355319-3731041?v=glance&n=283155&n=507846&s=books&v=glance) Is there a place where it's usual to cite this? --Steffen Grønneberg 23:10, 12 October 2005 (UTC)
Before I go make a fool of myself, shouldn't A and B be q x p matrices instead of p x q matrices? --The imp 15:32, 15 March 2006 (UTC)
I think the A and B matrices have the proper dimensions, but I changed the description of to a p x 1 vector (from q x 1) and changed to a q x 1 vector (from a p x 1). I did this based on:
If I'm wrong, my apologies - feel free to correct it back to the original. —Preceding unsigned comment added by 64.22.160.1 (talk) 21:37, 8 May 2008 (UTC)
I don't see how var(X,Y) and cov(X,Y) "conflict". We don't say that gamma(x) and x! or asin and arcsin "conflict" or "jar". It is perfectly OK to have var(X)=cov(X)=cov(X,X). The fact that there is a one-argument function cov doesn't exclude there being a two-argument, related, function. Consider, say, Γ(x)=Γ(x,0). --Macrakis 21:51, 21 December 2005 (UTC)
It would be nice to have a section on computational methods for calculating covariance matrices. I am unfortunately not competant to write it.... Any volunteers? --Macrakis 21:51, 21 December 2005 (UTC)
public Matrix CovarianceMatrix(double[,] myArray) // For a num of TS, n, there are n*(n-1) covar's. { //Cov(X, X) = Var(X) //Cov(P, Q) = Cov(Q, P) //vcvMatrix is symmetric square, with Var(i) on the leading diagonal. //vcvMatrix is positive semi-definite (should i include a safty test??) //Cov(P, Q) is NOT unitless; its units are those of P times those of Q. int nCols = myArray.GetLength(1); int nRows = myArray.GetLength(0); Matrix vcvMatrix = new Matrix(nCols, nCols); double[] u = mean(myArray); for (int i = 0; i < nCols; i++) //rows of the vcvMatrix { for (int j = 0; j < nCols; j++) //cols of the vcvMatrix { double temp = 0; double covar = 0; for (int z = 0; z < nRows; z++) { temp += (myArray[z, i] - u[i]) * (myArray[z, j] - u[j]); } covar = temp / (nRows - 1); vcvMatrix[i, j] = covar; } } if (!vcvMatrix.Symmetric) { throw new ApplicationException("VCV matrix is not symmetric "); } return vcvMatrix; }
Tashiro (talk) 16:43, 30 September 2012 (UTC)
What are the restrictions on what matrices can be covariance matrices? I guess the matrix has to be symmetric; is any symmetric matrix a possible covariance matrix? --Trovatore 23:11, 19 June 2006 (UTC)
In this definition, 'expected value operator' (mu)is used. Per my Excel program explanation of covariance, mu is just the average. Isn't it simpler to just say that mu is just the average value of X rather than the 'expected value'? —The preceding unsigned comment was added by Steve 10-Jan-0771.121.7.79 (talk) 03:41, 11 January 2007 (UTC).
Yes, but "average" is quite ambiguous. It could be a weighted mean, the geometric mean, the median, or lots of other things. The "expected value" is a standard term for the equally-weighted arithmetic mean. LachlanA 00:30, 22 January 2007 (UTC)
You need to add what the typical E function is that is used in practice. That is, you don't divide by N, but N-1 typically. I frankly think using expectation is a mistake, as it makes this article more difficult for the newbie than it needs to be. Sure, it may be more general, but that doesn't mean more helpful, clear, or useful. —Preceding unsigned comment added by 71.111.251.229 (talk) 05:53, 2 March 2008 (UTC)
"In statistics and probability theory, the covariance matrix is a matrix of covariances between elements of a vector. It is the natural generalization to higher dimensions of the concept of the variance of a scalar-valued random variable.
If X is a column vector with n scalar random variable components, and μk is the expected value of the kth element of X, i.e., μk = E(Xk), then the covariance matrix is defined as:"
I guess many people, like me, come to visit this article because they want to do some kind of statistical analysis within their studies or other related work. Of course I can only speak for myself but the explanation that a covariance matrix is "a matrix of covariances" did not really help. Also the fact that it's a natural generalization to higher dimension of some concept didn't improve my understanding (and looking at the other comments I'm appearently not alone). Maybe somebody could add an introductory explanation or even a section in the text where this concept is explained for the uninitiated. 84.168.17.109 10:43, 7 February 2007 (UTC)
Agree, this article needs to be made more accessible.67.180.143.83 19:49, 9 February 2007 (UTC)
Apart from mathematica fans, no one - imho - would really find the Wolfram link useful. I don't see it as my place to modify it (in case there ARE lots of Mathematica fans), but perhaps the next Admin who reads this could ponder my suggestion Wrude bouie 14:02, 7 May 2007 (UTC)
Although the Wolfram site can be useful, it seems as though this article should point to at least a few good texts which discuss covariance matrices, their properties, their applications, in full glory. I personally only have seen them in one text, which was not a very good text for the ins and outs of covariance matrices (van Kampen); I am unenthusiastically adding it to the Further Reading section. Does anyone have anything better?
Also, it seems as though at least a little bit more development on the Wikipedia page would be nice: e.g., the mathematical and physical meaning of the "generalized variance" (determinant of Cov Matrix).
Alex Roberts 19:46, 20 May 2007 (UTC)
The van Kampen is a very good book, but very advanced and not really about statistics, more about random processes in molecules. Much better for this topic would be "Mathematical Statistics and Data Analysis" by John A. Rice or "Introduction to Mathematical Statistics" by P. G. Hoel, which is old but excellent. —Preceding unsigned comment added by AidanTwomey (talk • contribs) 09:14, 28 March 2008 (UTC)
You may be interested in this discussion on the relationship between the covariance matrix and the moment-of-inertia matrix. —Ben FrantzDale (talk) 20:29, 27 August 2008 (UTC)
The section "Which matrices are covariance matrices" mentions the existence of the matrix square root of a covariance matrix. Is there a name for this? It seems like "standard deviation matrix" or something similar would be appropriate. That is, aren't the eigenvalues of the standard deviations in the principal directions and so if you wanted to draw a "confidence ellipsoid", you'd transform a sphere by , just like you might draw confidence intervals at ±σ for a 1-D distribution? —Ben FrantzDale (talk) 14:37, 4 September 2008 (UTC)
Is the covariance matrix a tensor? That is, does it transform as a tensor? It appear so. —Ben FrantzDale (talk) 19:08, 26 November 2008 (UTC)
How does the Hessian of the log likelihood function for a zero-mean multivariate normal distribution relate to the covariance matrix? They appear to be matrix inverses of each other, but Wikipedia has no mention of it. —Ben FrantzDale (talk) 22:11, 29 December 2008 (UTC)
Is there any reference to the fact that covariance matrices for complex random vectors are Hermitian positive definite, could they be Hermitian positive semidefinite? Does anybody have a citation or at least an explanation of why this is? Most of the books in probability only deal with real vectors and when they do talk about complex, they don't go into stating the properties of the covariance matrix, just define the special case when vectors are complex. So, please, anybody with some insight about this, please. Felipe (talk) 19:41, 7 July 2010 (UTC)
Hmmm.. As far as I know the product X * Transpose(X) is not defined if X is a column matrix. It should be the opposite: Transpose(X) * X, shouldn't it? Yet, throughout the article, the first "notation" is used. —Preceding unsigned comment added by 217.211.151.32 (talk) 14:19, 6 March 2011 (UTC)
Hi all,
Since i am not a registred user i write here:
The definition of a covariance matrix as it is now suggests that X1 etc. are single elements, as the article starts with a random variable explanation. This is misleading in the rest of the article as well as X1 being single element is not true.
Note that COV(X,Y) is the sum over all i in X and Y (Xi-mean(X))*(Yi-mean(Y)). Please not that the number of elements within X and Y have to be the same.
In plain english: If you have n observations from 2 variables X and Y then there is just one covariance. If you have n observations of m variables (i.e. your data has m dimensions), you will have m!/(2(m-2)!) covariances.
To have variances and covariances convieniently in one matrix the variance-covariance matrix displays all those covariances and variances in one matrix.
I think this is a very important article being looked up a lot and it should be corrected.
Good reference on the topic, with cristal clear explanation can be found here: http://users.ecs.soton.ac.uk/hbr03r/pa037042.pdf
Cheers Ben —Preceding unsigned comment added by 129.31.217.14 (talk) 23:05, 12 April 2011 (UTC)
If the article would mention this yes, unfortunately it does not mention that its taken over k=1,..kmax; and even if it did it remains very confusing to use the same letters for vectors and scalars. —Preceding unsigned comment added by 129.31.216.148 (talk) 16:07, 13 April 2011 (UTC)
Not sure if this is the same problem, but I find the definition very confusing. X is referred to as a "column vector." It would therefore seem to be a vector of scalars, e.g. [1 2 3 4 5]. Xi would then be the ith element of this vector. It would seem, then, that is the mean of this ith element. Which can't be right. The definition seems to be using Xi ambiguously, referring in one context to the ith scalar in a vector of scalars, and in another context as a vector of scalars, namely, the vector of (say) observed values of a variable quantity (e.g., height) in a given sample. Is this correct? If so, is there a way to re-write the definition in a standard notation that would remove the ambiguity? That would be helpful. — Preceding unsigned comment added by 76.100.128.83 (talk) 19:45, 2 December 2013 (UTC)
There are now some articles about matrix-valued random variables: e.g. matrix normal distribution, matrix t distribution. Some of these refer to a covariance matrix for these matrix random variables. It would be good to have something here about how such a thing is defined, and if there is a standard way of doing this. Clearly one can re-arrange the matrix into a vector and get a covariance matrix for this, but there must be a standard for working by rows or columns etc., and possibly a different way of treating symmetric=matrix-random-variables. Melcombe (talk) 12:26, 5 July 2012 (UTC)
The article states "The inverse of this matrix, Σ−1 is the inverse covariance matrix, also known as the concentration matrix or precision matrix;[1] see precision (statistics). The elements of the precision matrix have an interpretation in terms of partial correlations and partial variances.[citation needed]"
It's vague as to what is meant by "an interpretation". If the elements are the partial correlations, then please just say so. I'm not certain myself, but it seems that this indeed is the interpretation given by users of graphical models, as in Friedman, Hastie, & Tibshirani, 2007:
"The basic model for continuous data assumes that the observations have a multivariate Gaussian distribution with mean μ and covariance matrix Σ. If the ijth component of Σ−1 is zero, then variables i and j are conditionally independent, given the other variables."
161.130.188.94 (talk) 15:46, 14 April 2014 (UTC)Joe Hilgard
The current Wikipedia article on "Covariance Matrix" has a section on its "properties". It would be useful to know how many of these properties are relevant and correct for the sample covariance matrix. The sample covariance matrix could be regarded as the covariance matrix of a population consisting exactly of the sample, except for 1/(n-1) factor in the estimator. Is that difference important for "properties"?
Tashiro (talk) 19:15, 30 November 2014 (UTC)
It seems to me that the second paragraph in the introduction is about something other than what I understand by "covariance matrix". By "covariance matrix" I understand a matrix where the (i,j)th element is the covariance between variables X_i and X_j, but in this paragraph there are two sets of variables and the (i,j)th element is the covariance between X_i and Y_j. This is different from a covariance matrix in many ways, and in particular, it need not be symmetric, positive definite or even square.
From a brief glance the rest of the article seems to be in line with what I'd expect. This includes the first and third paragraphs of the introduction, which thus seem to contradict the second one. Is this use of multiple definitions intentional or should it be changed? Nathaniel Virgo (talk) 05:42, 10 October 2016 (UTC)
I am not familiar with the E operator being used for expectation value, but that may follow from my background in physics rather than statistics. Even so, there's a structural problem with the E notation. It's not really an operator because nothing is actually being operated on. X_i itself doesn't contain any information about its own mean value, so structurally, E(X_i) is a single token variable that represents some known mean value of X_i. It certainly doesn't look like a single token, and that can lead to serious confusion on the part of the reader while peppering the page with numerous E's. Wolfram Alpha uses , which I also find (slightly less) confusing for the same reason.
I suggest we use the , which clearly indicates that is a different variable than and that you can't get by doing , which is tempting and wrong. It also greatly shortens several formulae. This was standard notation in my mathematical education, and I'm sure I could dig up a few textbooks that use it.
Please discuss before we make any changes. Acronymsical (talk) 15:57, 3 August 2017 (UTC)
Section "Which matrices are covariance matrices?" reads:
"From the symmetry of the covariance matrix's definition it follows that only a positive-semidefinite matrix can be a covariance matrix."
This is false: symmetry (even with nonnegative diagonal entries) does not imply positive semi-definiteness, e.g.
.
The correct justification for the positive semi-definiteness of covariance matrices is:
I made this change. I found the proof in SE[1], but I don't know if I should cite SE. Also see 3.21 in Wasserman. — Preceding unsigned comment added by 2001:16A2:87EF:4900:782C:159B:2B2D:753B (talk) 14:48, 22 May 2018 (UTC)
References
The typography of the Covariance matrix#Block matrices section is inconsistent with the other sections, e.g. Covariance matrix#Definition. I see three ways how to restore consistency:
1. Edit bold and italic fonts in Covariance matrix#Block matrices to be consistent with Eq. 1, which is
Eq.1 |
2. Unfortunately, the typography in above equation itself is inconsistent since K is a matrix and μ is a vector, so they should be bold and roman (and T also should be roman):
Eq.1 |
3. However, subscripts X should be regarded as names rather than vectors, since using a vector as a subscript is an ill-defined concept. Therefore these subscripts should not be bold:
Eq.1 |
I think convention 3 is the best but most Wikipedia articles on statistics use convention 2. Before I add a couple of sections to this article (see User_talk:Sandstein#Covariance_mapping_redirection) I would like to correct the typography but do not want to make it inconsistent with other articles. What do you think? Is important to keep it consistent across several articles? Or only within any given article? Or its OK to mix conventions even withing the same article? FizykLJF (talk) 18:59, 3 August 2019 (UTC)
As described in the previous post, I have added a section on "Covariance mapping" and corrected some inconsistent typography and notation in other sections. I declare a possible conflict of interest: I am an expert in this research area and the author of reference 9. FizykLJF (talk) 14:05, 18 August 2019 (UTC)
The section links to a definition of "data matrices" where a data matrix is defined to contain one experiment per row. But in the same sentence two data matrices are defined with one experiment per column. This is confusing. The easiest solution would be to just remove the link to the "data matrix" article. A better solution might be to actually _use_ the "one experiment per row" definition, and adapt the formulas in the section on estimation accordingly. Ernstkl (talk) 09:57, 12 July 2021 (UTC)
Happy to be corrected here, but what is the reference for the definition currently (02 Mar 22) for the "autocorrelation matrix" as E[XX^T]? I have never seen this before. My understanding is that the autocorrelation matrix is the same as the "correlation matrix" given in the next subsection. E[XX^T] is just the non-centred (auto-)covariance matrix. I'm not aware it has a name, and it's not terribly useful by itself. I suggest "Relation to the autocorrelation matrix" is removed or completely replaced. Wanted to see reactions before I attempt.
The variable rho is used without definition in section "Inverse of the covariance matrix." Jaguarmountain (talk) 17:43, 30 July 2023 (UTC)