Asymptotic distribution of t and M-W tests

22 views
Skip to first unread message

Abhaya Indrayan

unread,
Nov 1, 2021, 12:47:59 AM11/1/21
to MedS...@googlegroups.com
The theory suggests that the Welch test and Mann-Whitney (M-W) test are asymptotically (standard) normal irrespective of different n's. However, when I calculate these values for some inflammatory markers for differences between surviving and dead patients of COVID, for extremely large samples (n1>10000 and n2>1000), I find the values of Welch t and M-W z are very different. Both the distributions are highly skewed to the right and both are unimodal. The variances are very different. For example, for CRP, the values are as follows:
n1 = 12226/mean1 = 29.86/SD1 = 49.97/ median = 8.61/
n2 = 1038/mean2 = 74.98/SD2 = 86.39/median = 34.84/ 
Welch t = 16.59 but M-W z = 22.29

This is not an isolated case. I find similarly  widely different values of Welch t and M-W z for all the inflammatory markers.

What is it that I am missing? 

~Abhaya

--
Dr Abhaya Indrayan, 
Personal website: http://indrayan.weebly.com

Rakesh Biswas

unread,
Nov 1, 2021, 1:52:44 AM11/1/21
to meds...@googlegroups.com
Could this happen due to disease progression or recovery? 

--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/medstats/CAP7G4a6Y5mrxH%3D2AQaJ-4_5gr1Rs16NzrNzEvFb0G65VW3ythw%40mail.gmail.com.

Abhaya Indrayan

unread,
Nov 1, 2021, 2:59:58 AM11/1/21
to meds...@googlegroups.com
Dr Biswas: Disease progression does not come into the picture. Whatever be the measurements and howsoever taken, the question is regarding asymptotic distribution of M-W test vis-a-vis Welch test.

Best.

~A. Indrayan
--
Dr Abhaya Indrayan, MSc,MS,PhD(OhioState),FSMS,FAMS,FRSS,FASc
Personal website: http://indrayan.weebly.com

John Whittington

unread,
Nov 1, 2021, 11:00:52 AM11/1/21
to meds...@googlegroups.com
Hi Abhaya, I hope that all is well with you and yours.

I may be talking nonsense, but .... whilst, as you say,  theory indicates that the distribution of the test statistic is asymptotically Normal in both cases, they appear to be testing totally different hypotheses (see **), so I'm not sure that I would expect the value of the test statistic (hence the resulting 'p-value') to necessarily be the same (or even similar) in the two cases?

[ ** as I understand it,the Welch test is testing the null hypothesis that the means of two groups are equal, whereas with the M-W test,  the null hypothesis is that the probability of a random value from one group being greater than a random value from the other group is equal to the probability of it being smaller (which seems to be a very different hypothesis) .  Is that not the case? ]

In other words, I am suggesting that, due to the hypotheses under test being different, the fact that both test statistics are asymptotically Normal does not mean that the tests themselves are necessarily 'asymptotically equivalent'.

However, as I said at the start, that may all be noinsense!!

Kind Regards,  John

At 06:59 01/11/2021, Abhaya Indrayan wrote:
Dr Biswas: Disease progression does not come into the picture. Whatever be the measurements and howsoever taken, the question is regarding asymptotic distribution of M-W test vis-a-vis Welch test.

Best.

~A. Indrayan
--
Dr Abhaya Indrayan, MSc,MS,PhD(OhioState),FSMS,FAMS,FRSS,FASc
Personal website:Â http://indrayan.weebly.com

On Mon, Nov 1, 2021 at 11:22 AM Rakesh Biswas <rakesh...@gmail.com > wrote:
Could this happen due to disease progression or recovery?

On Mon, Nov 1, 2021, 10:17 AM Abhaya Indrayan <a.ind...@gmail.com> wrote:
The theory suggests that the Welch test and Mann-Whitney (M-W) test are asymptotically (standard) normal irrespective of different n's. However, when I calculate these values for some inflammatory markers for differences between surviving and dead patients of COVID, for extremely large samples (n1>10000 and n2>1000), I find the values of Welch t and M-W z are very different. Both the distributions are highly skewed to the right and both are unimodal. The variances are very different. For example, for CRP, the values are as follows:
n1 = 12226/mean1 = 29.86/SD1 = 49.97/ median = 8.61/
n2 = 1038/mean2 = 74.98/SD2 = 86.39/median = 34.84/
Welch t = 16.59 but M-W z = 22.29

This is not an isolated case. I find similarly  widely different values of Welch t and M-W z for all the inflammatory markers.

What is it that I am missing?

~Abhaya

--
Dr Abhaya Indrayan,Â
Personal website:Â http://indrayan.weebly.com

--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/medstats/CAP7G4a6Y5mrxH%3D2AQaJ-4_5gr1Rs16NzrNzEvFb0G65VW3ythw%40mail.gmail.com .

--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.


--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.

John

----------------------------------------------------------------
Dr John Whittington,       Voice:    +44 (0) 1296 730225
Mediscience Services       Fax:      +44 (0) 1296 738893
Twyford Manor, Twyford,    E-mail:   Joh...@mediscience.co.uk
Buckingham  MK18 4EL, UK            
----------------------------------------------------------------

Abhaya Indrayan

unread,
Nov 1, 2021, 11:24:48 AM11/1/21
to meds...@googlegroups.com
John: My point was that both are asymptotically yield z-value (standard normal deviate) - thus should give the same value. However, I may be absolutely wrong. Regards.

~Abhaya

Marc Schwartz

unread,
Nov 1, 2021, 11:42:41 AM11/1/21
to meds...@googlegroups.com, Abhaya Indrayan
Abhaya,

Did you try to run a two sample t-test on the ranks of your data, rather than on the raw data, to compare the results with the M-W test?

Also, I would point out the the M-W test uses an adjustment in the calculation in the presence of ties in the ranks. That may be another source of the difference, depending upon the proportion of ties in the ranks of your data.

This paper may offer some additional insights:

Rank Transformations as a Bridge Between Parametric and Nonparametric Statistics
W. J. Conover and Ronald L. Iman
The American Statistician
Vol. 35, No. 3 (Aug., 1981), pp. 124-129 (6 pages)
https://www.jstor.org/stable/2683975

Regards,

Marc Schwartz

Abhaya Indrayan wrote on 11/1/21 11:24 AM:

Rich Ulrich

unread,
Nov 1, 2021, 11:50:08 AM11/1/21
to meds...@googlegroups.com
Yes, sorry, you are indeed "absolutely wrong". They are testing
different hypotheses, as John says. The M-W is sometimes described
as a test of medians (not means), but as Wikipedia says, "if both the
dispersions and shapes of the distribution of both samples differ,
the Mann-Whitney U test fails a test of medians."

I will offer a more subtle /failure of tests to converge/ - The chi-squared
test on contingency tables has a Likelihood formula and the more
common formula from Pearson.  These need not converge. (This came
up in a stats-group, many years ago.  Demos with Ns over 100 000
helped to make the point.)

What is different in their computations? - A large test score can show
multiple cells summing "moderate" differences, or from fewer cells
summing "large" differences ( consider (O-E)**2/E ).  One of them
(Pearson, IIRC) is more sensitive to "larger" differences.  Increasing
Ns for the same cell fractions will only increase the magnitude of both
scores, proportionately.

--
Rich Ulrich

From: meds...@googlegroups.com <meds...@googlegroups.com> on behalf of Abhaya Indrayan <a.ind...@gmail.com>
Sent: Monday, November 1, 2021 11:24 AM
To: meds...@googlegroups.com <meds...@googlegroups.com>
Subject: Re: {MEDSTATS} Asymptotic distribution of t and M-W tests
 

Alan Kimber

unread,
Nov 1, 2021, 11:50:49 AM11/1/21
to meds...@googlegroups.com
Hi
I'm new to this group, so sorry if I'm missing the point somewhere, but there seem to be two issues.
First, just because two test statistics have the same (asymptotic) distribution, doesn't mean that they must always take the same value with a specific data set.
For example, suppose W has a standard normal distribution. Then by symmetry 
-W also has a standard normal distribution. If W=w, then -W=-w. So, unless w=0,  W and -W will take different values.
Likewise we know that a t-test is more affected by outliers than MW. So an extreme observation may greatly affect the value of the t-statistic but affect the MW statistic much less.
Secondly, the null distributions may be standard normal, but surely the non-null distributions are not. If your data are essentially from the non-null region, then you wouldn't expect to see standard normality.
I hope this helps. 
With best wishes
Alan


Sent: 01 November 2021 15:24

To: meds...@googlegroups.com <meds...@googlegroups.com>
Subject: Re: {MEDSTATS} Asymptotic distribution of t and M-W tests
 
CAUTION: This e-mail originated outside the University of Southampton.

John Whittington

unread,
Nov 1, 2021, 1:19:37 PM11/1/21
to meds...@googlegroups.com
At 15:50 01/11/2021, Alan Kimber wrote:
I'm new to this group, so sorry if I'm missing the point somewhere, but there seem to be two issues.

Welcome to the group.  As yoiu will probably be aware, I amone of the group's managers, and I look forward to seeing your contributions here1


First, just because two test statistics have the same (asymptotic) distribution, doesn't mean that they must always take the same value with a specific data set.

As you realise, that was my very point.

Let's face it, even if one is considering only one test, it can often be used to test any number of hypotheses.  If the nature of the test is such that the test statistic has an asymptotically Normal distribution, that will always be the case, but the value of the test statistic (with a given data set) will obviously depend upon what hypothesis is being tested - and, as I said, the Welch and M-W tests are testing different hypotheses.

Kind Regards,
John
Kind Regards,  John

Best.

~A. Indrayan
--
Dr Abhaya Indrayan, MSc,MS,PhD(OhioState),FSMS,FAMS,FRSS,FASc
Personal website:Â http://indrayan.weebly.com

On Mon, Nov 1, 2021 at 11:22 AM Rakesh Biswas <rakesh...@gmail.com > wrote:
Could this happen due to disease progression or recovery?
On Mon, Nov 1, 2021, 10:17 AM Abhaya Indrayan <a.ind...@gmail.com> wrote:
The theory suggests that the Welch test and Mann-Whitney (M-W) test are asymptotically (standard) normal irrespective of different n's. However, when I calculate these values for some inflammatory markers for differences between surviving and dead patients of COVID, for extremely large samples (n1>10000 and n2>1000), I find the values of Welch t and M-W z are very different. Both the distributions are highly skewed to the right and both are unimodal. The variances are very different. For example, for CRP, the values are as follows:
n1 = 12226/mean1 = 29.86/SD1 = 49.97/ median = 8.61/
n2 = 1038/mean2 = 74.98/SD2 = 86.39/median = 34.84/
Welch t = 16.59 but M-W z = 22.29
This is not an isolated case. I find similarly  widely different values of Welch t and M-W z for all the inflammatory markers.
What is it that I am missing?
~Abhaya
--
Dr Abhaya Indrayan,Â
Personal website:Â http://indrayan.weebly.com

John

----------------------------------------------------------------
Buckingham  MK18 4EL, UK            
----------------------------------------------------------------

--
--



--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.


--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.

Simon, Stephen D.

unread,
Nov 1, 2021, 6:37:33 PM11/1/21
to meds...@googlegroups.com

Both are asymptotically N(0, 1), but only under the null hypothesis. If the null hypothesis is false, the distribution  would most certainly not be N(0, 1). A test statistics that was approximately N(0, 1) under the alternative hypothesis would have almost no power.

I'm guessing that the distributions would be non-central something or other and that the non-centrality parameter would be different for the two tests.

Abhaya Indrayan

unread,
Nov 1, 2021, 9:38:57 PM11/1/21
to meds...@googlegroups.com
Thanks for the responses. The major one is the difference in the null hypothesis for the two tests. I will examine how this affects the test-statistic values despite both being asymptotically N(0,1) under the null. 

Nice to see this group active again.

~Abhaya

John Whittington

unread,
Nov 1, 2021, 10:39:38 PM11/1/21
to meds...@googlegroups.com
Abhaya - at least I received this post from you. Goodness knows why I didn't receive (directly) your earliest ones.

Yes, as I said before I think that's the point.  If two tests are testing different hypotheses, then there's no reason why the test statistics should necessarily be even remotely similar,even if they both have the same asymptotic distribution.

To take a very simple example, if you had one test with the null hypothesis of mean=X and another with the null hypothesis of median=X, with the distribution ot test statsitics being asymptotically Normal in both cases, if you had a large set of data whose distribution was very asymettrical, you would expect to get very different test statistiucs (and corersponding p-values) wiuth the two tests,  wouldn't you?

Kind Regards,
John


At 01:38 02/11/2021, Abhaya Indrayan wrote:
Thanks for the responses. The major one is the difference in the null hypothesis for the two tests. I will examine how this affects the test-statistic values despite both being asymptotically N(0,1) under the null.

Nice to see this group active again.

~Abhaya

On Tue, Nov 2, 2021 at 4:07 AM Simon, Stephen D. <n...@pmean.com> wrote:

Both are asymptotically N(0, 1), but only under the null hypothesis. If the null hypothesis is false, the distribution  would most certainly not be N(0, 1). A test statistics that was approximately N(0, 1) under the alternative hypothesis would have almost no power.

I'm guessing that the distributions would be non-central something or other and that the non-centrality parameter would be different for the two tests.
On 10/31/2021 11:47 PM, Abhaya Indrayan wrote:
The theory suggests that the Welch test and Mann-Whitney (M-W) test are asymptotically (standard) normal irrespective of different n's. However, when I calculate these values for some inflammatory markers for differences between surviving and dead patients of COVID, for extremely large samples (n1>10000 and n2>1000), I find the values of Welch t and M-W z are very different. Both the distributions are highly skewed to the right and both are unimodal. The variances are very different. For example, for CRP, the values are as follows:
n1 = 12226/mean1 = 29.86/SD1 = 49.97/ median = 8.61/
n2 = 1038/mean2 = 74.98/SD2 = 86.39/median = 34.84/Â
Welch t = 16.59 but M-W z = 22.29

This is not an isolated case. I find similarly  widely different values of Welch t and M-W z for all the inflammatory markers.

What is it that I am missing?Â

~Abhaya

--
Dr Abhaya Indrayan,Â
Personal website:Â http://indrayan.weebly.com
--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/medstats/CAP7G4a6Y5mrxH%3D2AQaJ-4_5gr1Rs16NzrNzEvFb0G65VW3ythw%40mail.gmail.com .

--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.



--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.

Abhaya Indrayan

unread,
Nov 2, 2021, 12:43:26 AM11/2/21
to meds...@googlegroups.com
John: You and others certainly have a point regarding different nulls. But this needs to be examined in more detail.

I have experienced a similar problem regarding not getting some mails in the trail while all replies are in the trail. I guess there is some issue with the Google group mails.

Regards.

~Abhaya
Reply all
Reply to author
Forward
0 new messages