# "chi-square" | "chi-squared" | "chi squared" | "chi square" ?

10 messages
Open this post in threaded view
|

 As it's Friday .. and I also really want to clean up help files and similar R documents, both in R's own sources and in my new 'DPQ' CRAN package : As a trained mathematician, I'm uneasy if a thing has several easily confusable names, .. but as somewhat humanistically educated person, I know that natural languages, English in this case, are much more flexible than computer languages or math... Anyway, back to the question(s) .. which I had asked myself a couple of months ago, and already remained slightly undecided: The 0-th (meta-)question of course is   0. Is it worth using only one written form for the      χ² - distribution, e.g. "everywhere" in R? The answer is not obvious, as already the first few words of the (English) Wikipedia clearly convey: The URL is  https://en.wikipedia.org/wiki/Chi-squared_distributionand the main title therefore also     "Chi-squared distribution" Then it reads > This article is about the mathematics of the chi-squared > distribution. For its uses in statistics, see chi-squared > test. For the music [...] > In probability theory and statistics, the chi-square > distribution (also chi-squared or χ2-distribution) with k > degrees of freedom is the distribution of a sum of the squares > of k independent standard normal random variables. > The chi-square distribution is a special case of the gamma > distribution and is one of the most widely used probability > distributions in inferential statistics, notably in hypothesis > testing [........] > [........] So, in title and 1st paragraph its "chi-squared", but then everywhere(?) the text used "chi-square". Undoubtedly, Wilson & Hilferty (1931) has been an important paper and they use "Chi-square" in the title; also  Johnson, Kotz & Balakrishnan (1995) see R's help page ?pchisq use  "Chi-square" in the title of chapter 18 and then, diplomatically for chapter 29,  "Noncentral χ²-Distributions" as title. So it seems, that historically and using prestigious sources, "chi-square" to dominate (notably if we do not count "χ²" as an alternative). Things look a bit different when I study R's sources; on one hand, I find all 4 forms (s.Subject); then in the "R source history", I see   $svn log -c11342 ------------------------------------------------------------------------ r11342 | <....> | 2000-11-14 ... Use chi-squared'. ------------------------------------------------------------------------ which changed 16 (if I counted correctly) cases of 'chi-square' to 'chi-squared'. I have not found any R-core internal (or public) reasoning about that change, but had kept it in mind and often worked along that "goal". As a consequence, "statistically" speaking, much of R's own use has been standardized to use "chi-squared"; but as I mentioned, I still find all 4 variants even in "R base" package help files (which of course I now could quite quickly change (using Emacs M-x grep, plus a script); but ... "as it is Friday" ... I'm interested to hear what others think, notably if you are native English (or "American" ;-) speaking and/or have some extra good knowledge on such matters... Martin Maechler ETH Zurich ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code. Reply | Threaded Open this post in threaded view | ## Re: "chi-square" | "chi-squared" | "chi squared" | "chi square" ?  Dear Martin, Others struggle with this inconsistency as well; I found this discussion useful: https://math.stackexchange.com/questions/1098138/chi-square-or-chi-squaredDenes On 10/18/19 2:51 PM, Martin Maechler wrote: > As it's Friday .. > > and I also really want to clean up help files and similar R documents, > both in R's own sources and in my new 'DPQ' CRAN package : > > As a trained mathematician, I'm uneasy if a thing has > several easily confusable names, .. but as somewhat > humanistically educated person, I know that natural languages, > English in this case, are much more flexible than computer > languages or math... > > Anyway, back to the question(s) .. which I had asked myself a > couple of months ago, and already remained slightly undecided: > > The 0-th (meta-)question of course is > > 0. Is it worth using only one written form for the > χ² - distribution, e.g. "everywhere" in R? > > The answer is not obvious, as already the first few words of the > (English) Wikipedia clearly convey: > > The URL is https://en.wikipedia.org/wiki/Chi-squared_distribution> and the main title therefore also > "Chi-squared distribution" > > Then it reads > >> This article is about the mathematics of the chi-squared >> distribution. For its uses in statistics, see chi-squared >> test. For the music [...] > >> In probability theory and statistics, the chi-square >> distribution (also chi-squared or χ2-distribution) with k >> degrees of freedom is the distribution of a sum of the squares >> of k independent standard normal random variables. > >> The chi-square distribution is a special case of the gamma >> distribution and is one of the most widely used probability >> distributions in inferential statistics, notably in hypothesis >> testing [........] >> [........] > > So, in title and 1st paragraph its "chi-squared", but then > everywhere(?) the text used "chi-square". > > Undoubtedly, Wilson & Hilferty (1931) has been an important > paper and they use "Chi-square" in the title; > also Johnson, Kotz & Balakrishnan (1995) > see R's help page ?pchisq use "Chi-square" in the title of > chapter 18 and then, diplomatically for chapter 29, > "Noncentral χ²-Distributions" as title. > > So it seems, that historically and using prestigious sources, > "chi-square" to dominate (notably if we do not count "χ²" as an > alternative). > > Things look a bit different when I study R's sources; on one > hand, I find all 4 forms (s.Subject); then in the "R source > history", I see > >$ svn log -c11342 >    ------------------------------------------------------------------------ >    r11342 | <....> | 2000-11-14 ... > >    Use chi-squared'. >    ------------------------------------------------------------------------ > > which changed 16 (if I counted correctly) cases of 'chi-square' to 'chi-squared'. > > I have not found any R-core internal (or public) reasoning about > that change, but had kept it in mind and often worked along that "goal". > > As a consequence, "statistically" speaking, much of R's own use has been > standardized to use "chi-squared"; but as I mentioned, I still > find all  4  variants even in "R base" package help files > (which of course I now could quite quickly change  (using Emacs M-x grep, plus a script); > but > > ... "as it is Friday" ... I'm interested to hear what others > think, notably if you are native English (or "American" ;-) > speaking and/or have some extra good knowledge on such > matters... > > Martin Maechler > ETH Zurich > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

 In reply to this post by Martin Maechler I have the vague impression that "chi-squared" is more common in British usage and "chi-square" more common in American usage.  I'm pretty sure that either is acceptable, although "chi-squared" sounds much better to my ear. Of course within a given document (or collection of related documents) consistency is mandatory. cheers, Rolf On 19/10/19 1:51 AM, Martin Maechler wrote: > As it's Friday .. > > and I also really want to clean up help files and similar R documents, > both in R's own sources and in my new 'DPQ' CRAN package : > > As a trained mathematician, I'm uneasy if a thing has > several easily confusable names, .. but as somewhat > humanistically educated person, I know that natural languages, > English in this case, are much more flexible than computer > languages or math... > > Anyway, back to the question(s) .. which I had asked myself a > couple of months ago, and already remained slightly undecided: > > The 0-th (meta-)question of course is > >    0. Is it worth using only one written form for the >       χ² - distribution, e.g. "everywhere" in R? > > The answer is not obvious, as already the first few words of the > (English) Wikipedia clearly convey: > > The URL is  https://en.wikipedia.org/wiki/Chi-squared_distribution> and the main title therefore also >      "Chi-squared distribution" > > Then it reads > >> This article is about the mathematics of the chi-squared >> distribution. For its uses in statistics, see chi-squared >> test. For the music [...] > >> In probability theory and statistics, the chi-square >> distribution (also chi-squared or χ2-distribution) with k >> degrees of freedom is the distribution of a sum of the squares >> of k independent standard normal random variables. > >> The chi-square distribution is a special case of the gamma >> distribution and is one of the most widely used probability >> distributions in inferential statistics, notably in hypothesis >> testing [........] >> [........] > > So, in title and 1st paragraph its "chi-squared", but then > everywhere(?) the text used "chi-square". > > Undoubtedly, Wilson & Hilferty (1931) has been an important > paper and they use "Chi-square" in the title; > also  Johnson, Kotz & Balakrishnan (1995) > see R's help page ?pchisq use  "Chi-square" in the title of > chapter 18 and then, diplomatically for chapter 29, >   "Noncentral χ²-Distributions" as title. > > So it seems, that historically and using prestigious sources, > "chi-square" to dominate (notably if we do not count "χ²" as an > alternative). > > Things look a bit different when I study R's sources; on one > hand, I find all 4 forms (s.Subject); then in the "R source > history", I see > >    $svn log -c11342 > ------------------------------------------------------------------------ > r11342 | <....> | 2000-11-14 ... > > Use chi-squared'. > ------------------------------------------------------------------------ > > which changed 16 (if I counted correctly) cases of 'chi-square' to 'chi-squared'. > > I have not found any R-core internal (or public) reasoning about > that change, but had kept it in mind and often worked along that "goal". > > As a consequence, "statistically" speaking, much of R's own use has been > standardized to use "chi-squared"; but as I mentioned, I still > find all 4 variants even in "R base" package help files > (which of course I now could quite quickly change (using Emacs M-x grep, plus a script); > but > > ... "as it is Friday" ... I'm interested to hear what others > think, notably if you are native English (or "American" ;-) > speaking and/or have some extra good knowledge on such > matters... ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code. Reply | Threaded Open this post in threaded view | ## Re: "chi-square" | "chi-squared" | "chi squared" | "chi square" ?  In reply to this post by Dénes Tóth-2 I have thought about this one myself, and just reading the posts and links has afforded me a more informed viewpoint. My guess is that it boils down to a contest between mathematics and prosody. To speakers of English, "square" in the mathematical sense implies the active form such as "I square this number". When our focus is on the number itself, it is usually expressed as "This number has been squared". My suggestion is that while "chi-square" may be more correct in the derivation of the statistic, "chi-squared" is more consistent with colloquial usage in using the passive form. It may also avoid confusion with the use of "square" as a noun, in which the preceding word is often an adjective (e.g. "a red square"). Jim On Sat, Oct 19, 2019 at 12:19 AM Dénes Tóth <[hidden email]> wrote: > > Dear Martin, > > Others struggle with this inconsistency as well; I found this discussion > useful: > https://math.stackexchange.com/questions/1098138/chi-square-or-chi-squared> > Denes > > > On 10/18/19 2:51 PM, Martin Maechler wrote: > > As it's Friday .. > > > > and I also really want to clean up help files and similar R documents, > > both in R's own sources and in my new 'DPQ' CRAN package : > > > > As a trained mathematician, I'm uneasy if a thing has > > several easily confusable names, .. but as somewhat > > humanistically educated person, I know that natural languages, > > English in this case, are much more flexible than computer > > languages or math... > > > > Anyway, back to the question(s) .. which I had asked myself a > > couple of months ago, and already remained slightly undecided: > > > > The 0-th (meta-)question of course is > > > > 0. Is it worth using only one written form for the > > χ² - distribution, e.g. "everywhere" in R? > > > > The answer is not obvious, as already the first few words of the > > (English) Wikipedia clearly convey: > > > > The URL is https://en.wikipedia.org/wiki/Chi-squared_distribution> > and the main title therefore also > > "Chi-squared distribution" > > > > Then it reads > > > >> This article is about the mathematics of the chi-squared > >> distribution. For its uses in statistics, see chi-squared > >> test. For the music [...] > > > >> In probability theory and statistics, the chi-square > >> distribution (also chi-squared or χ2-distribution) with k > >> degrees of freedom is the distribution of a sum of the squares > >> of k independent standard normal random variables. > > > >> The chi-square distribution is a special case of the gamma > >> distribution and is one of the most widely used probability > >> distributions in inferential statistics, notably in hypothesis > >> testing [........] > >> [........] > > > > So, in title and 1st paragraph its "chi-squared", but then > > everywhere(?) the text used "chi-square". > > > > Undoubtedly, Wilson & Hilferty (1931) has been an important > > paper and they use "Chi-square" in the title; > > also Johnson, Kotz & Balakrishnan (1995) > > see R's help page ?pchisq use "Chi-square" in the title of > > chapter 18 and then, diplomatically for chapter 29, > > "Noncentral χ²-Distributions" as title. > > > > So it seems, that historically and using prestigious sources, > > "chi-square" to dominate (notably if we do not count "χ²" as an > > alternative). > > > > Things look a bit different when I study R's sources; on one > > hand, I find all 4 forms (s.Subject); then in the "R source > > history", I see > > > >$ svn log -c11342 > >    ------------------------------------------------------------------------ > >    r11342 | <....> | 2000-11-14 ... > > > >    Use chi-squared'. > >    ------------------------------------------------------------------------ > > > > which changed 16 (if I counted correctly) cases of 'chi-square' to 'chi-squared'. > > > > I have not found any R-core internal (or public) reasoning about > > that change, but had kept it in mind and often worked along that "goal". > > > > As a consequence, "statistically" speaking, much of R's own use has been > > standardized to use "chi-squared"; but as I mentioned, I still > > find all  4  variants even in "R base" package help files > > (which of course I now could quite quickly change  (using Emacs M-x grep, plus a script); > > but > > > > ... "as it is Friday" ... I'm interested to hear what others > > think, notably if you are native English (or "American" ;-) > > speaking and/or have some extra good knowledge on such > > matters... > > > > Martin Maechler > > ETH Zurich > > > > ______________________________________________ > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: "chi-square" | "chi-squared" | "chi squared" | "chi square" ?

 On Sat, 19 Oct 2019, Jim Lemon wrote: > My suggestion is that while "chi-square" may be more correct in the > derivation of the statistic, "chi-squared" is more consistent with > colloquial usage in using the passive form. Jim, This is a cogent suggestion that's pragmatic and defensible. Thank you, Rich ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

 In reply to this post by Martin Maechler What a delightful question.  Bill Cochran discussed this in class one day about 50 years ago.  He said the British usage (which I think he said was chi-squared, as is consistent with the other memories in this thread) is what he learned and previously used.  But he had been in the US for so long that he was now using the American preference (chi-square). Rich On Fri, Oct 18, 2019 at 8:51 AM Martin Maechler <[hidden email]> wrote: > > As it's Friday .. > > and I also really want to clean up help files and similar R documents, > both in R's own sources and in my new 'DPQ' CRAN package : > > As a trained mathematician, I'm uneasy if a thing has > several easily confusable names, .. but as somewhat > humanistically educated person, I know that natural languages, > English in this case, are much more flexible than computer > languages or math... > > Anyway, back to the question(s) .. which I had asked myself a > couple of months ago, and already remained slightly undecided: > > The 0-th (meta-)question of course is > >   0. Is it worth using only one written form for the >      χ² - distribution, e.g. "everywhere" in R? > > The answer is not obvious, as already the first few words of the > (English) Wikipedia clearly convey: > > The URL is  https://en.wikipedia.org/wiki/Chi-squared_distribution> and the main title therefore also >     "Chi-squared distribution" > > Then it reads > > > This article is about the mathematics of the chi-squared > > distribution. For its uses in statistics, see chi-squared > > test. For the music [...] > > > In probability theory and statistics, the chi-square > > distribution (also chi-squared or χ2-distribution) with k > > degrees of freedom is the distribution of a sum of the squares > > of k independent standard normal random variables. > > > The chi-square distribution is a special case of the gamma > > distribution and is one of the most widely used probability > > distributions in inferential statistics, notably in hypothesis > > testing [........] > > [........] > > So, in title and 1st paragraph its "chi-squared", but then > everywhere(?) the text used "chi-square". > > Undoubtedly, Wilson & Hilferty (1931) has been an important > paper and they use "Chi-square" in the title; > also  Johnson, Kotz & Balakrishnan (1995) > see R's help page ?pchisq use  "Chi-square" in the title of > chapter 18 and then, diplomatically for chapter 29, >  "Noncentral χ²-Distributions" as title. > > So it seems, that historically and using prestigious sources, > "chi-square" to dominate (notably if we do not count "χ²" as an > alternative). > > Things look a bit different when I study R's sources; on one > hand, I find all 4 forms (s.Subject); then in the "R source > history", I see > >   $svn log -c11342 > ------------------------------------------------------------------------ > r11342 | <....> | 2000-11-14 ... > > Use chi-squared'. > ------------------------------------------------------------------------ > > which changed 16 (if I counted correctly) cases of 'chi-square' to 'chi-squared'. > > I have not found any R-core internal (or public) reasoning about > that change, but had kept it in mind and often worked along that "goal". > > As a consequence, "statistically" speaking, much of R's own use has been > standardized to use "chi-squared"; but as I mentioned, I still > find all 4 variants even in "R base" package help files > (which of course I now could quite quickly change (using Emacs M-x grep, plus a script); > but > > ... "as it is Friday" ... I'm interested to hear what others > think, notably if you are native English (or "American" ;-) > speaking and/or have some extra good knowledge on such > matters... > > Martin Maechler > ETH Zurich > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code. Reply | Threaded Open this post in threaded view | ## Re: [EXTERNAL] Re: "chi-square" | "chi-squared" | "chi squared" | "chi square" ?  oh my... I'd like to see the statistics on it before jumping to a conclusion that the American preference is "chi-square" and the British preference is "chi-squared". I don't see that at all. ------ In keeping with the pronunciation of x^2 and 3^2, maybe "chi-squared" makes the most sense,. The "chi-square"? Because the iterated dentals in "chi-squared distribution" and "chi-squared test" are a little cumbersome to pronounce, an even slightly lazy pronunciation would sound like "chi-square distribution" and "chi-square test". There's no need to write it that way though. -Dan On Fri, Oct 18, 2019 at 2:28 PM Richard M. Heiberger <[hidden email]> wrote: > What a delightful question. Bill Cochran discussed this in class > one day about 50 years ago. He said the British usage (which I think > he said was chi-squared, > as is consistent with the other memories in this thread) > is what he learned and previously used. But he had been in the US for > so long that he was now using > the American preference (chi-square). > > Rich > > On Fri, Oct 18, 2019 at 8:51 AM Martin Maechler > <[hidden email]> wrote: > > > > As it's Friday .. > > > > and I also really want to clean up help files and similar R documents, > > both in R's own sources and in my new 'DPQ' CRAN package : > > > > As a trained mathematician, I'm uneasy if a thing has > > several easily confusable names, .. but as somewhat > > humanistically educated person, I know that natural languages, > > English in this case, are much more flexible than computer > > languages or math... > > > > Anyway, back to the question(s) .. which I had asked myself a > > couple of months ago, and already remained slightly undecided: > > > > The 0-th (meta-)question of course is > > > > 0. Is it worth using only one written form for the > > χ² - distribution, e.g. "everywhere" in R? > > > > The answer is not obvious, as already the first few words of the > > (English) Wikipedia clearly convey: > > > > The URL is https://en.wikipedia.org/wiki/Chi-squared_distribution> > and the main title therefore also > > "Chi-squared distribution" > > > > Then it reads > > > > > This article is about the mathematics of the chi-squared > > > distribution. For its uses in statistics, see chi-squared > > > test. For the music [...] > > > > > In probability theory and statistics, the chi-square > > > distribution (also chi-squared or χ2-distribution) with k > > > degrees of freedom is the distribution of a sum of the squares > > > of k independent standard normal random variables. > > > > > The chi-square distribution is a special case of the gamma > > > distribution and is one of the most widely used probability > > > distributions in inferential statistics, notably in hypothesis > > > testing [........] > > > [........] > > > > So, in title and 1st paragraph its "chi-squared", but then > > everywhere(?) the text used "chi-square". > > > > Undoubtedly, Wilson & Hilferty (1931) has been an important > > paper and they use "Chi-square" in the title; > > also Johnson, Kotz & Balakrishnan (1995) > > see R's help page ?pchisq use "Chi-square" in the title of > > chapter 18 and then, diplomatically for chapter 29, > > "Noncentral χ²-Distributions" as title. > > > > So it seems, that historically and using prestigious sources, > > "chi-square" to dominate (notably if we do not count "χ²" as an > > alternative). > > > > Things look a bit different when I study R's sources; on one > > hand, I find all 4 forms (s.Subject); then in the "R source > > history", I see > > > >$ svn log -c11342 > > >  ------------------------------------------------------------------------ > >   r11342 | <....> | 2000-11-14 ... > > > >   Use chi-squared'. > > >  ------------------------------------------------------------------------ > > > > which changed 16 (if I counted correctly) cases of 'chi-square' to > 'chi-squared'. > > > > I have not found any R-core internal (or public) reasoning about > > that change, but had kept it in mind and often worked along that "goal". > > > > As a consequence, "statistically" speaking, much of R's own use has been > > standardized to use "chi-squared"; but as I mentioned, I still > > find all  4  variants even in "R base" package help files > > (which of course I now could quite quickly change  (using Emacs M-x > grep, plus a script); > > but > > > > ... "as it is Friday" ... I'm interested to hear what others > > think, notably if you are native English (or "American" ;-) > > speaking and/or have some extra good knowledge on such > > matters... > > > > Martin Maechler > > ETH Zurich > > > > ______________________________________________ > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- Dan Dalthorp, PhD USGS Forest and Rangeland Ecosystem Science Center Forest Sciences Lab, Rm 311 3200 SW Jefferson Way Corvallis, OR 97331 ph: 541-750-0953 [hidden email]         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|