Structuring data for Correspondence Analysis

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Structuring data for Correspondence Analysis

Alfredo-2
Hi, I am very new to r and need help from you to do a correspondence
analysis because I don't know how to structure the following data:

Thank you.

Alfredo

 

library(ca,lib.loc=folder)

table <- read.csv(file="C:\\Temp\\Survey_Data.csv", header=TRUE, sep=",")

head (table, n=20)

                Preference   Sex        Age   Time

1           News/Info/Talk     M      25-30  06-09

2                Classical     F      >35    09-12

3          Rock and Top 40     F      21-25  12-13

4                     Jazz     M      >35    13-16  

5           News/Info/Talk     F      25-30  16-18

6             Don't listen     F      30-35  18-20

...

19         Rock and Top 40     M      25-30  16-18

20          Easy Listening     F      >35    18-20

 

In SAS I would simply do this:

proc corresp data=table dim=2 outc=_coord;

   table Preference, Sex Age Time;

run;

 

I don't know how convert in R a data frame to a frequency table to execute
properly this function:

ca <- ca(<frequency table>, graph=FALSE)


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Structuring data for Correspondence Analysis

jholtman
I am not familiar with SAS, so what did you want your output to look like.
There is the 'table' function that might do the job and then there is
always 'dplyr' which can do the hard stuff.  So we need more information on
what you want.

Jim Holtman
*Data Munger Guru*


*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*


On Fri, Mar 29, 2019 at 6:35 AM Alfredo <[hidden email]>
wrote:

> Hi, I am very new to r and need help from you to do a correspondence
> analysis because I don't know how to structure the following data:
>
> Thank you.
>
> Alfredo
>
>
>
> library(ca,lib.loc=folder)
>
> table <- read.csv(file="C:\\Temp\\Survey_Data.csv", header=TRUE, sep=",")
>
> head (table, n=20)
>
>                 Preference   Sex        Age   Time
>
> 1           News/Info/Talk     M      25-30  06-09
>
> 2                Classical     F      >35    09-12
>
> 3          Rock and Top 40     F      21-25  12-13
>
> 4                     Jazz     M      >35    13-16
>
> 5           News/Info/Talk     F      25-30  16-18
>
> 6             Don't listen     F      30-35  18-20
>
> ...
>
> 19         Rock and Top 40     M      25-30  16-18
>
> 20          Easy Listening     F      >35    18-20
>
>
>
> In SAS I would simply do this:
>
> proc corresp data=table dim=2 outc=_coord;
>
>    table Preference, Sex Age Time;
>
> run;
>
>
>
> I don't know how convert in R a data frame to a frequency table to execute
> properly this function:
>
> ca <- ca(<frequency table>, graph=FALSE)
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Structuring data for Correspondence Analysis

John Kane-3
Hi Alfredo,
I have not used SAS nor done a correspondence analysis in many years
but to give R-help readers an idea of what you are doing, we probably
need a short statement of the substantive  problem  that would lead to
the SAS program:
proc corresp data=table dim=2 outc=_coord;
   table Preference, Sex Age Time;

I believe that there are several packages in R that will do a
correspondence analysis (For one see
https://www.statmethods.net/advstats/ca.html). Have you checked out
any of the packages? If so which one are you thinking of using?

Next, we need to see some sample data. Have a look at these two links
that may help you give us more information on the problem and what you
are looking for.  It is important to supply some sample data. It does
not have to be much.  The very best way to supply the sample data is
to use the dput() function that you will find described in the links.

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

 http://adv-r.had.co.nz/Reproducibility.html

On Fri, 29 Mar 2019 at 17:35, jim holtman <[hidden email]> wrote:

>
> I am not familiar with SAS, so what did you want your output to look like.
> There is the 'table' function that might do the job and then there is
> always 'dplyr' which can do the hard stuff.  So we need more information on
> what you want.
>
> Jim Holtman
> *Data Munger Guru*
>
>
> *What is the problem that you are trying to solve?Tell me what you want to
> do, not how you want to do it.*
>
>
> On Fri, Mar 29, 2019 at 6:35 AM Alfredo <[hidden email]>
> wrote:
>
> > Hi, I am very new to r and need help from you to do a correspondence
> > analysis because I don't know how to structure the following data:
> >
> > Thank you.
> >
> > Alfredo
> >
> >
> >
> > library(ca,lib.loc=folder)
> >
> > table <- read.csv(file="C:\\Temp\\Survey_Data.csv", header=TRUE, sep=",")
> >
> > head (table, n=20)
> >
> >                 Preference   Sex        Age   Time
> >
> > 1           News/Info/Talk     M      25-30  06-09
> >
> > 2                Classical     F      >35    09-12
> >
> > 3          Rock and Top 40     F      21-25  12-13
> >
> > 4                     Jazz     M      >35    13-16
> >
> > 5           News/Info/Talk     F      25-30  16-18
> >
> > 6             Don't listen     F      30-35  18-20
> >
> > ...
> >
> > 19         Rock and Top 40     M      25-30  16-18
> >
> > 20          Easy Listening     F      >35    18-20
> >
> >
> >
> > In SAS I would simply do this:
> >
> > proc corresp data=table dim=2 outc=_coord;
> >
> >    table Preference, Sex Age Time;
> >
> > run;
> >
> >
> >
> > I don't know how convert in R a data frame to a frequency table to execute
> > properly this function:
> >
> > ca <- ca(<frequency table>, graph=FALSE)
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
John Kane
Kingston ON Canada

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Structuring data for Correspondence Analysis

Michael Friendly
In reply to this post by Alfredo-2
I think something like table(Preference, Sex, data=table) will get you
started. With 3+ variables, you are probably looking for a MCA analysis
or simple CA using the stacked approach.

Your SAS table statement,

table Preference, Sex Age Time;

treats Preference vs. all combinations of Sex, Age & Time.  This
corresponds to a loglinear model asserting Preference is jointly
independent of the other three.

See the vignette for the vcdExtra package for this kind of thing more
generally.

install.packages("vcdExtra")
browseVignettes("vcdExtra")

See my book, Discrete Data Analysis with R, http://ddar.datavis.ca/

best,
-Michael

On 3/29/2019 9:35 AM, Alfredo wrote:

> Hi, I am very new to r and need help from you to do a correspondence
> analysis because I don't know how to structure the following data:
>
> Thank you.
>
> Alfredo
>
>  
>
> library(ca,lib.loc=folder)
>
> table <- read.csv(file="C:\\Temp\\Survey_Data.csv", header=TRUE, sep=",")
>
> head (table, n=20)
>
>                  Preference   Sex        Age   Time
>
> 1           News/Info/Talk     M      25-30  06-09
>
> 2                Classical     F      >35    09-12
>
> 3          Rock and Top 40     F      21-25  12-13
>
> 4                     Jazz     M      >35    13-16
>
> 5           News/Info/Talk     F      25-30  16-18
>
> 6             Don't listen     F      30-35  18-20
>
> ...
>
> 19         Rock and Top 40     M      25-30  16-18
>
> 20          Easy Listening     F      >35    18-20
>
>  
>
> In SAS I would simply do this:
>
> proc corresp data=table dim=2 outc=_coord;
>
>     table Preference, Sex Age Time;
>
> run;
>
>  
>
> I don't know how convert in R a data frame to a frequency table to execute
> properly this function:
>
> ca <- ca(<frequency table>, graph=FALSE)
>
>
> [[alternative HTML version deleted]]
>


--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, ASA Statistical Graphics Section
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca  |  @datavisFriendly
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

R: Structuring data for Correspondence Analysis

Alfredo-2
Hi Michael et al,

I solved by myself simply running the code below.

Thanks anyway for the answers

Alfredo

 

 

t <- read.csv(file="C:\\Temp\\radio_survey.csv", header=TRUE, sep=",")

 

t1 <- table(t$Preference, t$Sex)

t2 <- table(t$Preference, t$Age)

t3 <- table(t$Preference, t$Time)

 

ct <- cbind(t1, t2, t3)

 

ca <- ca(ct)

 

 

 

 

-----Messaggio originale-----
Da: Michael Friendly <[hidden email]>
Inviato: sabato 30 marzo 2019 16:52
A: Alfredo <[hidden email]>; [hidden email]
Oggetto: Re: Structuring data for Correspondence Analysis

 

I think something like table(Preference, Sex, data=table) will get you started. With 3+ variables, you are probably looking for a MCA analysis or simple CA using the stacked approach.

 

Your SAS table statement,

 

table Preference, Sex Age Time;

 

treats Preference vs. all combinations of Sex, Age & Time.  This corresponds to a loglinear model asserting Preference is jointly independent of the other three.

 

See the vignette for the vcdExtra package for this kind of thing more generally.

 

install.packages("vcdExtra")

browseVignettes("vcdExtra")

 

See my book, Discrete Data Analysis with R,  <http://ddar.datavis.ca/> http://ddar.datavis.ca/

 

best,

-Michael

 

On 3/29/2019 9:35 AM, Alfredo wrote:

> Hi, I am very new to r and need help from you to do a correspondence

> analysis because I don't know how to structure the following data:

>

> Thank you.

>

> Alfredo

>

>  

>

> library(ca,lib.loc=folder)

>

> table <- read.csv(file="C:\\Temp\\Survey_Data.csv", header=TRUE,

> sep=",")

>

> head (table, n=20)

>

>                  Preference   Sex        Age   Time

>

> 1           News/Info/Talk     M      25-30  06-09

>

> 2                Classical     F      >35    09-12

>

> 3          Rock and Top 40     F      21-25  12-13

>

> 4                     Jazz     M      >35    13-16

>

> 5           News/Info/Talk     F      25-30  16-18

>

> 6             Don't listen     F      30-35  18-20

>

> ...

>

> 19         Rock and Top 40     M      25-30  16-18

>

> 20          Easy Listening     F      >35    18-20

>

>  

>

> In SAS I would simply do this:

>

> proc corresp data=table dim=2 outc=_coord;

>

>     table Preference, Sex Age Time;

>

> run;

>

>  

>

> I don't know how convert in R a data frame to a frequency table to

> execute properly this function:

>

> ca <- ca(<frequency table>, graph=FALSE)

>

>

>          [[alternative HTML version deleted]]

>

 

 

--

Michael Friendly     Email: friendly AT yorku DOT ca

Professor, Psychology Dept. & Chair, ASA Statistical Graphics Section

York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814

4700 Keele Street    Web:    <http://www.datavis.ca> http://www.datavis.ca  |  @datavisFriendly

Toronto, ONT  M3J 1P3 CANADA


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.