data transformation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

data transformation

Adrian Johnson-6
Dear group,
My question, perhaps is more of a statistical question using R
I have a data matrix ( 400 x 400 normally distributed) with data
points ranging from -1 to +1..
For certain clustering algorithms, I suspect the tight data range is
not helping resolving the clusters.

Is there a way to transform the data something similar to logit, where
I dont lose normality of the data and yet I can better expand the data
ranges.

Thanks
Adrian

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data transformation

Adrian Johnson-6
I apologize,  I forgot to mention another key operation.
in my matrix -1 to <0 has a different meaning while values between >0
to 1 has a different set of meaning.  So If I do logit transformation
some of the positives becomes negative (values < 0.5 etc.). In such
case, the resulting transformed matrix is incorrect.

I want to transform numbers ranging from -1 to <0   and numbers
between >0 and 1 independently.

Thanks

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data transformation

David Carlson
I don't think you have given us enough information. For example, is the 500x500 matrix a distance matrix or does it represent 500 columns of information about 500 rows of observations? If a distance matrix, how is distance being measured? You clarification suggests it may be a distance matrix of correlation coefficients? If distance has different meanings between -1 and 0 and 0 and +1, getting interpretable results from cluster analysis will be difficult, but it is not clear what you mean by that.

-------------------------------------------------
David L. Carlson
Department of Anthropology
Texas A&M University

-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Adrian Johnson
Sent: Sunday, January 20, 2019 8:02 AM
To: r-help <[hidden email]>
Subject: [R] data transformation

Dear group,
My question, perhaps is more of a statistical question using R
I have a data matrix ( 400 x 400 normally distributed) with data
points ranging from -1 to +1..
For certain clustering algorithms, I suspect the tight data range is
not helping resolving the clusters.

Is there a way to transform the data something similar to logit, where
I dont lose normality of the data and yet I can better expand the data
ranges.

Thanks
Adrian

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained


-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Adrian Johnson
Sent: Sunday, January 20, 2019 10:08 AM
To: r-help <[hidden email]>
Subject: Re: [R] data transformation

I apologize,  I forgot to mention another key operation.
in my matrix -1 to <0 has a different meaning while values between >0
to 1 has a different set of meaning.  So If I do logit transformation
some of the positives becomes negative (values < 0.5 etc.). In such
case, the resulting transformed matrix is incorrect.

I want to transform numbers ranging from -1 to <0   and numbers
between >0 and 1 independently.

Thanks

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data transformation

Richard M. Heiberger
In reply to this post by Adrian Johnson-6
this might work for you

newy <- sign(oldy)*f(abs(oldy))

where f() is a monotonic transformation, perhaps a power function.

On Sun, Jan 20, 2019 at 11:08 AM Adrian Johnson
<[hidden email]> wrote:

>
> I apologize,  I forgot to mention another key operation.
> in my matrix -1 to <0 has a different meaning while values between >0
> to 1 has a different set of meaning.  So If I do logit transformation
> some of the positives becomes negative (values < 0.5 etc.). In such
> case, the resulting transformed matrix is incorrect.
>
> I want to transform numbers ranging from -1 to <0   and numbers
> between >0 and 1 independently.
>
> Thanks
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data transformation

Jeff Newmiller
In reply to this post by Adrian Johnson-6
There is no "perhaps" about it. Nonsense phrases like "similar to logit, where I dont [sic] lose normality of the data" that lead into off-topic discussions of why one introduces transformations in the first place are perfect examples of why questions like this belong on a statistical theory discussion forum like StackExchange rather than here where the topic is the R language.

On January 20, 2019 6:02:15 AM PST, Adrian Johnson <[hidden email]> wrote:

>Dear group,
>My question, perhaps is more of a statistical question using R
>I have a data matrix ( 400 x 400 normally distributed) with data
>points ranging from -1 to +1..
>For certain clustering algorithms, I suspect the tight data range is
>not helping resolving the clusters.
>
>Is there a way to transform the data something similar to logit, where
>I dont lose normality of the data and yet I can better expand the data
>ranges.
>
>Thanks
>Adrian
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.