Ignoring the domain of RV in punif()

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Ignoring the domain of RV in punif()

hamedhm
Hi All,

I recently discovered an interesting issue with the punif() function.  Let
X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<= b).
The important fact here is the domain of the random variable X. Having said
that, R returns CDF for any value in the real domain.

I understand that one can justify this by extending the domain of X and
assigning zero probabilities to the values outside the domain. However,
theoretically, it is not true to return a value for the CDF outside the
domain. Then I propose a patch to R function punif() to return an error in
this situations.

Example:
> punif(10^10)
[1] 1


Regards,
Hamed.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

Eric Berger
Hi Hamed,
I disagree with your criticism.
For a random variable X
X: D - - - > R
its CDF F is defined by
F: R - - - > [0,1]
F(z) = Prob(X <= z)

The fact that you wrote a convenient formula for the CDF
F(z) = (z-a)/(b-a)  a <= z <= b
in a particular range for z is your decision, and as you noted this formula
will give the wrong value for z outside the interval [a,b].
But the problem lies in your formula, not the definition of the CDF which
would be, in your case:

F(z) = 0 if z <= a
       = (z-a)/(b-a)   if a <= z <= b
       = 1 if 1 <= z

HTH,
Eric




On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]> wrote:

> Hi All,
>
> I recently discovered an interesting issue with the punif() function.  Let
> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<= b).
> The important fact here is the domain of the random variable X. Having said
> that, R returns CDF for any value in the real domain.
>
> I understand that one can justify this by extending the domain of X and
> assigning zero probabilities to the values outside the domain. However,
> theoretically, it is not true to return a value for the CDF outside the
> domain. Then I propose a patch to R function punif() to return an error in
> this situations.
>
> Example:
> > punif(10^10)
> [1] 1
>
>
> Regards,
> Hamed.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

hamedhm
Hi Eric,

Thank you for your reply.

I should say that your justification makes sense to me.  However, I am in
doubt that CDF defines by the Pr(x <= X) for all X? that is the domain of
RV is totally ignored in the definition.

It makes a conflict between the formula and the theoretical definition.

Please see page 115 in
https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false
The


Thanks.
Hamed.



On Tue, 23 Oct 2018 at 10:21, Eric Berger <[hidden email]> wrote:

> Hi Hamed,
> I disagree with your criticism.
> For a random variable X
> X: D - - - > R
> its CDF F is defined by
> F: R - - - > [0,1]
> F(z) = Prob(X <= z)
>
> The fact that you wrote a convenient formula for the CDF
> F(z) = (z-a)/(b-a)  a <= z <= b
> in a particular range for z is your decision, and as you noted this
> formula will give the wrong value for z outside the interval [a,b].
> But the problem lies in your formula, not the definition of the CDF which
> would be, in your case:
>
> F(z) = 0 if z <= a
>        = (z-a)/(b-a)   if a <= z <= b
>        = 1 if 1 <= z
>
> HTH,
> Eric
>
>
>
>
> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]> wrote:
>
>> Hi All,
>>
>> I recently discovered an interesting issue with the punif() function.  Let
>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<= b).
>> The important fact here is the domain of the random variable X. Having
>> said
>> that, R returns CDF for any value in the real domain.
>>
>> I understand that one can justify this by extending the domain of X and
>> assigning zero probabilities to the values outside the domain. However,
>> theoretically, it is not true to return a value for the CDF outside the
>> domain. Then I propose a patch to R function punif() to return an error in
>> this situations.
>>
>> Example:
>> > punif(10^10)
>> [1] 1
>>
>>
>> Regards,
>> Hamed.
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

Eric Berger
Hi Hamed,
That reference is sloppy. Try looking at
https://en.wikipedia.org/wiki/Cumulative_distribution_function
and in particular the first example which deals with a Unif[0,1] r.v.

Best,
Eric


On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <[hidden email]> wrote:

> Hi Eric,
>
> Thank you for your reply.
>
> I should say that your justification makes sense to me.  However, I am in
> doubt that CDF defines by the Pr(x <= X) for all X? that is the domain of
> RV is totally ignored in the definition.
>
> It makes a conflict between the formula and the theoretical definition.
>
> Please see page 115 in
>
> https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false
> The
>
>
> Thanks.
> Hamed.
>
>
>
> On Tue, 23 Oct 2018 at 10:21, Eric Berger <[hidden email]> wrote:
>
>> Hi Hamed,
>> I disagree with your criticism.
>> For a random variable X
>> X: D - - - > R
>> its CDF F is defined by
>> F: R - - - > [0,1]
>> F(z) = Prob(X <= z)
>>
>> The fact that you wrote a convenient formula for the CDF
>> F(z) = (z-a)/(b-a)  a <= z <= b
>> in a particular range for z is your decision, and as you noted this
>> formula will give the wrong value for z outside the interval [a,b].
>> But the problem lies in your formula, not the definition of the CDF which
>> would be, in your case:
>>
>> F(z) = 0 if z <= a
>>        = (z-a)/(b-a)   if a <= z <= b
>>        = 1 if 1 <= z
>>
>> HTH,
>> Eric
>>
>>
>>
>>
>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]> wrote:
>>
>>> Hi All,
>>>
>>> I recently discovered an interesting issue with the punif() function.
>>> Let
>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<=
>>> b).
>>> The important fact here is the domain of the random variable X. Having
>>> said
>>> that, R returns CDF for any value in the real domain.
>>>
>>> I understand that one can justify this by extending the domain of X and
>>> assigning zero probabilities to the values outside the domain. However,
>>> theoretically, it is not true to return a value for the CDF outside the
>>> domain. Then I propose a patch to R function punif() to return an error
>>> in
>>> this situations.
>>>
>>> Example:
>>> > punif(10^10)
>>> [1] 1
>>>
>>>
>>> Regards,
>>> Hamed.
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

hamedhm
Yes, now it makes more sense.

Okay, I think that I am convinced and we can close this ticket.

Thanks Eric.
Regards,
Hamed.






On Tue, 23 Oct 2018 at 10:42, Eric Berger <[hidden email]> wrote:

> Hi Hamed,
> That reference is sloppy. Try looking at
> https://en.wikipedia.org/wiki/Cumulative_distribution_function
> and in particular the first example which deals with a Unif[0,1] r.v.
>
> Best,
> Eric
>
>
> On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <[hidden email]> wrote:
>
>> Hi Eric,
>>
>> Thank you for your reply.
>>
>> I should say that your justification makes sense to me.  However, I am in
>> doubt that CDF defines by the Pr(x <= X) for all X? that is the domain of
>> RV is totally ignored in the definition.
>>
>> It makes a conflict between the formula and the theoretical definition.
>>
>> Please see page 115 in
>>
>> https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false
>> The
>>
>>
>> Thanks.
>> Hamed.
>>
>>
>>
>> On Tue, 23 Oct 2018 at 10:21, Eric Berger <[hidden email]> wrote:
>>
>>> Hi Hamed,
>>> I disagree with your criticism.
>>> For a random variable X
>>> X: D - - - > R
>>> its CDF F is defined by
>>> F: R - - - > [0,1]
>>> F(z) = Prob(X <= z)
>>>
>>> The fact that you wrote a convenient formula for the CDF
>>> F(z) = (z-a)/(b-a)  a <= z <= b
>>> in a particular range for z is your decision, and as you noted this
>>> formula will give the wrong value for z outside the interval [a,b].
>>> But the problem lies in your formula, not the definition of the CDF
>>> which would be, in your case:
>>>
>>> F(z) = 0 if z <= a
>>>        = (z-a)/(b-a)   if a <= z <= b
>>>        = 1 if 1 <= z
>>>
>>> HTH,
>>> Eric
>>>
>>>
>>>
>>>
>>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I recently discovered an interesting issue with the punif() function.
>>>> Let
>>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<=
>>>> b).
>>>> The important fact here is the domain of the random variable X. Having
>>>> said
>>>> that, R returns CDF for any value in the real domain.
>>>>
>>>> I understand that one can justify this by extending the domain of X and
>>>> assigning zero probabilities to the values outside the domain. However,
>>>> theoretically, it is not true to return a value for the CDF outside the
>>>> domain. Then I propose a patch to R function punif() to return an error
>>>> in
>>>> this situations.
>>>>
>>>> Example:
>>>> > punif(10^10)
>>>> [1] 1
>>>>
>>>>
>>>> Regards,
>>>> Hamed.
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

Ted Harding
Before the ticket finally enters the waste bin, I think it is
necessary to explicitly explain what is meant by the "domain"
of a random variable. This is not (though in special cases
could be) the space of possible values of the random variable.

Definition of (real-valued) Random Variable (RV):
Let Z be a probability space, i.e. a set {z} of entities z
on which a probability distribution is defined. The entities z
do not need to be numeric. A real-valued RV X is a function
X:Z --> R defined on Z such that, for any z in Z, X(z) is a
real number. The set Z, in tthis context, is (by definitipon)
the *domain* of X, i.e. the space on which X is defined.
It may or may not be (and usually is not) the same as the set
of possible values of X.

Then. given any real value x0, the CDF of X at x- is Prob[X <= X0].
The distribution function of X does not define the domain of X.

As a simple exam[ple: Suppose Q is a cube of side A, consisting of
points z=(u,v,w) with 0 <= u,v,w <= A. Z is the probability space
of points z with a uniform distribution of position within Q.
Define the random variable X:Q --> [0,1] as
  X(u,v,w) = x/A
Then X is uniformly distributed on [0,1], the domain of X is Q.
Then for x <= 0 _Prob[X <= x] = 0, for 0 <= x <= 1 Prob(X >=x] = x,
for x >= 1 Prob(X <= x] = 1. These define the CDF. The set of poaaible
values of X is 1-dimensional, and is not the same as the domain of X,
which is 3-dimensional.

Hopiong this helps!
Ted.

On Tue, 2018-10-23 at 10:54 +0100, Hamed Ha wrote:

> Yes, now it makes more sense.
>
> Okay, I think that I am convinced and we can close this ticket.
>
> Thanks Eric.
> Regards,
> Hamed.
>
> On Tue, 23 Oct 2018 at 10:42, Eric Berger <[hidden email]> wrote:
>
> > Hi Hamed,
> > That reference is sloppy. Try looking at
> > https://en.wikipedia.org/wiki/Cumulative_distribution_function
> > and in particular the first example which deals with a Unif[0,1] r.v.
> >
> > Best,
> > Eric
> >
> >
> > On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <[hidden email]> wrote:
> >
> >> Hi Eric,
> >>
> >> Thank you for your reply.
> >>
> >> I should say that your justification makes sense to me.  However, I am in
> >> doubt that CDF defines by the Pr(x <= X) for all X? that is the domain of
> >> RV is totally ignored in the definition.
> >>
> >> It makes a conflict between the formula and the theoretical definition.
> >>
> >> Please see page 115 in
> >>
> >> https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false
> >> The
> >>
> >>
> >> Thanks.
> >> Hamed.
> >>
> >>
> >>
> >> On Tue, 23 Oct 2018 at 10:21, Eric Berger <[hidden email]> wrote:
> >>
> >>> Hi Hamed,
> >>> I disagree with your criticism.
> >>> For a random variable X
> >>> X: D - - - > R
> >>> its CDF F is defined by
> >>> F: R - - - > [0,1]
> >>> F(z) = Prob(X <= z)
> >>>
> >>> The fact that you wrote a convenient formula for the CDF
> >>> F(z) = (z-a)/(b-a)  a <= z <= b
> >>> in a particular range for z is your decision, and as you noted this
> >>> formula will give the wrong value for z outside the interval [a,b].
> >>> But the problem lies in your formula, not the definition of the CDF
> >>> which would be, in your case:
> >>>
> >>> F(z) = 0 if z <= a
> >>>        = (z-a)/(b-a)   if a <= z <= b
> >>>        = 1 if 1 <= z
> >>>
> >>> HTH,
> >>> Eric
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]> wrote:
> >>>
> >>>> Hi All,
> >>>>
> >>>> I recently discovered an interesting issue with the punif() function.
> >>>> Let
> >>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<=
> >>>> b).
> >>>> The important fact here is the domain of the random variable X. Having
> >>>> said
> >>>> that, R returns CDF for any value in the real domain.
> >>>>
> >>>> I understand that one can justify this by extending the domain of X and
> >>>> assigning zero probabilities to the values outside the domain. However,
> >>>> theoretically, it is not true to return a value for the CDF outside the
> >>>> domain. Then I propose a patch to R function punif() to return an error
> >>>> in
> >>>> this situations.
> >>>>
> >>>> Example:
> >>>> > punif(10^10)
> >>>> [1] 1
> >>>>
> >>>>
> >>>> Regards,
> >>>> Hamed.
> >>>>
> >>>>         [[alternative HTML version deleted]]
> >>>>
> >>>> ______________________________________________
> >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide
> >>>> http://www.R-project.org/posting-guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>>
> >>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

Ted Harding
Sorry -- stupid typos in my definition below!
See at ===*** below.

On Tue, 2018-10-23 at 11:41 +0100, Ted Harding wrote:
Before the ticket finally enters the waste bin, I think it is
necessary to explicitly explain what is meant by the "domain"
of a random variable. This is not (though in special cases
could be) the space of possible values of the random variable.

Definition of (real-valued) Random Variable (RV):
Let Z be a probability space, i.e. a set {z} of entities z
on which a probability distribution is defined. The entities z
do not need to be numeric. A real-valued RV X is a function
X:Z --> R defined on Z such that, for any z in Z, X(z) is a
real number. The set Z, in tthis context, is (by definitipon)
the *domain* of X, i.e. the space on which X is defined.
It may or may not be (and usually is not) the same as the set
of possible values of X.

Then. given any real value x0, the CDF of X at x- is Prob[X <= X0].
The distribution function of X does not define the domain of X.

As a simple exam[ple: Suppose Q is a cube of side A, consisting of
points z=(u,v,w) with 0 <= u,v,w <= A. Z is the probability space
of points z with a uniform distribution of position within Q.
Define the random variable X:Q --> [0,1] as
===***
  X[u,v,w) = x/A

Wrong! That should  have been:

  X[u,v,w) = w/A
===***
Then X is uniformly distributed on [0,1], the domain of X is Q.
Then for x <= 0 _Prob[X <= x] = 0, for 0 <= x <= 1 Prob(X >=x] = x,
for x >= 1 Prob(X <= x] = 1. These define the CDF. The set of poaaible
values of X is 1-dimensional, and is not the same as the domain of X,
which is 3-dimensional.

Hopiong this helps!
Ted.

On Tue, 2018-10-23 at 10:54 +0100, Hamed Ha wrote:

> > Yes, now it makes more sense.
> >
> > Okay, I think that I am convinced and we can close this ticket.
> >
> > Thanks Eric.
> > Regards,
> > Hamed.
> >
> > On Tue, 23 Oct 2018 at 10:42, Eric Berger <[hidden email]> wrote:
> >
> > > Hi Hamed,
> > > That reference is sloppy. Try looking at
> > > https://en.wikipedia.org/wiki/Cumulative_distribution_function
> > > and in particular the first example which deals with a Unif[0,1] r.v.
> > >
> > > Best,
> > > Eric
> > >
> > >
> > > On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <[hidden email]> wrote:
> > >
> > >> Hi Eric,
> > >>
> > >> Thank you for your reply.
> > >>
> > >> I should say that your justification makes sense to me.  However, I am in
> > >> doubt that CDF defines by the Pr(x <= X) for all X? that is the domain of
> > >> RV is totally ignored in the definition.
> > >>
> > >> It makes a conflict between the formula and the theoretical definition.
> > >>
> > >> Please see page 115 in
> > >>
> > >> https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false
> > >> The
> > >>
> > >>
> > >> Thanks.
> > >> Hamed.
> > >>
> > >>
> > >>
> > >> On Tue, 23 Oct 2018 at 10:21, Eric Berger <[hidden email]> wrote:
> > >>
> > >>> Hi Hamed,
> > >>> I disagree with your criticism.
> > >>> For a random variable X
> > >>> X: D - - - > R
> > >>> its CDF F is defined by
> > >>> F: R - - - > [0,1]
> > >>> F(z) = Prob(X <= z)
> > >>>
> > >>> The fact that you wrote a convenient formula for the CDF
> > >>> F(z) = (z-a)/(b-a)  a <= z <= b
> > >>> in a particular range for z is your decision, and as you noted this
> > >>> formula will give the wrong value for z outside the interval [a,b].
> > >>> But the problem lies in your formula, not the definition of the CDF
> > >>> which would be, in your case:
> > >>>
> > >>> F(z) = 0 if z <= a
> > >>>        = (z-a)/(b-a)   if a <= z <= b
> > >>>        = 1 if 1 <= z
> > >>>
> > >>> HTH,
> > >>> Eric
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]> wrote:
> > >>>
> > >>>> Hi All,
> > >>>>
> > >>>> I recently discovered an interesting issue with the punif() function.
> > >>>> Let
> > >>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<=
> > >>>> b).
> > >>>> The important fact here is the domain of the random variable X. Having
> > >>>> said
> > >>>> that, R returns CDF for any value in the real domain.
> > >>>>
> > >>>> I understand that one can justify this by extending the domain of X and
> > >>>> assigning zero probabilities to the values outside the domain. However,
> > >>>> theoretically, it is not true to return a value for the CDF outside the
> > >>>> domain. Then I propose a patch to R function punif() to return an error
> > >>>> in
> > >>>> this situations.
> > >>>>
> > >>>> Example:
> > >>>> > punif(10^10)
> > >>>> [1] 1
> > >>>>
> > >>>>
> > >>>> Regards,
> > >>>> Hamed.
> > >>>>
> > >>>>         [[alternative HTML version deleted]]
> > >>>>
> > >>>> ______________________________________________
> > >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>>> PLEASE do read the posting guide
> > >>>> http://www.R-project.org/posting-guide.html
> > >>>> and provide commented, minimal, self-contained, reproducible code.
> > >>>>
> > >>>
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

hamedhm
Hi Ted,

Thanks for the explanation.

I am convinced at least more than average by Eric and your answer. But
still have some shadows of confusion that is definitely because I have
forgotten some fundamentals in probabilities.

In your cube example, the cumulative probability of reaching a point
outside the cube (u or v or w  > A) is 1 however, the bigger cube does not
exists (because the Q is the reference space). Other words, I feel that we
extend the space to accommodate any cube of any size! Looks a bit weird to
me!


Hamed.

On Tue, 23 Oct 2018 at 11:52, Ted Harding <[hidden email]> wrote:

> Sorry -- stupid typos in my definition below!
> See at ===*** below.
>
> On Tue, 2018-10-23 at 11:41 +0100, Ted Harding wrote:
> Before the ticket finally enters the waste bin, I think it is
> necessary to explicitly explain what is meant by the "domain"
> of a random variable. This is not (though in special cases
> could be) the space of possible values of the random variable.
>
> Definition of (real-valued) Random Variable (RV):
> Let Z be a probability space, i.e. a set {z} of entities z
> on which a probability distribution is defined. The entities z
> do not need to be numeric. A real-valued RV X is a function
> X:Z --> R defined on Z such that, for any z in Z, X(z) is a
> real number. The set Z, in tthis context, is (by definitipon)
> the *domain* of X, i.e. the space on which X is defined.
> It may or may not be (and usually is not) the same as the set
> of possible values of X.
>
> Then. given any real value x0, the CDF of X at x- is Prob[X <= X0].
> The distribution function of X does not define the domain of X.
>
> As a simple exam[ple: Suppose Q is a cube of side A, consisting of
> points z=(u,v,w) with 0 <= u,v,w <= A. Z is the probability space
> of points z with a uniform distribution of position within Q.
> Define the random variable X:Q --> [0,1] as
> ===***
>   X[u,v,w) = x/A
>
> Wrong! That should  have been:
>
>   X[u,v,w) = w/A
> ===***
> Then X is uniformly distributed on [0,1], the domain of X is Q.
> Then for x <= 0 _Prob[X <= x] = 0, for 0 <= x <= 1 Prob(X >=x] = x,
> for x >= 1 Prob(X <= x] = 1. These define the CDF. The set of poaaible
> values of X is 1-dimensional, and is not the same as the domain of X,
> which is 3-dimensional.
>
> Hopiong this helps!
> Ted.
>
> On Tue, 2018-10-23 at 10:54 +0100, Hamed Ha wrote:
> > > Yes, now it makes more sense.
> > >
> > > Okay, I think that I am convinced and we can close this ticket.
> > >
> > > Thanks Eric.
> > > Regards,
> > > Hamed.
> > >
> > > On Tue, 23 Oct 2018 at 10:42, Eric Berger <[hidden email]>
> wrote:
> > >
> > > > Hi Hamed,
> > > > That reference is sloppy. Try looking at
> > > > https://en.wikipedia.org/wiki/Cumulative_distribution_function
> > > > and in particular the first example which deals with a Unif[0,1] r.v.
> > > >
> > > > Best,
> > > > Eric
> > > >
> > > >
> > > > On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <[hidden email]>
> wrote:
> > > >
> > > >> Hi Eric,
> > > >>
> > > >> Thank you for your reply.
> > > >>
> > > >> I should say that your justification makes sense to me.  However, I
> am in
> > > >> doubt that CDF defines by the Pr(x <= X) for all X? that is the
> domain of
> > > >> RV is totally ignored in the definition.
> > > >>
> > > >> It makes a conflict between the formula and the theoretical
> definition.
> > > >>
> > > >> Please see page 115 in
> > > >>
> > > >>
> https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false
> > > >> The
> > > >>
> > > >>
> > > >> Thanks.
> > > >> Hamed.
> > > >>
> > > >>
> > > >>
> > > >> On Tue, 23 Oct 2018 at 10:21, Eric Berger <[hidden email]>
> wrote:
> > > >>
> > > >>> Hi Hamed,
> > > >>> I disagree with your criticism.
> > > >>> For a random variable X
> > > >>> X: D - - - > R
> > > >>> its CDF F is defined by
> > > >>> F: R - - - > [0,1]
> > > >>> F(z) = Prob(X <= z)
> > > >>>
> > > >>> The fact that you wrote a convenient formula for the CDF
> > > >>> F(z) = (z-a)/(b-a)  a <= z <= b
> > > >>> in a particular range for z is your decision, and as you noted this
> > > >>> formula will give the wrong value for z outside the interval [a,b].
> > > >>> But the problem lies in your formula, not the definition of the CDF
> > > >>> which would be, in your case:
> > > >>>
> > > >>> F(z) = 0 if z <= a
> > > >>>        = (z-a)/(b-a)   if a <= z <= b
> > > >>>        = 1 if 1 <= z
> > > >>>
> > > >>> HTH,
> > > >>> Eric
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]>
> wrote:
> > > >>>
> > > >>>> Hi All,
> > > >>>>
> > > >>>> I recently discovered an interesting issue with the punif()
> function.
> > > >>>> Let
> > > >>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for
> (a<= x<=
> > > >>>> b).
> > > >>>> The important fact here is the domain of the random variable X.
> Having
> > > >>>> said
> > > >>>> that, R returns CDF for any value in the real domain.
> > > >>>>
> > > >>>> I understand that one can justify this by extending the domain of
> X and
> > > >>>> assigning zero probabilities to the values outside the domain.
> However,
> > > >>>> theoretically, it is not true to return a value for the CDF
> outside the
> > > >>>> domain. Then I propose a patch to R function punif() to return an
> error
> > > >>>> in
> > > >>>> this situations.
> > > >>>>
> > > >>>> Example:
> > > >>>> > punif(10^10)
> > > >>>> [1] 1
> > > >>>>
> > > >>>>
> > > >>>> Regards,
> > > >>>> Hamed.
> > > >>>>
> > > >>>>         [[alternative HTML version deleted]]
> > > >>>>
> > > >>>> ______________________________________________
> > > >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >>>> PLEASE do read the posting guide
> > > >>>> http://www.R-project.org/posting-guide.html
> > > >>>> and provide commented, minimal, self-contained, reproducible code.
> > > >>>>
> > > >>>
> > >
> > >     [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Ignoring the domain of RV in punif()

Ted Harding
Well, as a final (I hope!) clarification: It is not the case that
"the bigger cube does not exists (because the Q is the reference
space)". It does exist! Simply, the probability of the random point
being in the bigger cube, and NOT in the cube Q, is 0.

Hence "the cumulative probability of reaching a point outside the
cube (u or v or w  > A) is 1" is badly phrased. The "cumulative
probability" is not the probability of *reaching* a point, but of
being (in the case of a real random variable) less than or equal
to the given value. If Prob[X <= x1] = 1, then Prob[X > x1] = 0.
Hence if x0 is the minimum value such that Prob[X <= x0] = 1,
then X "can reach" x0. But for any x1 > x0, Prob[x0 < X <= x1] = 0.
Therefore, since X cannot be greater than x0, X *cannot reach* x1!

Best wishes,
Ted.

On Tue, 2018-10-23 at 12:06 +0100, Hamed Ha wrote:

> Hi Ted,
>
> Thanks for the explanation.
>
> I am convinced at least more than average by Eric and your answer. But
> still have some shadows of confusion that is definitely because I have
> forgotten some fundamentals in probabilities.
>
> In your cube example, the cumulative probability of reaching a point
> outside the cube (u or v or w  > A) is 1 however, the bigger cube does not
> exists (because the Q is the reference space). Other words, I feel that we
> extend the space to accommodate any cube of any size! Looks a bit weird to
> me!
>
>
> Hamed.
>
> On Tue, 23 Oct 2018 at 11:52, Ted Harding <[hidden email]> wrote:
>
> > Sorry -- stupid typos in my definition below!
> > See at ===*** below.
> >
> > On Tue, 2018-10-23 at 11:41 +0100, Ted Harding wrote:
> > Before the ticket finally enters the waste bin, I think it is
> > necessary to explicitly explain what is meant by the "domain"
> > of a random variable. This is not (though in special cases
> > could be) the space of possible values of the random variable.
> >
> > Definition of (real-valued) Random Variable (RV):
> > Let Z be a probability space, i.e. a set {z} of entities z
> > on which a probability distribution is defined. The entities z
> > do not need to be numeric. A real-valued RV X is a function
> > X:Z --> R defined on Z such that, for any z in Z, X(z) is a
> > real number. The set Z, in tthis context, is (by definitipon)
> > the *domain* of X, i.e. the space on which X is defined.
> > It may or may not be (and usually is not) the same as the set
> > of possible values of X.
> >
> > Then. given any real value x0, the CDF of X at x- is Prob[X <= X0].
> > The distribution function of X does not define the domain of X.
> >
> > As a simple exam[ple: Suppose Q is a cube of side A, consisting of
> > points z=(u,v,w) with 0 <= u,v,w <= A. Z is the probability space
> > of points z with a uniform distribution of position within Q.
> > Define the random variable X:Q --> [0,1] as
> > ===***
> >   X[u,v,w) = x/A
> >
> > Wrong! That should  have been:
> >
> >   X[u,v,w) = w/A
> > ===***
> > Then X is uniformly distributed on [0,1], the domain of X is Q.
> > Then for x <= 0 _Prob[X <= x] = 0, for 0 <= x <= 1 Prob(X >=x] = x,
> > for x >= 1 Prob(X <= x] = 1. These define the CDF. The set of poaaible
> > values of X is 1-dimensional, and is not the same as the domain of X,
> > which is 3-dimensional.
> >
> > Hopiong this helps!
> > Ted.
> >
> > On Tue, 2018-10-23 at 10:54 +0100, Hamed Ha wrote:
> > > > Yes, now it makes more sense.
> > > >
> > > > Okay, I think that I am convinced and we can close this ticket.
> > > >
> > > > Thanks Eric.
> > > > Regards,
> > > > Hamed.
> > > >
> > > > On Tue, 23 Oct 2018 at 10:42, Eric Berger <[hidden email]>
> > wrote:
> > > >
> > > > > Hi Hamed,
> > > > > That reference is sloppy. Try looking at
> > > > > https://en.wikipedia.org/wiki/Cumulative_distribution_function
> > > > > and in particular the first example which deals with a Unif[0,1] r.v.
> > > > >
> > > > > Best,
> > > > > Eric
> > > > >
> > > > >
> > > > > On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <[hidden email]>
> > wrote:
> > > > >
> > > > >> Hi Eric,
> > > > >>
> > > > >> Thank you for your reply.
> > > > >>
> > > > >> I should say that your justification makes sense to me.  However, I
> > am in
> > > > >> doubt that CDF defines by the Pr(x <= X) for all X? that is the
> > domain of
> > > > >> RV is totally ignored in the definition.
> > > > >>
> > > > >> It makes a conflict between the formula and the theoretical
> > definition.
> > > > >>
> > > > >> Please see page 115 in
> > > > >>
> > > > >>
> > https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false
> > > > >> The
> > > > >>
> > > > >>
> > > > >> Thanks.
> > > > >> Hamed.
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, 23 Oct 2018 at 10:21, Eric Berger <[hidden email]>
> > wrote:
> > > > >>
> > > > >>> Hi Hamed,
> > > > >>> I disagree with your criticism.
> > > > >>> For a random variable X
> > > > >>> X: D - - - > R
> > > > >>> its CDF F is defined by
> > > > >>> F: R - - - > [0,1]
> > > > >>> F(z) = Prob(X <= z)
> > > > >>>
> > > > >>> The fact that you wrote a convenient formula for the CDF
> > > > >>> F(z) = (z-a)/(b-a)  a <= z <= b
> > > > >>> in a particular range for z is your decision, and as you noted this
> > > > >>> formula will give the wrong value for z outside the interval [a,b].
> > > > >>> But the problem lies in your formula, not the definition of the CDF
> > > > >>> which would be, in your case:
> > > > >>>
> > > > >>> F(z) = 0 if z <= a
> > > > >>>        = (z-a)/(b-a)   if a <= z <= b
> > > > >>>        = 1 if 1 <= z
> > > > >>>
> > > > >>> HTH,
> > > > >>> Eric
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <[hidden email]>
> > wrote:
> > > > >>>
> > > > >>>> Hi All,
> > > > >>>>
> > > > >>>> I recently discovered an interesting issue with the punif()
> > function.
> > > > >>>> Let
> > > > >>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for
> > (a<= x<=
> > > > >>>> b).
> > > > >>>> The important fact here is the domain of the random variable X.
> > Having
> > > > >>>> said
> > > > >>>> that, R returns CDF for any value in the real domain.
> > > > >>>>
> > > > >>>> I understand that one can justify this by extending the domain of
> > X and
> > > > >>>> assigning zero probabilities to the values outside the domain.
> > However,
> > > > >>>> theoretically, it is not true to return a value for the CDF
> > outside the
> > > > >>>> domain. Then I propose a patch to R function punif() to return an
> > error
> > > > >>>> in
> > > > >>>> this situations.
> > > > >>>>
> > > > >>>> Example:
> > > > >>>> > punif(10^10)
> > > > >>>> [1] 1
> > > > >>>>
> > > > >>>>
> > > > >>>> Regards,
> > > > >>>> Hamed.
> > > > >>>>
> > > > >>>>         [[alternative HTML version deleted]]
> > > > >>>>
> > > > >>>> ______________________________________________
> > > > >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >>>> PLEASE do read the posting guide
> > > > >>>> http://www.R-project.org/posting-guide.html
> > > > >>>> and provide commented, minimal, self-contained, reproducible code.
> > > > >>>>
> > > > >>>
> > > >
> > > >     [[alternative HTML version deleted]]
> > > >
> > > > ______________________________________________
> > > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.