

Hello,
I'm trying to understand how to use the pbo package by looking at a
vignette. I'm curious about a part of the vignette that creates simulated
returns data. The package author transforms his simulated returns in a way
that I'm unfamiliar with, and that I haven't been able to find an
explanation for after searching around. I'm curious if I need to replicate
the transformation with real returns. For context, here is the vignette
(cleaned up a bit to make it reproducible):
(Full vignette:
https://cran.rproject.org/web/packages/pbo/vignettes/pbo.html)
library(pbo)
#First, we assemble the trials into an NxT matrix where each column
#represents a trial and each trial has the same length T. This example
#is random data so the backtest should be overfit.`
set.seed(765)
n < 100
t < 2400
m < data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
dimnames=list(1:t,1:n)), check.names=FALSE)
sr_base < 0
mu_base < sr_base/(252.0)
sigma_base < 1.00/(252.0)**0.5
for ( i in 1:n ) {
m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
#We can use any performance evaluation function that can work with the
#reassembled submatrices during the cross validation iterations.
#Following the original paper we can use the Sharpe ratio as
sharpe < function(x,rf=0.03/252) {
sr < apply(x,2,function(col) {
er = col  rf
return(mean(er)/sd(er))
})
return(sr)}
#Now that we have the trials matrix we can pass it to the pbo function
#for analysis.
my_pbo < pbo(m,s=8,f=sharpe,threshold=0)
summary(my_pbo)
Here's the portion i'm curious about:
sr_base < 0
mu_base < sr_base/(252.0)
sigma_base < 1.00/(252.0)**0.5
for ( i in 1:n ) {
m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
Why is the data transformed within the for loop, and does this kind of
rescaling and recentering need to be done with real returns? Or is this
just something the author is doing to make his simulated returns look more
like the real thing?
Googling around turned up some articles regarding scaling volatility to the
square root of time, but the scaling in the code here doesn't look quite
like what I've seen. Rescalings I've seen involve multiplying some short
term (i.e. daily) measure of volatility by the root of time, but this isn't
quite that. Also, the documentation for the package doesn't include this
chunk of rescaling and recentering code. Documentation: https://cran.rproject.org/web/packages/pbo/pbo.pdf
So:

Why is the data transformed in this way/what is result of this
transformation?

Is it only necessary for this simulated data, or do I need to
similarly transform real returns?
I read in the posting guide that stats questions are acceptable given
certain conditions, I hope this counts. Thanks for reading,
Joe
< http://www.avg.com/emailsignature?utm_medium=email&utm_source=link&utm_campaign=sigemail&utm_content=webmail>
Virusfree.
www.avg.com
< http://www.avg.com/emailsignature?utm_medium=email&utm_source=link&utm_campaign=sigemail&utm_content=webmail>
<#DAB4FAD82DD740BBA1B84E2AA1F9FDF2>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Wrong list.
Post on rsigfinance instead.
Cheers,
Bert
On Nov 20, 2017 11:25 PM, "Joe O" < [hidden email]> wrote:
Hello,
I'm trying to understand how to use the pbo package by looking at a
vignette. I'm curious about a part of the vignette that creates simulated
returns data. The package author transforms his simulated returns in a way
that I'm unfamiliar with, and that I haven't been able to find an
explanation for after searching around. I'm curious if I need to replicate
the transformation with real returns. For context, here is the vignette
(cleaned up a bit to make it reproducible):
(Full vignette:
https://cran.rproject.org/web/packages/pbo/vignettes/pbo.html)
library(pbo)
#First, we assemble the trials into an NxT matrix where each column
#represents a trial and each trial has the same length T. This example
#is random data so the backtest should be overfit.`
set.seed(765)
n < 100
t < 2400
m < data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
dimnames=list(1:t,1:n)), check.names=FALSE)
sr_base < 0
mu_base < sr_base/(252.0)
sigma_base < 1.00/(252.0)**0.5
for ( i in 1:n ) {
m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
#We can use any performance evaluation function that can work with the
#reassembled submatrices during the cross validation iterations.
#Following the original paper we can use the Sharpe ratio as
sharpe < function(x,rf=0.03/252) {
sr < apply(x,2,function(col) {
er = col  rf
return(mean(er)/sd(er))
})
return(sr)}
#Now that we have the trials matrix we can pass it to the pbo function
#for analysis.
my_pbo < pbo(m,s=8,f=sharpe,threshold=0)
summary(my_pbo)
Here's the portion i'm curious about:
sr_base < 0
mu_base < sr_base/(252.0)
sigma_base < 1.00/(252.0)**0.5
for ( i in 1:n ) {
m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
Why is the data transformed within the for loop, and does this kind of
rescaling and recentering need to be done with real returns? Or is this
just something the author is doing to make his simulated returns look more
like the real thing?
Googling around turned up some articles regarding scaling volatility to the
square root of time, but the scaling in the code here doesn't look quite
like what I've seen. Rescalings I've seen involve multiplying some short
term (i.e. daily) measure of volatility by the root of time, but this isn't
quite that. Also, the documentation for the package doesn't include this
chunk of rescaling and recentering code. Documentation: https://cran.rproject.org/web/packages/pbo/pbo.pdf
So:

Why is the data transformed in this way/what is result of this
transformation?

Is it only necessary for this simulated data, or do I need to
similarly transform real returns?
I read in the posting guide that stats questions are acceptable given
certain conditions, I hope this counts. Thanks for reading,
Joe
< http://www.avg.com/emailsignature?utm_medium=email&utm_source=link&utm_campaign=sigemail&utm_content=webmail>
Virusfree.
www.avg.com
< http://www.avg.com/emailsignature?utm_medium=email&utm_source=link&utm_campaign=sigemail&utm_content=webmail>
<#DAB4FAD82DD740BBA1B84E2AA1F9FDF2>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Joe,
The centering and rescaling is done for the purposes of his example, and
also to be consistent with his definition of the sharpe function.
In particular, note that the sharpe function has the rf (riskfree)
parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
to a DAILY rate, expressed in decimal.
That means that the other argument to this function, x, should be DAILY
returns, expressed in decimal.
Suppose he wanted to create random data from a distribution of returns with
ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
The equivalent DAILY
Then he does two steps: (1) generate a matrix of random values from the
N(0,1) distribution. (2) convert them to DAILY
After initializing the matrix with random values (from N(0,1)), he now
wants to create a series of DAILY
sr_base < 0
mu_base < sr_base/(252.0)
sigma_base < 1.00/(252.0)**0.5
for ( i in 1:n ) {
m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter < [hidden email]> wrote:
> Wrong list.
>
> Post on rsigfinance instead.
>
> Cheers,
> Bert
>
>
>
> On Nov 20, 2017 11:25 PM, "Joe O" < [hidden email]> wrote:
>
> Hello,
>
> I'm trying to understand how to use the pbo package by looking at a
> vignette. I'm curious about a part of the vignette that creates simulated
> returns data. The package author transforms his simulated returns in a way
> that I'm unfamiliar with, and that I haven't been able to find an
> explanation for after searching around. I'm curious if I need to replicate
> the transformation with real returns. For context, here is the vignette
> (cleaned up a bit to make it reproducible):
>
> (Full vignette:
> https://cran.rproject.org/web/packages/pbo/vignettes/pbo.html)
>
> library(pbo)
> #First, we assemble the trials into an NxT matrix where each column
> #represents a trial and each trial has the same length T. This example
> #is random data so the backtest should be overfit.`
>
> set.seed(765)
> n < 100
> t < 2400
> m < data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
> dimnames=list(1:t,1:n)), check.names=FALSE)
>
> sr_base < 0
> mu_base < sr_base/(252.0)
> sigma_base < 1.00/(252.0)**0.5
> for ( i in 1:n ) {
> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
> #We can use any performance evaluation function that can work with the
> #reassembled submatrices during the cross validation iterations.
> #Following the original paper we can use the Sharpe ratio as
>
> sharpe < function(x,rf=0.03/252) {
> sr < apply(x,2,function(col) {
> er = col  rf
> return(mean(er)/sd(er))
> })
> return(sr)}
> #Now that we have the trials matrix we can pass it to the pbo function
> #for analysis.
>
> my_pbo < pbo(m,s=8,f=sharpe,threshold=0)
>
> summary(my_pbo)
>
> Here's the portion i'm curious about:
>
> sr_base < 0
> mu_base < sr_base/(252.0)
> sigma_base < 1.00/(252.0)**0.5
> for ( i in 1:n ) {
> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>
> Why is the data transformed within the for loop, and does this kind of
> rescaling and recentering need to be done with real returns? Or is this
> just something the author is doing to make his simulated returns look more
> like the real thing?
>
> Googling around turned up some articles regarding scaling volatility to the
> square root of time, but the scaling in the code here doesn't look quite
> like what I've seen. Rescalings I've seen involve multiplying some short
> term (i.e. daily) measure of volatility by the root of time, but this isn't
> quite that. Also, the documentation for the package doesn't include this
> chunk of rescaling and recentering code. Documentation: https://cran.r> project.org/web/packages/pbo/pbo.pdf
>
> So:
>
> 
>
> Why is the data transformed in this way/what is result of this
> transformation?
> 
>
> Is it only necessary for this simulated data, or do I need to
> similarly transform real returns?
>
> I read in the posting guide that stats questions are acceptable given
> certain conditions, I hope this counts. Thanks for reading,
>
> Joe
>
> < http://www.avg.com/emailsignature?utm_medium=email&> utm_source=link&utm_campaign=sigemail&utm_content=webmail>
> Virusfree.
> www.avg.com
> < http://www.avg.com/emailsignature?utm_medium=email&> utm_source=link&utm_campaign=sigemail&utm_content=webmail>
> <#DAB4FAD82DD740BBA1B84E2AA1F9FDF2>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/> postingguide.html
> and provide commented, minimal, selfcontained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/> postingguide.html
> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


[resending  previous email went out by accident before complete]
Hi Joe,
The centering and rescaling is done for the purposes of his example, and
also to be consistent with his definition of the sharpe function.
In particular, note that the sharpe function has the rf (riskfree)
parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
to a DAILY rate, expressed in decimal.
That means that the other argument to this function, x, should be DAILY
returns, expressed in decimal.
Suppose he wanted to create random data from a distribution of returns with
ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
The equivalent DAILY returns would have mean MU_D = MU_A / 252 and standard
deviation SIGMA_D = SIGMA_A/SQRT(252).
He calls MU_D by the name mu_base and SIGMA_D by the name sigma_base.
His loop now converts the random numbers in his matrix so that each column
has mean MU_D and std deviation SIGMA_D.
HTH,
Eric
On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger < [hidden email]> wrote:
> Hi Joe,
> The centering and rescaling is done for the purposes of his example, and
> also to be consistent with his definition of the sharpe function.
> In particular, note that the sharpe function has the rf (riskfree)
> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
> to a DAILY rate, expressed in decimal.
> That means that the other argument to this function, x, should be DAILY
> returns, expressed in decimal.
>
> Suppose he wanted to create random data from a distribution of returns
> with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in
> decimal.
> The equivalent DAILY
>
> Then he does two steps: (1) generate a matrix of random values from the
> N(0,1) distribution. (2) convert them to DAILY
> After initializing the matrix with random values (from N(0,1)), he now
> wants to create a series of DAILY
> sr_base < 0
> mu_base < sr_base/(252.0)
> sigma_base < 1.00/(252.0)**0.5
> for ( i in 1:n ) {
> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>
> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter < [hidden email]>
> wrote:
>
>> Wrong list.
>>
>> Post on rsigfinance instead.
>>
>> Cheers,
>> Bert
>>
>>
>>
>> On Nov 20, 2017 11:25 PM, "Joe O" < [hidden email]> wrote:
>>
>> Hello,
>>
>> I'm trying to understand how to use the pbo package by looking at a
>> vignette. I'm curious about a part of the vignette that creates simulated
>> returns data. The package author transforms his simulated returns in a way
>> that I'm unfamiliar with, and that I haven't been able to find an
>> explanation for after searching around. I'm curious if I need to replicate
>> the transformation with real returns. For context, here is the vignette
>> (cleaned up a bit to make it reproducible):
>>
>> (Full vignette:
>> https://cran.rproject.org/web/packages/pbo/vignettes/pbo.html)
>>
>> library(pbo)
>> #First, we assemble the trials into an NxT matrix where each column
>> #represents a trial and each trial has the same length T. This example
>> #is random data so the backtest should be overfit.`
>>
>> set.seed(765)
>> n < 100
>> t < 2400
>> m < data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>> dimnames=list(1:t,1:n)), check.names=FALSE)
>>
>> sr_base < 0
>> mu_base < sr_base/(252.0)
>> sigma_base < 1.00/(252.0)**0.5
>> for ( i in 1:n ) {
>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>> #We can use any performance evaluation function that can work with the
>> #reassembled submatrices during the cross validation iterations.
>> #Following the original paper we can use the Sharpe ratio as
>>
>> sharpe < function(x,rf=0.03/252) {
>> sr < apply(x,2,function(col) {
>> er = col  rf
>> return(mean(er)/sd(er))
>> })
>> return(sr)}
>> #Now that we have the trials matrix we can pass it to the pbo function
>> #for analysis.
>>
>> my_pbo < pbo(m,s=8,f=sharpe,threshold=0)
>>
>> summary(my_pbo)
>>
>> Here's the portion i'm curious about:
>>
>> sr_base < 0
>> mu_base < sr_base/(252.0)
>> sigma_base < 1.00/(252.0)**0.5
>> for ( i in 1:n ) {
>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>
>> Why is the data transformed within the for loop, and does this kind of
>> rescaling and recentering need to be done with real returns? Or is this
>> just something the author is doing to make his simulated returns look more
>> like the real thing?
>>
>> Googling around turned up some articles regarding scaling volatility to
>> the
>> square root of time, but the scaling in the code here doesn't look quite
>> like what I've seen. Rescalings I've seen involve multiplying some short
>> term (i.e. daily) measure of volatility by the root of time, but this
>> isn't
>> quite that. Also, the documentation for the package doesn't include this
>> chunk of rescaling and recentering code. Documentation: https://cran.r>> project.org/web/packages/pbo/pbo.pdf
>>
>> So:
>>
>> 
>>
>> Why is the data transformed in this way/what is result of this
>> transformation?
>> 
>>
>> Is it only necessary for this simulated data, or do I need to
>> similarly transform real returns?
>>
>> I read in the posting guide that stats questions are acceptable given
>> certain conditions, I hope this counts. Thanks for reading,
>>
>> Joe
>>
>> < http://www.avg.com/emailsignature?utm_medium=email&>> utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>> Virusfree.
>> www.avg.com
>> < http://www.avg.com/emailsignature?utm_medium=email&>> utm_source=link&utm_campaign=sigemail&utm_content=webmail
>> < http://www.avg.com/emailsignature?utm_medium=email&utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>> >
>> <#DAB4FAD82DD740BBA1B84E2AA1F9FDF2>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/posti>> ngguide.html
>> and provide commented, minimal, selfcontained, reproducible code.
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/posti>> ngguide.html
>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Eric,
Thank you, that helps a lot. If I'm understanding correctly, if I’m wanting
to use actual returns from backtests rather than simulated returns, I would
need to make sure my riskadjusted return measure, sharpe ratio in this
case, matches up in scale with my returns (i.e. daily returns with daily
sharpe, monthly with monthly, etc). And I wouldn’t need to transform
returns like the simulated returns are in the vignette, as the real returns
are going to have whatever properties they have (meaning they will have
whatever average and std dev they happen to have). Is that correct?
Thanks, Joe
On Tue, Nov 21, 2017 at 5:36 AM, Eric Berger < [hidden email]> wrote:
> [resending  previous email went out by accident before complete]
> Hi Joe,
> The centering and rescaling is done for the purposes of his example, and
> also to be consistent with his definition of the sharpe function.
> In particular, note that the sharpe function has the rf (riskfree)
> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
> to a DAILY rate, expressed in decimal.
> That means that the other argument to this function, x, should be DAILY
> returns, expressed in decimal.
>
> Suppose he wanted to create random data from a distribution of returns
> with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in
> decimal.
> The equivalent DAILY returns would have mean MU_D = MU_A / 252 and
> standard deviation SIGMA_D = SIGMA_A/SQRT(252).
>
> He calls MU_D by the name mu_base and SIGMA_D by the name sigma_base.
>
> His loop now converts the random numbers in his matrix so that each column
> has mean MU_D and std deviation SIGMA_D.
>
> HTH,
> Eric
>
>
>
> On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger < [hidden email]>
> wrote:
>
>> Hi Joe,
>> The centering and rescaling is done for the purposes of his example, and
>> also to be consistent with his definition of the sharpe function.
>> In particular, note that the sharpe function has the rf (riskfree)
>> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
>> to a DAILY rate, expressed in decimal.
>> That means that the other argument to this function, x, should be DAILY
>> returns, expressed in decimal.
>>
>> Suppose he wanted to create random data from a distribution of returns
>> with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in
>> decimal.
>> The equivalent DAILY
>>
>> Then he does two steps: (1) generate a matrix of random values from the
>> N(0,1) distribution. (2) convert them to DAILY
>> After initializing the matrix with random values (from N(0,1)), he now
>> wants to create a series of DAILY
>> sr_base < 0
>> mu_base < sr_base/(252.0)
>> sigma_base < 1.00/(252.0)**0.5
>> for ( i in 1:n ) {
>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>
>> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter < [hidden email]>
>> wrote:
>>
>>> Wrong list.
>>>
>>> Post on rsigfinance instead.
>>>
>>> Cheers,
>>> Bert
>>>
>>>
>>>
>>> On Nov 20, 2017 11:25 PM, "Joe O" < [hidden email]> wrote:
>>>
>>> Hello,
>>>
>>> I'm trying to understand how to use the pbo package by looking at a
>>> vignette. I'm curious about a part of the vignette that creates simulated
>>> returns data. The package author transforms his simulated returns in a
>>> way
>>> that I'm unfamiliar with, and that I haven't been able to find an
>>> explanation for after searching around. I'm curious if I need to
>>> replicate
>>> the transformation with real returns. For context, here is the vignette
>>> (cleaned up a bit to make it reproducible):
>>>
>>> (Full vignette:
>>> https://cran.rproject.org/web/packages/pbo/vignettes/pbo.html)
>>>
>>> library(pbo)
>>> #First, we assemble the trials into an NxT matrix where each column
>>> #represents a trial and each trial has the same length T. This example
>>> #is random data so the backtest should be overfit.`
>>>
>>> set.seed(765)
>>> n < 100
>>> t < 2400
>>> m < data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>>> dimnames=list(1:t,1:n)), check.names=FALSE)
>>>
>>> sr_base < 0
>>> mu_base < sr_base/(252.0)
>>> sigma_base < 1.00/(252.0)**0.5
>>> for ( i in 1:n ) {
>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>> #We can use any performance evaluation function that can work with the
>>> #reassembled submatrices during the cross validation iterations.
>>> #Following the original paper we can use the Sharpe ratio as
>>>
>>> sharpe < function(x,rf=0.03/252) {
>>> sr < apply(x,2,function(col) {
>>> er = col  rf
>>> return(mean(er)/sd(er))
>>> })
>>> return(sr)}
>>> #Now that we have the trials matrix we can pass it to the pbo function
>>> #for analysis.
>>>
>>> my_pbo < pbo(m,s=8,f=sharpe,threshold=0)
>>>
>>> summary(my_pbo)
>>>
>>> Here's the portion i'm curious about:
>>>
>>> sr_base < 0
>>> mu_base < sr_base/(252.0)
>>> sigma_base < 1.00/(252.0)**0.5
>>> for ( i in 1:n ) {
>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>>
>>> Why is the data transformed within the for loop, and does this kind of
>>> rescaling and recentering need to be done with real returns? Or is this
>>> just something the author is doing to make his simulated returns look
>>> more
>>> like the real thing?
>>>
>>> Googling around turned up some articles regarding scaling volatility to
>>> the
>>> square root of time, but the scaling in the code here doesn't look quite
>>> like what I've seen. Rescalings I've seen involve multiplying some short
>>> term (i.e. daily) measure of volatility by the root of time, but this
>>> isn't
>>> quite that. Also, the documentation for the package doesn't include this
>>> chunk of rescaling and recentering code. Documentation:
>>> https://cran.r>>> project.org/web/packages/pbo/pbo.pdf
>>>
>>> So:
>>>
>>> 
>>>
>>> Why is the data transformed in this way/what is result of this
>>> transformation?
>>> 
>>>
>>> Is it only necessary for this simulated data, or do I need to
>>> similarly transform real returns?
>>>
>>> I read in the posting guide that stats questions are acceptable given
>>> certain conditions, I hope this counts. Thanks for reading,
>>>
>>> Joe
>>>
>>> < http://www.avg.com/emailsignature?utm_medium=email&>>> utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>>> Virusfree.
>>> www.avg.com
>>> < http://www.avg.com/emailsignature?utm_medium=email&>>> utm_source=link&utm_campaign=sigemail&utm_content=webmail
>>> < http://www.avg.com/emailsignature?utm_medium=email&utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>>> >
>>> <#DAB4FAD82DD740BBA1B84E2AA1F9FDF2>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/posti>>> ngguide.html
>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/posti>>> ngguide.html
>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>
>>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Correct
Sent from my iPhone
> On 21 Nov 2017, at 22:42, Joe O < [hidden email]> wrote:
>
> Hi Eric,
>
> Thank you, that helps a lot. If I'm understanding correctly, if I’m wanting to use actual returns from backtests rather than simulated returns, I would need to make sure my riskadjusted return measure, sharpe ratio in this case, matches up in scale with my returns (i.e. daily returns with daily sharpe, monthly with monthly, etc). And I wouldn’t need to transform returns like the simulated returns are in the vignette, as the real returns are going to have whatever properties they have (meaning they will have whatever average and std dev they happen to have). Is that correct?
>
> Thanks, Joe
>
>
>> On Tue, Nov 21, 2017 at 5:36 AM, Eric Berger < [hidden email]> wrote:
>> [resending  previous email went out by accident before complete]
>> Hi Joe,
>> The centering and rescaling is done for the purposes of his example, and also to be consistent with his definition of the sharpe function.
>> In particular, note that the sharpe function has the rf (riskfree) parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted to a DAILY rate, expressed in decimal.
>> That means that the other argument to this function, x, should be DAILY returns, expressed in decimal.
>>
>> Suppose he wanted to create random data from a distribution of returns with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
>> The equivalent DAILY returns would have mean MU_D = MU_A / 252 and standard deviation SIGMA_D = SIGMA_A/SQRT(252).
>>
>> He calls MU_D by the name mu_base and SIGMA_D by the name sigma_base.
>>
>> His loop now converts the random numbers in his matrix so that each column has mean MU_D and std deviation SIGMA_D.
>>
>> HTH,
>> Eric
>>
>>
>>
>>> On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger < [hidden email]> wrote:
>>> Hi Joe,
>>> The centering and rescaling is done for the purposes of his example, and also to be consistent with his definition of the sharpe function.
>>> In particular, note that the sharpe function has the rf (riskfree) parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted to a DAILY rate, expressed in decimal.
>>> That means that the other argument to this function, x, should be DAILY returns, expressed in decimal.
>>>
>>> Suppose he wanted to create random data from a distribution of returns with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
>>> The equivalent DAILY
>>>
>>> Then he does two steps: (1) generate a matrix of random values from the N(0,1) distribution. (2) convert them to DAILY
>>> After initializing the matrix with random values (from N(0,1)), he now wants to create a series of DAILY
>>> sr_base < 0
>>> mu_base < sr_base/(252.0)
>>> sigma_base < 1.00/(252.0)**0.5
>>> for ( i in 1:n ) {
>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>>
>>>> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter < [hidden email]> wrote:
>>>> Wrong list.
>>>>
>>>> Post on rsigfinance instead.
>>>>
>>>> Cheers,
>>>> Bert
>>>>
>>>>
>>>>
>>>> On Nov 20, 2017 11:25 PM, "Joe O" < [hidden email]> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm trying to understand how to use the pbo package by looking at a
>>>> vignette. I'm curious about a part of the vignette that creates simulated
>>>> returns data. The package author transforms his simulated returns in a way
>>>> that I'm unfamiliar with, and that I haven't been able to find an
>>>> explanation for after searching around. I'm curious if I need to replicate
>>>> the transformation with real returns. For context, here is the vignette
>>>> (cleaned up a bit to make it reproducible):
>>>>
>>>> (Full vignette:
>>>> https://cran.rproject.org/web/packages/pbo/vignettes/pbo.html)
>>>>
>>>> library(pbo)
>>>> #First, we assemble the trials into an NxT matrix where each column
>>>> #represents a trial and each trial has the same length T. This example
>>>> #is random data so the backtest should be overfit.`
>>>>
>>>> set.seed(765)
>>>> n < 100
>>>> t < 2400
>>>> m < data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>>>> dimnames=list(1:t,1:n)), check.names=FALSE)
>>>>
>>>> sr_base < 0
>>>> mu_base < sr_base/(252.0)
>>>> sigma_base < 1.00/(252.0)**0.5
>>>> for ( i in 1:n ) {
>>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>>> #We can use any performance evaluation function that can work with the
>>>> #reassembled submatrices during the cross validation iterations.
>>>> #Following the original paper we can use the Sharpe ratio as
>>>>
>>>> sharpe < function(x,rf=0.03/252) {
>>>> sr < apply(x,2,function(col) {
>>>> er = col  rf
>>>> return(mean(er)/sd(er))
>>>> })
>>>> return(sr)}
>>>> #Now that we have the trials matrix we can pass it to the pbo function
>>>> #for analysis.
>>>>
>>>> my_pbo < pbo(m,s=8,f=sharpe,threshold=0)
>>>>
>>>> summary(my_pbo)
>>>>
>>>> Here's the portion i'm curious about:
>>>>
>>>> sr_base < 0
>>>> mu_base < sr_base/(252.0)
>>>> sigma_base < 1.00/(252.0)**0.5
>>>> for ( i in 1:n ) {
>>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>>>
>>>> Why is the data transformed within the for loop, and does this kind of
>>>> rescaling and recentering need to be done with real returns? Or is this
>>>> just something the author is doing to make his simulated returns look more
>>>> like the real thing?
>>>>
>>>> Googling around turned up some articles regarding scaling volatility to the
>>>> square root of time, but the scaling in the code here doesn't look quite
>>>> like what I've seen. Rescalings I've seen involve multiplying some short
>>>> term (i.e. daily) measure of volatility by the root of time, but this isn't
>>>> quite that. Also, the documentation for the package doesn't include this
>>>> chunk of rescaling and recentering code. Documentation: https://cran.r>>>> project.org/web/packages/pbo/pbo.pdf
>>>>
>>>> So:
>>>>
>>>> 
>>>>
>>>> Why is the data transformed in this way/what is result of this
>>>> transformation?
>>>> 
>>>>
>>>> Is it only necessary for this simulated data, or do I need to
>>>> similarly transform real returns?
>>>>
>>>> I read in the posting guide that stats questions are acceptable given
>>>> certain conditions, I hope this counts. Thanks for reading,
>>>>
>>>> Joe
>>>>
>>>> < http://www.avg.com/emailsignature?utm_medium=email&>>>> utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>>>> Virusfree.
>>>> www.avg.com
>>>> < http://www.avg.com/emailsignature?utm_medium=email&>>>> utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>>>> <#DAB4FAD82DD740BBA1B84E2AA1F9FDF2>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Fantastic! Thank you for your help, Joe
On Tue, Nov 21, 2017 at 2:17 PM, Eric Berger < [hidden email]> wrote:
> Correct
>
> Sent from my iPhone
>
> On 21 Nov 2017, at 22:42, Joe O < [hidden email]> wrote:
>
> Hi Eric,
>
> Thank you, that helps a lot. If I'm understanding correctly, if I’m
> wanting to use actual returns from backtests rather than simulated returns,
> I would need to make sure my riskadjusted return measure, sharpe ratio in
> this case, matches up in scale with my returns (i.e. daily returns with
> daily sharpe, monthly with monthly, etc). And I wouldn’t need to transform
> returns like the simulated returns are in the vignette, as the real returns
> are going to have whatever properties they have (meaning they will have
> whatever average and std dev they happen to have). Is that correct?
>
> Thanks, Joe
>
>
> On Tue, Nov 21, 2017 at 5:36 AM, Eric Berger < [hidden email]>
> wrote:
>
>> [resending  previous email went out by accident before complete]
>> Hi Joe,
>> The centering and rescaling is done for the purposes of his example, and
>> also to be consistent with his definition of the sharpe function.
>> In particular, note that the sharpe function has the rf (riskfree)
>> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
>> to a DAILY rate, expressed in decimal.
>> That means that the other argument to this function, x, should be DAILY
>> returns, expressed in decimal.
>>
>> Suppose he wanted to create random data from a distribution of returns
>> with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in
>> decimal.
>> The equivalent DAILY returns would have mean MU_D = MU_A / 252 and
>> standard deviation SIGMA_D = SIGMA_A/SQRT(252).
>>
>> He calls MU_D by the name mu_base and SIGMA_D by the name sigma_base.
>>
>> His loop now converts the random numbers in his matrix so that each
>> column has mean MU_D and std deviation SIGMA_D.
>>
>> HTH,
>> Eric
>>
>>
>>
>> On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger < [hidden email]>
>> wrote:
>>
>>> Hi Joe,
>>> The centering and rescaling is done for the purposes of his example,
>>> and also to be consistent with his definition of the sharpe function.
>>> In particular, note that the sharpe function has the rf (riskfree)
>>> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
>>> to a DAILY rate, expressed in decimal.
>>> That means that the other argument to this function, x, should be DAILY
>>> returns, expressed in decimal.
>>>
>>> Suppose he wanted to create random data from a distribution of returns
>>> with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in
>>> decimal.
>>> The equivalent DAILY
>>>
>>> Then he does two steps: (1) generate a matrix of random values from the
>>> N(0,1) distribution. (2) convert them to DAILY
>>> After initializing the matrix with random values (from N(0,1)), he now
>>> wants to create a series of DAILY
>>> sr_base < 0
>>> mu_base < sr_base/(252.0)
>>> sigma_base < 1.00/(252.0)**0.5
>>> for ( i in 1:n ) {
>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>>
>>> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter < [hidden email]>
>>> wrote:
>>>
>>>> Wrong list.
>>>>
>>>> Post on rsigfinance instead.
>>>>
>>>> Cheers,
>>>> Bert
>>>>
>>>>
>>>>
>>>> On Nov 20, 2017 11:25 PM, "Joe O" < [hidden email]> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm trying to understand how to use the pbo package by looking at a
>>>> vignette. I'm curious about a part of the vignette that creates
>>>> simulated
>>>> returns data. The package author transforms his simulated returns in a
>>>> way
>>>> that I'm unfamiliar with, and that I haven't been able to find an
>>>> explanation for after searching around. I'm curious if I need to
>>>> replicate
>>>> the transformation with real returns. For context, here is the vignette
>>>> (cleaned up a bit to make it reproducible):
>>>>
>>>> (Full vignette:
>>>> https://cran.rproject.org/web/packages/pbo/vignettes/pbo.html)
>>>>
>>>> library(pbo)
>>>> #First, we assemble the trials into an NxT matrix where each column
>>>> #represents a trial and each trial has the same length T. This example
>>>> #is random data so the backtest should be overfit.`
>>>>
>>>> set.seed(765)
>>>> n < 100
>>>> t < 2400
>>>> m < data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>>>> dimnames=list(1:t,1:n)), check.names=FALSE)
>>>>
>>>> sr_base < 0
>>>> mu_base < sr_base/(252.0)
>>>> sigma_base < 1.00/(252.0)**0.5
>>>> for ( i in 1:n ) {
>>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>>> #We can use any performance evaluation function that can work with the
>>>> #reassembled submatrices during the cross validation iterations.
>>>> #Following the original paper we can use the Sharpe ratio as
>>>>
>>>> sharpe < function(x,rf=0.03/252) {
>>>> sr < apply(x,2,function(col) {
>>>> er = col  rf
>>>> return(mean(er)/sd(er))
>>>> })
>>>> return(sr)}
>>>> #Now that we have the trials matrix we can pass it to the pbo function
>>>> #for analysis.
>>>>
>>>> my_pbo < pbo(m,s=8,f=sharpe,threshold=0)
>>>>
>>>> summary(my_pbo)
>>>>
>>>> Here's the portion i'm curious about:
>>>>
>>>> sr_base < 0
>>>> mu_base < sr_base/(252.0)
>>>> sigma_base < 1.00/(252.0)**0.5
>>>> for ( i in 1:n ) {
>>>> m[,i] = m[,i] * sigma_base / sd(m[,i]) # rescale
>>>> m[,i] = m[,i] + mu_base  mean(m[,i]) # recenter}
>>>>
>>>> Why is the data transformed within the for loop, and does this kind of
>>>> rescaling and recentering need to be done with real returns? Or is
>>>> this
>>>> just something the author is doing to make his simulated returns look
>>>> more
>>>> like the real thing?
>>>>
>>>> Googling around turned up some articles regarding scaling volatility to
>>>> the
>>>> square root of time, but the scaling in the code here doesn't look quite
>>>> like what I've seen. Rescalings I've seen involve multiplying some
>>>> short
>>>> term (i.e. daily) measure of volatility by the root of time, but this
>>>> isn't
>>>> quite that. Also, the documentation for the package doesn't include this
>>>> chunk of rescaling and recentering code. Documentation:
>>>> https://cran.r>>>> project.org/web/packages/pbo/pbo.pdf
>>>>
>>>> So:
>>>>
>>>> 
>>>>
>>>> Why is the data transformed in this way/what is result of this
>>>> transformation?
>>>> 
>>>>
>>>> Is it only necessary for this simulated data, or do I need to
>>>> similarly transform real returns?
>>>>
>>>> I read in the posting guide that stats questions are acceptable given
>>>> certain conditions, I hope this counts. Thanks for reading,
>>>>
>>>> Joe
>>>>
>>>> < http://www.avg.com/emailsignature?utm_medium=email&>>>> utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>>>> Virusfree.
>>>> www.avg.com
>>>> < http://www.avg.com/emailsignature?utm_medium=email&>>>> utm_source=link&utm_campaign=sigemail&utm_content=webmail
>>>> < http://www.avg.com/emailsignature?utm_medium=email&utm_source=link&utm_campaign=sigemail&utm_content=webmail>
>>>> >
>>>> <#DAB4FAD82DD740BBA1B84E2AA1F9FDF2>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide http://www.Rproject.org/posti>>>> ngguide.html
>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide http://www.Rproject.org/posti>>>> ngguide.html
>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>>
>>>
>>>
>>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

