

Hello everyone,
I have the following problem: I have a data.frame with multiple fields.
If I had to do my calculations for a given combination of IM.type and Taxonomy is the following:
D < read.csv('Test_v2.csv')
names(D)
VC < 0.01*( subset(D, IM.type == 'PGA' & Damage.state == 'DS1' & Taxonomy == 'ER+ETR_H1')[10:13] 
subset(D, IM.type == 'PGA' & Damage.state == 'DS2' & Taxonomy == 'ER+ETR_H1')[10:13]) +
0.02*( subset(D, IM.type == 'PGA' & Damage.state == 'DS2' & Taxonomy == 'ER+ETR_H1')[10:13] 
subset(D, IM.type == 'PGA' & Damage.state == 'DS3' & Taxonomy == 'ER+ETR_H1')[10:13]) +
0.43*( subset(D, IM.type == 'PGA' & Damage.state == 'DS3' & Taxonomy == 'ER+ETR_H1')[10:13] 
subset(D, IM.type == 'PGA' & Damage.state == 'DS4' & Taxonomy == 'ER+ETR_H1')[10:13]) +
1.0*( subset(D, IM.type == 'PGA' & Damage.state == 'DS4' & Taxonomy == 'ER+ETR_H1')[10:13])
So the question is how can I do that in an automated way for all possible combinations and store the results in new data.frame which would look like this:
Ref.No. Region IM.type Taxonomy IM_1 IM_2 IM_3 IM_4 VC_1 VC_2 VC_3 VC_4
1622 South America PGA ER+ETR_H1 1.00E06 0.08 0.16 0.24 3.49e294 3.449819e05 0.002748889 0.01122911
Best, ,
ioanna
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hello everyone,
I have the following problem: I have a data.frame with multiple fields.
If I had to do my calculations for a given combination of IM.type and
Taxonomy is the following:
D < read.csv('Test_v2.csv')
names(D)
VC < 0.01*( subset(D, IM.type == 'PGA' & Damage.state == 'DS1' & Taxonomy
== 'ER+ETR_H1')[10:13] 
subset(D, IM.type == 'PGA' & Damage.state == 'DS2' & Taxonomy
== 'ER+ETR_H1')[10:13]) +
0.02*( subset(D, IM.type == 'PGA' & Damage.state == 'DS2' & Taxonomy
== 'ER+ETR_H1')[10:13] 
subset(D, IM.type == 'PGA' & Damage.state == 'DS3' & Taxonomy
== 'ER+ETR_H1')[10:13]) +
0.43*( subset(D, IM.type == 'PGA' & Damage.state == 'DS3' & Taxonomy ==
'ER+ETR_H1')[10:13] 
subset(D, IM.type == 'PGA' & Damage.state == 'DS4' & Taxonomy ==
'ER+ETR_H1')[10:13]) +
1.0*( subset(D, IM.type == 'PGA' & Damage.state == 'DS4' & Taxonomy ==
'ER+ETR_H1')[10:13])
So the question is how can I do that in an automated way for all possible
combinations and store the results in new data.frame which would look like
this:
Ref.No. Region IM.type Taxonomy IM_1 IM_2 IM_3 IM_4 VC_1
VC_2 VC_3 VC_4
1622 South America PGA ER+ETR_H1 1.00E06 0.08 0.16
0.24 3.49e294 3.449819e05 0.002748889 0.01122911
Best, ,
ioanna
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Okay, I'm away for most of the day and might not be able to look at it
until tomorrow.
Jim
On Wed, Dec 18, 2019 at 9:27 AM Ioannou, Ioanna
< [hidden email]> wrote:
>
> Hello Jim ,
>
> I am very sorry. Here is the corrected sample data to play with:
>
> Test.v2 < data.frame(Ref.No = c(1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629),
> Region = rep(c('South America'), times = 8),
> IM.type = c('PGA', 'PGA', 'PGA', 'PGA', 'Sa', 'Sa', 'Sa', 'Sa'),
> Damage.state = c('DS1', 'DS2', 'DS3', 'DS4','DS1', 'DS2', 'DS3', 'DS4'),
> Taxonomy = c('ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2'),
> IM_1 = c(0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00),
> IM_2 = c(0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08),
> IM_3 = c(0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16),
> IM_4 = c(0.24, 0.24, 0.24, 0.24, 0.24, 0.24, 0.24, 0.24),
> Prob.of.exceedance_1 = c(0,0,0,0,0,0,0,0),
> Prob.of.exceedance_2 = c(0,0,0,0,0,0,0,0),
> Prob.of.exceedance_3 = c(0.26,0.001,0.00019,0.000000573,0.04,0.00017,0.000215,0.000472),
> Prob.of.exceedance_4 = c(0.72,0.03,0.008,0.000061,0.475,0.0007,0.00435,0.000405)
> )
>
> Basically I am using the total probability theorem to calculate a best estimate. I am stuck how to do it for many cases. Many thanks for your patience.
>
> Original Message
> From: Jim Lemon [mailto: [hidden email]]
> Sent: Tuesday, December 17, 2019 10:22 PM
> To: Ioannou, Ioanna < [hidden email]>
> Subject: Re: [R] How to create a new data.frame based on calculation of subsets of an existing data.frame
>
> Hi Ioanna,
> After looking at your post for a while, I think that you are combining columns IM_1 to IM_4 to generate VC_1 to VC_4. First, you seem to have omitted the "Region" column from Test_v2, which means that your indices (10:13) run out of range. It seems to me that you would find it easier to write down what arithmetic operations you want and translate these into logical expressions to extract the rows.
>
> Jim
>
> On Wed, Dec 18, 2019 at 7:47 AM Ioannou, Ioanna < [hidden email]> wrote:
> >
> > Hello everyone,
> >
> > I have the following problem: I have a data.frame with multiple fields.
> >
> > If I had to do my calculations for a given combination of IM.type and Taxonomy is the following:
> > D < read.csv('Test_v2.csv')
> > names(D)
> >
> > VC < 0.01*( subset(D, IM.type == 'PGA' & Damage.state == 'DS1' & Taxonomy == 'ER+ETR_H1')[10:13] 
> > subset(D, IM.type == 'PGA' & Damage.state == 'DS2' & Taxonomy == 'ER+ETR_H1')[10:13]) +
> > 0.02*( subset(D, IM.type == 'PGA' & Damage.state == 'DS2' & Taxonomy == 'ER+ETR_H1')[10:13] 
> > subset(D, IM.type == 'PGA' & Damage.state == 'DS3' & Taxonomy == 'ER+ETR_H1')[10:13]) +
> > 0.43*( subset(D, IM.type == 'PGA' & Damage.state == 'DS3' & Taxonomy == 'ER+ETR_H1')[10:13] 
> > subset(D, IM.type == 'PGA' & Damage.state == 'DS4' & Taxonomy == 'ER+ETR_H1')[10:13]) +
> > 1.0*( subset(D, IM.type == 'PGA' & Damage.state == 'DS4' & Taxonomy
> > == 'ER+ETR_H1')[10:13])
> >
> > So the question is how can I do that in an automated way for all possible combinations and store the results in new data.frame which would look like this:
> >
> > Ref.No. Region IM.type Taxonomy IM_1 IM_2 IM_3 IM_4 VC_1 VC_2 VC_3 VC_4
> > 1622 South America PGA ER+ETR_H1 1.00E06 0.08 0.16 0.24 3.49e294 3.449819e05 0.002748889 0.01122911
> >
> > Best, ,
> > ioanna
> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat> > .ethz.ch%2Fmailman%2Flistinfo%2Frhelp&data=02%7C01%7C%7C2808d89de
> > 79441309c4808d7833f7f81%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C6
> > 37122181061837860&sdata=B%2FmCVpyLnCghj3KxgP7fYu3aOxy7uRjAVZ8fgdhc
> > u4w%3D&reserved=0 PLEASE do read the posting guide
> > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.R> > project.org%2Fpostingguide.html&data=02%7C01%7C%7C2808d89de79441
> > 309c4808d7833f7f81%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C637122
> > 181061837860&sdata=e4YB5rlwfSLO%2B01i92q4%2F8otuyjv%2FoZnuIwfDWPGi
> > EE%3D&reserved=0 and provide commented, minimal, selfcontained,
> > reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Ioanna,
I looked at the problem this morning and tried to work out what you
wanted. With a problem like this, it is often easy when you have
someone point to the data and say "I want this added to that and this
multiplied by that". I have probably made the wrong guesses, but I
hope that you can correct my guesses and I can get the calculations
correct for you. For example, I have assumed that you want the sum of
the IM_* values for each set of damage states as the values for VC_1,
VC_2 etc.
D<data.frame(Ref.No = c(1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629),
Region = rep(c('South America'), times = 8),
IM.type = c('PGA', 'PGA', 'PGA', 'PGA', 'Sa', 'Sa', 'Sa', 'Sa'),
Damage.state = c('DS1', 'DS2', 'DS3', 'DS4','DS1', 'DS2', 'DS3', 'DS4'),
Taxonomy = c('ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H2',
'ER+ETR_H2','ER+ETR_H2','ER+ETR_H2'),
IM_1 = c(0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00),
IM_2 = c(0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08),
IM_3 = c(0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16),
IM_4 = c(0.24, 0.24, 0.24, 0.24, 0.24, 0.24, 0.24, 0.24),
Prob.of.exceedance_1 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_2 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_3 =
c(0.26,0.001,0.00019,0.000000573,0.04,0.00017,0.000215,0.000472),
Prob.of.exceedance_4 =
c(0.72,0.03,0.008,0.000061,0.475,0.0007,0.00435,0.000405),
stringsAsFactors=FALSE)
# assume the above has been read in
# add the four columns to the data frame filled with NAs
D$VC_1<D$VC_2<D$VC_3<D$VC_4<NA
# names of the variables used in the calculations
calc_vars<paste("Prob.of.exceedance",1:4,sep="_")
# get the rows for the four damage states
DS1_rows<D$Damage.state == "DS1"
DS2_rows<D$Damage.state == "DS2"
DS3_rows<D$Damage.state == "DS3"
DS4_rows<D$Damage.state == "DS4"
# step through all possible values of IM.type and Taxonomy
for(IM in unique(D$IM.type)) {
for(Tax in unique(D$Taxonomy)) {
# get a logical vector of the rows to be used in this calculation
calc_rows<D$IM.type == IM & D$Taxonomy == Tax
cat(IM,Tax,calc_rows,"\n")
# check that there are any such rows in the data frame
if(sum(calc_rows)) {
# if so, fill in the four values for these rows
D$VC_1[calc_rows]<sum(0.01 * (D[calc_rows & DS1_rows,calc_vars] 
D[calc_rows & DS2_rows,calc_vars]))
D$VC_2[calc_rows]<sum(0.02 * (D[calc_rows & DS2_rows,calc_vars] 
D[calc_rows & DS3_rows,calc_vars]))
D$VC_3[calc_rows]<sum(0.43 * (D[calc_rows & DS3_rows,calc_vars] 
D[calc_rows & DS4_rows,calc_vars]))
D$VC_4[calc_rows]<sum(D[calc_rows & DS4_rows,calc_vars])
}
}
}
Jim
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hello Jim,
Thank you every so much it ws very helful. In fact what I want to calculate is the following. My very last question is if I want to save the outcome VC, IM.type and Taxonomy in a new data.frame how can I do it?
# names of the variables used in the calculations
calc_vars<paste("Prob.of.exceedance",1:4,sep="_")
# get the rows for the four damage states
DS1_rows <D$Damage.state == "DS1"
DS2_rows <D$Damage.state == "DS2"
DS3_rows <D$Damage.state == "DS3"
DS4_rows <D$Damage.state == "DS4"
# step through all possible values of IM.type and Taxonomy
for(IM in unique(D$IM.type)) { for(Tax in unique(D$Taxonomy)) {
# get a logical vector of the rows to be used in this calculation
calc_rows < D$IM.type == IM & D$Taxonomy == Tax
cat(IM,Tax,calc_rows,"\n")
# check that there are any such rows in the data frame
if(sum(calc_rows)) {
# if so, fill in the four values for these rows
VC < 0.0 * (1 D[calc_rows & DS1_rows,calc_vars]) +
0.02* (D[calc_rows & DS1_rows,calc_vars] 
D[calc_rows & DS2_rows,calc_vars]) +
0.10* (D[calc_rows & DS2_rows,calc_vars] 
D[calc_rows & DS3_rows,calc_vars]) +
0.43 * (D[calc_rows & DS3_rows,calc_vars] 
D[calc_rows & DS4_rows,calc_vars]) +
1.0* D[calc_rows & DS4_rows,calc_vars]
}
}
}
Original Message
From: Jim Lemon [mailto: [hidden email]]
Sent: Thursday, December 19, 2019 2:05 AM
To: Ioannou, Ioanna < [hidden email]>; rhelp mailing list < [hidden email]>
Subject: Re: [R] How to create a new data.frame based on calculation of subsets of an existing data.frame
Hi Ioanna,
I looked at the problem this morning and tried to work out what you wanted. With a problem like this, it is often easy when you have someone point to the data and say "I want this added to that and this multiplied by that". I have probably made the wrong guesses, but I hope that you can correct my guesses and I can get the calculations correct for you. For example, I have assumed that you want the sum of the IM_* values for each set of damage states as the values for VC_1,
VC_2 etc.
D<data.frame(Ref.No = c(1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629), Region = rep(c('South America'), times = 8), IM.type = c('PGA', 'PGA', 'PGA', 'PGA', 'Sa', 'Sa', 'Sa', 'Sa'), Damage.state = c('DS1', 'DS2', 'DS3', 'DS4','DS1', 'DS2', 'DS3', 'DS4'), Taxonomy = c('ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H2',
'ER+ETR_H2','ER+ETR_H2','ER+ETR_H2'),
IM_1 = c(0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00),
IM_2 = c(0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08),
IM_3 = c(0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16),
IM_4 = c(0.24, 0.24, 0.24, 0.24, 0.24, 0.24, 0.24, 0.24),
Prob.of.exceedance_1 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_2 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_3 =
c(0.26,0.001,0.00019,0.000000573,0.04,0.00017,0.000215,0.000472),
Prob.of.exceedance_4 =
c(0.72,0.03,0.008,0.000061,0.475,0.0007,0.00435,0.000405),
stringsAsFactors=FALSE)
# assume the above has been read in
# add the four columns to the data frame filled with NAs D$VC_1<D$VC_2<D$VC_3<D$VC_4<NA
# names of the variables used in the calculations
calc_vars<paste("Prob.of.exceedance",1:4,sep="_")
# get the rows for the four damage states DS1_rows<D$Damage.state == "DS1"
DS2_rows<D$Damage.state == "DS2"
DS3_rows<D$Damage.state == "DS3"
DS4_rows<D$Damage.state == "DS4"
# step through all possible values of IM.type and Taxonomy for(IM in unique(D$IM.type)) { for(Tax in unique(D$Taxonomy)) {
# get a logical vector of the rows to be used in this calculation
calc_rows<D$IM.type == IM & D$Taxonomy == Tax
cat(IM,Tax,calc_rows,"\n")
# check that there are any such rows in the data frame
if(sum(calc_rows)) {
# if so, fill in the four values for these rows
D$VC_1[calc_rows]<sum(0.01 * (D[calc_rows & DS1_rows,calc_vars] 
D[calc_rows & DS2_rows,calc_vars]))
D$VC_2[calc_rows]<sum(0.02 * (D[calc_rows & DS2_rows,calc_vars] 
D[calc_rows & DS3_rows,calc_vars]))
D$VC_3[calc_rows]<sum(0.43 * (D[calc_rows & DS3_rows,calc_vars] 
D[calc_rows & DS4_rows,calc_vars]))
D$VC_4[calc_rows]<sum(D[calc_rows & DS4_rows,calc_vars])
}
}
}
Jim
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Ioanna,
For simplicity assume that the new data frame will be named E:
E<D[,c("Taxonomy","IM.type",paste("VC,1:4,sep="_"))]
While I haven't tested this, I'm pretty sure I have it correct. Just
extract the columns you want from D and assign that to E.
Jim
On Fri, Dec 20, 2019 at 9:02 PM Ioannou, Ioanna
< [hidden email]> wrote:
>
> Hello Jim,
>
> Thank you every so much it ws very helful. In fact what I want to calculate is the following. My very last question is if I want to save the outcome VC, IM.type and Taxonomy in a new data.frame how can I do it?
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hello Jim,
I made some changes to the code essentially I substitute each 4 lines DS14 with one. I estimate VC which in an ideal world should be a matrix with 4 columns one for every exceedance_probability_14 and 2 rowsfor each unique combination of taxonomy and IM.Type. Coukd you please check the code I sent last and based on that give your solution?
Many thanks.
Get Outlook for Android< https://aka.ms/ghei36>
________________________________
From: Jim Lemon < [hidden email]>
Sent: Friday, December 20, 2019 11:40:28 AM
To: Ioannou, Ioanna < [hidden email]>
Cc: rhelp mailing list < [hidden email]>
Subject: Re: [R] How to create a new data.frame based on calculation of subsets of an existing data.frame
Hi Ioanna,
For simplicity assume that the new data frame will be named E:
E<D[,c("Taxonomy","IM.type",paste("VC,1:4,sep="_"))]
While I haven't tested this, I'm pretty sure I have it correct. Just
extract the columns you want from D and assign that to E.
Jim
On Fri, Dec 20, 2019 at 9:02 PM Ioannou, Ioanna
< [hidden email]> wrote:
>
> Hello Jim,
>
> Thank you every so much it ws very helful. In fact what I want to calculate is the following. My very last question is if I want to save the outcome VC, IM.type and Taxonomy in a new data.frame how can I do it?
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Ioanna,
We're getting somewhere, but there are four unique combinations of
Taxonomy and IM.type:
ER+ETR_H1,PGA
ER+ETR_H2,PGA
ER+ETR_H1,Sa
ER+ETR_H2,Sa
Perhaps you mean that ER+ETR_H1 only occurs with PGA and ER+ETR_H2
only occurs with Sa. I handled that by checking that there were any
rows that corresponded to the condition requested.
Also you want a matrix for each row containing Taxonomy and IM.type in
the output. When I run what I think you are asking, I only get a two
element list, each a vector of values. Maybe this is what you want,
and it could be coerced into matrix format:
D< data.frame(Ref.No = c(1622, 1623, 1624, 1625, 1626, 1627, 1628,
1629), Region = rep(c('South America'), times = 8),
IM.type = c('PGA', 'PGA', 'PGA', 'PGA', 'Sa', 'Sa', 'Sa', 'Sa'),
Damage.state = c('DS1', 'DS2', 'DS3', 'DS4','DS1', 'DS2', 'DS3', 'DS4'),
Taxonomy = c('ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2'),
Prob.of.exceedance_1 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_2 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_3 =
c(0.26,0.001,0.00019,0.000000573,0.04,0.00017,0.000215,0.000472),
Prob.of.exceedance_4 =
c(0.72,0.03,0.008,0.000061,0.475,0.0007,0.00435,0.000405),
stringsAsFactors=FALSE)
# names of the variables used in the calculations
calc_vars<paste("Prob.of.exceedance",1:4,sep="_")
# get the rows for the four damage states
DS1_rows <D$Damage.state == "DS1"
DS2_rows <D$Damage.state == "DS2"
DS3_rows <D$Damage.state == "DS3"
DS4_rows <D$Damage.state == "DS4"
# create an empty list
VC<list()
# set an index variable for VC
VCindex<1
# step through all possible values of IM.type and Taxonomy
for(IM in unique(D$IM.type)) {
for(Tax in unique(D$Taxonomy)) {
# get a logical vector of the rows to be used in this calculation
calc_rows < D$IM.type == IM & D$Taxonomy == Tax
cat(IM,Tax,calc_rows,"\n")
# check that there are any such rows in the data frame
if(sum(calc_rows)) {
# if so, fill in the four values for these rows
VC[[VCindex]] < 0.0 * (1 D[calc_rows & DS1_rows,calc_vars]) +
0.02* (D[calc_rows & DS1_rows,calc_vars] 
D[calc_rows & DS2_rows,calc_vars]) +
0.10* (D[calc_rows & DS2_rows,calc_vars] 
D[calc_rows & DS3_rows,calc_vars]) +
0.43 * (D[calc_rows & DS3_rows,calc_vars] 
D[calc_rows & DS4_rows,calc_vars]) +
1.0* D[calc_rows & DS4_rows,calc_vars]
# increment the index
VCindex<VCindex+1
}
}
}
I think we'll get there.
Jim
On Sat, Dec 21, 2019 at 12:45 AM Ioannou, Ioanna
< [hidden email]> wrote:
>
> Hello Jim,
>
> I made some changes to the code essentially I substitute each 4 lines DS14 with one. I estimate VC which in an ideal world should be a matrix with 4 columns one for every exceedance_probability_14 and 2 rowsfor each unique combination of taxonomy and IM.Type. Coukd you please check the code I sent last and based on that give your solution?
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hello Jim ,
Thank you ever so much for your help. I was truly stuck!
This looks much better and yes I can turn them into a matrix no problem. Indeed I need only the results for ER+ETR_H1,PGA and ER+ETR_H2,Sa. One minor point as it is the VC has 4 values for three cases instead of the aforementioned two. In fact, the third is identical to the first. Could you please optimize?
Thank you very much again,
Best,
ioanna
Original Message
From: Jim Lemon [mailto: [hidden email]]
Sent: Friday, December 20, 2019 9:04 PM
To: Ioannou, Ioanna < [hidden email]>
Cc: rhelp mailing list < [hidden email]>
Subject: Re: [R] How to create a new data.frame based on calculation of subsets of an existing data.frame
Hi Ioanna,
We're getting somewhere, but there are four unique combinations of Taxonomy and IM.type:
ER+ETR_H1,PGA
ER+ETR_H2,PGA
ER+ETR_H1,Sa
ER+ETR_H2,Sa
Perhaps you mean that ER+ETR_H1 only occurs with PGA and ER+ETR_H2 only occurs with Sa. I handled that by checking that there were any rows that corresponded to the condition requested.
Also you want a matrix for each row containing Taxonomy and IM.type in the output. When I run what I think you are asking, I only get a two element list, each a vector of values. Maybe this is what you want, and it could be coerced into matrix format:
D< data.frame(Ref.No = c(1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629), Region = rep(c('South America'), times = 8), IM.type = c('PGA', 'PGA', 'PGA', 'PGA', 'Sa', 'Sa', 'Sa', 'Sa'), Damage.state = c('DS1', 'DS2', 'DS3', 'DS4','DS1', 'DS2', 'DS3', 'DS4'), Taxonomy = c('ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2'),
Prob.of.exceedance_1 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_2 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_3 =
c(0.26,0.001,0.00019,0.000000573,0.04,0.00017,0.000215,0.000472),
Prob.of.exceedance_4 =
c(0.72,0.03,0.008,0.000061,0.475,0.0007,0.00435,0.000405),
stringsAsFactors=FALSE)
# names of the variables used in the calculations
calc_vars<paste("Prob.of.exceedance",1:4,sep="_")
# get the rows for the four damage states DS1_rows <D$Damage.state == "DS1"
DS2_rows <D$Damage.state == "DS2"
DS3_rows <D$Damage.state == "DS3"
DS4_rows <D$Damage.state == "DS4"
# create an empty list
VC<list()
# set an index variable for VC
VCindex<1
# step through all possible values of IM.type and Taxonomy for(IM in unique(D$IM.type)) { for(Tax in unique(D$Taxonomy)) {
# get a logical vector of the rows to be used in this calculation
calc_rows < D$IM.type == IM & D$Taxonomy == Tax
cat(IM,Tax,calc_rows,"\n")
# check that there are any such rows in the data frame
if(sum(calc_rows)) {
# if so, fill in the four values for these rows
VC[[VCindex]] < 0.0 * (1 D[calc_rows & DS1_rows,calc_vars]) +
0.02* (D[calc_rows & DS1_rows,calc_vars] 
D[calc_rows & DS2_rows,calc_vars]) +
0.10* (D[calc_rows & DS2_rows,calc_vars] 
D[calc_rows & DS3_rows,calc_vars]) +
0.43 * (D[calc_rows & DS3_rows,calc_vars] 
D[calc_rows & DS4_rows,calc_vars]) +
1.0* D[calc_rows & DS4_rows,calc_vars]
# increment the index
VCindex<VCindex+1
}
}
}
I think we'll get there.
Jim
On Sat, Dec 21, 2019 at 12:45 AM Ioannou, Ioanna < [hidden email]> wrote:
>
> Hello Jim,
>
> I made some changes to the code essentially I substitute each 4 lines DS14 with one. I estimate VC which in an ideal world should be a matrix with 4 columns one for every exceedance_probability_14 and 2 rowsfor each unique combination of taxonomy and IM.Type. Coukd you please check the code I sent last and based on that give your solution?
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I'm probably misunderstanding what you want. I get this from the code I sent:
VC
[[1]]
Prob.of.exceedance_1 Prob.of.exceedance_2 Prob.of.exceedance_3
1 0 0 0.005343027
Prob.of.exceedance_4
1 0.01947477
[[2]]
Prob.of.exceedance_1 Prob.of.exceedance_2 Prob.of.exceedance_3
5 0 0 0.00115359
Prob.of.exceedance_4
5 0.01122235
Two list elements with four values. Perhaps you want a matrix for each
block of Taxonomy and IM.type that has a row for each element of the
block? This often happens with a remotely specified problem.
Jim
On Sat, Dec 21, 2019 at 8:33 AM Ioannou, Ioanna
< [hidden email]> wrote:
>
> Hello Jim ,
>
> Thank you ever so much for your help. I was truly stuck!
>
> This looks much better and yes I can turn them into a matrix no problem. Indeed I need only the results for ER+ETR_H1,PGA and ER+ETR_H2,Sa. One minor point as it is the VC has 4 values for three cases instead of the aforementioned two. In fact, the third is identical to the first. Could you please optimize?
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

