help

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

help

shawnbiologist
Need help with R studio. Code is to pull data from todays date plus 5 years 4 months from now. I am missing the last 3 months of data. Can you please help?

Shawn Miller | Aquatic Biologist II | Assessment Section
Environmental Protection | Clean Water
Rachel Carson State Office Building
400 Market Street | Harrisburg, PA 17101
Phone: 717.772.2185 | Fax: 717.772.3249
www.depweb.state.pa.us<https://webmail.state.pa.us/OWA/redir.aspx?C=t4jGxr3_mkC5mWY30vM0D8N-9RdJ8s9IgIGFizoEzsd1aNOJaDwGjjpEh4RqLFX24CIJXV9M2ic.&URL=http%3a%2f%2fwww.depweb.state.pa.us%2f>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: help

David Carlson
You need Santa Claus not r-help. You haven't given us a fraction of the information we would need to help. You don't show us your code. You don't tell us where the information is coming from except "today's date." You don't tell us what data you want. You don't seem to know the difference between R and R-Studio.

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352


-----Original Message-----
From: R-help <[hidden email]> On Behalf Of Miller, Shawn
Sent: Thursday, December 13, 2018 11:29 AM
To: [hidden email]
Subject: [R] help

Need help with R studio. Code is to pull data from todays date plus 5 years 4 months from now. I am missing the last 3 months of data. Can you please help?

Shawn Miller | Aquatic Biologist II | Assessment Section
Environmental Protection | Clean Water
Rachel Carson State Office Building
400 Market Street | Harrisburg, PA 17101
Phone: 717.772.2185 | Fax: 717.772.3249
www.depweb.state.pa.us<https://webmail.state.pa.us/OWA/redir.aspx?C=t4jGxr3_mkC5mWY30vM0D8N-9RdJ8s9IgIGFizoEzsd1aNOJaDwGjjpEh4RqLFX24CIJXV9M2ic.&URL=http%3a%2f%2fwww.depweb.state.pa.us%2f>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [External] RE: help

David Carlson
Use Reply-All to keep the thread on the list.

If the most recent WQN SIS Data Download table is 14June2018_WQN_Data_Download, does it contain any data for the months July, August, September, October, November, December? After extracting the data you create the file "Anti-deg_requests/test.xlsx". Have you confirmed that all of the dates you need are present in that file?

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352

-----Original Message-----
From: Miller, Shawn <[hidden email]>
Sent: Thursday, December 13, 2018 1:15 PM
To: David L Carlson <[hidden email]>
Subject: RE: [External] RE: help

setwd("c:/users/shawnmille/Desktop/Anti-deg_requests/")

library(plyr)
library(reshape2)
library(RODBC)
library(dplyr)



## MAKE SURE YOU ARE USING THE 32-bit VERSION OF R (See Tools>Global Options>General in RStudio to set this)
## READING DATA FROM EXISTING DATABASES (EXAMPLE: ACCESS 2010)
chan <- odbcConnectAccess("WQN Datadumps.mdb")

# Show list of tables in database
sqlTables(chan,tableType='TABLE')
# Fetch the most recent WQN SIS Data Download table, this takes a couple minutes
allwqn <- sqlFetch(chan,'14June2018_WQN_Data_Download',stringsAsFactors=FALSE)
names(allwqn)
## subset to the desired fields and values, choose your WQN by modifying command

wqn.specific <- subset(allwqn,MONITORING_POINT_ALIAS_ID =='WQN0735',select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC',
                                                                                                                  'TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))  

###VIEW ENTIRE DATASET, this coding can be used to view any created datasets in excel#####
library(xlsx)
write.xlsx(wqn.specific,"c:/users/shawnmille/Desktop/Anti-deg_requests/test.xlsx")

###VIEW ENTIRE DATASET#####

# FOR ACTIVE WQN STATION GET TODAY'S DATE and date of 5 years ago + 4 months or put in last date in command#
today <- Sys.Date()
TODAY.5<-today-1949
## Select data where DATA_COLLECTED >= TODAY.5##
wqn.specific$DATE_COLLECTED <- as.Date(wqn.specific$DATE_COLLECTED ,"%m/%d/%y")
wqn.last5yrs <- subset(wqn.specific,DATE_COLLECTED >= TODAY.5, select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))


#OR FOR INACTIVE WQN STATION ENTER THE LAST DATE OF RECORD BELOW AND GET PAST 5 YEARS OF DATA IN SUBSET#
lastdate <- as.Date("11/01/2010", format = "%m/%d/%Y")
firstdate <- lastdate-1949
wqn.specific$DATE_COLLECTED <- as.Date(wqn.specific$DATE_COLLECTED ,"%m/%d/%y")
wqn.last5yrs <- subset(wqn.specific,DATE_COLLECTED >= firstdate & DATE_COLLECTED <= lastdate, select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

#### Delete Blanks and QA column #####
wqn.removeblanks <-subset(wqn.last5yrs,QUALITY_ASSURANCE_TYPE_DESC=="Duplicate"|is.na(QUALITY_ASSURANCE_TYPE_DESC),select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID',
                                                                                                                            'TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))
#### Delete Null Results####
wqn.removenull <- subset(wqn.removeblanks,FINAL_AMOUNT!="NA",select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

### Remove non-relevant parameters ####

wqn.removeparm <- subset(wqn.removenull,TEST_CODE!= "571"&TEST_CODE!="572"&
                           TEST_CODE!="573"& TEST_CODE!="543"& TEST_CODE!="561"& TEST_CODE!="546"
                         & TEST_CODE!="595"& TEST_CODE!="596"& TEST_CODE!="597"& TEST_CODE!="598"
                         & TEST_CODE!="599"& TEST_CODE!="600"& TEST_CODE!="71946"& TEST_CODE!="71940"
                         & TEST_CODE!="587"& TEST_CODE!="593"& TEST_CODE!="71939"& TEST_CODE!="71937"
                         & TEST_CODE!="574"& TEST_CODE!="549"& TEST_CODE!="99014"& TEST_CODE!="547"
                         & TEST_CODE!="298"& TEST_CODE!="551"& TEST_CODE!="71930"& TEST_CODE!="70508"
                         & TEST_CODE!="564"& TEST_CODE!="709"& TEST_CODE!="111"& TEST_CODE!="592"
                         & TEST_CODE!="MMTECMF"& TEST_CODE!="588"& TEST_CODE!="589"& TEST_CODE!="590"
                         & TEST_CODE!="594"& TEST_CODE!="71936"& TEST_CODE!="71945"& TEST_CODE!="F00061"
                         & TEST_CODE!="71947"& TEST_CODE!="555",
                         select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

###  Combine test codes that are measuring the same analyte, format testcode to include first 5 digits###


wqn.removeparm$TEST_CODE<-substr(wqn.removeparm$TEST_CODE,1,5)


######Creates a table (wqn.det) with list of max detection limits for each parameter,FOR EACH TEST_CODE BRING ALL  < TO THAT VALUE############

n.det <- subset(wqn.removeparm,READING_INDICATOR_CODE=='<')
wqn.det<- aggregate(n.det$FINAL_AMOUNT , by=list(n.det$MONITORING_POINT_ALIAS_ID,n.det$TEST_CODE),FUN=max)
colnames(wqn.det) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","DET_LIMIT")

wqn.mer<-merge(wqn.removeparm,wqn.det, by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"),all=TRUE)
wqn.mer$Amount<- ifelse(is.na(wqn.mer$'READING_INDICATOR_CODE'),wqn.mer$FINAL_AMOUNT*1,wqn.mer$DET_LIMIT*1)


######DIVIDE '<' BY HALF #####
wqn.mer$Result <- ifelse(is.na(wqn.mer$'READING_INDICATOR_CODE'),wqn.mer$Amount*1,wqn.mer$Amount/2)

###### AVERAGE AND GROUP BY TO REMOVE DUPLICATES ########

wqn.agg<-aggregate(wqn.mer$Result, by=list(MONITORING_POINT_ALIAS_ID=wqn.mer$'MONITORING_POINT_ALIAS_ID',DATE_COLLECTED=wqn.mer$'DATE_COLLECTED',TEST_CODE=wqn.mer$'TEST_CODE'),data=wqn.mer, FUN="mean",na.rm=TRUE)
colnames(wqn.agg) <- c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result")

###### Calculate Median and 95% CI, alpha/2 for tailed2 (pH)..similar to ci.median function in asbio package, using alpha for funtion "tailed1"(other parameters)###


tailed2<-function(x){
  n <- nrow(as.matrix(x))
  L <- qbinom(.025,n,0.5)
  U <- n-L+1
  order.x <-sort(x)
  lower <- (order.x[L])
  upper <-order.x[n -
                    L + 1]
  median <- median(x)
  coverage <- 1-(2*pbinom(qbinom(.025,n,.5)-1,n,0.5))
  y<-list(c(lower, median, upper,n,coverage))

 
 
 
 
 
}
tailed1<-function(x){
  n <- nrow(as.matrix(x))
  L <- qbinom(.05,n,0.5)
  U <- n-L+1
  order.x <-sort(x)
  lower <- (order.x[L])
  upper <-order.x[n -
                    L+1]
  median <- median(x)
  coverage <- 1-(2*pbinom(qbinom(.05,n,.5)-1,n,0.5))
  y<-list(c(lower, median, upper,n,coverage))
 
 
 
 
}


###Create two data sets, one with all the parameters except pH and one with just pH#####
wqn.ph <- subset(wqn.agg,TEST_CODE=='00403'|TEST_CODE=="F0040",select=c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result"))
wqn.noph <-subset(wqn.agg,TEST_CODE!='00403'& TEST_CODE!="F0040",select=c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result"))                  
                   
###Run tailed1 function on all data except for pH###                  
                   
stats<-aggregate(wqn.noph$Result, by=list(wqn.noph$MONITORING_POINT_ALIAS_ID,wqn.noph$TEST_CODE), FUN=tailed1)
colnames(stats) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","x")

stats$lower <-unlist(lapply(stats$x, '[[', 1))
stats$median<-unlist(lapply(stats$x, '[[', 2))
stats$upper <-unlist(lapply(stats$x, '[[', 3))
stats$n <-unlist(lapply(stats$x, '[[', 4))
stats1 <- stats[order(stats$"MONITORING_POINT_ALIAS_ID",stats$"TEST_CODE"),]

###Run tailed2 function on pH (lab and field) data####
ph.stats<-aggregate(wqn.ph$Result, by=list(wqn.ph$MONITORING_POINT_ALIAS_ID,wqn.ph$TEST_CODE), FUN=tailed2)
colnames(ph.stats) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","x")

ph.stats$lower <-unlist(lapply(ph.stats$x, '[[', 1))
ph.stats$median<-unlist(lapply(ph.stats$x, '[[', 2))
ph.stats$upper <-unlist(lapply(ph.stats$x, '[[', 3))
ph.stats$n <-unlist(lapply(ph.stats$x, '[[', 4))
ph.stats1 <- ph.stats[order(ph.stats$"MONITORING_POINT_ALIAS_ID",ph.stats$"TEST_CODE"),]

###Combine datasets vertically###
stats2 <- rbind(stats1,ph.stats1)


####PeriodofRecordTable####

library(dplyr)

my.dt<-(format(as.Date(wqn.removeparm$"DATE_COLLECTED"),"%m/%d/%Y"))

my.dt1<-aggregate(my.dt, by=list(wqn.removeparm$MONITORING_POINT_ALIAS_ID,wqn.removeparm$TEST_CODE), FUN=first)
colnames(my.dt1) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","FirstDate")

my.dtlast<-aggregate(my.dt, by=list(wqn.removeparm$MONITORING_POINT_ALIAS_ID,wqn.removeparm$TEST_CODE), FUN=last)
colnames(my.dtlast) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","LastDate")



####Merging Tables Dates and Stats####################
all.dt <- merge(my.dt1,my.dtlast,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"))

dt.stats<-merge(all.dt,stats2,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"))


####Calculate Period Of record in Table, Old coding 2 lines: dt.stats$pdofrecord<-(mdy(dt.stats$"LastDate"))-(mdy(dt.stats$"FirstDate"))
###dt.stats$pd<-round(dt.stats$pdofrecord/31563000, digits=1), lubridate is glitchy test new code on this step more######


dt.stats$pd <- round((as.Date(dt.stats$LastDate,format = "%m/%d/%Y")-as.Date(dt.stats$FirstDate,format = "%m/%d/%Y"))/365,digits=1)



####Retrieve Test Descriptions and Merge######
t1 <- subset(dt.stats,select=c('MONITORING_POINT_ALIAS_ID','TEST_CODE','FirstDate','LastDate','pd','lower','median','upper','n'))
t2<-subset(wqn.removeparm,select=c('TEST_CODE','TEST_SHORT_DESC','ABBREVIATION'))
t3 <- unique( t2 )

t4<-merge(t1,t3, by='TEST_CODE',all.t3=TRUE)
######ReorderColumns,sort, rename columns#######
t5 <- t4[ ,c(2,1,10,3,4,5,9,6,7,8,11)]
t6 <- t5[order(t5$"MONITORING_POINT_ALIAS_ID",t5$"TEST_SHORT_DESC"),]
names(t6)

####This does not need to be run. It produces an error. You still get the correct results.
library(plyr)
library(dplyr)
library(reshape2)
final.tab<-rename(t6, c("FirstDate"="FIRST_DATE", "LastDate"="LAST_DATE","pd"="PERIOD_OF_RECORD(yrs)","n"="SAMPLE_SIZE","lower"
             ="L_95_CI","median"="MEDIAN_","upper"="U_95_CI","ABBREVIATION"="UNITS"))

####BRING VALUES UP TO DETECTION LIMITS merge with wqn.det######
less<-function(x){
  sprintf("< %3.2f", x)
}

wqn.final <-merge(t6,wqn.det,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"),all=TRUE)
wqn.final$DET_LIMIT[is.na(wqn.final$DET_LIMIT)] <- 0
wqn.final$LOW_95_CL <- ifelse(wqn.final$lower < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$lower*1)
wqn.final$MEDIAN <- ifelse(wqn.final$median < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$median*1)
wqn.final$UPP_95_CL <- ifelse(wqn.final$upper < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$upper*1)

####ONLY REPORT MINIMUM FOR ALKALINITY, PH, DO, TEMP, SPECIFIC cONDUCTANCE######

wqn.final$LOWER_95_CL <- ifelse(wqn.final$TEST_CODE=="00403"|wqn.final$TEST_CODE=="00410"|wqn.final$TEST_CODE=="F0040"|
                                    wqn.final$TEST_CODE=="F0030"|wqn.final$TEST_CODE=="F0043",wqn.final$LOW_95_CL,"NA")

wqn.final$UPPER_95_CL <- ifelse(wqn.final$TEST_CODE=="00410"|wqn.final$TEST_CODE=="F0043"|
                                  wqn.final$TEST_CODE=="F0030","NA",wqn.final$UPP_95_CL)


names(wqn.final)                                  
final.1<-subset(wqn.final, select=c('MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','FirstDate',
                                    'LastDate', 'pd',"n","LOWER_95_CL",'MEDIAN',
                                    'UPPER_95_CL','ABBREVIATION'))                                    
final.2 <- final.1[order(final.1$"MONITORING_POINT_ALIAS_ID",final.1$"TEST_SHORT_DESC"),]                                    

###VIEW IN EXCEL##
library(xlsx)
write.xlsx(final.2,"c:/users/shawnmille/Desktop/Anti-deg_requests/final.2.xlsx")


-----Original Message-----
From: David L Carlson [mailto:[hidden email]]
Sent: Thursday, December 13, 2018 2:12 PM
To: Miller, Shawn <[hidden email]>; [hidden email]
Subject: [External] RE: help

ATTENTION: This email message is from an external sender. Do not open links or attachments from unknown sources. To report suspicious email, forward the message as an attachment to [hidden email].

You need Santa Claus not r-help. You haven't given us a fraction of the information we would need to help. You don't show us your code. You don't tell us where the information is coming from except "today's date." You don't tell us what data you want. You don't seem to know the difference between R and R-Studio.

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352


-----Original Message-----
From: R-help <[hidden email]> On Behalf Of Miller, Shawn
Sent: Thursday, December 13, 2018 11:29 AM
To: [hidden email]
Subject: [R] help

Need help with R studio. Code is to pull data from todays date plus 5 years 4 months from now. I am missing the last 3 months of data. Can you please help?

Shawn Miller | Aquatic Biologist II | Assessment Section Environmental Protection | Clean Water Rachel Carson State Office Building
400 Market Street | Harrisburg, PA 17101
Phone: 717.772.2185 | Fax: 717.772.3249
https://na01.safelinks.protection.outlook.com/?url=www.depweb.state.pa.us&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Cdfe052e5375647a1c32f08d6612eec13%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C1%7C636803251493279248&amp;sdata=ScL5NIrFc1A4g9wKVAYxhZ6Hrwnb7qvgjijh2WHEENY%3D&amp;reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwebmail.state.pa.us%2FOWA%2Fredir.aspx%3FC%3Dt4jGxr3_mkC5mWY30vM0D8N-9RdJ8s9IgIGFizoEzsd1aNOJaDwGjjpEh4RqLFX24CIJXV9M2ic.%26URL%3Dhttp%253a%252f%252fwww.depweb.state.pa.us%252f&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Cdfe052e5375647a1c32f08d6612eec13%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C1%7C636803251493279248&amp;sdata=H6VzxlE9wQCOszIoDOZXOoI5%2BZmsu2qVfP1Y%2BEU50SU%3D&amp;reserved=0>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Cdfe052e5375647a1c32f08d6612eec13%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C1%7C636803251493279248&amp;sdata=WM4AQs45%2FgKgmoxl4bS%2B5sTSaWYhCt0YhmkVSlwqgeY%3D&amp;reserved=0
PLEASE do read the posting guide https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Cdfe052e5375647a1c32f08d6612eec13%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C1%7C636803251493279248&amp;sdata=BSVgu1ome5ob4rGe%2Bck3hcInxJn6tF0N%2F1CBJezIKG4%3D&amp;reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [External] RE: help

shawnbiologist
Yes I confirmed all the dates I need are in the test.xlsx table. The final.2.xlsx table does not have the dates  for July, August, September.

-----Original Message-----
From: David L Carlson [mailto:[hidden email]]
Sent: Thursday, December 13, 2018 2:33 PM
To: Miller, Shawn <[hidden email]>
Cc: [hidden email]
Subject: RE: [External] RE: help

Use Reply-All to keep the thread on the list.

If the most recent WQN SIS Data Download table is 14June2018_WQN_Data_Download, does it contain any data for the months July, August, September, October, November, December? After extracting the data you create the file "Anti-deg_requests/test.xlsx". Have you confirmed that all of the dates you need are present in that file?

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352

-----Original Message-----
From: Miller, Shawn <[hidden email]>
Sent: Thursday, December 13, 2018 1:15 PM
To: David L Carlson <[hidden email]>
Subject: RE: [External] RE: help

setwd("c:/users/shawnmille/Desktop/Anti-deg_requests/")

library(plyr)
library(reshape2)
library(RODBC)
library(dplyr)



## MAKE SURE YOU ARE USING THE 32-bit VERSION OF R (See Tools>Global Options>General in RStudio to set this) ## READING DATA FROM EXISTING DATABASES (EXAMPLE: ACCESS 2010) chan <- odbcConnectAccess("WQN Datadumps.mdb")

# Show list of tables in database
sqlTables(chan,tableType='TABLE')
# Fetch the most recent WQN SIS Data Download table, this takes a couple minutes allwqn <- sqlFetch(chan,'14June2018_WQN_Data_Download',stringsAsFactors=FALSE)
names(allwqn)
## subset to the desired fields and values, choose your WQN by modifying command

wqn.specific <- subset(allwqn,MONITORING_POINT_ALIAS_ID =='WQN0735',select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC',
                                                                                                                  'TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))  

###VIEW ENTIRE DATASET, this coding can be used to view any created datasets in excel#####
library(xlsx)
write.xlsx(wqn.specific,"c:/users/shawnmille/Desktop/Anti-deg_requests/test.xlsx")

###VIEW ENTIRE DATASET#####

# FOR ACTIVE WQN STATION GET TODAY'S DATE and date of 5 years ago + 4 months or put in last date in command# today <- Sys.Date()
TODAY.5<-today-1949
## Select data where DATA_COLLECTED >= TODAY.5## wqn.specific$DATE_COLLECTED <- as.Date(wqn.specific$DATE_COLLECTED ,"%m/%d/%y") wqn.last5yrs <- subset(wqn.specific,DATE_COLLECTED >= TODAY.5, select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))


#OR FOR INACTIVE WQN STATION ENTER THE LAST DATE OF RECORD BELOW AND GET PAST 5 YEARS OF DATA IN SUBSET# lastdate <- as.Date("11/01/2010", format = "%m/%d/%Y") firstdate <- lastdate-1949 wqn.specific$DATE_COLLECTED <- as.Date(wqn.specific$DATE_COLLECTED ,"%m/%d/%y") wqn.last5yrs <- subset(wqn.specific,DATE_COLLECTED >= firstdate & DATE_COLLECTED <= lastdate, select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

#### Delete Blanks and QA column #####
wqn.removeblanks <-subset(wqn.last5yrs,QUALITY_ASSURANCE_TYPE_DESC=="Duplicate"|is.na(QUALITY_ASSURANCE_TYPE_DESC),select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID',
                                                                                                                            'TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))
#### Delete Null Results####
wqn.removenull <- subset(wqn.removeblanks,FINAL_AMOUNT!="NA",select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

### Remove non-relevant parameters ####

wqn.removeparm <- subset(wqn.removenull,TEST_CODE!= "571"&TEST_CODE!="572"&
                           TEST_CODE!="573"& TEST_CODE!="543"& TEST_CODE!="561"& TEST_CODE!="546"
                         & TEST_CODE!="595"& TEST_CODE!="596"& TEST_CODE!="597"& TEST_CODE!="598"
                         & TEST_CODE!="599"& TEST_CODE!="600"& TEST_CODE!="71946"& TEST_CODE!="71940"
                         & TEST_CODE!="587"& TEST_CODE!="593"& TEST_CODE!="71939"& TEST_CODE!="71937"
                         & TEST_CODE!="574"& TEST_CODE!="549"& TEST_CODE!="99014"& TEST_CODE!="547"
                         & TEST_CODE!="298"& TEST_CODE!="551"& TEST_CODE!="71930"& TEST_CODE!="70508"
                         & TEST_CODE!="564"& TEST_CODE!="709"& TEST_CODE!="111"& TEST_CODE!="592"
                         & TEST_CODE!="MMTECMF"& TEST_CODE!="588"& TEST_CODE!="589"& TEST_CODE!="590"
                         & TEST_CODE!="594"& TEST_CODE!="71936"& TEST_CODE!="71945"& TEST_CODE!="F00061"
                         & TEST_CODE!="71947"& TEST_CODE!="555",
                         select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

###  Combine test codes that are measuring the same analyte, format testcode to include first 5 digits###


wqn.removeparm$TEST_CODE<-substr(wqn.removeparm$TEST_CODE,1,5)


######Creates a table (wqn.det) with list of max detection limits for each parameter,FOR EACH TEST_CODE BRING ALL  < TO THAT VALUE############

n.det <- subset(wqn.removeparm,READING_INDICATOR_CODE=='<')
wqn.det<- aggregate(n.det$FINAL_AMOUNT , by=list(n.det$MONITORING_POINT_ALIAS_ID,n.det$TEST_CODE),FUN=max)
colnames(wqn.det) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","DET_LIMIT")

wqn.mer<-merge(wqn.removeparm,wqn.det, by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"),all=TRUE)
wqn.mer$Amount<- ifelse(is.na(wqn.mer$'READING_INDICATOR_CODE'),wqn.mer$FINAL_AMOUNT*1,wqn.mer$DET_LIMIT*1)


######DIVIDE '<' BY HALF #####
wqn.mer$Result <- ifelse(is.na(wqn.mer$'READING_INDICATOR_CODE'),wqn.mer$Amount*1,wqn.mer$Amount/2)

###### AVERAGE AND GROUP BY TO REMOVE DUPLICATES ########

wqn.agg<-aggregate(wqn.mer$Result, by=list(MONITORING_POINT_ALIAS_ID=wqn.mer$'MONITORING_POINT_ALIAS_ID',DATE_COLLECTED=wqn.mer$'DATE_COLLECTED',TEST_CODE=wqn.mer$'TEST_CODE'),data=wqn.mer, FUN="mean",na.rm=TRUE)
colnames(wqn.agg) <- c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result")

###### Calculate Median and 95% CI, alpha/2 for tailed2 (pH)..similar to ci.median function in asbio package, using alpha for funtion "tailed1"(other parameters)###


tailed2<-function(x){
  n <- nrow(as.matrix(x))
  L <- qbinom(.025,n,0.5)
  U <- n-L+1
  order.x <-sort(x)
  lower <- (order.x[L])
  upper <-order.x[n -
                    L + 1]
  median <- median(x)
  coverage <- 1-(2*pbinom(qbinom(.025,n,.5)-1,n,0.5))
  y<-list(c(lower, median, upper,n,coverage))

 
 
 
 
 
}
tailed1<-function(x){
  n <- nrow(as.matrix(x))
  L <- qbinom(.05,n,0.5)
  U <- n-L+1
  order.x <-sort(x)
  lower <- (order.x[L])
  upper <-order.x[n -
                    L+1]
  median <- median(x)
  coverage <- 1-(2*pbinom(qbinom(.05,n,.5)-1,n,0.5))
  y<-list(c(lower, median, upper,n,coverage))
 
 
 
 
}


###Create two data sets, one with all the parameters except pH and one with just pH##### wqn.ph <- subset(wqn.agg,TEST_CODE=='00403'|TEST_CODE=="F0040",select=c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result"))
wqn.noph <-subset(wqn.agg,TEST_CODE!='00403'& TEST_CODE!="F0040",select=c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result"))                  
                   
###Run tailed1 function on all data except for pH###                  
                   
stats<-aggregate(wqn.noph$Result, by=list(wqn.noph$MONITORING_POINT_ALIAS_ID,wqn.noph$TEST_CODE), FUN=tailed1)
colnames(stats) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","x")

stats$lower <-unlist(lapply(stats$x, '[[', 1)) stats$median<-unlist(lapply(stats$x, '[[', 2)) stats$upper <-unlist(lapply(stats$x, '[[', 3)) stats$n <-unlist(lapply(stats$x, '[[', 4))
stats1 <- stats[order(stats$"MONITORING_POINT_ALIAS_ID",stats$"TEST_CODE"),]

###Run tailed2 function on pH (lab and field) data#### ph.stats<-aggregate(wqn.ph$Result, by=list(wqn.ph$MONITORING_POINT_ALIAS_ID,wqn.ph$TEST_CODE), FUN=tailed2)
colnames(ph.stats) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","x")

ph.stats$lower <-unlist(lapply(ph.stats$x, '[[', 1)) ph.stats$median<-unlist(lapply(ph.stats$x, '[[', 2)) ph.stats$upper <-unlist(lapply(ph.stats$x, '[[', 3)) ph.stats$n <-unlist(lapply(ph.stats$x, '[[', 4))
ph.stats1 <- ph.stats[order(ph.stats$"MONITORING_POINT_ALIAS_ID",ph.stats$"TEST_CODE"),]

###Combine datasets vertically###
stats2 <- rbind(stats1,ph.stats1)


####PeriodofRecordTable####

library(dplyr)

my.dt<-(format(as.Date(wqn.removeparm$"DATE_COLLECTED"),"%m/%d/%Y"))

my.dt1<-aggregate(my.dt, by=list(wqn.removeparm$MONITORING_POINT_ALIAS_ID,wqn.removeparm$TEST_CODE), FUN=first)
colnames(my.dt1) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","FirstDate")

my.dtlast<-aggregate(my.dt, by=list(wqn.removeparm$MONITORING_POINT_ALIAS_ID,wqn.removeparm$TEST_CODE), FUN=last)
colnames(my.dtlast) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","LastDate")



####Merging Tables Dates and Stats#################### all.dt <- merge(my.dt1,my.dtlast,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"))

dt.stats<-merge(all.dt,stats2,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"))


####Calculate Period Of record in Table, Old coding 2 lines: dt.stats$pdofrecord<-(mdy(dt.stats$"LastDate"))-(mdy(dt.stats$"FirstDate"))
###dt.stats$pd<-round(dt.stats$pdofrecord/31563000, digits=1), lubridate is glitchy test new code on this step more######


dt.stats$pd <- round((as.Date(dt.stats$LastDate,format = "%m/%d/%Y")-as.Date(dt.stats$FirstDate,format = "%m/%d/%Y"))/365,digits=1)



####Retrieve Test Descriptions and Merge######
t1 <- subset(dt.stats,select=c('MONITORING_POINT_ALIAS_ID','TEST_CODE','FirstDate','LastDate','pd','lower','median','upper','n'))
t2<-subset(wqn.removeparm,select=c('TEST_CODE','TEST_SHORT_DESC','ABBREVIATION'))
t3 <- unique( t2 )

t4<-merge(t1,t3, by='TEST_CODE',all.t3=TRUE) ######ReorderColumns,sort, rename columns#######
t5 <- t4[ ,c(2,1,10,3,4,5,9,6,7,8,11)]
t6 <- t5[order(t5$"MONITORING_POINT_ALIAS_ID",t5$"TEST_SHORT_DESC"),]
names(t6)

####This does not need to be run. It produces an error. You still get the correct results.
library(plyr)
library(dplyr)
library(reshape2)
final.tab<-rename(t6, c("FirstDate"="FIRST_DATE", "LastDate"="LAST_DATE","pd"="PERIOD_OF_RECORD(yrs)","n"="SAMPLE_SIZE","lower"
             ="L_95_CI","median"="MEDIAN_","upper"="U_95_CI","ABBREVIATION"="UNITS"))

####BRING VALUES UP TO DETECTION LIMITS merge with wqn.det###### less<-function(x){
  sprintf("< %3.2f", x)
}

wqn.final <-merge(t6,wqn.det,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"),all=TRUE)
wqn.final$DET_LIMIT[is.na(wqn.final$DET_LIMIT)] <- 0 wqn.final$LOW_95_CL <- ifelse(wqn.final$lower < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$lower*1)
wqn.final$MEDIAN <- ifelse(wqn.final$median < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$median*1)
wqn.final$UPP_95_CL <- ifelse(wqn.final$upper < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$upper*1)

####ONLY REPORT MINIMUM FOR ALKALINITY, PH, DO, TEMP, SPECIFIC cONDUCTANCE######

wqn.final$LOWER_95_CL <- ifelse(wqn.final$TEST_CODE=="00403"|wqn.final$TEST_CODE=="00410"|wqn.final$TEST_CODE=="F0040"|
                                    wqn.final$TEST_CODE=="F0030"|wqn.final$TEST_CODE=="F0043",wqn.final$LOW_95_CL,"NA")

wqn.final$UPPER_95_CL <- ifelse(wqn.final$TEST_CODE=="00410"|wqn.final$TEST_CODE=="F0043"|
                                  wqn.final$TEST_CODE=="F0030","NA",wqn.final$UPP_95_CL)


names(wqn.final)                                  
final.1<-subset(wqn.final, select=c('MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','FirstDate',
                                    'LastDate', 'pd',"n","LOWER_95_CL",'MEDIAN',
                                    'UPPER_95_CL','ABBREVIATION'))                                    
final.2 <- final.1[order(final.1$"MONITORING_POINT_ALIAS_ID",final.1$"TEST_SHORT_DESC"),]                                    

###VIEW IN EXCEL##
library(xlsx)
write.xlsx(final.2,"c:/users/shawnmille/Desktop/Anti-deg_requests/final.2.xlsx")


-----Original Message-----
From: David L Carlson [mailto:[hidden email]]
Sent: Thursday, December 13, 2018 2:12 PM
To: Miller, Shawn <[hidden email]>; [hidden email]
Subject: [External] RE: help

ATTENTION: This email message is from an external sender. Do not open links or attachments from unknown sources. To report suspicious email, forward the message as an attachment to [hidden email].

You need Santa Claus not r-help. You haven't given us a fraction of the information we would need to help. You don't show us your code. You don't tell us where the information is coming from except "today's date." You don't tell us what data you want. You don't seem to know the difference between R and R-Studio.

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352


-----Original Message-----
From: R-help <[hidden email]> On Behalf Of Miller, Shawn
Sent: Thursday, December 13, 2018 11:29 AM
To: [hidden email]
Subject: [R] help

Need help with R studio. Code is to pull data from todays date plus 5 years 4 months from now. I am missing the last 3 months of data. Can you please help?

Shawn Miller | Aquatic Biologist II | Assessment Section Environmental Protection | Clean Water Rachel Carson State Office Building
400 Market Street | Harrisburg, PA 17101
Phone: 717.772.2185 | Fax: 717.772.3249
https://na01.safelinks.protection.outlook.com/?url=www.depweb.state.pa.us&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=SetioLDNtYwPWgXTfgp4XNoH2ptEpbMkVCkYBcBFJAE%3D&amp;reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwebmail.state.pa.us%2FOWA%2Fredir.aspx%3FC%3Dt4jGxr3_mkC5mWY30vM0D8N-9RdJ8s9IgIGFizoEzsd1aNOJaDwGjjpEh4RqLFX24CIJXV9M2ic.%26URL%3Dhttp%253a%252f%252fwww.depweb.state.pa.us%252f&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=8mPY9gVwTHUVb6EhpRueVUcHQIyTyd7QYYctDdbwBsM%3D&amp;reserved=0>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=FEb6F5f1w9Y7%2BPPPfHvCkQNVZjNJJjHod%2F0%2BX4IifUw%3D&amp;reserved=0
PLEASE do read the posting guide https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=ZemT7Y8LV%2FFIi5J46qF1rN48UlM1OxelnJYFhcmvLRc%3D&amp;reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [External] RE: help

David Carlson
OK. Follow the chain of subsets to figure out where the dates go missing.

Scanning the code I see the following subsets: wqn.last5yrs, wqn.removeblanks, wqn.removenull, wqn.removeparm, n.det, wqn.ph, wqn.noph. Use the range function to find out where the observations get dropped out.

range(wqn.last5yrs$DATE_COLLECTED)

David C

-----Original Message-----
From: Miller, Shawn <[hidden email]>
Sent: Thursday, December 13, 2018 1:42 PM
To: David L Carlson <[hidden email]>
Cc: [hidden email]
Subject: RE: [External] RE: help

Yes I confirmed all the dates I need are in the test.xlsx table. The final.2.xlsx table does not have the dates  for July, August, September.

-----Original Message-----
From: David L Carlson [mailto:[hidden email]]
Sent: Thursday, December 13, 2018 2:33 PM
To: Miller, Shawn <[hidden email]>
Cc: [hidden email]
Subject: RE: [External] RE: help

Use Reply-All to keep the thread on the list.

If the most recent WQN SIS Data Download table is 14June2018_WQN_Data_Download, does it contain any data for the months July, August, September, October, November, December? After extracting the data you create the file "Anti-deg_requests/test.xlsx". Have you confirmed that all of the dates you need are present in that file?

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352

-----Original Message-----
From: Miller, Shawn <[hidden email]>
Sent: Thursday, December 13, 2018 1:15 PM
To: David L Carlson <[hidden email]>
Subject: RE: [External] RE: help

setwd("c:/users/shawnmille/Desktop/Anti-deg_requests/")

library(plyr)
library(reshape2)
library(RODBC)
library(dplyr)



## MAKE SURE YOU ARE USING THE 32-bit VERSION OF R (See Tools>Global Options>General in RStudio to set this) ## READING DATA FROM EXISTING DATABASES (EXAMPLE: ACCESS 2010) chan <- odbcConnectAccess("WQN Datadumps.mdb")

# Show list of tables in database
sqlTables(chan,tableType='TABLE')
# Fetch the most recent WQN SIS Data Download table, this takes a couple minutes allwqn <- sqlFetch(chan,'14June2018_WQN_Data_Download',stringsAsFactors=FALSE)
names(allwqn)
## subset to the desired fields and values, choose your WQN by modifying command

wqn.specific <- subset(allwqn,MONITORING_POINT_ALIAS_ID =='WQN0735',select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC',
                                                                                                                  'TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))  

###VIEW ENTIRE DATASET, this coding can be used to view any created datasets in excel#####
library(xlsx)
write.xlsx(wqn.specific,"c:/users/shawnmille/Desktop/Anti-deg_requests/test.xlsx")

###VIEW ENTIRE DATASET#####

# FOR ACTIVE WQN STATION GET TODAY'S DATE and date of 5 years ago + 4 months or put in last date in command# today <- Sys.Date()
TODAY.5<-today-1949
## Select data where DATA_COLLECTED >= TODAY.5## wqn.specific$DATE_COLLECTED <- as.Date(wqn.specific$DATE_COLLECTED ,"%m/%d/%y") wqn.last5yrs <- subset(wqn.specific,DATE_COLLECTED >= TODAY.5, select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))


#OR FOR INACTIVE WQN STATION ENTER THE LAST DATE OF RECORD BELOW AND GET PAST 5 YEARS OF DATA IN SUBSET# lastdate <- as.Date("11/01/2010", format = "%m/%d/%Y") firstdate <- lastdate-1949 wqn.specific$DATE_COLLECTED <- as.Date(wqn.specific$DATE_COLLECTED ,"%m/%d/%y") wqn.last5yrs <- subset(wqn.specific,DATE_COLLECTED >= firstdate & DATE_COLLECTED <= lastdate, select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','QUALITY_ASSURANCE_TYPE_DESC','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

#### Delete Blanks and QA column #####
wqn.removeblanks <-subset(wqn.last5yrs,QUALITY_ASSURANCE_TYPE_DESC=="Duplicate"|is.na(QUALITY_ASSURANCE_TYPE_DESC),select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID',
                                                                                                                            'TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))
#### Delete Null Results####
wqn.removenull <- subset(wqn.removeblanks,FINAL_AMOUNT!="NA",select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

### Remove non-relevant parameters ####

wqn.removeparm <- subset(wqn.removenull,TEST_CODE!= "571"&TEST_CODE!="572"&
                           TEST_CODE!="573"& TEST_CODE!="543"& TEST_CODE!="561"& TEST_CODE!="546"
                         & TEST_CODE!="595"& TEST_CODE!="596"& TEST_CODE!="597"& TEST_CODE!="598"
                         & TEST_CODE!="599"& TEST_CODE!="600"& TEST_CODE!="71946"& TEST_CODE!="71940"
                         & TEST_CODE!="587"& TEST_CODE!="593"& TEST_CODE!="71939"& TEST_CODE!="71937"
                         & TEST_CODE!="574"& TEST_CODE!="549"& TEST_CODE!="99014"& TEST_CODE!="547"
                         & TEST_CODE!="298"& TEST_CODE!="551"& TEST_CODE!="71930"& TEST_CODE!="70508"
                         & TEST_CODE!="564"& TEST_CODE!="709"& TEST_CODE!="111"& TEST_CODE!="592"
                         & TEST_CODE!="MMTECMF"& TEST_CODE!="588"& TEST_CODE!="589"& TEST_CODE!="590"
                         & TEST_CODE!="594"& TEST_CODE!="71936"& TEST_CODE!="71945"& TEST_CODE!="F00061"
                         & TEST_CODE!="71947"& TEST_CODE!="555",
                         select=c('DATE_COLLECTED','MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','READING_INDICATOR_CODE','FINAL_AMOUNT','ABBREVIATION'))

###  Combine test codes that are measuring the same analyte, format testcode to include first 5 digits###


wqn.removeparm$TEST_CODE<-substr(wqn.removeparm$TEST_CODE,1,5)


######Creates a table (wqn.det) with list of max detection limits for each parameter,FOR EACH TEST_CODE BRING ALL  < TO THAT VALUE############

n.det <- subset(wqn.removeparm,READING_INDICATOR_CODE=='<')
wqn.det<- aggregate(n.det$FINAL_AMOUNT , by=list(n.det$MONITORING_POINT_ALIAS_ID,n.det$TEST_CODE),FUN=max)
colnames(wqn.det) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","DET_LIMIT")

wqn.mer<-merge(wqn.removeparm,wqn.det, by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"),all=TRUE)
wqn.mer$Amount<- ifelse(is.na(wqn.mer$'READING_INDICATOR_CODE'),wqn.mer$FINAL_AMOUNT*1,wqn.mer$DET_LIMIT*1)


######DIVIDE '<' BY HALF #####
wqn.mer$Result <- ifelse(is.na(wqn.mer$'READING_INDICATOR_CODE'),wqn.mer$Amount*1,wqn.mer$Amount/2)

###### AVERAGE AND GROUP BY TO REMOVE DUPLICATES ########

wqn.agg<-aggregate(wqn.mer$Result, by=list(MONITORING_POINT_ALIAS_ID=wqn.mer$'MONITORING_POINT_ALIAS_ID',DATE_COLLECTED=wqn.mer$'DATE_COLLECTED',TEST_CODE=wqn.mer$'TEST_CODE'),data=wqn.mer, FUN="mean",na.rm=TRUE)
colnames(wqn.agg) <- c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result")

###### Calculate Median and 95% CI, alpha/2 for tailed2 (pH)..similar to ci.median function in asbio package, using alpha for funtion "tailed1"(other parameters)###


tailed2<-function(x){
  n <- nrow(as.matrix(x))
  L <- qbinom(.025,n,0.5)
  U <- n-L+1
  order.x <-sort(x)
  lower <- (order.x[L])
  upper <-order.x[n -
                    L + 1]
  median <- median(x)
  coverage <- 1-(2*pbinom(qbinom(.025,n,.5)-1,n,0.5))
  y<-list(c(lower, median, upper,n,coverage))

 
 
 
 
 
}
tailed1<-function(x){
  n <- nrow(as.matrix(x))
  L <- qbinom(.05,n,0.5)
  U <- n-L+1
  order.x <-sort(x)
  lower <- (order.x[L])
  upper <-order.x[n -
                    L+1]
  median <- median(x)
  coverage <- 1-(2*pbinom(qbinom(.05,n,.5)-1,n,0.5))
  y<-list(c(lower, median, upper,n,coverage))
 
 
 
 
}


###Create two data sets, one with all the parameters except pH and one with just pH##### wqn.ph <- subset(wqn.agg,TEST_CODE=='00403'|TEST_CODE=="F0040",select=c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result"))
wqn.noph <-subset(wqn.agg,TEST_CODE!='00403'& TEST_CODE!="F0040",select=c("MONITORING_POINT_ALIAS_ID","DATE_COLLECTED","TEST_CODE","Result"))                  
                   
###Run tailed1 function on all data except for pH###                  
                   
stats<-aggregate(wqn.noph$Result, by=list(wqn.noph$MONITORING_POINT_ALIAS_ID,wqn.noph$TEST_CODE), FUN=tailed1)
colnames(stats) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","x")

stats$lower <-unlist(lapply(stats$x, '[[', 1)) stats$median<-unlist(lapply(stats$x, '[[', 2)) stats$upper <-unlist(lapply(stats$x, '[[', 3)) stats$n <-unlist(lapply(stats$x, '[[', 4))
stats1 <- stats[order(stats$"MONITORING_POINT_ALIAS_ID",stats$"TEST_CODE"),]

###Run tailed2 function on pH (lab and field) data#### ph.stats<-aggregate(wqn.ph$Result, by=list(wqn.ph$MONITORING_POINT_ALIAS_ID,wqn.ph$TEST_CODE), FUN=tailed2)
colnames(ph.stats) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","x")

ph.stats$lower <-unlist(lapply(ph.stats$x, '[[', 1)) ph.stats$median<-unlist(lapply(ph.stats$x, '[[', 2)) ph.stats$upper <-unlist(lapply(ph.stats$x, '[[', 3)) ph.stats$n <-unlist(lapply(ph.stats$x, '[[', 4))
ph.stats1 <- ph.stats[order(ph.stats$"MONITORING_POINT_ALIAS_ID",ph.stats$"TEST_CODE"),]

###Combine datasets vertically###
stats2 <- rbind(stats1,ph.stats1)


####PeriodofRecordTable####

library(dplyr)

my.dt<-(format(as.Date(wqn.removeparm$"DATE_COLLECTED"),"%m/%d/%Y"))

my.dt1<-aggregate(my.dt, by=list(wqn.removeparm$MONITORING_POINT_ALIAS_ID,wqn.removeparm$TEST_CODE), FUN=first)
colnames(my.dt1) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","FirstDate")

my.dtlast<-aggregate(my.dt, by=list(wqn.removeparm$MONITORING_POINT_ALIAS_ID,wqn.removeparm$TEST_CODE), FUN=last)
colnames(my.dtlast) <- c("MONITORING_POINT_ALIAS_ID","TEST_CODE","LastDate")



####Merging Tables Dates and Stats#################### all.dt <- merge(my.dt1,my.dtlast,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"))

dt.stats<-merge(all.dt,stats2,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"))


####Calculate Period Of record in Table, Old coding 2 lines: dt.stats$pdofrecord<-(mdy(dt.stats$"LastDate"))-(mdy(dt.stats$"FirstDate"))
###dt.stats$pd<-round(dt.stats$pdofrecord/31563000, digits=1), lubridate is glitchy test new code on this step more######


dt.stats$pd <- round((as.Date(dt.stats$LastDate,format = "%m/%d/%Y")-as.Date(dt.stats$FirstDate,format = "%m/%d/%Y"))/365,digits=1)



####Retrieve Test Descriptions and Merge######
t1 <- subset(dt.stats,select=c('MONITORING_POINT_ALIAS_ID','TEST_CODE','FirstDate','LastDate','pd','lower','median','upper','n'))
t2<-subset(wqn.removeparm,select=c('TEST_CODE','TEST_SHORT_DESC','ABBREVIATION'))
t3 <- unique( t2 )

t4<-merge(t1,t3, by='TEST_CODE',all.t3=TRUE) ######ReorderColumns,sort, rename columns#######
t5 <- t4[ ,c(2,1,10,3,4,5,9,6,7,8,11)]
t6 <- t5[order(t5$"MONITORING_POINT_ALIAS_ID",t5$"TEST_SHORT_DESC"),]
names(t6)

####This does not need to be run. It produces an error. You still get the correct results.
library(plyr)
library(dplyr)
library(reshape2)
final.tab<-rename(t6, c("FirstDate"="FIRST_DATE", "LastDate"="LAST_DATE","pd"="PERIOD_OF_RECORD(yrs)","n"="SAMPLE_SIZE","lower"
             ="L_95_CI","median"="MEDIAN_","upper"="U_95_CI","ABBREVIATION"="UNITS"))

####BRING VALUES UP TO DETECTION LIMITS merge with wqn.det###### less<-function(x){
  sprintf("< %3.2f", x)
}

wqn.final <-merge(t6,wqn.det,by=c("MONITORING_POINT_ALIAS_ID","TEST_CODE"),all=TRUE)
wqn.final$DET_LIMIT[is.na(wqn.final$DET_LIMIT)] <- 0 wqn.final$LOW_95_CL <- ifelse(wqn.final$lower < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$lower*1)
wqn.final$MEDIAN <- ifelse(wqn.final$median < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$median*1)
wqn.final$UPP_95_CL <- ifelse(wqn.final$upper < wqn.final$DET_LIM,less(wqn.final$DET_LIM*1),wqn.final$upper*1)

####ONLY REPORT MINIMUM FOR ALKALINITY, PH, DO, TEMP, SPECIFIC cONDUCTANCE######

wqn.final$LOWER_95_CL <- ifelse(wqn.final$TEST_CODE=="00403"|wqn.final$TEST_CODE=="00410"|wqn.final$TEST_CODE=="F0040"|
                                    wqn.final$TEST_CODE=="F0030"|wqn.final$TEST_CODE=="F0043",wqn.final$LOW_95_CL,"NA")

wqn.final$UPPER_95_CL <- ifelse(wqn.final$TEST_CODE=="00410"|wqn.final$TEST_CODE=="F0043"|
                                  wqn.final$TEST_CODE=="F0030","NA",wqn.final$UPP_95_CL)


names(wqn.final)                                  
final.1<-subset(wqn.final, select=c('MONITORING_POINT_ALIAS_ID','TEST_CODE','TEST_SHORT_DESC','FirstDate',
                                    'LastDate', 'pd',"n","LOWER_95_CL",'MEDIAN',
                                    'UPPER_95_CL','ABBREVIATION'))                                    
final.2 <- final.1[order(final.1$"MONITORING_POINT_ALIAS_ID",final.1$"TEST_SHORT_DESC"),]                                    

###VIEW IN EXCEL##
library(xlsx)
write.xlsx(final.2,"c:/users/shawnmille/Desktop/Anti-deg_requests/final.2.xlsx")


-----Original Message-----
From: David L Carlson [mailto:[hidden email]]
Sent: Thursday, December 13, 2018 2:12 PM
To: Miller, Shawn <[hidden email]>; [hidden email]
Subject: [External] RE: help

ATTENTION: This email message is from an external sender. Do not open links or attachments from unknown sources. To report suspicious email, forward the message as an attachment to [hidden email].

You need Santa Claus not r-help. You haven't given us a fraction of the information we would need to help. You don't show us your code. You don't tell us where the information is coming from except "today's date." You don't tell us what data you want. You don't seem to know the difference between R and R-Studio.

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352


-----Original Message-----
From: R-help <[hidden email]> On Behalf Of Miller, Shawn
Sent: Thursday, December 13, 2018 11:29 AM
To: [hidden email]
Subject: [R] help

Need help with R studio. Code is to pull data from todays date plus 5 years 4 months from now. I am missing the last 3 months of data. Can you please help?

Shawn Miller | Aquatic Biologist II | Assessment Section Environmental Protection | Clean Water Rachel Carson State Office Building
400 Market Street | Harrisburg, PA 17101
Phone: 717.772.2185 | Fax: 717.772.3249
https://na01.safelinks.protection.outlook.com/?url=www.depweb.state.pa.us&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=SetioLDNtYwPWgXTfgp4XNoH2ptEpbMkVCkYBcBFJAE%3D&amp;reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwebmail.state.pa.us%2FOWA%2Fredir.aspx%3FC%3Dt4jGxr3_mkC5mWY30vM0D8N-9RdJ8s9IgIGFizoEzsd1aNOJaDwGjjpEh4RqLFX24CIJXV9M2ic.%26URL%3Dhttp%253a%252f%252fwww.depweb.state.pa.us%252f&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=8mPY9gVwTHUVb6EhpRueVUcHQIyTyd7QYYctDdbwBsM%3D&amp;reserved=0>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=FEb6F5f1w9Y7%2BPPPfHvCkQNVZjNJJjHod%2F0%2BX4IifUw%3D&amp;reserved=0
PLEASE do read the posting guide https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html&amp;data=02%7C01%7Cshawnmille%40pa.gov%7Ce275c28afad54d99b5d608d66131bec5%7C418e284101284dd59b6c47fc5a9a1bde%7C1%7C0%7C636803263627811233&amp;sdata=ZemT7Y8LV%2FFIi5J46qF1rN48UlM1OxelnJYFhcmvLRc%3D&amp;reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

fortune nomination WAS:Re: help

Berry, Charles
In reply to this post by David Carlson

"You need Santa Claus not r-help."

in response to an unrealistic and poorly posed request for help.

Best,

Chuck

> On Dec 13, 2018, at 11:12 AM, David L Carlson <[hidden email]> wrote:
>
> You need Santa Claus not r-help. You haven't given us a fraction of the information we would need to help. You don't show us your code. You don't tell us where the information is coming from except "today's date." You don't tell us what data you want. You don't seem to know the difference between R and R-Studio.
>
> ----------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of Miller, Shawn
> Sent: Thursday, December 13, 2018 11:29 AM
> To: [hidden email]
> Subject: [R] help
>
> Need help with R studio. Code is to pull data from todays date plus 5 years 4 months from now. I am missing the last 3 months of data. Can you please help?
>
> Shawn Miller | Aquatic Biologist II | Assessment Section
> Environmental Protection | Clean Water
> Rachel Carson State Office Building
> 400 Market Street | Harrisburg, PA 17101
> Phone: 717.772.2185 | Fax: 717.772.3249
> www.depweb.state.pa.us<https://webmail.state.pa.us/OWA/redir.aspx?C=t4jGxr3_mkC5mWY30vM0D8N-9RdJ8s9IgIGFizoEzsd1aNOJaDwGjjpEh4RqLFX24CIJXV9M2ic.&URL=http%3a%2f%2fwww.depweb.state.pa.us%2f>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.