Issue with R Loop

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Issue with R Loop

BalajiS R User
I Have 2 columns ID & Content(eg: 5 Rows in total)
I would like to extract only the age information from each row (not salary,date etc). and I am able to extract the information from first 2 rows with out any issue and but I noticed that when the iteration reached 3rd row , it break with an error .

row1:"Father age-52 Mother agge42 Son age 9 Daugther aage 6  Address is door 23 20002, doc 26-07-1999 pincode 260074"
row2:"Father age-34 Mother agge32 Son Address is door 24 20002, doc 27-07-1999 pincode 260075"
row3:"Little Jonny wnet to the school"
row4:"I love chocolates"
row5:"Sun moon star"

Note:Only the first 2 rows has age|aage|agge information and row 3,4,5 doesn't have any age information , so I expect the ouput to be "NA", but the code breaks there with below error.
Error Output:
================
Error in data.frame(di = di, Age = age) :
  arguments imply differing number of rows: 1, 0

Expected Output:
=================
ID            CONTENT
1            "52" "42" "9" "6"
2            "34" "32"
3            NA
4            NA
5            NA

Code:
=====================
library(XML)
library(plyr)
fileUrl <- "C:\\Users\\BA\\Desktop\\XML_files\\103_out.xml"
data_df <- xmlToDataFrame(fileUrl)
df = data.frame(ID = numeric(),Age = character(), stringsAsFactors = FALSE)
r <- data_df$CONTENT
for (z in nrow(data_df)) {
  bb <- grep(".*",r[[z]],value = TRUE)
  #Extract Age
  age <- str_extract(str_extract_all(bb,"(?>age|aage).+?\\d+")[[1]],"\\d+")
    #Extract Id
  i <- data_df$ID
  id <- grep("[0-9]",i[[z]],value = TRUE)
  df <- rbind(df, data.frame(ID = di,Age = age))
}