navel-gazing

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

navel-gazing

bbolker
month <- "2010-August"
list <- "r-help"
##list <- "r-sig-ecology"
##list <- "r-sig-mixed-models"
## month <- "2010q3"
n <- 50
baseurl <- "https://stat.ethz.ch/pipermail/"
library(RCurl)
z <- getURL(paste(baseurl,list,"/",month,"/author.html",sep=""))
zz <- strsplit(z,"<LI>")[[1]]
namefun <- function(x) {
  gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x)))
}

cnames <- sapply(zz[3:(length(zz)-1)],namefun)
rr <- rev(sort(table(cnames)))


library(lattice)
dotplot(~rev(rr[1:n]),xlab="Number of posts")

dotplot(~rev(rr[1:n]),xlab="Number of posts",
        scales=list(x=list(log=10)))

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: navel-gazing

Henrique Dallazuanna
Ben,

I change the line:

z <- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""))

to

z <- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""),
ssl.verifypeer = FALSE)

because don't work for me.

Nice!

On Tue, Aug 17, 2010 at 1:47 PM, Ben Bolker <[hidden email]> wrote:

> month <- "2010-August"
> list <- "r-help"
> ##list <- "r-sig-ecology"
> ##list <- "r-sig-mixed-models"
> ## month <- "2010q3"
> n <- 50
> baseurl <- "https://stat.ethz.ch/pipermail/"
> library(RCurl)
> z <- getURL(paste(baseurl,list,"/",month,"/author.html",sep=""))
> zz <- strsplit(z,"<LI>")[[1]]
> namefun <- function(x) {
>  gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x)))
> }
>
> cnames <- sapply(zz[3:(length(zz)-1)],namefun)
> rr <- rev(sort(table(cnames)))
>
>
> library(lattice)
> dotplot(~rev(rr[1:n]),xlab="Number of posts")
>
> dotplot(~rev(rr[1:n]),xlab="Number of posts",
>        scales=list(x=list(log=10)))
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: navel-gazing

Brian Diggs
Since Peter Dalgaard is splitting his considerable contributions between
"Peter Dalgaard" and "peter dalgaard", I made the following changes
(which shouldn't be a problem unless e e cummings becomes a regular poster):

# from base::chartr documentation
capwords <- function(s, strict = FALSE) {
     cap <- function(s) paste(toupper(substring(s,1,1)),
                   {s <- substring(s,2); if(strict) tolower(s) else s},
                              sep = "", collapse = " " )
     sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))
}

cnames <- capwords(sapply(zz[3:(length(zz)-1)],namefun))



On 8/17/2010 10:00 AM, Henrique Dallazuanna wrote:

> Ben,
>
> I change the line:
>
> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""))
>
> to
>
> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""),
> ssl.verifypeer = FALSE)
>
> because don't work for me.
>
> Nice!
>
> On Tue, Aug 17, 2010 at 1:47 PM, Ben Bolker<[hidden email]>  wrote:
>
>> month<- "2010-August"
>> list<- "r-help"
>> ##list<- "r-sig-ecology"
>> ##list<- "r-sig-mixed-models"
>> ## month<- "2010q3"
>> n<- 50
>> baseurl<- "https://stat.ethz.ch/pipermail/"
>> library(RCurl)
>> z<- getURL(paste(baseurl,list,"/",month,"/author.html",sep=""))
>> zz<- strsplit(z,"<LI>")[[1]]
>> namefun<- function(x) {
>>   gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x)))
>> }
>>
>> cnames<- sapply(zz[3:(length(zz)-1)],namefun)
>> rr<- rev(sort(table(cnames)))
>>
>>
>> library(lattice)
>> dotplot(~rev(rr[1:n]),xlab="Number of posts")
>>
>> dotplot(~rev(rr[1:n]),xlab="Number of posts",
>>         scales=list(x=list(log=10)))
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>

--
Brian Diggs
Senior Research Associate, Department of Surgery, Oregon Health &
Science University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: navel-gazing

Henrique Dallazuanna
I think that gsub example on help page is more clear:

library(XML)

# could be used the XML package to get the names
cnames <- gsub('\n', '', head(tail(sapply(getNodeSet(htmlParse(z, asText =
TRUE), "//i"), xmlValue), -3), -3))
gsub("(\\w)(\\w*)", "\\U\\1\\L\\2", cnames, perl=TRUE)



On Tue, Aug 17, 2010 at 4:44 PM, Brian Diggs <[hidden email]> wrote:

> Since Peter Dalgaard is splitting his considerable contributions between
> "Peter Dalgaard" and "peter dalgaard", I made the following changes (which
> shouldn't be a problem unless e e cummings becomes a regular poster):
>
> # from base::chartr documentation
> capwords <- function(s, strict = FALSE) {
>    cap <- function(s) paste(toupper(substring(s,1,1)),
>                  {s <- substring(s,2); if(strict) tolower(s) else s},
>                             sep = "", collapse = " " )
>    sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))
> }
>
> cnames <- capwords(sapply(zz[3:(length(zz)-1)],namefun))
>
>
>
>
> On 8/17/2010 10:00 AM, Henrique Dallazuanna wrote:
>
>> Ben,
>>
>> I change the line:
>>
>> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""))
>>
>> to
>>
>> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""),
>> ssl.verifypeer = FALSE)
>>
>> because don't work for me.
>>
>> Nice!
>>
>> On Tue, Aug 17, 2010 at 1:47 PM, Ben Bolker<[hidden email]>  wrote:
>>
>>  month<- "2010-August"
>>> list<- "r-help"
>>> ##list<- "r-sig-ecology"
>>> ##list<- "r-sig-mixed-models"
>>> ## month<- "2010q3"
>>> n<- 50
>>> baseurl<- "https://stat.ethz.ch/pipermail/"
>>> library(RCurl)
>>> z<- getURL(paste(baseurl,list,"/",month,"/author.html",sep=""))
>>> zz<- strsplit(z,"<LI>")[[1]]
>>> namefun<- function(x) {
>>>  gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x)))
>>> }
>>>
>>> cnames<- sapply(zz[3:(length(zz)-1)],namefun)
>>> rr<- rev(sort(table(cnames)))
>>>
>>>
>>> library(lattice)
>>> dotplot(~rev(rr[1:n]),xlab="Number of posts")
>>>
>>> dotplot(~rev(rr[1:n]),xlab="Number of posts",
>>>        scales=list(x=list(log=10)))
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>>
> --
> Brian Diggs
> Senior Research Associate, Department of Surgery, Oregon Health & Science
> University
>


--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: navel-gazing

Joshua Wiley-2
I have been tinkering around with this for a bit, and I am proud to
share navel gazer 1.0.

If no arguments are passed, it will look up the top 50 authors on the
r-help list, for the given month in the given year. You can also
specify one or more months as a character vector (e.g., "August" or
c("August", "September") ). The same goes for years.  Thanks to some
help from Henrique (although I promise this was not the only reason I
wanted it), it will not try to pull a month beyond the current one.
You can also choose a different list (such as r-devel).  If you have
the time, you can set argument entire = TRUE, which will look up every
month from whatever year(s) you specified (or the current year if you
did not). It will return a named list with each element corresponding
to one month. Also, by default it will create a dotplot in lattice
(though this may be turned off via plot = FALSE). Finally, you can
specify how many authors you want. It defaults to 50.

It is also available at here:
http://gist.github.com/584910

#######################################

navel.gazer <- function(month = NULL, year = NULL, entire = FALSE,
                         list = "r-help", n = 50, plot = TRUE) {
  # Ben Bolker came up with most of the code
  # Henrique Dallazuanna provided an edit to the z <- line of code
  # Brian Diggs provided capwords() to properly count Peter Dalgaard
  # Joshua Wiley adapted all of it to one function


  if(is.null(month)) {
    month <- format(Sys.Date(), format = "%B")
  }
  if(isTRUE(entire)) {
    month <- unique(months(as.Date(1:365, "2000-01-01")))
  }
  if(is.null(year)) {
    year <- format(Sys.Date(), format = "%Y")
  }
  if(length(year) > 1) {
    tmp <- vector(mode = "list", length = length(year))
    for(i in seq_along(year)) {
      tmp[[i]] <- paste(year[i], month, sep = "-")
    }
    times <- unlist(tmp)
  } else {
    times <- paste(year, month, sep = "-")
  }

  require(zoo)

  times <- sort(as.yearmon(times, "%Y-%B"))
  current <- as.yearmon(Sys.Date(), "%Y-%m")
  times <- format(times[times <= current], "%Y-%B")

  # Function to extract the names
  # Originally by Ben Bolker
  namefun <- function(x) {
    gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x)))
  }

  # Based on a suggestion by Brian Diggs
  # Capitalizes the first letter of each word
  capwords <- function(s, strict = FALSE) {
    cap <- function(s) paste(toupper(substring(s,1,1)),
                         {s <- substring(s,2); if(strict) tolower(s) else s},
                         sep = "", collapse = " " )
    sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))
  }

  # Collects the author names for the relevant month and list
  # from the R archives
  # Originally by Ben Bolker
  grabber <- function(month, list, n) {
    baseurl <- "https://stat.ethz.ch/pipermail/"
    require(RCurl)
    # z <- getURL(paste(baseurl,list,"/",month,"/author.html",sep=""))
    z <- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""),
                ssl.verifypeer = FALSE)
    zz <- strsplit(z,"<LI>")[[1]]
    cnames <- capwords(sapply(zz[3:(length(zz)-1)],namefun))
    rr <- rev(sort(table(cnames)))
    output <- rr[1:n]
    return(output)
  }

  # Create dot plots of the number of posts
  # lattice dotplot() code primarily by Ben Bolker
  plotter <- function(dat) {
    require(lattice)
    if(length(dat) > 1) {
      old.par <- par(no.readonly = TRUE)
      on.exit(par(old.par))
      par("ask" = TRUE)
    }
    for(i in seq_along(dat)) {
      print(dotplot(~rev(dat[[i]]), xlab = "Number of posts",
                    main = names(dat)[i]))
    }
    invisible()
  }

  numbers <- lapply(times, function(x)
                    {grabber(month = x, list = list, n = n)})
  names(numbers) <- times

  if(plot) {
    plotter(dat = numbers)
  }

  return(numbers)
}

#######################################


Two examples:

navel.gazer(year = 2009, entire = TRUE)
navel.gazer(month = "September", year = 2007)

This basically works out to be a tribute to David Winsemius:
navel.gazer(n = 1, entire = TRUE)

Hope you get a bit of fun out of this.  I certainly enjoyed writing it!

Josh

On Tue, Aug 17, 2010 at 1:10 PM, Henrique Dallazuanna <[hidden email]> wrote:

> I think that gsub example on help page is more clear:
>
> library(XML)
>
> # could be used the XML package to get the names
> cnames <- gsub('\n', '', head(tail(sapply(getNodeSet(htmlParse(z, asText =
> TRUE), "//i"), xmlValue), -3), -3))
> gsub("(\\w)(\\w*)", "\\U\\1\\L\\2", cnames, perl=TRUE)
>
>
>
> On Tue, Aug 17, 2010 at 4:44 PM, Brian Diggs <[hidden email]> wrote:
>
>> Since Peter Dalgaard is splitting his considerable contributions between
>> "Peter Dalgaard" and "peter dalgaard", I made the following changes (which
>> shouldn't be a problem unless e e cummings becomes a regular poster):
>>
>> # from base::chartr documentation
>> capwords <- function(s, strict = FALSE) {
>>    cap <- function(s) paste(toupper(substring(s,1,1)),
>>                  {s <- substring(s,2); if(strict) tolower(s) else s},
>>                             sep = "", collapse = " " )
>>    sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))
>> }
>>
>> cnames <- capwords(sapply(zz[3:(length(zz)-1)],namefun))
>>
>>
>>
>>
>> On 8/17/2010 10:00 AM, Henrique Dallazuanna wrote:
>>
>>> Ben,
>>>
>>> I change the line:
>>>
>>> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""))
>>>
>>> to
>>>
>>> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""),
>>> ssl.verifypeer = FALSE)
>>>
>>> because don't work for me.
>>>
>>> Nice!
>>>
>>> On Tue, Aug 17, 2010 at 1:47 PM, Ben Bolker<[hidden email]>  wrote:
>>>
>>>  month<- "2010-August"
>>>> list<- "r-help"
>>>> ##list<- "r-sig-ecology"
>>>> ##list<- "r-sig-mixed-models"
>>>> ## month<- "2010q3"
>>>> n<- 50
>>>> baseurl<- "https://stat.ethz.ch/pipermail/"
>>>> library(RCurl)
>>>> z<- getURL(paste(baseurl,list,"/",month,"/author.html",sep=""))
>>>> zz<- strsplit(z,"<LI>")[[1]]
>>>> namefun<- function(x) {
>>>>  gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x)))
>>>> }
>>>>
>>>> cnames<- sapply(zz[3:(length(zz)-1)],namefun)
>>>> rr<- rev(sort(table(cnames)))
>>>>
>>>>
>>>> library(lattice)
>>>> dotplot(~rev(rr[1:n]),xlab="Number of posts")
>>>>
>>>> dotplot(~rev(rr[1:n]),xlab="Number of posts",
>>>>        scales=list(x=list(log=10)))
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>>>
>>>
>> --
>> Brian Diggs
>> Senior Research Associate, Department of Surgery, Oregon Health & Science
>> University
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.