% File src/library/base/man/findInterval.Rd % Part of the R package, https://www.R-project.org % Copyright 1995-2020 R Core Team % Distributed under GPL 2 or later \name{findInterval} \alias{findInterval} \title{Find Interval Numbers or Indices} \usage{ findInterval(x, vec, rightmost.closed = FALSE, all.inside = FALSE, left.open = FALSE) } \arguments{ \item{x}{numeric.} \item{vec}{numeric, sorted (weakly) increasingly, of length \code{N}, say.} \item{rightmost.closed}{logical; if true, the rightmost bounded interval, \code{vec[N-1] .. vec[N]} is treated as \emph{closed}, see below.} \item{all.inside}{logical; if true, the returned indices are coerced into \code{1,\dots,N-1}, i.e., \code{0} is mapped to \code{1} and \code{N} to \code{N-1}.} \item{left.open}{logical; if true all the intervals are open at left and closed at right; in the description \sQuote{less than or equal to} becomes \sQuote{strictly less than}, and \code{rightmost.closed} means \sQuote{leftmost is closed}. This may be useful, e.g., in survival analysis computations.} } \description{ Given a vector of non-decreasing values \code{vec}, for each value in \code{x} return the highest position in \code{vec} that corresponds to a value less than or equal to that \code{x} value, or zero if none are. Equivalently, if the values in \code{vec} are taken to be the closed left-bounds of contiguous half-open intervals, return which of those intervals each value of \code{x} lies in. } \details{ Under the default parameter values the bounds in \code{vec} define intervals that are closed on the left and open on the right: \preformatted{ Bound: -Inf vec[1] vec[2] ... vec[N-1] vec[N] +Inf Interval: 0 )[ 1 )[ ... )[ N-1 )[ N } Intervals \eqn{0} and \eqn{N} are half-bounded, and interval \eqn{0} is implicitly defined. Interval \eqn{0} does not exist if \code{vec} includes \eqn{-\infty}, and interval \eqn{N} does not exist if \code{vec} includes \eqn{+\infty}. \code{left.open=TRUE} reverses which side of the intervals is open, \code{rightmost.closed=TRUE} closes interval \eqn{N-1} on both sides (or interval \eqn{1} if \code{left.open=TRUE}), and \code{all.inside=TRUE} drops bounds \eqn{1} and \eqn{N}, which merges interval \eqn{0} into \eqn{1} and interval \eqn{N} into \eqn{N-1}. The internal algorithm uses interval search ensuring \eqn{O(n \log N)}{O(n * log(N))} complexity where \code{n <- length(x)} (and \code{N <- length(vec)}). For (almost) sorted \code{x}, it will be even faster, basically \eqn{O(n)}. } \value{ vector of length \code{length(x)} with values in \code{0:N} (and \code{NA}) where \code{N <- length(vec)}, or values coerced to \code{1:(N-1)} if and only if \code{all.inside = TRUE} (equivalently coercing all x values \emph{inside} the intervals). Note that \code{\link{NA}}s are propagated from \code{x}, and \code{\link{Inf}} values are allowed in both \code{x} and \code{vec}. } \author{Martin Maechler} \seealso{\code{\link{approx}(*, method = "constant")} which is a generalization of \code{findInterval()}, \code{\link{ecdf}} for computing the empirical distribution function which is (up to a factor of \eqn{n}) also basically the same as \code{findInterval(.)}. } \examples{ v <- c(5, 10, 15) # create bins [5,10), [10,15), and [15,+Inf) x <- c(2, 5, 8, 10, 12, 15, 17) intervals <- rbind( 'match(x, v)'=match(x, v), # x values that are on bounds x, default= findInterval(x, v), rightmost.cl= findInterval(x, v, rightmost.closed=TRUE), left.open= findInterval(x, v, left.open=TRUE), all.in= findInterval(x, v, all.inside=TRUE) ) v intervals N <- 100 X <- sort(round(stats::rt(N, df = 2), 2)) tt <- c(-100, seq(-2, 2, length.out = 201), +100) it <- findInterval(tt, X) tt[it < 1 | it >= N] # only first and last are outside range(X) ## 'left.open = TRUE' means "mirroring" : N <- length(v) stopifnot(identical( findInterval( x, v, left.open=TRUE) , N - findInterval(-x, -v[N:1]))) } \keyword{arith} \keyword{utilities}