me nugget: Data point locator function

Friday, December 5, 2014

Data point locator function

Here's a little function to select data points in an open graphical device (ptlocator()). The function does a scaling of the x and y axes in order to give them equal weighting and remove the influence of differing units or ranges. The function then calculates the Euclidean distance between the selected locations (using the locator() function) and the x, y coordinates of the plotted data points. Colored points are filled in for the data point that has the lowest distance to the clicked location, and the results give the vector positions of the closest x, y data points.

[NOTE: I just realized that the identify function is very similar in its usage]

The function:

ptlocator <- function(n=1, x, y, col=rgb(1,0,0,0.5), pch=20, ...){
  xsc <- scale(x)
  ysc <- scale(y)
  pos <- seq(n)*NaN
  for(i in seq(n)){
    print(paste("choose point", i))
    pt <- locator(1)
    ptxsc <- scale(pt$x, center=attr(xsc,"scaled:center"), scale=attr(xsc,"scaled:scale"))
    ptysc <- scale(pt$y, center=attr(ysc,"scaled:center"), scale=attr(ysc,"scaled:scale"))
    pos.i <- which.min(sqrt((c(ptxsc)-c(xsc))^2 + (c(ptysc)-c(ysc))^2))
    points(x[pos.i], y[pos.i], col=col, pch=pch, ...)
    pos[i] <- pos.i
  }
  pos    
}

Created by Pretty R at inside-R.org

To reproduce example:

set.seed(1)
n <- 200
x <- sort(runif(n, min=0, max=10*pi))
y <- sin(x) + rnorm(n, sd=0.2)
 
# Select 10 points at maxima and minima
op <- par(mar=c(4,4,1,1))
plot(x,y, cex=2)
pos <- ptlocator(10, x, y, col=rgb(1,0.2,0.2,0.75), cex=2)
par(op)
pos