Saturday, January 25, 2014

Importing bathymetry and coastline data in R


After noticing some frustrating inaccuracies with the high-resolution world coastlines and national boundaries database found in worldHires from the package mapdata (based on CIA World Data Bank II data), I decided to look into other options. Although listed as "depreciated", the data found in NOAAs online "Coastline Extractor" is a big step forward. There seem to be more up-to-date products, but this served my needs for the moment, and I thought I'd pass along the address to other R users. I exported the data in ASCII "Matlab" format, which is basically just a 2-column text file (.dat) with NaN's in the rows that separate line segments.

I've also discovered the bathymetry / topography data from GEBCO. Again, very easy to import into R from the netCDF files.

The above map of the Galapagos Archipelago illustrates the quality of both datasets. It also shows the comparison of coastline accuracy between World Vector Shoreline (1:250,000), world (map package), and worldHires (mapdata package) datasets. Obviously, the low-resolution world data only makes sense for quick plotting at large scales, but the high-resolution data is as much as 1/10° off in some locations. I noticed these errors for the first time when trying to map some data for smaller coastal bays. It drove me crazy trying to figure out where the errors were - in my data locations or the map itself. Bathymetry used in the map was 30 arc-second resolution GEBCO data.

[EDIT: The comparison of coastline data now includes the high resolution data from the rworldmap package.]

A more detailed description export settings:
  • Coastline data (from 'Coastline Extractor') :
    • Coastline database: World Vector Shoreline (1:250,000)
    • Compression method for extracted ASCII data: None
    • Coast Format options: Matlab
    • Coast Preview options: GMT Plot
  • Bathymetry / topography data [link]:
    • General Bathymetric Chart of the Oceans (GEBCO) : GEBCO_08 Grid (30 arc-second resolution)
Here's another example of bathymetry / topography data for the western Pacific (1 minute resolution GEBCO data):



For both maps, I took inspiration for the color palettes from GMT. The rgb color levels of these palettes have got to be documented somwhere, but I gave up looking after a while and managed to hack their levels from color scales contained in .png files [link].

Below is the R code to reproduce the figures.
To reproduce Galapagos coastline example:
###required packages
library(RNetCDF)
library(maps)
library(mapdata)
library(rworldmap)
library(rworldxtra)
 
###Data
#data locations
bathy_fname <- "galapagos_gebco_08_-92_-2_-88_2.nc" # from https://www.bodc.ac.uk/data/online_delivery/gebco/gebco_08_grid/
coast_fname <- "galapagos_18563.dat" # from http://www.ngdc.noaa.gov/mgg/coast/getcoast.html
 
#load bathy data
nc <- open.nc(bathy_fname)
print.nc(nc)
tmp <- read.nc(nc)
z <- array(tmp$z, dim=tmp$dim)
#z[which(z > 0)] <- NaN
z <- z[,rev(seq(ncol(z)))]
xran <- tmp$x_range
yran <- tmp$y_range
zran <- tmp$z_range
lon <- seq(tmp$x[1], tmp$x[2], tmp$spac[1])
lat <- seq(tmp$y[1], tmp$y[2], tmp$spac[1])
rm(tmp)
close.nc(nc)
 
#load coast data
coast <- read.table(coast_fname)
names(coast) <- c("lon", "lat")
 
###Plot
#make palette
ocean.pal <- colorRampPalette(
 c("#000000", "#000413", "#000728", "#002650", "#005E8C",
 "#0096C8", "#45BCBB", "#8AE2AE", "#BCF8B9", "#DBFBDC")
)
 
land.pal <- colorRampPalette(
 c("#467832", "#887438", "#B19D48", "#DBC758", "#FAE769",
 "#FAEB7E", "#FCED93", "#FCF1A7", "#FCF6C1", "#FDFAE0")
)
 
zbreaks <- seq(-8000, 8000, by=100)
cols <-c(ocean.pal(sum(zbreaks<=0)-1), land.pal(sum(zbreaks>0)))
 
#compare coastlines to package 'mapdata' and 'rworldxtra'
LWD <- 1
COL <- c(1:4)
png("coastline_compare2.png", width=7.5, height=6, units="in", res=600)
#quartz(width=7.5, height=6)
layout(matrix(1:2, 1,2), widths=c(6,1.5), heights=c(6))
 
par(mar=c(2,2,1,1), ps=10)
image(lon, lat, z=z, col=cols, breaks=zbreaks, useRaster=TRUE, ylim=c(-1.5,0.5), xlim=c(-92,-90))
lines(coast, col=COL[1], lwd=LWD)
map("world", col=COL[2], ylim=c(-2,2), xlim=c(-93,-88), add=TRUE, lwd=LWD)
map("worldHires", col=COL[3], ylim=c(-2,2), xlim=c(-93,-88), lwd=LWD, add=TRUE)
plot(getMap(resolution='high'), border=COL[4], lwd=LWD, add=TRUE)
 
legend("bottomleft", legend=c("World Vector Shoreline", "maps: world", "mapdata: worldHires", "rworldmap: high res."), lty=1, lwd=LWD, col=COL, bg="white") 
 
par(mar=c(2,0,1,5))
image(x=1, y=zbreaks, z=matrix(zbreaks, 1, length(zbreaks)), col=cols, breaks=zbreaks, useRaster=TRUE, xlab="", ylab="", axes=FALSE)
axis(4, at=seq(-8000, 8000, 1000), las=2)
mtext("[meters]", side=4, line=3)
box()
 
dev.off()
Created by Pretty R at inside-R.org





To reproduce western Pacific map:
###required packages
library(RNetCDF)
 
###Data
#data location
bathy_fname <- "west_pac_gebco_1min_100_-20_170_50.nc" # from https://www.bodc.ac.uk/data/online_delivery/gebco/gebco_08_grid/
 
#load bathy data
nc <- open.nc(bathy_fname)
print.nc(nc)
tmp <- read.nc(nc)
z <- array(tmp$z, dim=tmp$dim)
#z[which(z > 0)] <- NaN
z <- z[,rev(seq(ncol(z)))]
xran <- tmp$x_range
yran <- tmp$y_range
zran <- tmp$z_range
lon <- seq(tmp$x[1], tmp$x[2], tmp$spac[1])
lat <- seq(tmp$y[1], tmp$y[2], tmp$spac[1])
rm(tmp)
close.nc(nc)
 
###Plot
#make palette
ocean.pal <- colorRampPalette(
 c("#000000", "#000209", "#000413", "#00061E", "#000728", "#000932", "#002650", 
 "#00426E", "#005E8C", "#007AAA", "#0096C8", "#22A9C2", "#45BCBB", 
 "#67CFB5", "#8AE2AE", "#ACF6A8", "#BCF8B9", "#CBF9CA", "#DBFBDC", 
 "#EBFDED")
)
 
land.pal <- colorRampPalette(
 c("#336600", "#F3CA89", "#D9A627", 
 "#A49019", "#9F7B0D", "#996600", "#B27676", "#C2B0B0", "#E5E5E5", 
 "#FFFFFF")
)
 
zbreaks <- seq(-11000, 7000, by=10)
cols <-c(ocean.pal(sum(zbreaks<=0)-1), land.pal(sum(zbreaks>0)))
 
#compare coastlines to package 'mapdata'
png("west_pac.png", width=7.5, height=6, units="in", res=200)
#quartz(width=7.5, height=6)
layout(matrix(1:2, 1,2), widths=c(6,1.5), heights=c(6))
 
par(mar=c(2,2,1,1), ps=10)
image(lon, lat, z=z, col=cols, breaks=zbreaks, useRaster=TRUE, ylim=c(-20,50), xlim=c(100,170))
 
par(mar=c(2,0,1,5))
image(x=1, y=zbreaks, z=matrix(zbreaks, 1, length(zbreaks)), col=cols, breaks=zbreaks, useRaster=TRUE, xlab="", ylab="", axes=FALSE)
axis(4, at=seq(-11000, 7000, 1000), las=2)
mtext("[meters]", side=4, line=3)
box()
 
dev.off()
Created by Pretty R at inside-R.org




14 comments:

  1. Looks great! Thanks for sharing.

    ReplyDelete
  2. Thanks for sharing Marc. For up-to-date named country borders you may also want to check out rworldmap (2 resolutions) and rworldxtra (higher resolution). Both on CRAN and derived from Natural Earth data.

    ReplyDelete
    Replies
    1. Excellent Andy! Thanks for the comment. I will definitely check these packages out - would be a time-saver over the procedure presented here.

      Delete
    2. To add in a comparison to your first Galapagos map you could add this to your code.
      library(rworldmap)
      library(rworldxtra)
      plot(getMap(resolution='high'),bg=4,add=TRUE)
      I would be interested in the comparison as generally I don't work with high resolution data.
      All the best,
      Andy

      Delete
    3. Hi Andy - Thanks for the suggestion. The figure now compares the rworldmap high resolution data as well. I find the results to be very similar to the mapdata: worldHires data set - some smaller islands have similarly offset coastlines. Perhaps someone else will be able to comment on the reason for this issue? Cheers and thanks again, Marc

      Delete
  3. Thanks for sharing! I've made use of some of your initial code: http://geotheory.co.uk/blog/2014/02/07/visualising-topography/

    ReplyDelete
    Replies
    1. Nice work geotheory - I like the approach of using quantile levels. Cheers

      Delete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Thanks for your great code!!!

    I have redone your code and added an extra layer (GADM, from raster using the code:
    # To add GADM borders package 'raster' in orange
    # getData('ISO3') #use this to see your country iso name
    pt.border = getData('GADM', country=c('PT'), level=0)
    plot(pt.border, border="orange", add=T)

    Which for my short scale case I foundto be the best one... but I could not access the NOAA data base. I found also one available in 'marmap' package, from NOAA that might be worth to add, but could not make it work in a thinner resolution:

    # library(marmap)
    # getNOAA.bathy(lon1=-10, lon2=-6, lat1=36, lat2=42, resolution = 1) -> bathy
    # plot(bathy, image=TRUE, deep=-6000, shallow=0, step=1000)


    A really good idea!
    :)

    ReplyDelete
  6. Hi there,

    Great maps! I'm trying to reproduce these examples, but it appears that the 2008 GEBCO maps that you used are no longer supported, and the 2014 versions that I'm able to download have completely different data structures. When I download and import a slice, I have the following tmp format:

    > bathy_fname<- "GRIDONE_2D_-135.0_10.0_-90.0_35.0.nc"
    > nc <- open.nc(bathy_fname)
    > print.nc(nc)
    > tmp <- read.nc(nc)
    > names(tmp)
    [1] "elevation" "lat" "lon"

    with tmp$elevation, lat and lon as opposed to tmp$z, z_range, dim, spac, etc.

    I tried adjusting it as follows to account for the new data structure:

    z <- array(tmp$elevation, dim=dim(tmp$elevation))
    #z[which(z > 0)] <- NaN
    z <- z[,rev(seq(ncol(z)))]
    xran <- range(tmp$lon)
    yran <- range(tmp$lat)
    zran <- range(tmp$elevation)

    but I'm afraid I got hung up at:

    lon <- seq(tmp$x[1], tmp$x[2], tmp$spac[1])
    lat <- seq(tmp$y[1], tmp$y[2], tmp$spac[1])

    as I don't have tmp$spac in the new format, and consequently I'm not sure what this line of code is supposed to do.

    I'm very keen to be able to use this code to produce base maps rather than reverting to GMT, which I've used in the past but am very rusty with (and haven't used the latest versions). Any suggestions you can provide would be much appreciated!

    Cheers,
    Josh

    ReplyDelete
    Replies
    1. Hi Josh - Based on your code, it looks like the new format is even easier to use because you already have $lon and $lat and do not need to build these vectors. In other words, lon <- tmp$lon and lat <- tmp$lat.

      Delete
    2. Hi Marc,

      Awesome, thanks for the edit—very simple! For anyone's reference, here's the updated code snippet. Note that z does not need to be reversed anymore:

      bathy_fname<- "GRIDONE_2D_-135.0_10.0_-90.0_35.0.nc"

      nc <- open.nc(bathy_fname)
      print.nc(nc)
      tmp <- read.nc(nc)
      z <- array(tmp$elevation, dim=dim(tmp$elevation))
      z <- z[,seq(ncol(z))]

      xran <- range(tmp$lon)
      yran <- range(tmp$lat)
      zran <- range(tmp$elevation)
      lon <- tmp$lon
      lat <- tmp$lat
      rm(tmp)
      close.nc(nc)

      And the rest stays the same. Great code, thanks for sharing!

      Delete
    3. Thanks for the update Josh! Glad it worked for you.

      Delete
  7. Hi there,

    Great maps! I'm trying to reproduce these examples, but it appears that the 2008 GEBCO maps that you used are no longer supported, and the 2014 versions that I'm able to download have completely different data structures. When I download and import a slice, I have the following tmp format:

    > bathy_fname<- "GRIDONE_2D_-135.0_10.0_-90.0_35.0.nc"
    > nc <- open.nc(bathy_fname)
    > print.nc(nc)
    > tmp <- read.nc(nc)
    > names(tmp)
    [1] "elevation" "lat" "lon"

    with tmp$elevation, lat and lon as opposed to tmp$z, z_range, dim, spac, etc.

    I tried adjusting it as follows to account for the new data structure:

    z <- array(tmp$elevation, dim=dim(tmp$elevation))
    #z[which(z > 0)] <- NaN
    z <- z[,rev(seq(ncol(z)))]
    xran <- range(tmp$lon)
    yran <- range(tmp$lat)
    zran <- range(tmp$elevation)

    but I'm afraid I got hung up at:

    lon <- seq(tmp$x[1], tmp$x[2], tmp$spac[1])
    lat <- seq(tmp$y[1], tmp$y[2], tmp$spac[1])

    as I don't have tmp$spac in the new format, and consequently I'm not sure what this line of code is supposed to do.

    I'm very keen to be able to use this code to produce base maps rather than reverting to GMT, which I've used in the past but am very rusty with (and haven't used the latest versions). Any suggestions you can provide would be much appreciated!

    Cheers,
    Josh

    P.S. sorry if you received a few duplicate comments... google kept prompting me for sign in details and never confirmed that anything was posted/sent.

    ReplyDelete