Following a question that I posted on stackoverflow.com, I recieved the great advice to use the Bioconductor rhdf5 package to work with HDF5 files. The package is not located on CRAN, but can be sourced from the Bioconductor website:
source("http://bioconductor.org/biocLite.R") biocLite("rhdf5")
As an example, I use the package to extract Pathfinder sea surface temperature (SST) data, available in netCDF-4 format (the features of netCDF-4 are a subset of the features of HDF5). This type of file is not readable by the netCDF package ncdf.The result is the above plot of a subarea from one of the daily data sets.
To reproduce the figure, you will need the image.scale and val2col functions found on this blog.
To reproduce example:
###required packages library("rhdf5") library(maps) ###required functions source("image.scale.R") #http://menugget.blogspot.de/2011/08/adding-scale-to-image-plot.html source("val2col.R") # http://menugget.blogspot.de/2011/09/converting-values-to-color-levels.html ###load data #AVHRR Pathfinder Version 5.2 data (link: "http://www.nodc.noaa.gov/SatelliteData/pathfinder4km/") fname <- "20120103133752-NODC-L3C_GHRSST-SSTskin-AVHRR_Pathfinder-PFV5.2_NOAA19_G_2012003_day-v02.0-fv01.0.nc" #(from Pathfinder ftp: "ftp://ftp.nodc.noaa.gov/pub/data.nodc/pathfinder/Version5.2/2012/") #List the content of the HDF5 file. tmp <- h5ls(fname) tmp lon <- h5read(fname, "lon") lat <- h5read(fname, "lat") sst <- h5read(fname, "sea_surface_temperature") dim(sst) #crop to desired area lonRan <- c(-140, -60) latRan <- c(20, 50) lonKeep <- which(lon > lonRan[1] & lon < lonRan[2]) latKeep <- which(lat> latRan[1] & lat< latRan[2]) sst2 <- sst[lonKeep, latKeep, 1] range(sst2) sst2 <- replace(sst2, sst2 == -32768, NaN) range(sst2, na.rm=TRUE) sst2 <- (sst2 + 273.15) * 0.01 # change from deg Kelvin to deg Celsius and scale by 0.01 (p.45 http://data.nodc.noaa.gov/pathfinder/Version5.2/GDS_TechSpecs_v2.0.pdf) range(sst2, na.rm=TRUE) lon2 <- lon[lonKeep] # subset of lon values lat2 <- rev(lat[latKeep]) # subset of lat values , and reverse for increasing values sst2 <- sst2[,rev(seq(latKeep))] # reverse column order (lat) for increasing values ###plot #figure template from "http://menugget.blogspot.de/2013/01/my-template-for-controlling-publication.html" #Layout of plots #1 1 3 #1 2 3 LO <- matrix(c(1,2), nrow=1, ncol=2, byrow=TRUE) LO #double check layout #Resolution, pointsize RESO <- 400 PS <- 10 #Overall units in inches WIDTHS <- c(5,1) #widths of each figure in layout (i.e. column widths) HEIGHTS <- c(3.5) #heights of each figure in layout (i.e. row heights) #Outer margins and calculation of full dimensions OMA <- c(0,0,1,0) #Outer margins c(bottom, left, top, right) HEIGHT <- sum(HEIGHTS) + OMA[1]*PS*1/72 + OMA[3]*PS*1/72 WIDTH <- sum(WIDTHS) + OMA[2]*PS*1/72 + OMA[4]*PS*1/72 #Double check full dimensions WIDTH; HEIGHT #Start plot; open device - run from x11() down to observe behavior; run from pdf() down to produce .pdf png("sst_usa.png", width=WIDTH, height=HEIGHT, units="in", res=RESO) #pdf("sst_usa.pdf", width=WIDTH, height=HEIGHT) #x11(width=WIDTH, height=HEIGHT) par(oma=OMA, ps=PS) #settings before layout layout(LO, heights=HEIGHTS, widths=WIDTHS) #layout.show(max(LO)) # run to see layout; comment out to prevent plotting during .pdf par(cex=1) # layout has the tendency change par()$cex, so this step is important for control #plot 1 par(mar=c(2,2,1,1)) pal <- colorRampPalette(c("blue", "cyan", "yellow", "red")) image(lon2, lat2, sst2, col=pal(100), xlab="", ylab="") map("world", add=TRUE, fill=TRUE, col=8) mtext("Pathfinder SST (2012-01-03)", side=3, line=0.5, font=2) box() #plot 2 par(mar=c(2,0,1,4)) image.scale(sst2, col=pal(100), xlab="", ylab="", axes=FALSE, horiz=FALSE) axis(4) mtext("deg [C°]", side=4, line=2.5) box() dev.off() # closes device
Thanks to this post I've discovered HDF5 and it just transformed one of my pieces of research from impossible to possible.
ReplyDeleteTHANKS!!!
I seem to have a problem reproducing even the example. At the first hdf function call, the response is that 'HDF5: unable to open file
ReplyDeleteError in h5checktypeOrOpenLoc(file, readonly = TRUE) :
Error in h5checktypeOrOpenLoc(). File '~/workspace/r_scratch//ncdf/data/20120103133752-NODC-L3C_GHRSST-SSTskin-AVHRR_Pathfinder-PFV5.2_NOAA19_G_2012003_day-v02.0-fv01.0.nc' is not a valid HDF5 file.'
Any suggestion of where I should start looking for a solution?
I'm using R 3.0.2 on Ubuntu 12.4.3; all packages are up to date.
Thanks!
Is it possible that the double backslash in your file path is the cause of the problem?
DeleteThanks for your reply. I’ve seen the double backslash too late – mistake in copy/pasting the error message, not in my script. The problem was not that the file was not found, but that the file is not valid HDF5 file; and that problem remains.
ReplyDeleteHi Edward - Sorry I can't be of help. I had no trouble opening these netCDF-4 files with the package. Please post a comment if you figure out what the problem was.
DeleteThanks for the post! I read an HDF5 data and try to convert that into raster. After conversion, I add raster extent using xmin(r), xmax(r), ymin(x), ymin(r) When I check the dimensions they are correct but the cell size in not. it shows that the resolution for x and y is different. Could you please show how to read the full data set without clipping and convert that into raster? Thanks!
ReplyDelete