Derived variables under R

Author: Dr. Steve Oncley

Description of some of the dat() routines used by ISFS for data analysis. Content documents a few turbulent flux dat routines.

Introduction

The EOL/ISF Surface Group processes most of our data using R. We have built an extensive library of routines to work with our data using these packages, which are now based on a object-oriented model. We have defined a "dat" object, which has several attributes that are available for our routines to query:

time: the center time of the data point
name: the variable name(s), (we have a standard convention which may or may not specify where the data were taken). There usually are several variables in one "object", measured at the same time.
units: the units of measure of the variable
stations: specifies the measurement site
class: specifies that this object is a "dat" object and may also have the variable type

We also have defined a more primative "nts" object without the station attribute or "dat" class. "dat" objects usually are either read from a NetCDF file or calculated from other "dat" objects. We have developed functions to allow a set of "dat" objects that are derived to be referenced as though they had been read in. For example, dat("T") reads air temperature directly from the NetCDF file, but dat("Q") computes specific humidity from dat("T"), dat("RH"), and dat("P"), where dat("RH") is the relative humidity measured by our sensors and dat("P") is the pressure also measured. dat("Q") is created by calling a routine "dat.Q" which we have written.

A description of some of our "dat.X" functions is given in this document. Throughout this document, code which is only "housekeeping" in nature has been deleted to focus on the calculations themselves.

Parameters

Calculation by the "dat" functions is controlled by a set of parameters set by the "dpar()" function or a point-and-click equivalent "dpopup()". Through these parameters, the time period of interest (usually beginning time and duration) is specified. Additional options can select the measurement location (station and/or height), averaging period (see next section).

Of particular interest to the below discussion is the "robust" parameter. Many of the routines below require several data inputs to implement a full correction to the data (the dat.Q example above requires 3 input measurements). However, if any of these measurements are missing, the output of a "dat" routine necessarily must be NA. For some applications (such as quality control when we are in the field), it is useful to get a "quick-and-dirty" answer whether or not all the measurements are available. "robust=T" tells the dat functions to compute a result based on the fewest inputs. "robust=F" forces the dat functions to perform all data corrections, but may yield fewer results.

Averaging

If "avg" is specified by dpar() or dpopup(), the dat routines will compute a true average of the values from values stored in the NetCDF files. All of our NetCDF files to date were created using 5 minute averages. Thus dat("T") with avg=1800 [seconds] would read data from 6 5-minute blocks and create a 30-minute average.

The averaging process is smart enough to include the "weights" parameter so that if only 100 samples were averaged in one 5 minute block and 300 samples in another, the first would be weighted only 1/3 as much.

If the dpar parameter "simple.avg" or "robust" is "F", the average of higher- order momements (variance, covariance, skewness, etc.) will include the correct variation of the 5-minute average values in the computation.

Air Density

rho.air

Many routines require the density of air, and we usually measure temperature, relative humidity, and pressure, so that this can be computed. If robust=T, only one value (median across time and several sensors) is returned to keep matching simple.

dat.rho.air <- function(x,cache=F,robust=dpar("robust"),...)
{
  # density of (moist) air (kg/m^3)
  R <- 287		# J/kg-K
  p <- dat("P")*100	# pascals
  q <- dat("Q")
  Td <- dat("T")
  Tv <- (273.15 + Td)*(1 + 0.608*q*1e-3)
  x <- p/R/Tv
  if (!is.null(robust) && robust) {
    x <- median(x, na.rm=T)
    attr(x,"dunits") <- "kg/m^3"
    }
  else {
    attr(x,"dunits") <- rep("kg/m^3",ncol(x))
  }
  x
}

rho.dry

Sometimes, we only want the density of dry air. As with rho.air, only one value is returned for robust=T.

dat.rho.dry <- function(x,cache=F,robust=dpar("robust"),...)
{
  # density of dry air (kg/m^3)
  R <- 287		# J/kg-K
  mr <- dat("MR")
  p <- dat("P")*100/(1 + dat("MR")/622)		# P - Pv, pascals
  Td <- dat("T")
  TK <- 273.15 + Td
  x <- p/R/TK		
  if (!is.null(robust) && robust) {
    x <- median(x, na.rm=T)
    attr(x,"dunits") <- "kg/m^3"
    }
  else {
    attr(x,"dunits") <- rep("kg/m^3",ncol(x))
  }
  x
}

Turbulent Fluxes

These start getting complicated very fast due to all the corrections and interdependencies. It is probably necessary to read through everything in this section to get the complete picture.

u*

The sonic anemometer measures three components of wind velocity: horizontal components u and v and the vertical component w. u is commonly parallel to the sonic boom, v is normal to the boom and w is positive away from the surface, with u, v, and w forming a right-hand coordinate system. For the purpose of analyzing turbulence, we use Reynolds decomposition to divide each component into time-average mean values u, v and w and deviations from the mean u', v' and w'. Note that by definition <u'> = <v'> = <w'> = 0, where <x> denotes a time-average of x. Then we can calculate variances of the velocity components, e.g. <u'u'>, and covariances <u'w'> also called a turbulent stress or kinematic momentum flux. (The velocity covariance multiplied by air density would be a true stress or momentum flux.)

The scaling velocity u* is defined as the square root of the negative of the streamwise kinematic momentum flux, since the kinematic momentum flux is generally negative and thus denotes a flux of momentum from the air stream to the underlying surface or a drag on the air stream by that surface. The sonic data is in an arbitrary coordinate system, e.g. instrument or geographic coordinates, so we first rotate the components of the vertical flux of horizontal kinematic momentum, dat("u'w'") and dat("v'w'"), to streamwise coordinates. However, for low wind speeds and short averaging times, the argument of the square root in the numerator of the equation below for x can at times be negative, leading to missing values (na's) for those values of u*. In these cases and if robust=True, we approximate the value of the streamwise momentum flux with ustm, the magnitude of the {<u'w'>, <v'w'>} vector. This is not strictly valid because the cross-stream component of momentum is seldom zero in the real world.

"dat.u*" <- function(x,cache=F,...)
{
  x <- sqrt(-(dat("u'w'")*dat("u") + dat("v'w'")*dat("v"))/sqrt(dat("u")^2 + dat("v")^2))
  ib <- is.na(x)
  ustm <- (dat("u'w'")^2 + dat("v'w'")^2) ^ (0.25)
  if (dpar("robust"))
    x[ib] <- ustm[ib]
  attr(x,"dunits") <- rep("m/s",ncol(x))
  x
}

H

This routine computes the total sensible heat flux, including the heat capacity of moisture in the air parcel. Most of the work in this routine is done by the functions dat.w'h2o', dat.w't', and calc.H, which figure out how to convert the flux of sonic virtual temperature to real temperature, etc.

Missing temperatures (needed to compute the heat capacity) are filled in with the median temperature over the entire time period in the case that robust=T.

dat.H <- function(x,robust=dpar("robust"),...)
{
  # Turbulent sensible heat flux
  Td <- dat("T")
  wq <- dat("w'h2o'")*1e-3
  wt <- dat("w't'")
  x <- calc.H(wt=wt,wq=wq,Td=Td,rho=dat("rho.air"))
  x
}

dat.H.dry

Some people want to know what the sensible heat would be without the water vapor contribution to the specific heat. As before, missing data are filled in if robust=T.

dat.H.dry <- function(x,robust=dpar("robust"),...)
{
  # Turbulent sensible heat flux without contribution from varying Cp 
  Q <- dat("Q")
  x <- calc.H(wt=dat("w't'"),Q=dat("Q"),rho=dat("rho.air"))
  x
}

dat.LE

Given all the nonsense below, this is quite simple, with the real work hidden in dat("w'mr'"). Note that LE normally is defined from the specific humidity flux w'q' times dat("rho"), but the approach we use allows us to deal only with one dat routine.

dat.LE <- function(x,robust=dpar("robust"),...)
{
  # Turbulent latent heat flux
  x <- dat("Lv")*dat("rho.dry")*dat("w'mr'")*1e-3
  attr(x,"dunits") <- rep("W/m^2",ncol(x))
  x
}

w't'

In addition to the wind velocity components, the sonic anemometer also measures the speed-of-sound c which is proportional to the square root of a "sonic virtual temperature" that we denote by the variable tc. dat.w't' corrects the measured sonic virtual temperature flux for the contribution due to the water vapor flux (Schotanus et al., BLM, 1983). However, since our water vapor flux usually is measured by a krypton hygrometer, it is necessary to apply the Webb correction (Webb et al., QJRMS, 1980) and oxygen correction (Campbell App.C) to the w'h2o' measurements. Since these corrections both need w't' itself, these corrections must be calculated simultaneously. Then the correction can be written out explicitly as a function of w'tc' and w'h2o'.

"dat.w't'" <- function(x,robust=dpar("robust"),...)
{
  # Corrections to  assume E measured as  (m/s g/m^3)
  if (!is.null(robust) && robust)
    x <- dat("w'tc'")
  else {
    rho.air <- dat("rho.air")
    Q <- dat("Q")*1e-3
    Td <- dat("T")
    TK <- 273.15 + Td
    E <- dat("w'h2o'")*1e-3     # m/s kg/m^3
    o2corr <- calc.o2corr(dat("P"),TK,rho.air,E)
    factor <- 0.51*(1 + Q*(1/0.622-1))
    x <- (dat("w'tc'") - factor*TK*E/rho.air)/(1 + factor*(Q+o2corr))
    x[is.inf(x)] <- NA
    }
  attr(x,"dunits") <- rep("m/s degK",ncol(x))
  x
}

w'mr'

This is the workhorse for dat.LE which solves the second set of correction equations for w'mr' in terms of w'tc' and w'h2o', as done for w't'.

"dat.w'mr'" <- function(x,cache=F,robust=dpar("robust"),...)
{
  # Assume E measured as  (m/s g/m^3)
  E <- dat("w'h2o'")
  rhod <- dat("rho.dry")
  x <- E/rhod*1e-3
  if (!(!is.null(robust) && robust)) {
    Q <- dat("Q")*1e-3
    MR <- Q/(1-Q)
    Td <- dat("T")
    TK <- 273.15 + Td
    o2corr <- calc.o2corr(dat("P"),TK,dat("rho.air"),E)
    factor <- 0.51*(1 + Q*(1/0.622-1))
    x <- (1 + MR/0.622)*(x + (MR+o2corr/rhod)/TK*dat("w'tc'"))/
	(1 + factor*(Q+o2corr))
    }
  x <- x*1e3		# m/s g/kg
  attr(x,"dunits") <- rep("m/s g/kg",ncol(x))
  x
}

calc.o2corr

This makes the oxygen correction to kryptons (from Campbell, Appendix C and my krypton write-up). I have been told recently that Campbell wrote a later document which reduces the magnitude of this correction.

I have removed the messy logic from this write-up which tests whether kryptons were used (dpar("fh2os")) and skips this routine if not.

calc.o2corr <- function(P,TK,rho.air,E,robust=dpar("robust")) {
  hs <- dpar("fh2os")
  P <- 100*P
  corr <- (4.7e-5)*P/(rho.air*TK)
  corr
}

calc.H

This deals with the water contribution to heat capacity in the dat.H routines as per Stull eq. 10.7.1b or Stull eq. 10.7.1c and Fleagle and Businger (second edition) eq. 6.23.

calc.H <- function(wt,wq,Td,Q,rho.air)
{
  # Turbulent sensible heat flux
  # Constant values from Smithsonian Meteorological Tables:
  Cpd <- 1006	# for dry air, range is 1004.8-1007.4 from -50 to +50 C and
                # 700 to 1100 mb, J/kg-K
  Cpv <- 1857	# for water vapor, value for 0 C, range is 1850 to 1870 
                # from -50 to +50C, J/kg-K
  # No fast water covariance
  if (missing(wq)) {
    Hname <- "H.dry"
    Q <- Q/1000   # kg/kg
    # Stull, eq. 10.7.1b
    Cp <- Cpd*(1 + 0.84*Q)
    x <- Cp * rho.air
    x <- x * wt
  }
  else {
    Hname <- "H"
    x <- rho.air
    # Stull, eq. 10.7.1c or Fleagle&Businger (2nd ed) eq. 6.23, 
    # using Cpd&Cpv directly
    x <- x * (Cpd*wt + Cpv*Td*wq)
  }
  dimnames(x) <- list(NULL,paste(Hname,sfxnames(wt,2),sep=""))
  attr(x,"dunits") <- rep("W/m^2",ncol(x))
  x
}

dat.Lv

I note here that the Latent heat of vaporatizaton of water is a weak function of temperature (varies by 1% in 10 C), but appears to be well fit by a linear fit to data I found on the WWW.

However, to avoid matching problems, I'm only using the first temperature data point, which assumes that all the real temperatures aren't too much different.

dat.Lv <- function(x,cache=F,...)
{
  # Still, just 
  if (!is.null(dpar("robust")) & !dpar("robust")) {
  	x <- 2501000 - 2373*dat("T")[,1]
	x[is.na(x)] <- 2501000 - 2373*median(dat("T"),na.rm=T)
	}
  else {
	x <- 2.5e6
	}
  dimnames(x) <- list(NULL,paste("Lv",sfxnames(x,2),sep=""))
  class(x) <- c("Lv",class(x))
  attr(x,"dunits") <- rep("J/kg",ncol(x))
  x
}

References

Campbell, Appendix C

Fleagle, R. and J.A.Businger, An Introduction to Atmospheric Physics, Second Edition, Academic Press

Schotanus, P., 1983, ... Boundary Layer Meteor.

Smithsonian Meteorological Tables

Stull, R. 1988, An Introduction to Boundary Layer Meteorology, Kluwer Academic, Dordrecht, 666 pp.

Webb, E., 1980, Quart. J. Royal Meteor. Soc.,