This function carries out cluster analysis of HYSPLIT back trajectories. The
function is specifically designed to work with the trajectories imported
using the openair importTraj() function, which provides pre-calculated
back trajectories at specific receptor locations.
Usage
trajCluster(
traj,
method = "Euclid",
n.cluster = 5,
type = "default",
split.after = FALSE,
by.type = FALSE,
crs = 4326,
cols = "Set1",
theme = "default",
plot = TRUE,
...
)Arguments
- traj
An openair trajectory data frame resulting from the use of
importTraj().- method
Method used to calculate the distance matrix for the back trajectories. There are two methods available: “Euclid” and “Angle”.
- n.cluster
Number of clusters to calculate.
- type
Character string(s) defining how data should be split/conditioned before plotting.
"default"produces a single panel using the entire dataset. Any other options will split the plot into different panels - a roughly square grid of panels if onetypeis given, or a 2D matrix of panels if twotypesare given.typeis always passed tocutData(), and can therefore be any of:A built-in type defined in
cutData()(e.g.,"season","year","weekday", etc.). For example,type = "season"will split the plot into four panels, one for each season.The name of a numeric column in
mydata, which will be split inton.levelsquantiles (defaulting to 4).The name of a character or factor column in
mydata, which will be used as-is. Commonly this could be a variable like"site"to ensure data from different monitoring sites are handled and presented separately. It could equally be any arbitrary column created by the user (e.g., whether a nearby possible pollutant source is active or not).
Most
openairplotting functions can take twotypearguments. If two are given, the first is used for the columns and the second for the rows.- split.after
For
typeother than “default” e.g. “season”, the trajectories can either be calculated for each level oftypeindependently or extracted after the cluster calculations have been applied to the whole data set.- by.type
The percentage of the total number of trajectories is given for all data by default. Setting
by.type = TRUEwill make each panel add up to 100.- crs
The coordinate reference system to use for plotting. Defaults to
4326, which is the WGS84 geographic coordinate system, the standard, unprojected latitude/longitude system used in GPS, Google Earth, and GIS mapping. Othercrsvalues are available - for example,27700will use the the OSGB36/British National Grid.- cols
Colours to use for plotting. Can be a pre-set palette (e.g.,
"turbo","viridis","tol","Dark2", etc.) or a user-defined vector of R colours (e.g.,c("yellow", "green", "blue", "black")- seecolours()for a full list) or hex-codes (e.g.,c("#30123B", "#9CF649", "#7A0403")). Alternatively, can be a list of arguments to control the colour palette more closely (e.g.,palette,direction,alpha, etc.). SeeopenColours()andcolourOpts()for more details.- theme
A string representing an overall plot theme, defaulting to
"default". This option makes sweeping changes to non-data plot features such as fonts, colours, line widths, and so on, and may also change default arguments likecolsif not set by the user. Can also take aggplot2::theme()object, which will be used to modify the"default"theme. Pre-set options include:"default", a lattice-inspired theme resembling the traditionalopenairlook, with structured panels and visible gridlines."dark", a dark-background variant of the default theme, designed for presentations and low-light viewing, using high-contrast text and colour palettes optimised for visibility against dark panels."modern", a minimalist, contemporary theme inspired by tools such as Plotly and Observable Plot, with reduced visual clutter, horizontal emphasis in gridlines, a clean legend style, and typography suited to dashboards and reports."soft", a low-contrast, 'editorial' theme with warm background tones, subtle gridlines, and gently desaturated colours, designed for reports and publication-style figures, particularly where a calmer appearance improves readability."print", a strictly greyscale theme optimised for black-and-white reproduction, with stronger structural elements such as clearer gridlines and axis definitions to ensure good contrast and readability in printed or photocopied outputs.
Please note that if a global theme is set with
ggplot2::theme_set()to anything other than the defaultggplot2::theme_grey(), the selected openair theme will not be fully applied; instead, only minimal adjustments (such as legend positioning) will be made.- plot
When
openairplots are created they are automatically printed to the active graphics device.plot = FALSEdeactivates this behaviour. This may be useful when the plot data is of more interest, or the plot is required to appear later (e.g., later in a Quarto document, or to be saved to a file).- ...
Passed to
trajPlot().
Value
an openair object. The data component contains
both traj (the original data appended with its cluster) and results
(the average trajectory path per cluster, shown in the trajCluster()
plot.)
Details
Two main methods are available to cluster the back trajectories using two different calculations of the distance matrix. The default is to use the standard Euclidian distance between each pair of trajectories. Also available is an angle-based distance matrix based on Sirois and Bottenheim (1995). The latter method is useful when the interest is the direction of the trajectories in clustering.
The distance matrix calculations are made in C++ for speed. For data sets of
up to 1 year both methods should be relatively fast, although the method = "Angle" does tend to take much longer to calculate. Further details of these
methods are given in the openair manual.
References
Sirois, A. and Bottenheim, J.W., 1995. Use of backward trajectories to interpret the 5-year record of PAN and O3 ambient air concentrations at Kejimkujik National Park, Nova Scotia. Journal of Geophysical Research, 100: 2867-2881.
See also
Other trajectory analysis functions:
importTraj(),
trajLevel(),
trajPlot()
Other cluster analysis functions:
polarCluster(),
timeProp()
Examples
if (FALSE) { # \dontrun{
## import trajectories
traj <- importTraj(site = "london", year = 2009)
## calculate clusters
clust <- trajCluster(traj, n.cluster = 5)
head(clust$data) ## note new variable 'cluster'
## use different distance matrix calculation, and calculate by season
traj <- trajCluster(traj, method = "Angle", type = "season", n.cluster = 4)
} # }
