Metadata
The first step when importing air quality data is to consult
import_ukaq_meta()
. Let’s have a look at the AURN metadata
now:
import_ukaq_meta("aurn")
#> # A tibble: 316 × 14
#> source code site site_type latitude longitude start_date end_date zone
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <date> <date> <chr>
#> 1 aurn ABD Aberde… Urban Ba… 57.2 -2.09 1999-09-18 2021-09-20 Nort…
#> 2 aurn ABD9 Aberde… Urban Ba… 57.2 -2.09 2021-10-01 NA Nort…
#> 3 aurn ABD7 Aberde… Urban Tr… 57.1 -2.11 2008-01-01 2024-12-31 Nort…
#> 4 aurn ABD8 Aberde… Urban Tr… 57.1 -2.09 2016-02-09 NA Nort…
#> 5 aurn ARM6 Armagh… Urban Tr… 54.4 -6.65 2009-01-01 NA Nort…
#> 6 aurn AH Aston … Rural Ba… 52.5 -3.03 1986-06-26 NA Nort…
#> 7 aurn ACTH Auchen… Rural Ba… 55.8 -3.24 2006-01-01 NA Cent…
#> 8 aurn AYLA Aylesb… Urban Tr… 51.8 -0.794 2025-06-19 NA Sout…
#> 9 aurn BAAR Ballym… Urban Tr… 54.9 -6.27 2017-04-01 NA Nort…
#> 10 aurn BALM Ballym… Urban Ba… 54.9 -6.25 2010-01-01 NA Nort…
#> # ℹ 306 more rows
#> # ℹ 5 more variables: agglomeration <chr>, zagglom <chr>,
#> # local_authority <chr>, lmam_provider <chr>, lmam_code <chr>
This output can be customised using different function arguments. For example, lets find sites which measured O3 in 2020.
meta <- import_ukaq_meta("aurn", year = 2020, by_pollutant = TRUE)
meta[meta$pollutant == "O3",]
#> # A tibble: 0 × 16
#> # ℹ 16 variables: source <chr>, code <chr>, site <chr>, site_type <chr>,
#> # latitude <dbl>, longitude <dbl>, pollutant <chr>, start_date <date>,
#> # end_date <date>, ratified_to <date>, zone <chr>, agglomeration <chr>,
#> # zagglom <chr>, local_authority <chr>, lmam_provider <chr>, lmam_code <chr>
To import data, please make a note of the relevant site codes in the “code” column (and, if appropriate, the “source” network of the data).
Continuous Monitoring
Arguably the most useful data made available by ukaq
could be termed ‘continuous monitoring data’ - most commonly hourly
data. To access this, you may use
import_ukaq_measurements()
which requires two key pieces of
information - a site code
from the metadata table and a
year
(or years) to import.
import_ukaq_measurements(c("my1", "kc1"), year = 2024L)
#> # A tibble: 17,568 × 44
#> source date code site o3 no no2 nox so2 co
#> <chr> <dttm> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 aurn 2024-01-01 00:00:00 MY1 London … 39.7 22.1 30.4 64.3 1.33 0.244
#> 2 aurn 2024-01-01 01:00:00 MY1 London … 23.7 43.0 48.4 114. 2.39 0.373
#> 3 aurn 2024-01-01 02:00:00 MY1 London … 31.3 28.9 41.1 85.5 1.86 0.268
#> 4 aurn 2024-01-01 03:00:00 MY1 London … 34.1 23.0 37.5 72.7 1.60 0.314
#> 5 aurn 2024-01-01 04:00:00 MY1 London … 37.1 23.2 37.9 73.4 1.33 0.256
#> 6 aurn 2024-01-01 05:00:00 MY1 London … 39.5 14.6 28.5 50.9 1.06 0.244
#> 7 aurn 2024-01-01 06:00:00 MY1 London … 35.5 17.7 34.2 61.2 1.06 0.221
#> 8 aurn 2024-01-01 07:00:00 MY1 London … 37.7 13.0 31.2 50.9 1.06 0.186
#> 9 aurn 2024-01-01 08:00:00 MY1 London … 33.1 20.1 37.7 68.7 1.06 0.186
#> 10 aurn 2024-01-01 09:00:00 MY1 London … 32.7 17.0 37.3 63.5 1.06 0.163
#> # ℹ 17,558 more rows
#> # ℹ 34 more variables: pm10 <dbl>, pm2.5 <dbl>, ethane <dbl>, ethene <dbl>,
#> # ethyne <dbl>, propane <dbl>, propene <dbl>, ibutane <dbl>, nbutane <dbl>,
#> # `1butene` <dbl>, t2butene <dbl>, c2butene <dbl>, ipentane <dbl>,
#> # npentane <dbl>, `13bdiene` <dbl>, t2penten <dbl>, `1penten` <dbl>,
#> # `2mepent` <dbl>, isoprene <dbl>, nhexane <dbl>, nheptane <dbl>,
#> # ioctane <dbl>, noctane <dbl>, benzene <dbl>, toluene <dbl>, …
import_ukaq_measurements()
is clever enough to work out
the specific monitoring network each site is a member of, but sometimes
there can be ambiguity. The source
argument allows this to
be specified. Consider source
as defining the pool of one
or more networks ukaq will use to align each
code
with an actual monitoring station. In reality, this
should only be an issue for “locally managed” English sites which share
site codes with Ricardo-managed sites (e.g., “AD1” which is the
‘Aberdeen King Street’ AURN site and the ‘Adur - Shoreham-by-Sea’ Sussex
AQ site). Consider the difference between the two outputs below.
import_ukaq_measurements("ad1", year = 2020L)
#> Warning in import_ukaq_measurements("ad1", year = 2020L): Ambiguous Codes Detected: AD1.
#> Importing sites using following order of preference: aurn, aqe, saqn, waqn, niaqn, lmam
#> # A tibble: 8,784 × 13
#> source date code site no no2 nox pm10 pm2.5 pm1
#> <chr> <dttm> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 saqn 2020-01-01 00:00:00 AD1 Aberdee… 4.22 17.8 24.3 15.6 13.1 12.4
#> 2 saqn 2020-01-01 01:00:00 AD1 Aberdee… 6.04 19.5 28.8 7.24 5.92 5.33
#> 3 saqn 2020-01-01 02:00:00 AD1 Aberdee… 5.07 21.1 28.9 9.75 8.35 7.93
#> 4 saqn 2020-01-01 03:00:00 AD1 Aberdee… 1.53 9.97 12.3 5.00 3.83 3.37
#> 5 saqn 2020-01-01 04:00:00 AD1 Aberdee… 1.02 9.83 11.4 5.00 3.81 3.32
#> 6 saqn 2020-01-01 05:00:00 AD1 Aberdee… 0.716 5.16 6.25 3.86 2.94 2.59
#> 7 saqn 2020-01-01 06:00:00 AD1 Aberdee… 0.575 5.79 6.67 4.24 3.31 2.97
#> 8 saqn 2020-01-01 07:00:00 AD1 Aberdee… 1.03 7.99 9.57 3.88 3.19 2.96
#> 9 saqn 2020-01-01 08:00:00 AD1 Aberdee… 8.26 23.6 36.2 4.49 3.71 3.48
#> 10 saqn 2020-01-01 09:00:00 AD1 Aberdee… 6.07 26.6 35.9 6.32 5.04 4.80
#> # ℹ 8,774 more rows
#> # ℹ 3 more variables: wd <dbl>, ws <dbl>, temp <dbl>
import_ukaq_measurements("ad1", year = 2020L, source = "lmam")
#> # A tibble: 8,784 × 8
#> source date code site no no2 nox pm10
#> <chr> <dttm> <fct> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 lmam 2020-01-01 00:00:00 AD1 Adur - Shoreham-by-… NA NA NA 39
#> 2 lmam 2020-01-01 01:00:00 AD1 Adur - Shoreham-by-… 23.7 52.2 88.0 38
#> 3 lmam 2020-01-01 02:00:00 AD1 Adur - Shoreham-by-… 12.5 45.9 65.2 36
#> 4 lmam 2020-01-01 03:00:00 AD1 Adur - Shoreham-by-… 12.5 47.2 65.6 34
#> 5 lmam 2020-01-01 04:00:00 AD1 Adur - Shoreham-by-… 26.2 52.4 92.2 38
#> 6 lmam 2020-01-01 05:00:00 AD1 Adur - Shoreham-by-… 6.24 30.8 39.8 37
#> 7 lmam 2020-01-01 06:00:00 AD1 Adur - Shoreham-by-… 7.48 27.9 39.6 37
#> 8 lmam 2020-01-01 07:00:00 AD1 Adur - Shoreham-by-… 15.0 30.6 52.8 34
#> 9 lmam 2020-01-01 08:00:00 AD1 Adur - Shoreham-by-… 15.0 30.6 52.8 38
#> 10 lmam 2020-01-01 09:00:00 AD1 Adur - Shoreham-by-… 36.2 37.1 91.8 33
#> # ℹ 8,774 more rows
Data can be augmented with three kinds of extra information to make the functions more useful:
append_meteorology
will add modelled wind speed (ws
), wind direction (wd
), and air temperature (temp
) for networks where it exists. This defaults toTRUE
.append_quality_flag
will add a column (or columns) to indicate whether each pollutant has been ratified. This defaults toFALSE
.append_metadata
will add the metadata columns defined inmetadata_columns
. This is useful to append information like latitude/longitude for mapping, for example. This defaults toFALSE
, with chosen metadata being site type, latitude and longitude.
To demonstrate we’ll just grab a couple of pollutants from Marylebone
Road ("my1"
) using the pollutant
argument, to
keep the output small.
import_ukaq_measurements(
"my1",
2020L,
pollutant = c("no2", "o3"),
append_metadata = TRUE,
append_meteorology = FALSE,
append_quality_flag = TRUE,
metadata_columns = "zagglom"
)
#> # A tibble: 8,784 × 9
#> code source date site no2 o3 o3_qc no2_qc zagglom
#> <fct> <chr> <dttm> <chr> <dbl> <dbl> <lgl> <lgl> <chr>
#> 1 MY1 aurn 2020-01-01 00:00:00 London Mar… 45.8 1.73 TRUE TRUE Greate…
#> 2 MY1 aurn 2020-01-01 01:00:00 London Mar… 52.6 1.93 TRUE TRUE Greate…
#> 3 MY1 aurn 2020-01-01 02:00:00 London Mar… 44.8 2.00 TRUE TRUE Greate…
#> 4 MY1 aurn 2020-01-01 03:00:00 London Mar… 40.2 2.05 TRUE TRUE Greate…
#> 5 MY1 aurn 2020-01-01 04:00:00 London Mar… 47.3 2.99 TRUE TRUE Greate…
#> 6 MY1 aurn 2020-01-01 05:00:00 London Mar… 40.4 2.89 TRUE TRUE Greate…
#> 7 MY1 aurn 2020-01-01 06:00:00 London Mar… 42.7 2.99 TRUE TRUE Greate…
#> 8 MY1 aurn 2020-01-01 07:00:00 London Mar… 40.7 1.90 TRUE TRUE Greate…
#> 9 MY1 aurn 2020-01-01 08:00:00 London Mar… 42.6 1.85 TRUE TRUE Greate…
#> 10 MY1 aurn 2020-01-01 09:00:00 London Mar… 38.3 1.95 TRUE TRUE Greate…
#> # ℹ 8,774 more rows
By default, the data is put into a ‘wide’ format, with different
pollutants in different columns. For many applications in R, we may want
the data to be in a “long” format. For this data structure, simply set
pivot = "long"
. Note that this will interact with the other
‘append’ arguments - meteorological data and site metadata aren’t
pivoted with the pollutants, and there will only be a single quality
flag alongside the ‘value’ column.
import_ukaq_measurements(
"my1",
2020L,
pivot = "long"
)
#> # A tibble: 377,712 × 9
#> source date code site wd ws temp pollutant value
#> <chr> <dttm> <fct> <chr> <dbl> <dbl> <dbl> <chr> <dbl>
#> 1 aurn 2020-01-01 00:00:00 MY1 London Ma… 92.7 2.1 2.3 o3 1.73
#> 2 aurn 2020-01-01 01:00:00 MY1 London Ma… 98.3 2.1 1.4 o3 1.93
#> 3 aurn 2020-01-01 02:00:00 MY1 London Ma… 117. 2.3 1 o3 2.00
#> 4 aurn 2020-01-01 03:00:00 MY1 London Ma… 131. 1.8 0.8 o3 2.05
#> 5 aurn 2020-01-01 04:00:00 MY1 London Ma… 109. 1.7 0.8 o3 2.99
#> 6 aurn 2020-01-01 05:00:00 MY1 London Ma… 84.3 1.1 0 o3 2.89
#> 7 aurn 2020-01-01 06:00:00 MY1 London Ma… 86.9 1.2 -0.4 o3 2.99
#> 8 aurn 2020-01-01 07:00:00 MY1 London Ma… 143 1.3 0.9 o3 1.90
#> 9 aurn 2020-01-01 08:00:00 MY1 London Ma… 168. 1.1 1.5 o3 1.85
#> 10 aurn 2020-01-01 09:00:00 MY1 London Ma… 186. 0.6 1.5 o3 1.95
#> # ℹ 377,702 more rows
Finally, there are other data types available if they are of
interest, which can be used with the data_type
argument of
import_ukaq_measurements()
. These include:
"hourly"
: Hourly data (the default)."daily"
: Daily average data."15_min"
: 15-minute average SO2 concentrations."8_hour"
: 8-hour rolling mean concentrations for O3 and CO."24_hour"
: 24-hour rolling mean concentrations for particulates."daily_max_8"
: Maximum daily rolling 8-hour maximum for O3 and CO.
On top of these data types, there are three additional
data_type
s that can be accessed through other functions,
detailed in the sections below.
Monthly & Annual Statistics
When examining entire networks, it may be useful to examine
aggregated data. import_ukaq_summaries()
allows for
"monthly"
and "annual"
data types to be
imported. This function works differently to
import_ukaq_measurements()
in a few key ways.
A pre-calculated monthly or annual data capture is also returned with the data.
code
is optional (defaulting toNULL
) which will make the function return all data available for the givensource
andyear
.
Many of the arguments mentioned in the above section are also
available for this function, including pollutant
,
append_metadata
, metadata_columns
, and
pivot
.
import_ukaq_summaries(year = 2024, source = "aurn")
#> # A tibble: 187 × 89
#> source date year code site o3 o3.capture o3.summer.capture
#> <chr> <date> <int> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 aurn 2024-01-01 2024 ABD7 Aberdeen Un… NA NA NA
#> 2 aurn 2024-01-01 2024 ABD8 Aberdeen We… NA NA NA
#> 3 aurn 2024-01-01 2024 ABD9 Aberdeen Er… 56.0 0.878 0.959
#> 4 aurn 2024-01-01 2024 ACTH Auchencorth… 59.2 0.998 0.999
#> 5 aurn 2024-01-01 2024 AH Aston Hill 63.4 0.991 0.991
#> 6 aurn 2024-01-01 2024 ARM6 Armagh Road… NA NA NA
#> 7 aurn 2024-01-01 2024 BAAR Ballymena A… NA NA NA
#> 8 aurn 2024-01-01 2024 BALM Ballymena B… NA NA NA
#> 9 aurn 2024-01-01 2024 BAR3 Barnsley Ga… 56.9 0.292 0.0947
#> 10 aurn 2024-01-01 2024 BBRD Birkenhead … NA NA NA
#> # ℹ 177 more rows
#> # ℹ 81 more variables: o3.daily.max.8hour <dbl>, o3.aot40v <int>,
#> # o3.aot40f <int>, somo35 <dbl>, somo35.capture <dbl>, no <dbl>,
#> # no.capture <dbl>, no2 <dbl>, no2.capture <dbl>, nox <dbl>,
#> # nox.capture <dbl>, so2 <dbl>, so2.capture <dbl>, co <dbl>,
#> # co.capture <dbl>, pm10 <dbl>, pm10.capture <dbl>, pm2.5 <dbl>,
#> # pm2.5.capture <dbl>, gr10 <dbl>, gr10.capture <dbl>, gr2.5 <dbl>, …
import_ukaq_summaries(year = 2024, source = "aurn", pivot = "long")
#> # A tibble: 8,228 × 8
#> source date year code site pollutant mean capture
#> <chr> <date> <int> <chr> <chr> <chr> <dbl> <dbl>
#> 1 aurn 2024-01-01 2024 ABD7 Aberdeen Union Street … o3 NA NA
#> 2 aurn 2024-01-01 2024 ABD8 Aberdeen Wellington Ro… o3 NA NA
#> 3 aurn 2024-01-01 2024 ABD9 Aberdeen Erroll Park o3 56.0 0.878
#> 4 aurn 2024-01-01 2024 ACTH Auchencorth Moss o3 59.2 0.998
#> 5 aurn 2024-01-01 2024 AH Aston Hill o3 63.4 0.991
#> 6 aurn 2024-01-01 2024 ARM6 Armagh Roadside o3 NA NA
#> 7 aurn 2024-01-01 2024 BAAR Ballymena Antrim Road o3 NA NA
#> 8 aurn 2024-01-01 2024 BALM Ballymena Ballykeel o3 NA NA
#> 9 aurn 2024-01-01 2024 BAR3 Barnsley Gawber o3 56.9 0.292
#> 10 aurn 2024-01-01 2024 BBRD Birkenhead Borough Road o3 NA NA
#> # ℹ 8,218 more rows
import_ukaq_summaries("my1", 2020, data_type = "monthly", pivot = "long")
#> # A tibble: 516 × 10
#> source date year month month_label code site pollutant mean capture
#> <chr> <date> <int> <int> <fct> <chr> <chr> <chr> <dbl> <dbl>
#> 1 aurn 2020-01-01 2020 1 Jan MY1 Lond… o3 15.5 0.960
#> 2 aurn 2020-02-01 2020 2 Feb MY1 Lond… o3 21.8 0.978
#> 3 aurn 2020-03-01 2020 3 Mar MY1 Lond… o3 35.4 0.997
#> 4 aurn 2020-04-01 2020 4 Apr MY1 Lond… o3 54.7 0.976
#> 5 aurn 2020-05-01 2020 5 May MY1 Lond… o3 57.1 0.976
#> 6 aurn 2020-06-01 2020 6 Jun MY1 Lond… o3 41.0 0.993
#> 7 aurn 2020-07-01 2020 7 Jul MY1 Lond… o3 27.0 0.978
#> 8 aurn 2020-08-01 2020 8 Aug MY1 Lond… o3 41.8 0.660
#> 9 aurn 2020-09-01 2020 9 Sep MY1 Lond… o3 31.3 0.781
#> 10 aurn 2020-10-01 2020 10 Oct MY1 Lond… o3 18.5 0.933
#> # ℹ 506 more rows
Daily Air Quality Index (DAQI)
Pre-calculated Daily Air Quality Indices (the ‘DAQI’, see https://uk-air.defra.gov.uk/air-pollution/daqi) are also
available through import_ukaq_daqi()
. This function is
similar to import_ukaq_summaries()
, with some additional
nuances:
The default
pivot
is"long"
. This is due to the amount of data presented - the daily statistic, the corresponding index, the corresponding band, and the measurement period. Not all of this information is carried to the ‘wide’ format if the alternative is selected.pollutant
ensures only one of the five DAQI pollutants are given - any combination of NO2, O3, PM10, PM2.5 or SO2.
import_ukaq_daqi("my1", 2020)
#> # A tibble: 1,624 × 8
#> source date code site pollutant concentration poll_index poll_band
#> <chr> <date> <chr> <chr> <chr> <dbl> <int> <fct>
#> 1 aurn 2020-01-01 MY1 London … no2 72.4 2 Low
#> 2 aurn 2020-01-01 MY1 London … o3 6 1 Low
#> 3 aurn 2020-01-01 MY1 London … pm10 39 3 Low
#> 4 aurn 2020-01-01 MY1 London … pm2.5 36 4 Moderate
#> 5 aurn 2020-01-01 MY1 London … so2 9.15 1 Low
#> 6 aurn 2020-01-02 MY1 London … no2 66.2 1 Low
#> 7 aurn 2020-01-02 MY1 London … o3 25 1 Low
#> 8 aurn 2020-01-02 MY1 London … pm10 15 1 Low
#> 9 aurn 2020-01-02 MY1 London … pm2.5 9 1 Low
#> 10 aurn 2020-01-02 MY1 London … so2 6.46 1 Low
#> # ℹ 1,614 more rows