This function has two tables as the parameters.
(i) files_table
is required parameter, it ust contain paths pointing to raw
csv logger files, specification of data format (logger type) and locality name.
(ii) localities_table
is optional, containing locality id and metadata e.g. longitude, latitude, elevation...
mc_read_data(
files_table,
localities_table = NULL,
clean = TRUE,
silent = FALSE,
user_data_formats = NULL
)
path to csv file or data.frame object see example with 3 required columns and few optional:
required columns:
path - path to files
locality_id - unique locality id
data_format see mc_data_formats, names(mc_data_formats)
optional columns:
serial_number - logger serial number. If is NA, than myClim tries to detect serial number from file name (for TOMST) or header (for HOBO)
logger_type - type of logger. This defines individual sensors attributes (measurement heights and physical units) of the logger. Important when combining the data from multiple loggers on the locality. If not provided, myClim tries to detect loger_type from the source data file structure (applicable for HOBO, Dendro, Thermo and TMS), but automatic detection of TMS_L45 is not possible. Pre-defined logger types are: ("Dendro", "HOBO", "Thermo", "TMS", "TMS_L45") Default heights of sensor based on logger types are defined in table mc_data_heights
date_format A character vector specifying the custom date format(s) for the lubridate::parse_date_time()
function
(e.g., "%d.%m.%Y %H:%M:%S"). Multiple formats can be defined either in
in CSV or in R data.frame using @
character as separator (e.g., "%d.%m.%Y %H:%M:%S@%Y.%m.%d %H:%M:%S").
The first matching format will be selected for parsing, multiple formats are applicable to single file.
tz_offset - If source datetimes aren't in UTC, then is possible define offset from UTC in minutes. Value in this column have the highest priority. If NA then auto detection of timezone in files. If timezone can't be detected, then UTC is supposed. Timezone offset in HOBO format can be defined in header. In this case function try detect offset automatically. Ignored for TOMST TMS data format (they are always in UTC)
step - Time step of microclimatic time-series in seconds. When provided, then used in mc_prep_clean instead of automatic step detection. See details.
path to csv file ("c:/user/localities.table.csv") or R data.frame see example.
Localities table is optional (default NULL).
The locality_id
is the only required column. Other columns are optional. Column names corresponding
with the myclim pre-defined locality metadata (elevation, lon_wgs84, lat_wgs84, tz_offset)
are associted withthose pre-defined metadata slots, other columns are written into
metadata@user_data
myClim-package.
required columns:
locality_id - unique locality id
optional columns:
elevation - elevation (in m)
lon_wgs84 - longitude (in decimal degrees)
lat_wgs84 - latitude (in decimal degrees)
tz_offset - locality time zone offset from UTC, applicable for converting time-series from UTC to local time.
... - any other columns are imported to metadata@user_data
if TRUE, then mc_prep_clean is called automatically while reading (default TRUE)
if TRUE, then any information is not printed in console (default FALSE)
custom data formats; use in case you have your own logger files not pre-defined in myClim - list(key=mc_DataFormat) mc_DataFormat (default NULL)
If custom data format is defined the key can be used in data_format parameter in mc_read_files()
and mc_read_data()
. Custom data format must be defined first, and then an be used for reading.
myClim object in Raw-format see myClim-package
The input tables could be R data.frames or csv files. When loading files_table
and localities_table
from external CSV they must have header, column separator must be comma ",".
If you only need to place loggers to correct localities, files_table
is enough.
If you wish to provide localities additional metadata, you need also localities_table
By default, data are cleaned with the function mc_prep_clean see function description.
mc_prep_clean detects gaps in time-series data,
duplicated records, or records in the wrong order. Importantly, mc_prep_clean
also applies a step parameter if provided. The step parameter can be used either
instead of automatic step detection which can sometime failed, or to prune
microclimatic data. For example, if you have a 15-minute time series but you wish to
keep only one record per hour (without aggregating), you can use step parameter.
However, if a step is provided and clean = FALSE
, then the step is only stored in the
metadata of myClim, and the time-series data is not cleaned, and the step is not applied.
files_csv <- system.file("extdata", "files_table.csv", package = "myClim")
localities_csv <- system.file("extdata", "localities_table.csv", package = "myClim")
tomst_data <- mc_read_data(files_csv, localities_csv)
#> 4 loggers
#> datetime range: 2020-10-06 09:00:00 - 2020-11-01 13:00:00
#> detected steps: (900s = 15min)
#> locality_id serial_number logger_name start_date end_date
#> 1 A1E05 91184101 Thermo_1 2020-10-28 08:45:00 2020-10-29 09:45:00
#> 2 A1E05 92201058 Dendro_1 2020-10-31 12:00:00 2020-11-01 13:00:00
#> 3 A2E32 94184103 TMS_1 2020-10-16 06:15:00 2020-10-17 07:15:00
#> 4 A6W79 94184102 TMS_1 2020-10-06 09:00:00 2020-10-07 10:00:00
#> step_seconds count_duplicities count_missing count_disordered rounded
#> 1 900 0 0 0 FALSE
#> 2 900 0 0 0 FALSE
#> 3 900 0 0 0 FALSE
#> 4 900 0 0 0 FALSE