Dataset#
Classes and functions for defining, finding, and loading data.
Classes:
|
Define datasets, find the related files, and load them. |
Data:
Inherited facets. |
Functions:
|
Create or update a recipe from datasets. |
- class esmvalcore.dataset.Dataset(**facets: FacetValue)[source]#
Define datasets, find the related files, and load them.
- Parameters:
**facets (FacetValue) – Facets describing the dataset. See
esmvalcore.esgf.facets.FACETS
for the mapping between the facet names used by ESMValCore and those used on ESGF.
- facets#
Facets describing the dataset.
- Type:
Methods:
add_supplementary
(**facets)Add an supplementary dataset.
Add additional facets.
copy
(**facets)Create a copy.
Find files.
Create datasets based on the available files.
Create a list of datasets from short notations.
from_recipe
(recipe, session)Read datasets from a recipe.
load
()Load dataset.
set_facet
(key, value[, persist])Set facet.
Set the
'version'
facet based on the available data.summary
([shorten])Summarize the content of dataset.
Attributes:
The files associated with this dataset.
Get input datasets.
Return a dictionary with the persistent facets.
A
esmvalcore.config.Session
associated with the dataset.- add_supplementary(**facets: FacetValue) None [source]#
Add an supplementary dataset.
This is a convenience function that will create a copy of the current dataset, update its facets with the values specified in
**facets
, and append it toDataset.supplementaries
. For more control over the creation of the supplementary dataset, first create a newDataset
describing the supplementary dataset and then append it toDataset.supplementaries
.- Parameters:
**facets (FacetValue) – Facets describing the supplementary variable.
- Return type:
None
- augment_facets() None [source]#
Add additional facets.
This function will update the dataset with additional facets from various sources.
- Return type:
None
- copy(**facets: FacetValue) Dataset [source]#
Create a copy.
- Parameters:
**facets (FacetValue) – Update these facets in the copy. Note that for supplementary datasets attached to the dataset, the
'short_name'
and'mip'
facets will not be updated with these values.- Returns:
A copy of the dataset.
- Return type:
- find_files() None [source]#
Find files.
Look for files and populate the
Dataset.files
property of the dataset and its supplementary datasets.- Return type:
None
- from_files() Iterator[Dataset] [source]#
Create datasets based on the available files.
The facet values for local files are retrieved from the directory tree where the directories represent the facets values. See CMIP data for more information on this kind of file organization.
glob.glob()
patterns can be used as facet values to select multiple datasets. If for some of the datasets not all glob patterns can be expanded (e.g. because the required facet values cannot be inferred from the directory names), these datasets will be ignored, unless this happens to be all datasets.If
glob.glob()
patterns are used in supplementary variables and multiple matching datasets are found, only the supplementary dataset that has most facets in common with the main dataset will be attached.Supplementary datasets will in inherit the facet values from the main dataset for those facets listed in
INHERITED_FACETS
.This also works for derived variables. The input datasets that are necessary for derivation can be accessed via
Dataset.input_datasets
.Examples
See Discovering data for example use cases.
- Yields:
Dataset – Datasets representing the available files.
- Return type:
Iterator[Dataset]
- from_ranges() list[Dataset] [source]#
Create a list of datasets from short notations.
This expands the
'ensemble'
and'sub_experiment'
facets in the dataset definition if they are ranges.For example
'ensemble'='r(1:3)i1p1f1'
will be expanded to three datasets, with'ensemble'
values'r1i1p1f1'
,'r2i1p1f1'
,'r3i1p1f1'
.
- static from_recipe(recipe: Path | str | dict, session: Session) list[Dataset] [source]#
Read datasets from a recipe.
- Parameters:
- Returns:
A list of datasets.
- Return type:
- property input_datasets: list[Dataset]#
Get input datasets.
For non-derived variables (i.e., those with facet
derive=False
), this will simply return the dataset itself in a list.For derived variables (i.e., those with facet
derive=True
), this will return the datasets required for derivation if derivation is necessary, and the dataset itself if derivation is not necessary. Derivation is necessary if the facetforce_derivation=True
is set or no files for the dataset itself are available.See also
esmvalcore.preprocessor.derive()
for an example usage.
- load() Cube [source]#
Load dataset.
- Raises:
InputFilesNotFound – When no files were found.
- Returns:
An
iris
cube with the data corresponding the the dataset.- Return type:
- property minimal_facets: Facets#
Return a dictionary with the persistent facets.
- property session: Session#
A
esmvalcore.config.Session
associated with the dataset.
- esmvalcore.dataset.INHERITED_FACETS: list[str] = ['dataset', 'domain', 'driver', 'grid', 'project', 'timerange']#
Inherited facets.
Supplementary datasets created based on the available files using the
Dataset.from_files()
method will inherit the values of these facets from the main dataset.
- esmvalcore.dataset.datasets_to_recipe(datasets: Iterable[Dataset], recipe: Path | str | dict[str, Any] | None = None) dict [source]#
Create or update a recipe from datasets.
- Parameters:
datasets (Iterable[Dataset]) – Datasets to use in the recipe.
recipe (Path | str | dict[str, Any] | None) – Recipe to load the datasets from. The value provided here should be either a path to a file, a recipe file that has been loaded using e.g.
yaml.safe_load()
, or anstr
that can be loaded usingyaml.safe_load()
.
- Return type:
Examples
See Composing recipes for example use cases.