Configuration Reference
Given here are all configuration settings for zappend
.
Target Outline
append_dim
Type string.
The name of the variadic append dimension.
Defaults to "time"
.
append_step
If set, enforces a step size in the append dimension between two slices or just enforces a direction. Must be one of the following:
-
Arbitrary step size or not applicable. Its value is
null
. -
Monotonically increasing. Its value is
"+"
. -
Monotonically decreasing. Its value is
"-"
. -
Type string. A positive or negative time delta value, such as
12h
,2D
,-1D
. -
Type number. A positive or negative numerical delta value.
Defaults to null
.
fixed_dims
Type object. Specifies the fixed dimensions of the target dataset. Keys are dimension names, values are dimension sizes. The object's values are of type integer.
included_variables
Type array. Specifies the names of variables to be included in the target dataset. Defaults to all variables found in the first contributing dataset. The items of the list are of type string.
excluded_variables
Type array. Specifies the names of individual variables to be excluded from all contributing datasets. The items of the list are of type string.
variables
Type object.
Defines dimensions, encoding, and attributes for variables in the target dataset. Object property names refer to variable names. The special name *
refers to all variables, which is useful for defining common values.
The object's values are of type object.
Variable metadata.
-
dims
: The names of the variable's dimensions in the given order. Each dimension must exist in contributing datasets. The items of the list are of type string. -
encoding
: Variable Zarr storage encoding. Settings given here overwrite the encoding settings of the first contributing dataset.dtype
: Must be one of"int8", "uint8", "int16", "uint16", "int32", "uint32", "int64", "uint64", "float32", "float64"
.chunks
: Must be one of the following:- Type array.
Chunk sizes for each dimension of the variable.
The items of the list must be one of the following:
- Type integer. Dimension is chunked using given size.
- Disable chunking in this dimension.
Its value is
null
.
- Disable chunking in all dimensions.
Its value is
null
.
- Type array.
Chunk sizes for each dimension of the variable.
The items of the list must be one of the following:
fill_value
: Must be one of the following:- Type number.
A number of type and unit of the given storage
dtype
. - Not-a-number. Can be used only if storage
dtype
isfloat32
orfloat64
. Its value is"NaN"
. - No fill value.
Its value is
null
.
- Type number.
A number of type and unit of the given storage
scale_factor
: Scale factor for computing the in-memory value:memory_value = scale_factor * storage_value + add_offset
.add_offset
: Add offset for computing the in-memory value:memory_value = scale_factor * storage_value + add_offset
.units
: Units of the storage data type if memory data type is date/time.calendar
: The calendar to be used if memory data type is date/time.compressor
: Compressor definition. Set tonull
to disable data compression. Allowed parameters depend on the value ofid
. The keyid
is required.id
: Compressor identifier
filters
: List of filters. Set tonull
to not use filters. The items of the list are of type object. Filter definition. Allowed parameters depend on the value ofid
. The keyid
is required.id
: Compressor identifier
-
attrs
: Arbitrary variable metadata attributes.
attrs
Type object.
Arbitrary dataset attributes. If permit_eval
is set to true
, string values may include Python expressions enclosed in {{
and }}
to dynamically compute attribute values; in the expression, the current dataset is named ds
. Refer to the user guide for more information.
attrs_update_mode
The mode used update target attributes from slice attributes. Independently of this setting, extra attributes configured by the attrs
setting will finally be used to update the resulting target attributes.
Must be one of the following:
-
Use attributes from first slice dataset and keep them. Its value is
"keep"
. -
Replace existing attributes by attributes of last slice dataset. Its value is
"replace"
. -
Update existing attributes by attributes of last slice dataset. Its value is
"update"
. -
Ignore attributes from slice datasets. Its value is
"ignore"
.
Defaults to "keep"
.
zarr_version
The Zarr version to be used.
Its value is 2
.
Defaults to 2
.
Data I/O - Target
target_dir
Type string. The URI or local path of the target Zarr dataset. Must specify a directory whose parent directory must exist.
target_storage_options
Type object.
Options for the filesystem given by the URI of target_dir
.
force_new
Type boolean.
Force creation of a new target dataset. An existing target dataset (and its lock) will be permanently deleted before appending of slice datasets begins. WARNING: the deletion cannot be rolled back.
Defaults to false
.
Data I/O - Slices
slice_storage_options
Type object. Options for the filesystem given by the protocol of the URIs of contributing datasets.
slice_engine
Type string.
The name of the engine to be used for opening contributing datasets. Refer to the engine
argument of the function xarray.open_dataset()
.
slice_polling
Defines how to poll for contributing datasets. Must be one of the following:
-
No polling, fail immediately if dataset is not available. Its value is
false
. -
Poll using default values. Its value is
true
. -
Type object. Polling parameters. The keys
interval
,timeout
are required.interval
: Polling interval in seconds. Defaults to2
.timeout
: Polling timeout in seconds. Defaults to60
.
slice_source
Type string.
The fully qualified name of a class or function that receives a slice item as argument(s) and provides the slice dataset. If a class is given, it must be derived from zappend.api.SliceSource
. If the function is a context manager, it must yield an xarray.Dataset
. If a plain function is given, it must return any valid slice item type. Refer to the user guide for more information.
slice_source_kwargs
Type object.
Extra keyword-arguments passed to a configured slice_source
together with each slice item.
persist_mem_slices
Type boolean.
Persist in-memory slices and reopen from a temporary Zarr before appending them to the target dataset. This can prevent expensive re-computation of dask chunks at the cost of additional i/o.
Defaults to false
.
Data I/O - Transactions
temp_dir
Type string. The URI or local path of the directory that will be used to temporarily store rollback information.
temp_storage_options
Type object.
Options for the filesystem given by the protocol of temp_dir
.
disable_rollback
Type boolean.
Disable rolling back dataset changes on failure. Effectively disables transactional dataset modifications, so use this setting with care.
Defaults to false
.
Miscellaneous
version
Configuration schema version. Allows the schema to evolve while still preserving backwards compatibility.
Its value is 1
.
Defaults to 1
.
dry_run
Type boolean.
If true
, log only what would have been done, but don't apply any changes.
Defaults to false
.
permit_eval
Type boolean.
Allow for dynamically computed values in dataset attributes attrs
using the syntax {{ expression }}
. Executing arbitrary Python expressions is a security risk, therefore this must be explicitly enabled. Refer to the user guide for more information.
Defaults to false
.
extra
Type object.
Extra settings. Intended use is by a slice_source
that expects an argument named ctx
to access the extra settings and other configuration.
profiling
Profiling configuration. Allows for runtime profiling of the processing. Must be one of the following:
-
Type boolean. If set, profiling is enabled and output is logged using level
"INFO"
. Otherwise, profiling is disabled. -
Type string. Profile path. Enables profiling and writes a profiling report to given path.
-
Type object. Detailed profiling configuration.
enabled
: Enable or disable profiling.path
: Local file path for profiling output.log_level
: Defaults to"INFO"
. Must be one of"CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG", "NOTSET"
.keys
: Sort output according to the supplied column names. Refer to Stats.sort_stats(*keys). Defaults to["tottime"]
. Must be one of"calls", "cumulative", "cumtime", "file", "filename", "module", "ncalls", "pcalls", "line", "name", "nfl", "stdname", "time", "tottime"
.restrictions
: Used to limit the list down to the significant entries in the profiling report. Refer to Stats.print_stats(*restrictions). The items of the list must be one of the following:- Type integer. Select a count of lines.
- Type number. Select a percentage of lines.
- Type string. Pattern-match the standard name that is printed.
logging
Logging configuration. Must be one of the following:
-
Type boolean. Shortform that enables logging to the console using log level
"INFO"
. -
Shortform that enables logging to the console using the specified log level. Must be one of
"CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG", "NOTSET"
. -
Type object. Detailed logging configuration. For details refer to the dictionary schema of the Python module
logging.config
. The keyversion
is required.version
: Its value is1
.formatters
: Formatter definitions. Each key is a formatter id and each value is an object describing how to configure the corresponding formatter. The object's values are of type object. Formatter configuration.format
: Format string in the givenstyle
. Defaults to"%(message)s"
.datefmt
: Format string in the givenstyle
for the date/time portion. Defaults to"%Y-%m-%d %H:%M:%S,uuu"
.style
: Must be one of"%", "{", "$"
.
filters
: Filter definitions. Each key is a filter id and each value is a dict describing how to configure the corresponding filter. The object's values are of type object. Filter configuration.handlers
: Handler definitions. Each key is a handler id and each value is an object describing how to configure the corresponding handler. The object's values are of type object. Handler configuration. All keys other than the following are passed through as keyword arguments to the handler's constructor. The keyclass
is required.class
: The fully qualified name of the handler class. See logging handlers.level
: Must be one of"CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG", "NOTSET"
.formatter
: The id of the formatter for this handler.filters
: A list of ids of the filters for this logger. The items of the list are of type string.
loggers
: Logger definitions. Each key is a logger name and each value is an object describing how to configure the corresponding logger. The tool's logger has the id'zappend'
. The object's values are of type object. Logger configuration.level
: Must be one of"CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG", "NOTSET"
.propagate
: The propagation setting of the logger.filters
: A list of ids of the filters for this logger. The items of the list are of type string.handlers
: A list of ids of the handlers for this logger. The items of the list are of type string.