Skip to content

merge_into

merge_into(datasets, target_path, allow_expansion=True, overlap='last')

Destructively merge multiple cfdb datasets into an existing target cfdb file.

Unlike combine, merge_into modifies the target dataset in-place. This provides O(B) performance for appending/prepending time steps compared to O(A+B) for combine. However, it strictly enforces that incoming coordinates are either exactly matching, strictly appending, or strictly prepending. In-place insertions into the middle of a dataset are not supported and will raise an error.

Parameters:

Name Type Description Default
datasets list

List of file paths (str/Path) or open Dataset objects to merge.

required
target_path str or Path

Path to the existing target cfdb file. This file will be modified in-place.

required
allow_expansion bool or list

If False, no coordinates can expand (e.g. extending spatial bounds). If True, any coordinate can expand if new values are prepended or appended. If a list of strings (e.g. ['time']), only the specified coordinates can expand.

True
overlap str

How to handle data variables when there are overlapping coordinate values: - 'last': last dataset wins (default, overwrites existing target data) - 'first': first dataset wins (keeps existing target data, ignores new data) - 'error': raise ValueError on overlap

.. note:: overlap='first' and overlap='error' operate at chunk granularity. When coordinate expansion (append/prepend) causes a source chunk to span both existing and newly expanded regions, the entire chunk is skipped ('first') or raises ('error'), even though the expanded portion has no existing data. overlap='last' is unaffected because it writes unconditionally.

'last'

Returns:

Type Description
Dataset

The modified target dataset (open for reading and writing).