easy_vic_build.tools.dpc_func.dpc_base
Base workflow class for basin/grid data processing.
The class in this module maintains a dependency-aware step graph, executes registered processing steps, and caches basin/grid-level outputs for reuse. Subclasses provide concrete loading or aggregation routines through decorated methods.
Classes
|
Base class for basin/grid data loading pipelines. |
- class easy_vic_build.tools.dpc_func.dpc_base.dataProcess_base(load_path: str | None = None, reset_on_load_failure=False, **kwargs)[source]
Bases:
ABCBase class for basin/grid data loading pipelines.
The class manages three internal states:
_processing_steps: registered step metadata and dependencies._executed_steps: names of completed steps._cache: in-memory data products keyed bysave_name.
Subclasses typically declare loading methods decorated by
easy_vic_build.tools.decoractors.processing_step().Initialize the processing object and optionally restore saved state.
- Parameters:
load_path (str, optional) – Path to a serialized processor state (pickle file). If provided, the state will be loaded immediately.
reset_on_load_failure (bool, optional) – If
True, reset to a clean state when state loading fails. IfFalse, loading errors raiseRuntimeError.**kwargs (dict) – Extra keyword arguments forwarded to
load_state().
- __init__(load_path: str | None = None, reset_on_load_failure=False, **kwargs)[source]
Initialize the processing object and optionally restore saved state.
- Parameters:
load_path (str, optional) – Path to a serialized processor state (pickle file). If provided, the state will be loaded immediately.
reset_on_load_failure (bool, optional) – If
True, reset to a clean state when state loading fails. IfFalse, loading errors raiseRuntimeError.**kwargs (dict) – Extra keyword arguments forwarded to
load_state().
- register_processing_step(step_name: str, save_names: str | List[str], data_level: str, func: Callable, dependencies: List[str] | None = None)[source]
Register one processing step in the execution graph.
- Parameters:
step_name (str) – Unique step identifier.
save_names (str or list of str) – Cache key(s) expected to be produced by
func.data_level (str) – Data scope label, usually
"basin_level"or"grid_level".func (Callable) – Step callable with zero arguments. It must return a
dictkeyed bysave_names.dependencies (list of str, optional) – Steps that must be executed before this step.
- loaddata_pipeline(save_path=None, loaddata_kwargs: Dict[str, Dict[str, Any]] | None = None)[source]
Execute all registered steps in dependency order.
- Parameters:
save_path (str, optional) – Path for persisting state after each successful step.
loaddata_kwargs (dict, optional) – Runtime input dictionary consumed by individual step methods.
- merge_basin_data() geopandas.GeoDataFrame[source]
Merge all cached basin-level outputs into one GeoDataFrame.
- Returns:
Basin GeoDataFrame containing original geometry plus all joinable basin-level products in cache.
- Return type:
geopandas.GeoDataFrame
- merge_grid_data() geopandas.GeoDataFrame[source]
Merge all cached grid-level outputs into one GeoDataFrame.
- Returns:
Grid GeoDataFrame containing original grid fields and appended grid-level variables from cache.
- Return type:
geopandas.GeoDataFrame
- discard_step_name(step_name: str)[source]
Mark one step as not executed.
- Parameters:
step_name (str) – Step name to remove from
_executed_steps.
- get_data_from_cache(save_name: str, default: Any | None = None) Any[source]
Get cached data and its level by key.
- Parameters:
save_name (str) – Cache key to retrieve.
default (Any, optional) – Value used when the key is not found.
- Returns:
(data, data_level)if key exists; otherwise(default, None).- Return type:
tuple
- list_cache() List[str][source]
List available keys in cache.
- Returns:
Current cache key names.
- Return type:
list of str
- save_data_to_cache(save_name: str, data: Any, data_level: str, step_name: str | None = None) None[source]
Save data into cache and optionally reopen its step state.
- Parameters:
save_name (str) – Cache key for the data object.
data (Any) – Data object to cache.
data_level (str) – Data scope label, usually
"basin_level"or"grid_level".step_name (str, optional) – Step name to discard from
_executed_stepsafter updating cache.
- clear_data_from_cache(save_names: List[str] | None = None, step_name: str | None = None)[source]
Clear cached entries by key list or clear all entries.
- Parameters:
save_names (list of str, optional) – Keys to remove. If
None, all cache entries are removed.step_name (str, optional) – Step name to discard from
_executed_stepswhile clearing keys.
- save_state(save_path: str | None = None) None[source]
Serialize processor state to a pickle file.
- Parameters:
save_path (str, optional) – Output state path. If omitted,
self.load_pathis used when set.
- load_state(load_path: str, reset_on_load_failure: bool = False, **kwargs) dataProcess_base[source]
Load processor state from a pickle file.
- Parameters:
load_path (str) – State file path.
reset_on_load_failure (bool, optional) – Whether to reset to a clean state when loading fails.
**kwargs (dict) – Reserved for compatibility.
- Returns:
Current processor instance.
- Return type:
- Raises:
RuntimeError – Raised when loading fails and
reset_on_load_failureisFalse.
- aggregate_grid_to_basins()[source]
Aggregate grid-level variables to basin-level summaries.
Notes
This method is intentionally left for subclasses.
- plot(fig=None, ax=None, grid_shp_kwargs={}, grid_shp_point_kwargs={}, basin_shp_kwargs={})[source]
Plot cached basin and grid geometry.
- Parameters:
fig (matplotlib.figure.Figure, optional) – Existing figure object. A new one is created when omitted.
ax (matplotlib.axes.Axes, optional) – Existing axes object. A new one is created when omitted.
grid_shp_kwargs (dict, optional) – Keyword arguments passed to
grid_shp.boundary.plot.grid_shp_point_kwargs (dict, optional) – Keyword arguments passed to
grid_shp["point_geometry"].plot.basin_shp_kwargs (dict, optional) – Keyword arguments passed to
basin_shp.plot.
- Returns:
(fig, ax)with rendered basin/grid layout.- Return type:
tuple