7.4. Vector Dataloaders
7.4.1. Abstract Vector Base Class
The Abstract Base Class of the vector dataloaders holds most of the functionality that would be needed to manipulate the data to work with the mesh. When creating a new dataloader, the user must define how to open the data files, and what methods are required to manipulate the data into a standard format. More details are provided on the abstractVector doc page
7.4.2. Vector Dataloader Examples
Creating a vector dataloader is almost identical to creating a
scalar dataloader. The key differences
are that the VectorDataLoader abstract base class must be used, and that
the data_name is a comma separated string of the vector component names.
e.g. a dataloader storing a vector with column names uC
and
vC
will have an attribute self.data_name = 'uC,vC'
Data must be imported and saved as an xarray.Dataset, or a
pandas.DataFrame object. Below is a simple example of how to load in a
NetCDF file:
from meshiphi.Dataloaders.Scalar.AbstractScalar import VectorDataLoader
import xarray as xr
import logging
class MyDataLoader(VectorDataLoader):
def import_data(self, bounds):
logging.debug("Importing my data...")
# Open Dataset
logging.debug(f"- Opening file {self.file}")
data = xr.open_dataset(self.file)
# Rename coordinate columns to 'lat', 'long', 'time' if they aren't already
data = data.rename({'lon':'long'})
# Limit to initial boundary
data = self.trim_data(bounds, data=data)
return data
Similar to scalar data loaders, sometimes there are parameters that are constant for a data source, but are not constant for all data sources. Default values may be defined either in the dataloader factory, or within the dataloader itself. Below is an example of setting default parameters for reprojection of a dataset:
class MyDataLoader(ScalarDataLoader):
def add_default_params(self, params):
# Add all the regular default params that scalar dataloaders have
params = super().add_default_params(params) # This line MUST be included
# Define projection of dataset being imported
params['in_proj'] = 'EPSG:3412'
# Define projection required by output
params['out_proj'] = 'EPSG:4326' # default is EPSG:4326, so strictly
# speaking this line is not necessary
# Coordinates in dataset that will be reprojected into long/lat
params['x_col'] = 'x' # Becomes 'long'
params['y_col'] = 'y' # Becomes 'lat'
def import_data(self, bounds):
# Open Dataset
data = xr.open_mfdataset(self.file)
# Can't easily determine bounds of data in wrong projection, so skipping for now
return data
7.4.3. Implemented Vector Dataloaders
- 7.4.3.1. Baltic Currents Dataloader
- 7.4.3.2. DUACS Currents Dataloader
- 7.4.3.3. ERA5 Wave Direction Dataloader
- 7.4.3.4. ERA5 Wind Dataloader
- 7.4.3.5. North Sea Currents Dataloader
- 7.4.3.6. ORAS5 Currents Dataloader
- 7.4.3.7. SOSE Currents Dataloader
- 7.4.3.8. Vector CSV Dataloader
- 7.4.3.9. Vector GRF Dataloader