7.2. Dataloader Factory

The dataloader factory produces dataloader objects based off of parameter inputs provided in the config file. The parameters needed in the config are defined in the get_dataloader() method of the factory. At the very least, a name must be provided to select the dataloader from all those that are available.

7.2.1. Adding New Dataloader to Factory

Two actions must be performed to add a new dataloader to the Factory object. Optionally, a third may be added if you want to add a new default value for a parameter the dataloader requires. The actions are:

  1. Import the dataloader

  2. Add an entry to the dataloader_requirements dictionary

7.2.2. Example

In this example, a new scalar dataloader myScalarDataloader has been created, and is located at meshiphi.Dataloaders/Scalar/myScalarDataloader.py.

The only parameter required by this dataloader is a file to read data from. ‘files’ is passed as a mandatory parameter, as ‘file’ and ‘folder’ both get translated into a list of files, and stored in params under the key ‘files’:

# Add new import statement for Factory to read
from meshiphi.Dataloaders.Scalar.myScalarDataloader import myScalarDataloader

...

class DataLoaderFactory:
   ...
   def get_dataloader(self, name, bounds, params, min_dp=5):
      ...
      dataloader_requirements = {
         ...
         # Add new dataloaders
         'myscalar':    (myScalarDataloader, ['files'])
         ...
      ...
   ...

To call this dataloader, add an entry in the config.json file used to generate the mesh. Alternatively, add a folder, or a list of individual files:

{
      "loader": "myscalar",
      "params": {
         "file": "PATH_TO_DATA_FILE"   # For a single file
         "folder": "PATH_TO_FOLDER"    # For a folder, must have trailing '/'
         "files":[                     # For a list of individual files
            "PATH_TO_FILE_1",
            "PATH_TO_FILE_2",
            ...
         ]
      }
}

7.2.3. Dataloader Factory Object

class meshiphi.dataloaders.factory.DataLoaderFactory

Produces initialised DataLoader objects that can be used by the mesh to quickly retrieve values within a boundary.

static get_dataloader(name, bounds, params, min_dp=5)

Creates appropriate dataloader object based on name

Parameters:
  • name (str) – Name of data source/type. Must be one of following - ‘scalar_csv’, ‘scalar_grf’, ‘binary_grf’, ‘amsr’, ‘bsose_sic’, ‘bsose_depth’, ‘baltic_sic’, ‘gebco’, ‘icenet’, ‘modis’, ‘thickness’, ‘density’, ‘circle’, ‘square’, ‘gradient’, ‘checkerboard’, ‘vector_csv’, ‘vector_grf’, ‘baltic_currents’, ‘era5_wind’, ‘northsea_currents’, ‘oras5_currents’, ‘sose’, ‘duacs_currents’, ‘era5_wave_height’, ‘era5_wave_direction’

  • bounds (Boundary) – Boundary object with initial mesh space&time limits

  • params (dict) – Dictionary of parameters required by each dataloader

  • min_dp (int) – Minimum datapoints required to get homogeneity condition

Returns:

DataLoader object of correct type, with required params set

Return type:

(Scalar/Vector/LUT DataLoader)

static translate_file_input(params)

Allows flexible file specification in params. Translates ‘file’ or ‘folder’ into ‘files’

Parameters:

params (dict) – Dictionary of parameters written in config