Dataset Loader
nplinker.loader
¶
DatasetLoader
¶
Load datasets from the working directory with the given configuration.
Loaded data are stored in the data containers (attributes), e.g. self.bgcs
, self.gcfs
, etc.
Attributes:
-
config
–A Dynaconf object that contains the configuration settings.
-
bgcs
(list[BGC]
) –A list of BGC objects.
-
gcfs
(list[GCF]
) –A list of GCF objects.
-
spectra
(list[Spectrum]
) –A list of Spectrum objects.
-
mfs
(list[MolecularFamily]
) –A list of MolecularFamily objects.
-
mibig_bgcs
(list[BGC]
) –A list of MIBiG BGC objects.
-
mibig_strains_in_use
(StrainCollection
) –A StrainCollection object that contains the strains in use from MIBiG.
-
product_types
(list
) –A list of product types.
-
strains
(StrainCollection
) –A StrainCollection object that contains all strains.
-
class_matches
–A ClassMatches object that contains class match info.
-
chem_classes
–A ChemClassPredictions object that contains chemical class predictions.
Parameters:
-
config
(Dynaconf
) –A Dynaconf object that contains the configuration settings.
Examples:
>>> from nplinker.config import load_config
>>> from nplinker.loader import DatasetLoader
>>> config = load_config("nplinker.toml")
>>> loader = DatasetLoader(config)
>>> loader.load()
See Also
DatasetArranger: Download, generate and/or validate datasets to ensure they are ready for loading.
Source code in src/nplinker/loader.py
EXTRA_CANOPUS_PARAMS_DEFAULT
class-attribute
instance-attribute
¶
mibig_strains_in_use
instance-attribute
¶
mibig_strains_in_use: StrainCollection = StrainCollection()
load
¶
load() -> bool
Load all data from data files in the working directory.
See Dataset Loading Pipeline for the detailed steps.
Returns:
-
bool
–True if all data are loaded successfully.