MiBIG
nplinker.genomics.mibig
¶
MibigLoader
¶
Bases: BGCLoaderBase
Parse MIBiG metadata files and return BGC objects.
MIBiG metadata file (json) contains annotations/metadata information for each BGC. See https://mibig.secondarymetabolites.org/download.
The MiBIG accession is used as BGC id and strain name. The loaded BGC
objects have Strain object as their strain attribute (i.e. BGC.strain
).
Parameters:
Examples:
>>> loader = MibigLoader("path/to/mibig/data/dir")
>>> loader.data_dir
'path/to/mibig/data/dir'
>>> loader.get_bgcs()
[BGC('BGC000001', 'NRP'), BGC('BGC000002', 'Polyketide')]
Source code in src/nplinker/genomics/mibig/mibig_loader.py
get_files
¶
parse_data_dir
staticmethod
¶
Parse metadata directory and return paths to all metadata json files.
Parameters:
Returns:
-
dict[str, str]
–The key is metadata file name (BGC accession), and the value is path to the metadata
-
dict[str, str]
–json file
Source code in src/nplinker/genomics/mibig/mibig_loader.py
get_metadata
¶
get_metadata() -> dict[str, MibigMetadata]
Get MibigMetadata objects.
Returns:
-
dict[str, MibigMetadata]
–The key is BGC accession (file name) and the value is MibigMetadata object
get_bgcs
¶
Get BGC objects.
The BGC objects use MiBIG accession as id and have Strain object as
their strain attribute (i.e. BGC.strain
), where the name of the Strain
object is also MiBIG accession.
Returns:
Source code in src/nplinker/genomics/mibig/mibig_loader.py
MibigMetadata
¶
Class to model the BGC metadata/annotations defined in MIBiG.
MIBiG is a specification of BGC metadata and use JSON schema to represent BGC metadata. More details see: https://mibig.secondarymetabolites.org/download.
This class supports MIBiG version 1.0 to 4.0.
Parameters:
Examples:
Source code in src/nplinker/genomics/mibig/mibig_metadata.py
biosyn_class
property
¶
Get the value of metadata item 'biosyn_class'.
The 'biosyn_class' is biosynthetic class(es) defined by MIBiG.
Before version 4.0 of MIBiG, it defines 6 major biosynthetic classes,
including NRP
, Polyketide
, RiPP
, Terpene
, Saccharide
and Alkaloid
.
Starting from version 4.0, MIBiG defines 5 major biosynthetic classes,
including PKS
, NRPS
, Ribosomal
, Terpene
and Saccharide
.
The mapping between the old and new classes is as follows:
NRP
->NRPS
Polyketide
->PKS
RiPP
->Ribosomal
Terpene
->Terpene
Saccharide
->Saccharide
Alkaloid
->Other
Note that natural products that do not fit into any of the above
biosynthetic classes fall under the category Other
.
download_and_extract_mibig_metadata
¶
download_and_extract_mibig_metadata(
download_root: str | PathLike,
extract_path: str | PathLike,
version: str = "3.1",
)
Download and extract MIBiG metadata json files.
The MIBiG metadata json files are available at https://mibig.secondarymetabolites.org/download.
Note that it does not matter whether the metadata json files are in nested folders or not in the archive,
all json files will be extracted to the same location, i.e. extract_path
. The nested
folders will be removed if they exist. So the extract_path
will have only json files.
Parameters:
-
download_root
(str | PathLike
) –Path to the directory in which to place the downloaded archive.
-
extract_path
(str | PathLike
) –Path to an empty directory where the json files will be extracted. The directory must be empty if it exists. If it doesn't exist, the directory will be created.
-
version
(str
, default:'3.1'
) –MIBiG version.
Examples:
Source code in src/nplinker/genomics/mibig/mibig_downloader.py
parse_bgc_metadata_json
¶
Parse MIBiG metadata file and return BGC object.
Note that the MiBIG accession is used as the BGC id and strain name. The BGC object has Strain object as its strain attribute.
Parameters:
Returns:
-
BGC
–BGC object