MiBIG
nplinker.genomics.mibig
¶
MibigLoader
¶
Bases: BGCLoaderBase
Parse MIBiG metadata files and return BGC objects.
MIBiG metadata file (json) contains annotations/metadata information for each BGC. See https://mibig.secondarymetabolites.org/download.
The MiBIG accession is used as BGC id and strain name. The loaded BGC
objects have Strain object as their strain attribute (i.e. BGC.strain
).
Parameters:
Examples:
>>> loader = MibigLoader("path/to/mibig/data/dir")
>>> loader.data_dir
'path/to/mibig/data/dir'
>>> loader.get_bgcs()
[BGC('BGC000001', 'NRP'), BGC('BGC000002', 'Polyketide')]
Source code in src/nplinker/genomics/mibig/mibig_loader.py
get_files
¶
parse_data_dir
staticmethod
¶
Parse metadata directory and return paths to all metadata json files.
Parameters:
Returns:
-
dict[str, str]
–The key is metadata file name (BGC accession), and the value is path to the metadata
-
dict[str, str]
–json file
Source code in src/nplinker/genomics/mibig/mibig_loader.py
get_metadata
¶
get_metadata() -> dict[str, MibigMetadata]
Get MibigMetadata objects.
Returns:
-
dict[str, MibigMetadata]
–The key is BGC accession (file name) and the value is MibigMetadata object
get_bgcs
¶
Get BGC objects.
The BGC objects use MiBIG accession as id and have Strain object as
their strain attribute (i.e. BGC.strain
), where the name of the Strain
object is also MiBIG accession.
Returns:
Source code in src/nplinker/genomics/mibig/mibig_loader.py
MibigMetadata
¶
Class to model the BGC metadata/annotations defined in MIBiG.
MIBiG is a specification of BGC metadata and use JSON schema to represent BGC metadata. More details see: https://mibig.secondarymetabolites.org/download.
Parameters:
Examples:
Source code in src/nplinker/genomics/mibig/mibig_metadata.py
biosyn_class
property
¶
Get the value of metadata item 'biosyn_class'.
The 'biosyn_class' is biosynthetic class(es), namely the type of natural product or secondary metabolite.
MIBiG defines 6 major biosynthetic classes for natural products,
including NRP
, Polyketide
, RiPP
, Terpene
, Saccharide
and Alkaloid
. Note that natural products created by the other
biosynthetic mechanisms fall under the category Other
. For more details
see the paper.
download_and_extract_mibig_metadata
¶
download_and_extract_mibig_metadata(
download_root: str | PathLike,
extract_path: str | PathLike,
version: str = "3.1",
)
Download and extract MIBiG metadata json files.
Note that it does not matter whether the metadata json files are in nested folders or not in the archive,
all json files will be extracted to the same location, i.e. extract_path
. The nested
folders will be removed if they exist. So the extract_path
will have only json files.
Parameters:
-
download_root
(str | PathLike
) –Path to the directory in which to place the downloaded archive.
-
extract_path
(str | PathLike
) –Path to an empty directory where the json files will be extracted. The directory must be empty if it exists. If it doesn't exist, the directory will be created.
-
version
(str
, default:'3.1'
) –description. Defaults to "3.1".
Examples:
Source code in src/nplinker/genomics/mibig/mibig_downloader.py
parse_bgc_metadata_json
¶
Parse MIBiG metadata file and return BGC object.
Note that the MiBIG accession is used as the BGC id and strain name. The BGC object has Strain object as its strain attribute.
Parameters:
Returns:
-
BGC
–BGC object