BigScape
nplinker.genomics.bigscape
¶
BigscapeGCFLoader
¶
Bases: GCFLoaderBase
Data loader for BiG-SCAPE GCF cluster file.
Attributes:
-
cluster_file
(str
) –path to the BiG-SCAPE cluster file.
Parameters:
-
cluster_file
(str | PathLike
) –Path to the BiG-SCAPE cluster file, the filename has a pattern of
<class>_clustering_c0.xx.tsv
.
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
get_gcfs
¶
Get all GCF objects.
Parameters:
-
keep_mibig_only
(bool
, default:False
) –True to keep GCFs that contain only MIBiG BGCs.
-
keep_singleton
(bool
, default:False
) –True to keep singleton GCFs. A singleton GCF is a GCF that contains only one BGC.
Returns:
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
BigscapeV2GCFLoader
¶
Bases: GCFLoaderBase
Data loader for BiG-SCAPE v2 database file.
Attributes:
-
db_file
–Path to the BiG-SCAPE database file.
Parameters:
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
get_gcfs
¶
Get all GCF objects.
Parameters:
-
keep_mibig_only
(bool
, default:False
) –True to keep GCFs that contain only MIBiG BGCs.
-
keep_singleton
(bool
, default:False
) –True to keep singleton GCFs. A singleton GCF is a GCF that contains only one BGC.
Returns:
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
run_bigscape
¶
run_bigscape(
antismash_path: str | PathLike,
output_path: str | PathLike,
extra_params: str,
version: Literal[1, 2] = 1,
) -> bool
Runs BiG-SCAPE to cluster BGCs.
The behavior of this function is slightly different depending on the version of BiG-SCAPE that is set to run using the configuration file. Mostly this means a different set of parameters is used between the two versions.
The AntiSMASH output directory should be a directory that contains GBK files. The directory can contain subdirectories, in which case BiG-SCAPE will search recursively for GBK files. E.g.:
example_folder
├── organism_1
│ ├── organism_1.region001.gbk
│ ├── organism_1.region002.gbk
│ ├── organism_1.region003.gbk
│ ├── organism_1.final.gbk <- skipped!
│ └── ...
├── organism_2
│ ├── ...
└── ...
By default, only GBK Files with "cluster" or "region" in the filename are accepted. GBK Files with "final" in the filename are excluded.
Parameters:
-
antismash_path
(str | PathLike
) –Path to the antismash output directory.
-
output_path
(str | PathLike
) –Path to the output directory where BiG-SCAPE will write its results.
-
extra_params
(str
) –Additional parameters to pass to BiG-SCAPE.
-
version
(Literal[1, 2]
, default:1
) –The version of BiG-SCAPE to run. Must be 1 or 2.
Returns:
-
bool
–True if BiG-SCAPE ran successfully, False otherwise.
Raises:
-
ValueError
–If an unexpected BiG-SCAPE version number is specified.
-
FileNotFoundError
–If the antismash_path does not exist or if the BiG-SCAPE python script could not be found.
-
RuntimeError
–If BiG-SCAPE fails to run.
Examples:
>>> from nplinker.genomics.bigscape import run_bigscape
>>> run_bigscape(antismash_path="./antismash", output_path="./output",
... extra_params="--help", version=1)
Source code in src/nplinker/genomics/bigscape/runbigscape.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|