BigScape
nplinker.genomics.bigscape
¶
BigscapeGCFLoader
¶
Bases: GCFLoaderBase
Data loader for BiG-SCAPE GCF cluster file.
Attributes:
-
cluster_file(str) –path to the BiG-SCAPE cluster file.
Parameters:
-
cluster_file(str | PathLike) –Path to the BiG-SCAPE cluster file, the filename has a pattern of
<class>_clustering_c0.xx.tsv.
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
get_gcfs
¶
Get all GCF objects.
Parameters:
-
keep_mibig_only(bool, default:False) –True to keep GCFs that contain only MIBiG BGCs.
-
keep_singleton(bool, default:False) –True to keep singleton GCFs. A singleton GCF is a GCF that contains only one BGC.
Returns:
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
BigscapeV2GCFLoader
¶
Bases: GCFLoaderBase
Data loader for BiG-SCAPE v2 database file.
Attributes:
-
db_file–Path to the BiG-SCAPE database file.
Parameters:
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
get_gcfs
¶
Get all GCF objects.
Parameters:
-
keep_mibig_only(bool, default:False) –True to keep GCFs that contain only MIBiG BGCs.
-
keep_singleton(bool, default:False) –True to keep singleton GCFs. A singleton GCF is a GCF that contains only one BGC.
Returns:
Source code in src/nplinker/genomics/bigscape/bigscape_loader.py
run_bigscape
¶
run_bigscape(
antismash_path: str | PathLike,
output_path: str | PathLike,
extra_params: str,
version: Literal["1", "2"] = "1",
) -> bool
Runs BiG-SCAPE to cluster BGCs.
The behavior of this function is slightly different depending on the version of BiG-SCAPE that is set to run using the configuration file. Mostly this means a different set of parameters is used between the two versions.
The AntiSMASH output directory should be a directory that contains GBK files. The directory can contain subdirectories, in which case BiG-SCAPE will search recursively for GBK files. E.g.:
example_folder
├── organism_1
│ ├── organism_1.region001.gbk
│ ├── organism_1.region002.gbk
│ ├── organism_1.region003.gbk
│ ├── organism_1.final.gbk <- skipped!
│ └── ...
├── organism_2
│ ├── ...
└── ...
By default, only GBK Files with "cluster" or "region" in the filename are accepted. GBK Files with "final" in the filename are excluded.
Parameters:
-
antismash_path(str | PathLike) –Path to the antismash output directory.
-
output_path(str | PathLike) –Path to the output directory where BiG-SCAPE will write its results.
-
extra_params(str) –Additional parameters to pass to BiG-SCAPE.
-
version(Literal['1', '2'], default:'1') –The version of BiG-SCAPE to run. Must be "1" or "2".
Returns:
-
bool–True if BiG-SCAPE ran successfully, False otherwise.
Raises:
-
ValueError–If an unexpected BiG-SCAPE version number is specified.
-
FileNotFoundError–If the antismash_path does not exist or if the BiG-SCAPE python script could not be found.
-
RuntimeError–If BiG-SCAPE fails to run.
Examples:
>>> from nplinker.genomics.bigscape import run_bigscape
>>> run_bigscape(antismash_path="./antismash", output_path="./output",
... extra_params="--help", version="1")
Source code in src/nplinker/genomics/bigscape/runbigscape.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 | |