Utilities
nplinker.strain.utils
¶
load_user_strains
¶
Load user specified strains from a JSON file.
The JSON file will be validated against the schema USER_STRAINS_SCHEMA
The content of the JSON file could be, for example:
Parameters:
Returns:
Source code in src/nplinker/strain/utils.py
podp_generate_strain_mappings
¶
podp_generate_strain_mappings(
podp_project_json_file: str | PathLike,
genome_status_json_file: str | PathLike,
genome_bgc_mappings_file: str | PathLike,
gnps_file_mappings_file: str | PathLike,
output_json_file: str | PathLike,
) -> StrainCollection
Generate strain mappings JSON file for PODP pipeline.
To get the strain mappings, we need to combine the following mappings:
- strain_id <-> original_genome_id <-> resolved_genome_id <-> bgc_id
- strain_id <-> MS_filename <-> spectrum_id
These mappings are extracted from the following files:
- "strain_id <-> original_genome_id" is extracted from
podp_project_json_file
. - "original_genome_id <-> resolved_genome_id" is extracted from
genome_status_json_file
. - "resolved_genome_id <-> bgc_id" is extracted from
genome_bgc_mappings_file
. - "strain_id <-> MS_filename" is extracted from
podp_project_json_file
. - "MS_filename <-> spectrum_id" is extracted from
gnps_file_mappings_file
.
Parameters:
-
podp_project_json_file
(str | PathLike
) –The path to the PODP project JSON file.
-
genome_status_json_file
(str | PathLike
) –The path to the genome status JSON file.
-
genome_bgc_mappings_file
(str | PathLike
) –The path to the genome BGC mappings JSON file.
-
gnps_file_mappings_file
(str | PathLike
) –The path to the GNPS file mappings file (csv or tsv).
-
output_json_file
(str | PathLike
) –The path to the output JSON file.
Returns:
-
StrainCollection
–The strain mappings stored in a StrainCollection object.
See Also
extract_mappings_strain_id_original_genome_id
: Extract mappings "strain_id <-> original_genome_id".extract_mappings_original_genome_id_resolved_genome_id
: Extract mappings "original_genome_id <-> resolved_genome_id".extract_mappings_resolved_genome_id_bgc_id
: Extract mappings "resolved_genome_id <-> bgc_id".get_mappings_strain_id_bgc_id
: Get mappings "strain_id <-> bgc_id".extract_mappings_strain_id_ms_filename
: Extract mappings "strain_id <-> MS_filename".extract_mappings_ms_filename_spectrum_id
: Extract mappings "MS_filename <-> spectrum_id".get_mappings_strain_id_spectrum_id
: Get mappings "strain_id <-> spectrum_id".
Source code in src/nplinker/strain/utils.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
|