Skip to content

Data Models

nplinker.scoring

LinkGraph

LinkGraph()

Class to represent the links between objects in NPLinker.

This class wraps the networkx.Graph class to provide a more user-friendly interface for working with the links.

The links between objects are stored as edges in a graph, while the objects themselves are stored as nodes.

The scoring data for each link (or link data) is stored as the key/value attributes of the edge.

Examples:

Create a LinkGraph object:

>>> lg = LinkGraph()

Display the empty LinkGraph object:

>>> lg
|    |   Object 1 |   Object 2 |   Metcalf Score |   Rosetta Score |
|----|------------|------------|-----------------|-----------------|

Add a link between a GCF and a Spectrum object:

>>> lg.add_link(gcf, spectrum, metcalf=Score("metcalf", 1.0, {"cutoff": 0.5}))

Display all links in LinkGraph object:

>>> lg
|    |     Object 1 |               Object 2 |   Metcalf Score |   Rosetta Score |
|----|--------------|------------------------|-----------------|-----------------|
|  1 | GCF(id=gcf1) | Spectrum(id=spectrum1) |               1 |               - |

Get all links for a given object:

>>> lg[gcf]
{spectrum: {"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})}}

Get all links in the LinkGraph:

>>> lg.links
[(gcf, spectrum, {"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})})]

Check if there is a link between two objects:

>>> lg.has_link(gcf, spectrum)
True

Get the link data between two objects:

>>> lg.get_link_data(gcf, spectrum)
{"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})}
Source code in src/nplinker/scoring/link_graph.py
def __init__(self) -> None:
    """Initialize a LinkGraph object.

    Examples:
        Create a LinkGraph object:
        >>> lg = LinkGraph()

        Display the empty LinkGraph object:
        >>> lg
        |    |   Object 1 |   Object 2 |   Metcalf Score |   Rosetta Score |
        |----|------------|------------|-----------------|-----------------|

        Add a link between a GCF and a Spectrum object:
        >>> lg.add_link(gcf, spectrum, metcalf=Score("metcalf", 1.0, {"cutoff": 0.5}))

        Display all links in LinkGraph object:
        >>> lg
        |    |     Object 1 |               Object 2 |   Metcalf Score |   Rosetta Score |
        |----|--------------|------------------------|-----------------|-----------------|
        |  1 | GCF(id=gcf1) | Spectrum(id=spectrum1) |               1 |               - |

        Get all links for a given object:
        >>> lg[gcf]
        {spectrum: {"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})}}

        Get all links in the LinkGraph:
        >>> lg.links
        [(gcf, spectrum, {"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})})]

        Check if there is a link between two objects:
        >>> lg.has_link(gcf, spectrum)
        True

        Get the link data between two objects:
        >>> lg.get_link_data(gcf, spectrum)
        {"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})}
    """
    self._g: Graph = Graph()
links: list[LINK]

Get all links.

Returns:

  • list[LINK]

    A list of tuples containing the links between objects.

Examples:

>>> lg.links
[(gcf, spectrum, {"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})})]

__repr__

__repr__() -> str

Return a string representation of the LinkGraph.

Source code in src/nplinker/scoring/link_graph.py
def __repr__(self) -> str:
    """Return a string representation of the LinkGraph."""
    return self._get_table_repr()

__len__

__len__() -> int

Get the number of objects.

Source code in src/nplinker/scoring/link_graph.py
def __len__(self) -> int:
    """Get the number of objects."""
    return len(self._g)

__getitem__

__getitem__(u: Entity) -> dict[Entity, LINK_DATA]

Get all links for a given object.

Parameters:

  • u (Entity) –

    the given object

Returns:

  • dict[Entity, LINK_DATA]

    A dictionary of links for the given object.

Raises:

  • KeyError

    if the input object is not found in the link graph.

Source code in src/nplinker/scoring/link_graph.py
@validate_u
def __getitem__(self, u: Entity) -> dict[Entity, LINK_DATA]:
    """Get all links for a given object.

    Args:
        u: the given object

    Returns:
        A dictionary of links for the given object.

    Raises:
        KeyError: if the input object is not found in the link graph.
    """
    try:
        links = self._g[u]
    except KeyError:
        raise KeyError(f"{u} not found in the link graph.")

    return {**links}  # type: ignore
add_link(u: Entity, v: Entity, **data: Score) -> None

Add a link between two objects.

The objects u and v must be different types, i.e. one must be a GCF and the other must be a Spectrum or MolecularFamily.

Parameters:

  • u (Entity) –

    the first object, either a GCF, Spectrum, or MolecularFamily

  • v (Entity) –

    the second object, either a GCF, Spectrum, or MolecularFamily

  • data (Score, default: {} ) –

    keyword arguments. At least one scoring method and its data must be provided. The key must be the name of the scoring method defined in ScoringMethod, and the value is a Score object, e.g. metcalf=Score("metcalf", 1.0, {"cutoff": 0.5}).

Examples:

>>> lg.add_link(gcf, spectrum, metcalf=Score("metcalf", 1.0, {"cutoff": 0.5}))
Source code in src/nplinker/scoring/link_graph.py
@validate_uv
def add_link(
    self,
    u: Entity,
    v: Entity,
    **data: Score,
) -> None:
    """Add a link between two objects.

    The objects `u` and `v` must be different types, i.e. one must be a GCF and the other must be
    a Spectrum or MolecularFamily.

    Args:
        u: the first object, either a GCF, Spectrum, or MolecularFamily
        v: the second object, either a GCF, Spectrum, or MolecularFamily
        data: keyword arguments. At least one scoring method and its data must be provided.
            The key must be the name of the scoring method defined in `ScoringMethod`, and the
            value is a `Score` object, e.g. `metcalf=Score("metcalf", 1.0, {"cutoff": 0.5})`.

    Examples:
        >>> lg.add_link(gcf, spectrum, metcalf=Score("metcalf", 1.0, {"cutoff": 0.5}))
    """
    # validate the data
    if not data:
        raise ValueError("At least one scoring method and its data must be provided.")
    for key, value in data.items():
        if not ScoringMethod.has_value(key):
            raise ValueError(
                f"{key} is not a valid name of scoring method. See `ScoringMethod` for valid names."
            )
        if not isinstance(value, Score):
            raise TypeError(f"{value} is not a Score object.")

    self._g.add_edge(u, v, **data)
has_link(u: Entity, v: Entity) -> bool

Check if there is a link between two objects.

Parameters:

  • u (Entity) –

    the first object, either a GCF, Spectrum, or MolecularFamily

  • v (Entity) –

    the second object, either a GCF, Spectrum, or MolecularFamily

Returns:

  • bool

    True if there is a link between the two objects, False otherwise

Examples:

>>> lg.has_link(gcf, spectrum)
True
Source code in src/nplinker/scoring/link_graph.py
@validate_uv
def has_link(self, u: Entity, v: Entity) -> bool:
    """Check if there is a link between two objects.

    Args:
        u: the first object, either a GCF, Spectrum, or MolecularFamily
        v: the second object, either a GCF, Spectrum, or MolecularFamily

    Returns:
        True if there is a link between the two objects, False otherwise

    Examples:
        >>> lg.has_link(gcf, spectrum)
        True
    """
    return self._g.has_edge(u, v)
get_link_data(u: Entity, v: Entity) -> LINK_DATA | None

Get the data for a link between two objects.

Parameters:

  • u (Entity) –

    the first object, either a GCF, Spectrum, or MolecularFamily

  • v (Entity) –

    the second object, either a GCF, Spectrum, or MolecularFamily

Returns:

  • LINK_DATA | None

    A dictionary of scoring methods and their data for the link between the two objects, or

  • LINK_DATA | None

    None if there is no link between the two objects.

Examples:

>>> lg.get_link_data(gcf, spectrum)
{"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})}
Source code in src/nplinker/scoring/link_graph.py
@validate_uv
def get_link_data(
    self,
    u: Entity,
    v: Entity,
) -> LINK_DATA | None:
    """Get the data for a link between two objects.

    Args:
        u: the first object, either a GCF, Spectrum, or MolecularFamily
        v: the second object, either a GCF, Spectrum, or MolecularFamily

    Returns:
        A dictionary of scoring methods and their data for the link between the two objects, or
        None if there is no link between the two objects.

    Examples:
        >>> lg.get_link_data(gcf, spectrum)
        {"metcalf": Score("metcalf", 1.0, {"cutoff": 0.5})}
    """
    return self._g.get_edge_data(u, v)  # type: ignore

filter

filter(
    u_nodes: Sequence[Entity],
    v_nodes: Sequence[Entity] = [],
) -> LinkGraph

Return a new LinkGraph object with the filtered links between the given objects.

The new LinkGraph object will only contain the links between u_nodes and v_nodes.

If u_nodes or v_nodes is empty, the new LinkGraph object will contain the links for the given objects in v_nodes or u_nodes, respectively. If both are empty, return an empty LinkGraph object.

Note that not all objects in u_nodes and v_nodes need to be present in the original LinkGraph.

Parameters:

  • u_nodes (Sequence[Entity]) –

    a sequence of objects used as the first object in the links

  • v_nodes (Sequence[Entity], default: [] ) –

    a sequence of objects used as the second object in the links

Returns:

  • LinkGraph

    A new LinkGraph object with the filtered links between the given objects.

Examples:

Filter the links for gcf1 and gcf2:

>>> new_lg = lg.filter([gcf1, gcf2])
Filter the links for `spectrum1` and `spectrum2`:
>>> new_lg = lg.filter([spectrum1, spectrum2])
Filter the links between two lists of objects:
>>> new_lg = lg.filter([gcf1, gcf2], [spectrum1, spectrum2])
Source code in src/nplinker/scoring/link_graph.py
def filter(self, u_nodes: Sequence[Entity], v_nodes: Sequence[Entity] = [], /) -> LinkGraph:
    """Return a new LinkGraph object with the filtered links between the given objects.

    The new LinkGraph object will only contain the links between `u_nodes` and `v_nodes`.

    If `u_nodes` or `v_nodes` is empty, the new LinkGraph object will contain the links for
    the given objects in `v_nodes` or `u_nodes`, respectively. If both are empty, return an
    empty LinkGraph object.

    Note that not all objects in `u_nodes` and `v_nodes` need to be present in the original
    LinkGraph.

    Args:
        u_nodes: a sequence of objects used as the first object in the links
        v_nodes: a sequence of objects used as the second object in the links

    Returns:
        A new LinkGraph object with the filtered links between the given objects.

    Examples:
        Filter the links for `gcf1` and `gcf2`:
        >>> new_lg = lg.filter([gcf1, gcf2])
        Filter the links for `spectrum1` and `spectrum2`:
        >>> new_lg = lg.filter([spectrum1, spectrum2])
        Filter the links between two lists of objects:
        >>> new_lg = lg.filter([gcf1, gcf2], [spectrum1, spectrum2])
    """
    lg = LinkGraph()

    # exchange u_nodes and v_nodes if u_nodes is empty but v_nodes not
    if len(u_nodes) == 0 and len(v_nodes) != 0:
        u_nodes = v_nodes
        v_nodes = []

    if len(v_nodes) == 0:
        for u in u_nodes:
            self._filter_one_node(u, lg)

    for u in u_nodes:
        for v in v_nodes:
            self._filter_two_nodes(u, v, lg)

    return lg

Score dataclass

Score(name: str, value: float, parameter: dict)

A data class to represent score data.

Attributes:

  • name (str) –

    the name of the scoring method. See ScoringMethod for valid values.

  • value (float) –

    the score value.

  • parameter (dict) –

    the parameters used for the scoring method.

name instance-attribute

name: str

value instance-attribute

value: float

parameter instance-attribute

parameter: dict

__post_init__

__post_init__() -> None

Check if the value of name is valid.

Raises:

  • ValueError

    if the value of name is not valid.

Source code in src/nplinker/scoring/score.py
def __post_init__(self) -> None:
    """Check if the value of `name` is valid.

    Raises:
        ValueError: if the value of `name` is not valid.
    """
    if ScoringMethod.has_value(self.name) is False:
        raise ValueError(
            f"{self.name} is not a valid value. Valid values are: {[e.value for e in ScoringMethod]}"
        )

__getitem__

__getitem__(key)
Source code in src/nplinker/scoring/score.py
def __getitem__(self, key):
    if key in {field.name for field in fields(self)}:
        return getattr(self, key)
    else:
        raise KeyError(f"{key} not found in {self.__class__.__name__}")

__setitem__

__setitem__(key, value)
Source code in src/nplinker/scoring/score.py
def __setitem__(self, key, value):
    # validate the value of `name`
    if key == "name" and ScoringMethod.has_value(value) is False:
        raise ValueError(
            f"{value} is not a valid value. Valid values are: {[e.value for e in ScoringMethod]}"
        )

    if key in {field.name for field in fields(self)}:
        setattr(self, key, value)
    else:
        raise KeyError(f"{key} not found in {self.__class__.__name__}")