Practical peptide to SMILES and smiles to peptide workflows

PepLink: Peptide Sequence ↔ SMILES/SELFIES Conversion

Convert peptide sequences into SMILES or peptide SELFIES, and interpret peptide-like SMILES with support for non-canonical amino acids, cyclic or intramolecularly linked peptides, and terminal modifications in forward workflows.

Forward conversion supports broader peptide construction. Reverse parsing is intentionally conservative and focused on standard peptide-like molecules within PepLink v1's documented scope.

PepLink schematic showing peptide sequence conversion to SMILES, SELFIES, and structured peptide parsing

What PepLink does well

A compact tool for peptide representation conversion

Peptide sequence to SMILES or SELFIES

Use aa_seqs_to_smiles(...) to export a peptide sequence to canonical SMILES or peptide SELFIES for downstream chemistry and dataset workflows.

SMILES to peptide-oriented parsing

Use smiles_to_aa_seqs(...) when you need a practical smiles to peptide step for standard peptide-like molecules, including linear peptides and head-to-tail cyclic peptides, with structured output and explicit unsupported reasons.

Non-canonical amino acid support

PepLink ships with 420 bundled non-canonical amino acid mappings and also supports runtime registration of custom residues for peptide sequence to smiles pipelines.

Cyclic, linked, and terminally modified peptides

Forward generation covers selected cyclic peptide SMILES cases, intramolecular linkage classes, and bundled N- and C-terminal modifications without pretending to solve all peptide chemistry.

Why PepLink

Built for peptide workflows that exceed canonical linear cases

Many peptide conversion utilities are optimized for canonical linear sequences, one-way export, or narrowly curated amino-acid sets. That makes straightforward peptide to SMILES tasks possible, but it leaves gaps when your workflow includes a non-canonical amino acid, a cyclic peptide SMILES target, or terminal modifications that matter to the final structure.

PepLink is useful when you need practical conversion rather than a generic chemistry marketing page. Its forward builder is designed around monomer peptides, bundled residue mappings, selected cyclic and intramolecularly linked peptide topologies, and terminal modifications. The reverse parser is deliberately narrower: it focuses on standard amino-acid peptides, with support centered on linear peptides and head-to-tail cyclic peptides, and clearly reports when a molecule falls outside its reliable scope.

That split makes PepLink credible for research software and agent automation. You can rely on the forward direction for richer peptide construction, and you can use reverse parsing when a conservative smiles to peptide interpretation is the safer choice.

Example workflows

Readable APIs for common peptide informatics tasks

Sequence → SMILES

aa_seqs_to_smiles(...)

A realistic forward example with non-canonical residues plus terminal modifications. This is the core peptide sequence to smiles workflow.

from PepLink import aa_seqs_to_smiles

smiles = aa_seqs_to_smiles(
    "RRXXRF",
    unusual_amino_acids=[
        {"position": 3, "name": "1-NAL"},
        {"position": 4, "name": "1-NAL"},
    ],
    n_terminal="ACT",
    c_terminal="AMD",
)
Custom residue registration

register_noncanonical_aa(...)

Extend PepLink with your own non-canonical amino acid mappings directly in Python when bundled residue mappings are not enough for your workflow.

from PepLink import (
    aa_seqs_to_smiles,
    register_noncanonical_aa,
)

register_noncanonical_aa("MyAA", "N[C@@H](CC)C(=O)O")

smiles = aa_seqs_to_smiles(
    "AXA",
    unusual_amino_acids=[{"position": 2, "name": "MyAA"}],
)
SMILES → Peptide interpretation

smiles_to_aa_seqs(...)

Reverse parsing returns a structured PeptideParseResult. It is best suited to standard peptide-like molecules, including linear and head-to-tail cases.

from PepLink import smiles_to_aa_seqs

result = smiles_to_aa_seqs(
    "C[C@H](N)C(=O)N[C@@H](CS)C(=O)O"
)

print(result.sequence)            # AC
print(result.is_cyclic)           # False
print(result.cyclization)         # linear
print(result.unsupported_reason)  # None

Who it is for

Useful across research, pipelines, and tooling

  • Peptide design workflows that need explicit sequence-to-structure conversion.
  • Cheminformatics preprocessing for peptide datasets, descriptors, and model inputs.
  • Dataset generation where peptide sequence to smiles or peptide selfies export must be repeatable.
  • Agent and tool integration where a backend needs deterministic peptide representation conversion.
  • Educational and research prototyping that benefits from a practical Python API instead of a large stack.

Agent and tool integration

A practical backend for automation and AI agents

PepLink is a good fit when peptide conversion needs to sit behind an API, an MCP-compatible tool wrapper, a workflow engine, or a lightweight Python service. The forward API accepts structured fields for residue overrides, unusual amino acids, intrachain bonds, and terminal modifications. The reverse API returns a structured PeptideParseResult instead of only a raw string.

That makes the package useful for AI agents and tool builders that need deterministic peptide conversion, explicit failure handling, and data that can be fed into larger automation chains without brittle text parsing.

Programmatic shape

  • aa_seqs_to_smiles(sequence, ..., output_format="smiles" | "selfies") -> str
  • smiles_to_aa_seqs(text, *, input_format="auto") -> PeptideParseResult
  • register_noncanonical_aa(...) and CSV helpers for custom residue support
  • unsupported_reason gives automation-friendly failure context

Installation

Start with a minimal Python install

pip install PepLink
from PepLink import aa_seqs_to_smiles

smiles = aa_seqs_to_smiles("AC")
print(smiles)

PepLink is a Python package for peptide sequence to smiles, peptide selfies export, and conservative peptide-oriented interpretation of SMILES within its current documented scope.

FAQ

Common questions

What kinds of peptides does PepLink target?

PepLink v1 focuses on monomer peptides. Forward generation supports canonical residues, D-forms, bundled non-canonical mappings, selected cyclic or intramolecularly linked peptide definitions, and terminal modifications.

Can I define custom non-canonical amino acids?

Yes. PepLink includes bundled non-canonical amino acid mappings for forward workflows, and it also lets you register your own custom residues programmatically or load them from CSV when needed.

Can it help with cyclic peptide representations?

Yes for selected forward-generation cases, including documented cyclic and intramolecular linkage classes. The reverse parser is more conservative and officially narrower than the forward builder.

Is it suitable for automation or agent workflows?

Yes. PepLink exposes predictable Python functions, structured reverse outputs, and explicit unsupported reasons, which makes it practical for API wrappers, MCP tools, and AI-agent backends.

Where can I find the source code and documentation?

Start with the GitHub repository for source, README examples, issues, and releases: github.com/DragonDescentZerotsu/PepLink.

Try PepLink

Install the package, explore the examples, and use it in your peptide pipeline

If you need peptide to SMILES conversion, peptide SELFIES export, or a conservative smiles to peptide utility for automation, PepLink is ready to evaluate in a lightweight Python workflow.

Citation

Cite PepLink in related research

If you find this project useful, please cite:

@article{leng2025predicting,
  title={Predicting and generating antibiotics against future pathogens with ApexOracle},
  author={Leng, Tianang and Wan, Fangping and Torres, Marcelo Der Torossian and de la Fuente-Nunez, Cesar},
  journal={arXiv preprint arXiv:2507.07862},
  year={2025}
}