🔬 Mapping the Landscape of FDA-Approved Anti-Neoplastic Small Molecules
Over the past weeks, I’ve been curating and structuring a dataset of FDA-approved anti-neoplastic small molecules, starting from standardized SMILES and integrating approval metadata, physicochemical descriptors, and structural biology annotations (initially sourced from ChEMBL).
The aim was not simply to compile a list —
but to transform it into an explorable chemical and structural landscape.
🧬 1️⃣ Structural Overview
Even visually, you can observe:
-
The transition from classic cytotoxic scaffolds
-
The rise of kinase inhibitors
-
Increasing molecular complexity over decades
-
The appearance of macrocycles and larger architectures
This reflects the broader evolution of oncology drug design.
📊 2️⃣ Physicochemical Space
Plotting Molecular Weight vs cLogP reveals:
-
Clear clustering of targeted therapies
-
Highly polar nucleoside analogues
-
Large macrocyclic outliers
-
Expansion of “drug-like” space over time
This helps contextualize how medicinal chemistry strategies have shifted historically.
This creates a "3-dimensional (parameters) representation" in a single 2D view.
📈 3️⃣ Multi-Dimensional Structural Biology View
Using DataWarrior, I generated a 3D landscape including:
-
Molecular Weight
-
Number of UniProt-associated targets
-
Number of available PDB structures
-
Color-coded by approval year
-
Marker size reflecting ring count
This creates a 5-dimensional representation in a single interactive view.
An important nuance:
A higher number of UniProt entries does not automatically imply promiscuity.
It may reflect:
-
Cross-species target annotation
-
Extensive structural biology investigation
-
Historical research intensity
-
Drug repurposing efforts
A natural next step would be to analyze:
• Are these targets orthologous across organisms?
• Do some drugs truly exhibit multi-target pharmacology?
• Can we disentangle biological promiscuity from research bias?
This is where the dataset becomes biologically interesting — not just chemically descriptive.
Why this matters
Understanding oncology drug evolution across structure, physicochemistry, and target space can support:
-
Docking benchmark construction
-
Selectivity modeling
-
ML feature generation
-
Target network analysis
-
Drug evolution quantification
This is a foundation toward a structured, ML-ready oncology drug landscape.
Questions for the Community
I would greatly appreciate your perspective:
• Are there curated FDA oncology drug lists beyond ChEMBL that you recommend?
(e.g., ChemOncology, DrugBank, FDA Orange Book integrations, others?)
• Would integrating binding affinity data meaningfully improve this landscape?
• Should cross-species target mapping be separated from true multi-target pharmacology?
• Are there structural biology metrics that would add value here?
• What additional dimensions would you want visualized?
I’m especially interested in perspectives that connect structural biology, medicinal chemistry, and computational modeling.
More to come as this evolves.
Evangelos Papadopoulos
Computational Structural Biology | Drug Discovery | Molecular Modeling