Bioinformatics

Papers
(The TQCC of Bioinformatics is 14. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)
ArticleCitations
DivPro: diverse protein sequence design with direct structure recovery guidance2121
RVINN: a flexible modeling for inferring dynamic transcriptional and post-transcriptional regulation using physics-informed neural networks1739
CondiS web app: imputation of censored lifetimes for machine learning-based survival analysis988
Correction to: GTExVisualizer: a web platform for supporting ageing studies603
ProteinLIPs: a web server for identifying highly polar and poorly packed interfaces in proteins400
IntegrAlign: a comprehensive tool for multi-immunofluorescence panel integration through image alignment336
NOODAI: a webserver for network-oriented multi-omics data analysis and integration pipeline207
deTELpy: Python package for high-throughput detection of amino acid substitutions in mass spectrometry datasets171
Memory-efficient, accelerated protein interaction inference with blocked, multi-GPU D-SCRIPT165
CompareM2 is a genomes-to-report pipeline for comparing microbial genomes134
MRDagent: iterative and adaptive parameter optimization for stable ctDNA-based MRD detection in heterogeneous samples129
Viral Diseases Explorer: a webtool to identify viral disease information derived from multiple LLMs116
Accurate assembly of multiple RNA-seq samples with Aletsch101
Mixtum: a graphical tool for two-way admixture analysis in population genetics based on f -statistics100
FastDup: a scalable duplicate marking tool using speculation-and-test mechanism98
FracFixR: a compositional statistical framework for absolute proportion estimation between fractions in RNA sequencing data96
From genes to trajectories: mapping genetic influences on Huntington’s disease progression90
getDNB: identifying dynamic network biomarkers of hepatocellular carcinoma from time-varying gene regulations utilizing graph embedding techniques for anomaly detection87
MCOAN: multimodal contrastive representation learning for cross-omics adaptive disease regulatory network prediction85
Statistical framework to determine indel-length distribution80
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities79
ATLIGATOR: editing protein interactions with an atlas-based approach77
EvoAug-TF: extending evolution-inspired data augmentations for genomic deep learning to TensorFlow76
Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules74
Increasing confidence in proteomic spectral deconvolution through mass defect74
The 2025 ISCB Accomplishments by a Senior Scientist Award—Dr Amos Bairoch74
Mocafe: a comprehensive Python library for simulating cancer development with Phase Field Models73
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures72
MDCompress: better, faster compression of molecular dynamics simulation trajectories71
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations69
ADViSELipidomics: a workflow for analyzing lipidomics data68
Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition67
Estimation of cancer cell fractions and clone trees from multi-region sequencing of tumors67
The phers R package: using phenotype risk scores based on electronic health records to study Mendelian disease and rare genetic variants65
DeepPerVar: a multi-modal deep learning framework for functional interpretation of genetic variants in personal genome65
Inference of 3D genome architecture by modeling overdispersion of Hi-C data64
Deep Local Analysis deconstructs protein–protein interfaces and accurately estimates binding affinity changes upon mutation62
Response to the letter to the editor: On the feasibility of dynamical analysis of network models of biochemical regulation61
Decomposing mosaic tandem repeats accurately from long reads60
Fragmentstein—facilitating data reuse for cell-free DNA fragment analysis58
RNAsolo: a repository of cleaned PDB-derived RNA 3D structures57
Exploring automatic inconsistency detection for literature-based gene ontology annotation57
From high-throughput evaluation to wet-lab studies: advancing mutation effect prediction with a retrieval-enhanced model57
Random field modeling of multi-trait multi-locus association for detecting methylation quantitative trait loci56
MetBP: a software tool for detection of interaction between metal ion–RNA base pairs56
CANTATA—prediction of missing links in Boolean networks using genetic programming54
Harnessing deep learning for proteome-scale detection of amyloid signaling motifs54
FastSCODE: an accelerated SCODE algorithm for inferring gene regulatory networks on manycore processors54
TripLexicon: prediction and analysis of gene regulatory RNA–DNA interactions52
Evidential meta-model for molecular property prediction52
AFragmenter: schema-free, tuneable protein domain segmentation for AlphaFold protein structures52
WMDS.net: a network control framework for identifying key players in transcriptome programs51
insilicoSV: a flexible grammar-based framework for structural variant simulation and placement50
Aclust2.0: a revamped unsupervised R tool for Infinium methylation beadchips data analyses50
scSurv: a deep generative model for single-cell survival analysis50
Finding low-complexity DNA sequences with longdust48
Floria: fast and accurate strain haplotyping in metagenomes46
hapCon: estimating contamination of ancient genomes by copying from reference haplotypes46
MPBind: a multitask protein binding site predictor using protein language models and equivariant GNNs45
skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements45
The FASTQ+ format and PISA45
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification43
SA2E: spatial-aware auto-encoder for cell type deconvolution of spatial transcriptomics data43
phylobar: an R package for multiresolution compositional barplots in omics studies43
Perceiver CPI: a nested cross-attention network for compound–protein interaction prediction42
Powerful molecule generation with simple ConvNet41
Delineating inter- and intra-antibody repertoire evolution with AntibodyForests41
ShortCake: an integrated platform for efficient and reproducible single-cell analysis41
MICER: a pre-trained encoder–decoder architecture for molecular image captioning41
EXPLANA: a user-friendly workflow for EXPLoratory ANAlysis and feature selection in cross-sectional and longitudinal microbiome studies39
hipFG: high-throughput harmonization and integration pipeline for functional genomics data38
Prediction and curation of missing biomedical identifier mappings with Biomappings38
ViReMaShiny : an interactive application for analysis of viral recombination data38
CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction38
Estimating sparse regression models in multi-task learning and transfer learning through adaptive penalisation38
BrainConnect: processing brain connectivity and spatial transcriptomics data for integrative analysis37
Columba: fast approximate pattern matching with optimized search schemes37
vaRHC: an R package for semi-automation of variant classification in hereditary cancer genes according to ACMG/AMP and gene-specific ClinGen guidelines37
High-sensitivity pattern discovery in large, paired multiomic datasets36
Functional lipid analysis via index-based lipidomics profile: a new computational module in LipidOne36
Prediction of gene co-expression from chromatin contacts with graph attention network36
Hierarchical reinforcement learning for automatic disease diagnosis36
Adaptive digital tissue deconvolution35
VDJ-Insights: simplifying the annotation of genomic immunoglobulin and T cell receptor regions35
Transfer learning for drug–target interaction prediction35
Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning35
XSI—a genotype compression tool for compressive genomics in large biobanks35
Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition34
ECCB2022: the 21st European Conference on Computational Biology34
STAR-GO: improving protein function prediction by learning to hierarchically integrate ontology-informed semantic embeddings34
OMEN: network-based driver gene identification using mutual exclusivity33
Trustworthy causal biomarker discovery: a multiomics brain imaging genetics-based approach33
Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge33
Correction of image distortion in large-field ssEM stitching by an unsupervised intermediate-space solving network33
Prediction of bacterial protein–compound interactions with only positive samples33
SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality33
Accessible, uniform protein property prediction with a scikit-learn based toolset AIDE33
Conformal inference for reliable single cell RNA-seq annotation32
LimROTS: a hybrid method integrating empirical Bayes and reproducibility-optimized statistics for robust differential expression analysis32
The cell as a token: high-dimensional geometry in language models and cell embeddings31
SpecieScan: semi-automated taxonomic identification of bone collagen peptides from MALDI-ToF-MS31
mHapTk: a comprehensive toolkit for the analysis of DNA methylation haplotypes30
scGrapHiC: deep learning-based graph deconvolution for Hi-C using single cell gene expression30
BAV-LLPS: a database of bacterial, archaea, and virus liquid–liquid phase separation proteins30
Graph-theoretical prediction of biological modules in quaternary structures of large protein complexes30
Generating synthetic genotypes using diffusion models29
Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python29
Spectral clustering of single-cell multi-omics data on multilayer graphs29
Improving biomedical entity linking with generative relevance feedback29
Mining literature and pathway data to explore the relations of ketamine with neurotransmitters and gut microbiota using a knowledge-graph29
Using semantic search to find publicly available gene-expression datasets28
CProMG: controllable protein-oriented molecule generation with desired binding affinity and drug-like properties28
Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics28
AdenPredictor: accurate prediction of the adenylation domain specificity of nonribosomal peptide biosynthetic gene clusters in microbial genomes28
Dogme: a nextflow pipeline for reprocessing nanopore RNA and DNA modifications28
The minimizer Jaccard estimator is biased and inconsistent28
RNA threading with secondary structure and sequence profile27
Single-cell mutation calling and phylogenetic tree reconstruction with loss and recurrence27
Geometry-complete perceptron networks for 3D molecular graphs27
Modified RNAs and predictions with the ViennaRNA Package27
Functional characterization of co-phosphorylation networks27
Conumee 2.0: enhanced copy-number variation analysis from DNA methylation arrays for humans and mice27
PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers27
Cell type matching across species using protein embeddings and transfer learning26
PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction26
Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures26
A physics-informed neural SDE network for learning cellular dynamics from time-series scRNA-seq data26
Galaxy Helm chart: a standardized method for deploying production Galaxy servers25
MIAMI: mutual information-based analysis of multiplex imaging data25
The 2024 ISCB Overton Prize Award—Dr Martin Steinegger25
HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency25
An automated multi-modal graph-based pipeline for mouse genetic discovery25
2023 ISCB Overton Prize: Jingyi Jessica Li25
MSNet-4mC: learning effective multi-scale representations for identifying DNA N4-methylcytosine sites25
scHiCPTR: unsupervised pseudotime inference through dual graph refinement for single-cell Hi-C data24
Phenotype prediction from single-cell RNA-seq data using attention-based neural networks24
Semi-supervised data-integrated feature importance enhances performance and interpretability of biological classification tasks24
SpatialRNA: a Python package for easy application of Graph Neural Network models on single-molecule spatial transcriptomics dataset24
Prediction of recovery from multiple organ dysfunction syndrome in pediatric sepsis patients24
Scbean: a python library for single-cell multi-omics data analysis24
MIO: microRNA target analysis system for immuno-oncology24
PeakBot: machine-learning-based chromatographic peak picking23
An approachable, flexible and practical machine learning workshop for biologists23
TaxTriage: an open-source metagenomic sequencing data analysis pipeline enabling putative pathogen detection23
Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences23
Prediction of HIV sensitivity to monoclonal antibodies using aminoacid sequences and deep learning23
StructuralDPPIV: a novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides23
CCC-GPU: a graphics processing unit (GPU)-accelerated nonlinear correlation coefficient for large-scale transcriptomic analyses23
Multistage attention-based extraction and fusion of protein sequence and structural features for protein function prediction23
Forseti : a mechanistic and predictive model of the splicing status of scRNA-seq reads23
Improving dictionary-based named entity recognition with deep learning23
Nezzle: an interactive and programmable visualization of biological networks in Python22
SPRISS: approximating frequentk-mers by sampling reads, and applications22
Foreign RNA spike-ins enable accurate allele-specific expression analysis at scale22
DeepLMI: deep feature mining with a globally enhanced graph convolutional network for robust lncRNA–miRNA interaction prediction22
2022 ISCB Accomplishments by a Senior Scientist Award: Ron Shamir22
Powerful and interpretable control of false discoveries in two-group differential expression studies22
The 2025 ISCB Overton Prize Award—Dr James Zou22
statgenMPP: an R package implementing an IBD-based mixed model approach for QTL mapping in a wide range of multi-parent populations22
Balancing complexity and clarity—towards clinician-ready antibiotic resistance prediction models22
A novel pipeline for computerized mouse spermatogenesis staging22
Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data21
FishFeats: streamlined quantification of multimodal labeling at the single-cell level in 3D tissues21
InterpolatedXY: a two-step strategy to normalize DNA methylation microarray data avoiding sex bias21
ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads21
AHoJ: rapid, tailored search and retrieval of apo and holo protein structures for user-defined ligands21
LoRA-DR-suite: adapted embeddings predict intrinsic and soft disorder from protein sequences21
CellAnn: a comprehensive, super-fast, and user-friendly single-cell annotation web server21
Integrating curation into scientific publishing to train AI models21
Managing workflow executions with WESkit21
Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data21
Polyphest: fast polyploid phylogeny estimation21
DeepProtein: deep learning library and benchmark for protein sequence learning21
ARTEMIS integrates autoencoders and Schrödinger Bridges to predict continuous dynamics of gene expression, cell population, and perturbation from time-series single-cell data21
HyperGraphs.jl: representing higher-order relationships in Julia21
CATH-ddG: towards robust mutation effect prediction on protein–protein interactions out of CATH homologous superfamily21
Optimal phylogenetic reconstruction of insertion and deletion events21
dsMTL: a computational framework for privacy-preserving, distributed multi-task machine learning20
IMPACT: interpretable microbial phenotype analysis via microbial characteristic traits20
3D GAN image synthesis and dataset quality assessment for bacterial biofilm20
Semantic-enhanced heterogeneous graph learning for identifying ncRNAs associated with drug resistance20
ViTAL: Vision TrAnsformer based Low coverage SARS-CoV-2 lineage assignment20
CMAtlas: a comprehensive DNA methylation atlas for exploring epigenetic alterations in 34 human cancer types20
A unified mediation analysis framework for integrative cancer proteogenomics with clinical outcomes20
MixingDTA: improved drug–target affinity prediction by extending mixup with guilt-by-association20
SimBu : bias-aware simulation of bulk RNA-seq data with variable cell-type composition20
ConceptDrift: leveraging spatial, temporal and semantic evolution of biomedical concepts for hypothesis generation20
Somatic mutation effects diffused over microRNA dysregulation20
RiboGraph: an interactive visualization system for ribosome profiling data at read length resolution20
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data20
Determining epitope specificity of T-cell receptors with transformers20
SPEAR: Systematic ProtEin AnnotatoR20
Expanding the coverage of spatial proteomics: a machine learning approach20
sedimix : a workflow for the analysis of hominin nuclear DNA sequences from sediments20
Phylogenetic diversity statistics for all clades in a phylogeny20
Pycallingcards: an integrated environment for visualizing, analyzing, and interpreting Calling Cards data20
3D Optical Coherence Tomography image processing in BISCAP: characterization of biofilm structure and properties19
Graph attention network for link prediction of gene regulations from single-cell RNA-sequencing data19
GASTON-Mix: a unified model of spatial gradients and domains using spatial mixture-of-experts19
CIBRA identifies genomic alterations with a system-wide impact on tumor biology19
RISK: a next-generation tool for biological network annotation and visualization19
TEspeX: consensus-specific quantification of transposable element expression preventing biases from exonized fragments19
Looking at the BiG picture: incorporating bipartite graphs in drug response prediction19
ODGI: understanding pangenome graphs19
Biological Random Walks: multi-omics integration for disease gene prioritization19
NFTest: automated testing of Nextflow pipelines19
Bayesian inference of fitness landscapes via tree-structured branching processes19
AttentionPert: accurately modeling multiplexed genetic perturbations with multi-scale effects19
G4STAB: a multi-input deep learning model to predict G-quadruplex thermodynamic stability based on sequence and salt concentration19
CODEX: COunterfactual Deep learning for the in silico EXploration of cancer cell line perturbations19
DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure19
REUNION: transcription factor binding prediction and regulatory association inference from single-cell multi-omics data19
A penalized linear mixed model with generalized method of moments estimators for complex phenotype prediction18
Optimal sequencing budget allocation for trajectory reconstruction of single cells18
Closing the computational biology ‘knowledge gap’: Spanish Wikipedia as a case study18
Cleanifier: contamination removal from microbial sequences using spaced seeds of a human pangenome index18
SVJedi-graph: improving the genotyping of close and overlapping structural variants with long reads using a variation graph18
Manifold classification of neuron types from microscopic images18
A deep learning framework for comprehensive prediction of human RNA G-quadruplex-binding proteins18
Hierarchical modelling of microbial communities18
PERSEUS: an interactive and intuitive web-based tool for pedigree visualization18
Exploiting pretrained biochemical language models for targeted drug design18
MolMVC: Enhancing molecular representations for drug-related tasks through multi-view contrastive learning18
konnect2prot: a web application to explore the protein properties in a functional protein–protein interaction network18
Atomic protein structure refinement using all-atom graph representations and SE(3)-equivariant graph transformer18
Efficient algorithms for simulating sequences along a phylogenetic tree18
Predicted structural proteome of Sphagnum divinum and proteome-scale annotation18
MolCL-SP: a multimodal contrastive learning framework with non-overlapping substructure perturbations for molecular property prediction18
Strategies for robust, accurate, and generalizable benchmarking of drug discovery platforms18
GRUMB: a genome-resolved metagenomic framework for monitoring urban microbiomes and diagnosing pathogen risk18
JBrowse Jupyter: a Python interface to JBrowse 218
PlasmoFAB: a benchmark to foster machine learning for Plasmodium falciparum protein antigen candidate prediction18
HCS—hierarchical algorithm for simulation of omics datasets17
TARO: tree-aggregated factor regression for microbiome data integration17
NMRpQuant: an automated software for large scale urinary total protein quantification by one-dimensional 1H NMR profiles17
FlowDock: Geometric flow matching for generative protein–ligand docking and affinity prediction17
Common data model for COVID-19 datasets17
chem16S: community-level chemical metrics for exploring genomic adaptation to environments17
CALDERA: finding all significant de Bruijn subgraphs for bacterial GWAS17
PanTools v3: functional annotation, classification and phylogenomics17
Probabilistic pathway-based multimodal factor analysis17
CREMSA: compressed indexing of (ultra) large multiple sequence alignments17
Assembly and reasoning over semantic mappings at scale for biomedical data integration17
M-Ionic: prediction of metal-ion-binding sites from sequence using residue embeddings17
Joint registration of multiple point clouds for fast particle fusion in localization microscopy17
GAN-based data augmentation for transcriptomics: survey and comparative assessment17
HNOXPred: a web tool for the prediction of gas-sensing H-NOX proteins from amino acid sequence17
The 2024 ISCB Accomplishments by a Senior Scientist Award—Dr Tandy Warnow16
Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using co-expression-based convolutional neural networks16
Drug–Protein interaction prediction by correcting the effect of incomplete information in heterogeneous information16
0.16021203994751