Bioinformatics

Papers
(The TQCC of Bioinformatics is 15. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-09-01 to 2025-09-01.)
ArticleCitations
CondiS web app: imputation of censored lifetimes for machine learning-based survival analysis1517
ATLIGATOR: editing protein interactions with an atlas-based approach1079
Detecting spatially co-expressed gene clusters with functional coherence by graph-regularized convolutional neural network856
LPTD: a novel linear programming-based topology determination method for cryo-EM maps742
Mocafe: a comprehensive Python library for simulating cancer development with Phase Field Models617
Integrated Genome Browser App Store613
PANPROVA: pangenomic prokaryotic evolution of full assemblies335
MuWU: Mutant-seq library analysis and annotation302
Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules291
Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores232
Accurate assembly of multiple RNA-seq samples with Aletsch204
Deep Subspace Mutual Learning for cancer subtypes prediction203
Completing gene trees without species trees in sub-quadratic time182
EvoAug-TF: extending evolution-inspired data augmentations for genomic deep learning to TensorFlow177
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures144
Statistical framework to determine indel-length distribution135
Correction to: GTExVisualizer: a web platform for supporting ageing studies126
deTELpy: Python package for high-throughput detection of amino acid substitutions in mass spectrometry datasets120
The 2025 ISCB Accomplishments by a Senior Scientist Award—Dr Amos Bairoch118
DivPro: diverse protein sequence design with direct structure recovery guidance114
RVINN: a flexible modeling for inferring dynamic transcriptional and post-transcriptional regulation using physics-informed neural networks112
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities108
Increasing confidence in proteomic spectral deconvolution through mass defect107
SimPlot++: a Python application for representing sequence similarity and detecting recombination99
monaLisa: an R/Bioconductor package for identifying regulatory motifs90
TRANSDIRE: data-driven direct reprogramming by a pioneer factor-guided trans-omics approach90
PsiNorm: a scalable normalization for single-cell RNA-seq data88
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations88
MRDagent: Iterative and Adaptive Parameter Optimisation for stable ctDNA-Based MRD Detection in Heterogeneous Samples87
DeepPerVar: a multi-modal deep learning framework for functional interpretation of genetic variants in personal genome84
Prediction of whole-cell transcriptional response with machine learning83
VeloViz: RNA velocity-informed embeddings for visualizing cellular trajectories82
Estimation of cancer cell fractions and clone trees from multi-region sequencing of tumors81
Cross-species prediction of essential genes in insects81
bollito: a flexible pipeline for comprehensive single-cell RNA-seq analyses81
DeepSVP: integration of genotype and phenotype for structural variant prioritization using deep learning81
The FASTQ+ format and PISA80
ADViSELipidomics: a workflow for analyzing lipidomics data80
Deep Local Analysis deconstructs protein–protein interfaces and accurately estimates binding affinity changes upon mutation78
MICER: a pre-trained encoder–decoder architecture for molecular image captioning78
skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements76
ProSynAR: a reference aware read merger73
Exploring automatic inconsistency detection for literature-based gene ontology annotation71
MAGUS+eHMMs: improved multiple sequence alignment accuracy for fragmentary sequences71
Erratum to: GADGETS: a genetic algorithm for detecting epistasis using nuclear families70
Aclust2.0: a revamped unsupervised R tool for Infinium methylation beadchips data analyses69
Random field modeling of multi-trait multi-locus association for detecting methylation quantitative trait loci69
MetBP: a software tool for detection of interaction between metal ion–RNA base pairs67
Response to the letter to the editor: On the feasibility of dynamical analysis of network models of biochemical regulation66
trfermikit: a tool to discover VNTR-associated deletions66
RNAglib: a python package for RNA 2.5 D graphs65
Continual knowledge infusion into pre-trained biomedical language models65
Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition64
PyLiger: scalable single-cell multi-omic data integration in Python63
GMNN2CD: identification of circRNA–disease associations based on variational inference and graph Markov neural networks61
CANTATA—prediction of missing links in Boolean networks using genetic programming61
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification60
From high-throughput evaluation to wet-lab studies: advancing mutation effect prediction with a retrieval-enhanced model60
Harnessing deep learning for proteome-scale detection of amyloid signaling motifs59
The phers R package: using phenotype risk scores based on electronic health records to study Mendelian disease and rare genetic variants58
Inference of 3D genome architecture by modeling overdispersion of Hi-C data58
DRUMMER—rapid detection of RNA modifications through comparative nanopore sequencing57
hapCon: estimating contamination of ancient genomes by copying from reference haplotypes57
The ENDS of assumptions: an online tool for the epistemic non-parametric drug–response scoring57
Decomposing mosaic tandem repeats accurately from long reads56
Fragmentstein—facilitating data reuse for cell-free DNA fragment analysis56
Evidential meta-model for molecular property prediction56
Floria: fast and accurate strain haplotyping in metagenomes55
Perceiver CPI: a nested cross-attention network for compound–protein interaction prediction54
RNAsolo: a repository of cleaned PDB-derived RNA 3D structures54
Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition53
WMDS.net: a network control framework for identifying key players in transcriptome programs53
From viral evolution to spatial contagion: a biologically modulated Hawkes model52
HDMC: a novel deep learning-based framework for removing batch effects in single-cell RNA-seq data52
Powerful molecule generation with simple ConvNet52
COVID-19 Spread Mapper: a multi-resolution, unified framework and open-source tool52
Prediction of gene co-expression from chromatin contacts with graph attention network51
EDTox: an R Shiny application to predict the endocrine disruption potential of compounds51
ViReMaShiny: an interactive application for analysis of viral recombination data50
BATL: Bayesian annotations for targeted lipidomics50
Hierarchical reinforcement learning for automatic disease diagnosis50
XSI—a genotype compression tool for compressive genomics in large biobanks49
Adaptive digital tissue deconvolution49
minoTour, real-time monitoring and analysis for nanopore sequencers49
CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction48
vaRHC: an R package for semi-automation of variant classification in hereditary cancer genes according to ACMG/AMP and gene-specific ClinGen guidelines48
Single-cell RNA sequencing data analysis based on non-uniformε−neighborhood network48
hipFG: high-throughput harmonization and integration pipeline for functional genomics data48
Prediction and curation of missing biomedical identifier mappings with Biomappings48
LinkExplorer: predicting, explaining and exploring links in large biomedical knowledge graphs47
Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning46
DeepTrio: a ternary prediction system for protein–protein interaction using mask multiple parallel convolutional neural networks46
BridgeDPI: a novel Graph Neural Network for predicting drug–protein interactions45
Erratum to: Assessing heterogeneity in spatial data using the HTA index with applications to spatial transcriptomics and imaging44
scPOEM: Robust Co-embedding of Peaks and Genes Revealing Peak-Gene Regulation44
High-sensitivity pattern discovery in large, paired multiomic datasets44
Transfer learning for drug–target interaction prediction44
Cell type matching across species using protein embeddings and transfer learning43
Correction of image distortion in large-field ssEM stitching by an unsupervised intermediate-space solving network43
ECCB2022: the 21st European Conference on Computational Biology43
Prowler: a novel trimming algorithm for Oxford Nanopore sequence data43
Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python43
Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures43
OMEN: network-based driver gene identification using mutual exclusivity43
Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge42
LipidOne: user-friendly lipidomic data analysis tool for a deeper interpretation in a systems biology scenario42
Tightly integrated multiomics-based deep tensor survival model for time-to-event prediction42
MS-Decipher: a user-friendly proteome database search software with an emphasis on deciphering the spectra of O-linked glycopeptides41
RawHummus: an R Shiny app for automated raw data quality control in metabolomics41
Comprehensive comparison of two types of algorithm for circRNA detection from short-read RNA-Seq41
The minimizer Jaccard estimator is biased and inconsistent41
LOCAN: a python library for analyzing single-molecule localization microscopy data41
scGrapHiC: deep learning-based graph deconvolution for Hi-C using single cell gene expression41
Functional characterization of co-phosphorylation networks41
Generating synthetic genotypes using diffusion models40
Trustworthy causal biomarker discovery: a multiomics brain imaging genetics-based approach40
CProMG: controllable protein-oriented molecule generation with desired binding affinity and drug-like properties40
PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction39
Efficient gradient boosting for prognostic biomarker discovery39
Geometry-complete perceptron networks for 3D molecular graphs39
RNA threading with secondary structure and sequence profile39
Quantifying and correcting slide-to-slide variation in multiplexed immunofluorescence images39
SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality38
Spectral clustering of single-cell multi-omics data on multilayer graphs38
Mining literature and pathway data to explore the relations of ketamine with neurotransmitters and gut microbiota using a knowledge-graph38
Graph-theoretical prediction of biological modules in quaternary structures of large protein complexes38
A physics-informed neural SDE network for learning cellular dynamics from time-series scRNA-seq data37
Modified RNAs and predictions with the ViennaRNA Package37
Conumee 2.0: enhanced copy-number variation analysis from DNA methylation arrays for humans and mice37
Single-cell mutation calling and phylogenetic tree reconstruction with loss and recurrence37
PltDB: a blood platelets-based gene expression database for disease investigation37
PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers37
Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis36
AdenPredictor: accurate prediction of the adenylation domain specificity of nonribosomal peptide biosynthetic gene clusters in microbial genomes36
SpecieScan: semi-automated taxonomic identification of bone collagen peptides from MALDI-ToF-MS36
tcplfit2: an R-language general purpose concentration–response modeling package36
PST-PRNA: prediction of RNA-binding sites using protein surface topography and deep learning36
mHapTk: a comprehensive toolkit for the analysis of DNA methylation haplotypes36
2023 ISCB Overton Prize: Jingyi Jessica Li35
echolocatoR: an automated end-to-end statistical and functional genomic fine-mapping pipeline35
The 2024 ISCB Overton Prize Award—Dr Martin Steinegger35
Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics35
HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency34
Galaxy Helm chart: a standardized method for deploying production Galaxy servers33
iSFun: an R package for integrative dimension reduction analysis33
Scbean: a python library for single-cell multi-omics data analysis33
MSNet-4mC: learning effective multi-scale representations for identifying DNA N4-methylcytosine sites33
An approachable, flexible and practical machine learning workshop for biologists32
Importance-Penalized Joint Graphical Lasso (IPJGL): differential network inference via GGMs32
Computational modeling of mRNA degradation dynamics using deep neural networks32
SCONCE: a method for profiling copy number alterations in cancer evolution using single-cell whole genome sequencing32
StructuralDPPIV: a novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides31
scHiCPTR: unsupervised pseudotime inference through dual graph refinement for single-cell Hi-C data31
MIO: microRNA target analysis system for immuno-oncology31
scSGL: kernelized signed graph learning for single-cell gene regulatory network inference31
MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study31
scanMiR: a biochemically based toolkit for versatile and efficient microRNA target prediction31
Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences31
STAAR workflow: a cloud-based workflow for scalable and reproducible rare variant analysis30
MIAMI: mutual information-based analysis of multiplex imaging data30
Efficient change-points detection for genomic sequences via cumulative segmented regression30
GEnView: a gene-centric, phylogeny-based comparative genomics pipeline for bacterial genomes and plasmids30
Forseti: a mechanistic and predictive model of the splicing status of scRNA-seq reads30
PeakBot: machine-learning-based chromatographic peak picking30
statgenMPP: an R package implementing an IBD-based mixed model approach for QTL mapping in a wide range of multi-parent populations30
OPUS-X: an open-source toolkit for protein torsion angles, secondary structure, solvent accessibility, contact map predictions and 3D folding29
KCOSS: an ultra-fast k-mer counter for assembled genome analysis29
Semi-supervised data-integrated feature importance enhances performance and interpretability of biological classification tasks29
Prediction of recovery from multiple organ dysfunction syndrome in pediatric sepsis patients29
Prediction of HIV sensitivity to monoclonal antibodies using aminoacid sequences and deep learning29
AbDiver: a tool to explore the natural antibody landscape to aid therapeutic design29
An automated multi-modal graph-based pipeline for mouse genetic discovery29
Multistage attention-based extraction and fusion of protein sequence and structural features for protein function prediction29
Deep graph representations embed network information for robust disease marker identification29
spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data28
SBGNview: towards data analysis, integration and visualization on all pathways28
ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning28
Phenotype prediction from single-cell RNA-seq data using attention-based neural networks28
Multiomix: a cloud-based platform to infer cancer genomic and epigenomic events associated with gene expression modulation28
Foreign RNA spike-ins enable accurate allele-specific expression analysis at scale27
ELIXIR biovalidator for semantic validation of life science metadata27
WGA-LP: a pipeline for whole genome assembly of contaminated reads27
2022 ISCB Accomplishments by a Senior Scientist Award: Ron Shamir27
Improving dictionary-based named entity recognition with deep learning27
GADGETS: a genetic algorithm for detecting epistasis using nuclear families27
SplicingFactory—splicing diversity analysis for transcriptome data27
Testing microbiome association using integrated quantile regression models27
dsMTL: a computational framework for privacy-preserving, distributed multi-task machine learning27
Polyphest: fast polyploid phylogeny estimation27
SPRISS: approximating frequentk-mers by sampling reads, and applications27
Nezzle: an interactive and programmable visualization of biological networks in Python27
The 2025 ISCB Overton Prize Award—Dr James Zou26
ARTEMIS integrates autoencoders and Schrödinger Bridges to predict continuous dynamics of gene expression, cell population, and perturbation from time-series single-cell data26
A novel pipeline for computerized mouse spermatogenesis staging26
LoRA-DR-suite: adapted embeddings predict intrinsic and soft disorder from protein sequences26
MixingDTA: improved drug–target affinity prediction by extending mixup with guilt-by-association26
Learning sparse log-ratios for high-throughput sequencing data26
CATH-ddG: towards robust mutation effect prediction on protein–protein interactions out of CATH homologous superfamily26
ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads26
InterpolatedXY: a two-step strategy to normalize DNA methylation microarray data avoiding sex bias25
ORT: a workflow linking genome-scale metabolic models with reactive transport codes25
CellAnn: a comprehensive, super-fast, and user-friendly single-cell annotation web server25
PDMDA: predicting deep-level miRNA–disease associations with graph neural networks and sequence features25
Powerful and interpretable control of false discoveries in two-group differential expression studies25
Optimal phylogenetic reconstruction of insertion and deletion events25
Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data25
AHoJ: rapid, tailored search and retrieval of apo and holo protein structures for user-defined ligands25
Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data25
IMPACT: interpretable microbial phenotype analysis via microbial characteristic traits25
CIndex: compressed indexes for fast retrieval of FASTQ files24
Driver gene detection through Bayesian network integration of mutation and expression profiles24
MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics24
RAxML Grove: an empirical phylogenetic tree database24
ATHENA: analysis of tumor heterogeneity from spatial omics measurements24
Phylogenetic diversity statistics for all clades in a phylogeny24
Determining epitope specificity of T-cell receptors with transformers24
Querying multiple sets ofP-values through composed hypothesis testing24
HyperGraphs.jl: representing higher-order relationships in Julia24
SEPA: signaling entropy-based algorithm to evaluate personalized pathway activation for survival analysis on pan-cancer data24
A unified mediation analysis framework for integrative cancer proteogenomics with clinical outcomes24
Comparing transmembrane protein structures with ATOLL23
Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases23
IntelliPy: a GUI for analyzing IntelliCage data23
BSDE: barycenter single-cell differential expression for case–control studies23
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data23
Bayesian inference of fitness landscapes via tree-structured branching processes23
CODEX: COunterfactual Deep learning for the in silico EXploration of cancer cell line perturbations23
SPEAR: Systematic ProtEin AnnotatoR23
SimBu: bias-aware simulation of bulk RNA-seq data with variable cell-type composition23
TopHap: rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity23
Thermometer: a webserver to predict protein thermal stability23
EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts23
Overcoming biases in causal inference of molecular interactions23
AttentionPert: accurately modeling multiplexed genetic perturbations with multi-scale effects22
Looking at the BiG picture: incorporating bipartite graphs in drug response prediction22
NFTest: automated testing of Nextflow pipelines22
ViTAL: Vision TrAnsformer based Low coverage SARS-CoV-2 lineage assignment22
3D Optical Coherence Tomography image processing in BISCAP: characterization of biofilm structure and properties22
Explainable multimodal machine learning model for classifying pregnancy drug safety22
Multimodal medical image fusion using adaptive co-occurrence filter-based decomposition optimization model22
CIBRA identifies genomic alterations with a system-wide impact on tumor biology22
RiboGraph: an interactive visualization system for ribosome profiling data at read length resolution22
Expanding the coverage of spatial proteomics: a machine learning approach22
DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure22
REUNION: transcription factor binding prediction and regulatory association inference from single-cell multi-omics data22
TEspeX: consensus-specific quantification of transposable element expression preventing biases from exonized fragments22
HOMELETTE: a unified interface to homology modelling software22
Pycallingcards: an integrated environment for visualizing, analyzing, and interpreting Calling Cards data22
Somatic mutation effects diffused over microRNA dysregulation22
Graph2MDA: a multi-modal variational graph embedding model for predicting microbe–drug associations21
0.12814712524414