Bioinformatics

Papers
(The median citation count of Bioinformatics is 7. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-06-01 to 2025-06-01.)
ArticleCitations
Correction to: GTExVisualizer: a web platform for supporting ageing studies1317
Predicting anti-cancer drug response by finding optimal subset of drugs861
Accurate assembly of multiple RNA-seq samples with Aletsch751
Statistical framework to determine indel-length distribution647
OpenPhi: an interface to access Philips iSyntax whole slide images for computational pathology517
Detecting spatially co-expressed gene clusters with functional coherence by graph-regularized convolutional neural network473
CondiS web app: imputation of censored lifetimes for machine learning-based survival analysis278
Completing gene trees without species trees in sub-quadratic time272
deTELpy: Python package for high-throughput detection of amino acid substitutions in mass spectrometry datasets260
ATLIGATOR: editing protein interactions with an atlas-based approach260
LPTD: a novel linear programming-based topology determination method for cryo-EM maps206
Mocafe: a comprehensive Python library for simulating cancer development with Phase Field Models184
EvoAug-TF: extending evolution-inspired data augmentations for genomic deep learning to TensorFlow177
Integrated Genome Browser App Store167
PANPROVA: pangenomic prokaryotic evolution of full assemblies148
MuWU: Mutant-seq library analysis and annotation145
Haplotype-based membership inference from summary genomic data131
TRANSDIRE: data-driven direct reprogramming by a pioneer factor-guided trans-omics approach130
pKPDB: a protein data bank extension database of pKa and pI theoretical values123
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations114
Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules113
Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores106
Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments103
Practical selection of representative sets of RNA-seq samples using a hierarchical approach102
3Cnet: pathogenicity prediction of human variants using multitask learning with evolutionary constraints99
DTI-Voodoo: machine learning over interaction networks and ontology-based background knowledge predicts drug–target interactions97
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities97
Deep Subspace Mutual Learning for cancer subtypes prediction95
SimPlot++: a Python application for representing sequence similarity and detecting recombination94
monaLisa: an R/Bioconductor package for identifying regulatory motifs89
Increasing confidence in proteomic spectral deconvolution through mass defect88
PsiNorm: a scalable normalization for single-cell RNA-seq data83
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures81
ELIXIR: providing a sustainable infrastructure for life science data at European scale80
DeepSVP: integration of genotype and phenotype for structural variant prioritization using deep learning79
Refget: standardized access to reference sequences79
DeepPerVar: a multi-modal deep learning framework for functional interpretation of genetic variants in personal genome78
Prediction of whole-cell transcriptional response with machine learning76
DRUMMER—rapid detection of RNA modifications through comparative nanopore sequencing76
Inference of 3D genome architecture by modeling overdispersion of Hi-C data75
Cross-species prediction of essential genes in insects73
Deep Local Analysis deconstructs protein–protein interfaces and accurately estimates binding affinity changes upon mutation70
WMDS.net: a network control framework for identifying key players in transcriptome programs69
PyLiger: scalable single-cell multi-omic data integration in Python67
MAGUS+eHMMs: improved multiple sequence alignment accuracy for fragmentary sequences67
CANTATA—prediction of missing links in Boolean networks using genetic programming67
ProSynAR: a reference aware read merger67
Exploring automatic inconsistency detection for literature-based gene ontology annotation65
Erratum to: GADGETS: a genetic algorithm for detecting epistasis using nuclear families64
Aclust2.0: a revamped unsupervised R tool for Infinium methylation beadchips data analyses64
Random field modeling of multi-trait multi-locus association for detecting methylation quantitative trait loci62
skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements62
MetBP: a software tool for detection of interaction between metal ion–RNA base pairs61
Tysserand—fast and accurate reconstruction of spatial networks from bioimages60
Fragmentstein—facilitating data reuse for cell-free DNA fragment analysis59
Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries59
trfermikit: a tool to discover VNTR-associated deletions58
Evidential meta-model for molecular property prediction58
Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells58
Floria: fast and accurate strain haplotyping in metagenomes57
Response to the letter to the editor: On the feasibility of dynamical analysis of network models of biochemical regulation57
Continual knowledge infusion into pre-trained biomedical language models57
The phers R package: using phenotype risk scores based on electronic health records to study Mendelian disease and rare genetic variants56
The ENDS of assumptions: an online tool for the epistemic non-parametric drug–response scoring56
RNAglib: a python package for RNA 2.5 D graphs56
Estimation of cancer cell fractions and clone trees from multi-region sequencing of tumors56
Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition56
RNAsolo: a repository of cleaned PDB-derived RNA 3D structures54
VeloViz: RNA velocity-informed embeddings for visualizing cellular trajectories54
bollito: a flexible pipeline for comprehensive single-cell RNA-seq analyses54
ADViSELipidomics: a workflow for analyzing lipidomics data54
Accurate spliced alignment of long RNA sequencing reads54
MICER: a pre-trained encoder–decoder architecture for molecular image captioning53
GMNN2CD: identification of circRNA–disease associations based on variational inference and graph Markov neural networks53
The FASTQ+ format and PISA53
hapCon: estimating contamination of ancient genomes by copying from reference haplotypes52
Decomposing mosaic tandem repeats accurately from long reads52
Perceiver CPI: a nested cross-attention network for compound–protein interaction prediction52
Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition52
HDMC: a novel deep learning-based framework for removing batch effects in single-cell RNA-seq data51
CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction51
VSS: variance-stabilized signals for sequencing-based genomic signals50
hipFG: high-throughput harmonization and integration pipeline for functional genomics data50
From viral evolution to spatial contagion: a biologically modulated Hawkes model50
EpiDope: a deep neural network for linear B-cell epitope prediction50
Hierarchical reinforcement learning for automatic disease diagnosis50
De novo protein design by an energy function based on series expansion in distance and orientation dependence50
vaRHC: an R package for semi-automation of variant classification in hereditary cancer genes according to ACMG/AMP and gene-specific ClinGen guidelines49
Powerful molecule generation with simple ConvNet49
LinkExplorer: predicting, explaining and exploring links in large biomedical knowledge graphs49
EDTox: an R Shiny application to predict the endocrine disruption potential of compounds49
Prediction and curation of missing biomedical identifier mappings with Biomappings48
ViReMaShiny: an interactive application for analysis of viral recombination data48
MoMA-LoopSampler: a web server to exhaustively sample protein loop conformations48
2DProts: database of family-wide protein secondary structure diagrams47
Prediction of gene co-expression from chromatin contacts with graph attention network47
BATL: Bayesian annotations for targeted lipidomics46
BridgeDPI: a novel Graph Neural Network for predicting drug–protein interactions46
Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning46
Adaptive digital tissue deconvolution46
COVID-19 Spread Mapper: a multi-resolution, unified framework and open-source tool46
minoTour, real-time monitoring and analysis for nanopore sequencers46
DeepTrio: a ternary prediction system for protein–protein interaction using mask multiple parallel convolutional neural networks45
rPanglaoDB: an R package to download and merge labeled single-cell RNA-seq data from the PanglaoDB database45
Single-cell RNA sequencing data analysis based on non-uniformε−neighborhood network45
Transfer learning for drug–target interaction prediction44
Erratum to: Assessing heterogeneity in spatial data using the HTA index with applications to spatial transcriptomics and imaging44
XSI—a genotype compression tool for compressive genomics in large biobanks44
High-sensitivity pattern discovery in large, paired multiomic datasets44
Deep learning-based classification of breast cancer cells using transmembrane receptor dynamics42
MS-Decipher: a user-friendly proteome database search software with an emphasis on deciphering the spectra of O-linked glycopeptides42
A fast data-driven method for genotype imputation, phasing and local ancestry inference: MendelImpute.jl42
LINADMIX: evaluating the effect of ancient admixture events on modern populations42
The minimizer Jaccard estimator is biased and inconsistent42
Functional characterization of co-phosphorylation networks41
RawHummus: an R Shiny app for automated raw data quality control in metabolomics41
SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality41
Geometry-complete perceptron networks for 3D molecular graphs40
Comprehensive comparison of two types of algorithm for circRNA detection from short-read RNA-Seq40
Mining literature and pathway data to explore the relations of ketamine with neurotransmitters and gut microbiota using a knowledge-graph40
ECCB2022: the 21st European Conference on Computational Biology40
OMEN: network-based driver gene identification using mutual exclusivity39
Graph-theoretical prediction of biological modules in quaternary structures of large protein complexes39
CProMG: controllable protein-oriented molecule generation with desired binding affinity and drug-like properties39
Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python39
Single-cell mutation calling and phylogenetic tree reconstruction with loss and recurrence38
Correction of image distortion in large-field ssEM stitching by an unsupervised intermediate-space solving network38
LOCAN: a python library for analyzing single-molecule localization microscopy data38
PltDB: a blood platelets-based gene expression database for disease investigation38
SpecieScan: semi-automated taxonomic identification of bone collagen peptides from MALDI-ToF-MS38
Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures38
Modified RNAs and predictions with the ViennaRNA Package37
Tightly integrated multiomics-based deep tensor survival model for time-to-event prediction37
mHapTk: a comprehensive toolkit for the analysis of DNA methylation haplotypes37
scGrapHiC: deep learning-based graph deconvolution for Hi-C using single cell gene expression37
PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction37
A physics-informed neural SDE network for learning cellular dynamics from time-series scRNA-seq data37
PST-PRNA: prediction of RNA-binding sites using protein surface topography and deep learning37
RNA threading with secondary structure and sequence profile37
Quantifying and correcting slide-to-slide variation in multiplexed immunofluorescence images36
Multi-project and Multi-profile joint Non-negative Matrix Factorization for cancer omic datasets36
Cell type matching across species using protein embeddings and transfer learning36
Spectral clustering of single-cell multi-omics data on multilayer graphs36
CCIP: predicting CTCF-mediated chromatin loops with transitivity36
Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics35
Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge35
Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis35
CALLR: a semi-supervised cell-type annotation method for single-cell RNA sequencing data35
LipidOne: user-friendly lipidomic data analysis tool for a deeper interpretation in a systems biology scenario35
HieRFIT: a hierarchical cell type classification tool for projections from complex single-cell atlas datasets34
On the feasibility of deep learning applications using raw mass spectrometry data34
AdenPredictor: accurate prediction of the adenylation domain specificity of nonribosomal peptide biosynthetic gene clusters in microbial genomes34
PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers34
Efficient gradient boosting for prognostic biomarker discovery34
Conumee 2.0: enhanced copy-number variation analysis from DNA methylation arrays for humans and mice34
Prowler: a novel trimming algorithm for Oxford Nanopore sequence data34
tcplfit2: an R-language general purpose concentration–response modeling package34
ClusTCR: a python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity34
Scbean: a python library for single-cell multi-omics data analysis33
statgenMPP: an R package implementing an IBD-based mixed model approach for QTL mapping in a wide range of multi-parent populations33
2023 ISCB Overton Prize: Jingyi Jessica Li33
The 2024 ISCB Overton Prize Award—Dr Martin Steinegger33
An approachable, flexible and practical machine learning workshop for biologists32
MSNet-4mC: learning effective multi-scale representations for identifying DNA N4-methylcytosine sites32
Forseti: a mechanistic and predictive model of the splicing status of scRNA-seq reads32
iSFun: an R package for integrative dimension reduction analysis32
Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences32
Multiomix: a cloud-based platform to infer cancer genomic and epigenomic events associated with gene expression modulation32
STAAR workflow: a cloud-based workflow for scalable and reproducible rare variant analysis32
libOmexMeta: enabling semantic annotation of models to support FAIR principles32
Importance-Penalized Joint Graphical Lasso (IPJGL): differential network inference via GGMs31
HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency31
Computational modeling of mRNA degradation dynamics using deep neural networks31
Galaxy Helm chart: a standardized method for deploying production Galaxy servers31
OPUS-X: an open-source toolkit for protein torsion angles, secondary structure, solvent accessibility, contact map predictions and 3D folding31
RENANO: a REference-based compressor for NANOpore FASTQ files30
MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study30
SCONCE: a method for profiling copy number alterations in cancer evolution using single-cell whole genome sequencing30
Efficient change-points detection for genomic sequences via cumulative segmented regression30
scSGL: kernelized signed graph learning for single-cell gene regulatory network inference30
AbDiver: a tool to explore the natural antibody landscape to aid therapeutic design30
MIAMI: mutual information-based analysis of multiplex imaging data29
echolocatoR: an automated end-to-end statistical and functional genomic fine-mapping pipeline29
Fully unsupervised deep mode of action learning for phenotyping high-content cellular images29
spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data29
An automated multi-modal graph-based pipeline for mouse genetic discovery29
Deep graph representations embed network information for robust disease marker identification29
MIO: microRNA target analysis system for immuno-oncology29
CUT&RUNTools 2.0: a pipeline for single-cell and bulk-level CUT&RUN and CUT&Tag data analysis28
Phenotype prediction from single-cell RNA-seq data using attention-based neural networks28
GEnView: a gene-centric, phylogeny-based comparative genomics pipeline for bacterial genomes and plasmids28
scHiCPTR: unsupervised pseudotime inference through dual graph refinement for single-cell Hi-C data28
SysMod: the ISCB community for data-driven computational modelling and multi-scale analysis of biological systems28
Prediction of HIV sensitivity to monoclonal antibodies using aminoacid sequences and deep learning28
ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning28
PeakBot: machine-learning-based chromatographic peak picking28
Asynchronous parallel Bayesian optimization for AI-driven cloud laboratories28
Prediction of recovery from multiple organ dysfunction syndrome in pediatric sepsis patients28
StructuralDPPIV: a novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides28
KCOSS: an ultra-fast k-mer counter for assembled genome analysis28
SplicingFactory—splicing diversity analysis for transcriptome data27
Learning embedding features based on multisense-scaled attention architecture to improve the predictive performance of anticancer peptides27
Testing microbiome association using integrated quantile regression models27
Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data27
SBGNview: towards data analysis, integration and visualization on all pathways27
2022 ISCB Accomplishments by a Senior Scientist Award: Ron Shamir27
SPRISS: approximating frequentk-mers by sampling reads, and applications27
Improving dictionary-based named entity recognition with deep learning27
scanMiR: a biochemically based toolkit for versatile and efficient microRNA target prediction27
WGA-LP: a pipeline for whole genome assembly of contaminated reads27
SEPA: signaling entropy-based algorithm to evaluate personalized pathway activation for survival analysis on pan-cancer data26
Polyphest: fast polyploid phylogeny estimation26
S2L-PSIBLAST: a supervised two-layer search framework based on PSI-BLAST for protein remote homology detection26
Querying multiple sets ofP-values through composed hypothesis testing26
ELIXIR biovalidator for semantic validation of life science metadata26
IMPACT: interpretable microbial phenotype analysis via microbial characteristic traits26
dsMTL: a computational framework for privacy-preserving, distributed multi-task machine learning26
HyperGraphs.jl: representing higher-order relationships in Julia26
Nezzle: an interactive and programmable visualization of biological networks in Python26
ISMB/ECCB 2021 proceedings26
GADGETS: a genetic algorithm for detecting epistasis using nuclear families25
Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data25
CIndex: compressed indexes for fast retrieval of FASTQ files25
InterpolatedXY: a two-step strategy to normalize DNA methylation microarray data avoiding sex bias25
ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads25
ORT: a workflow linking genome-scale metabolic models with reactive transport codes25
Foreign RNA spike-ins enable accurate allele-specific expression analysis at scale25
AQUARIUM: accurate quantification of circular isoforms using model-based strategy25
CellAnn: a comprehensive, super-fast, and user-friendly single-cell annotation web server25
Optimal phylogenetic reconstruction of insertion and deletion events25
Powerful and interpretable control of false discoveries in two-group differential expression studies25
Driver gene detection through Bayesian network integration of mutation and expression profiles25
ClustENMD: efficient sampling of biomolecular conformational space at atomic resolution25
Determining epitope specificity of T-cell receptors with transformers24
ATHENA: analysis of tumor heterogeneity from spatial omics measurements24
RAxML Grove: an empirical phylogenetic tree database24
AHoJ: rapid, tailored search and retrieval of apo and holo protein structures for user-defined ligands24
Learning sparse log-ratios for high-throughput sequencing data24
efam: an expanded, metaproteome-supported HMM profile database of viral protein families24
A unified mediation analysis framework for integrative cancer proteogenomics with clinical outcomes24
stPlus: a reference-based method for the accurate enhancement of spatial transcriptomics24
A novel pipeline for computerized mouse spermatogenesis staging24
PDMDA: predicting deep-level miRNA–disease associations with graph neural networks and sequence features24
MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics24
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data23
BSDE: barycenter single-cell differential expression for case–control studies23
Towards a reproducible interactome: semantic-based detection of redundancies to unify protein–protein interaction databases23
A variant selection framework for genome graphs23
GdClean: removal of Gadolinium contamination in mass cytometry data23
AutoCAT: automated cancer-associated TCRs discovery from TCR-seq data23
0.081385135650635