Bioinformatics

Papers
(The TQCC of Bioinformatics is 15. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-06-01 to 2025-06-01.)
ArticleCitations
Correction to: GTExVisualizer: a web platform for supporting ageing studies1317
Predicting anti-cancer drug response by finding optimal subset of drugs861
Accurate assembly of multiple RNA-seq samples with Aletsch751
Statistical framework to determine indel-length distribution647
OpenPhi: an interface to access Philips iSyntax whole slide images for computational pathology517
Detecting spatially co-expressed gene clusters with functional coherence by graph-regularized convolutional neural network473
CondiS web app: imputation of censored lifetimes for machine learning-based survival analysis278
Completing gene trees without species trees in sub-quadratic time272
deTELpy: Python package for high-throughput detection of amino acid substitutions in mass spectrometry datasets260
ATLIGATOR: editing protein interactions with an atlas-based approach260
LPTD: a novel linear programming-based topology determination method for cryo-EM maps206
Mocafe: a comprehensive Python library for simulating cancer development with Phase Field Models184
EvoAug-TF: extending evolution-inspired data augmentations for genomic deep learning to TensorFlow177
Integrated Genome Browser App Store167
PANPROVA: pangenomic prokaryotic evolution of full assemblies148
MuWU: Mutant-seq library analysis and annotation145
Haplotype-based membership inference from summary genomic data131
TRANSDIRE: data-driven direct reprogramming by a pioneer factor-guided trans-omics approach130
pKPDB: a protein data bank extension database of pKa and pI theoretical values123
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations114
Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules113
Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores106
Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments103
Practical selection of representative sets of RNA-seq samples using a hierarchical approach102
3Cnet: pathogenicity prediction of human variants using multitask learning with evolutionary constraints99
DTI-Voodoo: machine learning over interaction networks and ontology-based background knowledge predicts drug–target interactions97
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities97
Deep Subspace Mutual Learning for cancer subtypes prediction95
SimPlot++: a Python application for representing sequence similarity and detecting recombination94
monaLisa: an R/Bioconductor package for identifying regulatory motifs89
Increasing confidence in proteomic spectral deconvolution through mass defect88
PsiNorm: a scalable normalization for single-cell RNA-seq data83
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures81
ELIXIR: providing a sustainable infrastructure for life science data at European scale80
DeepSVP: integration of genotype and phenotype for structural variant prioritization using deep learning79
Refget: standardized access to reference sequences79
DeepPerVar: a multi-modal deep learning framework for functional interpretation of genetic variants in personal genome78
Prediction of whole-cell transcriptional response with machine learning76
DRUMMER—rapid detection of RNA modifications through comparative nanopore sequencing76
Inference of 3D genome architecture by modeling overdispersion of Hi-C data75
Cross-species prediction of essential genes in insects73
Deep Local Analysis deconstructs protein–protein interfaces and accurately estimates binding affinity changes upon mutation70
WMDS.net: a network control framework for identifying key players in transcriptome programs69
PyLiger: scalable single-cell multi-omic data integration in Python67
MAGUS+eHMMs: improved multiple sequence alignment accuracy for fragmentary sequences67
CANTATA—prediction of missing links in Boolean networks using genetic programming67
ProSynAR: a reference aware read merger67
Exploring automatic inconsistency detection for literature-based gene ontology annotation65
Erratum to: GADGETS: a genetic algorithm for detecting epistasis using nuclear families64
Aclust2.0: a revamped unsupervised R tool for Infinium methylation beadchips data analyses64
Random field modeling of multi-trait multi-locus association for detecting methylation quantitative trait loci62
skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements62
MetBP: a software tool for detection of interaction between metal ion–RNA base pairs61
Tysserand—fast and accurate reconstruction of spatial networks from bioimages60
Fragmentstein—facilitating data reuse for cell-free DNA fragment analysis59
Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries59
Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells58
trfermikit: a tool to discover VNTR-associated deletions58
Evidential meta-model for molecular property prediction58
Continual knowledge infusion into pre-trained biomedical language models57
Floria: fast and accurate strain haplotyping in metagenomes57
Response to the letter to the editor: On the feasibility of dynamical analysis of network models of biochemical regulation57
Estimation of cancer cell fractions and clone trees from multi-region sequencing of tumors56
Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition56
The phers R package: using phenotype risk scores based on electronic health records to study Mendelian disease and rare genetic variants56
The ENDS of assumptions: an online tool for the epistemic non-parametric drug–response scoring56
RNAglib: a python package for RNA 2.5 D graphs56
ADViSELipidomics: a workflow for analyzing lipidomics data54
Accurate spliced alignment of long RNA sequencing reads54
RNAsolo: a repository of cleaned PDB-derived RNA 3D structures54
VeloViz: RNA velocity-informed embeddings for visualizing cellular trajectories54
bollito: a flexible pipeline for comprehensive single-cell RNA-seq analyses54
The FASTQ+ format and PISA53
MICER: a pre-trained encoder–decoder architecture for molecular image captioning53
GMNN2CD: identification of circRNA–disease associations based on variational inference and graph Markov neural networks53
hapCon: estimating contamination of ancient genomes by copying from reference haplotypes52
Decomposing mosaic tandem repeats accurately from long reads52
Perceiver CPI: a nested cross-attention network for compound–protein interaction prediction52
Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition52
HDMC: a novel deep learning-based framework for removing batch effects in single-cell RNA-seq data51
CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction51
De novo protein design by an energy function based on series expansion in distance and orientation dependence50
VSS: variance-stabilized signals for sequencing-based genomic signals50
hipFG: high-throughput harmonization and integration pipeline for functional genomics data50
From viral evolution to spatial contagion: a biologically modulated Hawkes model50
EpiDope: a deep neural network for linear B-cell epitope prediction50
Hierarchical reinforcement learning for automatic disease diagnosis50
EDTox: an R Shiny application to predict the endocrine disruption potential of compounds49
Powerful molecule generation with simple ConvNet49
vaRHC: an R package for semi-automation of variant classification in hereditary cancer genes according to ACMG/AMP and gene-specific ClinGen guidelines49
LinkExplorer: predicting, explaining and exploring links in large biomedical knowledge graphs49
MoMA-LoopSampler: a web server to exhaustively sample protein loop conformations48
Prediction and curation of missing biomedical identifier mappings with Biomappings48
ViReMaShiny: an interactive application for analysis of viral recombination data48
2DProts: database of family-wide protein secondary structure diagrams47
Prediction of gene co-expression from chromatin contacts with graph attention network47
BATL: Bayesian annotations for targeted lipidomics46
BridgeDPI: a novel Graph Neural Network for predicting drug–protein interactions46
Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning46
Adaptive digital tissue deconvolution46
COVID-19 Spread Mapper: a multi-resolution, unified framework and open-source tool46
minoTour, real-time monitoring and analysis for nanopore sequencers46
DeepTrio: a ternary prediction system for protein–protein interaction using mask multiple parallel convolutional neural networks45
rPanglaoDB: an R package to download and merge labeled single-cell RNA-seq data from the PanglaoDB database45
Single-cell RNA sequencing data analysis based on non-uniformε−neighborhood network45
Transfer learning for drug–target interaction prediction44
Erratum to: Assessing heterogeneity in spatial data using the HTA index with applications to spatial transcriptomics and imaging44
XSI—a genotype compression tool for compressive genomics in large biobanks44
High-sensitivity pattern discovery in large, paired multiomic datasets44
Deep learning-based classification of breast cancer cells using transmembrane receptor dynamics42
MS-Decipher: a user-friendly proteome database search software with an emphasis on deciphering the spectra of O-linked glycopeptides42
A fast data-driven method for genotype imputation, phasing and local ancestry inference: MendelImpute.jl42
LINADMIX: evaluating the effect of ancient admixture events on modern populations42
The minimizer Jaccard estimator is biased and inconsistent42
Functional characterization of co-phosphorylation networks41
RawHummus: an R Shiny app for automated raw data quality control in metabolomics41
SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality41
Geometry-complete perceptron networks for 3D molecular graphs40
Comprehensive comparison of two types of algorithm for circRNA detection from short-read RNA-Seq40
Mining literature and pathway data to explore the relations of ketamine with neurotransmitters and gut microbiota using a knowledge-graph40
ECCB2022: the 21st European Conference on Computational Biology40
CProMG: controllable protein-oriented molecule generation with desired binding affinity and drug-like properties39
Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python39
OMEN: network-based driver gene identification using mutual exclusivity39
Graph-theoretical prediction of biological modules in quaternary structures of large protein complexes39
Single-cell mutation calling and phylogenetic tree reconstruction with loss and recurrence38
Correction of image distortion in large-field ssEM stitching by an unsupervised intermediate-space solving network38
LOCAN: a python library for analyzing single-molecule localization microscopy data38
PltDB: a blood platelets-based gene expression database for disease investigation38
SpecieScan: semi-automated taxonomic identification of bone collagen peptides from MALDI-ToF-MS38
Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures38
Modified RNAs and predictions with the ViennaRNA Package37
Tightly integrated multiomics-based deep tensor survival model for time-to-event prediction37
mHapTk: a comprehensive toolkit for the analysis of DNA methylation haplotypes37
scGrapHiC: deep learning-based graph deconvolution for Hi-C using single cell gene expression37
PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction37
A physics-informed neural SDE network for learning cellular dynamics from time-series scRNA-seq data37
PST-PRNA: prediction of RNA-binding sites using protein surface topography and deep learning37
RNA threading with secondary structure and sequence profile37
Quantifying and correcting slide-to-slide variation in multiplexed immunofluorescence images36
Multi-project and Multi-profile joint Non-negative Matrix Factorization for cancer omic datasets36
Cell type matching across species using protein embeddings and transfer learning36
Spectral clustering of single-cell multi-omics data on multilayer graphs36
CCIP: predicting CTCF-mediated chromatin loops with transitivity36
Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics35
Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge35
Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis35
CALLR: a semi-supervised cell-type annotation method for single-cell RNA sequencing data35
LipidOne: user-friendly lipidomic data analysis tool for a deeper interpretation in a systems biology scenario35
tcplfit2: an R-language general purpose concentration–response modeling package34
ClusTCR: a python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity34
HieRFIT: a hierarchical cell type classification tool for projections from complex single-cell atlas datasets34
On the feasibility of deep learning applications using raw mass spectrometry data34
AdenPredictor: accurate prediction of the adenylation domain specificity of nonribosomal peptide biosynthetic gene clusters in microbial genomes34
PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers34
Efficient gradient boosting for prognostic biomarker discovery34
Conumee 2.0: enhanced copy-number variation analysis from DNA methylation arrays for humans and mice34
Prowler: a novel trimming algorithm for Oxford Nanopore sequence data34
Scbean: a python library for single-cell multi-omics data analysis33
statgenMPP: an R package implementing an IBD-based mixed model approach for QTL mapping in a wide range of multi-parent populations33
2023 ISCB Overton Prize: Jingyi Jessica Li33
The 2024 ISCB Overton Prize Award—Dr Martin Steinegger33
An approachable, flexible and practical machine learning workshop for biologists32
MSNet-4mC: learning effective multi-scale representations for identifying DNA N4-methylcytosine sites32
Forseti: a mechanistic and predictive model of the splicing status of scRNA-seq reads32
iSFun: an R package for integrative dimension reduction analysis32
Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences32
Multiomix: a cloud-based platform to infer cancer genomic and epigenomic events associated with gene expression modulation32
STAAR workflow: a cloud-based workflow for scalable and reproducible rare variant analysis32
libOmexMeta: enabling semantic annotation of models to support FAIR principles32
Importance-Penalized Joint Graphical Lasso (IPJGL): differential network inference via GGMs31
HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency31
Computational modeling of mRNA degradation dynamics using deep neural networks31
Galaxy Helm chart: a standardized method for deploying production Galaxy servers31
OPUS-X: an open-source toolkit for protein torsion angles, secondary structure, solvent accessibility, contact map predictions and 3D folding31
scSGL: kernelized signed graph learning for single-cell gene regulatory network inference30
AbDiver: a tool to explore the natural antibody landscape to aid therapeutic design30
RENANO: a REference-based compressor for NANOpore FASTQ files30
MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study30
SCONCE: a method for profiling copy number alterations in cancer evolution using single-cell whole genome sequencing30
Efficient change-points detection for genomic sequences via cumulative segmented regression30
MIAMI: mutual information-based analysis of multiplex imaging data29
echolocatoR: an automated end-to-end statistical and functional genomic fine-mapping pipeline29
Fully unsupervised deep mode of action learning for phenotyping high-content cellular images29
spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data29
An automated multi-modal graph-based pipeline for mouse genetic discovery29
Deep graph representations embed network information for robust disease marker identification29
MIO: microRNA target analysis system for immuno-oncology29
CUT&RUNTools 2.0: a pipeline for single-cell and bulk-level CUT&RUN and CUT&Tag data analysis28
Phenotype prediction from single-cell RNA-seq data using attention-based neural networks28
GEnView: a gene-centric, phylogeny-based comparative genomics pipeline for bacterial genomes and plasmids28
scHiCPTR: unsupervised pseudotime inference through dual graph refinement for single-cell Hi-C data28
SysMod: the ISCB community for data-driven computational modelling and multi-scale analysis of biological systems28
Prediction of HIV sensitivity to monoclonal antibodies using aminoacid sequences and deep learning28
ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning28
PeakBot: machine-learning-based chromatographic peak picking28
Asynchronous parallel Bayesian optimization for AI-driven cloud laboratories28
Prediction of recovery from multiple organ dysfunction syndrome in pediatric sepsis patients28
StructuralDPPIV: a novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides28
KCOSS: an ultra-fast k-mer counter for assembled genome analysis28
Improving dictionary-based named entity recognition with deep learning27
scanMiR: a biochemically based toolkit for versatile and efficient microRNA target prediction27
WGA-LP: a pipeline for whole genome assembly of contaminated reads27
SplicingFactory—splicing diversity analysis for transcriptome data27
Learning embedding features based on multisense-scaled attention architecture to improve the predictive performance of anticancer peptides27
Testing microbiome association using integrated quantile regression models27
Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data27
SBGNview: towards data analysis, integration and visualization on all pathways27
2022 ISCB Accomplishments by a Senior Scientist Award: Ron Shamir27
SPRISS: approximating frequentk-mers by sampling reads, and applications27
ISMB/ECCB 2021 proceedings26
SEPA: signaling entropy-based algorithm to evaluate personalized pathway activation for survival analysis on pan-cancer data26
Polyphest: fast polyploid phylogeny estimation26
S2L-PSIBLAST: a supervised two-layer search framework based on PSI-BLAST for protein remote homology detection26
Querying multiple sets ofP-values through composed hypothesis testing26
ELIXIR biovalidator for semantic validation of life science metadata26
IMPACT: interpretable microbial phenotype analysis via microbial characteristic traits26
dsMTL: a computational framework for privacy-preserving, distributed multi-task machine learning26
HyperGraphs.jl: representing higher-order relationships in Julia26
Nezzle: an interactive and programmable visualization of biological networks in Python26
GADGETS: a genetic algorithm for detecting epistasis using nuclear families25
Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data25
CIndex: compressed indexes for fast retrieval of FASTQ files25
InterpolatedXY: a two-step strategy to normalize DNA methylation microarray data avoiding sex bias25
ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads25
ORT: a workflow linking genome-scale metabolic models with reactive transport codes25
Foreign RNA spike-ins enable accurate allele-specific expression analysis at scale25
AQUARIUM: accurate quantification of circular isoforms using model-based strategy25
CellAnn: a comprehensive, super-fast, and user-friendly single-cell annotation web server25
Optimal phylogenetic reconstruction of insertion and deletion events25
Powerful and interpretable control of false discoveries in two-group differential expression studies25
Driver gene detection through Bayesian network integration of mutation and expression profiles25
ClustENMD: efficient sampling of biomolecular conformational space at atomic resolution25
A novel pipeline for computerized mouse spermatogenesis staging24
PDMDA: predicting deep-level miRNA–disease associations with graph neural networks and sequence features24
MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics24
Determining epitope specificity of T-cell receptors with transformers24
ATHENA: analysis of tumor heterogeneity from spatial omics measurements24
RAxML Grove: an empirical phylogenetic tree database24
AHoJ: rapid, tailored search and retrieval of apo and holo protein structures for user-defined ligands24
Learning sparse log-ratios for high-throughput sequencing data24
efam: an expanded, metaproteome-supported HMM profile database of viral protein families24
A unified mediation analysis framework for integrative cancer proteogenomics with clinical outcomes24
stPlus: a reference-based method for the accurate enhancement of spatial transcriptomics24
AutoCAT: automated cancer-associated TCRs discovery from TCR-seq data23
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data23
BSDE: barycenter single-cell differential expression for case–control studies23
Towards a reproducible interactome: semantic-based detection of redundancies to unify protein–protein interaction databases23
A variant selection framework for genome graphs23
GdClean: removal of Gadolinium contamination in mass cytometry data23
0.095580101013184