Bioinformatics

Papers
(The TQCC of Bioinformatics is 14. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-04-01 to 2025-04-01.)
ArticleCitations
Corrigendum to: HRIBO: high-throughput analysis of bacterial ribosome profiling data1179
2023 ISCB innovator award: Dana Pe’er773
qTeller: a tool for comparative multi-genomic gene expression analysis686
EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts587
webSCST: an interactive web application for single-cell RNA-sequencing data and spatial transcriptomic data integration554
The 2024 Outstanding Contributions to ISCB Award—Dr Scott Markel463
Disease gene prediction with privileged information and heteroscedastic dropout389
SPDE: a multi-functional software for sequence processing and data extraction265
Embeddings of genomic region sets capture rich biological associations in lower dimensions252
SD2: spatially resolved transcriptomics deconvolution through integration of dropout and spatial information229
Characterizing domain-specific open educational resources by linking ISCB Communities of Special Interest to Wikipedia222
HAMdetector: a Bayesian regression model that integrates information to detect HLA-associated mutations180
Biomedical evidence engineering for data-driven discovery175
Correction to: GTExVisualizer: a web platform for supporting ageing studies156
Pairs and Pairix: a file format and a tool for efficient storage and retrieval for Hi-C read pairs147
ISMB 2022 proceedings135
Current structure predictors are not learning the physics of protein folding134
Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing127
MitoVisualize: a resource for analysis of variants in human mitochondrial RNAs and DNA125
Deep structure integrative representation of multi-omics data for cancer subtyping117
CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain116
Predicting spatially resolved gene expression via tissue morphology using adaptive spatial GNNs113
Computing optimal factories in metabolic networks with negative regulation106
Calib-RT: an open source python package for peptide retention time calibration in DIA mass spectrometry data100
ExplorATE: a new pipeline to explore active transposable elements from RNA-seq data100
Looking at the BiG picture: incorporating bipartite graphs in drug response prediction95
MultiBaC: an R package to remove batch effects in multi-omic experiments94
RiboGraph: an interactive visualization system for ribosome profiling data at read length resolution93
Effective knowledge graph embeddings based on multidirectional semantics relations for polypharmacy side effects prediction93
SPARSE: a sparse hypergraph neural network for learning multiple types of latent combinations to accurately predict drug–drug interactions91
Fine-tuning protein embeddings for functional similarity evaluation89
DiffChIPL: a differential peak analysis method for high-throughput sequencing data with biological replicates based on limma84
Biological Random Walks: multi-omics integration for disease gene prioritization82
On the feasibility of dynamical analysis of network models of biochemical regulation80
Clustering single-cell RNA-seq data by rank constrained similarity learning79
Gene Tracer: a smart, interactive, voice-controlled Alexa skill For gene information retrieval and browsing, mutation annotation and network visualization74
HPOFiller: identifying missing protein–phenotype associations by graph convolutional network74
REALGAR: a web app of integrated respiratory omics data74
DrugCVar: a platform for evidence-based drug annotation for genetic variants in cancer74
SimBu: bias-aware simulation of bulk RNA-seq data with variable cell-type composition73
I2b2-etl: Python application for importing electronic health data into the informatics for integrating biology and the bedside platform73
MMGraph: a multiple motif predictor based on graph neural network and coexisting probability for ATAC-seq data72
massDatabase: utilities for the operation of the public compound and pathway database71
ASURAT: functional annotation-driven unsupervised clustering of single-cell transcriptomes70
GLIDER: function prediction from GLIDE-based neighborhoods70
teff: estimation of Treatment EFFects on transcriptomic data using causal random forest68
Deep statistical modelling of nanopore sequencing translocation times reveals latent non-B DNA structures67
medna-metadata: an open-source data management system for tracking environmental DNA samples and metadata65
Overcoming selection bias in synthetic lethality prediction64
Phylogenetic diversity statistics for all clades in a phylogeny63
A weighted distance-based approach for deriving consensus tumor evolutionary trees63
GAMIBHEAR: whole-genome haplotype reconstruction from Genome Architecture Mapping data63
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations61
Haplotype-based membership inference from summary genomic data60
GraphLoc: a graph neural network model for predicting protein subcellular localization from immunohistochemistry images59
Practical selection of representative sets of RNA-seq samples using a hierarchical approach57
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data56
tSFM 1.0: tRNA Structure–Function Mapper56
PrISM: precision for integrative structural models56
TEspeX: consensus-specific quantification of transposable element expression preventing biases from exonized fragments55
scWMC: weighted matrix completion-based imputation of scRNA-seq data via prior subspace information55
scAMACE: model-based approach to the joint analysis of single-cell data on chromatin accessibility, gene expression and methylation55
Study of real-valued distance prediction for protein structure prediction with deep learning54
Systematic replication enables normalization of high-throughput imaging assays54
DelaySSAToolkit.jl: stochastic simulation of reaction systems with time delays in Julia54
Discovering drug–target interaction knowledge from biomedical literature53
wenda_gpu: fast domain adaptation for genomic data52
Utilizing image and caption information for biomedical document classification52
Optimization of drug–target affinity prediction methods through feature processing schemes52
SpaceX: gene co-expression network estimation for spatial transcriptomics51
KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers51
Fec: a fast error correction method based on two-rounds overlapping and caching51
MNHN-Tree-Tools: a toolbox for tree inference using multi-scale clustering of a set of sequences50
Exploring parallel MPI fault tolerance mechanisms for phylogenetic inference with RAxML-NG50
CENTRE: a gradient boosting algorithm for Cell-type-specific ENhancer-Target pREdiction50
CrepHAN: cross-species prediction of enhancers by using hierarchical attention networks49
Rapid T-cell receptor interaction grouping with ting49
Predicting anti-cancer drug response by finding optimal subset of drugs49
3Dscript.server: true server-side 3D animation of microscopy images using a natural language-based syntax48
TimiRGeN: R/Bioconductor package for time series microRNA–mRNA integration and analysis48
PaIntDB: network-based omics integration and visualization using protein–protein interactions in Pseudomonas aeruginosa48
SimService: a lightweight library for building simulation services in Python47
DREAMM: a web-based server for drugging protein-membrane interfaces as a novel workflow for targeted drug design47
Statistical framework to determine indel-length distribution47
GdClean: removal of Gadolinium contamination in mass cytometry data47
SPOT: a web-tool enabling swift profiling of transcriptomes47
ORFLine: a bioinformatic pipeline to prioritize small open reading frames identifies candidate secreted small proteins from lymphocytes47
DMIL-IsoFun: predicting isoform function using deep multi-instance learning47
CHIT: an allele-specific method for testing the association between molecular quantitative traits and phenotype–genotype interaction47
Phytest: quality control for phylogenetic analyses47
DeepSec: a deep learning framework for secreted protein discovery in human body fluids46
Organism-specific training improves performance of linear B-cell epitope prediction46
Joint eQTL mapping and inference of gene regulatory network improves power of detecting bothcis- andtrans-eQTLs45
PyJAMAS: open-source, multimodal segmentation and analysis of microscopy images45
A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype45
ATLIGATOR: editing protein interactions with an atlas-based approach44
ConsensuSV—from the whole-genome sequencing data to the complete variant list44
AutoCAT: automated cancer-associated TCRs discovery from TCR-seq data44
PsiNorm: a scalable normalization for single-cell RNA-seq data44
REUNION: transcription factor binding prediction and regulatory association inference from single-cell multi-omics data43
Detecting spatially co-expressed gene clusters with functional coherence by graph-regularized convolutional neural network43
AWOT and CWOT for genotype and genotype-by-treatment interaction joint analysis in pharmacogenetics GWAS43
RCandy: an R package for visualizing homologous recombinations in bacterial genomes43
COVID-19 Knowledge Graph from semantic integration of biomedical literature and databases43
An empirical study on KDIGO-defined acute kidney injury prediction in the intensive care unit43
Accurate large-scale phylogeny-aware alignment using BAli-Phy42
HELM-GPT: de novo macrocyclic peptide design using generative pre-trained transformer42
CAMML with the Integration of Marker Proteins (ChIMP)42
deTELpy: Python package for high-throughput detection of amino acid substitutions in mass spectrometry datasets42
MetaSquare: an integrated metadatabase of 16S rRNA gene amplicon for microbiome taxonomic classification41
learnMSA2: deep protein multiple alignments with large language and hidden Markov models41
GORetriever: reranking protein-description-based GO candidates by literature-driven deep information retrieval for protein function annotation41
Comparing transmembrane protein structures with ATOLL40
Towards a reproducible interactome: semantic-based detection of redundancies to unify protein–protein interaction databases40
PANPROVA: pangenomic prokaryotic evolution of full assemblies40
Deciphering associations between gut microbiota and clinical factors using microbial modules39
CIBRA identifies genomic alterations with a system-wide impact on tumor biology39
TopHap: rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity39
Completing gene trees without species trees in sub-quadratic time39
A Bayesian hierarchical model to estimate DNA methylation conservation in colorectal tumors39
Integrated Genome Browser App Store39
LPTD: a novel linear programming-based topology determination method for cryo-EM maps38
fastISM: performantin silicosaturation mutagenesis for convolutional neural networks38
An incrementally updatable and scalable system for large-scale sequence search using the Bentley–Saxe transformation38
Chromosomal imbalances detected via RNA-sequencing in 28 cancers38
StructuralVariantAnnotation: a R/Bioconductor foundation for a caller-agnostic structural variant software ecosystem37
TMQuery: a database of precomputed template modeling scores for assessment of protein structural similarity37
Steer’n’Detect: fast 2D template detection with accurate orientation estimation36
TRANSDIRE: data-driven direct reprogramming by a pioneer factor-guided trans-omics approach36
Biomarker identification by interpretable maximum mean discrepancy36
maplet: an extensible R toolbox for modular and reproducible metabolomics pipelines36
StAmP-DB: a platform for structures of polymorphic amyloid fibril cores36
BSDE: barycenter single-cell differential expression for case–control studies36
Synthetic-to-real: instance segmentation of clinical cluster cells with unlabeled synthetic training35
KIMGENS: a novel method to estimate kinship in organisms with mixed haploid diploid genetic systems robust to population structure35
Scoring protein sequence alignments using deep learning35
Mian: interactive web-based microbiome data table visualization and machine learning platform35
Seeding with minimized subsequence35
seqgra: principled selection of neural network architectures for genomics prediction tasks35
MOJITOO: a fast and universal method for integration of multimodal single-cell data34
POIBM: batch correction of heterogeneous RNA-seq datasets through latent sample matching34
OMAMO: orthology-based alternative model organism selection34
InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses34
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities33
SPEAR: Systematic ProtEin AnnotatoR33
The DOMINO web-server for active module identification analysis33
Deep Subspace Mutual Learning for cancer subtypes prediction33
Median and small parsimony problems on RNA trees33
Trap spaces of multi-valued networks: definition, computation, and applications33
scKINETICS: inference of regulatory velocity with single-cell transcriptomics data33
Overcoming biases in causal inference of molecular interactions33
RoDiCE: robust differential protein co-expression analysis for cancer complexome33
Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases33
Phylogenomic branch length estimation using quartets33
3DPolyS-LE: an accessible simulation framework to model the interplay between chromatin and loop extrusion33
CellProfiler Analyst 3.0: accessible data exploration and machine learning for image analysis33
SWIMmeR: an R-based software to unveiling crucial nodes in complex biological networks32
OpenPhi: an interface to access Philips iSyntax whole slide images for computational pathology32
Explainable multimodal machine learning model for classifying pregnancy drug safety32
Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores32
DEMETER: efficient simultaneous curation of genome-scale reconstructions guided by experimental data and refined gene annotations32
MuWU: Mutant-seq library analysis and annotation32
Toward the assessment of predicted inter-residue distance32
Multi-instance learning of graph neural networks for aqueous pKa prediction32
DTI-Voodoo: machine learning over interaction networks and ontology-based background knowledge predicts drug–target interactions32
ShinyArchR.UiO: user-friendly,integrative and open-source tool for visualization of single-cell ATAC-seq data using ArchR32
NetControl4BioMed: a web-based platform for controllability analysis of protein–protein interaction networks32
TreeAndLeaf: an R/Bioconductor package for graphs and trees with focus on the leaves31
Learning locality-sensitive bucketing functions31
ACES: Analysis of Conservation with an Extensive list of Species31
Clustering spatial transcriptomics data30
Accurate assembly of multiple RNA-seq samples with Aletsch30
DeepAc4C: a convolutional neural network model with hybrid features composed of physicochemical patterns and distributed representation information for identification of N4-acetylcytidine in mRNA30
DIMPL: a bioinformatics pipeline for the discovery of structured noncoding RNA motifs in bacteria30
MS1Connect: a mass spectrometry run similarity measure30
TPWshiny: an interactive R/Shiny app to explore cell line transcriptional responses to anti-cancer drugs30
An algorithm for decoy-free false discovery rate estimation in XL-MS/MS proteomics29
A count-based model for delineating cell–cell interactions in spatial transcriptomics data29
CODEX: COunterfactual Deep learning for the in silico EXploration of cancer cell line perturbations29
SigTools: exploratory visualization for genomic signals29
Multimodal medical image fusion using adaptive co-occurrence filter-based decomposition optimization model29
On the stability of log-rank test under labeling errors29
dsRBPBind: modeling the effect of RNA secondary structure on double-stranded RNA–protein binding28
BioCCP.jl: collecting coupons in combinatorial biotechnology28
On the relation between input and output distributions of scRNA-seq experiments28
Graph2MDA: a multi-modal variational graph embedding model for predicting microbe–drug associations28
Conway–Bromage–Lyndon (CBL): an exact, dynamic representation of k-mer sets28
AttentionPert: accurately modeling multiplexed genetic perturbations with multi-scale effects28
A heuristic algorithm solving the mutual-exclusivity-sorting problem28
Predicting protein functions using positive-unlabeled ranking with ontology-based priors28
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures28
DULoc: quantitatively unmixing protein subcellular location patterns in immunofluorescence images based on deep learning features28
ChromDL: a next-generation regulatory DNA classifier27
Metaball skinning of synthetic astroglial morphologies into realistic mesh models for in silico simulations and visual analytics27
AMC: accurate mutation clustering from single-cell DNA sequencing data27
HOMELETTE: a unified interface to homology modelling software27
Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions27
CACONET: a novel classification framework for microbial correlation networks27
MANIEA: a microbial association network inference method based on improved Eclat association rule mining algorithm27
Target–Decoy MineR for determining the biological relevance of variables in noisy datasets26
VeTra: a tool for trajectory inference based on RNA velocity26
NeoFox: annotating neoantigen candidates with neoantigen features26
An integrative pipeline for circular RNA quantitative trait locus discovery with application in human T cells26
Mocafe: a comprehensive Python library for simulating cancer development with Phase Field Models26
Identification of cell-type-specific spatially variable genes accounting for excess zeros26
Comparison of structural variants detected by optical mapping with long-read next-generation sequencing26
XRRpred: accurate predictor of crystal structure quality from protein sequence26
Topology-based sparsification of graph annotations26
PAX2GRAPHML: a python library for large-scale regulation network analysis using BioPAX26
Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics26
CondiS web app: imputation of censored lifetimes for machine learning-based survival analysis26
Toward comprehensive functional analysis of gene lists weighted by gene essentiality scores26
MetaNorm: incorporating meta-analytic priors into normalization of NanoString nCounter data25
ASHLEYS: automated quality control for single-cell Strand-seq data25
Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments25
methyLImp2: faster missing value estimation for DNA methylation data25
RápidoPGS: a rapid polygenic score calculator for summary GWAS data without a test dataset25
On the stability of log-rank test under labeling errors25
IIFDTI: predicting drug–target interactions through interactive and independent features based on attention mechanism25
3D GAN image synthesis and dataset quality assessment for bacterial biofilm25
AutoCCS: automated collision cross-section calculation software for ion mobility spectrometry–mass spectrometry25
EvoAug-TF: extending evolution-inspired data augmentations for genomic deep learning to TensorFlow24
Selection among site-dependent structurally constrained substitution models of protein evolution by approximate Bayesian computation24
CTISL: a dynamic stacking multi-class classification approach for identifying cell types from single-cell RNA-seq data24
pKPDB: a protein data bank extension database of pKa and pI theoretical values24
Isoform function prediction by Gene Ontology embedding24
Correction to: GSpace: an exact coalescence simulator of recombining genomes under isolation by distance24
GBZ file format for pangenome graphs24
GEInter: an R package for robust gene–environment interaction analysis24
Improving deep learning-based protein distance prediction in CASP1424
Somatic mutation effects diffused over microRNA dysregulation24
pycofitness—Evaluating the fitness landscape of RNA and protein sequences24
Statistical approaches for differential expression analysis in metatranscriptomics24
Automated exploitation of deep learning for cancer patient stratification across multiple types24
Pycallingcards: an integrated environment for visualizing, analyzing, and interpreting Calling Cards data24
Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules24
3D Optical Coherence Tomography image processing in BISCAP: characterization of biofilm structure and properties24
ViTAL: Vision TrAnsformer based Low coverage SARS-CoV-2 lineage assignment24
CLUE: exact maximal reduction of kinetic models by constrained lumping of differential equations24
A graph neural network approach for molecule carcinogenicity prediction24
Metagenomic functional profiling: to sketch or not to sketch?23
DeepMHCII: a novel binding core-aware deep interaction model for accurate MHC-II peptide binding affinity prediction23
The topology of data: opportunities for cancer research23
IntelliPy: a GUI for analyzing IntelliCage data23
GNN-based embedding for clustering scRNA-seq data23
Expanding the coverage of spatial proteomics: a machine learning approach23
RAPPPID: towards generalizable protein interaction prediction with AWD-LSTM twin networks23
3Cnet: pathogenicity prediction of human variants using multitask learning with evolutionary constraints23
DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure23
On the reliability and the limits of inference of amino acid sequence alignments23
0.043431997299194