Bioinformatics

Papers
(The TQCC of Bioinformatics is 13. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-05-01 to 2024-05-01.)
ArticleCitations
clinker & clustermap.js: automatic generation of gene cluster comparison figures577
YaHS: yet another Hi-C scaffolding tool479
GraphDTA: predicting drug–target binding affinity with graph neural networks363
Analysing high-throughput sequencing data in Python with HTSeq 2.0355
Liftoff: accurate mapping of gene annotations339
New strategies to improve minimap2 alignment accuracy330
GTDB-Tk v2: memory friendly classification with the genome taxonomy database311
DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome294
CoV-AbDab: the coronavirus antibody database265
LDpred2: better, faster, stronger261
STREME: accurate and versatile sequence motif discovery249
pyGenomeTracks: reproducible plots for multivariate genomic datasets 237
CAFE 5 models variation in evolutionary rates among gene families230
ProteinBERT: a universal deep-learning model of protein sequence and function217
A multimodal deep learning framework for predicting drug–drug interaction events199
TransformerCPI: improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments194
CoV-Spectrum: analysis of globally shared SARS-CoV-2 data to identify and characterize new variants191
MolTrans: Molecular Interaction Transformer for drug–target interaction prediction175
DeepPurpose: a deep learning library for drug–target interaction prediction171
fastsimcoal2: demographic inference under complex evolutionary scenarios160
Metaviral SPAdes: assembly of viruses from metagenomic data148
LocusZoom.js: interactive and embeddable visualization of genetic association study results148
Dream: powerful differential expression analysis for repeated measures designs139
scVAE: variational auto-encoders for single-cell gene expression data133
Nebulosa recovers single-cell gene expression signals by kernel density estimation132
glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data125
DeepCDA: deep cross-domain compound–protein affinity prediction through LSTM and convolutional neural networks118
Fast and sensitive taxonomic assignment to metagenomic contigs114
UCSC Cell Browser: visualize your single-cell data114
Weighted minimizer sampling improves long read mapping112
DeepLGP: a novel deep learning method for prioritizing lncRNA target genes106
Colour deconvolution: stain unmixing in histological imaging104
COVID-19 Docking Server: a meta server for docking small molecules, peptides and antibodies against potential targets of COVID-19103
DeepCDR: a hybrid graph convolutional network for predicting cancer drug response100
microbiomeMarker: an R/Bioconductor package for microbiome marker identification and visualization98
Unsupervised topological alignment for single-cell multi-omics integration96
The VEGA suite of programs: an versatile platform for cheminformatics and drug design projects96
IDP-Seq2Seq: identification of intrinsically disordered regions based on sequence to sequence learning93
ProDy 2.0: increased scale and scope after 10 years of protein dynamics modelling with Python93
dittoSeq: universal user-friendly single-cell and bulk RNA sequencing visualization toolkit91
MUFFIN: multi-scale feature fusion for drug–drug interaction prediction89
Scirpy: a Scanpy extension for analyzing single-cell T-cell receptor-sequencing data88
FlaGs and webFlaGs: discovering novel biology through the analysis of gene neighbourhood conservation88
ShinyCell: simple and sharable visualization of single-cell gene expression data87
MGIDI: toward an effective multivariate selection in biological experiments86
BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides85
ggtranscript: an R package for the visualization and interpretation of transcript isoforms usingggplot284
PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores83
Information theoretic generalized Robinson–Foulds metrics for comparing phylogenetic trees82
DNA Features Viewer: a sequence annotation formatting and plotting library for Python82
Predicting human microbe–drug associations via graph convolutional network with conditional random field82
TITAN: T-cell receptor specificity prediction with bimodal attention networks80
Accurate, scalable cohort variant calls using DeepVariant and GLnexus80
Systematic determination of the mitochondrial proportion in human and mice tissues for single-cell RNA-sequencing data quality control79
LightBBB: computational prediction model of blood–brain-barrier penetration based on LightGBM79
ABlooper: fast accurate antibody CDR loop structure prediction with accuracy estimation78
HiSCF: leveraging higher-order structures for clustering analysis in biological networks77
Fast gap-affine pairwise alignment using the wavefront algorithm77
Efficient toolkit implementing best practices for principal component analysis of population genetic data77
plotsr: visualizing structural similarities and rearrangements between multiple genomes75
PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data73
methylclock: a Bioconductor package to estimate DNA methylation age69
POKY: a software suite for multidimensional NMR and 3D structure calculation of biomolecules68
SoluProt: prediction of soluble protein expression in Escherichia coli68
SumGNN: multi-typed drug interaction prediction via efficient knowledge graph summarization68
Protein interaction interface region prediction by geometric deep learning68
Unsupervised protein embeddings outperform hand-crafted sequence and structure features at predicting molecular function68
Structure-aware protein–protein interaction site prediction using deep graph convolutional network67
Make Interactive Complex Heatmaps in R66
Webina: an open-source library and web app that runs AutoDock Vina entirely in the web browser66
PyMod 3: a complete suite for structural bioinformatics in PyMOL66
GraphQA: protein model quality assessment using graph convolutional networks65
propeller: testing for differences in cell type proportions in single cell data65
DELPHI: accurate deep ensemble model for protein interaction sites prediction65
CellProfiler Analyst 3.0: accessible data exploration and machine learning for image analysis64
NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data64
MDeePred: novel multi-channel protein featurization for deep learning-based binding affinity prediction in drug discovery64
DeepTE: a computational method for de novo classification of transposons with convolutional neural network63
BP4RNAseq: a babysitter package for retrospective and newly generated RNA-seq data analyses using both alignment-based and alignment-free quantification method63
Gene regulation inference from single-cell RNA-seq data with linear differential equations and velocity inference63
PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs63
COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology62
HyperAttentionDTI: improving drug–protein interaction prediction by sequence-based deep learning with attention mechanism62
Impact of protein conformational diversity on AlphaFold predictions61
Conditional out-of-distribution generation for unpaired data using transfer VAE61
iEnhancer-XG: interpretable sequence-based enhancers and their strength predictor61
Deuteros 2.0: peptide-level significance testing of data from hydrogen deuterium exchange mass spectrometry60
DeepSurf: a surface-based deep learning approach for the prediction of ligand binding sites on proteins60
Cellsnp-lite: an efficient tool for genotyping single cells60
V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data60
UCSCXenaShiny: an R/CRAN package for interactive analysis of UCSC Xena data59
MOVICS: an R package for multi-omics integration and visualization in cancer subtyping58
iCarPS: a computational tool for identifying protein carbonylation sites by novel encoded features58
Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning57
RNA-SeQC 2: efficient RNA-seq quality control and quantification for large cohorts56
COVID-2019-associated overexpressed Prevotella proteins mediated host–pathogen interactions and their role in coronavirus outbreak56
ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning56
Toward heterogeneous information fusion: bipartite graph convolutional networks for in silico drug repurposing56
Extended connectivity interaction features: improving binding affinity prediction through chemical description55
DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction54
Subtype-GAN: a deep learning approach for integrative cancer subtyping of multi-omics data53
TALE: Transformer-based protein function Annotation with joint sequence–Label Embedding53
ImmuCellAI-mouse: a tool for comprehensive prediction of mouse immune cell abundance and immune microenvironment depiction53
Mutalyzer 2: next generation HGVS nomenclature checker52
MBG: Minimizer-based sparse de Bruijn Graph construction52
Current structure predictors are not learning the physics of protein folding52
SpatialExperiment: infrastructure for spatially-resolved transcriptomics data in R using Bioconductor51
SAINT: self-attention augmented inception-inside-inception network improves protein secondary structure prediction51
Bacteriophage classification for assembled contigs using graph convolutional network51
VPF-Class: taxonomic assignment and host prediction of uncultivated viruses based on viral protein families51
Automated inference of Boolean models from molecular interaction maps using CaSQ51
lncLocator 2.0: a cell-line-specific subcellular localization predictor for long non-coding RNAs with interpretable deep learning51
ViralMSA: massively scalable reference-guided multiple sequence alignment of viral genomes51
HunFlair: an easy-to-use tool for state-of-the-art biomedical named entity recognition51
eMPRess: a systematic cophylogeny reconciliation tool51
Geometric potentials from deep learning improve prediction of CDR H3 loop structures50
Humanization of antibodies using a machine learning approach on large-scale repertoire data50
EpiDope: a deep neural network for linear B-cell epitope prediction49
BWA-MEME: BWA-MEM emulated with a machine learning approach49
Evaluating single-cell cluster stability using the Jaccard similarity index49
Identification of sub-Golgi protein localization by use of deep representation learning features48
AEMDA: inferring miRNA–disease associations based on deep autoencoder48
MVGCN: data integration through multi-view graph convolutional network for predicting links in biomedical bipartite networks47
stPlus: a reference-based method for the accurate enhancement of spatial transcriptomics47
Interfacing Seurat with the R tidy universe47
DamageProfiler: fast damage pattern calculation for ancient DNA46
Ribbon: intuitive visualization for complex genomic variation46
Improved RNA secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning46
Cellinker: a platform of ligand–receptor interactions for intercellular communication analysis46
DLAB: deep learning methods for structure-based virtual screening of antibodies45
SVIM-asm: structural variant detection from haploid and diploid genome assemblies45
MobiDB-lite 3.0: fast consensus annotation of intrinsic disorder flavors in proteins45
StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps45
OPUS-TASS: a protein backbone torsion angles and secondary structure predictor based on ensemble neural networks44
amPEPpy 1.0: a portable and accurate antimicrobial peptide prediction tool44
RCSB Protein Data Bank: improved annotation, search and visualization of membrane protein structures archived in the PDB43
Manifold alignment for heterogeneous single-cell multi-omics data integration using Pamona43
HierCC: a multi-level clustering scheme for population assignments based on core genome MLST43
LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities42
GMNN2CD: identification of circRNA–disease associations based on variational inference and graph Markov neural networks42
Swarm v3: towards tera-scale amplicon clustering42
FL-QSAR: a federated learning-based QSAR prototype for collaborative drug discovery42
CoMut: visualizing integrated molecular information with comutation plots41
ODGI: understanding pangenome graphs41
Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports41
mlr3proba: an R package for machine learning in survival analysis41
The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction41
AMICI: high-performance sensitivity analysis for large ordinary differential equation models41
ToxDL: deep learning using primary structure and domain embeddings for assessing protein toxicity40
TaxoNN: ensemble of neural networks on stratified microbiome data for disease prediction40
Solubility-Weighted Index: fast and accurate prediction of protein solubility40
Plotgardener: cultivating precise multi-panel figures in R40
TandemTools: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats40
Real-time mapping of nanopore raw signals40
MultiDTI: drug–target interaction prediction based on multi-modal representation learning to bridge the gap between new chemical entities and known heterogeneous network40
Socket2: a program for locating, visualizing and analyzing coiled-coil interfaces in protein structures39
Using drug descriptions and molecular structures for drug–drug interaction extraction from literature39
UniRule: a unified rule resource for automatic annotation in the UniProt Knowledgebase38
MR-Clust: clustering of genetic variants in Mendelian randomization with similar causal estimates38
Fijiyama: a registration tool for 3D multimodal time-lapse imaging38
ganon: precise metagenomics classification against large and up-to-date sets of reference sequences38
Deep graph learning of inter-protein contacts37
PROSS 2: a new server for the design of stable and highly expressed protein variants37
GPDBN: deep bilinear network integrating both genomic data and pathological images for breast cancer prognosis prediction37
MAGUS: Multiple sequence Alignment using Graph clUStering37
DTF: Deep Tensor Factorization for predicting anticancer drug synergy37
DeepEventMine: end-to-end neural nested event extraction from biomedical texts37
orfipy: a fast and flexible tool for extracting ORFs37
ASpli: integrative analysis of splicing landscapes through RNA-Seq assays37
Tiara: deep learning-based classification system for eukaryotic sequences36
SCIM: universal single-cell matching with unpaired feature sets36
SHOGUN: a modular, accurate and scalable framework for microbiome quantification36
MIB2: metal ion-binding site prediction and modeling server36
iPromoter-BnCNN: a novel branched CNN-based predictor for identifying and classifying sigma promoters36
MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics36
VIDHOP, viral host prediction with deep learning36
BACPI: a bi-directional attention neural network for compound–protein interaction and binding affinity prediction36
Improved survival analysis by learning shared genomic information from pan-cancer data36
Stitching and registering highly multiplexed whole-slide images of tissues and tumors using ASHLAR36
FraGAT: a fragment-oriented multi-scale graph attention model for molecular property prediction35
QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks35
Advanced graph and sequence neural networks for molecular property prediction and drug discovery35
Graph neural representational learning of RNA secondary structures for predicting RNA-protein interactions35
BERT-Kcr: prediction of lysine crotonylation sites by a transfer learning method with pre-trained BERT models35
DeepViral: prediction of novel virus–host interactions from protein sequences and infectious disease phenotypes35
Coronavirus3D: 3D structural visualization of COVID-19 genomic divergence35
SpacePHARER: sensitive identification of phages from CRISPR spacers in prokaryotic hosts35
Generating property-matched decoy molecules using deep learning35
monaLisa: an R/Bioconductor package for identifying regulatory motifs35
TRTools: a toolkit for genome-wide analysis of tandem repeats34
BERN2: an advanced neural biomedical named entity recognition and normalization tool34
Ensembling graph attention networks for human microbe–drug association prediction34
PhosIDN: an integrated deep neural network for improving protein phosphorylation site prediction by combining sequence and protein–protein interaction information34
SOMDE: a scalable method for identifying spatially variable genes with self-organizing map33
FBA reveals guanylate kinase as a potential target for antiviral therapies against SARS-CoV-233
ELIXIR: providing a sustainable infrastructure for life science data at European scale33
SimPlot++: a Python application for representing sequence similarity and detecting recombination33
Bayesian modeling of spatial molecular profiling data via Gaussian process33
TissUUmaps: interactive visualization of large-scale spatial gene expression and tissue morphology data33
DNA Chisel, a versatile sequence optimizer33
BridgeDPI: a novel Graph Neural Network for predicting drug–protein interactions33
synergy: a Python library for calculating, analyzing and visualizing drug combination synergy33
Multi-omics data integration by generative adversarial network32
REINDEER: efficient indexing of k-mer presence and abundance in sequencing datasets32
Deep cross-omics cycle attention model for joint analysis of single-cell multi-omics data32
coronaSPAdes: from biosynthetic gene clusters to RNA viral assemblies32
Modeling multi-scale data via a network of networks32
cytomapper: an R/Bioconductor package for visualization of highly multiplexed imaging data32
DisoLipPred: accurate prediction of disordered lipid-binding residues in protein sequences with deep recurrent networks and transfer learning31
Deep learning models for RNA secondary structure prediction (probably) do not generalize across families31
KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers31
Inference of gene regulatory networks based on nonlinear ordinary differential equations31
Adversarial deconfounding autoencoder for learning robust gene expression embeddings31
TPpred-ATMV: therapeutic peptide prediction by adaptive multi-view tensor learning model31
MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model31
Transfer learning via multi-scale convolutional neural layers for human–virus protein–protein interaction prediction31
scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets31
A deep dilated convolutional residual network for predicting interchain contacts of protein homodimers30
EpiGraphDB: a database and data mining platform for health data science30
Pre-training graph neural networks for link prediction in biomedical networks30
Identifying signaling genes in spatial single-cell expression data30
Integrating multi-OMICS data through sparse canonical correlation analysis for the prediction of complex traits: a comparison study30
Inferring cancer progression from Single-Cell Sequencing while allowing mutation losses30
sepal: identifying transcript profiles with spatial patterns by diffusion-based modeling30
E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics30
STACAS: Sub-Type Anchor Correction for Alignment in Seurat to integrate single-cell RNA-seq data30
Learning embedding features based on multisense-scaled attention architecture to improve the predictive performance of anticancer peptides30
Node similarity-based graph convolution for link prediction in biological networks30
Predicting protein–peptide binding residues via interpretable deep learning30
ipDMR: identification of differentially methylated regions with interval P-values30
CLNN-loop: a deep learning model to predict CTCF-mediated chromatin loops in the different cell lines and CTCF-binding sites (CBS) pair types30
Topsy-Turvy: integrating a global view into sequence-based PPI prediction29
The string decomposition problem and its applications to centromere analysis and assembly29
BERTMHC: improved MHC–peptide class II interaction prediction with transformer and multiple instance learning29
Accurate spliced alignment of long RNA sequencing reads29
A mixture model for determining SARS-Cov-2 variant composition in pooled samples28
Improved design and analysis of practical minimizers28
PecanPy: a fast, efficient and parallelized Python implementation of node2vec28
NerLTR-DTA: drug–target binding affinity prediction based on neighbor relationship and learning to rank28
Supervised graph co-contrastive learning for drug–target interaction prediction28
AITL: Adversarial Inductive Transfer Learning with input and output space adaptation for pharmacogenomics28
Integrating genome-scale metabolic modelling and transfer learning for human gene regulatory network reconstruction28
DeepTrio: a ternary prediction system for protein–protein interaction using mask multiple parallel convolutional neural networks28
Statistical approaches for differential expression analysis in metatranscriptomics28
SPOT-Contact-LM: improving single-sequence-based prediction of protein contact map using a transformer language model27
DRUMMER—rapid detection of RNA modifications through comparative nanopore sequencing27
Cross-dependent graph neural networks for molecular property prediction27
CrossTalkeR: analysis and visualization of ligand–receptorne tworks27
Sigflow: an automated and comprehensive pipeline for cancer genome mutational signature analysis27
Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments27
0.083113193511963