Database-The Journal of Biological Databases and Curation

Papers
(The median citation count of Database-The Journal of Biological Databases and Curation is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)
ArticleCitations
CCIDB: a manually curated cell–cell interaction database with cell context information65
CBGDA: a manually curated resource for gene–disease associations based on genome-wide CRISPR50
Empirical substitution models of protein evolution: database, relationships, and modeling considerations37
AVPCD: a plant-derived medicine database of antiviral phytochemicals for cancer, Covid-19, malaria and HIV35
GrameneOryza: a comprehensive resource for Oryza genomes, genetic variation, and functional data34
FishTEDB 2.0: an update fish transposable element (TE) database with new functions to facilitate TE research30
The importance of graph databases and graph learning for clinical applications28
BioKC: a collaborative platform for curation and annotation of molecular interactions27
CenhANCER: a comprehensive cancer enhancer database for primary tissues and cell lines25
Collecting and managing in situ banana genetic resources information (Musa spp.) using online resources and citizen science24
OncoCardioDB: a public and curated database of molecular information in onco-cardiology/cardio-oncology23
Building resource-efficient community databases using open-source software23
Post-composing ontology terms for efficient phenotyping in plant breeding20
Pathway-based, reaction-specific annotation of disease variants for elucidation of molecular phenotypes17
Integrated data-driven biotechnology research environments16
The Sickle Africa Data Coordinating Centre (SADaCC): a data science hub for interdisciplinary sickle cell disease research and training16
The overview of the BioRED (Biomedical Relation Extraction Dataset) track at BioCreative VIII16
GeniePool: genomic database with corresponding annotated samples based on a cloud data lake architecture16
Assessing the performance of generative artificial intelligence in retrieving information against manually curated genetic and genomic data15
Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences15
DisGeNet: a disease-centric interaction database among diseases and various associated genes14
CO-19 PDB 2.0: A Comprehensive COVID-19 Database with Global Auto-Alerts, Statistical Analysis, and Cancer Correlations13
HOFE: an interactive forensic entomological database13
An approach to making life sciences FAIR—FAIR-DS as a tool for Aspergillus fumigatus13
ESOMIR: a curated database of biomarker genes and miRNAs associated with esophageal cancer13
AFTM: a database of transmembrane regions in the human proteome predicted by AlphaFold13
CobVar—a comprehensive resource of vitamin B12-associated genomic variants12
Panorama: a database for the oncogenic evaluation of somatic mutations in pan-cancer12
BCEDB: a linear B-cell epitopes database for SARS-CoV-212
A comprehensive morphological database of hognose Porthidium pitvipers (Viperidae: Crotalinae)12
Interactive tools for functional annotation of bacterial genomes12
New approaches in developing medicinal herbs databases12
Optimized biomedical entity relation extraction method with data augmentation and classification using GPT-4 and Gemini11
AbAMPdb: a database of Acinetobacter baumannii specific antimicrobial peptides11
PheNormGPT: a framework for extraction and normalization of key medical findings11
HoloFood Data Portal: holo-omic datasets for analysing host–microbiota interactions in animal production11
Acupuncture indication knowledge bases: meridian entity recognition and classification based on ACUBERT11
Localizatome: a database for stress-dependent subcellular localization changes in proteins11
TCMToxDB: a comprehensive database for the toxicological analysis of traditional Chinese medicines11
CardioHotspots: a database of mutational hotspots for cardiac disorders10
Towards discovery: an end-to-end system for uncovering novel biomedical relations10
OncoCTMiner: streamlining precision oncology trial matching via molecular profile analysis10
Correction to: The overview of the BioRED (Biomedical Relation Extraction Dataset) track at BioCreative VIII10
GenDiS3 database: census on the prevalence of protein domain superfamilies of known structure in the entire sequence database10
FungiProteomeDB: a database for the molecular weight and isoelectric points of the fungal proteomes9
LSD600: the first corpus of biomedical abstracts annotated with lifestyle–disease relations9
Visualization and exploration of linked data using virtual reality9
Integrated ACMG-approved genes and ICD codes for the translational research and precision medicine9
PASS2: update of database of structure-based sequence alignments9
Artificial Intelligence-based database for prediction of protein structure and their alterations in ocular diseases8
scDrugAtlas: an integrative single-cell drug response database for dissecting tumour heterogeneity in therapeutic efficacy8
MoPSeq-DB: a user-friendly web application for genomic data management and analysis of marine mollusc pathogens8
LICEDB: light industrial core enzyme database for industrial applications and AI enzyme design8
StopKB: a comprehensive knowledgebase for nonsense suppression therapies8
Correction to: An interactive web application for exploring systemic lupus erythematosus blood transcriptomic diversity8
SingleQ: a comprehensive database of single-cell expression quantitative trait loci (sc-eQTLs) cross human tissues8
AFED, a comprehensive resource for Aspergillus flavus gene expression profiling7
CysDuF database: annotation and characterization of cysteine residues in domain of unknown function proteins based on cysteine post-translational modifications, their protein microenvironments, bioche7
Is metadata of articles about COVID-19 enough for multilabel topic classification task?7
SMCVdb: a database of experimental cellular toxicity information for drug candidate molecules7
Data sharing and ontology use among agricultural genetics, genomics, and breeding databases and resources of the Agbiodata Consortium7
IHM-DB: a curated collection of metagenomics data from the Indian Himalayan Region, and automated pipeline for 16S rRNA amplicon-based analysis (AutoQii2)7
Development of marine biodiversity database (BISMaL) to enable estimations past habitat conditions for marine life in the northwestern Pacific7
JTIS: enhancing biomedical document-level relation extraction through joint training with intermediate steps7
neomerDB: a comprehensive database of neomer biomarkers in cancer7
Correction to: CardioHotspots: a database of mutational hotspots for cardiac disorders7
Aerial Wildlife Image Repository for animal monitoring with drones in the age of artificial intelligence6
Toward clearer recognition and easier usefulness: development of a cross-lingual atherosclerotic cerebrovascular disease ontology6
gymnotoa-db: a database and application to optimize functional annotation in gymnosperms6
Transverse aortic constriction multi-omics analysis uncovers pathophysiological cardiac molecular mechanisms6
The Portuguese Beacon: sharing genomic variant data safely6
MANUDB: database and application to retrieve and visualize mammalian NUMTs6
PETCH-DB: a Portal for Exploring Tissue-specific and Complex disease-associated 5-Hydroxymethylcytosines6
An interactive web application for exploring systemic lupus erythematosus blood transcriptomic diversity6
ELiAH: the atlas of E3 ligases in human tissues for targeted protein degradation with reduced off-target effect6
SynVectorDB: embedding-based retrieval system for synthetic biology parts6
An open-source multi-semantic annotation dataset and automated recognition tool for viral carcinogenesis factors6
ImmRNA: a database of RNAs associated with tumor immunity6
MOKCa-3D database: functional and structural analysis of missense mutations in cancer6
Pipeline to explore information on genome editing using large language models and genome editing meta-database6
Protein Sequence Analysis landscape: A Systematic Review of Task Types, Databases, Datasets, Word Embeddings Methods, and Language Models6
ForestForward: visualizing and accessing integrated world forest data from the last 50 years6
Correction to: The importance of graph databases and graph learning for clinical applications5
Standardized pipelines support and facilitate integration of diverse datasets at the Rat Genome Database5
CPMKG: a condition-based knowledge graph for precision medicine5
NbThermo: a new thermostability database for nanobodies5
The state of the human coding gene catalogues5
CancerMHL: the database of integrating key DNA methylation, histone modifications and lncRNAs in cancer5
Correction to: CardioHotspots: a database of mutational hotspots for cardiac disorders5
ROSBASE1.0: a comprehensive database of reactive oxygen species (ROS): categorization of cell organelles, proteins, taxonomy, and diseases based on ROS-related activities5
Integrating AI-powered text mining from PubTator into the manual curation workflow at the Comparative Toxicogenomics Database5
CAS: enhancing implicit constrained data augmentation with semantic enrichment for biomedical relation extraction and beyond5
A novel taxonomic database for eukaryotic mitochondrial cytochrome oxidase subunit I gene (eKOI), with a focus on protists diversity5
Filling knowledge gaps in insect conservation by leveraging genetic data from public archives5
IsoProDB: an integrated map of human protein isoforms for accelerated research5
Transformer-based approach for symptom recognition and multilingual linking5
Autophagy3D: a comprehensive autophagy structure database4
scBrainMap: a landscape for cell types and associated genetic markers in the brain4
Biomedical literature-based clinical phenotype definition discovery using large language models4
Standardized naming of microbiome samples in Genomes OnLine Database4
AneRBC dataset: a benchmark dataset for computer-aided anemia diagnosis using RBC images4
lncHUB2: aggregated and inferred knowledge about human and mouse lncRNAs4
GExplore 1.5: a comprehensive Caenorhabditis elegans database for the analysis of gene function with a new user-friendly web interface4
scEccDNAdb: an integrated single-cell eccDNA resource for human and mouse4
ENCD: a manually curated database of experimentally supported endocrine system disease and lncRNA associations4
Peptipedia v2.0: a peptide sequence database and user-friendly web platform. A major update4
ARAapp: filling gaps in the ecological knowledge of spiders using an automated and dynamic approach to analyze systematically collected community data4
The landscape of microRNA interaction annotation: analysis of three rare disorders as a case study4
TRGdb: a universal resource for the exploration of taxonomically restricted genes in bacteria4
A review on Gene Ontology evaluations4
Helping authors produce FAIR taxonomic data: evaluation of an author-driven phenotype data production prototype4
IBDTransDB: a manually curated transcriptomic database for inflammatory bowel disease4
PDC: a highly compact file format to store protein 3D coordinates4
MantaID: a machine learning–based tool to automate the identification of biological database IDs4
A combinatorial approach implementing new database structures to facilitate practical data curation management of QTL, association, correlation and heritability data on trait variants4
PDB NextGen Archive: centralizing access to integrated annotations and enriched structural information by the Worldwide Protein Data Bank4
LitSumm: large language models for literature summarization of noncoding RNAs4
cancercelllines.org—a novel resource for genomic variants in cancer cell lines4
PYK-SubstitutionOME: an integrated database containing allosteric coupling, ligand affinity and mutational, structural, pathological, bioinformatic and computational information about pyruvate kinase 4
PLoV: a comprehensive database of genetic variants leading to pregnancy loss4
Mapping of EFO terms from the GWAS catalog data to multiple ontologies at the Rat Genome Database4
predicTox: an integrated database of clinical risk frequencies and human gene expression signatures for cardiotoxic drugs4
HCoVDB: a comprehensive database encompassing viral genomes, drug targets, and therapeutics of human coronaviruses4
RegulaTome: a corpus of typed, directed, and signed relations between biomedical entities in the scientific literature4
AgingReG: a curated database of aging regulatory relationships in humans4
Probe my Pathway (PmP): a portal to explore the chemical coverage of the human Reactome4
Assessing the use of supplementary materials to improve genomic variant discovery3
Enhancing biomedical relation extraction through data-centric and preprocessing-robust ensemble learning approach3
A comprehensive experimental comparison between federated and centralized learning3
CiliaMiner: an integrated database for ciliopathy genes and ciliopathies3
A database on the historical and current occurrences of snakes in Eswatini3
Expression of Concern: DisGeNet: a disease-centric interaction database among diseases and various associated genes3
MineProt: a stand-alone server for structural proteome curation3
Biomedical relation extraction with knowledge base–refined weak supervision3
CBPDdb: a curated database of compounds derived from Coumarin–Benzothiazole–Pyrazole3
ProNet DB: a proteome-wise database for protein surface property representations and RNA-binding profiles3
RNA-Chrom: a manually curated analytical database of RNA–chromatin interactome3
BEDB: a comprehensive binding energy database for molecular docking and dynamics: insights into Human Metapneumovirus (HMPV) Inhibitors3
piOxi database: a web resource of germline and somatic tissue piRNAs identified by chemical oxidation3
DrugRepoBank: a comprehensive database and discovery platform for accelerating drug repositioning3
Notes on the data quality of bibliographic records from the MEDLINE database3
MACSFeD—a database of mosquito acoustic communication and swarming features3
VineColD: an integrative database for global historical tracing and real-time monitoring of grapevine cold hardiness3
AIMedGraph: a comprehensive multi-relational knowledge graph for precision medicine3
MDDOmics: multi-omics resource of major depressive disorder3
Dataset of xenobiotics human renal clearance values3
CancerPPD2: an updated repository of anticancer peptides and proteins3
VariantHunter: a method and tool for fast detection of emerging SARS-CoV-2 variants3
The biomedical relationship corpus of the BioRED track at the BioCreative VIII challenge and workshop3
RettDb: the Rett syndrome omics database to navigate the Rett syndrome genomic landscape3
CaviDB: a database of cavities and their features in the structural and conformational space of proteins3
CuPCA: a web server for pan-cancer association analysis of large-scale cuproptosis-related genes3
Global Globin Network and adopting genomic variant database requirements for thalassemia2
Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical–protein relations2
MASH-GA: a manually curated cross-species transcriptomic database for metabolic-associated steatohepatitis2
PlantIntronDB: a database for plant introns that host functional elements2
The Immunopeptidomics Ontology (ImPO)2
SesamumGDB: a comprehensive platform for Sesamum genetics and genomics analysis2
Assessing resource use: a case study with the Human Disease Ontology2
PearMODB: a multiomics database for pear (Pyrus) genomics, genetics and breeding study2
MBS: a genome browser annotation track for high-confident microRNA binding sites in whole human transcriptome2
BDCD: a comprehensive Brain Disease Cell-cell communication Database2
BbGSD: Black-boned Sheep Genome SNP Database2
The TOXIN knowledge graph: supporting animal-free risk assessment of cosmetics2
FoPGDB: a pangenome database of Fusarium oxysporum, a cross-kingdom fungal pathogen2
STCDB4ND: a signal transduction classification database for neurological diseases2
PotatoBSLnc: a curated repository of potato long noncoding RNAs in response to biotic stress2
FooDrugs: a comprehensive food–drug interactions database with text documents and transcriptional data2
Automated annotation of scientific texts for ML-based keyphrase extraction and validation2
Data set of fraction unbound values in the in vitro incubations for metabolic studies for better prediction of human clearance2
FatPlants: a comprehensive information system for lipid-related genes and metabolic pathways in plants2
Fisheries data management systems in the NW Mediterranean: from data collection to web visualization2
Correction to: A Terpenoids Database with the Chemical Content as A Novel Agronomic Trait2
MyxoPortal: a database of myxobacterial genomic features2
MiCK: a database of gut microbial genes linked with chemoresistance in cancer patients2
DISPEL: database for ascertaining the best medicinal plants to cure human diseases2
The landscape of health disparities in the UK Biobank2
PPCRKB: a risk factor knowledge base of postoperative pulmonary complications2
ePerturbDB: enhancer’s experimental perturbation database2
Genome-wide identification of SSR markers from coding regions for endangered Argania spinosa L. skeels and construction of SSR database: AsSSRdb2
From library to landscape: integrative annotation workflows for compound libraries in drug repurposing2
Automated annotation and validation of human respiratory virus sequences using VADR2
Maize Feature Store: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications2
Enhancing statistical analysis of real world data2
GMMID: genetically modified mice information database2
A dataset of tumour-infiltrating lymphocytes in colorectal cancer patients using limited resources2
AI4FoodDB: a database for personalized e-Health nutrition and lifestyle through wearable devices and artificial intelligence2
ThermoPCD: a database of molecular dynamics trajectories of antibody–antigen complexes at physiologic and fever-range temperatures2
InTxDB: interaction data between gram-negative bacteria secreted effectors and host proteins2
A comprehensive database for biological data derived from sewage in five European cities2
Best practices for the manual curation of intrinsically disordered proteins in DisProt2
CIGAF—a database and interactive platform for insect-associated trichomycete fungi2
nhanesA: achieving transparency and reproducibility in NHANES research2
GdbMTB: a manually curated genomic database of magnetotactic bacteria2
BgDB: a comprehensive genomic resource information system of bitter gourd for accelerated breeding programme2
The Genomic SSR Millets Database (GSMDB): enhancing genetic resources for sustainable agriculture2
PeptiHub: a curated repository of precisely annotated cancer-related peptides with advanced utilities for peptide exploration and discovery2
VarGuideAtlas: a repository of variant interpretation guidelines2
BLAB2CancerKD: a knowledge graph database focusing on the association between lactic acid bacteria and cancer, but beyond2
The Odonata of China: a data-driven, open-access resource for biodiversity research and conservation2
0.077739000320435