Scientific Data

Papers
(The H4-Index of Scientific Data is 82. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)
ArticleCitations
A database of seed plants on taxonomy, geography and ecology in the Qinling-Daba Mountains and adjacent areas2310
Tsunami Runup Survey Data From The Taan Fjord Landslide Event878
Linking Research Data with Physically Preserved Research Materials in Chemistry841
CreelCat, a Catalog of United States Inland Creel and Angler Survey Data773
A dataset of scientific dates from archaeological sites in eastern Africa spanning 5000 BCE to 1800 CE608
Development of Gridded Root-Zone Soil Moisture Product for India, 1981–2024556
A long-term ecosystem monitoring dataset from the ICP Integrated Monitoring network: biogeochemical data from 1977–2020 across 14 European countries535
Dynamic urban morphology mapping in Chinese cities based on local climate zone approach437
Dataset on heavy metal pollution assessment in freshwater ecosystems425
A Frontal Ablation Dataset for 49 Tidewater Glaciers in Greenland384
A semantic approach to mapping the Provenance Ontology to Basic Formal Ontology380
Empowering open data sharing for social good: a privacy-aware approach348
A western United States snow reanalysis dataset over the Landsat era from water years 1985 to 2021301
A large-scale dataset of patient summaries for retrieval-based clinical decision support systems301
A validated Mandarin Chinese Auditory Emotion Database of Subject-Personal-Pronoun Sentences (MCAE-SPPS)299
Chromosome-level genome assembly of the alpine extremophyte Tibetan snow lotus, Saussurea hypsipeta Diels278
FIGARO-E3: a high-resolution extended multi-regional input-output database consistent with official statistics209
T1DiabetesGranada: a longitudinal multi-modal dataset of type 1 diabetes mellitus209
Pennsieve: A Collaborative Platform for Translational Neuroscience and Beyond198
Multiclass Dataset for Intelligent Detection of Wind Turbine Blade Defects Using Drone Imagery187
An 8-model ensemble of CMIP6-derived ocean surface wave climate176
A large EEG dataset for studying cross-session variability in motor imagery brain-computer interface174
Occurrence of human infection with Salmonella Typhi in sub-Saharan Africa173
Ensemble of CMIP6 derived reference and potential evapotranspiration with radiative and advective components159
Globe-LFMC 2.0, an enhanced and updated dataset for live fuel moisture content research158
Head model dataset for mixed reality navigation in neurosurgical interventions for intracranial lesions155
Dataset on the effects of psychological care on depression and suicide ideation in underrepresented children154
A Field-Level Asset Mapping Dataset for England’s Agricultural Sector151
Near-complete reference genome assembly of Hoya carnosa151
Enrichment of lung cancer computed tomography collections with AI-derived annotations150
Chromosome-level assemblies of cultivated water chestnut Trapa bicornis and its wild relative Trapa incisa145
The first high-quality chromosome-level genome of Parupeneus biaculeatus using HiFi and Hi-C data144
A thermosurvey dataset: Older adults’ experiences and adaptation to urban heat and climate change143
Molecular landscape of respiratory infection: A large-scale, multi-centre blood transcriptome dataset142
Author Correction: Mobility networks in Greater Mexico City133
Author Correction: The Plegma dataset: Domestic appliance-level and aggregate electricity demand with metadata from Greece132
Chromosome-level genome assembly of rock carp (Procypris rabaudi)130
Author Correction: Database covering the prayer movements which were not available previously130
Author Correction: GERDA: The German Election Database130
A chromosome-scale assembly of Ormosia boluoensis (Fabaceae)128
Chromosome-level genome assembly of the Rhizoctonia solani128
A curated dataset of great ape genome diversity127
Sea ice records over more than a century at an observatory facing the Okhotsk coast of Hokkaido, Japan127
Hydrological model-based streamflow reconstruction for Indian sub-continental river basins, 1951–2021125
Slovak database of speech affected by neurodegenerative diseases120
RailFOD23: A dataset for foreign object detection on railroad transmission lines119
MarNemaFunDiv: a first comprehensive dataset of functional traits for marine nematodes116
BUS-UCLM: Breast ultrasound lesion segmentation dataset113
Enhancing radiomics and Deep Learning systems through the standardization of medical imaging workflows112
Multimodal Data for the Detection of Freezing of Gait in Parkinson’s Disease111
An open dataset for oracle bone character recognition and decipherment111
The interplay between brain and behavior during development: A multisite effort to generate and share simulated datasets110
PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery109
A construction waste landfill dataset of two districts in Beijing, China from high resolution satellite images107
Very High Resolution Projections over Italy under different CMIP5 IPCC scenarios103
Bounding the costs of electric vehicle managed charging—supply curves for scenarios from 2025 to 2050102
A bimodal dataset for diabetes research102
Full Field Digital Mammography Dataset from a Population Screening Program101
Chromosomal-level genome assembly of Ichthyurus bourgeoisi Gestro using PacBio HiFi and Hi-C sequencing101
Machine learning-ready remote sensing data for Maya archaeology98
Global Ocean Particulate Organic Phosphorus, Carbon, Oxygen for Respiration, and Nitrogen (GO-POPCORN)97
A dataset for deep learning based detection of printed circuit board surface defect96
A Global Database of Soil Plant Available Phosphorus96
Home monitoring with connected mobile devices for asthma attack prediction with machine learning94
EEG Dataset for the Recognition of Different Emotions Induced in Voice-User Interaction93
QMugs, quantum mechanical properties of drug-like molecules92
A focus groups study on data sharing and research data management90
Dataset for studying deformation in 3D patient-specific pulmonary artery anatomies90
Scaling up SoccerNet with multi-view spatial localization and re-identification90
A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations89
Multi-proteomics and interactome dataset of tick-borne encephalitis virus infected host cells88
Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection88
Analysis of AlphaMissense data in different protein groups and structural context87
A longitudinal cross-country dataset on agricultural productivity and welfare in Sub-Saharan Africa86
ML-extendable framework for multiphysics-multiscale simulation workflow and data management using Kadi4Mat86
A haplotype-resolved chromosomal-level genome assembly of Oxalis articulata85
A comprehensive dataset of riverine levee overtopping events for advancing risk assessment85
Correction: Sea lice infestation dataset for wild and farmed salmon populations on the Pacific coast of Canada (2001–2023)85
A near-global dataset of dissolved organic carbon concentrations and yields in forested headwater streams84
Shotgun metagenomes from productive lakes in an urban region of Sweden84
What’s the TEE: Metrics of Temperature Extremes in Europe NUTS Regions (1980-2024)83
A dataset of the daily edge of each polynya in the Antarctic83
A near-telomere-to-telomere genome assembly of the Chinese soft-shelled turtle (Pelodiscus sinensis)82
Spatial and temporal data to study residential heat decarbonisation pathways in England and Wales82
NeuMa - the absolute Neuromarketing dataset en route to an holistic understanding of consumer behaviour82
0.13736987113953