Scientific Data

Papers
(The H4-Index of Scientific Data is 75. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-12-01 to 2025-12-01.)
ArticleCitations
Author Correction: The Plegma dataset: Domestic appliance-level and aggregate electricity demand with metadata from Greece1688
Author Correction: Mobility networks in Greater Mexico City681
A database of seed plants on taxonomy, geography and ecology in the Qinling-Daba Mountains and adjacent areas675
Identifying Cocoa Flower Visitors: A Deep Learning Dataset595
Tsunami Runup Survey Data From The Taan Fjord Landslide Event420
Chromosome-level genome assembly of Oriental chestnut gall wasp (Dryocosmus kuriphilus)413
Multi-proteomics and interactome dataset of tick-borne encephalitis virus infected host cells406
Linking Research Data with Physically Preserved Research Materials in Chemistry401
Chromosome-level genome assembly of the Rhizoctonia solani362
Occurrence of human infection with Salmonella Typhi in sub-Saharan Africa362
EEG Dataset for the Recognition of Different Emotions Induced in Voice-User Interaction294
CreelCat, a Catalog of United States Inland Creel and Angler Survey Data293
An Enhanced Phenology Dataset for Global Drylands from 2001 to 2019244
In toto light sheet fluorescence microscopy live imaging datasets of Ceratitis capitata embryonic development232
A dataset of scientific dates from archaeological sites in eastern Africa spanning 5000 BCE to 1800 CE227
A dataset of the daily edge of each polynya in the Antarctic223
A daily high-resolution (1 km) human thermal index collection over the North China Plain from 2003 to 2020189
A focus groups study on data sharing and research data management180
The Latin American Legislators Dataset175
PAVC: The foundation for a Pan-Arctic Vegetation Cover database168
OPERAnet, a multimodal activity recognition dataset acquired from radio frequency and vision-based sensors157
A global dataset of fossil fungi records from the Cenozoic146
A database of steric and electronic properties of heteroaryl substituents142
T1DiabetesGranada: a longitudinal multi-modal dataset of type 1 diabetes mellitus142
An 8-model ensemble of CMIP6-derived ocean surface wave climate141
What’s the TEE: Metrics of Temperature Extremes in Europe NUTS Regions (1980-2024)139
Mediterranean marine sediment cores database: unlocking paleoclimatic signals for the last 20,000 years126
Dataset on the effects of psychological care on depression and suicide ideation in underrepresented children122
Near-complete reference genome assembly of Hoya carnosa121
Empowering open data sharing for social good: a privacy-aware approach118
A Simulated Comprehensive Photon Flux Shielding Spectra Dataset for Advanced Radiation Safety Assessment118
A Field-Level Asset Mapping Dataset for England’s Agricultural Sector118
Chromosome-level genome assembly of rock carp (Procypris rabaudi)117
Enrichment of lung cancer computed tomography collections with AI-derived annotations117
Chromosome-level assemblies of cultivated water chestnut Trapa bicornis and its wild relative Trapa incisa116
The first high-quality chromosome-level genome of Parupeneus biaculeatus using HiFi and Hi-C data116
A chromosome-scale assembly of Ormosia boluoensis (Fabaceae)112
Author Correction: Database covering the prayer movements which were not available previously111
A thermosurvey dataset: Older adults’ experiences and adaptation to urban heat and climate change110
A Frontal Ablation Dataset for 49 Tidewater Glaciers in Greenland109
Unveiling the Spatiotemporal Dynamics of Global Brain Circulation: A Comprehensive Corpus (2000–2024)109
Students’ performance dataset for using machine learning technique in physics education research107
The Superfund Research Program Analytics Portal: linking environmental chemical exposure to biological phenotypes106
ML-extendable framework for multiphysics-multiscale simulation workflow and data management using Kadi4Mat105
Chromosome-level haplotype-resolved genome assembly of bread wheat’s wild relative Aegilops mutica102
Multi-Domain Indoor Dataset for Visual Place Recognition and Anomaly Detection by Mobile Robots100
District-scale surface temperatures generated from high-resolution longitudinal thermal infrared images99
An open-access database of nature-based carbon offset project boundaries99
Statistical performance indicators and index—a new tool to measure country statistical capacity98
NeuMa - the absolute Neuromarketing dataset en route to an holistic understanding of consumer behaviour98
Head model dataset for mixed reality navigation in neurosurgical interventions for intracranial lesions98
A longitudinal cross-country dataset on agricultural productivity and welfare in Sub-Saharan Africa97
Author Correction: Whales from space dataset, an annotated satellite image dataset of whales for training machine learning models95
Machine learning-ready remote sensing data for Maya archaeology95
Home monitoring with connected mobile devices for asthma attack prediction with machine learning95
Optimizing drug combination and mechanism analysis based on risk pathway crosstalk in pan cancer91
Slovak database of speech affected by neurodegenerative diseases90
Canopy height model and NAIP imagery pairs across CONUS87
A neuroimaging dataset during sequential color qualia similarity judgments with and without reports85
Chromosome-level genome assembly of the traditional medicinal plant Lindera aggregata85
The Carbon Catalogue, carbon footprints of 866 commercial products from 8 industry sectors and 5 continents84
Hydrological model-based streamflow reconstruction for Indian sub-continental river basins, 1951–202183
SDUST2023GRA_MSS: the new global marine gravity anomaly model determined from mean sea surface model83
GARD-LENS: A downscaled large ensemble dataset for understanding future climate and its uncertainties81
Shotgun metagenomes from productive lakes in an urban region of Sweden80
A century-long eddy-resolving simulation of global oceanic large- and mesoscale state80
The R package for DICOM to brain imaging data structure conversion79
Ultra-deep sequencing data from a liquid biopsy proficiency study demonstrating analytic validity79
The interplay between brain and behavior during development: A multisite effort to generate and share simulated datasets79
Author Correction: GERDA: The German Election Database79
Bioclimatic atlas of the terrestrial Arctic79
A semantic approach to mapping the Provenance Ontology to Basic Formal Ontology78
A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations78
Generating FAIR research data in experimental tribology77
Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data76
A construction waste landfill dataset of two districts in Beijing, China from high resolution satellite images75
0.074462890625