Scientific Data

Papers
(The H4-Index of Scientific Data is 72. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)
ArticleCitations
Shotgun metagenomes from productive lakes in an urban region of Sweden1256
A database of seed plants on taxonomy, geography and ecology in the Qinling-Daba Mountains and adjacent areas563
Ultra-deep sequencing data from a liquid biopsy proficiency study demonstrating analytic validity528
Slovak database of speech affected by neurodegenerative diseases419
CreelCat, a Catalog of United States Inland Creel and Angler Survey Data354
Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data344
Directional wave buoy data measured near Campbell Island, New Zealand306
RNA-seq of peripheral blood mononuclear cells of congenital generalized lipodystrophy type 2 patients303
Author Correction: Mobility networks in Greater Mexico City287
Occurrence of human infection with Salmonella Typhi in sub-Saharan Africa286
Reinterpretation of prostate cancer pathology by Appl1, Sortilin and Syndecan-1 biomarkers279
Author Correction: The Plegma dataset: Domestic appliance-level and aggregate electricity demand with metadata from Greece279
A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations242
Linking Research Data with Physically Preserved Research Materials in Chemistry234
Empowering open data sharing for social good: a privacy-aware approach205
A daily high-resolution (1 km) human thermal index collection over the North China Plain from 2003 to 2020189
The first high-quality chromosome-level genome of Parupeneus biaculeatus using HiFi and Hi-C data154
Chromosome-level genome assembly of the Rhizoctonia solani144
Author Correction: Whales from space dataset, an annotated satellite image dataset of whales for training machine learning models144
The R package for DICOM to brain imaging data structure conversion141
Multi-proteomics and interactome dataset of tick-borne encephalitis virus infected host cells137
Chromosome-level genome assembly of Oriental chestnut gall wasp (Dryocosmus kuriphilus)137
EEG Dataset for the Recognition of Different Emotions Induced in Voice-User Interaction129
A global dataset of fossil fungi records from the Cenozoic127
Dynamic urban morphology mapping in Chinese cities based on local climate zone approach124
The interplay between brain and behavior during development: A multisite effort to generate and share simulated datasets124
A dataset of the daily edge of each polynya in the Antarctic113
OPERAnet, a multimodal activity recognition dataset acquired from radio frequency and vision-based sensors109
A comprehensive genomic and transcriptomic dataset of triple-negative breast cancers104
Molecular landscape of respiratory infection: A large-scale, multi-centre blood transcriptome dataset104
The landscape of abiotic and biotic stress-responsive splice variants with deep RNA-seq datasets in hot pepper103
Statistical performance indicators and index—a new tool to measure country statistical capacity103
Multimodal Data for the Detection of Freezing of Gait in Parkinson’s Disease102
An Enhanced Phenology Dataset for Global Drylands from 2001 to 2019101
Students’ performance dataset for using machine learning technique in physics education research99
A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland98
Near-complete reference genome assembly of Hoya carnosa98
A large-scale multi-label 12-lead electrocardiogram database with standardized diagnostic statements97
Enhancing radiomics and Deep Learning systems through the standardization of medical imaging workflows97
PAVC: The foundation for a Pan-Arctic Vegetation Cover database94
Exploring the electrophysiology of Parkinson’s disease with magnetoencephalography and deep brain recordings94
A Field-Level Asset Mapping Dataset for England’s Agricultural Sector92
Ensemble of CMIP6 derived reference and potential evapotranspiration with radiative and advective components89
A Cross Spatio-Temporal Pathology-based Lung Nodule Dataset89
China’s provincial process CO2 emissions from cement production during 1993–201988
A construction waste landfill dataset of two districts in Beijing, China from high resolution satellite images88
A thermosurvey dataset: Older adults’ experiences and adaptation to urban heat and climate change88
Generating FAIR research data in experimental tribology87
NeuMa - the absolute Neuromarketing dataset en route to an holistic understanding of consumer behaviour87
FIGARO-E3: a high-resolution extended multi-regional input-output database consistent with official statistics85
A Frontal Ablation Dataset for 49 Tidewater Glaciers in Greenland85
Chromosome-level genome assembly of the traditional medicinal plant Lindera aggregata84
An open-access database of nature-based carbon offset project boundaries83
A global 1 km resolution daily surface longwave radiation product from MODIS satellite data from 2000–202382
Chromosome-level assemblies of cultivated water chestnut Trapa bicornis and its wild relative Trapa incisa82
Machine learning-ready remote sensing data for Maya archaeology81
Head model dataset for mixed reality navigation in neurosurgical interventions for intracranial lesions81
Global Ocean Particulate Organic Phosphorus, Carbon, Oxygen for Respiration, and Nitrogen (GO-POPCORN)79
The Superfund Research Program Analytics Portal: linking environmental chemical exposure to biological phenotypes79
Author Correction: Open-access quantitative MRI data of the spinal cord and reproducibility across participants, sites and manufacturers78
A neuroimaging dataset during sequential color qualia similarity judgments with and without reports77
PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery76
Canopy height model and NAIP imagery pairs across CONUS76
ML-extendable framework for multiphysics-multiscale simulation workflow and data management using Kadi4Mat76
Sea ice records over more than a century at an observatory facing the Okhotsk coast of Hokkaido, Japan75
SDUST2023GRA_MSS: the new global marine gravity anomaly model determined from mean sea surface model75
Unveiling the Spatiotemporal Dynamics of Global Brain Circulation: A Comprehensive Corpus (2000–2024)74
Scaling up SoccerNet with multi-view spatial localization and re-identification74
Monitoring non-pharmaceutical public health interventions during the COVID-19 pandemic73
MarNemaFunDiv: a first comprehensive dataset of functional traits for marine nematodes73
Spatial and temporal data to study residential heat decarbonisation pathways in England and Wales72
A dataset of scientific dates from archaeological sites in eastern Africa spanning 5000 BCE to 1800 CE72
0.027535915374756