Scientific Data

Papers
(The TQCC of Scientific Data is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-04-01 to 2025-04-01.)
ArticleCitations
A chromosome-level genome assembly of East Asia endemic minnow Zacco platypus869
Dataset for developing deep learning models to assess crack width and self-healing progress in concrete461
A travelable area boundary dataset for visual navigation of field robots404
A catalogue of land-based adaptation and mitigation solutions to tackle climate change329
Kinematics, kinetics, and muscle activations during human locomotion over compliant terrains300
Author Correction: Microbial Metagenomes Across a Complete Phytoplankton Bloom Cycle: High-Resolution Sampling Every 4 Hours Over 22 Days265
PCMMD: A Novel Dataset of Plasma Cells to Support the Diagnosis of Multiple Myeloma235
Recovery of nearly 3,000 archaeal genomes from 152 terrestrial geothermal spring metagenomes228
A large-scale dataset for Chinese historical document recognition and analysis223
The compositional behavior of the human T cell receptor repertoire in ovarian cancer compared to healthy donors211
Mapping of 10-km daily diffuse solar radiation across China from reanalysis data and a Machine-Learning method208
A dataset on formulation parameters and characteristics of drug-loaded PLGA microparticles199
Curated global occurrence dataset of the insect order Zoraptera194
Vis-NIR soil spectral library of the Hungarian Soil Degradation Observation System191
CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting181
A high-quality chromosome-level genome assembly of Pacific whiteleg shrimp (Penaeus vannamei)178
A co-registered in-situ and ex-situ dataset from wire arc additive manufacturing process172
A standardized lexicon of body odor words crafted from 17 countries167
Time-dependent RNA transcriptional profiling of abomasal mucosa in cattle infected with Ostertagia ostertagi155
A global dataset of fossil fungi records from the Cenozoic148
EEG Dataset for the Recognition of Different Emotions Induced in Voice-User Interaction135
Shear modulus reduction and damping ratios curves joined with engineering geological units in Italy133
Single-cell assay for transposase-accessible chromatin sequencing of human clear cell renal cell carcinoma127
Exploring, walking, and interacting in virtual reality with simulated low vision: a living contextual dataset119
A neuroimaging dataset during sequential color qualia similarity judgments with and without reports118
Measuring Overwork in China Using Daily High-Resolution Nighttime Satellite Data109
XyloDensMap: a georeferenced dataset for the wood density of 110,000 trees from 156 European species in France108
Chromosome-level genome assembly and annotation of the prickly nightshade Solanum rostratum Dunal108
Human alterations of the global floodplains 1992–2019106
Comprehensive energy demand and usage data for building automation97
Multi-environment field trials for wheat yield, stability and breeding progress in Germany92
Global daily 1 km land surface precipitation based on cloud cover-informed downscaling91
Effects of heat stress on 16S rDNA, metagenome and metabolome in Holstein cows at different growth stages90
A database of low-energy atomically precise nanoclusters89
A human lower-limb biomechanics and wearable sensors dataset during cyclic and non-cyclic activities87
ODFM, an omics data resource from microorganisms associated with fermented foods86
Size-fractionated microbiome observed during an eight-month long sampling in Jiaozhou Bay and the Yellow Sea86
Annual Impervious Surface Data from 2001–2020 for West African Countries: Ghana, Togo, Benin and Nigeria85
De novo transcriptome assembly and gene annotation for the toxic dinoflagellate Dinophysis83
A database of chemical absorption in human skin with mechanistic modeling applications83
High-quality faba bean reference transcripts generated using PacBio and Illumina RNA-seq data82
The transcriptomic footprint of Mytella strigata: de novo transcriptome assembly of a major invasive species82
Investigating the Quality of DermaMNIST and Fitzpatrick17k Dermatological Image Datasets82
High-fidelity annotated triploid genome of the quarantine root-knot nematode, Meloidogyne enterolobii80
Multidimensional dataset for cognitive assessment, sMRI, and rsfMRI in common benign epileptic children78
The Bushland, Texas, maize evapotranspiration, growth, and yield dataset Collection78
Dynamic urban morphology mapping in Chinese cities based on local climate zone approach78
An enhanced rainfall-induced landslide catalogue in Italy77
High-speed video recordings of metal powder pneumatic conveying in thin capillary pipes77
A time-varying index for agricultural suitability across Europe from 1500–200076
Chromosome-level genome assembly of tetraploid Chinese cherry (Prunus pseudocerasus)74
A high-quality chromosome-scale genome assembly of the Cherokee rose (Rosa laevigata)73
Linking Research Data with Physically Preserved Research Materials in Chemistry71
A human single-neuron dataset for object recognition71
Propithecus verreauxi demography spanning 40 years at Bezà Mahafaly Special Reserve, southwest Madagascar70
Australian automotive workers and community leaders interview dataset following 2017 assembly plant closures70
An Integrated Database for Exploring Alternative Promoters in Animals69
A National Synthetic Populations Dataset for the United States69
A synthetic building operation dataset68
Genome assembly of Hawaiian flower thrips Thrips hawaiiensis (Thysanoptera: Thripidae)68
A knowledge graph for crop diseases and pests in China68
A chromosome-level genome assembly of the spider mite Tetranychus piercei McGregor67
Monitoring non-pharmaceutical public health interventions during the COVID-19 pandemic67
Author Correction: First assessment of underwater sound levels in the Northern Adriatic Sea at the basin scale67
Chromosome-level genome assembly of the critically endangered Baer’s pochard (Aythya baeri)66
SOIL-WATERGRIDS, mapping dynamic changes in soil moisture and depth of water table from 1970 to 201465
The Avian Diet Database as a source of quantitative information on bird diets65
FastMRI Prostate: A public, biparametric MRI dataset to advance machine learning for prostate cancer imaging65
Single-cell RNA-sequencing of virus-specific cellular immune responses in chronic hepatitis B patients65
Three-Dimensional Motion Capture Data of a Movement Screen from 183 Athletes64
Author Correction: European primary forest database v2.064
A high-fidelity residential building occupancy detection dataset64
RNA-seq of peripheral blood mononuclear cells of congenital generalized lipodystrophy type 2 patients64
A high-spatial-resolution dataset of human thermal stress indices over South and East Asia64
High-resolution freshwater dissolved calcium and pH data layers for Canada and the United States64
Crowd cluster data in the USA for analysis of human response to COVID-19 events and policies63
Attributes of the food and physical activity built environments from the Southern Cone of Latin America63
Product, building, and infrastructure material stocks dataset for 337 Chinese cities between 1978 and 202063
The short-term mortality fluctuation data series, monitoring mortality shocks across time and space61
Directional wave buoy data measured near Campbell Island, New Zealand61
An integrated metagenomic, metabolomic and transcriptomic survey of Populus across genotypes and environments61
Whole genome and exome sequencing reference datasets from a multi-center and cross-platform benchmark study60
Author Correction: Open-access quantitative MRI data of the spinal cord and reproducibility across participants, sites and manufacturers60
The Simrad EK60 echosounder dataset from the Malaspina circumnavigation59
Total irrigation by crop in the Continental United States from 2008 to 202058
De novo transcriptomes of six calanoid copepods (Crustacea): a resource for the discovery of novel genes58
Integrated microbiome-metabolome-genome axis data of Laiwu and Lulai pigs57
A dataset for measuring the impact of research data and their curation57
Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data56
A construction waste landfill dataset of two districts in Beijing, China from high resolution satellite images56
The first chromosome-level genome of the stag beetle Dorcus hopei Saunders, 1854 (Coleoptera: Lucanidae)56
Addendum: High-resolution terrestrial climate, bioclimate and vegetation for the last 120,000 years56
Towards Gender Harmony Dataset: Gender Beliefs and Gender Stereotypes in 62 Countries56
Chromosome-level genome assembly of the threatened resource plant Cinnamomum chago55
Haplotype-resolved genome assembly of Coriaria nepalensis a non-legume nitrogen-fixing shrub55
Normative volumes and relaxation times at 3T during brain development55
gga-miRNOME, a microRNA-sequencing dataset from chick embryonic tissues54
Scientific echosounder data provide a predator’s view of Antarctic krill (Euphausia superba)53
Mass cytometric and transcriptomic profiling of epithelial-mesenchymal transitions in human mammary cell lines53
Author Correction: Brain Data Standards - A method for building data-driven cell-type ontologies53
Flora diversity survey and establishment of a plant DNA barcode database of Lomas ecosystems in Peru53
ReMIND: The Brain Resection Multimodal Imaging Database52
Stable isotope variations of dew under three different climates52
A chromosome-level genome assembly of an avivorous bat species (Nyctalus aviator)52
Zeo-1, a computational data set of zeolite structures52
A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution51
Open science resources from the Tara Pacific expedition across coral reef and surface ocean ecosystems51
Genome-resolved carbon processing potential of tropical peat microbiomes from an oil palm plantation51
Exploring SureChEMBL from a drug discovery perspective51
De novo transcriptome assembly and annotation for gene discovery in Salamandra salamandra at the larval stage50
A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations49
Author Correction: Mobility networks in Greater Mexico City49
Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda49
SkewDB, a comprehensive database of GC and 10 other skews for over 30,000 chromosomes and plasmids49
A single-cell RNA-seq dataset describing macrophages in NSCLC tumor and peritumor tissues48
Venom-gland transcriptomics and venom proteomics of the Tibellus oblongus spider48
Revised monthly energy generation estimates for 1,500 hydroelectric power plants in the United States48
Individual attendance data for over 30 years of international climate change talks48
Robotic monitoring of Alpine screes: a dataset from the EU Natura2000 habitat 8110 in the Italian Alps48
A dataset of manually annotated filaments from H-alpha observations48
Publisher Correction: 1-km resolution rebound surfaces and paleotopography of glaciated North America since the Last Glacial Maximum48
The ImSURE phantoms: a digital dataset for radiomic software benchmarking and investigation47
District-scale surface temperatures generated from high-resolution longitudinal thermal infrared images46
Chromosome-level genome assembly of chub mackerel (Scomber japonicus) from the Indo-Pacific Ocean46
Atmospheric new particle formation identifier using longitudinal global particle number size distribution data45
Thirty years of volcano geodesy from space at Campi Flegrei caldera (Italy)45
Dataset on heavy metal pollution assessment in freshwater ecosystems45
T1DiabetesGranada: a longitudinal multi-modal dataset of type 1 diabetes mellitus45
The PAD-US-AR dataset: Measuring accessible and recreational parks in the contiguous United States44
An fMRI dataset in response to large-scale short natural dynamic facial expression videos44
Dataset of Smartphone-Based Finger Tapping Test44
Modeling community standards for metadata as templates makes data FAIR44
Finnish inventory data of underwater marine biodiversity43
Eagle-449: A volumetric, whole-brain compilation of brain atlases for vestibular functional MRI research43
High-resolution repeat topography of drifting ice floes in the Arctic Ocean from terrestrial laser scanning43
Multi-proteomics and interactome dataset of tick-borne encephalitis virus infected host cells43
Optimizing drug combination and mechanism analysis based on risk pathway crosstalk in pan cancer43
A transcriptome dataset for gonadectomy-induced changes in rat spinal cord43
Aerodynamic characterisation of porous fairings: pressure drop and Laser Doppler Velocimetry measurements43
Multi-cohort shotgun metagenomic analysis of oral and gut microbiota overlap in healthy adults43
Extended-wavelength diffuse reflectance spectroscopy dataset of animal tissues for bone-related biomedical applications42
A Non-Laboratory Gait Dataset of Full Body Kinematics and Egocentric Vision42
Chromosome-level genome assembly of the giant ladybug Megalocaria dilatata42
Dataset of human-single neuron activity during a Sternberg working memory task42
A subnational reproductive, maternal, newborn, child, and adolescent health and development atlas of India42
Fluorescence microscopy and correlative brightfield videos of mitochondria and vesicles in H9c2 cardiomyoblasts42
Slovak database of speech affected by neurodegenerative diseases42
A global dataset of tree hydraulic and structural traits imputed from phylogenetic relationships42
Bioclimatic atlas of the terrestrial Arctic42
A high-resolution multi-scale industrial water use dataset in China42
Transcriptome-wide RNA 5-methylcytosine profiles of human iPSCs and iPSC-derived cardiomyocytes41
Chromosome-level genome assembly for three geographical stocks of large yellow croaker (Larimichthys crocea)41
Single-Molecule Tracking dataset for histone H3 (hht1) from live and fixed cells of Schizosaccharomyces pombe41
AerialWaste dataset for landfill discovery in aerial and satellite images41
A literature-derived dataset on risk factors for dry eye disease41
A Time-of-Flight and Radar Dataset of a neonatal Thorax Simulator with synchronized Reference Sensor Signals for respiratory Rate Detection41
Author Correction: A public dataset of dogs vital signs recorded with ultra wideband radar and reference sensors40
3DPatBody: 3D dataset of human bodies of a patagonian population and their anthropometric measurements40
Phytoplankton optical fingerprint libraries for development of phytoplankton ocean color satellite products39
Chromosome-level genome assembly of the Japanese sawyer beetle Monochamus alternatus39
GARD-LENS: A downscaled large ensemble dataset for understanding future climate and its uncertainties39
Chromosome-level genome assembly of Phortica okadai, a vector of Thelazia callipaeda39
The Three Terms Task - an open benchmark to compare human and artificial semantic representations39
Global Gridded Crop Production Dataset at 10 km Resolution from 2010 to 202039
A global synthesis of high-resolution stable isotope data from benthic foraminifera of the last deglaciation39
A biological ocean data reformatting effort38
Chromosomal level genome assemblies of two Malus crabapple cultivars Flame and Royalty38
A chromosome-level genome assembly of the forestry pest Coronaproctus castanopsis38
Dual-modal edible oil impurity dataset for weak feature detection38
COVID-19 non-pharmaceutical interventions: data annotation for rapidly changing local policy information37
Fluorescent Neuronal Cells v2: multi-task, multi-format annotations for deep learning in microscopy37
The Superfund Research Program Analytics Portal: linking environmental chemical exposure to biological phenotypes37
A large-scale dataset for end-to-end table recognition in the wild37
A multiomics dataset for the study of RNA modifications in human macrophage differentiation and polarisation37
Remote Sensing-Based Extension of GRDC Discharge Time Series - A Monthly Product with Uncertainty Estimates37
Cancer-Alterome: a literature-mined resource for regulatory events caused by genetic alterations in cancer36
Dataset on the effects of psychological care on depression and suicide ideation in underrepresented children36
WAVES – The Lucile Packard Children’s Hospital Pediatric Physiological Waveforms Dataset36
The bii4africa dataset of faunal and floral population intactness estimates across Africa’s major land uses36
Protein interactors of 3-O sulfated heparan sulfates in human MCI and age-matched control cerebrospinal fluid36
GEOWEALTH-US: Spatial wealth inequality data for the United States, 1960–202036
Single-cell integrative analysis reveals consensus cancer cell states and clinical relevance in breast cancer36
Chimera: An atlas of regular vines on up to 8 nodes35
HYADES - A Global Archive of Annual Maxima Daily Precipitation35
Global holiday datasets for understanding seasonal human mobility and population dynamics35
A large and diverse brain organoid dataset of 1,400 cross-laboratory images of 64 trackable brain organoids35
A unified dataset for the city-scale traffic assignment model in 20 U.S. cities35
Native range estimates for red-listed vascular plants34
Processing of visual and non-visual naturalistic spatial information in the "parahippocampal place area"34
Author Correction: Mapping annual 10-m soybean cropland with spatiotemporal sample migration34
A dataset of micro biodiversity in benthic sediment at a global scale34
The Carbon Catalogue, carbon footprints of 866 commercial products from 8 industry sectors and 5 continents34
Ultra-deep sequencing data from a liquid biopsy proficiency study demonstrating analytic validity33
An interprovincial input–output database distinguishing firm ownership in China from 1997 to 201733
Developing a large-scale dataset of flood fatalities for territories in the Euro-Mediterranean region, FFEM-DB33
De novo transcriptomes of cave and surface isopod crustaceans: insights from 11 species across three suborders33
A haplotype-resolved genome assembly of Malus domestica ‘Red Fuji’33
Simulated sulfur K-edge X-ray absorption spectroscopy database of lithium thiophosphate solid electrolytes33
Author Correction: MASCDB, a database of images, descriptors and microphysical properties of individual snowflakes in free fall32
A nineteenth-century urban Ottoman population micro dataset: Data extraction and relational database curation from the 1840s pre-census Bursa population registers32
Human displacements, fatalities, and economic damages linked to remotely observed floods32
A dataset of human capital-weighted population estimates for 185 countries from 1970 to 210032
Home monitoring with connected mobile devices for asthma attack prediction with machine learning32
Ensemble of CMIP6 derived reference and potential evapotranspiration with radiative and advective components32
Motor evoked potentials for multiple sclerosis, a multiyear follow-up dataset32
Single cell transcriptome sequencing of stimulated and frozen human peripheral blood mononuclear cells32
Caltech Conte Center, a multimodal data resource for exploring social cognition and decision-making32
Type B Aortic Dissection CTA Collection with True and False Lumen Expert Annotations for the Development of AI-based Algorithms32
A dataset for assessing phytolith data for implementation of the FAIR data principles32
NeuMa - the absolute Neuromarketing dataset en route to an holistic understanding of consumer behaviour31
Spatial and temporal data to study residential heat decarbonisation pathways in England and Wales31
A thermosurvey dataset: Older adults’ experiences and adaptation to urban heat and climate change31
Fatigue database of complex metallic alloys31
Realization times of energetic modernization measures for buildings based on interviews with craftworkers31
Multi sequence average templates for aging and neurodegenerative disease populations31
A dataset of high-resolution digital elevation models of the Skeiðarársandur kettle holes, Southern Iceland30
Machine-learning ready data on the thermal power consumption of the Mars Express Spacecraft30
Generating FAIR research data in experimental tribology30
High-resolution transcriptome datasets during embryogenesis of plant-parasitic nematodes30
Transcriptome sequencing of seven deep marine invertebrates30
A global dataset of the fraction of absorbed photosynthetically active radiation for 1982–202230
A multi-modal panel dataset to understand the psychological impact of the pandemic30
Computed tomography reconstructions of burrow networks for the Opheliid polychaete, Armandia cirrhosa30
Making Mathematical Research Data FAIR: Pathways to Improved Data Sharing30
A focus groups study on data sharing and research data management30
A collection of 3D geomodels of the Los Humeros and Acoculco geothermal systems (Mexico)30
Chromosome-level genome assembly of Korean holoparasitic plants, Orobanche coerulescens29
Author Correction: Three de novo assembled wild cacao genomes from the Upper Amazon29
In toto light sheet fluorescence microscopy live imaging datasets of Ceratitis capitata embryonic development29
Clinical trial data sharing: a cross-sectional study of outcomes associated with two U.S. National Institutes of Health models29
Chromosome-level genome assembly of Cnidium monnieri, a highly demanded traditional Chinese medicine29
A multi-predator trophic database for the California Current Large Marine Ecosystem29
Three-dimensional topology dataset of folded radar stratigraphy in northern Greenland29
SPRC19: A Database of State Policy Responses to COVID-19 in the United States29
Genome-wide identification of accessible chromatin regions by ATAC-seq upon induction of the transcription factor bZIP11 in Arabidopsis29
Terabyte-scale supervised 3D training and benchmarking dataset of the mouse kidney29
HVSMR-2.0: A 3D cardiovascular MR dataset for whole-heart segmentation in congenital heart disease29
PSHG-TISS: A collection of polarization-resolved second harmonic generation microscopy images of fixed tissues28
DUO-GAIT: A gait dataset for walking under dual-task and fatigue conditions with inertial measurement units28
Machine learning-ready remote sensing data for Maya archaeology28
A large-scale fMRI dataset for the visual processing of naturalistic scenes28
Profile observations of the Arctic atmospheric boundary layer with the BELUGA tethered balloon during MOSAiC28
Scaling up SoccerNet with multi-view spatial localization and re-identification28
Metagenome sequencing and 103 microbial genomes from ballast water and sediments28
A Multi-Stain Breast Cancer Histological Whole-Slide-Image Data Set from Routine Diagnostics27
40 years of forest dynamics and tree demography in an intact tropical forest at M’Baïki in central Africa27
A comprehensive genomic and transcriptomic dataset of triple-negative breast cancers27
Dataset of the rumen microbiota and epithelial transcriptomics and proteomics in goat affected by solid diets27
Author Correction: Chromatic dispersion and thermal coefficients of hygroscopic liquids: 5 glycols and glycerol27
Author Correction: An fMRI Dataset for Concept Representation with Semantic Feature Annotations27
0.12191891670227