Journal of Big Data

Papers
(The TQCC of Journal of Big Data is 12. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-04-01 to 2025-04-01.)
ArticleCitations
DD-KARB: data-driven compliance to quality by rule based benchmarking504
Fine grain algorithm parallelization on a hybrid control-flow and dataflow processor322
Machine learning-based prediction of elliptical double steel columns under compression loading318
SIMER: an accurate and intelligent tool for simulating customizable population data across species in complex scenarios265
Sub-spatial prediction of votes integrating socioeconomic, educational, and age strata with machine learning and topological data analysis245
Unsupervised label generation for severely imbalanced fraud data223
Federated learning-driven IoT system for automated freshness monitoring in resource-constrained vending carts204
A scheduling algorithm to maximize storm throughput in heterogeneous cluster164
Dual channel and multi-scale adaptive morphological methods for infrared small targets161
Efficient pollen grain classification using pre-trained Convolutional Neural Networks: a comprehensive study153
Gaussian transformation enhanced semi-supervised learning for sleep stage classification143
Supervised contrastive pre-training models for mammography screening127
A new dimensionality reduction technique based on the Wavelet Transform for cancer classification124
Data analysis for vague contingency data101
Towards a folksonomy graph-based context-aware recommender system of annotated books98
Apply machine learning techniques to detect malicious network traffic in cloud computing93
PCJ Java library as a solution to integrate HPC, Big Data and Artificial Intelligence workloads87
A data value metric for quantifying information content and utility84
Accuracy improvements for cold-start recommendation problem using indirect relations in social networks82
Real-time spatio-temporal event detection on geotagged social media80
Dissimilarity space reinforced with manifold learning and latent space modeling for improved pattern classification76
Remote patient monitoring and classifying using the internet of things platform combined with cloud computing74
An empirical study on the evaluation of the RDF storage systems72
Detection of fickle trolls in large-scale online social networks71
Machine learning concepts for correlated Big Data privacy71
Diabetes emergency cases identification based on a statistical predictive model71
Exploring the form of big data products and the supporting systems70
Operationalizing and automating Data Governance68
Domain-relevance of influence: characterizing variations in online influence across multiple domains on social media61
Defining user spectra to classify Ethereum users based on their behavior61
Title2Vec: a contextual job title embedding for occupational named entity recognition and other applications60
Classification of long-term clinical course of Parkinson’s disease using clustering algorithms on social support registry database58
Prognostic stratification based on HIF-1α signaling for evaluating hypoxia status and immune landscape in hepatocellular carcinoma55
Free trade as domestic, economic, and strategic issues: a big data analytics approach54
Social media analysis of car parking behavior using similarity based clustering54
Tabular and latent space synthetic data generation: a literature review53
Context-aware prediction of active and passive user engagement: Evidence from a large online social platform52
Detecting unregistered users through semi-supervised anomaly detection with similarity datasets51
ASENN: attention-based selective embedding neural networks for road distress prediction48
New custom rating for improving recommendation system performance47
A survey of graph convolutional networks (GCNs) in FPGA-based accelerators46
Memetic multilabel feature selection using pruned refinement process46
CTGAN-ENN: a tabular GAN-based hybrid sampling method for imbalanced and overlapped data in customer churn prediction45
Sentiment analysis of Indonesian datasets based on a hybrid deep-learning strategy44
Quality assurance strategies for machine learning applications in big data analytics: an overview44
Hemorrhage semantic segmentation in fundus images for the diagnosis of diabetic retinopathy by using a convolutional neural network43
Digital social innovation based on Big Data Analytics for health and well-being of society43
An enhanced random forest approach using CoClust clustering: MIMIC-III and SMS spam collection application42
Estimating the carbon content of oceans using satellite sensor data42
Machine learning approach for predicting production delays: a quarry company case study41
The differences in gastric cancer epidemiological data between SEER and GBD: a joinpoint and age-period-cohort analysis41
RTiSR: a review-driven time interval-aware sequential recommendation method40
Education on quality assurance and assessment in teaching quality of high school instructors39
Evaluation is key: a survey on evaluation measures for synthetic time series39
GB-AFS: graph-based automatic feature selection for multi-class classification via Mean Simplified Silhouette39
Identification of tumor antigens and anoikis-based molecular subtypes in the hepatocellular carcinoma immune microenvironment: implications for mRNA vaccine development and precision treatment38
Hyperdimensional computing: a framework for stochastic computation and symbolic AI38
A fuel consumption-based method for developing local-specific CO2 emission rate database using open-source big data38
Early prediction of MODS interventions in the intensive care unit using machine learning37
Automated segmentation of choroidal neovascularization on optical coherence tomography angiography images of neovascular age-related macular degeneration patients based on deep learning37
An integrated multistage ensemble machine learning model for fraudulent transaction detection37
Hybrid beluga whale optimization algorithm with multi-strategy for functions and engineering optimization problems36
High-performance computing in healthcare: An automatic literature analysis perspective35
Multimodal text-emoji fusion using deep neural networks for text-based emotion detection in online communication34
A clustering-based approach for classifying data streams using graph matching34
Click-through rate prediction model integrating user interest and multi-head attention mechanism34
Skyline query under multidimensional incomplete data based on classification tree33
Web crawling based context aware recommender system using optimized deep recurrent neural network33
Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning33
A universal approach for multi-model schema inference33
Cartographies of warfare in the Indian subcontinent: Contextualizing archaeological and historical analysis through big data approaches33
Gene selection via improved nuclear reaction optimization algorithm for cancer classification in high-dimensional data33
Watch and learn: event-domain term extraction from social networks32
A hybrid Hadoop-based sentiment analysis classifier for tweets associated with COVID-19 utilizing two machine learning algorithms: CNN, and fuzzy C4.532
Algorithm for generating neutrosophic data using accept-reject method30
Exploring the state of the art in legal QA systems30
Anomaly detection and community detection in networks30
Data-driven multinomial random forest: a new random forest variant with strong consistency29
Low-level turbulence risk assessment and visualization using temporal rate of change of headwind of an aircraft29
Profitability trend prediction in crypto financial markets using Fibonacci technical indicator and hybrid CNN model29
Usability enhancement model for unstructured text in big data28
De-occlusion and recognition of frontal face images: a comparative study of multiple imputation methods28
Automatic identification and classification of pediatric glomerulonephritis on ultrasound images based on deep learning and radiomics28
Large language models, social demography, and hegemony: comparing authorship in human and synthetic text27
Spatial heterogeneities in acute lower respiratory infections prevalence and determinants across Ethiopian administrative zones27
Robust visual tracking using very deep generative model26
Breast cancer prediction using gated attentive multimodal deep learning26
Deep learning enhancing banking services: a hybrid transaction classification and cash flow prediction approach26
Correction to: Arabic text summarization using deep learning approach25
A proposed hybrid framework to improve the accuracy of customer churn prediction in telecom industry25
Dual-weight decay mechanism and Nelder-Mead simplex boosted RIME algorithm for optimal power flow25
Network intrusion detection using data dimensions reduction techniques25
Optimizing poultry audio signal classification with deep learning and burn layer fusion25
An LSTM and GRU based trading strategy adapted to the Moroccan market24
Error and optimism bias regularization24
Autoencoder-kNN meta-model based data characterization approach for an automated selection of AI algorithms24
Towards algorithmic framing analysis: expanding the scope by using LLMs24
The state of metaverse research: a bibliometric visual analysis based on CiteSpace23
Poisson logit hurdle model with associated factors of perinatal mortality in Ethiopia23
The stability of different aggregation techniques in ensemble feature selection23
Designing and evaluating a big data analytics approach for predicting students’ success factors23
A literature review on one-class classification and its potential applications in big data22
Exploration of issues, challenges and latest developments in autonomous cars22
Comparison of algorithms for the recognition of ChatGPT paraphrased texts22
Big data quality framework: a holistic approach to continuous quality management22
Value-at-risk student prescription trees for price personalization22
A novel sensitivity-based method for feature selection21
A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method21
Unsupervised hyperspectral image segmentation of films: a hierarchical clustering-based approach21
Missing values compensation in duplicates detection using hot deck method20
An enhanced machine learning framework for accurate diagnosis of tuberculous pleural effusion20
Big data processing using hybrid Gaussian mixture model with salp swarm algorithm20
Text Data Augmentation for Deep Learning20
Correlation-based feature selection of single cell transcriptomics data from multiple sources20
An efficient weighted slime mould algorithm for engineering optimization19
Long-term survival prediction in patients with acute brain lesions using ensemble machine learning algorithms: a cohort study with combined national health insurance service and its self-run hospital 19
Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging19
Machine learning in biomedical and health big data: a comprehensive survey with empirical and experimental insights19
A review on lung disease recognition by acoustic signal analysis with deep learning networks19
A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications19
Data reduction techniques for highly imbalanced medicare Big Data19
Comprehensive study of driver behavior monitoring systems using computer vision and machine learning techniques19
Comparing traditional news and social media with stock price movements; which comes first, the news or the price change?18
The Standard Deviation Score: a novel similarity metric for data analysis18
Breast cancer diagnosis with MFF-HistoNet: a multi-modal feature fusion network integrating CNNs and quantum tensor networks18
Dataset for unflappable driving: UNFLAPSet18
A systematic literature review of neuroimaging coupled with machine learning approaches for diagnosis of attention deficit hyperactivity disorder18
Discovering customer segments through interaction behaviors for home appliance business17
A review on adversarial–based deep transfer learning mechanical fault diagnosis17
Machine learning-based interactive dynamic resilience assessment for complex hydropower systems17
A large-scale sentiment analysis of tweets pertaining to the 2020 US presidential election17
Introducing the enterprise data marketplace: a platform for democratizing company data17
Out-of-distribution- and location-aware PointNets for real-time 3D road user detection without a GPU17
Artificial intelligence models for prediction of monthly rainfall without climatic data for meteorological stations in Ethiopia17
Adapting security and decentralized knowledge enhancement in federated learning using blockchain technology: literature review16
DiabSense: early diagnosis of non-insulin-dependent diabetes mellitus using smartphone-based human activity recognition and diabetic retinopathy analysis with Graph Neural Network16
The adaptive community-response (ACR) method for collecting misinformation on social media16
The application of adaptive group LASSO imputation method with missing values in personal income compositional data16
On the development of an information system for monitoring user opinion and its role for the public16
Design, development and performance analysis of cognitive assisting aid with multi sensor fused navigation for visually impaired people16
Data-driven prediction of soccer outcomes using enhanced machine and deep learning techniques16
Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection16
Risk and UCON-based access control model for healthcare big data16
RILS-ROLS: robust symbolic regression via iterated local search and ordinary least squares15
Detection and prevention of SQLI attacks and developing compressive framework using machine learning and hybrid techniques15
Internal dynamics of patent reference networks using the Bray–Curtis dissimilarity measure15
VeilGraph: incremental graph stream processing15
Distributed fuzzy clustering algorithm for mixed-mode data in Apache SPARK15
Fast cluster-based computation of exact betweenness centrality in large graphs15
Traffic and road conditions monitoring system using extracted information from Twitter14
Expanded graph embedding for joint network alignment and link prediction14
Potential for the use of large unstructured data resources by public innovation support institutions14
Readers’ affect: predicting and understanding readers’ emotions with deep learning14
Practical ANN prediction models for the axial capacity of square CFST columns14
Main memory controller with multiple media technologies for big data workloads14
Accelerating neural network training with distributed asynchronous and selective optimization (DASO)14
Transfer learning approach based on satellite image time series for the crop classification problem14
Scalable and space-efficient Robust Matroid Center algorithms14
Deep-Eware: spatio-temporal social event detection using a hybrid learning model13
A service-categorized security scheme with physical unclonable functions for internet of vehicles13
On data efficiency of univariate time series anomaly detection models13
Assessing the effects of hyperparameters on knowledge graph embedding quality13
Accurate identification of cashmere and wool fibers based on enhanced ShuffleNetV2 and transfer learning13
Online listing data and their interaction with market dynamics: evidence from Singapore during COVID-1913
DEMFFA: a multi-strategy modified Fennec Fox algorithm with mixed improved differential evolutionary variation strategies13
Chromatin state distribution of residue-specific histone acetylation in early myoblast differentiation13
EXABSUM: a new text summarization approach for generating extractive and abstractive summaries13
Characterizing patent big data upon IPC: a survey of triadic patent families and PCT applications13
A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters12
Improving lookup and query execution performance in distributed Big Data systems using Cuckoo Filter12
Data analysis for sequential contingencies under uncertainty12
Introducing Mplots: scaling time series recurrence plots to massive datasets12
Using social media for sub-event detection during disasters12
A problem-agnostic approach to feature selection and analysis using SHAP12
A real-time predicting online tool for detection of people’s emotions from Arabic tweets based on big data platforms12
VEDAS: an efficient GPU alternative for store and query of large RDF data sets12
MuSe: a multi-level storage scheme for big RDF data using MapReduce12
An adaptive hybrid african vultures-aquila optimizer with Xgb-Tree algorithm for fake news detection12
0.44216990470886