Proceedings of the Vldb Endowment

Papers
(The TQCC of Proceedings of the Vldb Endowment is 9. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)
ArticleCitations
Approximating probabilistic group steiner trees in graphs482
IsoBugView302
Cardinality Estimation for Having-Clauses182
Timestamp as a Service, Not an Oracle77
Differentially Private Stream Processing at Scale77
QPJVis Demo: Quality-Boost Progressive Join Query Processing System72
Spectrum: Speedy and Strictly-Deterministic Smart Contract Transactions for Blockchain Ledgers66
Fries61
Motiflets60
SpaceSaving ±58
OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates56
Reliable community search in dynamic networks56
DyHealth56
Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives53
A Reproducible Tutorial on Reproducibility in Database Systems Research52
PerfGuard48
Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules48
VeriBench: Analyzing the Performance of Database Systems with Verifiability47
DuckDB-wasm47
Towards Designing and Learning Piecewise Space-Filling Curves46
Neighborhood-Based Hypergraph Core Decomposition46
Influential Community Search over Large Heterogeneous Information Networks46
Efficient Distributed Transaction Processing in Heterogeneous Networks45
Accelerating recommendation system training by leveraging popular choices45
G-tran44
GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation43
Algorithm and system co-design for efficient subgraph-based graph representation learning43
Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service43
DoppelGanger++ in Action: A Database Replay System with Fast Dependency Graph Generation42
Demonstrating Waffle: A Self-Driving Grid Index42
PARQO: Penalty-Aware Robust Plan Selection in Query Optimization42
LION: Fast and High-Resolution Network Kernel Density Visualization42
Galvatron42
PSFQ: A Blockchain-Based Privacy-Preserving and Verifiable Student Feedback Questionnaire Platform41
POEM40
Efficient Non-Learning Similar Subtrajectory Search39
Exploiting the Power of Equality-Generating Dependencies in Ontological Reasoning38
LIDER37
Incremental partitioning for efficient spatial data analytics37
DPXPlain37
Approximate Queries over Concurrent Updates37
IsoVista: Black-Box Checking Database Isolation Guarantees37
Pre-training summarization models of structured datasets for cardinality estimation36
VeLP: Vehicle Loading Plan Learning from Human Behavior in Nationwide Logistics System36
HAIChart: Human and AI Paired Visualization System36
HyperBlocker: Accelerating Rule-Based Blocking in Entity Resolution Using GPUs36
Trie memtables in cassandra35
Hardware-Efficient Data Imputation through DBMS Extensibility35
DARKER: Efficient Transformer with Data-Driven Attention Mechanism for Time Series35
Making CRDTs Not So Eventual35
LITS: An Optimized Learned Index for Strings35
Seiden: Revisiting Query Processing in Video Database Systems34
SUFF: Accelerating Subgraph Matching with Historical Data34
Improving matrix-vector multiplication via lossless grammar-compressed matrices33
Plush33
Databases Unbound: Querying All of the World's Bytes with AI32
TsQuality: Measuring Time Series Data Quality in Apache IoTDB32
TSB-UAD31
SQL Engines Excel at the Execution of Imperative Programs31
A demonstration of multi-region CockroachDB31
SingleStore-V: An Integrated Vector Database System in SingleStore31
Kora: A Cloud-Native Event Streaming Platform for Kafka30
Succinct graph representations as distance oracles30
Design trade-offs for a robust dynamic hybrid hash join29
FS-Real: A Real-World Cross-Device Federated Learning Platform29
MLP-Mixer based Masked Autoencoders are Effective, Explainable and Robust for Time Series Anomaly Detection29
Enabling SQL-based training data debugging for federated learning28
LavaStore: ByteDance's Purpose-Built, High-Performance, Cost-Effective Local Storage Engine for Cloud Services27
Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data27
OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance from Database Query Event Logs27
Scalable Reasoning on Document Stores via Instance-Aware Query Rewriting27
Toward Quantity-of-Interest Preserving Lossy Compression for Scientific Data26
Petabyte-Scale Row-Level Operations in Data Lakehouses26
ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems26
Optimizing machine learning inference queries with correlative proxy models26
SparkCAD25
Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams25
Oasis: An Optimal Disjoint Segmented Learned Range Filter25
Saving Money for Analytical Workloads in the Cloud25
KGNav: A Knowledge Graph Navigational Visual Query System24
FastMosaic in Action: A New Mosaic Operator for Array DBMSs24
From Zero to Hero: Detecting Leaked Data through Synthetic Data Injection and Model Querying24
CMixing: An Efficient Coin Mixing Platform to Enhance Anonymity in Cryptocurrency Transactions23
Sparcle: Boosting the Accuracy of Data Cleaning Systems through Spatial Awareness23
Efficient and Accurate SimRank-Based Similarity Joins: Experiments, Analysis, and Improvement23
Navigating Data Repositories: Utilizing Line Charts to Discover Relevant Datasets23
Discovering Leitmotifs in Multidimensional Time Series23
DINOMO22
Less is More: Efficient Time Series Dataset Condensation via Two-Fold Modal Matching22
Ember22
XDB in Action: Decentralized Cross-Database Query Processing for Black-Box DBMSes22
ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads22
Sphinteract: Resolving Ambiguities in NL2SQL through User Interaction22
Serving deep learning models with deduplication from relational databases22
CoroGraph: Bridging Cache Efficiency and Work Efficiency for Graph Algorithm Execution22
Subgraph matching over graph federation21
ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs21
Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching21
Hercules against data series similarity search21
Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines21
Enriching Relations with Additional Attributes for ER21
Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data21
ACTA: Autonomy and Coordination Task Assignment in Spatial Crowdsourcing Platforms20
PerMA-bench20
Anomaly detection in time series20
Dalton20
Federated matrix factorization with privacy guarantee20
CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space20
FARGO: Fast Maximum Inner Product Search via Global Multi-Probing20
Expanding Reverse Nearest Neighbors20
DAFDiscover: Robust Mining Algorithm for Dynamic Approximate Functional Dependencies on Dirty Data19
ResLake : Towards Minimum Job Latency and Balanced Resource Utilization in Geo-Distributed Job Scheduling19
A Case for Graphics-Driven Query Processing19
Quantifying Point Contributions: A Lightweight Framework for Efficient and Effective Query-Driven Trajectory Simplification19
Window Function Expression: Let the Self-Join Enter19
Minimum Strongly Connected Subgraph Collection in Dynamic Graphs19
AeonG: An Efficient Built-in Temporal Support in Graph Databases19
Cloud data systems19
QuoteInspector: Gaining Insight about Social Media Discussions19
Unleash the Power of Ellipsis: Accuracy-Enhanced Sparse Vector Technique with Exponential Noise19
Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads19
Starry18
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis18
CEDA: Learned Cardinality Estimation with Domain Adaptation18
Uldp-FL: Federated Learning with Across-Silo User-Level Differential Privacy18
Polyglot data management18
Demo of QueryBooster: Supporting Middleware-Based SQL Query Rewriting as a Service18
Pyneapple-G: Scalable Spatial Grouping Queries18
Nuhuo: An Effective Estimation Model for Traffic Speed Histogram Imputation on A Road Network18
Selective data acquisition in the wild for model charging18
Computing Rule-Based Explanations by Leveraging Counterfactuals18
Datamap-Driven Tabular Coreset Selection for Classifier Training18
Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding18
Ganos Aero: A Cloud-Native System for Big Raster Data Management and Processing17
L2chain17
Accelerating Maximal Clique Enumeration via Graph Reduction17
Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL17
To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks17
Differentially Private Data Generation with Missing Data17
TranAD17
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release17
Hyper-tune17
GENTI: GPU-Powered Walk-Based Subgraph Extraction for Scalable Representation Learning on Dynamic Graphs17
A Hierarchical Grouping Algorithm for the Multi-Vehicle Dial-a-Ride Problem17
Composable Data Management: An Execution Overview17
Resource Management in Aurora Serverless17
Fast neural ranking on bipartite graph indices17
RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems17
YeSQL17
Scalable and Robust Snapshot Isolation for High-Performance Storage Engines17
Dynamic Graph Databases with Out-of-Order Updates16
Distributed learning of fully connected neural networks using independent subnet training16
PIM-Tree16
MiCS16
Simpler is More: Efficient Top-K Nearest Neighbors Search on Large Road Networks16
OceanBase Paetica: A Hybrid Shared-Nothing/Shared-Everything Database for Supporting Single Machine and Distributed Cluster16
Tigger: A Database Proxy That Bounces with User-Bypass16
LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans16
PGE16
ELEET: Efficient Learned Query Execution over Text and Tables16
TIGER: Training Inductive Graph Neural Network for Large-Scale Knowledge Graph Reasoning16
TGL16
Skellam mixture mechanism15
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes15
Lingua Manga : A Generic Large Language Model Centric System for Data Curation15
Efficient k NN Search in Public Transportation Networks15
Efficient Algorithms for Pseudoarboricity Computation in Large Static and Dynamic Graphs15
Themis: A GPU-Accelerated Relational Query Execution Engine15
DataRinse: Semantic Transforms for Data Preparation Based on Code Mining15
Fast approximate denial constraint discovery15
Vortex: Overcoming Memory Capacity Limitations in GPU-Accelerated Large-Scale Data Analytics15
ParChain15
DBMS annihilator15
ArcheType: A Novel Framework for Open-Source Column Type Annotation Using Large Language Models15
Towards distributed bitruss decomposition on bipartite graphs15
Efficient Triangle-Connected Truss Community Search in Dynamic Graphs15
The case for distributed shared-memory databases with RDMA-enabled memory disaggregation15
PRICE: A Pretrained Model for Cross-Database Cardinality Estimation15
Task: An Efficient Framework for Instant Error-Tolerant Spatial Keyword Queries on Road Networks15
Efficient Discovery of Significant Patterns with Few-Shot Resampling15
ELPIS: Graph-Based Similarity Search for Scalable Data Science15
xFraud15
QTCS: Efficient Query-Centered Temporal Community Search15
Generating Succinct Descriptions of Database Schemata for Cost-Efficient Prompting of Large Language Models14
MT-teql14
ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection14
Designing production-friendly machine learning14
Points-of-interest relationship inference with spatial-enriched graph neural networks14
FlowWalker: A Memory-Efficient and High-Performance GPU-Based Dynamic Graph Random Walk Framework14
ABC14
Exploiting Cloud Object Storage for High-Performance Analytics14
Reimagining Deep Learning Systems through the Lens of Data Systems14
Detecting layout templates in complex multiregion files14
POEM: Pattern-Oriented Explanations of Convolutional Neural Networks14
ChainDash: An Ad-Hoc Blockchain Data Analytics System14
Explaining Differentially Private Query Results with DPXPlain14
Incremental Detection of Denial Constraint Violations14
OFL-W3: A One-Shot Federated Learning System on Web 3.014
Sancus14
AMRAS14
SecretFlow-SCQL: A Secure Collaborative Query Platform14
Tiresias14
B link -hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases13
Efficient Execution of User-Defined Functions in SQL Queries13
NeutronStream: A Dynamic GNN Training Framework with Sliding Window for Graph Streams13
WebMILE13
Maximum k -Plex Search: An Alternated Reduction-and-Bound Method13
Intelligent Agents for Data Exploration13
Density Personalized Group Query13
An Experimental Evaluation of Anomaly Detection in Time Series13
Angel-PTM: A Scalable and Economical Large-Scale Pre-Training System in Tencent13
Decentralized crowdsourcing for human intelligence tasks with efficient on-chain cost13
No Repetition13
SIMformer: Single-Layer Vanilla Transformer Can Learn Free-Space Trajectory Similarity13
Optimal Matrix Sketching over Sliding Windows13
Efficient Black-Box Checking of Snapshot Isolation in Databases13
Troubles with nulls, views from the users13
Breaking It Down: An In-Depth Study of Index Advisors13
Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V13
Efficient Cost Modeling of Space-Filling Curves13
gCore: Exploring Cross-Layer Cohesiveness in Multi-Layer Graphs13
Machine Learning for Subgraph Extraction: Methods, Applications and Challenges13
Spade: A Real-Time Fraud Detection Framework12
Semi-Oblivious Chase Termination for Linear Existential Rules: An Experimental Study12
The power of summarization in graph mining and learning12
Agile-Ant: Self-Managing Distributed Cache Management for Cost Optimization of Big Data Applications12
Pantheon12
Accelerating Similarity Search for Elastic Measures: A Study and New Generalization of Lower Bounding Distances12
DILI: A Distribution-Driven Learned Index12
DBOS12
FedTSC12
Hu-fu12
SPECIAL: SynoPsis AssistEd Secure Collaborative AnaLytics12
IncrCP: Decomposing and Orchestrating Incremental Checkpoints for Effective Recommendation Model Training12
LANNS12
Data and AI Model Markets: Opportunities for Data and Model Sharing, Discovery, and Integration12
Mixed Covers of Keys and Functional Dependencies for Maintaining the Integrity of Data under Updates12
BICE: Exploring Compact Search Space by Using Bipartite Matching and Cell-Wide Verification12
MITra: A Framework for Multi-Instance Graph Traversal12
AdaNDV: Adaptive Number of Distinct Value Estimation via Learning to Select and Fuse Estimators12
Demonstration of accelerating machine learning inference queries with correlative proxy models12
Finding locally densest subgraphs12
CatSQL : Towards Real World Natural Language to SQL Applications12
Weakly Guided Adaptation for Robust Time Series Forecasting12
TOD11
A Tutorial on Visual Representations of Relational Queries11
DumpKV: Learning Based Lifetime Aware Garbage Collection for Key Value Separation in LSM-Tree11
Calibrating Noise for Group Privacy in Subsampled Mechanisms11
Fairness Matters11
A Memory Guided Transformer for Time Series Forecasting11
0.1386890411377