Proceedings of the Vldb Endowment

Papers
(The median citation count of Proceedings of the Vldb Endowment is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
IsoBugView577
Approximating probabilistic group steiner trees in graphs377
Timestamp as a Service, Not an Oracle222
Cardinality Estimation for Having-Clauses101
DuckDB-wasm100
Privacy for Free: Leveraging Local Differential Privacy Perturbed Data from Multiple Services93
QPJVis Demo: Quality-Boost Progressive Join Query Processing System92
Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service87
Efficient Distributed Transaction Processing in Heterogeneous Networks75
OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates73
Algorithm and system co-design for efficient subgraph-based graph representation learning72
Reliable community search in dynamic networks70
How to Optimize SQL Queries? A Comparison Between Split, Holistic, and Hybrid Approaches67
Unraveling the Impact of Window Semantics: Optimizing Join Order for Efficient Stream Processing57
Efficient Graph Data Access for Out-of-Memory GPU Streaming Graph Processing56
Shifting Transaction Isolation on Graphs: From Systems to Data56
Relational Data Models for Genetic VCF data54
Accelerating Subgraph Matching through Fine-Grained and Powerful Equivalences54
Unify: A System For Unstructured Data Analytics53
Cloudy with a Chance of JSON53
Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives51
TSB-AutoAD: Towards Automated Solutions for Time-Series Anomaly Detection50
PARQO: Penalty-Aware Robust Plan Selection in Query Optimization50
VeriBench: Analyzing the Performance of Database Systems with Verifiability50
Motiflets49
Neighborhood-Based Hypergraph Core Decomposition49
Efficient Discovery of Relaxed Functional Dependencies48
GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation48
SkyStore: Cost-Optimized Object Storage Across Regions and Clouds48
Spectrum: Speedy and Strictly-Deterministic Smart Contract Transactions for Blockchain Ledgers47
Influential Community Search over Large Heterogeneous Information Networks47
Galvatron47
SpaceSaving ±47
Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules46
Differentially Private Stream Processing at Scale45
A Reproducible Tutorial on Reproducibility in Database Systems Research45
Fries44
DyHealth44
G-tran44
LION: Fast and High-Resolution Network Kernel Density Visualization43
Towards Designing and Learning Piecewise Space-Filling Curves43
POEM42
DoppelGanger++ in Action: A Database Replay System with Fast Dependency Graph Generation41
HyperBlocker: Accelerating Rule-Based Blocking in Entity Resolution Using GPUs41
Hardware-Efficient Data Imputation through DBMS Extensibility41
Making CRDTs Not So Eventual40
Exploiting the Power of Equality-Generating Dependencies in Ontological Reasoning40
Plush39
SAIL: A Voyage to Symbolic Approximation Solutions for Time-Series Analysis39
Improving matrix-vector multiplication via lossless grammar-compressed matrices39
A Comprehensive Survey and Experimental Study of Learning-Based Community Search38
Cuckoo Heavy Keeper and the Balancing Act of Maintaining Heavy Hitters in Stream Processing37
Demonstrating Waffle: A Self-Driving Grid Index36
LIDER35
DARKER: Efficient Transformer with Data-Driven Attention Mechanism for Time Series35
Efficient Non-Learning Similar Subtrajectory Search35
VeLP: Vehicle Loading Plan Learning from Human Behavior in Nationwide Logistics System35
LITS: An Optimized Learned Index for Strings35
Seiden: Revisiting Query Processing in Video Database Systems35
DPXPlain34
IsoVista: Black-Box Checking Database Isolation Guarantees33
Trie memtables in cassandra32
SingleStore-V: An Integrated Vector Database System in SingleStore32
TsQuality: Measuring Time Series Data Quality in Apache IoTDB32
Approximate Queries over Concurrent Updates31
PSFQ: A Blockchain-Based Privacy-Preserving and Verifiable Student Feedback Questionnaire Platform31
SUFF: Accelerating Subgraph Matching with Historical Data30
SQL Engines Excel at the Execution of Imperative Programs30
Hermes: Off-the-Shelf Real-Time Transactional Analytics30
Federated Data Distribution Shift Estimation30
Pre-training summarization models of structured datasets for cardinality estimation29
Incremental partitioning for efficient spatial data analytics29
LogLite: Lightweight Plug-and-Play Streaming Log Compression29
TSB-UAD29
Bonspiel: Low Tail Latency Transactions in Geo-Distributed Databases29
Databases Unbound: Querying All of the World's Bytes with AI28
Scalable Reasoning on Document Stores via Instance-Aware Query Rewriting28
HAIChart: Human and AI Paired Visualization System28
MLP-Mixer based Masked Autoencoders are Effective, Explainable and Robust for Time Series Anomaly Detection28
Succinct graph representations as distance oracles27
Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching27
A demonstration of multi-region CockroachDB27
Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores27
Saving Money for Analytical Workloads in the Cloud26
SparkCAD26
GalaxyWeaver: Autonomous Table-to-Graph Conversion and Schema Optimization with Large Language Models26
Oasis: An Optimal Disjoint Segmented Learned Range Filter26
Simulating a Transactional Server for Multi-Model Systems26
A Practical Theory of Generalization in Selectivity Learning26
VIDEX: A Disaggregated and Extensible Virtual Index for the Cloud and AI Era26
Optimal Sharding for Scalable Blockchains with Deconstructed SMR25
Decentralized Actor Scheduling and Reference-Based Storage in Xorbits: A Native Scalable Data Science Engine25
OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance from Database Query Event Logs25
RICH: Real-Time Identification of Negative Cycles for High-Efficiency Arbitrage25
Vive la Différence: Practical Diff Testing of Stateful Applications25
FSMDTW: A Fast Index-Free Subsequence Matching Algorithm for Dynamic Time Warping25
FastMosaic in Action: A New Mosaic Operator for Array DBMSs25
XDB in Action: Decentralized Cross-Database Query Processing for Black-Box DBMSes25
FARGO: Fast Maximum Inner Product Search via Global Multi-Probing25
HADES: Range-Filtered Private Aggregation on Public Data25
KGNav: A Knowledge Graph Navigational Visual Query System25
Discovering Leitmotifs in Multidimensional Time Series24
ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs24
Design trade-offs for a robust dynamic hybrid hash join24
ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads24
Enriching Relations with Additional Attributes for ER24
BURST: Rendering Clustering Techniques Suitable for Evolving Streams24
Enabling SQL-based training data debugging for federated learning24
Dalton24
DINOMO23
Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data23
Less is More: Efficient Time Series Dataset Condensation via Two-Fold Modal Matching23
Kora: A Cloud-Native Event Streaming Platform for Kafka23
Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams23
Toward Quantity-of-Interest Preserving Lossy Compression for Scientific Data23
Sparcle: Boosting the Accuracy of Data Cleaning Systems through Spatial Awareness22
LavaStore: ByteDance's Purpose-Built, High-Performance, Cost-Effective Local Storage Engine for Cloud Services22
Sphinteract: Resolving Ambiguities in NL2SQL through User Interaction22
From Zero to Hero: Detecting Leaked Data through Synthetic Data Injection and Model Querying22
ACTA: Autonomy and Coordination Task Assignment in Spatial Crowdsourcing Platforms22
Federated matrix factorization with privacy guarantee22
Optimizing machine learning inference queries with correlative proxy models22
CMixing: An Efficient Coin Mixing Platform to Enhance Anonymity in Cryptocurrency Transactions22
Expanding Reverse Nearest Neighbors22
PerMA-bench21
Subgraph matching over graph federation21
ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems21
Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines21
Hercules against data series similarity search21
Efficient and Accurate SimRank-Based Similarity Joins: Experiments, Analysis, and Improvement21
Serving deep learning models with deduplication from relational databases21
CoroGraph: Bridging Cache Efficiency and Work Efficiency for Graph Algorithm Execution21
Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data21
FS-Real: A Real-World Cross-Device Federated Learning Platform21
Navigating Data Repositories: Utilizing Line Charts to Discover Relevant Datasets21
Anomaly detection in time series20
Window Function Expression: Let the Self-Join Enter20
Cloud data systems20
Ember20
QuoteInspector: Gaining Insight about Social Media Discussions20
Petabyte-Scale Row-Level Operations in Data Lakehouses20
CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space20
Datamap-Driven Tabular Coreset Selection for Classifier Training20
Pyneapple-G: Scalable Spatial Grouping Queries19
Selective data acquisition in the wild for model charging19
Quantifying Point Contributions: A Lightweight Framework for Efficient and Effective Query-Driven Trajectory Simplification19
A Case for Graphics-Driven Query Processing19
From Scale-Up to Scale-Out: PolarDB's Journey to Achieving 2 Billion tpmC19
TuskFlow: An Efficient Graph Database for Long-Running Transactions19
Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads19
Minimum Strongly Connected Subgraph Collection in Dynamic Graphs19
Demo of QueryBooster: Supporting Middleware-Based SQL Query Rewriting as a Service19
RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems19
Fused Gromov-Wasserstein Alignment for Graph Edit Distance Computation and Beyond19
Uldp-FL: Federated Learning with Across-Silo User-Level Differential Privacy19
Unleash the Power of Ellipsis: Accuracy-Enhanced Sparse Vector Technique with Exponential Noise19
CEDA: Learned Cardinality Estimation with Domain Adaptation19
Resource Management in Aurora Serverless19
Authenticated Aggregate Queries with Boolean Range Predicates on Blockchains19
GQL and SQL/PGQ: Theoretical Models and Expressive Power18
ResLake : Towards Minimum Job Latency and Balanced Resource Utilization in Geo-Distributed Job Scheduling18
Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding18
Simpler is More: Efficient Top-K Nearest Neighbors Search on Large Road Networks18
L2chain18
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis18
Starry18
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release18
Falcon: Advancing Asynchronous BFT Consensus for Lower Latency and Enhanced Throughput18
On More Efficiently and Versatilely Querying Historical k -Cores18
Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL18
Polyglot data management18
A Hierarchical Grouping Algorithm for the Multi-Vehicle Dial-a-Ride Problem18
GENTI: GPU-Powered Walk-Based Subgraph Extraction for Scalable Representation Learning on Dynamic Graphs18
Computing Rule-Based Explanations by Leveraging Counterfactuals18
Nuhuo: An Effective Estimation Model for Traffic Speed Histogram Imputation on A Road Network18
SCompression: Enhancing Database Knob Tuning Efficiency Through Slice-Based OLTP Workload Compression18
DAFDiscover: Robust Mining Algorithm for Dynamic Approximate Functional Dependencies on Dirty Data18
Accelerating Maximal Clique Enumeration via Graph Reduction18
TranAD18
AeonG: An Efficient Built-in Temporal Support in Graph Databases18
Hyper-tune18
Fast neural ranking on bipartite graph indices18
TGL17
Composable Data Management: An Execution Overview17
Efficient Algorithms for Pseudoarboricity Computation in Large Static and Dynamic Graphs17
PIM-Tree17
The case for distributed shared-memory databases with RDMA-enabled memory disaggregation17
Themis: A GPU-Accelerated Relational Query Execution Engine17
YeSQL17
ELPIS: Graph-Based Similarity Search for Scalable Data Science17
Lingua Manga : A Generic Large Language Model Centric System for Data Curation17
DBMS annihilator17
Efficient Triangle-Connected Truss Community Search in Dynamic Graphs17
Efficient Discovery of Significant Patterns with Few-Shot Resampling17
PRICE: A Pretrained Model for Cross-Database Cardinality Estimation17
Sancus17
LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans17
Vortex: Overcoming Memory Capacity Limitations in GPU-Accelerated Large-Scale Data Analytics17
Scalable and Robust Snapshot Isolation for High-Performance Storage Engines17
To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks16
DataRinse: Semantic Transforms for Data Preparation Based on Code Mining16
ELEET: Efficient Learned Query Execution over Text and Tables16
MiCS16
Tigger: A Database Proxy That Bounces with User-Bypass16
KEIGO: Co-Designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-Aware Storage Hierarchy16
TIGER: Training Inductive Graph Neural Network for Large-Scale Knowledge Graph Reasoning16
Differentially Private Data Generation with Missing Data16
Anarchy in the Database: A Survey and Evaluation of Database Management System Extensibility16
Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding16
Efficient k NN Search in Public Transportation Networks16
Towards distributed bitruss decomposition on bipartite graphs16
Ganos Aero: A Cloud-Native System for Big Raster Data Management and Processing16
QTCS: Efficient Query-Centered Temporal Community Search15
Machine Learning for Graph Data Management and Query Processing15
xFraud15
Task: An Efficient Framework for Instant Error-Tolerant Spatial Keyword Queries on Road Networks15
Skellam mixture mechanism15
Bridging Disciplines in Data Management Research to Solve Complex Data Problems15
Fast approximate denial constraint discovery15
Dynamic Graph Databases with Out-of-Order Updates15
GRewriter: Practical Query Rewriting with Automatic Rule Set Expansion in GaussDB15
OceanBase Paetica: A Hybrid Shared-Nothing/Shared-Everything Database for Supporting Single Machine and Distributed Cluster15
ArcheType: A Novel Framework for Open-Source Column Type Annotation Using Large Language Models15
ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection14
Machine Learning for Subgraph Extraction: Methods, Applications and Challenges14
OFL-W3: A One-Shot Federated Learning System on Web 3.014
Access Control for Information-Theoretically Secure Data14
Improving DBMS Scheduling Decisions with Accurate Performance Prediction on Concurrent Queries14
Bringing the Operational and Analytical Worlds Together with Lakebase14
ABC14
FB + -Tree: A Memory-Optimized B + -Tree with Latch-Free Update14
Detecting layout templates in complex multiregion files14
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes14
Points-of-interest relationship inference with spatial-enriched graph neural networks14
Angel-PTM: A Scalable and Economical Large-Scale Pre-Training System in Tencent14
Generating Succinct Descriptions of Database Schemata for Cost-Efficient Prompting of Large Language Models14
NeutronStream: A Dynamic GNN Training Framework with Sliding Window for Graph Streams14
Efficient Execution of User-Defined Functions in SQL Queries14
MT-teql14
Reimagining Deep Learning Systems through the Lens of Data Systems14
PGE14
AMRAS14
Tiresias14
Heta: Distributed Training of Heterogeneous Graph Neural Networks14
Troubles with nulls, views from the users14
Fair Transaction Processing for Multi-Tenant Databases14
Decentralized crowdsourcing for human intelligence tasks with efficient on-chain cost14
POEM: Pattern-Oriented Explanations of Convolutional Neural Networks14
Streaming Time Series Subsequence Anomaly Detection: A Glance and Focus Approach14
Distributed learning of fully connected neural networks using independent subnet training14
0.12711405754089