Proceedings of the Vldb Endowment

Papers
(The TQCC of Proceedings of the Vldb Endowment is 10. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)
ArticleCitations
IsoBugView485
Cardinality Estimation for Having-Clauses297
A Reproducible Tutorial on Reproducibility in Database Systems Research188
QPJVis Demo: Quality-Boost Progressive Join Query Processing System165
Accelerating Subgraph Matching through Fine-Grained and Powerful Equivalences123
G-tran108
Efficient Graph Data Access for Out-of-Memory GPU Streaming Graph Processing107
Towards Designing and Learning Piecewise Space-Filling Curves96
Algorithm and system co-design for efficient subgraph-based graph representation learning90
PARQO: Penalty-Aware Robust Plan Selection in Query Optimization81
Influential Community Search over Large Heterogeneous Information Networks75
Efficient Distributed Transaction Processing in Heterogeneous Networks71
Timestamp as a Service, Not an Oracle66
How to Optimize SQL Queries? A Comparison Between Split, Holistic, and Hybrid Approaches65
Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives65
Unraveling the Impact of Window Semantics: Optimizing Join Order for Efficient Stream Processing65
Relational Data Models for Genetic VCF data63
Shifting Transaction Isolation on Graphs: From Systems to Data63
Fries62
Efficient Discovery of Relaxed Functional Dependencies62
Unify: A System For Unstructured Data Analytics62
SkyStore: Cost-Optimized Object Storage Across Regions and Clouds61
ConANN: Conformal Approximate Nearest Neighbor Search61
Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules61
Spectrum: Speedy and Strictly-Deterministic Smart Contract Transactions for Blockchain Ledgers58
Galvatron58
Privacy for Free: Leveraging Local Differential Privacy Perturbed Data from Multiple Services57
Cloudy with a Chance of JSON57
Reliable community search in dynamic networks55
GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation54
TSB-AutoAD: Towards Automated Solutions for Time-Series Anomaly Detection52
DyHealth51
Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service50
Motiflets49
OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates48
Approximating probabilistic group steiner trees in graphs47
Resilience-Aware Elastic Scaling for Cloud-Native Online DL Training on Multi-Tenant GPU Clusters47
TRIM: An Efficient Framework for Exact Eccentricity Computation on Large-Scale Graphs47
DuckDB-wasm44
VeriBench: Analyzing the Performance of Database Systems with Verifiability43
Neighborhood-Based Hypergraph Core Decomposition43
LION: Fast and High-Resolution Network Kernel Density Visualization42
Differentially Private Stream Processing at Scale42
POEM41
DoppelGanger++ in Action: A Database Replay System with Fast Dependency Graph Generation40
SAIL: A Voyage to Symbolic Approximation Solutions for Time-Series Analysis39
Demonstrating Waffle: A Self-Driving Grid Index39
Exploiting the Power of Equality-Generating Dependencies in Ontological Reasoning39
A Comprehensive Survey and Experimental Study of Learning-Based Community Search39
TsQuality: Measuring Time Series Data Quality in Apache IoTDB38
DARKER: Efficient Transformer with Data-Driven Attention Mechanism for Time Series38
LogLite: Lightweight Plug-and-Play Streaming Log Compression37
Databases Unbound: Querying All of the World's Bytes with AI36
PSFQ: A Blockchain-Based Privacy-Preserving and Verifiable Student Feedback Questionnaire Platform36
DPXPlain36
SUFF: Accelerating Subgraph Matching with Historical Data36
Efficient Non-Learning Similar Subtrajectory Search36
Hermes: Off-the-Shelf Real-Time Transactional Analytics35
FairDAG: Consensus Fairness over Multi-Proposer Causal Design35
Trie memtables in cassandra35
SQL Engines Excel at the Execution of Imperative Programs35
Bonspiel: Low Tail Latency Transactions in Geo-Distributed Databases35
VeLP: Vehicle Loading Plan Learning from Human Behavior in Nationwide Logistics System34
Federated Data Distribution Shift Estimation34
Cuckoo Heavy Keeper and the Balancing Act of Maintaining Heavy Hitters in Stream Processing34
bNDCRepair: Cleaning both Data Errors and Inaccurate Constraints on Numerical Sequential Data33
HyperBlocker: Accelerating Rule-Based Blocking in Entity Resolution Using GPUs33
Hardware-Efficient Data Imputation through DBMS Extensibility32
LITS: An Optimized Learned Index for Strings32
HAIChart: Human and AI Paired Visualization System32
SingleStore-V: An Integrated Vector Database System in SingleStore31
Fast Verification of Strong Database Isolation31
Plush31
LIDER31
Making CRDTs Not So Eventual31
Eureka: Enabling Fine-Grained Access and Range Queries on Compressed Scientific Data via Data-Index Co-Compression31
Approximate Queries over Concurrent Updates31
A demonstration of multi-region CockroachDB30
IsoVista: Black-Box Checking Database Isolation Guarantees30
Scalable Reasoning on Document Stores via Instance-Aware Query Rewriting30
Seiden: Revisiting Query Processing in Video Database Systems30
Improving matrix-vector multiplication via lossless grammar-compressed matrices30
GalaxyWeaver: Autonomous Table-to-Graph Conversion and Schema Optimization with Large Language Models29
HADES: Range-Filtered Private Aggregation on Public Data29
FSMDTW: A Fast Index-Free Subsequence Matching Algorithm for Dynamic Time Warping29
Simulating a Transactional Server for Multi-Model Systems29
SparkCAD29
FastMosaic in Action: A New Mosaic Operator for Array DBMSs29
Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching29
A Practical Theory of Generalization in Selectivity Learning29
Decentralized Actor Scheduling and Reference-Based Storage in Xorbits: A Native Scalable Data Science Engine29
Vive la Différence: Practical Diff Testing of Stateful Applications29
Efficient and Accurate SimRank-Based Similarity Joins: Experiments, Analysis, and Improvement28
KGNav: A Knowledge Graph Navigational Visual Query System28
ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads28
Enriching Relations with Additional Attributes for ER28
RICH: Real-Time Identification of Negative Cycles for High-Efficiency Arbitrage28
BURST: Rendering Clustering Techniques Suitable for Evolving Streams28
Serving deep learning models with deduplication from relational databases27
Toward Quantity-of-Interest Preserving Lossy Compression for Scientific Data27
From Zero to Hero: Detecting Leaked Data through Synthetic Data Injection and Model Querying27
MLP-Mixer based Masked Autoencoders are Effective, Explainable and Robust for Time Series Anomaly Detection27
Hercules against data series similarity search27
Design trade-offs for a robust dynamic hybrid hash join27
Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams27
CoroGraph: Bridging Cache Efficiency and Work Efficiency for Graph Algorithm Execution27
VIDEX: A Disaggregated and Extensible Virtual Index for the Cloud and AI Era26
Saving Money for Analytical Workloads in the Cloud26
Succinct graph representations as distance oracles26
PerMA-bench26
Petabyte-Scale Row-Level Operations in Data Lakehouses26
CMixing: An Efficient Coin Mixing Platform to Enhance Anonymity in Cryptocurrency Transactions25
Navigating Data Repositories: Utilizing Line Charts to Discover Relevant Datasets25
Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data25
Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data25
FS-Real: A Real-World Cross-Device Federated Learning Platform25
Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines25
Expanding Reverse Nearest Neighbors25
TATA: An Efficient Framework for Task Transfer in Query Plan Representation25
ACTA: Autonomy and Coordination Task Assignment in Spatial Crowdsourcing Platforms25
OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance from Database Query Event Logs24
Optimal Sharding for Scalable Blockchains with Deconstructed SMR24
Sphinteract: Resolving Ambiguities in NL2SQL through User Interaction24
Optimizing machine learning inference queries with correlative proxy models24
Kora: A Cloud-Native Event Streaming Platform for Kafka24
Discovering Leitmotifs in Multidimensional Time Series24
XDB in Action: Decentralized Cross-Database Query Processing for Black-Box DBMSes24
Sparcle: Boosting the Accuracy of Data Cleaning Systems through Spatial Awareness23
Dalton23
FARGO: Fast Maximum Inner Product Search via Global Multi-Probing23
Less is More: Efficient Time Series Dataset Condensation via Two-Fold Modal Matching23
ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs23
Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores23
ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems23
Anomaly detection in time series23
Oasis: An Optimal Disjoint Segmented Learned Range Filter23
DINOMO23
LavaStore: ByteDance's Purpose-Built, High-Performance, Cost-Effective Local Storage Engine for Cloud Services23
CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space23
Window Function Expression: Let the Self-Join Enter22
Fused Gromov-Wasserstein Alignment for Graph Edit Distance Computation and Beyond22
Demo of QueryBooster: Supporting Middleware-Based SQL Query Rewriting as a Service22
QuoteInspector: Gaining Insight about Social Media Discussions22
TuskFlow: An Efficient Graph Database for Long-Running Transactions22
Cloud data systems22
Unleash the Power of Ellipsis: Accuracy-Enhanced Sparse Vector Technique with Exponential Noise22
SCompression: Enhancing Database Knob Tuning Efficiency Through Slice-Based OLTP Workload Compression22
Hybrid Mixed Integer Linear Programming for Large-Scale Join Order Optimisation21
DAFDiscover: Robust Mining Algorithm for Dynamic Approximate Functional Dependencies on Dirty Data21
Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding21
Falcon: Advancing Asynchronous BFT Consensus for Lower Latency and Enhanced Throughput21
On More Efficiently and Versatilely Querying Historical k -Cores21
Accelerating Maximal Clique Enumeration via Graph Reduction20
Quantifying Point Contributions: A Lightweight Framework for Efficient and Effective Query-Driven Trajectory Simplification20
Starry20
Authenticated Aggregate Queries with Boolean Range Predicates on Blockchains20
CEDA: Learned Cardinality Estimation with Domain Adaptation20
Uldp-FL: Federated Learning with Across-Silo User-Level Differential Privacy20
Datamap-Driven Tabular Coreset Selection for Classifier Training20
Nuhuo: An Effective Estimation Model for Traffic Speed Histogram Imputation on A Road Network20
L2chain20
Polyglot data management20
Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads20
Pyneapple-G: Scalable Spatial Grouping Queries20
Minimum Strongly Connected Subgraph Collection in Dynamic Graphs20
AeonG: An Efficient Built-in Temporal Support in Graph Databases20
From Scale-Up to Scale-Out: PolarDB's Journey to Achieving 2 Billion tpmC19
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis19
Efficient k NN Search in Public Transportation Networks19
RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems19
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release19
GQL and SQL/PGQ: Theoretical Models and Expressive Power19
A Hierarchical Grouping Algorithm for the Multi-Vehicle Dial-a-Ride Problem19
Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL19
A Case for Graphics-Driven Query Processing19
GENTI: GPU-Powered Walk-Based Subgraph Extraction for Scalable Representation Learning on Dynamic Graphs19
Computing Rule-Based Explanations by Leveraging Counterfactuals19
Efficient Discovery of Significant Patterns with Few-Shot Resampling19
ResLake : Towards Minimum Job Latency and Balanced Resource Utilization in Geo-Distributed Job Scheduling19
Resource Management in Aurora Serverless19
Machine Learning for Graph Data Management and Query Processing18
DataRinse: Semantic Transforms for Data Preparation Based on Code Mining18
Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding18
KEIGO: Co-Designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-Aware Storage Hierarchy18
Simpler is More: Efficient Top-K Nearest Neighbors Search on Large Road Networks18
DBMS annihilator18
Mix & Match: Subgraph Matching for Absolute Coverage18
Fast approximate denial constraint discovery18
Anarchy in the Database: A Survey and Evaluation of Database Management System Extensibility18
GRewriter: Practical Query Rewriting with Automatic Rule Set Expansion in GaussDB18
Analyzing Near-Network Hardware Acceleration with Co-Processing on DPUs18
Towards distributed bitruss decomposition on bipartite graphs18
QStore: Quantization-Aware Compressed Model Storage18
To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks18
Ganos Aero: A Cloud-Native System for Big Raster Data Management and Processing18
Bridging Disciplines in Data Management Research to Solve Complex Data Problems18
Lingua Manga : A Generic Large Language Model Centric System for Data Curation18
Task: An Efficient Framework for Instant Error-Tolerant Spatial Keyword Queries on Road Networks18
TUX: Efficient Drop-in Networking for Database Systems18
Differentially Private Data Generation with Missing Data17
Efficient Algorithms for Pseudoarboricity Computation in Large Static and Dynamic Graphs17
Themis: A GPU-Accelerated Relational Query Execution Engine17
OceanBase Paetica: A Hybrid Shared-Nothing/Shared-Everything Database for Supporting Single Machine and Distributed Cluster17
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes17
ELPIS: Graph-Based Similarity Search for Scalable Data Science17
ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection17
ELEET: Efficient Learned Query Execution over Text and Tables17
Sancus17
PRICE: A Pretrained Model for Cross-Database Cardinality Estimation17
Vodka: Rethink Benchmarking Philosophy in HTAP Systems17
Dynamic Graph Databases with Out-of-Order Updates17
Efficient Triangle-Connected Truss Community Search in Dynamic Graphs17
QTCS: Efficient Query-Centered Temporal Community Search17
PIM-Tree17
LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans17
Composable Data Management: An Execution Overview17
Skellam mixture mechanism17
Tigger: A Database Proxy That Bounces with User-Bypass17
ArcheType: A Novel Framework for Open-Source Column Type Annotation Using Large Language Models17
Vortex: Overcoming Memory Capacity Limitations in GPU-Accelerated Large-Scale Data Analytics17
YeSQL17
TIGER: Training Inductive Graph Neural Network for Large-Scale Knowledge Graph Reasoning17
Scalable and Robust Snapshot Isolation for High-Performance Storage Engines17
Machine Learning for Subgraph Extraction: Methods, Applications and Challenges16
Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V16
POEM: Pattern-Oriented Explanations of Convolutional Neural Networks16
AMRAS16
MiCS16
The case for distributed shared-memory databases with RDMA-enabled memory disaggregation16
Access Control for Information-Theoretically Secure Data16
Streaming Time Series Subsequence Anomaly Detection: A Glance and Focus Approach16
ABC16
Beyond Shortest Paths: Node Fairness in Route Recommendation16
Incremental Detection of Denial Constraint Violations16
Reimagining Deep Learning Systems through the Lens of Data Systems16
OFL-W3: A One-Shot Federated Learning System on Web 3.016
MD-MVCC: Multi-Version Concurrency Control for Schema Changes in Azure SQL Database16
OpenFGL: A Comprehensive Benchmark for Federated Graph Learning16
ChainDash: An Ad-Hoc Blockchain Data Analytics System16
Decentralized crowdsourcing for human intelligence tasks with efficient on-chain cost16
B link -hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases16
Improving DBMS Scheduling Decisions with Accurate Performance Prediction on Concurrent Queries16
Heta: Distributed Training of Heterogeneous Graph Neural Networks16
FlowWalker: A Memory-Efficient and High-Performance GPU-Based Dynamic Graph Random Walk Framework15
Towards Principled, Practical Document Database Design15
Exploiting Cloud Object Storage for High-Performance Analytics15
Efficient Execution of User-Defined Functions in SQL Queries15
SecretFlow-SCQL: A Secure Collaborative Query Platform15
Design and Modular Verification of Distributed Transactions in MongoDB15
Troubles with nulls, views from the users15
0.17725205421448