Proceedings of the Vldb Endowment

Papers
(The TQCC of Proceedings of the Vldb Endowment is 7. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-04-01 to 2025-04-01.)
ArticleCitations
SHiFT369
An experimental evaluation and guideline for path finding in weighted dynamic network250
BICE: Exploring Compact Search Space by Using Bipartite Matching and Cell-Wide Verification132
XGNN: Boosting Multi-GPU GNN Training via Global GNN Memory Store95
DuckPGQ: Bringing SQL/PGQ to DuckDB86
Evolution of a compiling query engine67
PRUC66
Towards a polyglot framework for factorized ML65
Representing Paths in Graph Database Pattern Matching63
Can Learned Models Replace Hash Functions?62
Semi-Oblivious Chase Termination for Linear Existential Rules: An Experimental Study58
A study of database performance sensitivity to experiment settings55
High-Dimensional Data Cubes55
Accelerating recommendation system training by leveraging popular choices52
Flexible rule-based decomposition and metadata independence in modin51
An intermediate representation for hybrid database and machine learning workloads50
Towards plug-and-play visual graph query interfaces49
Privacy-Enhanced Database Synthesis for Benchmark Publishing48
Maximum k -Plex Search: An Alternated Reduction-and-Bound Method46
SIMformer: Single-Layer Vanilla Transformer Can Learn Free-Space Trajectory Similarity46
Chimera: A System Design of Dual Storage and Traversal-Join Unified Query Processing for SQL/PGQ43
From Logs to Causal Inference: Diagnosing Large Systems41
AutoDI40
Towards Designing and Learning Piecewise Space-Filling Curves40
LANNS39
DynaHB: A Communication-Avoiding Asynchronous Distributed Framework with Hybrid Batches for Dynamic GNN Training39
Two Birds With One Stone: Designing a Hybrid Cloud Storage Engine for HTAP39
Aleph Filter: To Infinity in Constant Time38
Agile-Ant: Self-Managing Distributed Cache Management for Cost Optimization of Big Data Applications38
Influential Community Search over Large Heterogeneous Information Networks37
Discovering related data at scale37
Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRAD37
Spooky37
Texera: A System for Collaborative and Interactive Data Analytics Using Workflows37
MITra: A Framework for Multi-Instance Graph Traversal37
A Comparative Study and Component Analysis of Query Plan Representation Techniques in ML4DB Studies36
Parallel Colorful h -Star Core Maintenance in Dynamic Graphs36
Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules36
ZIP: Lazy Imputation during Query Processing36
Efficient maximum k -plex computation over large sparse graphs35
DeepJoin: Joinable Table Discovery with Pre-Trained Language Models35
Reliable community search in dynamic networks34
Algorithmic Complexity Attacks on Dynamic Learned Indexes34
PerfGuard33
Towards distribution-aware query answering in data markets33
Fries33
Symmetric continuous subgraph matching with bidirectional dynamic programming33
Towards General and Efficient Online Tuning for Spark33
Frost32
Demonstration of accelerating machine learning inference queries with correlative proxy models32
Keep CALM and CRDT On32
ReMac32
Demo of marius32
WebMILE32
Catch a blowfish alive32
EPICGen31
Approximating probabilistic group steiner trees in graphs31
QO-Insight: Inspecting Steered Query Optimizers31
IsoBugView31
Towards Efficient Index Construction and Approximate Nearest Neighbor Search in High-Dimensional Spaces31
Data and AI Model Markets: Opportunities for Data and Model Sharing, Discovery, and Integration30
Pipemizer30
Database technology for the masses30
Anser: Adaptive Information Sharing Framework of AnalyticDB30
Sigma workbook30
Procedural extensions of SQL29
Wikinegata29
KG-Roar: Interactive Datalog-Based Reasoning on Virtual Knowledge Graphs29
DuckDB-wasm29
ZeroEA: A Zero-Training Entity Alignment Framework via Pre-Trained Language Model29
POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least Resistance28
On repairing timestamps for regular interval time series28
Density Personalized Group Query28
Analysis of influence contribution in social advertising28
The LDBC Social Network Benchmark28
How Do Categorical Duplicates Affect ML? A New Benchmark and Empirical Analyses28
Flexible Resource Allocation for Relational Database-as-a-Service28
LES 328
LightDiC: A Simple Yet Effective Approach for Large-Scale Digraph Representation Learning27
Spatial and temporal constrained ranked retrieval over videos27
No Repetition27
The power of summarization in graph mining and learning27
DBOS27
ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement Learning25
ADF & TransApp: A Transformer-Based Framework for Appliance Detection Using Smart Meter Consumption Series25
Timestamp as a Service, Not an Oracle25
Low-latency compilation of SQL queries to machine code25
OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates25
Efficient Regular Simple Path Queries under Transitive Restricted Expressions25
Hippo24
VeriBench: Analyzing the Performance of Database Systems with Verifiability24
REmatch: A Novel Regex Engine for Finding All Matches24
Transactional Panorama: A Conceptual Framework for User Perception in Analytical Visual Interfaces24
Data-Driven Insight Synthesis for Multi-Dimensional Data24
Triangular Stability Maximization by Influence Spread over Social Networks24
A Blockchain System for Clustered Federated Learning with Peer-to-Peer Knowledge Transfer24
Fast detection of denial constraint violations23
Pantheon23
Demonstration of panda23
Effective and Efficient Route Planning Using Historical Trajectories on Road Networks23
In-network support for transaction triaging23
Motiflets22
Generalized supervised meta-blocking22
Algorithm and system co-design for efficient subgraph-based graph representation learning22
GeCo21
Watermarks in stream processing systems21
DARLING21
RPT21
Effective indexing for dynamic structural graph clustering21
PARQO: Penalty-Aware Robust Plan Selection in Query Optimization20
DILI: A Distribution-Driven Learned Index20
Demonstrating ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Joins via Reinforcement Learning20
Mixed Covers of Keys and Functional Dependencies for Maintaining the Integrity of Data under Updates20
Chameleon: A Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models19
Neighborhood-Preserving Graph Sparsification19
Efficient Cost Modeling of Space-Filling Curves19
Don't be a tattle-tale19
Cardinality Estimation for Having-Clauses19
The end of Moore's law and the rise of the data processor19
QPJVis Demo: Quality-Boost Progressive Join Query Processing System18
HMAB18
GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation18
CatSQL : Towards Real World Natural Language to SQL Applications18
MDTP18
Towards communication-efficient vertical federated learning training via cache-enabled local updates18
Mach: Firefighting Time-Critical Issues in Complex Systems Using High-Frequency Telemetry17
TDSQL: Tencent Distributed Database System17
RetClean: Retrieval-Based Data Cleaning Using LLMs and Data Lakes17
Demonstration of the VeriEQL Equivalence Checker for Complex SQL Queries17
Differentially Private Stream Processing at Scale17
Spatial Query Optimization With Learning17
Transparent Migration from Datastore to Firestore16
OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud16
Achieving high throughput and elasticity in a larger-than-memory store16
LLM for Data Management16
Apache TsFile: An IoT-Native Time Series File Format16
ModsNet: Performance-Aware Top- k Model Search Using Exemplar Datasets16
Intelligent Agents for Data Exploration16
A Reproducible Tutorial on Reproducibility in Database Systems Research16
High-Performance Spatial Data Analytics: Systematic R&D for Scale-Out and Scale-Up Solutions from the Past to Now16
Spade: A Real-Time Fraud Detection Framework16
BigST: Linear Complexity Spatio-Temporal Graph Neural Network for Traffic Forecasting on Large-Scale Road Networks15
BP-Tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-Trees15
Managing ML pipelines15
Catalyst: Optimizing Cache Management for Large In-memory Key-value Systems15
WebArrayDB15
Efficient Framework for Operating on Data Sketches15
In-network leaderless replication for distributed data stores15
Finding locally densest subgraphs15
Performance-Based Pricing for Federated Learning via Auction15
Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy15
SpaceSaving ±15
Lindorm TSDB: A Cloud-Native Time-Series Database for Large-Scale Monitoring Systems15
Valentine in action15
gCore: Exploring Cross-Layer Cohesiveness in Multi-Layer Graphs15
AnyOLAP15
FedTSC15
The art of balance14
BASE: Bridging the Gap between Cost and Latency for Query Optimization14
Longshot: Indexing Growing Databases Using MPC and Differential Privacy14
Optimal Matrix Sketching over Sliding Windows14
Online Ridesharing with Meeting Points14
DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search14
ZKSQL: Verifiable and Efficient Query Evaluation with Zero-Knowledge Proofs14
Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service14
In-page shadowing and two-version timestamp ordering for mobile DBMSs14
Efficient Distributed Transaction Processing in Heterogeneous Networks14
Hazelcast jet14
Projection-compliant database generation14
BYO: A Unified Framework for Benchmarking Large-Scale Graph Containers14
Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database Construction14
LANCET14
Quasi-Stable Coloring for Graph Compression14
Efficient Influence Minimization via Node Blocking14
Efficient secure and verifiable location-based skyline queries over encrypted data14
How divergent is your data?14
Learning to be a statistician13
Trajectory Similarity Measurement: An Efficiency Perspective13
A queueing-theoretic framework for vehicle dispatching in dynamic car-hailing13
Napa13
CGgraph: An Ultra-Fast Graph Processing System on Modern Commodity CPU-GPU Co-processor13
Efficient k -Clique Count Estimation with Accuracy Guarantee13
Efficient Maximal Frequent Group Enumeration in Temporal Bipartite Graphs13
Towards event prediction in temporal graphs13
Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives13
Communication Efficient and Provable Federated Unlearning13
Neighborhood-Based Hypergraph Core Decomposition13
From BERT to GPT-3 codex13
Marigold: Efficientk-Means Clustering in High Dimensions13
FusionQuery: On-demand Fusion Queries over Multi-source Heterogeneous Data13
Kamino13
Hu-Fu13
Weakly Guided Adaptation for Robust Time Series Forecasting13
Viper13
UPLIFT13
COMET13
Out-of-Order Sliding-Window Aggregation with Efficient Bulk Evictions and Insertions13
Accelerating Similarity Search for Elastic Measures: A Study and New Generalization of Lower Bounding Distances13
Distributed Shortest Distance Labeling on Large-Scale Graphs13
Parallel training of knowledge graph embedding models13
Scabbard12
On Efficient Approximate Queries over Machine Learning Models12
APEX12
The FastLanes Compression Layout: Decoding > 100 Billion Integers per Second with Scalar Code12
Exathlon12
SAND in action12
Spectrum: Speedy and Strictly-Deterministic Smart Contract Transactions for Blockchain Ledgers12
SmartLite: A DBMS-Based Serving System for DNN Inference in Resource-Constrained Environments12
Cloud Analytics Benchmark12
TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods12
ShadowAQP: Efficient Approximate Group-by and Join Query via Attribute-Oriented Sample Size Allocation and Data Generation12
FILM12
SDPipe: A Semi-Decentralized Framework for Heterogeneity-Aware Pipeline-parallel Training12
MLOS in Action: Bridging the Gap Between Experimentation and Auto-Tuning in the Cloud12
Accelerating large scale real-time GNN inference using channel pruning12
Are we ready for learned cardinality estimation?12
DyHealth12
LLM-PBE: Assessing Data Privacy in Large Language Models12
Hu-fu12
DatAgent12
Galvatron12
SAND12
Netherite12
G-tran12
Billion-Scale Bipartite Graph Embedding: A Global-Local Induced Approach12
FCBench: Cross-Domain Benchmarking of Lossless Compression for Floating-Point Data12
The Composable Data Management System Manifesto12
MagicScaler: Uncertainty-Aware, Predictive Autoscaling11
Exploiting the Power of Equality-Generating Dependencies in Ontological Reasoning11
Testing Graph Database Systems via Graph-Aware Metamorphic Relations11
CDI-E11
A Memory Guided Transformer for Time Series Forecasting11
Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study11
Estimating Single-Node PageRank in Õ (min{ d t , √ m }11
TOD11
Autonomously Computable Information Extraction11
Fanglue: An Interactive System for Decision Rule Crafting11
Interactive demonstration of SQLCheck11
Calibrating Noise for Group Privacy in Subsampled Mechanisms11
QueryArtisan: Generating Data Manipulation Codes for Ad-hoc Analysis in Data Lakes11
GEDet11
LIDER11
Unconstrained submodular maximization with modular costs11
FACE11
ADESIT11
Towards crowd-aware indoor path planning11
The Cost of Representation by Subset Repairs11
Towards scalable online machine learning collaborations with OpenML11
A scalable and generic approach to range joins11
Butterfly counting on uncertain bipartite graphs10
0.20770502090454