IEEE Computer Architecture Letters

Papers
(The median citation count of IEEE Computer Architecture Letters is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-06-01 to 2025-06-01.)
ArticleCitations
Learned Performance Model for SSD30
Speculative Multi-Level Access in LSM Tree-Based KV Store29
Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture17
Characterization and Analysis of Text-to-Image Diffusion Models13
Toward Practical 128-Bit General Purpose Microarchitectures12
A Characterization of Generative Recommendation Models: Study of Hierarchical Sequential Transduction Unit12
2021 Index IEEE Computer Architecture Letters Vol. 2011
SCALES: SCALable and Area-Efficient Systolic Accelerator for Ternary Polynomial Multiplication11
Scale-Model Simulation11
Decoupled SSD: Reducing Data Movement on NAND-Based Flash SSD11
SoCurity: A Design Approach for Enhancing SoC Security10
Straw: A Stress-Aware WL-Based Read Reclaim Technique for High-Density NAND Flash-Based SSDs10
OASIS: Outlier-Aware KV Cache Clustering for Scaling LLM Inference in CXL Memory Systems10
Improving Energy-Efficiency of Capsule Networks on Modern GPUs10
RouteReplies: Alleviating Long Latency in Many-Chip-Module GPUs9
In-Memory Versioning (IMV)9
Reorder Buffer Contention: A Forward Speculative Interference Attack for Speculation Invariant Instructions8
Exploring the DIMM PIM Architecture for Accelerating Time Series Analysis8
A Flexible Embedding-Aware Near Memory Processing Architecture for Recommendation System8
A Case for In-Memory Random Scatter-Gather for Fast Graph Processing8
Exploiting Intel Advanced Matrix Extensions (AMX) for Large Language Model Inference7
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture7
Security Helper Chiplets: A New Paradigm for Secure Hardware Monitoring7
NoHammer: Preventing Row Hammer With Last-Level Cache Management6
Accelerating Deep Reinforcement Learning via Phase-Level Parallelism for Robotics Applications6
Mitigating Timing-Based NoC Side-Channel Attacks With LLC Remapping6
DeMM: A Decoupled Matrix Multiplication Engine Supporting Relaxed Structured Sparsity5
Memory-Centric MCM-GPU Architecture5
LADIO: Leakage-Aware Direct I/O for I/O-Intensive Workloads5
Understanding the Implication of Non-Volatile Memory for Large-Scale Graph Neural Network Training5
Managing Prefetchers With Deep Reinforcement Learning5
High-Performance Winograd Based Accelerator Architecture for Convolutional Neural Network5
Data-Aware Compression of Neural Networks5
SparseLeakyNets: Classification Prediction Attack Over Sparsity-Aware Embedded Neural Networks Using Timing Side-Channel Information5
Enhancing the Reach and Reliability of Quantum Annealers by Pruning Longer Chains4
Adaptive Web Browsing on Mobile Heterogeneous Multi-cores4
FPGA-Accelerated Data Preprocessing for Personalized Recommendation Systems4
Guard Cache: Creating Noisy Side-Channels4
PreGNN: Hardware Acceleration to Take Preprocessing Off the Critical Path in Graph Neural Networks4
ZoneBuffer: An Efficient Buffer Management Scheme for ZNS SSDs4
Primate: A Framework to Automatically Generate Soft Processors for Network Applications4
Characterization and Analysis of Deep Learning for 3D Point Cloud Analytics4
Chopping off the Tail: Bounded Non-Determinism for Real-Time Accelerators4
A Flexible Hybrid Interconnection Design for High-Performance and Energy-Efficient Chiplet-Based Systems4
Fast Performance Prediction for Efficient Distributed DNN Training4
SSE: Security Service Engines to Accelerate Enclave Performance in Secure Multicore Processors4
Architectural Implications of GNN Aggregation Programming Abstractions3
T-CAT: Dynamic Cache Allocation for Tiered Memory Systems With Memory Interleaving3
DRAM-CAM: General-Purpose Bit-Serial Exact Pattern Matching3
SEMS: Scalable Embedding Memory System for Accelerating Embedding-Based DNNs3
Exploring Volatile FPGAs Potential for Accelerating Energy-Harvesting IoT Applications3
Direct-Coding DNA With Multilevel Parallelism3
Reducing the Silicon Area Overhead of Counter-Based Rowhammer Mitigations3
Accelerators & Security: The Socket Approach3
A Quantum Computer Trusted Execution Environment3
Overcoming Memory Capacity Wall of GPUs With Heterogeneous Memory Stack3
HBM3 RAS: Enhancing Resilience at Scale2
A Case Study of a DRAM-NVM Hybrid Memory Allocator for Key-Value Stores2
Accelerating Page Migrations in Operating Systems With Intel DSA2
FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems2
PINSim: A Processing In- and Near-Sensor Simulator to Model Intelligent Vision Sensors2
Redundant Array of Independent Memory Devices2
Cost-Effective Extension of DRAM-PIM for Group-Wise LLM Quantization2
Halis: A Hardware-Software Co-designed Near-Cache Accelerator for Graph Pattern Mining2
Amethyst: Reducing Data Center Emissions With Dynamic Autotuning and VM Management2
Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications2
FullPack: Full Vector Utilization for Sub-Byte Quantized Matrix-Vector Multiplication on General Purpose CPUs2
gem5-accel: A Pre-RTL Simulation Toolchain for Accelerator Architecture Validation2
A First-Order Model to Assess Computer Architecture Sustainability2
Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture2
Approximate Multiplier Design With LFSR-Based Stochastic Sequence Generators for Edge AI2
Characterization and Analysis of the 3D Gaussian Splatting Rendering Pipeline2
eDKM: An Efficient and Accurate Train-Time Weight Clustering for Large Language Models2
Enhancing DNN Training Efficiency Via Dynamic Asymmetric Architecture2
Analyzing and Exploiting Memory Hierarchy Parallelism With MLP Stacks2
IntervalSim++: Enhanced Interval Simulation for Unbalanced Processor Designs2
Energy-Efficient Bayesian Inference Using Bitstream Computing2
R.I.P. Geomean Speedup Use Equal-Work (Or Equal-Time) Harmonic Mean Speedup Instead2
Hungarian Qubit Assignment for Optimized Mapping of Quantum Circuits on Multi-Core Architectures2
Minimal Counters, Maximum Insight: Simplifying System Performance With HPC Clusters for Optimized Monitoring2
Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models2
MQSim-E: An Enterprise SSD Simulator1
Characterizing and Understanding HGNNs on GPUs1
GPU-Centric Memory Tiering for LLM Serving With NVIDIA Grace Hopper Superchip1
An Intermediate Language for General Sparse Format Customization1
A Hardware-Friendly Tiled Singular-Value Decomposition-Based Matrix Multiplication for Transformer-Based Models1
Architectural Security Regulation1
A Case for Hardware Memoization in Server CPUs1
SPAM: Streamlined Prefetcher-Aware Multi-Threaded Cache Covert-Channel Attack1
Hashing ATD Tags for Low-Overhead Safe Contention Monitoring1
GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks1
Characterizing and Understanding End-to-End Multi-Modal Neural Networks on GPUs1
Exploiting Intel AMX Power Gating1
Pyramid: Accelerating LLM Inference With Cross-Level Processing-in-Memory1
MajorK: Majority Based kmer Matching in Commodity DRAM1
MixDiT: Accelerating Image Diffusion Transformer Inference With Mixed-Precision MX Quantization1
LSim: Fine-Grained Simulation Framework for Large-Scale Performance Evaluation1
Canal: A Flexible Interconnect Generator for Coarse-Grained Reconfigurable Arrays1
Modeling Periodic Energy-Harvesting Computing Systems1
TeleVM: A Lightweight Virtual Machine for RISC-V Architecture1
Intelligent SSD Firmware for Zero-Overhead Journaling1
Structured Combinators for Efficient Graph Reduction1
Tulip: Turn-Free Low-Power Network-on-Chip1
A Data Prefetcher-Based 1000-Core RISC-V Processor for Efficient Processing of Graph Neural Networks1
Electra: Eliminating the Ineffectual Computations on Bitmap Compressed Matrices1
Characterizing and Understanding Distributed GNN Training on GPUs1
Approximate SFQ-based Computing Architecture Modeling with Device-level Guidelines1
A Pre-Silicon Approach to Discovering Microarchitectural Vulnerabilities in Security Critical Applications1
X-PPR: Post Package Repair for CXL Memory1
Balancing Performance Against Cost and Sustainability in Multi-Chip-Module GPUs1
Supporting a Virtual Vector Instruction Set on a Commercial Compute-in-SRAM Accelerator1
On Variable Strength Quantum ECC1
Address Scaling: Architectural Support for Fine-Grained Thread-Safe Metadata Management1
Exploiting Direct Memory Operands in GPU Instructions1
0.12569904327393