IEEE Computer Architecture Letters

Papers
(The TQCC of IEEE Computer Architecture Letters is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-05-01 to 2024-05-01.)
ArticleCitations
DRAMsim3: A Cycle-Accurate, Thermal-Capable DRAM Simulator103
SmartSSD: FPGA Accelerated Near-Storage Data Analytics on SSD40
RAMBO: Resource Allocation for Microservices Using Bayesian Optimization29
GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers28
pPIM: A Programmable Processor-in-Memory Architecture With Precision-Scaling for Deep Learning24
The Entangling Instruction Prefetcher17
Lightweight Hardware Implementation of Binary Ring-LWE PQC Accelerator17
MultiPIM: A Detailed and Configurable Multi-Stack Processing-In-Memory Simulator15
Rebasing Instruction Prefetching: An Industry Perspective13
A Cross-Stack Approach Towards Defending Against Cryptojacking13
Flexion: A Quantitative Metric for Flexibility in DNN Accelerators12
HBM3 RAS: Enhancing Resilience at Scale11
Cryogenic PIM: Challenges & Opportunities9
Heterogeneity-Aware Scheduling on SoCs for Autonomous Vehicles8
STONNE: Enabling Cycle-Level Microarchitectural Simulation for DNN Inference Accelerators8
TRiM: Tensor Reduction in Memory8
Reorder Buffer Contention: A Forward Speculative Interference Attack for Speculation Invariant Instructions8
Characterizing and Understanding End-to-End Multi-Modal Neural Networks on GPUs7
A Day In the Life of a Quantum Error7
BTB-X: A Storage-Effective BTB Organization7
FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems6
Harnessing Pairwise-Correlating Data Prefetching With Runahead Metadata6
DRAM-CAM: General-Purpose Bit-Serial Exact Pattern Matching6
Understanding the Implication of Non-Volatile Memory for Large-Scale Graph Neural Network Training6
Accelerating Concurrent Priority Scheduling Using Adaptive in-Hardware Task Distribution in Multicores6
MCsim: An Extensible DRAM Memory Controller Simulator6
A Lightweight Memory Access Pattern Obfuscation Framework for NVM5
Dagger: Towards Efficient RPCs in Cloud Microservices With Near-Memory Reconfigurable NICs5
Deep Partitioned Training From Near-Storage Computing to DNN Accelerators5
Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications5
Instruction Criticality Based Energy-Efficient Hardware Data Prefetching5
LT-PIM: An LUT-Based Processing-in-DRAM Architecture With RowHammer Self-Tracking5
GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks5
A First-Order Model to Assess Computer Architecture Sustainability4
WPC: Whole-Picture Workload Characterization Across Intermediate Representation, ISA, and Microarchitecture4
Zero-Copying I/O Stack for Low-Latency SSDs4
Hardware Acceleration for GCNs via Bidirectional Fusion4
DAM: Deadblock Aware Migration Techniques for STT-RAM-Based Hybrid Caches4
Adaptive Web Browsing on Mobile Heterogeneous Multi-cores4
Characterizing and Understanding HGNNs on GPUs4
Managing Prefetchers With Deep Reinforcement Learning4
Row-Streaming Dataflow Using a Chaining Buffer and Systolic Array+ Structure4
Dynamic Optimization of On-Chip Memories for HLS Targeting Many-Accelerator Platforms4
Decoupled SSD: Reducing Data Movement on NAND-Based Flash SSD4
Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture3
PIM-GraphSCC: PIM-Based Graph Processing Using Graph’s Community Structures3
Infinity Stream: Enabling Transparent and Automated In-Memory Computing3
Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models3
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs3
OpenMDS: An Open-Source Shell Generation Framework for High-Performance Design on Xilinx Multi-Die FPGAs3
Characterizing and Understanding Distributed GNN Training on GPUs3
Last-Level Cache Insertion and Promotion Policy in the Presence of Aggressive Prefetching3
Hungarian Qubit Assignment for Optimized Mapping of Quantum Circuits on Multi-Core Architectures3
Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing3
Data-Aware Compression of Neural Networks3
0.020370960235596