International Journal of Parallel Programming

Papers
(The median citation count of International Journal of Parallel Programming is 0. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-06-01 to 2025-06-01.)
ArticleCitations
Calculation of Distributed-Order Fractional Derivative on Tensor Cores-Enabled GPU9
Accelerating OCaml Programs on FPGA9
Special Issue on SAMOS 20228
Meerkat: A Framework for Dynamic Graph Algorithms on GPUs7
Guest Editorial: Special Issue on 2020 IEEE International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS 2020)6
Erasure-Coded Hybrid Writes Based on Data Delta6
Declarative Data Flow in a Graph-Based Distributed Memory Runtime System6
Scaling the Maximum Flow Computation on GPUs5
ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation5
SGgraph: A Scalable GPU-Based Edge-Centric Graph Processing Framework4
Investigating Methods for ASPmT-Based Design Space Exploration in Evolutionary Product Design4
Portable C++ Code that can Look and Feel Like Fortran Code with Yet Another Kernel Launcher (YAKL)3
Statistical Analysis Based Intrusion Detection System for Ultra-High-Speed Software Defined Network3
A Practical Approach for Employing Tensor Train Decomposition in Edge Devices3
Using Machine Learning Hardware to Solve Linear Partial Differential Equations with Finite Difference Methods3
A Scalable Similarity Join Algorithm Based on MapReduce and LSH3
K*-Means: An Efficient Clustering Algorithm with Adaptive Decision Boundaries3
Automatic Heterogeneous Runtime Using Signal Processing Domain-Specific and Parallel Patterns2
Optimizing Three-Dimensional Stencil-Operations on Heterogeneous Computing Environments2
Design and Performance Evaluation of a Novel High-Speed Hardware Architecture for Keccak Crypto Coprocessor2
A Quantitative Study of Locality in GPU Caches for Memory-Divergent Workloads2
Self-Adaptive Micro-Batching for Low-Latency GPU-Accelerated Stream Processing2
Advancing Interactive Parallelization: iCetus2
RMOWOA: A Revamped Multi-Objective Whale Optimization Algorithm for Maximizing the Lifetime of a Network in Wireless Sensor Networks2
Generic Exact Combinatorial Search at HPC Scale2
High-Level Programming of FPGA-Accelerated Systems with Parallel Patterns1
A Fault-Model-Relevant Classification of Consensus Mechanisms for MPI and HPC1
Generating Sparse Matrices for Large-Scale Spectral Clustering on a Single GPU1
A Profile-Based AI-Assisted Dynamic Scheduling Approach for Heterogeneous Architectures1
Accelerating Computation of Steiner Trees on GPUs1
Giraph-Based Distributed Algorithms for Coloring Large-Scale Graphs1
The Celerity High-level API: C++20 for Accelerator Clusters1
CAPIO-CL: The CAPIO Coordination Language1
AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators1
SMSG: Profiling-Free Parallelism Modeling for Distributed Training of DNN1
Retraction Note: QoS and QoE Enhanced Resource Allocation for Wireless Video Sensor Networks Using Hybrid Optimization Algorithm1
Yet Another Lock-Free Atom Table Design for Scalable Symbol Management in Prolog1
Enhancing the Effectiveness of Inlining in Automatic Parallelization1
Larger-Than-Memory Stateful Stream Processing with WindFlow1
Fine-Grained Power Modeling of Multicore Processors Using FFNNs1
Thread and Data Mapping in Software Transactional Memory: an Overview1
A Methodology for Efficient Tile Size Selection for Affine Loop Kernels0
GPU-Based Algorithms for Processing the k Nearest-Neighbor Query on Spatial Data Using Partitioning and Concurrent Kernel Execution0
Portable Node-Level Parallelism for the PGAS Model0
Orchestration Extensions for Interference- and Heterogeneity-Aware Placement for Data-Analytics0
pi-par: A Dependently-Typed Parallel Language with Algorithmic Skeletons0
Guest Editorial: Special issue on Network and Parallel Computing for Emerging Architectures and Applications0
A Hybrid Machine Learning Model for Code Optimization0
Partitioning-Aware Performance Modeling of Distributed Graph Processing Tasks0
Parallelization of Swarm Intelligence Algorithms: Literature Review0
Distributed Calculations with Algorithmic Skeletons for Heterogeneous Computing Environments0
Automatic Discovery of Collective Communication Patterns in Parallelized Task Graphs0
An Improved/Optimized Practical Non-Blocking PageRank Algorithm for Massive Graphs*0
Split’n’Cover: ISO 26262 Hardware Safety Analysis with SystemC0
Fast Parallel CPU-GPU Approximate Spectral Clustering for Transcriptomics Data0
Fortress Abstractions in X10 Framework0
PragFormer: Data-Driven Parallel Source Code Classification with Transformers0
DyG-DPCD: A Distributed Parallel Community Detection Algorithm for Large-Scale Dynamic Graphs0
Correction: Split’n’Cover: ISO 26,262 Hardware Safety Analysis with SystemC0
Efficient High-Level Programming in Plain Java0
Assessing Application Efficiency and Performance Portability in Single-Source Programming for Heterogeneous Parallel Systems0
Stencil Calculations with Algorithmic Skeletons for Heterogeneous Computing Environments0
Hardware-Aware Evolutionary Explainable Filter Pruning for Convolutional Neural Networks0
Access Interval Prediction by Partial Matching for Tightly Coupled Memory Systems0
FIPLib: An Image Processing Library for FPGAs Using High-Level Synthesis0
Energy-Efficient Partial-Duplication Task Mapping Under Multiple DVFS Schemes0
Celerity-RSim: Porting Light Propagation Simulation to Accelerator Clusters Using a High-Level API0
Retraction Note: Designing a Framework for Communal Software: Based on the Assessment Using Relation Modelling0
Intelligent Page Migration on Heterogeneous Memory by Using Transformer0
Restoration of Legacy Parallelism: Transforming Pthreads into Farm and Pipeline Patterns0
Accelerating Massively Distributed Deep Learning Through Efficient Pseudo-Synchronous Update Method0
High Throughput Instruction-Data Level Parallelism Based Arithmetic Hardware Accelerator0
Parallelizing RNA-Seq Analysis with BioSkel: A FastFlow Based Prototype0
Performance Characterization of Python Runtimes for Multi-device Task Parallel Programming0
Interruptible Nodes: Reducing Queueing Costs in Irregular Streaming Dataflow Applications on Wide-SIMD Architectures0
Distributed-Memory FastFlow Building Blocks0
DRAMSys4.0: An Open-Source Simulation Framework for In-depth DRAM Analyses0
GraphTango: A Hybrid Representation Format for Efficient Streaming Graph Updates and Analysis0
Code Rejuvenation: From Vector Compiler Intrinsics to Portable Standardized SIMD0
Guest Editor’s Note: High-Level Parallel Programming 20210
A Deterministic Portable Parallel Pseudo-Random Number Generator for Pattern-Based Programming of Heterogeneous Parallel Systems0
DSParLib: A C++ Template Library for Distributed Stream Parallelism0
LSH SimilarityJoin Pattern in FastFlow0
0.039381980895996