International Journal of Parallel Programming

Papers
(The median citation count of International Journal of Parallel Programming is 0. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-04-01 to 2025-04-01.)
ArticleCitations
pi-par: A Dependently-Typed Parallel Language with Algorithmic Skeletons36
Using Machine Learning Hardware to Solve Linear Partial Differential Equations with Finite Difference Methods13
SMSG: Profiling-Free Parallelism Modeling for Distributed Training of DNN13
Intelligent Page Migration on Heterogeneous Memory by Using Transformer8
A Comparative Survey of Big Data Computing and HPC: From a Parallel Programming Model to a Cluster Architecture8
High-Level Parallel Ant Colony Optimization with Algorithmic Skeletons7
Restoration of Legacy Parallelism: Transforming Pthreads into Farm and Pipeline Patterns6
Advancing Interactive Parallelization: iCetus6
Accelerating OCaml Programs on FPGA6
Automatic Heterogeneous Runtime Using Signal Processing Domain-Specific and Parallel Patterns6
Enhancing the Effectiveness of Inlining in Automatic Parallelization6
Guest Editorial: Special issue on Network and Parallel Computing for Emerging Architectures and Applications6
Interruptible Nodes: Reducing Queueing Costs in Irregular Streaming Dataflow Applications on Wide-SIMD Architectures5
CCRP: Converging Credit-Based and Reactive Protocols in Datacenters5
AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators5
Special Issue on SAMOS 20225
Location-based and Time-aware Service Recommendation in Mobile Edge Computing4
Statistical Analysis Based Intrusion Detection System for Ultra-High-Speed Software Defined Network4
Calculation of Distributed-Order Fractional Derivative on Tensor Cores-Enabled GPU3
Meerkat: A Framework for Dynamic Graph Algorithms on GPUs3
Design and Performance Evaluation of a Novel High-Speed Hardware Architecture for Keccak Crypto Coprocessor3
Fault-Tolerant and Unicast Performances of the Data Center Network HSDC3
Parallel Computation of Discrete Orthogonal Moment on Block Represented Images Using OpenMP3
Accelerating DES and AES Algorithms for a Heterogeneous Many-core Processor2
SGgraph: A Scalable GPU-Based Edge-Centric Graph Processing Framework2
Giraph-Based Distributed Algorithms for Coloring Large-Scale Graphs2
Distributed Calculations with Algorithmic Skeletons for Heterogeneous Computing Environments2
Thread and Data Mapping in Software Transactional Memory: an Overview2
Performance Characterization of Python Runtimes for Multi-device Task Parallel Programming2
Scaling the Maximum Flow Computation on GPUs2
Fast Parallel CPU-GPU Approximate Spectral Clustering for Transcriptomics Data2
FIPLib: An Image Processing Library for FPGAs Using High-Level Synthesis2
Orchestration Extensions for Interference- and Heterogeneity-Aware Placement for Data-Analytics1
Investigating Methods for ASPmT-Based Design Space Exploration in Evolutionary Product Design1
M-DRL: Deep Reinforcement Learning Based Coflow Traffic Scheduler with MLFQ Threshold Adaption1
DSParLib: A C++ Template Library for Distributed Stream Parallelism1
Fortress Abstractions in X10 Framework1
A Fault-Model-Relevant Classification of Consensus Mechanisms for MPI and HPC1
Accelerating Computation of Steiner Trees on GPUs1
A Methodology for Efficient Tile Size Selection for Affine Loop Kernels1
A Configurable Hardware Architecture for Runtime Application of Network Calculus1
Retraction Note: Designing a Framework for Communal Software: Based on the Assessment Using Relation Modelling1
A Quantitative Study of Locality in GPU Caches for Memory-Divergent Workloads0
Portable Node-Level Parallelism for the PGAS Model0
Generic Exact Combinatorial Search at HPC Scale0
Retraction Note: QoS and QoE Enhanced Resource Allocation for Wireless Video Sensor Networks Using Hybrid Optimization Algorithm0
Self-Adaptive Micro-Batching for Low-Latency GPU-Accelerated Stream Processing0
Parallelizing RNA-Seq Analysis with BioSkel: A FastFlow Based Prototype0
Guest Editorial: Special Issue on 2020 IEEE International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS 2020)0
ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation0
Yet Another Lock-Free Atom Table Design for Scalable Symbol Management in Prolog0
CAPIO-CL: The CAPIO Coordination Language0
Fine-Grained Power Modeling of Multicore Processors Using FFNNs0
PragFormer: Data-Driven Parallel Source Code Classification with Transformers0
Guest Editor’s Note: High-Level Parallel Programming 20210
RMOWOA: A Revamped Multi-Objective Whale Optimization Algorithm for Maximizing the Lifetime of a Network in Wireless Sensor Networks0
K*-Means: An Efficient Clustering Algorithm with Adaptive Decision Boundaries0
Larger-Than-Memory Stateful Stream Processing with WindFlow0
A Profile-Based AI-Assisted Dynamic Scheduling Approach for Heterogeneous Architectures0
DyG-DPCD: A Distributed Parallel Community Detection Algorithm for Large-Scale Dynamic Graphs0
High-Level Programming of FPGA-Accelerated Systems with Parallel Patterns0
Optimizing Three-Dimensional Stencil-Operations on Heterogeneous Computing Environments0
GPU-Based Algorithms for Processing the k Nearest-Neighbor Query on Spatial Data Using Partitioning and Concurrent Kernel Execution0
Access Interval Prediction by Partial Matching for Tightly Coupled Memory Systems0
Erasure-Coded Hybrid Writes Based on Data Delta0
A Deterministic Portable Parallel Pseudo-Random Number Generator for Pattern-Based Programming of Heterogeneous Parallel Systems0
LSH SimilarityJoin Pattern in FastFlow0
Hardware-Aware Evolutionary Explainable Filter Pruning for Convolutional Neural Networks0
Assessing Application Efficiency and Performance Portability in Single-Source Programming for Heterogeneous Parallel Systems0
Parallelization of Swarm Intelligence Algorithms: Literature Review0
Automatic Discovery of Collective Communication Patterns in Parallelized Task Graphs0
A Parallel Skeleton for Divide-and-conquer Unbalanced and Deep Problems0
Celerity-RSim: Porting Light Propagation Simulation to Accelerator Clusters Using a High-Level API0
DeeperThings: Fully Distributed CNN Inference on Resource-Constrained Edge Devices0
Energy-Efficient Partial-Duplication Task Mapping Under Multiple DVFS Schemes0
Stencil Calculations with Algorithmic Skeletons for Heterogeneous Computing Environments0
An Improved/Optimized Practical Non-Blocking PageRank Algorithm for Massive Graphs*0
DRAMSys4.0: An Open-Source Simulation Framework for In-depth DRAM Analyses0
A Hybrid Machine Learning Model for Code Optimization0
Efficient High-Level Programming in Plain Java0
High Throughput Instruction-Data Level Parallelism Based Arithmetic Hardware Accelerator0
Declarative Data Flow in a Graph-Based Distributed Memory Runtime System0
SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters0
Partitioning-Aware Performance Modeling of Distributed Graph Processing Tasks0
On Single-Valuedness in Textually Aligned SPMD Programs0
GraphTango: A Hybrid Representation Format for Efficient Streaming Graph Updates and Analysis0
Portable C++ Code that can Look and Feel Like Fortran Code with Yet Another Kernel Launcher (YAKL)0
The Celerity High-level API: C++20 for Accelerator Clusters0
A Scalable Similarity Join Algorithm Based on MapReduce and LSH0
Distributed-Memory FastFlow Building Blocks0
A Practical Approach for Employing Tensor Train Decomposition in Edge Devices0
Accelerating Massively Distributed Deep Learning Through Efficient Pseudo-Synchronous Update Method0
0.081773996353149