OOIR: Observatory of International Research

Papers

(The median citation count of International Journal of High Performance Computing Applications is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-06-01 to 2026-06-01.)

Article	Citations
Dynamic spawning of MPI processes applied to malleability	82
Visualization at exascale: Making it all work with VTK-m	76
Compressed basis GMRES on high-performance graphics processing units	73
chipStar : Making HIP/CUDA applications cross-vendor portable by building on open standards	42
HPL-MxP benchmark: Mixed-precision algorithms, iterative refinement, and scalable data generation	38
HPC I/O innovations in the exascale era	24
Automatizing the creation of specialized high-performance computing containers	24
Running ahead of evolution—AI-based simulation for predicting future high-risk SARS-CoV-2 variants	18
Accelerating atmospheric physics parameterizations using graphics processing units	17
HDF5 in the exascale era: Delivering efficient and scalable parallel I/O for exascale applications	16
Refining HPCToolkit for application performance analysis at exascale	16
Orchestration of materials science workflows for heterogeneous resources at large scale	15
Julia versus C++ Kokkos for performance portable Cartesian CFD solvers on heterogeneous architectures	14
Scalable multilevel Monte Carlo methods exploiting parallel redistribution on coarse levels	14
Modeling, evaluating, and orchestrating heterogeneous environmental leverages for large-scale data center management	14
Direct numerical simulations for hybrid rocket boundary layers: Performance modeling and scaling	13
A tale of two codes: CUDA vs OpenACC for mass-zero constrained dynamics	12
Preparing MPICH for exascale	11
General framework for re-assuring numerical reliability in parallel Krylov solvers: A case of bi-conjugate gradient stabilized methods	11
Ginkgo - A math library designed to accelerate Exascale Computing Project science applications	11
Hypergraph-based locality-enhancing methods for graph operations in Big Data applications	10
GPU-based molecular dynamics of fluid flows: Reaching for turbulence	10
Architecture specific generation of large scale lattice Boltzmann methods for sparse complex geometries	9
Performance of explicit and IMEX MRI multirate methods on complex reactive flow problems within modern parallel adaptive structured grid frameworks	9
Accelerated dynamic data reduction using spatial and temporal properties	8

Special issue introduction	8
Integrating ytopt and libEnsemble to autotune OpenMC	8
A study on the performance of distributed training of data-driven CFD simulations	8
An elastic framework for ensemble-based large-scale data assimilation	8
Technology trends in computing hardware and their impacts on high-performance scientific computing Part II: Memory systems, interconnects, and system integration	8
Massively parallel nodal discontinous Galerkin finite element method simulator for room acoustics	8
Retraction Notice	8
Data-driven analysis to understand GPU hardware resource usage of optimizations	8
Cache blocking of distributed-memory parallel matrix power kernels	7
PeleC: An adaptive mesh refinement solver for compressible reacting flows	7
Special issue: Introduction	6
Preparing the TAU performance system for exascale and beyond	6
Data-driven scalable pipeline using national agent-based models for real-time pandemic response and decision support	6
Technology trends in computing hardware and their impacts on high-performance scientific computing Part I: General-purpose processors and hardware accelerators	6
Fast truncated SVD of sparse and dense matrices on graphics processors	6
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors	6
Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications	6
TransGRU-X – A fusion Seq2Seq network enhanced with multiresolution analysis and gating for forecasting of AI/ML workloads in cloud environments	6
Fair-sharing simulator: Toward fair scheduling in batch computing systems	6
HOPPS: A performance portable spectral difference solver for high-fidelity computational fluid dynamics	5
Clacc: OpenACC for C/C++ in Clang	5
HPC-AI coupling methodology for scientific applications	5
NUMA-aware parallel sparse LU factorization for SPICE-based circuit simulators on ARM multi-core processors	5
Democratizing responsible artificial intelligence for innovation and impact	5
Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large-scale GPU-equipped supercomputers	5
Sequence length scaling in vision transformers for scientific images on frontier	5
Accelerating cluster dynamics simulation of fission gas behavior in nuclear fuel on deep computing unit–based heterogeneous architecture supercomputer	5
Understanding power and energy utilization in large scale production physics simulation codes	5
Asynchronous-many-task systems: Challenges and opportunities - Scaling an AMR astrophysics code on exascale machines using Kokkos and HPX	4
PoCL-R: An open standard based heterogeneous offloading layer with server side scalability	4
Feynman and computation: From Los Alamos to quantum computers	4
Advances in ArborX to support exascale applications	4
Bricks: A high-performance portability layer for computations on block-structured grids	4
Abisko: Deep codesign of an architecture for spiking neural networks using novel neuromorphic materials	4
P4IRS: An intermediate representation and compiler for parallel and performance-portable particle simulations	4
An integrated three-dimensional aeromechanical analysis for the prediction of stresses on modern coaxial rotors	3
PaRSEC: Scalability, flexibility, and hybrid architecture support for task-based applications in ECP	3
Guest editors note: Special issue on clusters, clouds, and data for scientific computing	3
Scalable cosmic AI inference using cloud serverless computing	3
Simulation-based machine learning for real-time assessment of side-branch hemodynamics in coronary bifurcation lesions	3
Performance evaluation of mixed-precision Runge–Kutta methods for the solution of partial differential equations	3
UMap: An application-oriented user level memory mapping library	3
Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations	3
IO-aware Job-Scheduling: Exploiting the Impacts of Workload Characterizations to select the Mapping Strategy	3
Cache-optimized and low-overhead implementations of additive Schwarz methods for high-order FEM multigrid computations	3
Guest editor’s note: Special issue on system-level innovations for performance and fairness at scale: From interconnects to schedulers	3
An HPC benchmark survey and taxonomy for characterization	3
MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures	3
The ECP ALPINE project: In situ and post hoc visualization infrastructure and analysis capabilities for exascale	3
Black-box statistical prediction of lossy compression ratios for scientific data	3

ECP libraries and tools: An overview	3
Fault-tolerant numerical iterative algorithms at scale	3
Exploiting mesh structure to improve multigrid performance for saddle-point problems	3
Efficient solution of batched band linear systems on GPUs	2
Corrigendum to large-scale direct numerical simulations of turbulence using GPUs and modern Fortran	2
Mixed precision LU factorization on GPU tensor cores: reducing data movement and memory footprint	2
End-to-end GPU acceleration of low-order-refined preconditioning for high-order finite element discretizations	2
Evolution of the SLATE linear algebra library	2
Deep learning foundation and pattern models: Challenges in hydrological time series	2
High-performance conjugate gradient benchmark: A comprehensive survey	2
An implicit barotropic mode solver for MPAS-ocean using a modern Fortran solver interface	2
Role-shifting threads: Increasing OpenMP malleability to address load imbalance at MPI and OpenMP	2
Detecting interference between applications and improving the scheduling using malleable application clones	2
SWARM: Reimagining scientific workflow management systems in a distributed world	2
Integrating High Performance In-Memory Data Streaming and In-Situ Visualization in Hybrid MPI+OpenMP PIC MC Simulations Towards Exascale	2
Fixed-work versus fixed-time checkpointing on large-scale failure-prone platforms	2