OOIR: Observatory of International Research

Papers

(The median citation count of Machine Vision and Applications is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)

Article	Citations
A method for high dynamic range 3D color modeling of objects through a color camera	70
Class-aware cross-domain target detection based on cityscape in fog	57
Real estate pricing prediction via textual and visual features	53
Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data	43
Text-driven object affordance for guiding grasp-type recognition in multimodal robot teaching	41
ECM: arbitrary style transfer via Enhanced-Channel Module	38
DMU-Net: a dual stream multi-scale U-Net for image splicing forgery localization	37
StyleDemorpher: high-quality face demorphing via StyleGAN2’s latent space	34
Triple attention and global reasoning Siamese networks for visual tracking	32
A hybrid overlapping group sparsity denoising model with fractional-order total variation and non-convex regularizer	28
Non-contact SpO2 monitoring via multi-channel pulse signals from facial videos using machine learning	21
End-to-end unsupervised learning of latent-space clustering for image segmentation via fully dense-UNet and fuzzy C-means loss	20
Medtransnet: advanced gating transformer network for medical image classification	20
Obs-tackle: an obstacle detection system to assist navigation of visually impaired using smartphones	19
Using breast density for hybrid region and pixel-level loss function	19
Global-guided cross-reference network for co-salient object detection	18
Motion-region annotation for complex videos via label propagation across occluders	17
Enforced clustering for zero-to-one-shot texture anomaly detection	17
MSPKD: multi spatial projectors for knowledge distillation in semantic segmentation	16
A stereo vision SLAM with moving vehicles tracking in outdoor environment	16
Editing implicit and explicit representations of radiance fields: a survey	16
LOID: Lane Occlusion Inpainting and Detection for Enhanced Autonomous Driving Systems	16
Innovative surface roughness detection method based on white light interference images	15
Specular Surface Detection with Deep Static Specular Flow and Highlight	14
L-VAE: variational auto-encoder with learnable beta for disentangled representation	14

Ubiquitous vision of transformers for person re-identification	14
A motion direction detecting model for colored images based on the Hassenstein–Reichardt model	14
Alternate guidance network for boundary-aware camouflaged object detection	13
Generalized few-shot learning under large scope by using episode-wise regularizing imprinting	12
Axes-aligned non-linear optimized PnP algorithm	12
Correction: Unsupervised single-shot depth estimation using perceptual reconstruction	12
Generation of realistic synthetic cable images to train deep learning segmentation models	12
Modeling driving task-relevant attention for intelligent vehicles using triplet ranking	12
Discriminant distance template matching for image recognition	12
AFC-Net: adjacent feature complementary for crowded pedestrian detection	11
RPIM-net: residual channel prior-driven interaction multi-scale network for stereo image deraining	11
Real-World super-resolution under the guidance of optimal transport	11
Kernel based local matching network for video object segmentation	11
A multi-modal framework for continuous and isolated hand gesture recognition utilizing movement epenthesis detection	11
Novel Cauchy mixture modeling combined with the Sparse-RCNN architecture for enhanced multi-person pose estimation	11
CGA-Net: channel-wise gated attention network for improved super-resolution in remote sensing imagery	11
Traversing the subspace of adversarial patches	10
Improving knowledge distillation via pseudo-multi-teacher network	10
A dual progressive strategy for long-tailed visual recognition	10
LDNet: low-light image enhancement with joint lighting and denoising	10
3D face parsing based on 2D CPFNet: conformal parameterized face parsing network	10
Two-stage structural information enhancement for source-free domain adaptation	10
Enhanced hyperspectral image reconstruction via parallel 2D/3D convolution with global layer purification and multiscale pooling fusion	10
Twinned attention network for occlusion-aware facial expression recognition	9
SGL-SLAM: a semantic and geometric RGB-D visual SLAM enhanced with line features for dynamic environments	9
Shape related unknown object one-shot learning grasping	9
Generating comprehensive scene graphs with integrated multiple attribute detection	9
Camera-based mapping in search-and-rescue via flying and ground robot teams	9
Benchmarking large and small MLLMs	9
Redundancy-free label space and dual-feature collaboration for multi-label feature selection	9
Shape description losses for medical image segmentation	9
CAMTrack: a combined appearance-motion method for multiple-object tracking	9
Online continual learning with saliency-guided experience replay using tiny episodic memory	9
Adversarial imitation learning-based network for category-level 6D object pose estimation	9
Correction: Real estate pricing prediction via textual and visual features	9
Cross-validation of a semantic segmentation network for natural history collection specimens	8
MÆIDM: multi-scale anomaly embedding inpainting and discrimination for surface anomaly detection	8
Chfnet: a coarse-to-fine hierarchical refinement model for monocular depth estimation	8
Fusing bilinear multi-channel gated vector for fine-grained classification	8
Audio-visual localization based on spatial relative sound order	8
Thin section analysis for ceramic petrography using motion analysis and segmentation techniques	8
EAF-Net: an enhancement and aggregation–feedback network for RGB-T salient object detection	8
IoU-aware feature fusion R-CNN for dense object detection	8
Explainable interactive projections of images	8
X-Align++: cross-modal cross-view alignment for Bird’s-eye-view segmentation	8
GOA-net: generic occlusion aware networks for visual tracking	8
A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification	8
Pakistan sign language recognition: leveraging deep learning models with limited dataset	8
OmniGlasses: an optical aid for stereo vision CNNs to enable omnidirectional image processing	8
Multi-scale convolution underwater image restoration network	8

An anisotropic non-local attention network for image segmentation	7
YG-SLAM: dynamic environment-based geometric constraint point-line fusion visual SLAM system	7
DisRot: boosting the generalization capability of few-shot learning via knowledge distillation and self-supervised learning	7
Identification of facial skin diseases from face phenotypes using FSDNet in uncontrolled environment	7
Clarity method of fog and dust image in fully mechanized mining face	7
Automatic cables segmentation from a substation device based on 3D point cloud	7
Integrating visual-semantic relational reasoning for fake news detection on video platforms	7
Pattern recognition methodologies for pollen grain image classification: a survey	7
Evolution algorithm of parametric active contour model based on Gaussian smoothing filter	7
A comprehensive survey on SLAM and machine learning approaches for indoor autonomous navigation of mobile robots	7
Real-time pedestrian pose estimation, tracking and localization for social distancing	7
Attention-based global context network for driving maneuvers prediction	7
An adaptive interpolation and 3D reconstruction algorithm for underwater images	7
An efficient ground segmentation approach for LiDAR point cloud utilizing adjacent grids	7
Welding splash and arc noise reduction imaging model based on computationally efficient pairwise response serving welding process library	6
Boosting few-shot learning via selective patch embedding by comprehensive sample analysis	6
Quality assessment of synthetic images via spatial distortion recognition	6
VGT-MOT: visibility-guided tracking for online multiple-object tracking	6
Enhanced normal estimation of point clouds via fine-grained geometric information learning	6
A novel multi-feature fusion deep neural network using HOG and VGG-Face for facial expression classification	6
Semi-supervised metric learning incorporating weighted triplet constraint and Riemannian manifold optimization for classification	6
Parametric loss-based super-resolution for scene text recognition	6
Mobgazenet: robust gaze estimation mobile network based on progressive attention mechanisms	6
Actions as points: a simple and efficient detector for skeleton-based temporal action detection	6
PGA6D: 6D pose estimation for grasping and assemblying based on keypoints voting	6
Block-recurrent visual transformer for enhanced human detection in thermal imaging	6
Tree-managed network ensembles for video prediction	6
Environmental factors-aware two-stream GCN for skeleton-based behavior recognition	6
Kinematic calibration of a hexapod robot based on monocular vision	6
Visual-inertial SLAM with line segment merging and efficient feature tracking method	6
Tensor-guided learning for image denoising using anisotropic PDEs	6
ConsInstancy: learning instance representations for semi-supervised panoptic segmentation of concrete aggregate particles	6
Improving change detection using conditional discriminative adversarial regularization	6
Robust semantic segmentation method of urban scenes in snowy environment	6
TFF-temporal fusion framework for advancing video retrieval through long-range dependencies and multi-modal intent	6
Multi-view dynamic reconstruction with cross-view smoothing based on surfel	6
Distortion diminishing with vulnerability filters pruning	6
A dual-path U-Net for pulmonary vessel segmentation method based on lightweight 3D attention	6
Meta-learning enhanced global–local feature fusion for image quality assessment	6
Delaunay walk for fast nearest neighbor: accelerating correspondence matching for ICP	5
A collaborative SLAM method for dual payload-carrying UAVs in denied environments	5
PTDS CenterTrack: pedestrian tracking in dense scenes with re-identification and feature enhancement	5
Local region-learning modules for point cloud classification	5
YOLOMH: you only look once for multi-task driving perception with high efficiency	5
FLAVR: flow-free architecture for fast video frame interpolation	5
Fine-grained 3D vehicle shape manipulation via latent space editing	5
React: recognize every action everywhere all at once	5
Regional filtering distillation for object detection	5
A review of adaptable conventional image processing pipelines and deep learning on limited datasets	5
Residual shuffle attention network for image super-resolution	5
Multiple object tracking using weighted graph convolutional neural networks	5
Cascaded attention-guided multi-granularity feature learning for person re-identification	5
Personvit: large-scale self-supervised vision transformer for person re-identification	5
Toward phytoplankton parasite detection using autoencoders	5
BiTransformer: augmenting semantic context in video captioning via bidirectional decoder	5
Beyond Kalman filters: deep learning-based filters for improved object tracking	5
Guest editorial: special issue on human pose estimation and its applications	5
Text-to-face synthesis based on facial landmarks prediction	5
Logit scaling for out-of-distribution detection	5
Accelerated fixed-point iterations for image deblurring and defiltering	5
Unsupervised single-shot depth estimation using perceptual reconstruction	5
Human pose estimation based on lightweight basicblock	5
Naturally constrained reject option classification	5
Swin transformer with part-level tokenization for occluded person re-identification	5
Self-attention network for few-shot learning based on nearest-neighbor algorithm	4
Removing cloud shadows from ground-based solar imagery	4
SiamCAR-Kal: anti-occlusion tracking algorithm for infrared ground targets based on SiamCAR and Kalman filter	4
Hierarchical contrastive adaptation for cross-domain object detection	4
The general framework for few-shot learning by kernel HyperNetworks	4
A deep Retinex network for underwater low-light image enhancement	4
Pixel representations, sampling, and label correction for semantic part detection	4
Multimodal dance style transfer	4
Gait recognition using free-area transformer networks	4
Real-time 3D reconstruction using point-dependent pose graph optimization framework	4
An image quality assessment method based on edge extraction and singular value for blurriness	4
CCTV-Calib: a toolbox to calibrate surveillance cameras around the globe	4
Structure–texture decomposition-based dehazing of a single image with large sky area	4
Residual feature learning with hierarchical calibration for gaze estimation	4
A robust vehicle tracking in low-altitude UAV videos	4
Ipdm: identity preserving diffusion model for face sketch and photo synthesis	4

ViCap-AD: video caption-based weakly supervised video anomaly detection	4
MVUDA: Unsupervised Domain Adaptation for Multi-view Pedestrian Detection	4
Vision-based power line cables and pylons detection for low flying aircraft	4
Carixray: a periapical X-ray dataset for machine vision-based dental caries recognition	4
Symmetry-induced ambiguity in orientation estimation from RGB images	4
Superpixel-based foreground-preserving image stitching	4
Optimized hand pose estimation CrossInfoNet-based architecture for embedded devices	4
Mitigating adversarial perturbations via weakly supervised object location and regions recombination	4
Spatial-temporal graph-guided global attention network for video-based person re-identification	4
Normalized margin loss for action unit detection	4
Amp: single-shot ultra-wide fisheye-to-cubemap PnP pose estimation	4
A zero-shot anomaly detection method based on learnable text query	4
Supervised contrastive learning with multi-scale interaction and integrity learning for salient object detection	4
Dynamically throttleable neural networks	3
Dynamic focused prototypes distillation for few-shot object detection	3
Wide-baseline multi-camera calibration from a room filled with people	3
CTL-DETR: a landslide detection algorithm for complex terrains	3
Multi-person 3D pose estimation from unlabelled data	3
CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer	3
RCA-IUnet: a residual cross-spatial attention-guided inception U-Net model for tumor segmentation in breast ultrasound imaging	3
Unsupervised domain adaptation by cross-domain consistency learning for CT body composition	3
Bidirectional cascaded multimodal attention for multiple choice visual question answering	3
Contextual Guided Segmentation Framework for Semi-supervised Video Instance Segmentation	3
Addressing the generalization of 3D registration methods with a featureless baseline and an unbiased benchmark	3
Efficient abnormality detection using patch-based 3D convolution with recurrent model	3
Investigating long-term training for remote sensing object detection	3
Multimodal fine-grained grocery product recognition using image and OCR text	3
Virtual home staging and relighting from a single panorama under natural illumination	3
Pixel-wise confidence estimation for segmentation in Bayesian Convolutional Neural Networks	3
SGBGAN: minority class image generation for class-imbalanced datasets	3
Utilizing incremental branches on a one-stage object detection framework to avoid catastrophic forgetting	3
SNFR: salient neighbor decoding and text feature refining for scene text recognition	3
Enhanced keypoint information and pose-weighted re-ID features for multi-person pose estimation and tracking	3
Ising granularity image analysis on VAE–GAN	3
Online camera auto-calibration appliable to road surveillance	3
Human–object interaction detection based on disentangled axial attention transformer	3
Entangled appearance and motion structures network for multi-object tracking and segmentation	3
Interpretable visual transmission lines inspections using pseudo-prototypical part network	3
An Efficient point-in-convex 3D polyhedron test using a projective algorithm with sub-linear expected complexity	3
Foreground enhancement network for object detection in sonar images	3
Exploring the potential of deep learning techniques for analyzing athlete movements in competitive athletics sports	3
Zero-shot action recognition by clustered representation with redundancy-free features	3
Material classification of polishing and convex surface objects based on photon accumulation point spread function (PAPSF) from imaging model of binocular pulsed time-of-flight camera	3
FERGCN: facial expression recognition based on graph convolution network	3
MYFED: a dataset of affective face videos for investigation of emotional facial dynamics as a soft biometric for person identification	3
Knowledge-based hybrid connectionist models for morphologic reasoning	3
A lightweight and generalizable detection enhancement method using segmentation feedback	3
FESAR: SAR ship detection model based on local spatial relationship capture and fused convolutional enhancement	3
Ssman: self-supervised masked adaptive network for 3D human pose estimation	3
FDT − Dr2T: a unified Dense Radiology Report Generation Transformer framework for X-ray images	3
Generating quality grasp rectangle using Pix2Pix GAN for intelligent robot grasping	3
Exploring filter placement in convolutional layer topologies based on ResNet for image classification	3
Consensus similarity learning based on tensor nuclear norm	3
Trusted 3D self-supervised representation learning with cross-modal settings	2
PM-MVS: PatchMatch multi-view stereo	2
Rocnet: 3D robust registration of points clouds using deep learning	2
High-precision calibration of wide-angle fisheye lens with radial distortion projection ellipse constraint (RDPEC)	2
Interpretability of fingerprint presentation attack detection systems: a look at the “representativeness” of samples against never-seen-before attacks	2
Editor’s Note: Special Issue on Advances in Visual Computing	2
Synergizing LiDAR and Augmented Reality for precise real-time interior distance measurements for mobile devices	2
Multi-scene low-light remote physiological measurement database	2
GMC_FM : a grid and multi-density-based method for matching ancient Chinese architectural images	2
Pose is all you need: the pose only group activity recognition system (POGARS)	2
Dyna-MSDepth: multi-scale self-supervised monocular depth estimation network for visual SLAM in dynamic scenes	2
Potential escalator-related injury identification and prevention based on multi-module integrated system for public health	2
Biomimetic oculomotor control with spiking neural networks	2
Unsupervised learning of probabilistic subspaces for multi-spectral and multi-temporal image-based disaster mapping	2
Semantic scene upgrades for trajectory prediction	2
Continuous sign language recognition based on motor attention mechanism and frame-level self-distillation	2
Improving visual odometry pipeline with feedback from forward and backward motion estimates	2
Similarity contrastive estimation for image and video soft contrastive self-supervised learning	2
Calibrating uncertainties in human trajectory forecasting	2
Region gradient-guided diffusion model for underwater image enhancement	2
Self-supervised monocular depth estimation via joint attention and intelligent mask loss	2
Beyond a strong baseline: cross-modality contrastive learning for visible-infrared person re-identification	2
Deep transfer learning algorithms applied to synthetic drawing images as a tool for supporting Alzheimer’s disease prediction	2
Multi-view damage inspection using single-view damage projection	2
That’s BAD: blind anomaly detection by implicit local feature clustering	2
Optimize multiscale feature hybrid-net deep learning approach used for automatic pancreas image segmentation	2
Object Recognition Consistency in Regression for Active Detection	2
Multi-level receptive field feature reuse for multi-focus image fusion	2
Active perception based on deep reinforcement learning for autonomous robotic damage inspection	2
Cmf-transformer: cross-modal fusion transformer for human action recognition	2
A multi-target physiological signal detection method for UWB radar based on Kalman tracking and dual-branch network	2
Discriminative feature learning through feature distance loss	2
Representing dynamic textures based on polarized gradient features	2
Multi-core token mixer: a novel approach for underwater image enhancement	2
ICE-GCN: An interactional channel excitation-enhanced graph convolutional network for skeleton-based action recognition	2
Correction to: Self-attention network for few-shot learning based on nearest-neighbor algorithm	2
Fast re-OBJ: real-time object re-identification in rigid scenes	2
Ellipse detection using the edges extracted by deep learning	2
From explanation to unsupervised segmentation: fusion of multiple explanation maps for vision transformers	2
Visible-infrared person re-identification model based on feature consistency and modal indistinguishability	2
Position Puzzle Network and Augmentation: localizing human keypoints beyond the bounding box	2
High-efficiency automated triaxial robot grasping system for motor rotors using 3D structured light sensor	2