Machine Vision and Applications

Papers
(The median citation count of Machine Vision and Applications is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)
ArticleCitations
A method for high dynamic range 3D color modeling of objects through a color camera71
Class-aware cross-domain target detection based on cityscape in fog68
DMU-Net: a dual stream multi-scale U-Net for image splicing forgery localization60
StyleDemorpher: high-quality face demorphing via StyleGAN2’s latent space46
Triple attention and global reasoning Siamese networks for visual tracking40
Text-driven object affordance for guiding grasp-type recognition in multimodal robot teaching35
Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data22
Non-contact SpO2 monitoring via multi-channel pulse signals from facial videos using machine learning22
A hybrid overlapping group sparsity denoising model with fractional-order total variation and non-convex regularizer21
Medtransnet: advanced gating transformer network for medical image classification20
ECM: arbitrary style transfer via Enhanced-Channel Module19
Obs-tackle: an obstacle detection system to assist navigation of visually impaired using smartphones19
Real estate pricing prediction via textual and visual features19
End-to-end unsupervised learning of latent-space clustering for image segmentation via fully dense-UNet and fuzzy C-means loss19
Using breast density for hybrid region and pixel-level loss function18
Innovative surface roughness detection method based on white light interference images18
An integration of deep network with random forests framework for image quality assessment in real-time18
Global-guided cross-reference network for co-salient object detection16
Enhancing hyperspectral image classification: DeepXTE for efficient semantic feature extraction16
MSPKD: multi spatial projectors for knowledge distillation in semantic segmentation15
Enforced clustering for zero-to-one-shot texture anomaly detection15
Motion-region annotation for complex videos via label propagation across occluders15
A motion direction detecting model for colored images based on the Hassenstein–Reichardt model14
A stereo vision SLAM with moving vehicles tracking in outdoor environment14
Editing implicit and explicit representations of radiance fields: a survey14
LOID: Lane Occlusion Inpainting and Detection for Enhanced Autonomous Driving Systems14
Modeling driving task-relevant attention for intelligent vehicles using triplet ranking13
Axes-aligned non-linear optimized PnP algorithm13
Specular Surface Detection with Deep Static Specular Flow and Highlight13
Generalized few-shot learning under large scope by using episode-wise regularizing imprinting13
CGA-Net: channel-wise gated attention network for improved super-resolution in remote sensing imagery12
AFC-Net: adjacent feature complementary for crowded pedestrian detection12
A multi-modal framework for continuous and isolated hand gesture recognition utilizing movement epenthesis detection12
Discriminant distance template matching for image recognition12
Ubiquitous vision of transformers for person re-identification12
Correction: Unsupervised single-shot depth estimation using perceptual reconstruction12
Generation of realistic synthetic cable images to train deep learning segmentation models11
Novel Cauchy mixture modeling combined with the Sparse-RCNN architecture for enhanced multi-person pose estimation11
Alternate guidance network for boundary-aware camouflaged object detection11
Kernel based local matching network for video object segmentation11
Multi-feature fusion network based on wavelet transform and multi-scale cross-response for hyperspectral image classification11
L-VAE: variational auto-encoder with learnable beta for disentangled representation11
Two-stage structural information enhancement for source-free domain adaptation10
Enhanced hyperspectral image reconstruction via parallel 2D/3D convolution with global layer purification and multiscale pooling fusion10
RPIM-net: residual channel prior-driven interaction multi-scale network for stereo image deraining10
Redundancy-free label space and dual-feature collaboration for multi-label feature selection10
Traversing the subspace of adversarial patches10
Improving knowledge distillation via pseudo-multi-teacher network10
LDNet: low-light image enhancement with joint lighting and denoising10
A general two-stage framework of tensor low-rank representation for enhanced image denoising and clustering10
A dual progressive strategy for long-tailed visual recognition10
3D face parsing based on 2D CPFNet: conformal parameterized face parsing network10
Online continual learning with saliency-guided experience replay using tiny episodic memory10
SGL-SLAM: a semantic and geometric RGB-D visual SLAM enhanced with line features for dynamic environments10
CAMTrack: a combined appearance-motion method for multiple-object tracking10
Adversarial imitation learning-based network for category-level 6D object pose estimation9
Camera-based mapping in search-and-rescue via flying and ground robot teams9
OmniGlasses: an optical aid for stereo vision CNNs to enable omnidirectional image processing9
IoU-aware feature fusion R-CNN for dense object detection9
Explainable interactive projections of images9
Benchmarking large and small MLLMs9
Shape related unknown object one-shot learning grasping9
X-Align++: cross-modal cross-view alignment for Bird’s-eye-view segmentation9
Multi-scale convolution underwater image restoration network9
Chfnet: a coarse-to-fine hierarchical refinement model for monocular depth estimation9
GOA-net: generic occlusion aware networks for visual tracking9
Audio-visual localization based on spatial relative sound order9
Generating comprehensive scene graphs with integrated multiple attribute detection9
Shape description losses for medical image segmentation9
Thin section analysis for ceramic petrography using motion analysis and segmentation techniques9
Fusing bilinear multi-channel gated vector for fine-grained classification9
Correction: Real estate pricing prediction via textual and visual features9
Twinned attention network for occlusion-aware facial expression recognition9
Overcoming occlusions in AR, via multi-view, real-time 3D human pose estimation8
A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification8
An efficient ground segmentation approach for LiDAR point cloud utilizing adjacent grids8
Real-time pedestrian pose estimation, tracking and localization for social distancing8
MÆIDM: multi-scale anomaly embedding inpainting and discrimination for surface anomaly detection8
FOCUS: Frequency-Optimized Conditioning of diffUSion models for mitigating catastrophic forgetting during test-time adaptation8
Pakistan sign language recognition: leveraging deep learning models with limited dataset8
DisRot: boosting the generalization capability of few-shot learning via knowledge distillation and self-supervised learning8
EAF-Net: an enhancement and aggregation–feedback network for RGB-T salient object detection8
A comprehensive survey on SLAM and machine learning approaches for indoor autonomous navigation of mobile robots8
Automatic cables segmentation from a substation device based on 3D point cloud8
Attention-based global context network for driving maneuvers prediction8
Meta-learning enhanced global–local feature fusion for image quality assessment7
Integrating visual-semantic relational reasoning for fake news detection on video platforms7
Evolving brain tumor segmentation: differential evolution-optimized ensemble deep learning for multi-modal MRI analysis7
Visual-inertial SLAM with line segment merging and efficient feature tracking method7
Parametric loss-based super-resolution for scene text recognition7
Evolution algorithm of parametric active contour model based on Gaussian smoothing filter7
An adaptive interpolation and 3D reconstruction algorithm for underwater images7
Tensor-guided learning for image denoising using anisotropic PDEs7
Mobgazenet: robust gaze estimation mobile network based on progressive attention mechanisms7
Distortion diminishing with vulnerability filters pruning7
YG-SLAM: dynamic environment-based geometric constraint point-line fusion visual SLAM system7
Cross-dataset video deepfake detection using Transformer and CNN architectures7
Kinematic calibration of a hexapod robot based on monocular vision7
Actions as points: a simple and efficient detector for skeleton-based temporal action detection7
Improving change detection using conditional discriminative adversarial regularization6
ConsInstancy: learning instance representations for semi-supervised panoptic segmentation of concrete aggregate particles6
VGT-MOT: visibility-guided tracking for online multiple-object tracking6
Regional filtering distillation for object detection6
Guest editorial: special issue on human pose estimation and its applications6
PTDS CenterTrack: pedestrian tracking in dense scenes with re-identification and feature enhancement6
Boosting few-shot learning via selective patch embedding by comprehensive sample analysis6
PGA6D: 6D pose estimation for grasping and assemblying based on keypoints voting6
A dual-path U-Net for pulmonary vessel segmentation method based on lightweight 3D attention6
Environmental factors-aware two-stream GCN for skeleton-based behavior recognition6
Quality assessment of synthetic images via spatial distortion recognition6
Text-to-face synthesis based on facial landmarks prediction6
Multiple object tracking using weighted graph convolutional neural networks6
Welding splash and arc noise reduction imaging model based on computationally efficient pairwise response serving welding process library6
Multi-view dynamic reconstruction with cross-view smoothing based on surfel6
A novel multi-feature fusion deep neural network using HOG and VGG-Face for facial expression classification6
Robust semantic segmentation method of urban scenes in snowy environment6
Tree-managed network ensembles for video prediction6
Accelerated fixed-point iterations for image deblurring and defiltering6
Logit scaling for out-of-distribution detection6
Enhanced normal estimation of point clouds via fine-grained geometric information learning6
Semi-supervised metric learning incorporating weighted triplet constraint and Riemannian manifold optimization for classification6
Human pose estimation based on lightweight basicblock5
A collaborative SLAM method for dual payload-carrying UAVs in denied environments5
A review of adaptable conventional image processing pipelines and deep learning on limited datasets5
Cascaded attention-guided multi-granularity feature learning for person re-identification5
React: recognize every action everywhere all at once5
Residual feature learning with hierarchical calibration for gaze estimation5
FLAVR: flow-free architecture for fast video frame interpolation5
YOLOMH: you only look once for multi-task driving perception with high efficiency5
Toward phytoplankton parasite detection using autoencoders5
TFF-temporal fusion framework for advancing video retrieval through long-range dependencies and multi-modal intent5
Diffusion-leveraged GAN dehazing driven by classification: a two-stage framework for real-world monitoring imagery5
Fine-grained 3D vehicle shape manipulation via latent space editing5
Self-attention network for few-shot learning based on nearest-neighbor algorithm5
Personvit: large-scale self-supervised vision transformer for person re-identification5
Edge-aware dual path network for medical image classification5
Swin transformer with part-level tokenization for occluded person re-identification5
Naturally constrained reject option classification5
Local region-learning modules for point cloud classification5
Residual shuffle attention network for image super-resolution5
BiTransformer: augmenting semantic context in video captioning via bidirectional decoder5
Block-recurrent visual transformer for enhanced human detection in thermal imaging5
Optimized hand pose estimation CrossInfoNet-based architecture for embedded devices5
Unsupervised single-shot depth estimation using perceptual reconstruction5
Beyond Kalman filters: deep learning-based filters for improved object tracking5
The general framework for few-shot learning by kernel HyperNetworks4
Supervised contrastive learning with multi-scale interaction and integrity learning for salient object detection4
Superpixel-based foreground-preserving image stitching4
Carixray: a periapical X-ray dataset for machine vision-based dental caries recognition4
Removing cloud shadows from ground-based solar imagery4
A zero-shot anomaly detection method based on learnable text query4
SiamCAR-Kal: anti-occlusion tracking algorithm for infrared ground targets based on SiamCAR and Kalman filter4
Vision-based power line cables and pylons detection for low flying aircraft4
Enhanced keypoint information and pose-weighted re-ID features for multi-person pose estimation and tracking4
Online camera auto-calibration appliable to road surveillance4
Entangled appearance and motion structures network for multi-object tracking and segmentation4
ViCap-AD: video caption-based weakly supervised video anomaly detection4
Pixel representations, sampling, and label correction for semantic part detection4
Amp: single-shot ultra-wide fisheye-to-cubemap PnP pose estimation4
An image quality assessment method based on edge extraction and singular value for blurriness4
Token adaptation via side graph convolution for efficient fine-tuning of 3D point cloud transformers4
Structure–texture decomposition-based dehazing of a single image with large sky area4
CCTV-Calib: a toolbox to calibrate surveillance cameras around the globe4
LS-Occ:light specific-target-focus vision-based 3D occupancy prediction with adaptive combined head4
FESAR: SAR ship detection model based on local spatial relationship capture and fused convolutional enhancement4
A lightweight and generalizable detection enhancement method using segmentation feedback4
MVUDA: Unsupervised Domain Adaptation for Multi-view Pedestrian Detection4
A robust vehicle tracking in low-altitude UAV videos4
Multimodal dance style transfer4
WIFE-Net: widely integrated follicle extraction network4
Symmetry-induced ambiguity in orientation estimation from RGB images4
Gait recognition using free-area transformer networks4
A deep Retinex network for underwater low-light image enhancement4
Hierarchical contrastive adaptation for cross-domain object detection4
Ssman: self-supervised masked adaptive network for 3D human pose estimation4
Normalized margin loss for action unit detection4
Spatial-temporal graph-guided global attention network for video-based person re-identification4
Region gradient-guided diffusion model for underwater image enhancement3
Interpretability of fingerprint presentation attack detection systems: a look at the “representativeness” of samples against never-seen-before attacks3
Multi-person 3D pose estimation from unlabelled data3
CTL-DETR: a landslide detection algorithm for complex terrains3
Investigating long-term training for remote sensing object detection3
Pixel-wise confidence estimation for segmentation in Bayesian Convolutional Neural Networks3
Addressing the generalization of 3D registration methods with a featureless baseline and an unbiased benchmark3
CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer3
Unsupervised domain adaptation by cross-domain consistency learning for CT body composition3
FDT − Dr2T: a unified Dense Radiology Report Generation Transformer framework for X-ray images3
An Efficient point-in-convex 3D polyhedron test using a projective algorithm with sub-linear expected complexity3
Similarity contrastive estimation for image and video soft contrastive self-supervised learning3
Dynamically throttleable neural networks3
Cmf-transformer: cross-modal fusion transformer for human action recognition3
Discriminative feature learning through feature distance loss3
Time-constrained adversarial attacks for video recognition models: temporally sparse but effective perturbations3
Dynamic focused prototypes distillation for few-shot object detection3
Exploring the potential of deep learning techniques for analyzing athlete movements in competitive athletics sports3
Wide-baseline multi-camera calibration from a room filled with people3
SGBGAN: minority class image generation for class-imbalanced datasets3
SNFR: salient neighbor decoding and text feature refining for scene text recognition3
Bidirectional cascaded multimodal attention for multiple choice visual question answering3
MYFED: a dataset of affective face videos for investigation of emotional facial dynamics as a soft biometric for person identification3
Material classification of polishing and convex surface objects based on photon accumulation point spread function (PAPSF) from imaging model of binocular pulsed time-of-flight camera3
Continuous sign language recognition based on motor attention mechanism and frame-level self-distillation3
Human–object interaction detection based on disentangled axial attention transformer3
Multimodal fine-grained grocery product recognition using image and OCR text3
Efficient abnormality detection using patch-based 3D convolution with recurrent model3
Motioninsights: real-time object tracking in streaming video3
Exploring filter placement in convolutional layer topologies based on ResNet for image classification3
Zero-shot action recognition by clustered representation with redundancy-free features3
Consensus similarity learning based on tensor nuclear norm3
Virtual home staging and relighting from a single panorama under natural illumination3
Ising granularity image analysis on VAE–GAN3
Knowledge-based hybrid connectionist models for morphologic reasoning3
Ipdm: identity preserving diffusion model for face sketch and photo synthesis3
Generating quality grasp rectangle using Pix2Pix GAN for intelligent robot grasping3
Interpretable visual transmission lines inspections using pseudo-prototypical part network3
Rocnet: 3D robust registration of points clouds using deep learning3
Object Recognition Consistency in Regression for Active Detection2
Foreground enhancement network for object detection in sonar images2
Beyond a strong baseline: cross-modality contrastive learning for visible-infrared person re-identification2
Optimize multiscale feature hybrid-net deep learning approach used for automatic pancreas image segmentation2
Closing the gap in domain adaptation for semantic segmentation: a time-aware method2
A deep learning framework for finding illicit images/videos of children2
Yolov7-pcbam: enhancing steel surface defect detection via partial convolution and attention mechanism2
Graph convolutional networks and LSTM for first-person multimodal hand action recognition2
A novel method for 3D knee anatomical landmark localization by combining global and local features2
Deep 6-DoF camera relocalization in variable and dynamic scenes by multitask learning2
ICE-GCN: An interactional channel excitation-enhanced graph convolutional network for skeleton-based action recognition2
Biomimetic oculomotor control with spiking neural networks2
Improved deep depth estimation for environments with sparse visual cues2
Trusted 3D self-supervised representation learning with cross-modal settings2
Pose is all you need: the pose only group activity recognition system (POGARS)2
From explanation to unsupervised segmentation: fusion of multiple explanation maps for vision transformers2
PerSnake: a real-time pedestrian instance segmentation network using contour representation2
TARG-YOLO: an efficient small target detection framework for UAV2
A pothole can be seen with two eyes: an ensemble approach to pothole detection2
Study on defect detection of metal castings based on supervised enhancement and attention distillation2
On the generalizability of iterative patch selection for memory-efficient high-resolution image classification2
PM-MVS: PatchMatch multi-view stereo2
Representing dynamic textures based on polarized gradient features2
A multi-target physiological signal detection method for UWB radar based on Kalman tracking and dual-branch network2
Visible-infrared person re-identification model based on feature consistency and modal indistinguishability2
Fast re-OBJ: real-time object re-identification in rigid scenes2
Hyperspectral image dynamic range reconstruction using deep neural network-based denoising methods2
Simultaneous tracking of objects with loose context constraints from multiple views: human–human interaction paradigm2
Editor’s Note: Special Issue from Winter Conference on Applications of Computer Vision - WACV 20232
MSPhys: multiscale fusing-based diffusion model for remote physiological measurement2
Correction to: Self-attention network for few-shot learning based on nearest-neighbor algorithm2
Speech-aided facial video super resolution with accurate lip motion and enhanced frequency details2
Calibrating uncertainties in human trajectory forecasting2
Ellipse detection using the edges extracted by deep learning2
0.16992282867432