|
1 | 1 | [](https://github.com/isLinXu/paper-list)<p align="center"><h1 align="center"><br><ins>Paper-List-DAILY</ins><br>Automatically Update Papers Daily in list</h1></p>
|
2 |
| -## Updated on 2025.09.02 |
| 2 | +## Updated on 2025.09.03 |
3 | 3 |
|
4 | 4 | <details>
|
5 | 5 | <summary>Table of Contents</summary>
|
|
2248 | 2248 | |**2024-02-27**|**Scaling Supervised Local Learning with Augmented Auxiliary Networks**|Chenxiang Ma et.al.|[2402.17318](http://arxiv.org/abs/2402.17318)|**[link](https://github.com/chenxiangma/auglocal)**|
|
2249 | 2249 | |**2024-02-26**|**Offline Writer Identification Using Convolutional Neural Network Activation Features**|Vincent Christlein et.al.|[2402.17029](http://arxiv.org/abs/2402.17029)|null|
|
2250 | 2250 |
|
2251 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 2251 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
2252 | 2252 |
|
2253 | 2253 | ## Object Detection
|
2254 | 2254 |
|
|
4650 | 4650 | |**2024-02-27**|**A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track**|Zehui Chen et.al.|[2402.17319](http://arxiv.org/abs/2402.17319)|null|
|
4651 | 4651 | |**2024-02-27**|**Probing Multimodal Large Language Models for Global and Local Semantic Representation**|Mingxu Tao et.al.|[2402.17304](http://arxiv.org/abs/2402.17304)|null|
|
4652 | 4652 |
|
4653 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 4653 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
4654 | 4654 |
|
4655 | 4655 | ## Semantic Segmentation
|
4656 | 4656 |
|
|
6610 | 6610 | |**2024-02-27**|**Mitigating Distributional Shift in Semantic Segmentation via Uncertainty Estimation from Unlabelled Data**|David S. W. Williams et.al.|[2402.17653](http://arxiv.org/abs/2402.17653)|null|
|
6611 | 6611 | |**2024-02-27**|**Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling**|David S. W. Williams et.al.|[2402.17622](http://arxiv.org/abs/2402.17622)|null|
|
6612 | 6612 |
|
6613 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 6613 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
6614 | 6614 |
|
6615 | 6615 | ## Object Tracking
|
6616 | 6616 |
|
|
7138 | 7138 | |**2024-02-24**|**Multi-Object Tracking by Hierarchical Visual Representations**|Jinkun Cao et.al.|[2402.15895](http://arxiv.org/abs/2402.15895)|null|
|
7139 | 7139 | |**2024-02-24**|**Detection Is Tracking: Point Cloud Multi-Sweep Deep Learning Models Revisited**|Lingji Chen et.al.|[2402.15756](http://arxiv.org/abs/2402.15756)|null|
|
7140 | 7140 |
|
7141 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 7141 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
7142 | 7142 |
|
7143 | 7143 | ## Action Recognition
|
7144 | 7144 |
|
|
7929 | 7929 | |**2024-02-13**|**Vision-Based Hand Gesture Customization from a Single Demonstration**|Soroush Shahi et.al.|[2402.08420](http://arxiv.org/abs/2402.08420)|null|
|
7930 | 7930 | |**2024-02-12**|**PBADet: A One-Stage Anchor-Free Approach for Part-Body Association**|Zhongpai Gao et.al.|[2402.07814](http://arxiv.org/abs/2402.07814)|null|
|
7931 | 7931 |
|
7932 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 7932 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
7933 | 7933 |
|
7934 | 7934 | ## Pose Estimation
|
7935 | 7935 |
|
|
9110 | 9110 | |**2024-02-26**|**DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer**|Yizhe Wu et.al.|[2402.16308](http://arxiv.org/abs/2402.16308)|null|
|
9111 | 9111 | |**2024-02-25**|**XAI-based gait analysis of patients walking with Knee-Ankle-Foot orthosis using video cameras**|Arnav Mishra et.al.|[2402.16175](http://arxiv.org/abs/2402.16175)|null|
|
9112 | 9112 |
|
9113 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 9113 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
9114 | 9114 |
|
9115 | 9115 | ## Image Generation
|
9116 | 9116 |
|
|
12077 | 12077 | |**2024-02-28**|**Breaking the Black-Box: Confidence-Guided Model Inversion Attack for Distribution Shift**|Xinhao Liu et.al.|[2402.18027](http://arxiv.org/abs/2402.18027)|null|
|
12078 | 12078 | |**2024-02-27**|**CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing**|Chufeng Xiao et.al.|[2402.17624](http://arxiv.org/abs/2402.17624)|null|
|
12079 | 12079 |
|
12080 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 12080 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
12081 | 12081 |
|
12082 | 12082 | ## LLM
|
12083 | 12083 |
|
|
15450 | 15450 | |**2024-02-28**|**Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport**|Bin Li et.al.|[2402.18411](http://arxiv.org/abs/2402.18411)|**[link](https://github.com/hcvlab/protoot)**|
|
15451 | 15451 | |**2024-02-28**|**A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models**|Xiujie Song et.al.|[2402.18409](http://arxiv.org/abs/2402.18409)|null|
|
15452 | 15452 |
|
15453 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 15453 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
15454 | 15454 |
|
15455 | 15455 | ## Scene Understanding
|
15456 | 15456 |
|
|
16316 | 16316 | |**2024-02-21**|**Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition**|Mingkun Yang et.al.|[2402.13643](http://arxiv.org/abs/2402.13643)|**[link](https://github.com/melosy/cam)**|
|
16317 | 16317 | |**2024-02-25**|**DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models**|Xiaoyu Tian et.al.|[2402.12289](http://arxiv.org/abs/2402.12289)|null|
|
16318 | 16318 |
|
16319 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 16319 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
16320 | 16320 |
|
16321 | 16321 | ## Depth Estimation
|
16322 | 16322 |
|
|
17065 | 17065 | |**2024-02-19**|**An Endoscopic Chisel: Intraoperative Imaging Carves 3D Anatomical Models**|Jan Emily Mangulabnan et.al.|[2402.11840](http://arxiv.org/abs/2402.11840)|null|
|
17066 | 17066 | |**2024-02-19**|**Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios**|Jialei Xu et.al.|[2402.11826](http://arxiv.org/abs/2402.11826)|null|
|
17067 | 17067 |
|
17068 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 17068 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
17069 | 17069 |
|
17070 | 17070 | ## Audio Processing
|
17071 | 17071 |
|
|
18717 | 18717 | |**2024-02-26**|**Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods**|Ivan Magrin-Chagnolleau et.al.|[2402.16429](http://arxiv.org/abs/2402.16429)|null|
|
18718 | 18718 | |**2024-02-24**|**ArEEG_Chars: Dataset for Envisioned Speech Recognition using EEG for Arabic Characters**|Hazem Darwish et.al.|[2402.15733](http://arxiv.org/abs/2402.15733)|null|
|
18719 | 18719 |
|
18720 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 18720 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
18721 | 18721 |
|
18722 | 18722 | ## Multimodal
|
18723 | 18723 |
|
|
19268 | 19268 | |**2024-02-19**|**Multimodal Emotion Recognition from Raw Audio with Sinc-convolution**|Xiaohui Zhang et.al.|[2402.11954](http://arxiv.org/abs/2402.11954)|null|
|
19269 | 19269 | |**2024-02-18**|**Efficient Multimodal Learning from Data-centric Perspective**|Muyang He et.al.|[2402.11530](http://arxiv.org/abs/2402.11530)|**[link](https://github.com/baai-dcai/bunny)**|
|
19270 | 19270 |
|
19271 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 19271 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
19272 | 19272 |
|
19273 | 19273 | ## Anomaly Detection
|
19274 | 19274 |
|
|
21349 | 21349 | |**2024-02-25**|**An Adversarial Robustness Benchmark for Enterprise Network Intrusion Detection**|João Vitorino et.al.|[2402.16912](http://arxiv.org/abs/2402.16912)|null|
|
21350 | 21350 | |**2024-02-26**|**Uncertainty Quantification in Anomaly Detection with Cross-Conformal $p$ -Values**|Oliver Hennhöfer et.al.|[2402.16388](http://arxiv.org/abs/2402.16388)|null|
|
21351 | 21351 |
|
21352 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 21352 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
21353 | 21353 |
|
21354 | 21354 | ## Transfer Learning
|
21355 | 21355 |
|
|
24114 | 24114 | |**2024-02-28**|**OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine**|Xiaosong Wang et.al.|[2402.18028](http://arxiv.org/abs/2402.18028)|null|
|
24115 | 24115 | |**2024-02-28**|**Collaborative decoding of critical tokens for boosting factuality of large language models**|Lifeng Jin et.al.|[2402.17982](http://arxiv.org/abs/2402.17982)|null|
|
24116 | 24116 |
|
24117 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 24117 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
24118 | 24118 |
|
24119 | 24119 | ## Optical Flow
|
24120 | 24120 |
|
|
24655 | 24655 | |**2024-02-12**|**A Flow-based Credibility Metric for Safety-critical Pedestrian Detection**|Maria Lyssenko et.al.|[2402.07642](http://arxiv.org/abs/2402.07642)|null|
|
24656 | 24656 | |**2024-02-09**|**Image-based Deep Learning for the time-dependent prediction of fresh concrete properties**|Max Meyer et.al.|[2402.06611](http://arxiv.org/abs/2402.06611)|null|
|
24657 | 24657 |
|
24658 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 24658 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
24659 | 24659 |
|
24660 | 24660 | ## Reinforcement Learning
|
24661 | 24661 |
|
|
27980 | 27980 | |**2024-02-28**|**Is Crowdsourcing Breaking Your Bank? Cost-Effective Fine-Tuning of Pre-trained Language Models with Proximal Policy Optimization**|Shuo Yang et.al.|[2402.18284](http://arxiv.org/abs/2402.18284)|null|
|
27981 | 27981 | |**2024-02-28**|**Reinforcement Learning and Graph Neural Networks for Probabilistic Risk Assessment**|Joachim Grimstad et.al.|[2402.18246](http://arxiv.org/abs/2402.18246)|null|
|
27982 | 27982 |
|
27983 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 27983 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
27984 | 27984 |
|
27985 | 27985 | ## Graph Neural Networks
|
27986 | 27986 |
|
|
30732 | 30732 | |**2024-02-27**|**Using Graph Neural Networks to Predict Local Culture**|Thiago H Silva et.al.|[2402.17905](http://arxiv.org/abs/2402.17905)|null|
|
30733 | 30733 | |**2024-02-27**|**Learning Topological Representations with Bidirectional Graph Attention Network for Solving Job Shop Scheduling Problem**|Cong Zhang et.al.|[2402.17606](http://arxiv.org/abs/2402.17606)|null|
|
30734 | 30734 |
|
30735 |
| -<p align=right>(<a href=#updated-on-20250902>back to top</a>)</p> |
| 30735 | +<p align=right>(<a href=#updated-on-20250903>back to top</a>)</p> |
30736 | 30736 |
|
30737 | 30737 | [](https://github.com/isLinXu/paper-list)
|
0 commit comments