Curriculum Vitae
Education
University College London (QS 9), 09/2024 - 09/2025
MSc in Computer Graphics, Vision and Imaging
Compulsory modules: Machine Vision, Image Processing, Computer Graphics, Machine Learning for Visual Computing
Optional modules: Inverse Problems in Imaging, Acquisition and Processing of 3D Geometry, Numerical Optimisation, Virtual Environments.
Hefei University of Technology (Project 211), 09/2020 - 06/2024
BEng in Computer Science and Technology
Score: 90.10% (3.85/4.0); 5/152
Publications
Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition
Jihao Gu, Kun Li, Fei Wang, Yanyan Wei, Zhiliang Wu, Hehe Fan, Meng Wang
MM-Gesture: Towards Precise Micro-Gesture Recognition through Multimodal Fusion
🏆 The Champion of Micro-gesture Classification sub-challenge in MiGA@IJCAI2025.
Jihao Gu, Fei Wang, Kun Li, Yanyan Wei, Zhiliang Wu, Dan Guo
Performance Analysis of Traditional VQA Models Under Limited Computational Resources
Jihao Gu
Peer Review
Conference
Reviewer for The 2025 International Joint Conference on Neural Networks (IJCNN’25), 03/2025
Reviewer for The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR’24), 02/2024
Journal
- Reviewer for Transactions on Pattern Analysis and Machine Intelligence(TPAMI), 08/2025
- Reviewer for Engineering Applications of Artificial Intelligence(EAAI), 04/2025
- Reviewer for Intelligent Data Analysis(IDA), 04/2025
Patents
Blind Travel Obstacle Avoidance Assistance System V1.0 [2023SR0517944]
Outdoor visual impairment assisting method based on deep intelligent interaction [CN114724053A]
A Semantic Segmentation-Based Preferential Direction Deviation Early Warning System and Method [CN114723946A]
Route planning method for visually impaired people based on weighted undirected graph [CN116448130A]
A collision warning method based on image target detection and visual depth estimation [CN116403146A]
Projects
Prompt-guided Anomaly Detection in Videos (postgraduate thesis), 03/2025 - present
[thesis] [code]
Jihao Gu, Kaan Akşit (Advisor)
Video Anomaly Detection (VAD) aims to identify anomalous events within video segments. In scenarios such as surveillance or industrial process monitoring, anomaly detection is of critical importance. However, most existing approaches are semi-automated and rely on human intervention to assess anomalies. Traditional VAD methods typically focus only on determining whether a video contains anomalies or locating anomalous frames, with output limited to binary decisions (“normal” or “anomalous”). We propose Prompt-guided Anomaly Detection in Videos (PGADV), a framework built upon Large Vision-Language Models (LVLMs). PGADV introduces an Anomaly Decoder that performs pixel-wise visual-textual feature alignment to generate fine-grained anomaly heatmaps. Furthermore, we design a Region-aware Prompt Learner that transforms the heatmaps into learnable textual embeddings, guiding the LVLM to accurately identify and localize anomalous events in videos. This significantly enhances both the granularity and interactivity of anomaly detection.
Enhancing the Baseline Performance of OrienterNet for Visual Localization, 01/2024 - 06/2024
Jihao Gu, Yan Da (Advisor)
Our task aims to further optimize OrienterNet, a neural network-based visual localization method designed to achieve accurate localization using 2D public maps (e.g., planar maps). The original approach matches camera-captured images with public maps, effectively addressing localization challenges in GPS-denied scenarios, especially in indoor and complex urban environments.
End-to-End Sign Language Recognition using Transformers, 12/2022 - 10/2023
Jihao Gu, LMC-VUT (Lab)
Enhanced a Transformer-based model for sign language recognition, achieving improved translation accuracy and reduced word error rate.
Navigation System for Visually Impaired People Based on Visual Ambient Intelligence (undergraduate thesis), 06/2021 - 6/2024
Jihao Gu, Guo Dan (Advisor)
This thesis proposes a visually aware navigation system to assist visually impaired individuals in outdoor travel. By integrating computer vision, artificial intelligence, and cloud computing technologies, the system captures environmental information using a binocular camera and processes the data through advanced algorithms such as object detection and semantic segmentation. The primary contributions of this thesis are as follows: (1) the design and implementation of a navigation system based on visual environmental perception; (2) the proposal of a deep intelligent interaction-based outdoor assistance method tailored for visually impaired individuals; (3) the development of an offset warning system leveraging semantic segmentation techniques; (4) the introduction of a collision warning method utilizing image-based object detection and visual depth estimation; (5) the formulation of a route planning method founded on weighted undirected graph principles; and (6) the construction of a road image dataset specifically for the Hefei University of Technology campus.
Awards
Outstanding Graduate of Hefei University of Technology, Class of 2024
Outstanding Graduation Thesis (Design), Class of 2024, Hefei University of Technology
“Three Good Students” at School Level, Academic Years 2022 & 2023
First-Class Scholarship, Academic Year 2023
Second-Class Scholarship, Academic Years 2021 & 2022
Work Experience
Shenzhen Boshengteng Technology Co., Ltd.
Embedded Development Assistant, 07/2023 - 08/2023
Conducted comprehensive research on AI chip design, focusing on embedded neural network design, RISC-V architecture, Verilog programming, and power optimization.
Contributed to the design and optimization of an AI chip processing unit, involving architecture selection, instruction set optimization, and module creation using Verilog.
Shenzhen Boshengteng Technology Co., Ltd.
Algorithm Development Assistant, 03/2023 - 06/2023
Participated in APA and HPP projects for advanced driving technology, contributing to data annotation, model testing, and optimization of parking and object detection algorithms.
Developed technical documentation for perception algorithms, detailing design, implementation, and evaluation.
Beijing Tuosida Technology Development Co., Ltd.
Python Development Engineer Assistant, 01/2023 - 02/2023
Developed machine learning models using PyTorch and TensorFlow, including data preprocessing, feature engineering, and model evaluation.
Documented system requirements, design specifications, and test protocols.