About
Dang-Khanh Nguyen
I am a software engineer and a philomath. Check my coding activities at
.
My research record can be found at
or
.
Education
- 2022-2023
- Master Student, Department of AI Convergence; Chonnam National University (Gwangju, South Korea)
- Graduatded in Feb 2024.
- 2014-2018
- Bachelor of Engineering, Electrical and Electronic Engineering, Ho Chi Minh University of Technology (HCMC, Vietnam)
- Honor Program of Telecommunication Engineering.
Experience
- Senior Software Engineer at Aim Futrure
- 2024 - Present
- Design and implement SDK compiling well-known computer vision models (from various frameworks: Keras, Tensorflow, Torch) to hardware-specific runtime instructions.
- Hardware-oriented optimize DNN operations for efficient and effective computation on AI accelerator. Decompose these operations into smaller elements so that they can be executed in parallel in order to hide the memory latency and maximize the arithmetic intensity.
- Exploit the power of our AI accelerator to provide faster implementations for inferencing deep neural networks.
- Researcher at Smart Computing Lab
- 2022 - 2023
- My research topic is video emotion recognition, multimodal learning, and multimodal fusion.
- Investigate and implement machine learning, deep learning model using Pytorch framework.
- My publications and competitions that I joined are listed in the next section.
- Software Engineer at Viettel High Technology Company
- 2020 - 2021
- Investigate RFC documents to implement network protocol. Develop L2/L3 Protocol for Network Device (Switch/Router) using C programming language.
- Some protocols that I have developed and tested: OMCI, SMNP.
- Use GDB for debugging and git to control version of team’s source code.
- Utilize Multi-thread, Inter-process communication to improve response time between devices.
- Simulate behavior of Intermediary network device for boundary-value analysis and stress test of Gateway device.
- Setup network topology to evaluate the behavior of the protocols on prototype device.
- Software Engineer at Renesas Design Vietnam
- 2018 - 2019
- Develop IP modules for virtual MCU/SoC using C++ programming language and SystemC.
- Some modules that I have developed and tested: Interrupt controller, Random Number Generator.
- Embed Python APIs in C++ module. Use Python to create test script and unit test environment.
- Guarantee code coverage >95% using GCOV. Track and resolve 100% leaking memory issues with Valgrind.
- Work in Linux Environment, use Makefile to control building process.
Publications
- Multiple Facial Reaction Generation Using Gaussian Mixture of Models and Multimodal Bottleneck Transformer
- In Proceeding of IEEE 18th International Conference on Automatic Face and Gesture Recognition (2024)
- The paper introduces my solution in the The Second REACT Challenge@IEEE FG24. We achieve top 3 in the Facial Reaction Generation.
- Our proposed approach incorporates the Multimodal Bottleneck Token mechanism to capture interactions between acoustic and visual speaker features and utilizes the Variational Auto-encoder framework to generate latent representations of multiple listener reactions. Additionally, we employ Gaussian Mixture Models to enhance the generative capabilities of the Autoencoder.
- Source code here.
- A Transformer-based Approach to Video Frame-level Prediction in Affective Behavior Analysis In-the-wild
- In Proceeding of 11th International Conference on Big Data Applications and Services
- The paper introduces my solution in the CVPR 2023: 5th Workshop and Competition on Affective Behavior Analysis in-the-wild. We achieved top 9 in the Multi-task Learning Challange.
- We used a pre-trained EfficientNet to extract facial spatial features and a Transformer to extract temporal features. The sequence of embeddings are then used to generate frame-wise emotional predictions for a video.
- Source code here.
- Multimodal Transformer for Automatic Depression Estimation System
- In Proceeding of The 29th International Workshop on Frontiers of Computer Vision
- We reviewed the transformer-based fusion methods of Xu for Depression recognition.
- We introduce a new fusion transformer and get good performance on D-Vlog Benchmark.
- Source code here.
- Analyzing Context and Speaker Memory using Pretrained Language Model for Emotion Recognition in Korean Conversation task
- In Proceeding of 10th International Conference on Big Data Applications and Services
- We achieve 4th Prize in the 4th Emotion Recognition in Korean Conversation Competition.
- We use pre-trained language model to analyze the conversation context and the speakers’ memory to handle the ERC task.
- Source code here.
- Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
- In Proceeding of ACII Affective Vocal Bursts Workshop and Competition 2022 (A-VB)
- Affective Behavior Analysis using Action Unit Relation Graph and Multi-task Cross Attention
- In Proceeding of ECCV Workshop (2022)
- The paper introduces my solution in the 4th Workshop and Competition on Affective Behavior Analysis in-the-wild. We achieve top 4 in the Multi-task Learning Challange.
- We propose a 3-head EfficientNet to resolve 3 affective tasks: emotion recognition, valence-arousal estimation, and action unit detection.
- Source code here.