Dang-Khanh Nguyen

I am a software engineer and a philomath. Check my coding activities at .
My research record can be found at or .

Education

2022-2023

Master Student, Department of AI Convergence; Chonnam National University (Gwangju, South Korea)

2014-2018

Bachelor of Engineering, Electrical and Electronic Engineering, Ho Chi Minh University of Technology (HCMC, Vietnam)

Senior Software Engineer at Aim Futrure

2024 - Present

Design and implement SDK compiling well-known computer vision models (from various frameworks: Keras, Tensorflow, Torch) to hardware-specific runtime instructions.
Hardware-oriented optimize DNN operations for efficient and effective computation on AI accelerator. Decompose these operations into smaller elements so that they can be executed in parallel in order to hide the memory latency and maximize the arithmetic intensity.
Exploit the power of our AI accelerator to provide faster implementations for inferencing deep neural networks.

2022 - 2023

My research topic is video emotion recognition, multimodal learning, and multimodal fusion.
Investigate and implement machine learning, deep learning model using Pytorch framework.
My publications and competitions that I joined are listed in the next section.

2020 - 2021

Investigate RFC documents to implement network protocol. Develop L2/L3 Protocol for Network Device (Switch/Router) using C programming language.
Some protocols that I have developed and tested: OMCI, SMNP.
Use GDB for debugging and git to control version of team’s source code.
Utilize Multi-thread, Inter-process communication to improve response time between devices.
Simulate behavior of Intermediary network device for boundary-value analysis and stress test of Gateway device.
Setup network topology to evaluate the behavior of the protocols on prototype device.

Software Engineer at Renesas Design Vietnam

2018 - 2019

Develop IP modules for virtual MCU/SoC using C++ programming language and SystemC.
Some modules that I have developed and tested: Interrupt controller, Random Number Generator.
Embed Python APIs in C++ module. Use Python to create test script and unit test environment.
Guarantee code coverage >95% using GCOV. Track and resolve 100% leaking memory issues with Valgrind.
Work in Linux Environment, use Makefile to control building process.

The paper introduces my solution in the The Second REACT Challenge@IEEE FG24. We achieve top 3 in the Facial Reaction Generation.
Our proposed approach incorporates the Multimodal Bottleneck Token mechanism to capture interactions between acoustic and visual speaker features and utilizes the Variational Auto-encoder framework to generate latent representations of multiple listener reactions. Additionally, we employ Gaussian Mixture Models to enhance the generative capabilities of the Autoencoder.
Source code here.

The paper introduces my solution in the CVPR 2023: 5th Workshop and Competition on Affective Behavior Analysis in-the-wild. We achieved top 9 in the Multi-task Learning Challange.
We used a pre-trained EfficientNet to extract facial spatial features and a Transformer to extract temporal features. The sequence of embeddings are then used to generate frame-wise emotional predictions for a video.
Source code here.

We reviewed the transformer-based fusion methods of Xu for Depression recognition.
We introduce a new fusion transformer and get good performance on D-Vlog Benchmark.
Source code here.

We achieve 4th Prize in the 4th Emotion Recognition in Korean Conversation Competition.
We use pre-trained language model to analyze the conversation context and the speakers’ memory to handle the ERC task.
Source code here.

We utilize pre-trained Wav2vec to handle the Emotion Recognition from Vocal-burst. We achieve top 2 in A-VB Culture task.
The competion is organized by HumeAI to explore a under-studied indicator of emotion, the non-verbal sound of human.
Source code here.

In Proceeding of ECCV Workshop (2022)

The paper introduces my solution in the 4th Workshop and Competition on Affective Behavior Analysis in-the-wild. We achieve top 4 in the Multi-task Learning Challange.
We propose a 3-head EfficientNet to resolve 3 affective tasks: emotion recognition, valence-arousal estimation, and action unit detection.
Source code here.