Changho Choi

Computer Science undergraduate at Korea University specializing in computer vision and generative models, with 6+ years of applied machine learning experience in vision and audio. Actively seeking PhD opportunities in computer vision and general machine learning.

From Mar 2025 to Dec 2025, I was an Undergraduate Researcher at the Vision & AI Lab under Prof. Jinkyu Kim, conducting research on causal sequential visual encoders. Prior to that, from Oct 2024 to Dec 2024, I worked as an AI Researcher at OptimizerAI where I contributed to research and development for the Text-to-SFX Model v2. I also worked as an AI Engineer at Pion Corporation developing deep learning models for optimal product images. I was an AI Scientist at Maum.AI from Mar 2019 to May 2023, where I experienced various computer vision tasks through extensive literature review, utilization of open-source resources, and paper implementation.

I am an Undergraduate at Korea University majoring in Computer Science and Engineering, with graduation expected in August 2026. I previously graduated from Gyeonggi Science High School in Feb 2017.

Email / CV / GitHub / Google Scholar / LinkedIn / X

Publication

I'm interested in computer vision, machine learning, optimization, graphics and robotics.

MambaEye: A Size-Agnostic Visual Encoder with Causal Sequential Processing

Changho Choi, Minho Kim, Jinkyu Kim
CVPR Findings, 2026
arxiv / code /

A causal sequential visual encoder that achieves input-size agnostic charateristic like human vision by using a pure Mamba2 backbone. We designed a strictly unidirectional approach and relative move embeddings, MambaEye ensures translation invariance and adaptability to arbitrary resolutions.

SAOInstruct: Free-form Audio Editing using Natural Language Instructions

Michael Ungersböck, Florian Grötschla, Luca A. Lanzendörfer, June Young Yi, Changho Choi, Roger Wattenhofer
NeurIPS, 2025
arxiv / code / website /

A model based on Stable Audio Open for flexible, free-form natural language audio editing. Trained on a novel dataset of audio editing triplets, it generalizes to real-world audio and unseen instructions.

LatentSwap: An Efficient Latent Code Mapping Framework for Face Swapping

Changho Choi, Minho Kim, Junhyeok Lee, Hyoung-Kyu Song, Younggeun Kim, Seungryong Kim
Arxiv preprint, 2024
arxiv / code /

A lightweight face swapping framework that generates latent codes for pre-trained generators without external datasets. It features a fast training process with a simple three-term loss, producing high-resolution results comparable to state-of-the-art models.

Work & Research Experience

Besides my work on the research and publications above, a sampling of my past works

	Vision & AI Lab, Korea University Undergraduate Researcher 2025-03 ~ 2025-12 Conducted research on causal sequential visual encoders under Prof. Jinkyu Kim. Proposed MambaEye, a size-agnostic visual encoder that treats image recognition as a causal, sequential process, effectively mimicking human saccadic vision.
	OptimizerAI AI Researcher 2024-10 ~ 2024-12 Contributed to research and development for the Text-to-SFX Model v2, achieving state-of-the-art human preference win rates. Update with further details if desired.
	Pion Corporation (vcat.ai) AI Engineer 2023-06 ~ 2024-10 Developed deep learning models for optimal product image recommendation and online image clustering using CLIP and Triton Inference Server.
	Maum.AI AI Scientist 2019-03 ~ 2023-05 Led the computer vision research team and worked on various CV tasks (Super Resolution, Face Swapping) through open-source utilization and paper implementation.