Multimodal AI Lab: http://mmai.ewha.ac.kr/
Jiyoung Lee is an assistant professor of the Department of AI at Ewha Womans University. Before joining Ewha Womans University, she was a research scientist at NAVER AI Lab from Dec. 2021 to Feb. 2025. She received a Ph.D. degree from Yonsei University, advised by Prof. Kwanghoon Sohn. Previously, she interned at Adobe Research in 2021, working with Justin Salamon and Dingzeyu Li, and collaborated with Microsoft Research, working with Daniel McDuff and Yale Song in 2020.
She is broadly interested in multimodal learning & computer vision. Mostly, she is interested in audio-visual/vision-language models, generative AI, and video understanding, but not limited to.
[학술발표] Bootstrap your own views: Masked ego-exo modeling for fine-grained view-invariant video representationsIEEE/CVF Conference on Computer Vision and Pattern Recognition , 미국, 2025-06-13Proceedings of the Computer Vision and Pattern Recognition Conference, 2025
[학술발표] Read, watch and scream! sound generation from text and videoAAAI Conference on Artificial Intelligence , 미국, 2025-02-27Proceedings of the AAAI Conference on Artificial Intelligence, 2025
[학술발표] Bridging vision and language spaces with assignment predictionThe Twelfth International Conference on Learning Representations, 오스트리아, 2024-05-07The Twelfth International Conference on Learning Representations, 2024
[학술발표] Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D GenerationThe Twelfth International Conference on Learning Representations, 오스트리아, 2024-05-07The Twelfth International Conference on Learning Representations, 2024
[지적재산권] 제로샷 텍스트-음성 변환을 위한 훈련 없는 화자 생성 방지 기법국내 : 특허, 출원