jongha-crop.jpeg

Jongha Kim

M.S & Ph.D Integrated Student in MLV Lab, advised by Prof. Hyunwoo J. Kim.

Department of Computer Science and Engineering at Korea University, Seoul, Republic of Korea.

My goal is to develop personalized and reliable multimodal AI agents. My research focuses on personalizing Multimodal Large Language Models, exploring post-training methods (e.g., DPO), systems (e.g., RAG), and ways to leverage structured representations to complement these models to achieve the goal. For more information, please see my CV.

If you are interested in collaboration or have opportunities that align with my expertise, please feel free to reach out to me via contact email.

selected publications [full list]

(*) denotes equal contribution

  1. WACV
    Relevance-aware Multi-context Contrastive Decoding for Retrieval-augmented Visual Question Answering
    Jongha Kim, Byungoh Ko, Jeehye Na, Jinsung Yoon, and Hyunwoo J Kim
    In IEEE/CVF Conference on Winter Conference on Applications of Computer Vision (WACV 2026)
  2. AAAI
    TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing
    Jongha Kim, Minseong Bae, Sanghyeok Lee, Jinsung Yoon, and Hyunwoo J Kim
    In AAAI Conference on Artificial Intelligence (AAAI 2026)
  3. InfoScience
    Improved Query Specialization for Transformer-based Visual Relationship Detection
    Jongha Kim, Jihwan Park, Jinyoung Park, Jinyoung Kim, Sehyung Kim, and Hyunwoo J Kim
    In Information Sciences (2026)
  4. AAAI
    VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning
    Ji Soo Lee*, Jongha Kim*, Jeehye Na, Jinyoung Park, and Hyunwoo J Kim
    In AAAI Conference on Artificial Intelligence (AAAI 2025)
  5. CVPR
    Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship Detection
    Jongha Kim*, Jihwan Park*, Jinyoung Park*, Jinyoung Kim, Sehyung Kim, and Hyunwoo J Kim
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)