8/1
Friday (Online)
|
8/3 Sunday |
8/4 Monday |
8/5 Tuesday |
8/6 Wednesday | |||
09:00 |
Oral 3 (좌장: 오태현 교수/KAIST) ■ 9:00 - 10:00 |
Mentoring (Work & Life) ■ 9:00 - 10:00 | |||||
09:30 |
Opening ■ 9:30 - 10:00 | ||||||
10:00 |
Oral 1 (좌장: 김현우 교수/KAIST) ■ 10:00 - 11:00 |
Oral 4 (좌장: 이승용 교수/POSTECH) ■ 10:00 - 11:00 |
Oral 6 (좌장: 윤국진 교수/KAIST) ■ 10:00 - 11:00 | ||||
10:30 |
Workshop 1 ■ 10:30 - 12:00 |
||||||
11:00 |
Keynote 1Shree NayarColumbia University(좌장: 박인규 교수/인하대) ■ 11:00 - 12:00 |
Keynote 2Michal IraniWeizmann Institute of Science(좌장: 심현정 교수/KAIST) ■ 11:00 - 12:00 |
Oral 7(좌장: 임성훈 교수/DGIST) ■ 11:00 - 12:00 | ||||
11:30 |
|||||||
12:00 |
Lunch(Not provided) ■ 12:00 - 13:30 |
Lunch(Not provided) ■ 12:00 - 13:30 |
Lunch(Not provided) ■ 12:00 - 13:30 |
Lunch(Not provided) ■ 12:00 - 13:30 | |||
12:30 |
|||||||
13:00 |
|||||||
13:30 |
Tutorial 1 ■ 13:30 - 14:45 |
Oral 2 (좌장:박종일 교수/햔양대) ■ 13:30 - 14:30 |
Oral 5 (좌장: 서용덕 교수/서강대) ■ 13:30 - 14:30 |
Oral 8 (좌장: 최종현 교수/서울대) ■ 13:30 - 14:30 | |||
14:00 |
|||||||
14:30 |
Industry 1(좌장: 김원준 교수/건국대) ■ 14:30 - 15:30 |
Industry 2(좌장: 한보형 교수/서울대) ■ 14:30 - 15:30 |
Poster 3 ■ 14:30 - 16:00 | ||||
15:00 |
Tutorial 2 ■ 15:00 - 16:15 |
||||||
15:30 |
Poster 1 ■ 15:30 - 17:00 |
Poster 2 ■ 15:30 - 17:00 | |||||
16:00 |
KCCV 프로그램 위원회 ■ 16:00 - 17:00 |
Keynote 3Gul VarolÉcole des Ponts ParisTech(좌장: 심현정 교수/KAIST) ■ 16:00 - 17:00 | |||||
16:30 |
Tutorial 3 ■ 16:30 - 17:45 | KCVS 이사회 (장소: 320호) ■ 16:30 - 15:30 | |||||
17:00 |
DoctoralColloquium ■ 17:00 - 18:30 |
KCVS 총회 ■ 17:00 - 18:00 |
Close Remark ■ 17:00 - 17:30 | ||||
17:30 |
|||||||
18:00 |
Workshop.1 | 10:30 ~ 11:00 | 산업계에서의 인공지능 연구 |
|
Workshop.2 | 11:00 ~ 11:30 | 진리의 파편을 찾아서 |
|
Workshop.3 | 11:30 ~ 12:00 | 논문 수와 인용 수, 그 다음은? | 오성준 (Univ. of Tubingen) |
Tutorial.1 | 13:30 ~ 15:00 | Identity-preserving Distillation Sampling by Fixed-Point Iterator (CVPR 2025) /Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion (ICCV 2025) | 진경환 (고려대) |
Tutorial.2 | 15:00 ~ 16:30 | Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining (ICLR 2025) | 임성훈 (DGIST) |
Tutorial.3 | 16:30 ~ 17:45 | Pseudo-RIS: Distinctive Pseudo-Supervision Generation for Referring Image Segmentation (ECCV 2024) | 손진희 (GIST) |
O1.1 | 10:00 ~ 10:20 | EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting | 김상필(고려대) |
O1.2 | 10:20 ~ 10:40 | Latent space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models | 김선주(연세대) |
O1.3 | 10:40 ~ 11:00 | A Simple yet Universal Framework for Depth Completion | 전해곤(GIST) |
Keynote 1 | 11:00 ~ 12:00 | Computational Imaging and Future Cameras |
|
O2.1 | 13:30 ~ 13:50 | MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations | 최진우(경희대) |
O2.2 | 13:50 ~ 14:10 | Neural cover selection for steganography | 김혜지(University of Texas at Austin) |
O2.3 | 14:10 ~ 14:30 | Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks | 윤국진(KAIST) |
I1.1 | 14:30 ~ 14:50 | TBD | DeltaX |
I1.2 | 14:50 ~ 15:10 | TBD | Lunit Inc. |
I1.3 | 15:10 ~ 15:30 | TBD | Nexdata |
Poster No.
| Time | Paper Name | Advisor | Presenter |
1 | 15:30 ~ 17:00 | GRAE-3DMOT: Geometry Relation-Aware Encoder for Online 3D Multi-Object Tracking | 고영준 | 김현섭 |
2 | Classification Matters: Improving Video Action Detection with Class-Specific Attention | 곽수하 | 이진성 | |
3 | Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval | 곽수하 | 정보승 | |
4 | Towards Generalizable Scene Change Detection | 김의환 | 김재우 | |
5 | VideoMamba: Spatio-Temporal Selective State Space Model | 김창익 | 김희선 | |
6 | REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning | 김창익 | 이지현 | |
7 | Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems | 김현우 | 이소진 | |
8 | Deblurring 3D Gaussian Splatting | 박은병 | 이병현 | |
9 | Event Ellipsometer: Event-based Mueller-Matrix Video Imaging | 백승환 | 문윤성 | |
10 | BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos | 변혜란 | 이필현 | |
11 | Random Conditioning for Diffusion Model Compression with Distillation | 서홍석 | 박세환 | |
12 | SyncTweedies: A General Generative Framework Based on Synchronized Diffusions | 성민혁 | 여경민 | |
13 | Bootstrap Your Own Views: Masked Ego-Exo Modeling for Fine-grained View-invariant Video Representations | 손광훈 | 박정인 | |
14 | Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather | 심현정 | 박준성 | |
15 | Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models | 안남혁 | 안남혁 | |
16 | TCFG: Tangential Damping Classifier-free Guidance | 어영정 | 어영정 | |
17 | Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration | 오태현 | 김준성 | |
18 | Do Your Best and Get Enough Rest for Continual Learning | 유종빈 | 강한결 | |
19 | Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks | 윤국진 | 양훈민 | |
20 | WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels | 윤국진 | 권혁준 | |
21 | OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for OmOmnidirectional Images with Editable Capabilities | 이경무 | 이수영 | |
22 | RAD: Region-Aware Diffusion Models for Image Inpainting | 이민식 | 김소라 | |
23 | Style-Editor: Text-driven object-centric style editing | 임성훈 | 박지훈 | |
24 | 6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry | 창주용 | 전성호 | |
25 | A Simple yet Universal Framework for Depth Completion | 전해곤 | 박진휘 | |
26 | Continuous Locomotive Crowd Behavior Generation | 전해곤 | 배인환 | |
27 | Black Hole-Driven Identity Absorbing in Diffusion Models | 정순기 | 사허야르 무하마드 | |
28 | Probabilistic Weather Forecasting with Deterministic Guidance-based Diffusion Model | 조동현 | 조동현 | |
29 | Video Summarization with Large Language Models | 조민수 | 이민정 | |
30 | Leveraging 3D Geometric Priors in 2D Rotation Symmetry Detection | 조민수 | 서아현 | |
31 | Towards Lossless Implicit Neural Representation via Bit Plane Decomposition | 진경환 | 한우경 | |
32 | Identity-preserving Distillation Sampling by Fixed-Point Iterator | 진경환 | 김선화 | |
33 | FIFO-Diffusion: Generating Infinite Videos from Text without Training | 한보형 | 강준오 | |
34 | Subnet-Aware Dynamic Supernet Training for Neural Architecture Search | 함범섭 | 전제민 | |
35 | Temporal Alignment-Free Video Matching for Few-shot Action Recognition | 허재필 | 이수빈 | |
36 | Question-Aware Gaussian Experts for Audio-Visual Question Answering | 홍성은 | 김홍엽 | |
37 | Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding | 황성재 | 강세일 | |
38 | Cross-View Completion Models are Zero-shot Correspondence Estimators | 김승룡 | 안홍규 |
Doctoral Colloquium | 17:00 ~ 18:00 | TBD |
O3.1 | 9:00 ~ 9:20 | A New Multi-Source Light Detection Benchmark and Semi-Supervised Focal Light Detection | 배승환(인하대) |
O3.2 | 9:20 ~ 9:40 | GOAL: Global-local Object Alignment Learning | 엄찬호(중앙대) |
O3.3 | 9:40 ~ 10:00 | Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction | 박은병(연세대) |
O4.1 | 10:00 ~ 10:20 | RGBD GS-ICP SLAM | 유현우(성균관대) |
O4.2 | 10:20 ~ 10:40 | Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild | 홍승훈(KAIST) |
O4.3 | 10:40 ~ 11:00 | Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval | 곽수하(POSTECH) |
Keynote 2 | 11:00 ~ 12:00 | Reading Minds & Machines | Michal Irani(Weizmann Institute of Science) |
O5.1 | 13:30 ~ 13:50 | Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models | 주한별(서울대) |
O5.2 | 13:50 ~ 14:10 | Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior | 손진희(GIST) |
O5.3 | 14:10 ~ 14:30 | NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model | 장형진(University of Birmingham) |
I2.1 | 14:30 ~ 14:50 | TBD | POSCO DX |
I2.2 | 14:50 ~ 15:10 | TBD | Qualcomm |
I2.3 | 15:10 ~ 15:30 | TBD | SK magic |
Poster No.
| Time | Paper Name | Advisor | Presenter |
1 | 15:30 ~ 17:00 | Efficient Neural Video Representation with Temporally Coherent Modulation | 오도관 | 신승준 |
2 | Improving Sound Source Localization with Joint Slot Attention on Image and Audio | 곽수하 | 김인호 | |
3 | Bootstrapping Top-down Information for Self-modulating Slot Attention | 곽수하 | 김동원 | |
4 | Online Temporal Action Localization with Memory-Augmented Transformer | 곽수하 | 송영길 | |
5 | LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection | 김동석 | 김산민 | |
6 | VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness | 김동진 | 차승주 | |
7 | BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions | 김문철 | 서원용 | |
8 | SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video | 김문철 | 박종민 | |
9 | Enhanced Motion Forecasting with Visual Relation Reasoning | 김상필 | 백하닮 | |
10 | Latent space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models | 김선주 | 정진호 | |
11 | ControlFace: Harnessing Facial Parametric Control for Face Rigging | 김승룡 | 양진이 | |
12 | DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation | 김영빈 | 권준형 | |
13 | DropGaussian: Structural Regularization for Sparse-view Gaussian Splatting | 김원준 | 박현우 | |
14 | MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection | 김정욱 | 정준영 | |
15 | Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation | 김창익 | 함석일 | |
16 | Expressive Whole-Body 3D Gaussian Avatar | 문경식 | 문경식 | |
17 | TADFormer : Task-Adaptive Dynamic TransFormer for Efficient Multi-Task Learning | 민동보 | 백승민 | |
18 | SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting | 박은병 | 유지상 | |
19 | Link to the Past: Temporal Propagation for Fast 3D Human Reconstruction from Monocular Video | 박인규 | Matthew Marchellus | |
20 | A New Multi-Source Light Detection Benchmark and Semi-Supervised Focal Light Detection | 배승환 | 배승환 | |
21 | GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation | 성민혁 | 이유승 | |
22 | Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation | 심현정 | 최지호 | |
23 | Memory-Efficient Fine-Tuning for Quantized Diffusion Model | 심현정 | 임서현 | |
24 | Optical-Flow Guided Prompt Optimization for Coherent Video Generation | 예종철 | 김제민 | |
25 | Dense-SfM: Structure from Motion with Dense Consistent Matching | 유승주 | 유승주 | |
26 | Multi-modal Knowledge Distillation-based Human Trajectory Forecasting | 윤국진 | 정재우 | |
27 | TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight | 윤국진 | 김지훈 | |
28 | NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model | 장형진 | 장형진 | |
29 | PersonaBooth: Personalized Text-to-Motion Generation | 김보은 | 김보은 | |
30 | Fully Explicit Dynamic Gaussian Splatting | 전해곤 | 이준오 | |
31 | Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation | 조민수 | 이준하 | |
32 | Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks | 최준원 | 김정호 | |
33 | DEVIAS: Learning Disentangled Video Representations of Action and Scene | 최진우 | 안지오 | |
34 | Hierarchical Visual Feature Aggregation for OCR-Free Document Understanding | 한보형 | 최진영 | |
35 | Auto-Encoded Supervision for Perceptual Image Super-Resolution | 허재필 | 이민규 | |
36 | HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts | 윤상두 | 윤상두 | |
37 | Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning | 이준태, 윤성락 | 미정 |
O6.1 | 10:00 ~ 10:20 | LC-Mamba: Local and Continuous Mamba with Shifted Windows for Frame Interpolation | 이채은(한양대) |
O6.2 | 10:20 ~ 10:40 | DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI | 김희원(숭실대) |
O6.3 | 10:40 ~ 11:00 | Bi-directional Contextual Attention for 3D Dense Captioning | 김건희(서울대) |
O7.1 | 11:00 ~ 11:20 | TADFormer: Task-Adaptive Dynamic TransFormer for Efficient Multi-Task Learning | 민동보(이화여대) |
O7.2 | 11:20 ~ 11:40 | Identity-preserving Distillation Sampling by Fixed-Point Iterator | 차은주(숙명여대) |
O7.3 | 11:40 ~ 12:00 | BF-STVSR: B-Splines and Fourier—Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution | 유재준(UNIST) |
O8.1 | 13:30 ~ 13:50 | Random Conditioning for Diffusion Model Compression with Distillation | 서홍석(고려대) |
O8.2 | 13:50 ~ 14:10 | Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation | 조민수(POSTECH) |
O8.3 | 14:10 ~ 14:30 | Model Stock: All we need is just a few fine-tuned models | 윤상두(NAVER) |
Poster No.
|
Time |
Paper Name |
Advisor |
Presenter |
1 |
14:30 ~ 16:00 |
ContactField: Implicit Field Representation for Multi-Person Interaction Geometry | 임화섭 | 유택근 |
2 |
Seurat: From Moving Points to Depth |
이준영 |
신희성 |
|
3 |
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation |
강석주 |
유현우 |
|
4 |
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation |
Feng Yang |
오경록 |
|
5 |
Decomposition of Neural Discrete Representations for Large-Scale 3D Mapping |
김은태 |
박민성 |
|
6 |
VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions |
김진규 |
문석하 |
|
7 |
DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI |
김희원 |
이상민, 박성용 |
|
8 |
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing |
노준용 |
홍석현 |
|
9 |
ESC: Erasing Space Concept for Knowledge Deletion |
박경문, 황효석 |
이태영 |
|
10 |
ShowMak3r: Compositional TV Show Reconstruction |
박재식 |
김상민 |
|
11 |
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions |
박상효, 박혜영 |
허찬 |
|
12 |
Pixel-aligned RGB-NIR Stereo Imaging and Dataset for Robot Vision |
백승환 |
김진녕 |
|
13 |
Towards Certifiably Robust Face Recognition |
서재홍 |
백승훈 |
|
14 |
VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors |
성민혁 |
구주일 |
|
15 |
Learning Representation for Multitask Learning Through Self-supervised Auxiliary Learning |
손영두 |
신석원 |
|
16 |
Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior |
손진희 |
이찬희 |
|
17 |
Two is Better than One: an Efficient Ensemble Defense for Robust and Compact Model |
송병철 |
정유진 |
|
18 |
No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation |
심현정 |
박준성 |
|
19 |
GOAL: Global-local Object Alignment Learning |
엄찬호 |
최현규 |
|
20 | Minority-Focused Text-to-Image Generation via Prompt Optimization | 예종철 | 음수빈 | |
21 | BF-STVSR: B-Splines and Fourier—Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution | 유재준, 진경환 | 김현진 | |
22 | Syn-to-Real Domain Adaptation for Point Cloud Completion via Part-based Approach | 윤국진 | 김지훈 | |
23 | On-the-fly Category Discovery for LiDAR Semantic Segmentation | 윤국진 | 김현성 | |
24 | Posture-Informed Muscular Force Learning for Robust Hand Pressure Estimation | 윤상호 | 정한석 | |
25 | GeoAvatar: Geometrically-Consistent Multi-Person Avatar Reconstruction from Sparse Multi-View Videos | 이주호 | 이주호 | |
26 | Blind Image Deblurring with Noise-Robust Kernel Estimation | 장무석 | 이찬석 | |
27 | Is 'Right' Right? Enhancing Object Orientation Understanding in Multimodal Large Language Models through Egocentric Instruction Tuning | 장부루, 김범수 | 김은태 | |
28 | Kinetic Typography Diffusion Model | 전해곤 | 박선미 | |
29 | Amnesia as a Catalyst for Enhancing Black Box Pixel Attacks in Image Classification and Object Detection | 정재훈 | 송동수 | |
30 | Locality-Aware Interaction for Zero-Shot Human-Object Interaction Detection | 조민수 | 김상현 | |
31 | Dynamic Pseudo Labeling via Gradient Cutting for High-Low Entropy Exploration | 조성인 | 전주현 | |
32 | HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics | 주경돈 | 이종성 | |
33 | ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments | 최종현 | 김병휘 | |
34 | Resilient Sensor Fusion under Adverse Sensor Failures via Multi-Modal Expert Fusion | 최준원 | 박건율 | |
35 | Integration of Global and Local Representations for Fine-grained Cross-modal Alignment | 한경식 | 진승완 | |
36 | Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations | 황상흠 | 김정현 | |
37 | I2-SLAM: Inverting Imaging Process for Robust Photorealistic Dense SLAM | 김영민 | 배광탁 | |
38 | Model Stock: All We Need Is Just a Few Fine-Tuned Models | 윤상두, 한동윤 | 한동윤 |
Keynote 3 | 16:00 ~ 17:00 | TBD | Gul Varol(Ecole des Ponts ParisTech) |