Norio Kosaka

Norio Kosaka (小坂紀夫)
email: kosakaboat[at]gmail.com

I am currently a PhD student at The National Institute of Informatics, and The Graduate University for Advanced Studies, SOKENDAI, advised by Prof. Akihiro Sugimoto and sub-supervisor Prof. Pajdla Tomas.

Norio completed an MSc in Mathematics at Birkbeck, University of London, with a thesis titled "Overview of Riemann surfaces." with the supervision of Prof. Ben Fairbairn. He also did an MSc in Machine Learning with Distinction at Royal Holloway University of London, focusing on the intersection of Reinforcement Learning and Robotics for his Master's Thesis, under the guidance of Prof. Chris Watkins.

Research interests: Mathematics (Riemann surface, Hyperbolic geometry, and Algebric geometry) and Machine / Reinforcement Learning

Education

PhD in Informatics (ongoing), NII / SOKENDAI Apr. 2025 -
MSc in Mathematics (Merit), Birkbeck, University of London Oct. 2021 - Oct. 2023
Graduate Certificate in Pure Mathematics (Merit), Birkbeck, University of London Jan. 2021 - Jul. 2021
MSc Machine Learning (Distinction), Royal Holloway, University of London Sep. 2018 - Sep. 2019
B.A. in Commerce (GPA: 3.26), Waseda University, Tokyo Apr. 2011 - Mar. 2015

Work Experience

Researcher, LY Corp. Oct. 2023 -
Researcher, LINE Corp. Jul. 2022 - Oct. 2023
Research Engineer, NAVER Clova AI Nov. 2020 -
Research Intern, NAVER Clova AI Feb. 2020 - Apr. 2020
Research Intern, Panasonic Sep. 2019 - Dec. 2019
Research Intern, The Construct Apr. 2019 - Jul. 2019
DL/AI Research Engineer, Rakuten Inc. Jan. 2018 - Sep. 2018
Data Scientist / Marketer, Rakuten Inc. Apr. 2015 - Jan. 2018

Mathematics

Introduction to χ-boundedness Norio Kosaka Essay Project, 2022.
Overview of Riemann Surfaces (Talk) Norio Kosaka MSc Thesis at Birkbeck, University of London, 2023.
Visualizing Dirichlet Domains Norio Kosaka Project, 2024.

Computer Vision

Direct Camera Calibration from Vanishing Points via Polynomial Solvers
Norio Kosaka
ICCV 2025 Workshop on Camera Calibration and Pose Estimation (CALIPOSE), 2025
[Poster PDF]

Summary

We replace fragile stratified calibration with a direct polynomial solver using homotopy continuation, achieving robust intrinsics recovery from vanishing points (100% synthetic success; strong real-data performance) without any training.

Machine Learning

	Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions Ayush Jain, Norio Kosaka, Xinhu Li, Kyung-Min Kim, Erdem Bıyık, Joseph J. Lim Reinforcement Learning Conference (RLC), 2025 Outstanding Paper Award on Empirical Reinforcement Learning Research Summary We introduce SAVO, a successive-actor method that prunes low-value regions in complex Q-landscapes so TD3 avoids local optima, delivering higher returns and better sample efficiency across MuJoCo, Adroit, RecSim and gridworld.
	Enhancing Actor-Critic Decision-Making with Afterstate Models for Continuous Control Norio Kosaka ICML Workshop: Aligning Reinforcement Learning Experimentalists and Theorists, 2024 Summary We make critics evaluate the predicted afterstate (next state) rather than raw actions, simplifying value estimation; plugged into DDPG/SAC this yields faster, stabler learning on MuJoCo, PaintGym and RecSim.
	Direct Preference-based Policy Optimization without Reward Modeling *Gaon An, Junhyeok Lee, Xingdong Zuo, Norio Kosaka, Kyung-Min Kim, Hyun Oh Song* Neural Information Processing Systems (NeurIPS), 2023 Summary DPPO learns policies directly from preference labels (no reward model), improving robustness and efficiency on D4RL/Adroit/Kitchen and transferring to RLHF for LLMs.
	Know Your Action Set: Learning Action Relations for Reinforcement Learning *Ayush Jain, Norio Kosaka, Kyung-Min Kim, Joseph J Lim* International Conference on Learning Representations (ICLR) 2022, Apr. 2022 Summary AGILE uses a graph-attention policy to model dependencies between available actions, boosting performance and generalisation in tool-use and recommender settings.
	PlaNet of the Bayesians: Reconsidering and Improving Deep Planning Network by Incorporating Bayesian Inference Masashi Okada, Norio Kosaka, Tadahiro Taniguchi International Conference on Intelligent Robots and Systems(IROS) 2020, USA, Oct. 2020 Summary We add Bayesian model and action uncertainty (ensembles + probabilistic MPC) to PlaNet, improving planning robustness and sample efficiency on DeepMind Control Suite.
	Has it explored enough Norio Kosaka Master's Thesis at Royal Holloway, University of London, Sep. 2019 Summary An ablation-driven study of DDPG exploration (noise processes, BatchNorm, on/off-policy) vs SAC/GNN policies on MuJoCo, Centipede and gridworld, explaining when and why DDPG fails to explore.
	Hindsight Experience Replay on ROS Norio Kosaka ROSDevCon19, Jun. 2019
	DIY SLAM 3-Wheeled Car Norio Kosaka Individual Project, Mar. 2019

Extracurricular Activities

Robotics Club
Rakuten Tokyo, Japan

Various Projects & Activities (Watch Video 1) (Watch Video 2)

University Official Rowing Club
Waseda University Tokyo, Japan

Achieved 5th place in Men’s 8+ at the 34th All-Japan Lightweight Championship (Japan Cup) (Watch Video) (Results Table)
1st place in the 2013 Waseda-Keio Regatta (Watch Video); 2nd place in the 2014 Waseda-Keio Regatta

Access Map

Template: this