Norio Kosaka (小坂 紀夫)
email:
kosakaboat[at]gmail.com
CV |
Scholar |
Github |
LinkedIn
|
Norio completed an MSc in Mathematics at Birkbeck, University of London, with a thesis
titled "Overview of Riemann surfaces." with the supervision of
Prof. Ben Fairbairn.
He also did an MSc in Machine Learning with Distinction at Royal Holloway University of
London, focusing on the intersection of Reinforcement Learning and Robotics for his Master's
Thesis, under the guidance of Prof. Chris
Watkins.
Research interests: Mathematics (Riemann surface, Hyperbolic geometry, and Algebric
geometry) and Machine / Reinforcement Learning
|
|
Direct Camera Calibration from Vanishing Points via Polynomial Solvers
Norio Kosaka
ICCV 2025 Workshop on Camera Calibration and Pose Estimation (CALIPOSE), 2025
Summary
We replace fragile stratified calibration with a direct polynomial solver using homotopy continuation, achieving robust intrinsics recovery from vanishing points (100% synthetic success; strong real-data performance) without any training.
|
|
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
Ayush Jain, Norio Kosaka, Xinhu Li, Kyung-Min Kim, Erdem Bıyık, Joseph J. Lim
Reinforcement Learning Conference (RLC), 2025
Outstanding Paper Award on Empirical Reinforcement Learning Research
Summary
We introduce SAVO, a successive-actor method that prunes low-value regions in complex Q-landscapes so TD3 avoids local optima, delivering higher returns and better sample efficiency across MuJoCo, Adroit, RecSim and gridworld.
|
|
Enhancing Actor-Critic Decision-Making with Afterstate Models for Continuous Control
Norio Kosaka
ICML Workshop: Aligning Reinforcement Learning Experimentalists and Theorists, 2024
Summary
We make critics evaluate the predicted afterstate (next state) rather than raw actions, simplifying value estimation; plugged into DDPG/SAC this yields faster, stabler learning on MuJoCo, PaintGym and RecSim.
|
|
Direct Preference-based Policy Optimization without Reward Modeling
Gaon An*, Junhyeok Lee*, Xingdong Zuo, Norio Kosaka, Kyung-Min Kim, Hyun Oh
Song
Neural Information Processing Systems (NeurIPS), 2023
Summary
DPPO learns policies directly from preference labels (no reward model), improving robustness and efficiency on D4RL/Adroit/Kitchen and transferring to RLHF for LLMs.
|
|
Know Your Action Set: Learning Action Relations for Reinforcement Learning
Ayush Jain*, Norio Kosaka*, Kyung-Min Kim, Joseph J Lim
International Conference on Learning Representations (ICLR) 2022, Apr. 2022
Summary
AGILE uses a graph-attention policy to model dependencies between available actions, boosting performance and generalisation in tool-use and recommender settings.
|
|
PlaNet of the Bayesians: Reconsidering and Improving Deep Planning Network by
Incorporating Bayesian Inference
Masashi Okada, Norio Kosaka, Tadahiro Taniguchi
International Conference on Intelligent Robots and Systems(IROS) 2020, USA, Oct.
2020
Summary
We add Bayesian model and action uncertainty (ensembles + probabilistic MPC) to PlaNet, improving planning robustness and sample efficiency on DeepMind Control Suite.
|
|
Has it explored enough
Norio Kosaka
Master's Thesis at Royal Holloway, University of London, Sep. 2019
Summary
An ablation-driven study of DDPG exploration (noise processes, BatchNorm, on/off-policy) vs SAC/GNN policies on MuJoCo, Centipede and gridworld, explaining when and why DDPG fails to explore.
|
|
Hindsight Experience Replay on ROS
Norio Kosaka
ROSDevCon19, Jun. 2019
|
|
DIY SLAM 3-Wheeled Car
Norio Kosaka
Individual Project, Mar. 2019
|
Extracurricular Activities
-
Robotics Club
Rakuten
Tokyo, Japan
-
University Official Rowing Club
Waseda University
Tokyo, Japan
- Achieved 5th place in Men’s 8+ at the 34th All-Japan Lightweight Championship (Japan Cup)
(Watch Video)
(Results Table)
- 1st place in the 2013 Waseda-Keio Regatta
(Watch Video);
2nd place in the 2014 Waseda-Keio Regatta
|
|