English | Japanese 


FingerVision FingerVision is a vision-based tactile sensor for robot fingers, which is multimodal (contact force distribution, proximity vision: slip, deformation, object pose estimation), easy to manufacture, physically robust, and cheap (material cost is $50). →Project website

Learning from Demonstration of Pouring

LfD Pouring This is a case study of "pouring" under the learning from demonstration (LfD) framework. The goal of this work is to explore how to represent, plan, and learn complex tasks that have many variations.   ..more

Learning Strategy Fusion

Learning Strategy Fusion Learning Strategy (LS) Fusion a method to fuse learning strategies (LSs) in reinforcement learning framework. Generally, we need to choose a suitable LS for each task respectively. In contrast, the proposed method automates this selection by fusing LSs.   ..more

SkyAI: Highly Modularized Reinforcement Learning Library

SkyAI SkyAI is an open source software library of reinforcement learning methods. The distinct feature is its modular architecture which realizes high execution-speed enough for real robot systems and high flexibility to design learning systems.   ..more

DCOB: Action Space for Motion Learning of Large DoF Robots

DCOB DCOB is a method to generate a discrete action space from a set of basis functions given to approximate a value function. The distinct feature is its applicability to large domains.   ..more

Learning Humanoid Locomotion

Learning Humanoid Locomotion This research applies our reinforcement learning methods to learning locomotion by a human-size humanoid robot. A new learning scheme is studied where the robot is embedded with a primitive balancing controller during learning.   ..more

Model Decomposition

Model Decomposition We proposed a method to decompose a dynamics model into task specific and task invariant elements. This method enables to transfer a dynamics model of a task to one of the other tasks.   ..more

DQL+: Reusing Memory of Dangerous Actions

DQL+ DQL+ is a method to decompose an action value function into two so that one encodes the memory of dangerous actions (e.g. actions causing falling down). The memory of dangerous actions is considered to be invariant to the task; so, we can share the memory between tasks by using DQL+.   ..more