ML-Agents#
Summary#
ML-Agents
a.k.a. the Unity Machine Learning Agents Toolkit is an open-source project that enables games and simulations to serve as environments for training intelligent agents.
Articulation Robot Training Scene#
Set up the environment#
To test your installation:
In an empty project (with the right ML Agents package installed): drag & drop the folder
\ml-agents\Project\Assets\ML-Agents
in yourAssets
folder (ml-agents is the project folder that you cloned for the unity repository, be sure to checkout on the branch of your release to get compatible examples)In the Unity editor navigate to
\Assets\ML-Agents\Examples\3DBall\Scenes
and open the scene3DBall
Run the scene in
Default
orInference Only
mode. You can change this setting in theBehavior type
drop-down menu in theBehavior Parameters
component of your agent (see figure below). If this work, it will show the behavior of the already trained agent.Retraining an example. Launch the training on the console (see the detailed instructions here ), then, in
Default
mode, click the play button. When the training is completed, check the command line to get the path to the saved model. Copy the saved model into unity (darg & drop) and attach it to the Agent’sBehavior Parameters
component into theModel
field.

Behavior Parameters component#
If all is working, congrats you did half of the work. The next step is to actually use ml-agents in your project.
Warning
Always check that all the packages that you are using are the good version for your release
Warning
We had a great deal of problems working with Visual Studio Code. An alternative that worked great for us is Jetbrains’ Rider. If this is your choice, follow the easy steps here. Independently from the IDE you choose, similar steps are usually required.
Develop (Ml-Agents)#
Set up the scene. Typically you will have at least one agent (e.g., a robot arm), optionally a target (e.g., a cube to touch) and the environment (e.g., a table on which the robot and the target sit). All these elements are typical Unity’s game objects. It is a good idea to have an empty game object as a common parent. This will allow later on to create a prefab and duplicate your training areas.
Add the ML Agents package to the scene. The easiest way is usually via the menu
Windows | Package Manager
. For this guide, we used the release 18 (i.e., package 2.1.0-exp.1).Ceate the agent. Practically, this means to create a C# script and attach the script to the Agent game object. The content of the script strongly depends on what you whant to accomplish. However, commonly, you will find the following methods (you will find more information in the steps below):
void Start()
. Regular Unity start() method.public override void OnEpisodeBegin()
. Here, your will reset the initial state of a learning episode (e.g., replace the target object in a random position on the table and reset the agent’ position)public override void CollectObservations(VectorSensor sensor)
. Here you will manage the observation available to your agent.public override void OnActionReceived(ActionBuffers vectorAction)
. Here, you will manage the possible actions available to your agent. Usually, this is also where you define some rewards.public override void Heuristic(in ActionBuffers actionsOut)
. This is usually used for testing. This will allow to gain manual control on your agent.
Add the component Decision Requester to the Agent. This component will request a decision every certain amount that corresponds to the possibility for the agent to take actions.
Set up the Actions in the Behavior Parameters component. Basically, this manages a random number generator. The
Discrete Actions
generates integers, theContinuous Action
generates floats. It is possible to use a combination of both. These values are passed to theOnActionReceived()
function and will be used to change the state of the agent (e.g., move it in the environment).Configure the type of action (Discrete Vs. Continuous) by defining (per each type) the number of dimensions (i.e., numbers of rotating joints that can move simultaneously).
For the Discrete Actions only, define the size parameters (i.e., the min-max values for each dimension; e.g., degrees between 0-365 for rotation and 1-100 for x, y coordinates).
Set up the Actions in the OnActionReceived() function. The actions received (accordingly to the configuration of the previous point) should modify the state of the agent (e.g., move it).
Set up the Observations in the Behavior Parameters component. Set the
Vector Observation Space Size
accordingly to the variable observed by the Agent (e.g., set space size = 6, if your agent should observe/know the position in the space (coo x, y, z) of the agent and of the target).Set up the Observations in the CollectObservations(VectorSensor sensor) function. Add the observations of interest in sensor (e.g., the position of the agent and the position of the target, the distance between the two objects, etc.). Use a few observations as possible and make sure that they are as relevant as possible to the goals that you want to achieve.
Set the rewards. They could be positive and negative. This is typically done in OnActionReceived(), following the actions, or whenever some events are detected (e.g.,
onTriggerEnter()
).
Note
For more information, examples and good practice, about Actions, Observation and Rewards check the official Agents documentation.
Add (override) OnEpisodBegin(). The goal is to fix the initial conditions/reset all back to normal at the beginning of each episode (e.g., randomize the position of the target on the table).
Override the Heuristic() function. This will allow to manually control the Agent by mapping inputs from the user into the actions previously defined (in other terms, this function puts the input in the discrete/continues action vector used by OnActionReceived())
Training#
Create a configuration file. As a starting point, you can get inspired by the file in the example:
ml-agents\config\ppo\3DBall.yaml
.Launch the training before in the console than by clicking the play button in the Unity editor (you did the same steps to test your installation). In cmd console, you will have something like
mlagents-learn articulations-robot-demo\ur3_config.yml --run-id=RoboArm --force
.Check the progress made by your agents, using the command (in another cmd console)
tensorboard --logdir results --port 6006
Finally, when the training is completed, ceck the command line to get the path to the saved model. Copy the saved model into unity (darg & drop) and attach it to the Agent’s
Behavior Parameters
component into theModel
field.
Note
If you want to resume the training (if interrupted before reaching may_steps): run mlagents-learn articulations-robot-demo\ur3_config.yml --run-id=RoboArm --resume
Warning
If you want to resume the training (if completed, i.e., after reaching may_steps), you need to increase the max_steps
parameter in the configuration yml file and resume the training (see note above)