Distributed ROS system

A distributed ROS system is a system that runs on several machines, with one master node.

Introduction

Most of the ROS system are fine to be run on one single machine. However, in some configuration, it is beneficial to run a distributed system :

the sensors are physically located in different places, with no wired connection in between
the required computing power is too big for one single computer
...

ROS has the capability to do that. The procedure is quite simple, yet contains security points that have to be considered before doing it !

How to do it

The procedure is divided in two steps. It can be applied only to ROS systems where the two (or more) machines are in the same subnet. Otherwise, more steps are required to get it working.

First thing to know is that ROS will use ephemeral ports to communicate between the two machines. An ephemeral port is a short-lived port number used by an Internet Protocal such as TCP. ROS chooses in a range of ephemeral ports to open communication between the two machines. Since they are short-lived, the chosen ports will often change.

Warning

The range of ephemeral ports used by ROS must not be blocked by the firewall, otherwise the two computer won't be able to communicate together...

This is not a big deal in a protected and private network, but it might become one in a public network.

The procedure below is explained for two computer :

running in the same subnet, knowing each other
running Ubuntu 18.04 LTS
running ROS Melodic Moreina

Main computer (contains the ROS master)

The first thing to is to reduce the number of ephemeral ports used. By default ROS will use all the available ephemeral ports. To reduce it :

sudo sysctl -w net.ipv4.ip_local_port_range="58000 60999"

Note

Be aware that this reduces also the ephemeral ports for all the other applications !

When this is done, we have to open these ports in the firewall (see our doc about UFW for more information). Run :

sudo ufw allow 58000:60999/tcp

Important

While the UFW will keep these ports open no matter if the computer reboots, the limited ephemeral ports range will be reset at each reboot.

This is why the best is to integrate the first command line in the script that launches the program.

Finally, the ROS_MASTER_URI must be exported. This variable describes where the master node is running.

   export ROS_MASTER_URI=http://<main computer IP address>:<port number of the master>

   # for example
   export ROS_MASTER_URI=http://10.166.28.21:11311

As a summary, it is recommended to include the step 1, 3 and the launch in a bash file, such as this one :

#!/bin/bash
sudo sysctl -w net.ipv4.ip_local_port_range="58000 60999"
export ROS_MASTER_URI=http://10.166.28.21:11311
roslaunch main.launch

Second computer

The first steps as for the main computer must be repeated :

limit the range of the ephemeral ports
open the UFW for these ports
export the ROS_MASTER_URI

It is best to do that in a bash file, such as :

#!/bin/bash
sudo sysctl -w net.ipv4.ip_local_port_range="58000 60999"
export ROS_MASTER_URI=http://10.166.28.21:11311
roslaunch distributed.launch