Language overview in ROS repositories

Denis Zatyagov
5 min readNov 19, 2021

--

The discussion about languages in robotics lacks quantitative data. This article is an attempt to fill this gap by analyzing 927 ROS Melodic repositories.

As a result of the analysis, the count of programming languages reached 53, with C++ as the most popular. It is used extensively and often: in more than 600 repositories.

However, Python is also popular, though its use is less frequent with 400 repositories. Nevertheless, a robotics software engineer should rather be fluent in both of them to be able to reuse any existing code if needed.

This overview is rather descriptive. However, it is impossible to make any conclusion regarding functionality without further analysis of the code. So as future work, functional and logical patterns mining is considered in order to discover architecture tactics for robotics software development.

Motivation

Reading robotics-related chats and forums I often come across discussions like “what is better for robotics: C++ or Python?”. The correct answer to the question obviously depends on a task at hand or task in consideration and if you would like to see details please refer to articles on The Robotics Back-End [1] and by The Construct CEO Ricardo Tellez [2]. Although the articles provide a fair enough view on the problem, I noticed only qualitative data there. So I decided to take a look at the contents of the repositories and dig some quantitative data for the discussion.

Methodology

Repositories released for ROS Melodic were chosen for the analysis as it’s the most popular ROS distributive (41% of total packages downloaded as of April 1, 2021 [3]). Note that repositories themselves were taken as a subject for the analysis (note that a single repository may contain more than one package). A list of repositories was extracted from the “ROS Index” service [4] (July 19, 2021). All the repositories were hosted on GitHub, so the built-in indicator was used to obtain statistics on languages [5]. I collected data from ROS Index and GitHub using two custom scripts written in Python. Data was collected on August 16, 2021.

927 repositories in total were analyzed. The full list can be found here.

Disclaimer: some languages mentioned in the publication can not be considered as programming languages (CMake for example). However, I decided to keep everything recognized by the GitHub service as a language to show the situation as it is.

Findings

53 languages were identified but only a few are really used

53 languages were identified in total. Also, the “other” category is presented which contains some code that GitHub built-in indicator could not recognize.

Besides identified common programming languages (C++, Python, Java) there are interactive languages (e.g. Shell), its wrappers (e.g. Makefile, Dockerfile, Batchfile), markup, and style sheet languages (e.g. HTML, CSS), and build systems (CMake).

The most often used: CMake (reasonably) in almost every repository

The vast majority of identified languages are not popular: 40 languages are found in less than 10 repositories each, while 21 are seen just once.

Only 5 languages (CMake, C++, Python, Shell, C) occur in more than a hundred repositories.

Each ROS repository contains one or more packages. To manage packages in ROS a build system catkin is used. And catkin uses CMake. Thus reasonably CMake is present in almost every repository.

As for the programming languages, it’s also reasonable to see two of the most popular: C++ and Python. Noticeably, C++ is used more frequently: 7 out of 10 repositories, while Python is used in 5 out of 10.

Each seventh repository contains shell scripts and slightly less often occurs code in classic C.

The rest of the languages can be considered exotic.

Table 1. Ten the most used languages in ROS Melodic with total shares. The entire table can be found here.

The most extensively used: more than a half of the code in over 500 repositories is written С++

Most often the share of CMake is less than 10% in the repository, which is reasonable: it provides only service instructions for package building. Though, each tenth repository consists of CMake only, which means it has no functional code and uses functionality from other packages.

C++ code is usually used for more than 70% of a repository codebase, providing the main functionality.

Python is usually used either for a minor (less than 10%) or a major part (more than 90%) in a repository codebase.

Figure 1. A number of repositories contain a language by its share.

89% of repositories use more than one language

9 out of 10 repositories are written in two or more languages.

One-third contains two languages. The most common combination is C++ and CMake (more than a half of 320 repositories).

A bit less than one-third contains three languages, while the most common combination is C++, Python, and CMake (almost a half of 270 repositories).

One-quarter of all repositories contain 4 or more languages (“Other” language was always considered as one).

Figure 2. A number of languages used in a repository.
Table 2. Five the most common combinations of languages in 320 repositories written in two languages.
Table 3. Five the most common combinations of languages in 270 repositories written in three languages.

Primary and secondary languages: C++/Python and CMake/Python

Based on 829 repositories that contain two or more languages.

The first most used (i.e. primary) language is expectedly C++ (67,4%). Python as a primary language is used three times less often (21,6%).

The second most used (i.e. secondary) language is CMake (more than a half of repositories) and less often Python (20,7%). Almost the same situation with the third language.

Table 4. Five languages are used more often as the first (primary), the second, and the third (secondary) in a repository.

Conclusions

  1. At first glance, a lot of languages are used in ROS repositories.
  2. 90% of repositories utilize CMake, which is expected as it’s a package management tool.
  3. Due to the above 90% of repositories are written in two or more languages.
  4. С++ is the main programming language in ROS.
  5. Nevertheless, Python is used often both as a primary and secondary language.
  6. A robotics software engineer should be able to program both in C++ and Python in order to re-use existing unique code.
  7. It is impossible to say what functionality brings a language without further investigation and code inspection.

References

[1] When To Use Python vs C++ in Robotics? https://roboticsbackend.com/when-to-use-python-vs-c-in-robotics/

[2] Learn ROS with C++ or Python? https://www.theconstructsim.com/learn-ros-python-or-cpp/

[3] ROS Distro Usage by packages.ros.org traffic https://metrics.ros.org/packages_rosdistro.html

[4] ROS Index https://index.ros.org/repos/#melodic

[5] About repository languages https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-repository-languages

--

--