Imitation Learning for a Snake Robot

Muhammad Kamil; Shahmir Ali; M. Noman Saeed; Muhammad Aqib Jamal; Anayat Ullah

Authors

Muhammad Kamil Department of Electronic Engineering, Balochistan University of IT, Engineering and Management Sciences, Quetta, Pakistan
Shahmir Ali Control, Automotive and Robotics Lab, National Centre of Robotics and Automation, Rawalpindi, Pakistan
M. Noman Saeed Control, Automotive and Robotics Lab, National Centre of Robotics and Automation, Rawalpindi, Pakistan
Muhammad Aqib Jamal Department of Electronic Engineering, Balochistan University of IT, Engineering and Management Sciences, Quetta, Pakistan
Anayat Ullah

Keywords:

Imitation Learning, Supervise Learning, Depth Images, CNN-LSTM and Snake Robot

Abstract

Introduction/Importance of Study: The emergence of learning-based methods, particularly imitation learning (IL), provides an effective alternative to traditional control strategies for complex robotic systems. IL enables robots to learn control policies directly from expert demonstrations, eliminating the need for explicit modeling of highly nonlinear dynamics.

Novelty statement: This study proposes an imitation learning framework for a snake robot based on a hybrid CNN–LSTM architecture, designed to capture both spatial and temporal dependencies inherent in locomotion tasks.

Material and Method: A simulated snake robot was developed in a ROS1–Gazebo environment. RGB-D images, motor commands, directional labels, and timestamps were collected and stored in CSV format. Deep learning models were implemented in PyTorch to learn a mapping from sensory inputs to motor control actions.

Result and Discussion: The baseline CNN model achieved a test accuracy of approximately 19–20%, despite exceeding 90% training accuracy, indicating severe overfitting and poor generalization. The model showed signs of overfitting, as indicated by decreasing training loss but increasing validation loss across 100–140 epochs. In contrast, the proposed CNN–LSTM model achieved a test accuracy of 96.37% with a macro F1-score of 0.89. The model demonstrated rapid convergence within 10–15 epochs, with training and validation accuracies reaching approximately 99% and 95–96%, respectively. Confidence-based evaluation further indicated high statistical reliability (95% confidence interval: 96.30%–96.44%). Confusion matrix analysis confirmed strong class-wise performance, with rectilinear forward motion achieving near-perfect accuracy (~99%), while other motion classes were also classified with minimal error.

Concluding Remarks: The results demonstrate that integrating temporal memory with spatial feature extraction significantly enhances imitation learning performance for snake robot control. Deployed on an NVIDIA Tesla A100 GPU (32 GB), the lightweight CNN–LSTM model achieves an estimated inference latency of 2–10 ms per input sequence, indicating its suitability for real-time robotic control in simulated environments.

References

Edward Johns, “Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration,” arXiv:2105.06411, 2021, [Online]. Available: https://arxiv.org/abs/2105.06411

Xiaolong YangCorresponding Author; Long Zheng; Da Lü; Jinhao Wang; Shukun Wang; Hang Su; Zhixin Wang; Luquan Ren, “The snake-inspired robots: a review,” Assem. Autom., vol. 42, no. 4, pp. 567–583, 2022, doi: https://doi.org/10.1108/AA-03-2022-0058.

B. Fang, S. Jia, D. Guo, M. Xu, S. Wen, and F. Sun, “Survey of imitation learning for robotic manipulation,” Int. J. Intell. Robot. Appl. 2019 34, vol. 3, no. 4, pp. 362–369, Sep. 2019, doi: 10.1007/S41315-019-00103-5.

Rafael Figueiredo Prudencio, Marcos R. O. A. Maximo, Esther Luna Colombini, “A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems,” arXiv:2203.01387, 2023, [Online]. Available: https://arxiv.org/abs/2203.01387

Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell, “Curiosity-driven exploration by self-supervised prediction,” arXiv:1705.05363, 2017, [Online]. Available: https://arxiv.org/abs/1705.05363

Carlos Celemin, Rodrigo Pérez-Dattari, Eugenio Chisari, Giovanni Franzese, Leandro de Souza Rosa, Ravi Prakash, Zlatan Ajanović, Marta Ferraz, “Interactive imitation learning in robotics: A survey,” arXiv:2211.00600, 2022, [Online]. Available: https://arxiv.org/abs/2211.00600

Jiang Hua, Liangcai Zeng, “Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning,” Sensors, vol. 21, no. 4, p. 1278, 2021, doi: 10.3390/s21041278.

X. Liu, R. Gasoto, Z. Jiang, C. Onal, and J. Fu, “Learning to locomote with artificial neural-network and CPG-based control in a soft snake robot,” IEEE Int. Conf. Intell. Robot. Syst., pp. 7758–7765, Oct. 2020, doi: 10.1109/IROS45743.2020.9340763.

Lixing Liu, Xian Guo, “A Reinforcement Learning-Based Strategy of Path Following for Snake Robots with an Onboard Camera,” Sensors, vol. 22, no. 24, 2022, doi: 10.3390/s22249867.

Xupeng Liu, Yong Zang, “Bio-Inspired Multimodal Motion Gait Control of Snake Robots with Environmental Adaptability Based on ROS,” Electronics, vol. 13, no. 17, p. 3437, 2024, doi: https://doi.org/10.3390/electronics13173437.

Syed Kumayl Raza Moosavi, Muhammad Hamza Zafar, “Snake robots: A state-of-the-art review on design, locomotion, control, and real-world applications,” Mechatronics, vol. 112, p. 103418, 2025, doi: https://doi.org/10.1016/j.mechatronics.2025.103418.

Khojasteh Z. Mirza, Shubham Singh, “Imitation learning for legged robot locomotion: a survey,” Front. Robot. AI, vol. 12, 2025, doi: https://doi.org/10.3389/frobt.2025.1678567.

Ardit Poka, Daniele Ludovico, Federico Manara, Lorenzo De Mari Casareto Dal Verme, Carlo Canali, Giovanni Berselli, Darwin G. Caldwell & Jovana Jovanova, “Underwater Snake-Like Robots: A Review on Design, Actuation, and Modelling Methods,” Int. J. Adv. Manuf. Technol., vol. 139, pp. 5445–5460, 2025, [Online]. Available: https://link.springer.com/article/10.1007/s00170-025-16231-1

Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, “From Motor Control to Team Play in Simulated Humanoid Football,” arXiv:2105.12196, 2021, [Online]. Available: https://arxiv.org/abs/2105.12196

Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, “Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization,” arXiv:2210.01241, 2023, [Online]. Available: https://arxiv.org/abs/2210.01241

M. Popova, O. Isayev, and A. Tropsha, “Deep reinforcement learning for de novo drug design,” Sci. Adv., vol. 4, no. 7, Jul. 2018, doi: 10.1126/SCIADV.AAP7885.

B. R. Kiran et al, “Deep Reinforcement Learning for Autonomous Driving: A Survey,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4909–4926, 2022, doi: 10.1109/TITS.2021.3054625.

Abdul Mueed Hafiz, Mahmoud Hassaballah, “Formula-Driven Supervised Learning in Computer Vision: A Literature Survey,” Appl. Sci, vol. 13, no. 2, p. 723, 2023, doi: https://doi.org/10.3390/app13020723.

Bowen Baker, Ilge Akkaya, Peter Zhokhov, Joost Huizinga, Jie Tang, Adrien Ecoffet, Brandon Houghton, Raul Sampedro, Jeff Clune, “Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos,” arXiv:2206.11795, 2022, [Online]. Available: https://arxiv.org/abs/2206.11795

L. Le Mero, D. Yi, M. Dianati, and A. Mouzakitis, “A Survey on Imitation Learning Techniques for End-to-End Autonomous Vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 9, pp. 14128–14147, Sep. 2022, doi: 10.1109/TITS.2022.3144867.