Low Compute End-to-End Robot Navigation from Depth Images Using Deep Reinforcement Learning

Shahrukh Hussain; Abdul Wahid; Anayatullah

doi:10.33411/IJIST/1816

Authors

Shahrukh Hussain Control, Automation and Robotics Lab, National Centre of Robotics and Automation, Rawalpindi, Pakistan
Abdul Wahid Control, Automation and Robotics Lab, National Centre of Robotics and Automation, Rawalpindi, Pakistan
Anayatullah Balochistan University of Information Technology, Engineering and Management Sciences, Quetta, Pakistan

DOI:

https://doi.org/10.33411/IJIST/1816

Keywords:

Autonomous Navigation, Deep Reinforcement Learning, Depth Images, Convolutional Neural Networks, Mobile Robots

Abstract

Effective and compute-efficient navigation is essential for mobile robots to operate autonomously in complex environments such as crowded places like airports and shopping malls. Sensors mounted on mobile robots play a vital role in decision-making by providing information about the surroundings. Among these sensors, depth cameras are a type of visual sensor that provides rich depth information of the surroundings, enabling robots to comprehend the 3-dimensional structure of the environment, assisting the robot in robust obstacle avoidance and navigation. In this paper, we aim to achieve autonomous navigation of a differential drive robot using only the depth information of the environment by employing a simple Convolutional Neural Network (CNN) architecture. CNN interprets the depth images captured by the depth camera and generates corresponding actions for the robot, while maintaining computational efficiency due to the limited computational resources of mobile robots. We employ Deep Reinforcement Learning (DRL) with curriculum learning paradigm to train our CNN in two Gazebo robotic simulation environments with increasing complexity. This approach increases the generalization of the model and enhances its adaptability in unobserved environments. The CNN learns to navigate autonomously using only depth information from the environment. The trained model is then evaluated by deploying it in an unseen simulation environment. Results show that the agent converged in 1,100 episodes in the primary environment. Furthermore, to demonstrate the model’s adaptability, it is deployed in a real-life laboratory environment where it achieved a 70% success rate after training for 1,000 episodes.

References

Jonáš Kulhánek, Erik Derner, Tim de Bruin, Robert Babuška, “Vision-based Navigation Using Deep Reinforcement Learning,” arXiv:1908.03627, 2019, [Online]. Available: https://arxiv.org/abs/1908.03627

Hamid Majidi Balanji, Ali Emre Turgut, “A novel vision-based calibration framework for industrial robotic manipulators,” Robot. Comput. Integr. Manuf., vol. 73, p. 102248, 2022, doi: https://doi.org/10.1016/j.rcim.2021.102248.

Honghu Xue, Benedikt Hein, “Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics,” Appl. Sci., vol. 12, no. 6, p. 3153, 2022, doi: https://doi.org/10.3390/app12063153.

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Deep Residual Learning for Image Recognition,” arXiv:1512.03385, 2015, [Online]. Available: https://arxiv.org/abs/1512.03385

A. Z. Karen Simonyan, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv:1409.1556, 2014, doi: https://doi.org/10.48550/arXiv.1409.1556.

Reinis Cimurs, Jin Han Lee, “Goal-Oriented Obstacle Avoidance with Deep Reinforcement Learning in Continuous Action Space,” Electronics, vol. 9, no. 3, p. 411, 2020, doi: https://doi.org/10.3390/electronics9030411.

M. M. Ejaz, T. B. Tang, and C. K. Lu, “Vision-Based Autonomous Navigation Approach for a Tracked Robot Using Deep Reinforcement Learning,” IEEE Sens. J., vol. 21, no. 2, pp. 2230–2240, Jan. 2021, doi: 10.1109/JSEN.2020.3016299.

V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015, doi: 10.1038/NATURE14236;TECHMETA=119,129;SUBJMETA=117,639,705;KWRD=COMPUTER+SCIENCE.

OpenAI: Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, “Dota 2 with Large Scale Deep Reinforcement Learning,” arXiv:1912.06680, 2019, [Online]. Available: https://arxiv.org/abs/1912.06680

D. Silver et al., “Mastering the game of Go with deep neural networks and tree search,” Nat. 2016 5297587, vol. 529, no. 7587, pp. 484–489, Jan. 2016, doi: 10.1038/nature16961.

D. Silver et al., “A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play,” Science (80-. )., vol. 362, no. 6419, pp. 1140–1144, Dec. 2018, doi: 10.1126/science.aar6404.

Jin Meng, Jiahui Zou, Shifeng Wang, Ruibing Yang, Aakash Kumar & Jonghyuk Kim, “Deep reinforcement learning for robust robot navigation in complex and crowded environments,” J. King Saud Univ. Comput. Inf. Sci., vol. 37, no. 333, 2025, [Online]. Available: https://link.springer.com/article/10.1007/s44443-025-00357-z

Ravi Raj & Andrzej Kos, “Intelligent mobile robot navigation in unknown and complex environment using reinforcement learning technique,” Sci. Rep., vol. 14, 2024, [Online]. Available: https://www.nature.com/articles/s41598-024-72857-3

Hoangcong Le, Saeed Saeedvand & Chen-Chien Hsu, “A Comprehensive Review of Mobile Robot Navigation Using Deep Reinforcement Learning Algorithms in Crowded Environments,” J. Intell. Robot. Syst., vol. 110, no. 158, 2024, [Online]. Available: https://link.springer.com/article/10.1007/s10846-024-02198-w

Luis Payá, Oscar Reinoso García, “Deep Reinforcement Learning of Mobile Robot Navigation in Dynamic Environment: A Review,” Sensors, vol. 25, no. 11, p. 3394, 2025, doi: https://doi.org/10.3390/s25113394.

Fadi AlMahamid, Katarina Grolinger, “VizNav: A Modular Off-Policy Deep Reinforcement Learning Framework for Vision-Based Autonomous UAV Navigation in 3D Dynamic Environments,” Drones 2022, Vol. 6, Page 9, vol. 8, no. 5, p. 173, 2024, doi: https://doi.org/10.3390/drones8050173.

Lun Ge, Xiaoguang Zhou, “Deep reinforcement learning navigation via decision transformer in autonomous driving,” Front. Neurorobot., vol. 18, 2024, [Online]. Available: https://www.frontiersin.org/journals/neurorobotics/articles/10.3389/fnbot.2024.1338189/full

R ; Singh, J ; Ren, “A review of deep reinforcement learning algorithms for mobile robot path planning,” Vehicles, vol. 5, no. 4, pp. 1423–1451, 2023, doi: https://doi.org/10.3390/vehicles5040078.

“Double Deep Q-Learning and Faster R-CNN-Based Autonomous Vehicle Navigation and Obstacle Avoidance in Dynamic Environment | Request PDF.” Accessed: Mar. 26, 2026. [Online]. Available: https://www.researchgate.net/publication/349492692_Double_Deep_Q-Learning_and_Faster_R-CNN-Based_Autonomous_Vehicle_Navigation_and_Obstacle_Avoidance_in_Dynamic_Environment

Jonáš Kulhánek, Erik Derner, Robert Babuška, “Visual Navigation in Real-World Indoor Environments Using End-to-End Deep Reinforcement Learning,” arXiv:2010.10903, 2020, [Online]. Available: https://arxiv.org/abs/2010.10903

Linhai Xie, Sen Wang, Andrew Markham, Niki Trigoni, “Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning,” arXiv:1706.09829, 2017, [Online]. Available: https://arxiv.org/abs/1706.09829

K. Yokoyama and K. Morioka, “Autonomous Mobile Robot with Simple Navigation System Based on Deep Reinforcement Learning and a Monocular Camera,” Proc. 2020 IEEE/SICE Int. Symp. Syst. Integr. SII 2020, pp. 525–530, Jan. 2020, doi: 10.1109/SII46433.2020.9025987.

N. D. Toan and K. G. Woo, “Mapless Navigation with Deep Reinforcement Learning based on The Convolutional Proximal Policy Optimization Network,” 2021 IEEE Int. Conf. Big Data Smart Comput., pp. 298–301, Jan. 2021, doi: 10.1109/BigComp51126.2021.00063.

Minjae Park, Seok Young Lee, “Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments,” Sensors, vol. 22, no. 24, p. 9574, 2022, doi: https://doi.org/10.3390/s22249574.

Yan Yin, Zhiyu Chen, “A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework,” Sensors, vol. 23, no. 4, p. 2036, 2023, doi: https://doi.org/10.3390/s23042036.

E. Marchesini and A. Farinelli, “Discrete Deep Reinforcement Learning for Mapless Navigation,” Proc. - IEEE Int. Conf. Robot. Autom., pp. 10688–10694, May 2020, doi: 10.1109/ICRA40945.2020.9196739.

Yongchao Zhang, Pengzhan Chen, “Path Planning of a Mobile Robot for a Dynamic Indoor Environment Based on an SAC-LSTM Algorithm,” Sensors, vol. 23, no. 24, p. 9802, 2023, doi: https://doi.org/10.3390/s23249802.

H. Shi, L. Shi, M. Xu, and K. S. Hwang, “End-to-End Navigation Strategy with Deep Reinforcement Learning for Mobile Robots,” IEEE Trans. Ind. Informatics, vol. 16, no. 4, pp. 2393–2402, Apr. 2020, doi: 10.1109/TII.2019.2936167.

Oualid Doukhi, Deok Jin Lee, “Deep Reinforcement Learning for End-to-End Local Motion Planning of Autonomous Aerial Robots in Unknown Outdoor Environments: Real-Time Flight Experiments,” Sensors, vol. 21, no. 7, p. 2534, 2021, doi: https://doi.org/10.3390/s21072534.

Junjie Zeng, Rusheng Ju, “Navigation in Unknown Dynamic Environments Based on Deep Reinforcement Learning,” Sensors, vol. 19, no. 18, p. 3837, 2019, doi: https://doi.org/10.3390/s19183837.

Junli Gao, Weijie Ye, “Deep Reinforcement Learning for Indoor Mobile Robot Path Planning,” Sensors, vol. 20, no. 19, p. 5493, 2020, doi: https://doi.org/10.3390/s20195493.

Y. Chen et al., “DRQN-based 3D Obstacle Avoidance with a Limited Field of View,” IEEE Int. Conf. Intell. Robot. Syst., pp. 8137–8143, 2021, doi: 10.1109/IROS51168.2021.9635949.

Manuel Sánchez, Jesús Morales, “Reinforcement and Curriculum Learning for Off-Road Navigation of an UGV with a 3D LiDAR,” Sensors, vol. 23, no. 6, p. 3239, 2023, doi: https://doi.org/10.3390/s23063239.

Xiuquan Cheng, Shaobo Zhang, “Path-Following and Obstacle Avoidance Control of Nonholonomic Wheeled Mobile Robot Based on Deep Reinforcement Learning,” Appl. Sci., vol. 12, no. 14, p. 6874, 2022, doi: https://doi.org/10.3390/app12146874.

L. Sun, J. Zhai and W. Qin, “Crowd Navigation in an Unknown and Dynamic Environment Based on Deep Reinforcement Learning,” IEEE Access, vol. 7, pp. 109544–109554, 2019, doi: 10.1109/ACCESS.2019.2933492.

Tinglong Zhao, Ming Wang, “A Path-Planning Method Based on Improved Soft Actor-Critic Algorithm for Mobile Robots,” Biomimetics, vol. 8, no. 6, p. 481, 2023, doi: https://doi.org/10.3390/biomimetics8060481.

Reinis Cimurs, Il Hong Suh, Jin Han Lee, “Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning,” arXiv:2103.07119, 2021, [Online]. Available: https://arxiv.org/abs/2103.07119

R. S. Sutton and A. G. Barto, “Reinforcement learning : an introduction,” p. 526.

“Intel® RealSenseTM Depth Camera D455”, [Online]. Available: https://www.intel.com/content/www/us/en/products/sku/205847/intel-realsense-depth-camera-d455/specifications.html

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal Policy Optimization Algorithms,” Jul. 2017, Accessed: May 04, 2024. [Online]. Available: https://arxiv.org/abs/1707.06347v2

John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel, “High-Dimensional Continuous Control Using Generalized Advantage Estimation,” arXiv:1506.02438, 2018, [Online]. Available: https://arxiv.org/abs/1506.02438

“TurtleBot 3 Burger RPi4 4GB [US]”, [Online]. Available: https://robotis.us/turtlebot-3-burger-rpi4-4gb-us/?srsltid=AfmBOoptTtlXGegIsi91AUnY6qDETpxq7h-IQvjfS624D8nuS7DC2llG

Alexandru Telea, “An Image Inpainting Technique Based on the Fast Marching Method,” J. Graph. Tools, vol. 9, no. 1, 2004, doi: 10.1080/10867651.2004.10487596.