Intelligent Control of Bionic Fish Tracking Based on PPO Algorithm
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [17]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Bionic fish has broad prospect for engineering application. For the control of bionic fish, the first thing to solve is the tracking problem. However, the existing fish control methods based on CFD methods and traditional control algorithms feature high training data acquisition costs and unstable control. This study proposes an intelligent control method based on the PPO algorithm for bionic fish tracking. The surrogate model is employed instead of CFD to generate training data to improve the data generation efficiency. The efficient PPO algorithm is introduced to accelerate the learning speed of the strategy model and improve the utility of the training data. The speed parameter is introduced to solve the problem that the fish cannot track smoothly in the sharp turning area. Experiments show that the proposed method has faster convergence speed and more stable control ability in various paths, with guiding significance for the intelligent control of bionic robotic fish.

    Reference
    [1] 林海. 仿生机器鱼机构设计及力学分析[硕士学位论文]. 西宁: 青海大学, 2015.
    [2] Gao A, Triantafyllou MS. Independent caudal fin actuation enables high energy extraction and control in two-dimensional fish-like group swimming. Journal of Fluid Mechanics, 2018, 850: 304–335. [doi: 10.1017/jfm.2018.456
    [3] Gazzola M, Hejazialhosseini B, Koumoutsakos P. Reinforcement learning and wavelet adapted vortex methods for simulations of self-propelled swimmers. SIAM Journal on Scientific Computing, 2014, 36(3): B622–B639. [doi: 10.1137/130943078
    [4] Novati G, Verma S, Alexeev D, et al. Synchronisation through learning for two self-propelled swimmers. Bioinspiration & Biomimetics, 2017, 12(3): 036001. [doi: 10.1088/1748-3190/aa6311
    [5] Verma S, Novati G, Koumoutsakos P. Efficient collective swimming by harnessing vortices through deep reinforcement learning. Proceedings of the National Academy of Sciences the United States of America, 2018, 115(23): 5849–5854. [doi: 10.1073/pnas.1800923115
    [6] 皮骏, 李想, 张志力, 等. 基于神经模糊PID控制的四旋翼飞行器算法. 计算机系统应用, 2021, 30(5): 228–233. [doi: 10.15888/j.cnki.csa.007933
    [7] Tian RY, Li L, Wang W, et al. CFD based parameter tuning for motion control of robotic fish. Bioinspiration & Biomimetics, 2020, 15(2): 026008. [doi: 10.1088/1748-3190/ab6b6c
    [8] Khan S, Javed S, Naeem N, et al. Performance analysis of PID and state-feedback controller on the depth control of a robotic fish. Proceedings of the 2017 International Conference on Frontiers of Information Technology (FIT). Islamabad: IEEE, 2017. 7–11.
    [9] Novati G, Mahadevan L, Koumoutsakos P. Controlled gliding and perching through deep-reinforcement-learning. Physical Review Fluids, 2019, 4(9): 093902. [doi: 10.1103/PhysRevFluids.4.093902
    [10] Zhu Y, Tian FB, Young J, et al. A numerical study of fish adaption behaviors in complex environments with a deep reinforcement learning and immersed boundary-lattice Boltzmann method. Scientific Reports, 2021, 11(1): 1691. [doi: 10.1038/s41598-021-81124-8
    [11] Yan L, Chang XH, Tian RY, et al. A numerical simulation method for bionic fish self-propelled swimming under control based on deep reinforcement learning. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, 2020, 234(17): 3397–3415. [doi: 10.1177/0954406220915216
    [12] Yan L, Chang XH, Wang NH, et al. Learning how to avoid obstacles: A numerical investigation for maneuvering of self-propelled fish based on deep reinforcement learning. International Journal for Numerical Methods in Fluids, 2021, 93(10): 3073–3091. [doi: 10.1002/fld.5025
    [13] Hirt CW, Amsden AA, Cook JL. An arbitrary Lagrangian-Eulerian computing method for all flow speeds. Journal of Computational Physics, 1997, 135(2): 203–216. [doi: 10.1006/jcph.1997.5702
    [14] Zhang LP, Wang ZJ. A block LU-SGS implicit dual time-stepping algorithm for hybrid dynamic meshes. Computers & Fluids, 2004, 33(7): 891–916. [doi: 10.1016/j.compfluid.2003.10.004
    [15] Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms. arXiv:1707.06347, 2017.
    [16] Konda VR, Tsitsiklis JN. Actor-citic algorithms. Proceedings of the 12th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 1999. 1008–1014
    [17] 李茹杨, 彭慧民, 李仁刚, 等. 强化学习算法与应用综述. 计算机系统应用, 2020, 29(12): 13–25. [doi: 10.15888/j.cnki.csa.007701
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

李云飞,严嫏,张来平,邓小刚,邹舒帆.基于PPO算法的仿生鱼循迹智能控制.计算机系统应用,2023,32(9):230-238

Copy
Share
Article Metrics
  • Abstract:743
  • PDF: 1622
  • HTML: 1034
  • Cited by: 0
History
  • Received:February 22,2023
  • Revised:March 22,2023
  • Online: July 14,2023
Article QR Code
You are the first990400Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063