基于拥塞及内存感知的SD-WAN故障恢复
作者:
基金项目:

国家重点研发计划(2019YFB1804003); 广东省重点领域研发计划(2019B010137003); 广东省科技基金(2016B030305006, 2018A07071702); 广州市科技基金(201804010314)


SD-WAN Failure Recovery Based on Congestion and Memory Awareness
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [19]
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    在软件定义广域网(SD-WAN)中, 链路故障会导致大量丢包, 严重时会引起部分网络瘫痪. 现有的流量工程方法通过在数据平面提前安装备份路径能够加快故障恢复过程, 但在资源受限的情况下难以适应各种网络故障情况, 从而使恢复后的网络性能下降. 为了保证网络在故障恢复之后的性能并减少备份资源的消耗, 本文提出一种基于拥塞及内存感知的主动式故障恢复方案(CAMA), 不仅能够将受影响数据流进行快速重定向, 还能实现负载均衡避免恢复后潜在的链路拥塞. 实验结果表明, 与已有方案相比, CAMA能有效利用备份资源, 在负载均衡上有较好的性能, 且仅需少量备份规则即可覆盖所有单链路故障情况.

    Abstract:

    In software-defined wide area networks (SD-WANs), link failures can result in substantial packet loss, leading to partial network paralysis in severe cases. The existing traffic engineering approaches can expedite failure recovery by installing backup paths in advance on the data plane. However, it is difficult to adapt to various network failures with limited resources, which degrades the network performance after recovery. To maintain the network performance after failure recovery and reduce the consumption of backup resources, this study proposes a proactive failure recovery scheme based on congestion and memory awareness (CAMA), which can not only redirect the affected data flows quickly but also realize the load balancing to avoid the potential link congestion after recovery. Experimental results demonstrate that compared with existing schemes, CAMA can effectively utilize backup resources, performs well in load balancing, and requires only a small number of backup rules to cover all single-link failure scenarios.

    参考文献
    [1] Ali J, Lee GM, Roh BH, et al. Software-defined networking approaches for link failure recovery: A survey. Sustainability, 2020, 12(10): 4255. [doi: 10.3390/su12104255
    [2] Farhady H, Lee H, Nakao A. Software-defined networking: A survey. Computer Networks, 2015, 81: 79–95. [doi: 10.1016/j.comnet.2015.02.014
    [3] Fonseca PC, Mota ES. A survey on fault management in software-defined networks. IEEE Communications Surveys & Tutorials, 2017, 19(4): 2284–2321. [doi: 10.1109/COMST.2017.2719862
    [4] Liu HH, Kandula S, Mahajan R, et al. Traffic engineering with forward fault correction. Proceedings of the 2014 ACM Conference on SIGCOMM. Chicago: ACM, 2014. 527–538.
    [5] Shojaee M, Neves M, Haque I. SafeGuard: Congestion and memory-aware failure recovery in SD-WAN. Proceedings of the 16th International Conference on Network and Service Management. Izmir: IEEE, 2020. 1–7.
    [6] Zheng JQ, Xu H, Zhu XJ, et al. Sentinel: Failure recovery in centralized traffic engineering. IEEE/ACM Transactions on Networking, 2019, 27(5): 1859–1872. [doi: 10.1109/TNET.2019.2931473
    [7] Isyaku B, Mohd Zahid MS, Bte Kamat M, et al. Software defined networking flow table management of OpenFlow switches performance and security challenges: A survey. Future Internet, 2020, 12(9): 147. [doi: 10.3390/fi12090147
    [8] Wang Y, Feng SX, Guo HT, et al. A single-link failure recovery approach based on resource sharing and performance prediction in SDN. IEEE Access, 2019, 7: 174750–174763. [doi: 10.1109/ACCESS.2019.2957141
    [9] Wang SS, Xu HL, Huang LS, et al. Fast recovery for single link failure with segment routing in SDNs. Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications, the 17th IEEE International Conference on Smart City and the 5th IEEE International Conference on Data Science and Systems. Zhangjiajie: IEEE, 2019. 2013–2018.
    [10] Barakabitze AA, Sun LF, Mkwawa IH, et al. Multipath protections and dynamic link recovery in softwarized 5G networks using segment routing. Proceedings of the 2019 IEEE Globecom Workshops. Waikoloa: IEEE, 2019. 1–6.10.
    [11] Tian Y, Wang ZL, Yin X, et al. Traffic engineering with segment routing considering probabilistic failures. Proceedings of the 17th International Conference on Network and Service Management. Izmir: IEEE, 2021. 21–27.
    [12] Wang LK, Yao L, Xu ZC, et al. CFR: A cooperative link failure recovery scheme in software-defined networks. International Journal of Communication Systems, 2018, 31(10): e3560. [doi: 10.1002/dac.3560
    [13] Mohan PM, Truong-Huu T, Gurusamy M. Fault tolerance in TCAM-limited software defined networks. Computer Networks, 2017, 116: 47–62. [doi: 10.1016/j.comnet.2017.02.009
    [14] Thorat P, Challa R, Raza SM, et al. Proactive failure recovery scheme for data traffic in software defined networks. Proceedings of the 2016 IEEE NetSoft Conference and Workshops. Seoul: IEEE, 2016. 219–225.
    [15] Garey MR, Johnson DS. Computers and intractability: A guide to the theory of NP-completeness. San Francisco: W. H. Freeman & Co., 1979.
    [16] Jain S, Kumar A, Mandal S, et al. B4: Experience with a globally-deployed software defined WAN. ACM SIGCOMM Computer Communication Review, 2013, 43(4): 3–14. [doi: 10.1145/2534169.2486019
    [17] Knight S, Nguyen HX, Falkner N, et al. The Internet topology zoo. IEEE Journal on Selected Areas in Communications, 2011, 29(9): 1765–1775. [doi: 10.1109/JSAC.2011.111002
    [18] Fernandes EL, Rojas E, Alvarez-Horcajo J, et al. The road to BOFUSS: The basic OpenFlow userspace software switch. Journal of Network and Computer Applications, 2020, 165: 102685. [doi: 10.1016/j.jnca.2020.102685
    [19] Foerster KT, Pignolet YA, Schmid S, et al. CASA: Congestion and stretch aware static fast rerouting. Proceedings of the 2019 IEEE Conference on Computer Communications. Paris: IEEE, 2019. 469–477.
    相似文献
    引证文献
引用本文

庄捷,张奇支,郑伟平,赵淦森.基于拥塞及内存感知的SD-WAN故障恢复.计算机系统应用,2023,32(9):106-114

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-02-22
  • 最后修改日期:2023-03-22
  • 在线发布日期: 2023-07-17
文章二维码
您是第11370123位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号