Hi Peter and all,

As Kevin(on cc) and I have been working on Lustre Arm stuff for some time.

We want to give a status and progress report to the community and list our work plan for the next year.

Please help to review our work plan and give some comments and suggestions. Thanks.

Status and Progress

====================

Release

  • No Arm packages built on official community release yet.

Build

  • Verified Lustre, openZFS build and multi-nodes setup on Arm64 CentOS 8, all are ok.

  • Lbuild script support for Arm is on review, LU-15293.

CI

  • No Arm server end CI support yet.

  • Arm client with x86_64 server test is already in the CI gate.

Arm server end test on local setup

  • Run a full ldiskfs test with all test suites.

    • Due to the multi MDTs crash issue, some multi MDTs tests are not run.

    • Many new failed tests come, see the google sheet for details.

  • The openZFS full test is not run, but heard that it should be more stable than ldiskfs.

Bugfix

Reference to:

James Simmons’ Lustre Arm update: https://connect.linaro.org/resources/san19/san19-224/


Work Plan

==========

  • Lustre Server End Critical Bug Fix target 2022-06

    • Lustre Multiple MDTs kernel OOPS when stripe issue: LU-15364

    • Lustre hangs at Sanity Test 807

    • Lustre Conf-sanity test 44 kernel crash

    • Lustre Conf sanity case 58 kernel crash

    • Lustre Conf sanity case 78 kernel crash

    • Lustre Conf sanity case 79 crash

    • Lustre sanity-pcc 7a case hang the cluster

  • Lustre Server End Non-critical Bug Fix target 2022-12

    • Lustre Sanity failure cases: 33 cases

    • Lustre server replay-single: 1 case

    • Lustre sanity-flr 200 cases fix: 1 case 

    • Lustre sanity-hsm failure cases: 25 cases

    • Lustre lustre-rsync-test failure test: 3

    • Lustre recovery-small/sanity-scrub: 2

    • Lustre sanityn test cases fix: 12

    • Lustre sanity-lfsck failure cases fix: 3

    • Lustre sanity-sec failure cases fix :7 

    • Lustre sanity-lnet failure cases test fix: 2

  • Continuous add more test suites for Arm client CI ??

    • Once a  test suite is all passed for Arm then add it into CI.

  • Server CI support for Arm on Centos8 ??

    • Ideally, Arm server CI can come with Arm server end fixes patches and ensure future patches merged don’t make any regressions on Arm.

    • As the test infra is not open source and maintained by whamcloud, it might need whamcloud to make it ??

  • Other works in future

    • Test other distros like ubuntu, SUSE etc.

    • Test x86 client with Arm64 Server 

    • Basic Optimised: CRC/AES

    • All-flash optimization


Best Regards,

Xinliang