linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 00/61] Introduce parallel fsck to e2fsck pass1
@ 2020-11-18 15:38 Saranya Muruganandam
  2020-11-18 15:38 ` [RFC PATCH v3 01/61] e2fsck: add -m option for multithread Saranya Muruganandam
                   ` (61 more replies)
  0 siblings, 62 replies; 84+ messages in thread
From: Saranya Muruganandam @ 2020-11-18 15:38 UTC (permalink / raw)
  To: linux-ext4, tytso; +Cc: adilger.kernel, Saranya Muruganandam

Currently it has been popular that single disk could be more than TiB,
etc 16Tib with only one single disk, with this trend, one single
filesystem could be larger and larger and easily reach PiB with LUN system.

The journal filesystem like ext4 need be offline to do regular
check and repair from time to time, however the problem is e2fsck
still do this using single thread, this could be challenging at scale
for two reasons:

1) even with readahead, IO speed still limits several tens MiB per second.
2) could not utilize CPU cores.

It could be challenging to try multh-threads for all phase of e2fsck, but as
first step, we might try this for most time-consuming pass1, according to
our benchmarking it cost of 80% time for whole e2fck phase.

Pass1 is trying to scanning all valid inode of filesystem and check it one by
one, and the patchset idea is trying to split these to different threads and
trying to do this at the same time, we try to merge these inodes and corresponding
inode's extent information after threads finish.

To simplify complexity and make it less error-prone, the fix is still serialized,
since most of time there will be only minor errors for filesystem, what's important
for us is parallel reading and checking.

Here is a benchmarking on our Lustre filesystem with 1.2 PiB OSD ext4 based
filesystem:

DDN SFA18KE StorageServer
DCR(DeClustering RAID) with 162 x HGST 10TB NL-SAS
Tested Server
A Virtual Machine running on SFA18KE
8 x CPU cores (Xeon(R) Gold 6140)
150GB memory
CentoOS7.7 (Lustre patched kernel)

Created 600 Million x 32K byte files.

Without Patch		With Patch  thr=64
pass1: 13079.66		488.57 seconds
Total: 15673.33		3188.42

We have 5x total time reduction of total time which is very inspiring.

I've tested the whole patch series using 'make test' of e2fsck itself, and i
manually set default threads to 4 which still pass almost of test suite,
failure cases are below:

f_h_badroot f_multithread f_multithread_logfile f_multithread_no f_multithread_ok

h_h_badroot failed because out of order checking output, and others are because
of extra multiple threads log output.

The other test i've done to verify is:
1) filled a filesystem with different features enabled(encryption, shared xattr)
2) copied the above image out using e2image.
3) using e2fuzz to randomly corrupt above image, and then have another copy of
corrupted image.
4) running e2fsck with single thread(without patches) and multiple threads with
corrupted image respectively.
5) using valgrind to check any memory leak.
6) compare single thread and multiple threads fix results by mounting two images
back and compare the directores and files.(size and name match)

Any review or comments are welcomed!

Thanks you very much!

changelog v2->v3:
1) Parallelize rw_bitmaps
2) Configuration to turn off parallel fsck.(default=on)
3) Add f_multithread_ok
4) Add annotations to e2fsck_struct instead of reshuffle
5) Fix memory leaks
6) Some more extra fixes

Andreas Dilger (2):
  e2fsck: fix f_multithread_ok test
  e2fsck: misc cleanups for pfsck

Li Xi (18):
  e2fsck: add -m option for multithread
  e2fsck: copy context when using multi-thread fsck
  e2fsck: copy fs when using multi-thread fsck
  e2fsck: add assert when copying context
  e2fsck: copy bitmaps when copying context
  e2fsck: open io-channel when copying fs
  e2fsck: create logs for mult-threads
  e2fsck: optionally configure one pfsck thread
  e2fsck: add start/end group for thread
  e2fsck: split groups to different threads
  e2fsck: print thread log properly
  e2fsck: do not change global variables
  e2fsck: optimize the inserting of dir_info_db
  e2fsck: merge dir_info after thread finishes
  e2fsck: merge icounts after thread finishes
  e2fsck: merge dblist after thread finishes
  e2fsck: add debug codes for multiple threads
  e2fsck: merge fs flags when threads finish

Saranya Muruganandam (2):
  e2fsck: propagate number of threads
  e2fsck: Annotating fields in e2fsck_struct

Wang Shilong (39):
  e2fsck: clear icache when using multi-thread fsck
  e2fsck: copy badblocks when copying fs
  e2fsck: merge bitmaps after thread completes
  e2fsck: rbtree bitmap for dir
  e2fsck: merge badblocks after thread finishes
  e2fsck: merge counts after threads finish
  e2fsck: merge dx_dir_info after threads finish
  e2fsck: merge dirs_to_hash when threads finish
  e2fsck: merge context flags properly
  e2fsck: merge quota context after threads finish
  e2fsck: serialize fix operations
  e2fsck: move some fixes out of parallel pthreads
  e2fsck: split and merge invalid bitmaps
  e2fsck: merge EA blocks properly
  e2fsck: kickoff mutex lock for block found map
  e2fsck: allow admin specify number of threads
  e2fsck: adjust number of threads
  e2fsck: fix readahead for pfsck of pass1
  e2fsck: merge options after threads finish
  e2fsck: reset lost_and_found after threads finish
  e2fsck: merge extent depth count after threads finish
  e2fsck: simplify e2fsck context merging codes
  e2fsck: set E2F_FLAG_ALLOC_OK after threads
  e2fsck: wait fix thread finish before checking
  e2fsck: cleanup e2fsck_pass1_thread_join()
  e2fsck: avoid too much memory allocation for pfsck
  e2fsck: make default smallest RA size to 1M
  ext2fs: parallel bitmap loading
  e2fsck: update mmp block in one thread
  e2fsck: reset @inodes_to_rebuild if restart
  e2fsck: fix build for make rpm
  e2fsck: move ext2fs_get_avg_group to rw_bitmaps.c
  configure: enable pfsck by default
  test: add pfsck test
  e2fsck: fix race in ext2fs_read_bitmaps()
  e2fsck: fix readahead for pass1 without pfsck
  e2fsck: fix memory leaks with pfsck enabled
  ext2fs: fix to set tail flags with pfsck enabled
  e2fsck: update mmp block race

 MCONFIG.in                              |    1 +
 configure                               |   90 +-
 configure.ac                            |   26 +
 e2fsck/Makefile.in                      |    9 +-
 e2fsck/dirinfo.c                        |  238 ++-
 e2fsck/dx_dirinfo.c                     |   64 +
 e2fsck/e2fsck.8.in                      |    8 +-
 e2fsck/e2fsck.c                         |   11 +
 e2fsck/e2fsck.h                         |  102 +-
 e2fsck/logfile.c                        |   13 +-
 e2fsck/pass1.c                          | 1766 ++++++++++++++++++++---
 e2fsck/problem.c                        |   11 +
 e2fsck/problem.h                        |    3 +
 e2fsck/readahead.c                      |    4 +
 e2fsck/unix.c                           |   59 +-
 e2fsck/util.c                           |  193 ++-
 lib/config.h.in                         |    3 +
 lib/ext2fs/badblocks.c                  |   85 +-
 lib/ext2fs/bitmaps.c                    |   10 +
 lib/ext2fs/bitops.h                     |    2 +
 lib/ext2fs/blkmap64_rb.c                |   65 +
 lib/ext2fs/bmap64.h                     |    4 +
 lib/ext2fs/dblist.c                     |   38 +
 lib/ext2fs/ext2_err.et.in               |    3 +
 lib/ext2fs/ext2_io.h                    |    2 +
 lib/ext2fs/ext2fs.h                     |   19 +-
 lib/ext2fs/ext2fsP.h                    |    1 -
 lib/ext2fs/gen_bitmap64.c               |   62 +
 lib/ext2fs/icount.c                     |  107 ++
 lib/ext2fs/openfs.c                     |   48 +-
 lib/ext2fs/rw_bitmaps.c                 |  301 +++-
 lib/ext2fs/undo_io.c                    |   19 +
 lib/ext2fs/unix_io.c                    |   24 +-
 lib/support/mkquota.c                   |   39 +
 lib/support/quotaio.h                   |    3 +
 tests/f_itable_collision/expect.1       |    3 -
 tests/f_multithread/expect.1            |   25 +
 tests/f_multithread/expect.2            |    7 +
 tests/f_multithread/image.gz            |    1 +
 tests/f_multithread/name                |    1 +
 tests/f_multithread/script              |    4 +
 tests/f_multithread_completion/expect.1 |    2 +
 tests/f_multithread_completion/expect.2 |   23 +
 tests/f_multithread_completion/image.gz |    1 +
 tests/f_multithread_completion/name     |    1 +
 tests/f_multithread_completion/script   |    4 +
 tests/f_multithread_logfile/expect.1    |   25 +
 tests/f_multithread_logfile/image.gz    |    1 +
 tests/f_multithread_logfile/name        |    1 +
 tests/f_multithread_logfile/script      |   32 +
 tests/f_multithread_no/expect.1         |   26 +
 tests/f_multithread_no/expect.2         |   23 +
 tests/f_multithread_no/image.gz         |    1 +
 tests/f_multithread_no/name             |    1 +
 tests/f_multithread_no/script           |    4 +
 tests/f_multithread_ok/expect.1         |    8 +
 tests/f_multithread_ok/image.gz         |  Bin 0 -> 796311 bytes
 tests/f_multithread_ok/name             |    1 +
 tests/f_multithread_ok/script           |   21 +
 tests/f_multithread_preen/expect.1      |   11 +
 tests/f_multithread_preen/expect.2      |   23 +
 tests/f_multithread_preen/image.gz      |    1 +
 tests/f_multithread_preen/name          |    1 +
 tests/f_multithread_preen/script        |    4 +
 tests/f_multithread_yes/expect.1        |    2 +
 tests/f_multithread_yes/expect.2        |   23 +
 tests/f_multithread_yes/image.gz        |    1 +
 tests/f_multithread_yes/name            |    1 +
 tests/f_multithread_yes/script          |    4 +
 tests/test_one.in                       |    8 +
 70 files changed, 3335 insertions(+), 393 deletions(-)
 create mode 100644 tests/f_multithread/expect.1
 create mode 100644 tests/f_multithread/expect.2
 create mode 120000 tests/f_multithread/image.gz
 create mode 100644 tests/f_multithread/name
 create mode 100644 tests/f_multithread/script
 create mode 100644 tests/f_multithread_completion/expect.1
 create mode 100644 tests/f_multithread_completion/expect.2
 create mode 120000 tests/f_multithread_completion/image.gz
 create mode 100644 tests/f_multithread_completion/name
 create mode 100644 tests/f_multithread_completion/script
 create mode 100644 tests/f_multithread_logfile/expect.1
 create mode 120000 tests/f_multithread_logfile/image.gz
 create mode 100644 tests/f_multithread_logfile/name
 create mode 100644 tests/f_multithread_logfile/script
 create mode 100644 tests/f_multithread_no/expect.1
 create mode 100644 tests/f_multithread_no/expect.2
 create mode 120000 tests/f_multithread_no/image.gz
 create mode 100644 tests/f_multithread_no/name
 create mode 100644 tests/f_multithread_no/script
 create mode 100644 tests/f_multithread_ok/expect.1
 create mode 100644 tests/f_multithread_ok/image.gz
 create mode 100644 tests/f_multithread_ok/name
 create mode 100644 tests/f_multithread_ok/script
 create mode 100644 tests/f_multithread_preen/expect.1
 create mode 100644 tests/f_multithread_preen/expect.2
 create mode 120000 tests/f_multithread_preen/image.gz
 create mode 100644 tests/f_multithread_preen/name
 create mode 100644 tests/f_multithread_preen/script
 create mode 100644 tests/f_multithread_yes/expect.1
 create mode 100644 tests/f_multithread_yes/expect.2
 create mode 120000 tests/f_multithread_yes/image.gz
 create mode 100644 tests/f_multithread_yes/name
 create mode 100644 tests/f_multithread_yes/script

-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2020-12-18  1:30 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-18 15:38 [RFC PATCH v3 00/61] Introduce parallel fsck to e2fsck pass1 Saranya Muruganandam
2020-11-18 15:38 ` [RFC PATCH v3 01/61] e2fsck: add -m option for multithread Saranya Muruganandam
2020-11-23 19:53   ` harshad shirwadkar
2020-11-23 21:28   ` Theodore Y. Ts'o
2020-11-18 15:38 ` [RFC PATCH v3 02/61] e2fsck: copy context when using multi-thread fsck Saranya Muruganandam
2020-11-23 19:55   ` harshad shirwadkar
2020-11-23 21:38   ` Theodore Y. Ts'o
2020-12-17 23:56   ` Darrick J. Wong
2020-12-18  1:13     ` Wang Shilong
2020-12-18  1:27       ` Darrick J. Wong
2020-11-18 15:38 ` [RFC PATCH v3 03/61] e2fsck: copy fs " Saranya Muruganandam
2020-11-23 22:12   ` Theodore Y. Ts'o
2020-11-18 15:38 ` [RFC PATCH v3 04/61] e2fsck: clear icache " Saranya Muruganandam
2020-11-23 22:27   ` Theodore Y. Ts'o
2020-11-18 15:38 ` [RFC PATCH v3 05/61] e2fsck: add assert when copying context Saranya Muruganandam
2020-11-18 15:38 ` [RFC PATCH v3 06/61] e2fsck: copy bitmaps " Saranya Muruganandam
2020-11-18 15:38 ` [RFC PATCH v3 07/61] e2fsck: copy badblocks when copying fs Saranya Muruganandam
2020-11-18 15:38 ` [RFC PATCH v3 08/61] e2fsck: open io-channel " Saranya Muruganandam
2020-11-23 22:38   ` Theodore Y. Ts'o
2020-11-24 14:17     ` Theodore Y. Ts'o
2020-11-18 15:38 ` [RFC PATCH v3 09/61] e2fsck: create logs for mult-threads Saranya Muruganandam
2020-11-23 23:05   ` Theodore Y. Ts'o
2020-11-18 15:38 ` [RFC PATCH v3 10/61] e2fsck: optionally configure one pfsck thread Saranya Muruganandam
2020-11-23 23:16   ` Theodore Y. Ts'o
2020-11-18 15:38 ` [RFC PATCH v3 11/61] e2fsck: add start/end group for thread Saranya Muruganandam
2020-11-18 15:38 ` [RFC PATCH v3 12/61] e2fsck: split groups to different threads Saranya Muruganandam
2020-11-18 15:38 ` [RFC PATCH v3 13/61] e2fsck: print thread log properly Saranya Muruganandam
2020-11-23 23:40   ` Theodore Y. Ts'o
2020-11-18 15:39 ` [RFC PATCH v3 14/61] e2fsck: merge bitmaps after thread completes Saranya Muruganandam
2020-11-24  2:00   ` Theodore Y. Ts'o
2020-11-18 15:39 ` [RFC PATCH v3 15/61] e2fsck: do not change global variables Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 16/61] e2fsck: optimize the inserting of dir_info_db Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 17/61] e2fsck: merge dir_info after thread finishes Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 18/61] e2fsck: rbtree bitmap for dir Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 19/61] e2fsck: merge badblocks after thread finishes Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 20/61] e2fsck: merge icounts " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 21/61] e2fsck: merge dblist " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 22/61] e2fsck: add debug codes for multiple threads Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 23/61] e2fsck: merge counts after threads finish Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 24/61] e2fsck: merge fs flags when " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 25/61] e2fsck: merge dx_dir_info after " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 26/61] e2fsck: merge dirs_to_hash when " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 27/61] e2fsck: merge context flags properly Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 28/61] e2fsck: merge quota context after threads finish Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 29/61] e2fsck: serialize fix operations Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 30/61] e2fsck: move some fixes out of parallel pthreads Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 31/61] e2fsck: split and merge invalid bitmaps Saranya Muruganandam
2020-12-18  0:05   ` Darrick J. Wong
2020-12-18  1:19     ` Wang Shilong
2020-11-18 15:39 ` [RFC PATCH v3 32/61] e2fsck: merge EA blocks properly Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 33/61] e2fsck: kickoff mutex lock for block found map Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 34/61] e2fsck: allow admin specify number of threads Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 35/61] e2fsck: adjust " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 36/61] e2fsck: fix readahead for pfsck of pass1 Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 37/61] e2fsck: merge options after threads finish Saranya Muruganandam
2020-12-17 23:30   ` Darrick J. Wong
2020-11-18 15:39 ` [RFC PATCH v3 38/61] e2fsck: reset lost_and_found " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 39/61] e2fsck: merge extent depth count " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 40/61] e2fsck: simplify e2fsck context merging codes Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 41/61] e2fsck: set E2F_FLAG_ALLOC_OK after threads Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 42/61] e2fsck: wait fix thread finish before checking Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 43/61] e2fsck: cleanup e2fsck_pass1_thread_join() Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 44/61] e2fsck: avoid too much memory allocation for pfsck Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 45/61] e2fsck: make default smallest RA size to 1M Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 46/61] ext2fs: parallel bitmap loading Saranya Muruganandam
2020-11-24  2:44   ` Theodore Y. Ts'o
2020-11-18 15:39 ` [RFC PATCH v3 47/61] e2fsck: update mmp block in one thread Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 48/61] e2fsck: reset @inodes_to_rebuild if restart Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 49/61] e2fsck: fix build for make rpm Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 50/61] e2fsck: move ext2fs_get_avg_group to rw_bitmaps.c Saranya Muruganandam
2020-11-24  2:12   ` Theodore Y. Ts'o
2020-11-18 15:39 ` [RFC PATCH v3 51/61] configure: enable pfsck by default Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 53/61] e2fsck: fix f_multithread_ok test Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 54/61] e2fsck: fix race in ext2fs_read_bitmaps() Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 55/61] e2fsck: fix readahead for pass1 without pfsck Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 56/61] e2fsck: fix memory leaks with pfsck enabled Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 57/61] ext2fs: fix to set tail flags " Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 58/61] e2fsck: misc cleanups for pfsck Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 59/61] e2fsck: update mmp block race Saranya Muruganandam
2020-11-18 15:39 ` [RFC PATCH v3 60/61] e2fsck: propagate number of threads Saranya Muruganandam
2020-11-24  3:56   ` Theodore Y. Ts'o
2020-11-18 15:39 ` [RFC PATCH v3 61/61] e2fsck: Annotating fields in e2fsck_struct Saranya Muruganandam
2020-11-19 15:58 ` [RFC PATCH v3 00/61] Introduce parallel fsck to e2fsck pass1 Theodore Y. Ts'o
2020-11-23 21:25 ` Theodore Y. Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).