linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/46] introduce parallel fsck to e2fsck pass1
@ 2020-04-08 10:44 Wang Shilong
  2020-04-08 10:44 ` [RFC PATCH 01/46] e2fsck: cleanup struct e2fsck_struct Wang Shilong
                   ` (45 more replies)
  0 siblings, 46 replies; 47+ messages in thread
From: Wang Shilong @ 2020-04-08 10:44 UTC (permalink / raw)
  To: linux-ext4; +Cc: lixi, adilger, sihara, Wang Shilong

From: Wang Shilong <wshilong@ddn.com>

Currently it has been popular that single disk could be more than TiB,
etc 16Tib with only one single disk, with this trend, one single
filesystem could be larger and larger and easily reach PiB with LUN system.

The journal filesystem like ext4 need be offline to do regular
check and repair from time to time, however the problem is e2fsck
still do this using single thread, this could be challenging at scale
for two reasons:

1) even with readahead, IO speed still limits several tens MiB per second.
2) could not utilize CPU cores.

It could be challenging to try multh-threads for all phase of e2fsck, but as
first step, we might try this for most time-consuming pass1, according to
our benchmarking it cost of 80% time for whole e2fck phase.

Pass1 is trying to scanning all valid inode of filesystem and check it one by
one, and the patchset idea is trying to split these to different threads and
trying to do this at the same time, we try to merge these inodes and corresponding
inode's extent information after threads finish.

To simplify complexity and make it less error-prone, the fix is still serialized,
since most of time there will be only minor errors for filesystem, what's important
for us is parallel reading and checking.

Here is a benchmarking on our Lustre filesystem with 1.2 PiB OSD ext4 based
filesystem:

DDN SFA18KE StorageServer
DCR(DeClustering RAID) with 162 x HGST 10TB NL-SAS
Tested Server
A Virtual Machine running on SFA18KE
8 x CPU cores (Xeon(R) Gold 6140)
150GB memory
CentoOS7.7 (Lustre patched kernel)

Created 600 Million x 32K byte files.

Without Patch		With Patch  thr=64
pass1: 13079.66		488.57 seconds
Total: 15673.33		3188.42

We have 5x total time reduction of total time which is very inspiring.

I've tested the whole patch series using 'make test' of e2fsck itself, and i
manually set default threads to 4 which still pass almost of test suite,
failure cases are below:

f_h_badroot f_multithread f_multithread_logfile f_multithread_no 

h_h_badroot failed because out of order checking output, and others are because
of extra multiple threads log output.

So the whole series is reasonably stable if you are intrested testing it on
different platforms, i've pushed it to github:

https://github.com/wangshilong/e2fsprogs pfsck_pass1_v1

It is definitely in early stage, but i'd like to send it for early
review for any comments or testing etc.

Thanks you very much!
Shilong

Li Dongyang (1):
  libext2fs: optimize ext2fs_convert_subcluster_bitmap()

Li Xi (25):
  e2fsck: add -m option for multithread
  e2fsck: copy context when using multi-thread fsck
  e2fsck: copy fs when using multi-thread fsck
  e2fsck: copy dblist when using multi-thread fsck
  e2fsck: clear icache when using multi-thread fsck
  e2fsck: add assert when copying context
  e2fsck: copy bitmaps when copying context
  e2fsck: copy badblocks when copying fs
  e2fsck: open io-channel when copying fs
  e2fsck: create logs for mult-threads
  e2fsck: create one thread to fsck
  e2fsck: add start/end group for thread
  e2fsck: split groups to different threads
  e2fsck: print thread log properly
  e2fsck: merge bitmaps after thread completes
  e2fsck: do not change global variables
  e2fsck: optimize the inserting of dir_info_db
  e2fsck: merge dir_info after thread finishes
  e2fsck: rbtree bitmap for dir
  e2fsck: merge badblocks after thread finishes
  e2fsck: merge icounts after thread finishes
  e2fsck: merge dblist after thread finishes
  e2fsck: add debug codes for multiple threds
  e2fsck: merge counts when threads finish
  LU-8465 e2fsck: merge fs flags when threads finish

Wang Shilong (20):
  e2fsck: cleanup struct e2fsck_struct
  e2fsck: merge dx_dir_info
  e2fsck: make threads splitting aware of flex_bg
  e2fsck: merge dirs_to_hash when threads finish
  e2fsck: merge context flags properly
  e2fsck: split and merge quota context
  e2fsck: serialize fix operations
  e2fsck: move some fixes out of parallel pthreads
  e2fsck: split and merge invalid bitmaps
  e2fsck: fix to protect EA checking
  e2fsck: allow admin specify number of threads
  e2fsck: kickoff mutex lock for block found map
  e2fsck: fix readahead for pfsck of pass1
  e2fsck: kick off ea mutex lock from pfsck
  e2fsck: merge encrypted_files after threads finish
  e2fsck: merge inode_bad_map after threads finish
  e2fsck: simplify e2fsck context merging codes
  e2fsck: merge options after threads finish
  e2fsck: reset lost_and_found after threads finish
  LU-8465 e2fsck: merge extent depth count after threads finish

 configure.ac                            |    6 +
 e2fsck/dirinfo.c                        |  220 ++-
 e2fsck/dx_dirinfo.c                     |   67 +
 e2fsck/e2fsck.h                         |  336 +++--
 e2fsck/encrypted_files.c                |  175 ++-
 e2fsck/logfile.c                        |   12 +-
 e2fsck/pass1.c                          | 1692 ++++++++++++++++++++---
 e2fsck/problem.c                        |    9 +
 e2fsck/problem.h                        |    3 +
 e2fsck/unix.c                           |   33 +-
 e2fsck/util.c                           |   56 +-
 lib/ext2fs/badblocks.c                  |   75 +-
 lib/ext2fs/bitmaps.c                    |    8 +
 lib/ext2fs/bitops.h                     |    2 +
 lib/ext2fs/blkmap64_rb.c                |   51 +
 lib/ext2fs/bmap64.h                     |    3 +
 lib/ext2fs/dblist.c                     |   36 +
 lib/ext2fs/ext2_err.et.in               |    3 +
 lib/ext2fs/ext2_io.h                    |    2 +
 lib/ext2fs/ext2fs.h                     |   11 +
 lib/ext2fs/ext2fsP.h                    |    1 -
 lib/ext2fs/gen_bitmap64.c               |   89 +-
 lib/ext2fs/icount.c                     |  102 ++
 lib/ext2fs/undo_io.c                    |   19 +
 lib/ext2fs/unix_io.c                    |   24 +-
 lib/support/mkquota.c                   |   19 +
 lib/support/quotaio.h                   |    2 +
 tests/f_itable_collision/expect.1       |    3 -
 tests/f_multithread/expect.1            |   25 +
 tests/f_multithread/expect.2            |    7 +
 tests/f_multithread/image.gz            |    1 +
 tests/f_multithread/name                |    1 +
 tests/f_multithread/script              |    4 +
 tests/f_multithread_completion/expect.1 |    2 +
 tests/f_multithread_completion/expect.2 |   23 +
 tests/f_multithread_completion/image.gz |    1 +
 tests/f_multithread_completion/name     |    1 +
 tests/f_multithread_completion/script   |    4 +
 tests/f_multithread_logfile/expect.1    |   25 +
 tests/f_multithread_logfile/image.gz    |    1 +
 tests/f_multithread_logfile/name        |    1 +
 tests/f_multithread_logfile/script      |   32 +
 tests/f_multithread_no/expect.1         |   26 +
 tests/f_multithread_no/expect.2         |   23 +
 tests/f_multithread_no/image.gz         |    1 +
 tests/f_multithread_no/name             |    1 +
 tests/f_multithread_no/script           |    4 +
 tests/f_multithread_preen/expect.1      |   11 +
 tests/f_multithread_preen/expect.2      |   23 +
 tests/f_multithread_preen/image.gz      |    1 +
 tests/f_multithread_preen/name          |    1 +
 tests/f_multithread_preen/script        |    4 +
 tests/f_multithread_yes/expect.1        |    2 +
 tests/f_multithread_yes/expect.2        |   23 +
 tests/f_multithread_yes/image.gz        |    1 +
 tests/f_multithread_yes/name            |    1 +
 tests/f_multithread_yes/script          |    4 +
 57 files changed, 2858 insertions(+), 455 deletions(-)
 create mode 100644 tests/f_multithread/expect.1
 create mode 100644 tests/f_multithread/expect.2
 create mode 120000 tests/f_multithread/image.gz
 create mode 100644 tests/f_multithread/name
 create mode 100644 tests/f_multithread/script
 create mode 100644 tests/f_multithread_completion/expect.1
 create mode 100644 tests/f_multithread_completion/expect.2
 create mode 120000 tests/f_multithread_completion/image.gz
 create mode 100644 tests/f_multithread_completion/name
 create mode 100644 tests/f_multithread_completion/script
 create mode 100644 tests/f_multithread_logfile/expect.1
 create mode 120000 tests/f_multithread_logfile/image.gz
 create mode 100644 tests/f_multithread_logfile/name
 create mode 100644 tests/f_multithread_logfile/script
 create mode 100644 tests/f_multithread_no/expect.1
 create mode 100644 tests/f_multithread_no/expect.2
 create mode 120000 tests/f_multithread_no/image.gz
 create mode 100644 tests/f_multithread_no/name
 create mode 100644 tests/f_multithread_no/script
 create mode 100644 tests/f_multithread_preen/expect.1
 create mode 100644 tests/f_multithread_preen/expect.2
 create mode 120000 tests/f_multithread_preen/image.gz
 create mode 100644 tests/f_multithread_preen/name
 create mode 100644 tests/f_multithread_preen/script
 create mode 100644 tests/f_multithread_yes/expect.1
 create mode 100644 tests/f_multithread_yes/expect.2
 create mode 120000 tests/f_multithread_yes/image.gz
 create mode 100644 tests/f_multithread_yes/name
 create mode 100644 tests/f_multithread_yes/script

-- 
2.25.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2020-04-08 10:47 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-08 10:44 [RFC PATCH 00/46] introduce parallel fsck to e2fsck pass1 Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 01/46] e2fsck: cleanup struct e2fsck_struct Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 02/46] e2fsck: add -m option for multithread Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 03/46] e2fsck: copy context when using multi-thread fsck Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 04/46] e2fsck: copy fs " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 05/46] e2fsck: copy dblist " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 06/46] e2fsck: clear icache " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 07/46] e2fsck: add assert when copying context Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 08/46] e2fsck: copy bitmaps " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 09/46] e2fsck: copy badblocks when copying fs Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 10/46] e2fsck: open io-channel " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 11/46] e2fsck: create logs for mult-threads Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 12/46] e2fsck: create one thread to fsck Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 13/46] e2fsck: add start/end group for thread Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 14/46] e2fsck: split groups to different threads Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 15/46] e2fsck: print thread log properly Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 16/46] e2fsck: merge bitmaps after thread completes Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 17/46] e2fsck: do not change global variables Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 18/46] e2fsck: optimize the inserting of dir_info_db Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 19/46] e2fsck: merge dir_info after thread finishes Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 20/46] e2fsck: rbtree bitmap for dir Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 21/46] e2fsck: merge badblocks after thread finishes Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 22/46] e2fsck: merge icounts " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 23/46] e2fsck: merge dblist " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 24/46] e2fsck: add debug codes for multiple threds Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 25/46] e2fsck: merge counts when threads finish Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 26/46] e2fsck: merge fs flags " Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 27/46] e2fsck: merge dx_dir_info Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 28/46] e2fsck: make threads splitting aware of flex_bg Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 29/46] e2fsck: merge dirs_to_hash when threads finish Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 30/46] e2fsck: merge context flags properly Wang Shilong
2020-04-08 10:44 ` [RFC PATCH 31/46] e2fsck: split and merge quota context Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 32/46] e2fsck: serialize fix operations Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 33/46] e2fsck: move some fixes out of parallel pthreads Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 34/46] e2fsck: split and merge invalid bitmaps Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 35/46] e2fsck: fix to protect EA checking Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 36/46] e2fsck: allow admin specify number of threads Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 37/46] e2fsck: kickoff mutex lock for block found map Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 38/46] e2fsck: fix readahead for pfsck of pass1 Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 39/46] e2fsck: kick off ea mutex lock from pfsck Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 40/46] e2fsck: merge encrypted_files after threads finish Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 41/46] e2fsck: merge inode_bad_map " Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 42/46] e2fsck: simplify e2fsck context merging codes Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 43/46] e2fsck: merge options after threads finish Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 44/46] e2fsck: reset lost_and_found " Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 45/46] LU-8465 e2fsck: merge extent depth count " Wang Shilong
2020-04-08 10:45 ` [RFC PATCH 46/46] libext2fs: optimize ext2fs_convert_subcluster_bitmap() Wang Shilong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).