linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
@ 2021-01-26 19:59 Maxim Mikityanskiy
  2021-01-27  4:23 ` Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Maxim Mikityanskiy @ 2021-01-26 19:59 UTC (permalink / raw)
  To: Jens Axboe, Alexander Viro, Christoph Hellwig
  Cc: linux-block, linux-fsdevel, linux-kernel, Maxim Mikityanskiy

The cited commit introduced a serious regression with SATA write speed,
as found by bisecting. This patch reverts this commit, which restores
write speed back to the values observed before this commit.

The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
(WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
single HDD, the rest are different RAID levels built over the first
partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.

                | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
----------------+--------+-------+-----------+-----------+--------
9011495c9466    | R:256  | R:313 | R:276     | R:313     | R:323
(before faulty) | W:254  | W:253 | W:195     | W:204     | W:117
----------------+--------+-------+-----------+-----------+--------
5ff9f19231a0    | R:257  | R:398 | R:312     | R:344     | R:391
(faulty commit) | W:154  | W:122 | W:67.7    | W:66.6    | W:67.2
----------------+--------+-------+-----------+-----------+--------
5.10.10         | R:256  | R:401 | R:312     | R:356     | R:375
unpatched       | W:149  | W:123 | W:64      | W:64.1    | W:61.5
----------------+--------+-------+-----------+-----------+--------
5.10.10         | R:255  | R:396 | R:312     | R:340     | R:393
patched         | W:247  | W:274 | W:220     | W:225     | W:121

Applying this patch doesn't hurt read performance, while improves the
write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
is restored back to the state before the faulty commit, and even a bit
higher in RAID tests (which aren't HDD-bound on this device) - that is
likely related to other optimizations done between the faulty commit and
5.10.10 which also improved the read speed.

Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
---
 fs/block_dev.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 3b8963e228a1..235b5042672e 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range);
 
 static void set_init_blocksize(struct block_device *bdev)
 {
-	bdev->bd_inode->i_blkbits = blksize_bits(bdev_logical_block_size(bdev));
+	unsigned int bsize = bdev_logical_block_size(bdev);
+	loff_t size = i_size_read(bdev->bd_inode);
+
+	while (bsize < PAGE_SIZE) {
+		if (size & bsize)
+			break;
+		bsize <<= 1;
+	}
+	bdev->bd_inode->i_blkbits = blksize_bits(bsize);
 }
 
 int set_blocksize(struct block_device *bdev, int size)
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
  2021-01-26 19:59 [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance Maxim Mikityanskiy
@ 2021-01-27  4:23 ` Bart Van Assche
  2021-01-27  7:44   ` Maxim Mikityanskiy
  2021-01-27 16:12 ` Christoph Hellwig
  2021-01-27 16:15 ` Jens Axboe
  2 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2021-01-27  4:23 UTC (permalink / raw)
  To: Maxim Mikityanskiy, Jens Axboe, Alexander Viro, Christoph Hellwig
  Cc: linux-block, linux-fsdevel, linux-kernel

On 1/26/21 11:59 AM, Maxim Mikityanskiy wrote:
> The cited commit introduced a serious regression with SATA write speed,
> as found by bisecting. This patch reverts this commit, which restores
> write speed back to the values observed before this commit.
> 
> The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
> (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
> single HDD, the rest are different RAID levels built over the first
> partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.
> 
>                 | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
> ----------------+--------+-------+-----------+-----------+--------
> 9011495c9466    | R:256  | R:313 | R:276     | R:313     | R:323
> (before faulty) | W:254  | W:253 | W:195     | W:204     | W:117
> ----------------+--------+-------+-----------+-----------+--------
> 5ff9f19231a0    | R:257  | R:398 | R:312     | R:344     | R:391
> (faulty commit) | W:154  | W:122 | W:67.7    | W:66.6    | W:67.2
> ----------------+--------+-------+-----------+-----------+--------
> 5.10.10         | R:256  | R:401 | R:312     | R:356     | R:375
> unpatched       | W:149  | W:123 | W:64      | W:64.1    | W:61.5
> ----------------+--------+-------+-----------+-----------+--------
> 5.10.10         | R:255  | R:396 | R:312     | R:340     | R:393
> patched         | W:247  | W:274 | W:220     | W:225     | W:121
> 
> Applying this patch doesn't hurt read performance, while improves the
> write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
> is restored back to the state before the faulty commit, and even a bit
> higher in RAID tests (which aren't HDD-bound on this device) - that is
> likely related to other optimizations done between the faulty commit and
> 5.10.10 which also improved the read speed.
> 
> Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize")
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jens Axboe <axboe@kernel.dk>
> ---
>  fs/block_dev.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 3b8963e228a1..235b5042672e 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range);
>  
>  static void set_init_blocksize(struct block_device *bdev)
>  {
> -	bdev->bd_inode->i_blkbits = blksize_bits(bdev_logical_block_size(bdev));
> +	unsigned int bsize = bdev_logical_block_size(bdev);
> +	loff_t size = i_size_read(bdev->bd_inode);
> +
> +	while (bsize < PAGE_SIZE) {
> +		if (size & bsize)
> +			break;
> +		bsize <<= 1;
> +	}
> +	bdev->bd_inode->i_blkbits = blksize_bits(bsize);
>  }
>  
>  int set_blocksize(struct block_device *bdev, int size)

How can this patch affect write speed? I haven't found any calls of
set_init_blocksize() in the I/O path. Did I perhaps overlook something?

Bart.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
  2021-01-27  4:23 ` Bart Van Assche
@ 2021-01-27  7:44   ` Maxim Mikityanskiy
  2021-01-27 15:18     ` Ming Lei
  0 siblings, 1 reply; 6+ messages in thread
From: Maxim Mikityanskiy @ 2021-01-27  7:44 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Alexander Viro, Christoph Hellwig, linux-block,
	linux-fsdevel, Linux Kernel Mailing List

On Wed, Jan 27, 2021 at 6:23 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 1/26/21 11:59 AM, Maxim Mikityanskiy wrote:
> > The cited commit introduced a serious regression with SATA write speed,
> > as found by bisecting. This patch reverts this commit, which restores
> > write speed back to the values observed before this commit.
> >
> > The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
> > (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
> > single HDD, the rest are different RAID levels built over the first
> > partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.
> >
> >                 | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
> > ----------------+--------+-------+-----------+-----------+--------
> > 9011495c9466    | R:256  | R:313 | R:276     | R:313     | R:323
> > (before faulty) | W:254  | W:253 | W:195     | W:204     | W:117
> > ----------------+--------+-------+-----------+-----------+--------
> > 5ff9f19231a0    | R:257  | R:398 | R:312     | R:344     | R:391
> > (faulty commit) | W:154  | W:122 | W:67.7    | W:66.6    | W:67.2
> > ----------------+--------+-------+-----------+-----------+--------
> > 5.10.10         | R:256  | R:401 | R:312     | R:356     | R:375
> > unpatched       | W:149  | W:123 | W:64      | W:64.1    | W:61.5
> > ----------------+--------+-------+-----------+-----------+--------
> > 5.10.10         | R:255  | R:396 | R:312     | R:340     | R:393
> > patched         | W:247  | W:274 | W:220     | W:225     | W:121
> >
> > Applying this patch doesn't hurt read performance, while improves the
> > write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
> > is restored back to the state before the faulty commit, and even a bit
> > higher in RAID tests (which aren't HDD-bound on this device) - that is
> > likely related to other optimizations done between the faulty commit and
> > 5.10.10 which also improved the read speed.
> >
> > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> > Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize")
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Jens Axboe <axboe@kernel.dk>
> > ---
> >  fs/block_dev.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > index 3b8963e228a1..235b5042672e 100644
> > --- a/fs/block_dev.c
> > +++ b/fs/block_dev.c
> > @@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range);
> >
> >  static void set_init_blocksize(struct block_device *bdev)
> >  {
> > -     bdev->bd_inode->i_blkbits = blksize_bits(bdev_logical_block_size(bdev));
> > +     unsigned int bsize = bdev_logical_block_size(bdev);
> > +     loff_t size = i_size_read(bdev->bd_inode);
> > +
> > +     while (bsize < PAGE_SIZE) {
> > +             if (size & bsize)
> > +                     break;
> > +             bsize <<= 1;
> > +     }
> > +     bdev->bd_inode->i_blkbits = blksize_bits(bsize);
> >  }
> >
> >  int set_blocksize(struct block_device *bdev, int size)
>
> How can this patch affect write speed? I haven't found any calls of
> set_init_blocksize() in the I/O path. Did I perhaps overlook something?

I don't know the exact mechanism how this change affects the speed,
I'm not an expert in the block device subsystem (I'm a networking
guy). This commit was found by git bisect, and my performance test
confirmed that reverting it fixes the bug.

It looks to me as this function sets the block size as part of control
flow, and this size is used later in the fast path, and the commit
that removed the loop decreased this block size.

> Bart.
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
  2021-01-27  7:44   ` Maxim Mikityanskiy
@ 2021-01-27 15:18     ` Ming Lei
  0 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2021-01-27 15:18 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: Bart Van Assche, Jens Axboe, Alexander Viro, Christoph Hellwig,
	linux-block, linux-fsdevel, Linux Kernel Mailing List

On Wed, Jan 27, 2021 at 09:44:50AM +0200, Maxim Mikityanskiy wrote:
> On Wed, Jan 27, 2021 at 6:23 AM Bart Van Assche <bvanassche@acm.org> wrote:
> >
> > On 1/26/21 11:59 AM, Maxim Mikityanskiy wrote:
> > > The cited commit introduced a serious regression with SATA write speed,
> > > as found by bisecting. This patch reverts this commit, which restores
> > > write speed back to the values observed before this commit.
> > >
> > > The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
> > > (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
> > > single HDD, the rest are different RAID levels built over the first
> > > partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.
> > >
> > >                 | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
> > > ----------------+--------+-------+-----------+-----------+--------
> > > 9011495c9466    | R:256  | R:313 | R:276     | R:313     | R:323
> > > (before faulty) | W:254  | W:253 | W:195     | W:204     | W:117
> > > ----------------+--------+-------+-----------+-----------+--------
> > > 5ff9f19231a0    | R:257  | R:398 | R:312     | R:344     | R:391
> > > (faulty commit) | W:154  | W:122 | W:67.7    | W:66.6    | W:67.2
> > > ----------------+--------+-------+-----------+-----------+--------
> > > 5.10.10         | R:256  | R:401 | R:312     | R:356     | R:375
> > > unpatched       | W:149  | W:123 | W:64      | W:64.1    | W:61.5
> > > ----------------+--------+-------+-----------+-----------+--------
> > > 5.10.10         | R:255  | R:396 | R:312     | R:340     | R:393
> > > patched         | W:247  | W:274 | W:220     | W:225     | W:121
> > >
> > > Applying this patch doesn't hurt read performance, while improves the
> > > write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
> > > is restored back to the state before the faulty commit, and even a bit
> > > higher in RAID tests (which aren't HDD-bound on this device) - that is
> > > likely related to other optimizations done between the faulty commit and
> > > 5.10.10 which also improved the read speed.
> > >
> > > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> > > Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize")
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Jens Axboe <axboe@kernel.dk>
> > > ---
> > >  fs/block_dev.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > > index 3b8963e228a1..235b5042672e 100644
> > > --- a/fs/block_dev.c
> > > +++ b/fs/block_dev.c
> > > @@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range);
> > >
> > >  static void set_init_blocksize(struct block_device *bdev)
> > >  {
> > > -     bdev->bd_inode->i_blkbits = blksize_bits(bdev_logical_block_size(bdev));
> > > +     unsigned int bsize = bdev_logical_block_size(bdev);
> > > +     loff_t size = i_size_read(bdev->bd_inode);
> > > +
> > > +     while (bsize < PAGE_SIZE) {
> > > +             if (size & bsize)
> > > +                     break;
> > > +             bsize <<= 1;
> > > +     }
> > > +     bdev->bd_inode->i_blkbits = blksize_bits(bsize);
> > >  }
> > >
> > >  int set_blocksize(struct block_device *bdev, int size)
> >
> > How can this patch affect write speed? I haven't found any calls of
> > set_init_blocksize() in the I/O path. Did I perhaps overlook something?
> 
> I don't know the exact mechanism how this change affects the speed,
> I'm not an expert in the block device subsystem (I'm a networking
> guy). This commit was found by git bisect, and my performance test
> confirmed that reverting it fixes the bug.
> 
> It looks to me as this function sets the block size as part of control
> flow, and this size is used later in the fast path, and the commit
> that removed the loop decreased this block size.

Right, the issue is stupid __block_write_full_page() which submits single bio
for each buffer head. And I have tried to improve the situation by merging
BHs into single bio, see below patch:

	https://lore.kernel.org/linux-block/20201230000815.3448707-1-ming.lei@redhat.com/

The above patch should improve perf for your test case.

-- 
Ming


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
  2021-01-26 19:59 [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance Maxim Mikityanskiy
  2021-01-27  4:23 ` Bart Van Assche
@ 2021-01-27 16:12 ` Christoph Hellwig
  2021-01-27 16:15 ` Jens Axboe
  2 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2021-01-27 16:12 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: Jens Axboe, Alexander Viro, Christoph Hellwig, linux-block,
	linux-fsdevel, linux-kernel

While this code is gross, I think we need to add it back for now:

Acked-by: Christoph Hellwig <hch@lst.de>

I'll put converting the block device buffered I/O path to iomap or
an iomap lookalike on the backburner to fix this..

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
  2021-01-26 19:59 [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance Maxim Mikityanskiy
  2021-01-27  4:23 ` Bart Van Assche
  2021-01-27 16:12 ` Christoph Hellwig
@ 2021-01-27 16:15 ` Jens Axboe
  2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-01-27 16:15 UTC (permalink / raw)
  To: Maxim Mikityanskiy, Alexander Viro, Christoph Hellwig
  Cc: linux-block, linux-fsdevel, linux-kernel

On 1/26/21 12:59 PM, Maxim Mikityanskiy wrote:
> The cited commit introduced a serious regression with SATA write speed,
> as found by bisecting. This patch reverts this commit, which restores
> write speed back to the values observed before this commit.
> 
> The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
> (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
> single HDD, the rest are different RAID levels built over the first
> partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.
> 
>                 | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
> ----------------+--------+-------+-----------+-----------+--------
> 9011495c9466    | R:256  | R:313 | R:276     | R:313     | R:323
> (before faulty) | W:254  | W:253 | W:195     | W:204     | W:117
> ----------------+--------+-------+-----------+-----------+--------
> 5ff9f19231a0    | R:257  | R:398 | R:312     | R:344     | R:391
> (faulty commit) | W:154  | W:122 | W:67.7    | W:66.6    | W:67.2
> ----------------+--------+-------+-----------+-----------+--------
> 5.10.10         | R:256  | R:401 | R:312     | R:356     | R:375
> unpatched       | W:149  | W:123 | W:64      | W:64.1    | W:61.5
> ----------------+--------+-------+-----------+-----------+--------
> 5.10.10         | R:255  | R:396 | R:312     | R:340     | R:393
> patched         | W:247  | W:274 | W:220     | W:225     | W:121
> 
> Applying this patch doesn't hurt read performance, while improves the
> write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
> is restored back to the state before the faulty commit, and even a bit
> higher in RAID tests (which aren't HDD-bound on this device) - that is
> likely related to other optimizations done between the faulty commit and
> 5.10.10 which also improved the read speed.

Can't argue with these numbers, and while this should probably get
fixed up instead, let's leave that for future kernels. I'll apply this
for 5.11, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-01-27 16:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-26 19:59 [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance Maxim Mikityanskiy
2021-01-27  4:23 ` Bart Van Assche
2021-01-27  7:44   ` Maxim Mikityanskiy
2021-01-27 15:18     ` Ming Lei
2021-01-27 16:12 ` Christoph Hellwig
2021-01-27 16:15 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).