* [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
@ 2023-03-10 6:07 Yangtao Li
2023-03-10 6:17 ` Eric Biggers
` (3 more replies)
0 siblings, 4 replies; 22+ messages in thread
From: Yangtao Li @ 2023-03-10 6:07 UTC (permalink / raw)
To: tytso, adilger.kernel; +Cc: linux-ext4, linux-kernel, viro, Yangtao Li
Just for better readability, no code logic change.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
---
fs/ext4/inode.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d251d705c276..d121cde74522 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
{
struct inode *inode = mpd->inode;
int err;
- ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
- >> inode->i_blkbits;
+ ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
if (ext4_verity_in_progress(inode))
blocks = EXT_MAX_BLOCKS;
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li
@ 2023-03-10 6:17 ` Eric Biggers
2023-03-10 6:27 ` Yangtao Li
2023-03-10 6:37 ` Al Viro
2023-03-10 19:07 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 2 replies; 22+ messages in thread
From: Eric Biggers @ 2023-03-10 6:17 UTC (permalink / raw)
To: Yangtao Li; +Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, viro
On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> Just for better readability, no code logic change.
>
> Signed-off-by: Yangtao Li <frank.li@vivo.com>
> ---
> fs/ext4/inode.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d251d705c276..d121cde74522 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> {
> struct inode *inode = mpd->inode;
> int err;
> - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> - >> inode->i_blkbits;
> + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
>
Please don't do this. This makes the code compile down to a division, which is
far less efficient. I've verified this by checking the assembly generated.
- Eric
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:17 ` Eric Biggers
@ 2023-03-10 6:27 ` Yangtao Li
2023-03-10 6:37 ` Al Viro
1 sibling, 0 replies; 22+ messages in thread
From: Yangtao Li @ 2023-03-10 6:27 UTC (permalink / raw)
To: tytso, adilger.kernel
Cc: hsiangkao, linux-ext4, linux-erofs, linux-kernel, viro
> Please don't do this. This makes the code compile down to a division, which is
> far less efficient. I've verified this by checking the assembly generated.
How much is the performance impact? So should the following be modified as shift operations as well?
fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
Thx,
Yangtao
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
@ 2023-03-10 6:27 ` Yangtao Li
0 siblings, 0 replies; 22+ messages in thread
From: Yangtao Li @ 2023-03-10 6:27 UTC (permalink / raw)
To: tytso, adilger.kernel
Cc: linux-ext4, linux-kernel, viro, hsiangkao, linux-erofs
> Please don't do this. This makes the code compile down to a division, which is
> far less efficient. I've verified this by checking the assembly generated.
How much is the performance impact? So should the following be modified as shift operations as well?
fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
Thx,
Yangtao
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:27 ` Yangtao Li
@ 2023-03-10 6:35 ` Gao Xiang
-1 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2023-03-10 6:35 UTC (permalink / raw)
To: Yangtao Li, tytso, adilger.kernel
Cc: linux-ext4, linux-kernel, viro, linux-erofs
On 2023/3/10 14:27, Yangtao Li wrote:
>> Please don't do this. This makes the code compile down to a division, which is
>> far less efficient. I've verified this by checking the assembly generated.
>
> How much is the performance impact? So should the following be modified as shift operations as well?
>
> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
Please stop taking EROFS as example on ext4 patches
and they will all be changed due to subpage support.
>
> Thx,
> Yangtao
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
@ 2023-03-10 6:35 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2023-03-10 6:35 UTC (permalink / raw)
To: Yangtao Li, tytso, adilger.kernel
Cc: linux-ext4, linux-erofs, linux-kernel, viro
On 2023/3/10 14:27, Yangtao Li wrote:
>> Please don't do this. This makes the code compile down to a division, which is
>> far less efficient. I've verified this by checking the assembly generated.
>
> How much is the performance impact? So should the following be modified as shift operations as well?
>
> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
Please stop taking EROFS as example on ext4 patches
and they will all be changed due to subpage support.
>
> Thx,
> Yangtao
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:35 ` Gao Xiang
@ 2023-03-10 6:37 ` Gao Xiang
-1 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2023-03-10 6:37 UTC (permalink / raw)
To: Yangtao Li, tytso, adilger.kernel
Cc: linux-ext4, linux-kernel, viro, linux-erofs
On 2023/3/10 14:35, Gao Xiang wrote:
>
>
> On 2023/3/10 14:27, Yangtao Li wrote:
>>> Please don't do this. This makes the code compile down to a division, which is
>>> far less efficient. I've verified this by checking the assembly generated.
>>
>> How much is the performance impact? So should the following be modified as shift operations as well?
>>
>> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
>> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
>> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
>> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
>> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>
> Please stop taking EROFS as example on ext4 patches
> and they will all be changed due to subpage support.
Here EROFS_BLKSIZ == PAGE_SIZE is a constant, so no difference
to use shift or division.
>
>>
>> Thx,
>> Yangtao
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
@ 2023-03-10 6:37 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2023-03-10 6:37 UTC (permalink / raw)
To: Yangtao Li, tytso, adilger.kernel
Cc: linux-ext4, linux-erofs, linux-kernel, viro
On 2023/3/10 14:35, Gao Xiang wrote:
>
>
> On 2023/3/10 14:27, Yangtao Li wrote:
>>> Please don't do this. This makes the code compile down to a division, which is
>>> far less efficient. I've verified this by checking the assembly generated.
>>
>> How much is the performance impact? So should the following be modified as shift operations as well?
>>
>> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
>> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
>> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
>> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
>> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>
> Please stop taking EROFS as example on ext4 patches
> and they will all be changed due to subpage support.
Here EROFS_BLKSIZ == PAGE_SIZE is a constant, so no difference
to use shift or division.
>
>>
>> Thx,
>> Yangtao
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:17 ` Eric Biggers
2023-03-10 6:27 ` Yangtao Li
@ 2023-03-10 6:37 ` Al Viro
2023-03-10 6:43 ` Al Viro
2023-03-10 6:43 ` Eric Biggers
1 sibling, 2 replies; 22+ messages in thread
From: Al Viro @ 2023-03-10 6:37 UTC (permalink / raw)
To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > Just for better readability, no code logic change.
> >
> > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > ---
> > fs/ext4/inode.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index d251d705c276..d121cde74522 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > {
> > struct inode *inode = mpd->inode;
> > int err;
> > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > - >> inode->i_blkbits;
> > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> >
>
> Please don't do this. This makes the code compile down to a division, which is
> far less efficient. I've verified this by checking the assembly generated.
Which compiler is doing that?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:27 ` Yangtao Li
@ 2023-03-10 6:41 ` Eric Biggers
-1 siblings, 0 replies; 22+ messages in thread
From: Eric Biggers @ 2023-03-10 6:41 UTC (permalink / raw)
To: Yangtao Li
Cc: tytso, adilger.kernel, hsiangkao, linux-ext4, linux-erofs,
linux-kernel, viro
On Fri, Mar 10, 2023 at 02:27:38PM +0800, Yangtao Li wrote:
> > Please don't do this. This makes the code compile down to a division, which is
> > far less efficient. I've verified this by checking the assembly generated.
>
> How much is the performance impact? So should the following be modified as shift operations as well?
>
> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>
No, compilers can handle division by a power-of-2 constant just fine. It's just
division by a non-constant that is inefficient.
- Eric
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
@ 2023-03-10 6:41 ` Eric Biggers
0 siblings, 0 replies; 22+ messages in thread
From: Eric Biggers @ 2023-03-10 6:41 UTC (permalink / raw)
To: Yangtao Li
Cc: tytso, linux-kernel, adilger.kernel, viro, hsiangkao, linux-ext4,
linux-erofs
On Fri, Mar 10, 2023 at 02:27:38PM +0800, Yangtao Li wrote:
> > Please don't do this. This makes the code compile down to a division, which is
> > far less efficient. I've verified this by checking the assembly generated.
>
> How much is the performance impact? So should the following be modified as shift operations as well?
>
> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>
No, compilers can handle division by a power-of-2 constant just fine. It's just
division by a non-constant that is inefficient.
- Eric
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:37 ` Al Viro
@ 2023-03-10 6:43 ` Al Viro
2023-03-10 6:43 ` Eric Biggers
1 sibling, 0 replies; 22+ messages in thread
From: Al Viro @ 2023-03-10 6:43 UTC (permalink / raw)
To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > Just for better readability, no code logic change.
> > >
> > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > ---
> > > fs/ext4/inode.c | 3 +--
> > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index d251d705c276..d121cde74522 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > {
> > > struct inode *inode = mpd->inode;
> > > int err;
> > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > - >> inode->i_blkbits;
> > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > >
> >
> > Please don't do this. This makes the code compile down to a division, which is
> > far less efficient. I've verified this by checking the assembly generated.
>
> Which compiler is doing that?
While we are at it, replace
return (1 << node->i_blkbits);
with
return (1u << node->i_blkbits);
and see if that changes the things.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:37 ` Al Viro
2023-03-10 6:43 ` Al Viro
@ 2023-03-10 6:43 ` Eric Biggers
2023-03-10 6:46 ` Al Viro
1 sibling, 1 reply; 22+ messages in thread
From: Eric Biggers @ 2023-03-10 6:43 UTC (permalink / raw)
To: Al Viro; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > Just for better readability, no code logic change.
> > >
> > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > ---
> > > fs/ext4/inode.c | 3 +--
> > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index d251d705c276..d121cde74522 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > {
> > > struct inode *inode = mpd->inode;
> > > int err;
> > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > - >> inode->i_blkbits;
> > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > >
> >
> > Please don't do this. This makes the code compile down to a division, which is
> > far less efficient. I've verified this by checking the assembly generated.
>
> Which compiler is doing that?
$ gcc --version
gcc (GCC) 12.2.1 20230201
i_blocksize(inode) is not a constant, so this should not be particularly
surprising. One might hope that a / (1 << b) would be optimized into a >> b,
but that doesn't seem to happen.
- Eric
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:43 ` Eric Biggers
@ 2023-03-10 6:46 ` Al Viro
2023-03-10 6:54 ` Eric Biggers
0 siblings, 1 reply; 22+ messages in thread
From: Al Viro @ 2023-03-10 6:46 UTC (permalink / raw)
To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > Just for better readability, no code logic change.
> > > >
> > > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > > ---
> > > > fs/ext4/inode.c | 3 +--
> > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > index d251d705c276..d121cde74522 100644
> > > > --- a/fs/ext4/inode.c
> > > > +++ b/fs/ext4/inode.c
> > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > {
> > > > struct inode *inode = mpd->inode;
> > > > int err;
> > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > - >> inode->i_blkbits;
> > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > >
> > >
> > > Please don't do this. This makes the code compile down to a division, which is
> > > far less efficient. I've verified this by checking the assembly generated.
> >
> > Which compiler is doing that?
>
> $ gcc --version
> gcc (GCC) 12.2.1 20230201
>
> i_blocksize(inode) is not a constant, so this should not be particularly
> surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> but that doesn't seem to happen.
It really ought to be a / (1u << b), though...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:46 ` Al Viro
@ 2023-03-10 6:54 ` Eric Biggers
2023-03-10 7:05 ` Al Viro
0 siblings, 1 reply; 22+ messages in thread
From: Eric Biggers @ 2023-03-10 6:54 UTC (permalink / raw)
To: Al Viro; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > > Just for better readability, no code logic change.
> > > > >
> > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > > > ---
> > > > > fs/ext4/inode.c | 3 +--
> > > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > > index d251d705c276..d121cde74522 100644
> > > > > --- a/fs/ext4/inode.c
> > > > > +++ b/fs/ext4/inode.c
> > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > > {
> > > > > struct inode *inode = mpd->inode;
> > > > > int err;
> > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > > - >> inode->i_blkbits;
> > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > > >
> > > >
> > > > Please don't do this. This makes the code compile down to a division, which is
> > > > far less efficient. I've verified this by checking the assembly generated.
> > >
> > > Which compiler is doing that?
> >
> > $ gcc --version
> > gcc (GCC) 12.2.1 20230201
> >
> > i_blocksize(inode) is not a constant, so this should not be particularly
> > surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> > but that doesn't seem to happen.
>
> It really ought to be a / (1u << b), though...
Sure, that does better:
uint64_t f(uint64_t a, int b)
{
return a / (1U << b);
}
gcc:
0000000000000000 <f>:
0: 48 89 f8 mov %rdi,%rax
3: 89 f1 mov %esi,%ecx
5: 48 d3 e8 shr %cl,%rax
8: c3 ret
clang:
0000000000000000 <f>:
0: 89 f1 mov %esi,%ecx
2: 48 89 f8 mov %rdi,%rax
5: 48 d3 e8 shr %cl,%rax
8: c3 ret
But with a signed dividend (which is the case here) it gets messed up:
int64_t f(int64_t a, int b)
{
return a / (1U << b);
}
gcc:
0000000000000000 <f>:
0: 48 89 f8 mov %rdi,%rax
3: 89 f1 mov %esi,%ecx
5: bf 01 00 00 00 mov $0x1,%edi
a: d3 e7 shl %cl,%edi
c: 48 99 cqto
e: 48 f7 ff idiv %rdi
11: c3 ret
clang:
0000000000000000 <f>:
0: 89 f1 mov %esi,%ecx
2: be 01 00 00 00 mov $0x1,%esi
7: d3 e6 shl %cl,%esi
9: 48 89 f8 mov %rdi,%rax
c: 48 89 f9 mov %rdi,%rcx
f: 48 c1 e9 20 shr $0x20,%rcx
13: 74 06 je 1b <f+0x1b>
15: 48 99 cqto
17: 48 f7 fe idiv %rsi
1a: c3 ret
1b: 31 d2 xor %edx,%edx
1d: f7 f6 div %esi
1f: c3 ret
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:54 ` Eric Biggers
@ 2023-03-10 7:05 ` Al Viro
2023-03-10 7:12 ` Al Viro
0 siblings, 1 reply; 22+ messages in thread
From: Al Viro @ 2023-03-10 7:05 UTC (permalink / raw)
To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Thu, Mar 09, 2023 at 10:54:38PM -0800, Eric Biggers wrote:
> On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote:
> > On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> > > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > > > Just for better readability, no code logic change.
> > > > > >
> > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > > > > ---
> > > > > > fs/ext4/inode.c | 3 +--
> > > > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > > > index d251d705c276..d121cde74522 100644
> > > > > > --- a/fs/ext4/inode.c
> > > > > > +++ b/fs/ext4/inode.c
> > > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > > > {
> > > > > > struct inode *inode = mpd->inode;
> > > > > > int err;
> > > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > > > - >> inode->i_blkbits;
> > > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > > > >
> > > > >
> > > > > Please don't do this. This makes the code compile down to a division, which is
> > > > > far less efficient. I've verified this by checking the assembly generated.
> > > >
> > > > Which compiler is doing that?
> > >
> > > $ gcc --version
> > > gcc (GCC) 12.2.1 20230201
> > >
> > > i_blocksize(inode) is not a constant, so this should not be particularly
> > > surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> > > but that doesn't seem to happen.
> >
> > It really ought to be a / (1u << b), though...
>
> Sure, that does better:
>
> uint64_t f(uint64_t a, int b)
> {
> return a / (1U << b);
> }
>
> gcc:
> 0000000000000000 <f>:
> 0: 48 89 f8 mov %rdi,%rax
> 3: 89 f1 mov %esi,%ecx
> 5: 48 d3 e8 shr %cl,%rax
> 8: c3 ret
>
> clang:
> 0000000000000000 <f>:
> 0: 89 f1 mov %esi,%ecx
> 2: 48 89 f8 mov %rdi,%rax
> 5: 48 d3 e8 shr %cl,%rax
> 8: c3 ret
>
> But with a signed dividend (which is the case here) it gets messed up:
>
> int64_t f(int64_t a, int b)
> {
> return a / (1U << b);
> }
*ow*
And i_size_read() is long long ;-/ Right.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 7:05 ` Al Viro
@ 2023-03-10 7:12 ` Al Viro
2023-03-10 21:47 ` Eric Biggers
0 siblings, 1 reply; 22+ messages in thread
From: Al Viro @ 2023-03-10 7:12 UTC (permalink / raw)
To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Fri, Mar 10, 2023 at 07:05:27AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:54:38PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote:
> > > On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> > > > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > > > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > > > > Just for better readability, no code logic change.
> > > > > > >
> > > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > > > > > ---
> > > > > > > fs/ext4/inode.c | 3 +--
> > > > > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > > > > index d251d705c276..d121cde74522 100644
> > > > > > > --- a/fs/ext4/inode.c
> > > > > > > +++ b/fs/ext4/inode.c
> > > > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > > > > {
> > > > > > > struct inode *inode = mpd->inode;
> > > > > > > int err;
> > > > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > > > > - >> inode->i_blkbits;
> > > > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > > > > >
> > > > > >
> > > > > > Please don't do this. This makes the code compile down to a division, which is
> > > > > > far less efficient. I've verified this by checking the assembly generated.
> > > > >
> > > > > Which compiler is doing that?
> > > >
> > > > $ gcc --version
> > > > gcc (GCC) 12.2.1 20230201
> > > >
> > > > i_blocksize(inode) is not a constant, so this should not be particularly
> > > > surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> > > > but that doesn't seem to happen.
> > >
> > > It really ought to be a / (1u << b), though...
> >
> > Sure, that does better:
> >
> > uint64_t f(uint64_t a, int b)
> > {
> > return a / (1U << b);
> > }
> >
> > gcc:
> > 0000000000000000 <f>:
> > 0: 48 89 f8 mov %rdi,%rax
> > 3: 89 f1 mov %esi,%ecx
> > 5: 48 d3 e8 shr %cl,%rax
> > 8: c3 ret
> >
> > clang:
> > 0000000000000000 <f>:
> > 0: 89 f1 mov %esi,%ecx
> > 2: 48 89 f8 mov %rdi,%rax
> > 5: 48 d3 e8 shr %cl,%rax
> > 8: c3 ret
> >
> > But with a signed dividend (which is the case here) it gets messed up:
> >
> > int64_t f(int64_t a, int b)
> > {
> > return a / (1U << b);
> > }
>
> *ow*
>
> And i_size_read() is long long ;-/ Right.
Out of curiosity (and that's already too brittle for practical use) -
does DIV_ROUND_UP_ULL() do any better on full example?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li
2023-03-10 6:17 ` Eric Biggers
@ 2023-03-10 19:07 ` kernel test robot
2023-03-10 19:38 ` kernel test robot
2023-03-10 23:21 ` Theodore Ts'o
3 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2023-03-10 19:07 UTC (permalink / raw)
To: Yangtao Li, tytso, adilger.kernel
Cc: oe-kbuild-all, linux-ext4, linux-kernel, viro, Yangtao Li
Hi Yangtao,
I love your patch! Yet something to improve:
[auto build test ERROR on tytso-ext4/dev]
[also build test ERROR on linus/master v6.3-rc1 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/r/20230310060734.8780-1-frank.li%40vivo.com
patch subject: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
config: i386-defconfig (https://download.01.org/0day-ci/archive/20230311/202303110211.LXeNm5uw-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
git checkout f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 olddefconfig
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303110211.LXeNm5uw-lkp@intel.com/
All errors (new ones prefixed by >>):
ld: fs/ext4/inode.o: in function `mpage_process_page_bufs':
>> inode.c:(.text+0xbda): undefined reference to `__divdi3'
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li
2023-03-10 6:17 ` Eric Biggers
2023-03-10 19:07 ` kernel test robot
@ 2023-03-10 19:38 ` kernel test robot
2023-03-10 23:21 ` Theodore Ts'o
3 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2023-03-10 19:38 UTC (permalink / raw)
To: Yangtao Li, tytso, adilger.kernel
Cc: oe-kbuild-all, linux-ext4, linux-kernel, viro, Yangtao Li
Hi Yangtao,
I love your patch! Yet something to improve:
[auto build test ERROR on tytso-ext4/dev]
[also build test ERROR on linus/master v6.3-rc1 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/r/20230310060734.8780-1-frank.li%40vivo.com
patch subject: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
config: i386-debian-10.3 (https://download.01.org/0day-ci/archive/20230311/202303110358.NxL6UI32-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
git checkout f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 olddefconfig
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303110358.NxL6UI32-lkp@intel.com/
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "__divdi3" [fs/ext4/ext4.ko] undefined!
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 7:12 ` Al Viro
@ 2023-03-10 21:47 ` Eric Biggers
0 siblings, 0 replies; 22+ messages in thread
From: Eric Biggers @ 2023-03-10 21:47 UTC (permalink / raw)
To: Al Viro; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel
On Fri, Mar 10, 2023 at 07:12:31AM +0000, Al Viro wrote:
>
> Out of curiosity (and that's already too brittle for practical use) -
> does DIV_ROUND_UP_ULL() do any better on full example?
'DIV_ROUND_UP_ULL(i_size_read(inode), i_blocksize(inode))' works properly with
clang but not gcc. If i_blocksize() is changed to do '1U << inode->i_blkbits'
instead of '1 << inode->i_blkbits', it works with gcc too.
- Eric
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li
` (2 preceding siblings ...)
2023-03-10 19:38 ` kernel test robot
@ 2023-03-10 23:21 ` Theodore Ts'o
2023-03-13 9:25 ` David Laight
3 siblings, 1 reply; 22+ messages in thread
From: Theodore Ts'o @ 2023-03-10 23:21 UTC (permalink / raw)
To: Yangtao Li; +Cc: adilger.kernel, linux-ext4, linux-kernel, viro
On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> Just for better readability, no code logic change.
All aside from the arguments over performance, I'm not at *all*
convinced by the "it's more readable" argument.
So yeah, let's not. We have i_blkbits for a reason, and it's because
shifting right is just simpler and easier.
BTW, doing a 64-bit division on a 32-bit platforms causes compile
failures, which was the cause of the test bot complaint:
ld: fs/ext4/inode.o: in function `mpage_process_page_bufs':
>> inode.c:(.text+0xbda): undefined reference to `__divdi3'
On 32-bit platforms --- i386 in particular --- the 64-bit division
results in an out-of-line call to a helper function that is not
supplied in the kernel compilation environment, so not only is it
slower, it Just Doesn't Work.
- Ted
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
2023-03-10 23:21 ` Theodore Ts'o
@ 2023-03-13 9:25 ` David Laight
0 siblings, 0 replies; 22+ messages in thread
From: David Laight @ 2023-03-13 9:25 UTC (permalink / raw)
To: 'Theodore Ts'o', Yangtao Li
Cc: adilger.kernel, linux-ext4, linux-kernel, viro
...
> On 32-bit platforms --- i386 in particular --- the 64-bit division
> results in an out-of-line call to a helper function that is not
> supplied in the kernel compilation environment, so not only is it
> slower, it Just Doesn't Work.
Even on some 64-bit systems a 64bit divide can be significantly
slower than a 32-bit divide - even with the same arguments.
IIRC Intel x86-64 a 64-bit divide (128-bit dividend) is about
twice the clocks of a 32-bit one. On AMD cpu they are much the same.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2023-03-13 9:25 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li
2023-03-10 6:17 ` Eric Biggers
2023-03-10 6:27 ` Yangtao Li
2023-03-10 6:27 ` Yangtao Li
2023-03-10 6:35 ` Gao Xiang
2023-03-10 6:35 ` Gao Xiang
2023-03-10 6:37 ` Gao Xiang
2023-03-10 6:37 ` Gao Xiang
2023-03-10 6:41 ` Eric Biggers
2023-03-10 6:41 ` Eric Biggers
2023-03-10 6:37 ` Al Viro
2023-03-10 6:43 ` Al Viro
2023-03-10 6:43 ` Eric Biggers
2023-03-10 6:46 ` Al Viro
2023-03-10 6:54 ` Eric Biggers
2023-03-10 7:05 ` Al Viro
2023-03-10 7:12 ` Al Viro
2023-03-10 21:47 ` Eric Biggers
2023-03-10 19:07 ` kernel test robot
2023-03-10 19:38 ` kernel test robot
2023-03-10 23:21 ` Theodore Ts'o
2023-03-13 9:25 ` David Laight
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.