* [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize
@ 2020-02-07 14:29 Qian Cai
2020-02-07 15:12 ` Marco Elver
2020-02-20 4:16 ` Theodore Y. Ts'o
0 siblings, 2 replies; 6+ messages in thread
From: Qian Cai @ 2020-02-07 14:29 UTC (permalink / raw)
To: tytso; +Cc: adilger.kernel, elver, linux-ext4, linux-kernel, Qian Cai
EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
KCSAN,
BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
ext4_write_end+0x4e3/0x750 [ext4]
ext4_update_i_disksize at fs/ext4/ext4.h:3032
(inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
(inlined by) ext4_write_end at fs/ext4/inode.c:1287
generic_perform_write+0x208/0x2a0
ext4_buffered_write_iter+0x11f/0x210 [ext4]
ext4_file_write_iter+0xce/0x9e0 [ext4]
new_sync_write+0x29c/0x3b0
__vfs_write+0x92/0xa0
vfs_write+0x103/0x260
ksys_write+0x9d/0x130
__x64_sys_write+0x4c/0x60
do_syscall_64+0x91/0xb47
entry_SYSCALL_64_after_hwframe+0x49/0xbe
read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
ext4_writepages+0x10ac/0x1d00 [ext4]
mpage_map_and_submit_extent at fs/ext4/inode.c:2468
(inlined by) ext4_writepages at fs/ext4/inode.c:2772
do_writepages+0x5e/0x130
__writeback_single_inode+0xeb/0xb20
writeback_sb_inodes+0x429/0x900
__writeback_inodes_wb+0xc4/0x150
wb_writeback+0x4bd/0x870
wb_workfn+0x6b4/0x960
process_one_work+0x54c/0xbe0
worker_thread+0x80/0x650
kthread+0x1e0/0x200
ret_from_fork+0x27/0x50
Reported by Kernel Concurrency Sanitizer on:
CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
Workqueue: writeback wb_workfn (flush-7:0)
Since only the read is operating as lockless (outside of the
"i_data_sem"), load tearing could introduce a logic bug. Fix it by
adding READ_ONCE() for the read and WRITE_ONCE() for the write.
Signed-off-by: Qian Cai <cai@lca.pw>
---
v2: also add WRITE_ONCE() which is recommended even for fixing load tearing.
fs/ext4/ext4.h | 2 +-
fs/ext4/inode.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9a2ee2428ecc..8329ccc82fa9 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3029,7 +3029,7 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
!inode_is_locked(inode));
down_write(&EXT4_I(inode)->i_data_sem);
if (newsize > EXT4_I(inode)->i_disksize)
- EXT4_I(inode)->i_disksize = newsize;
+ WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize);
up_write(&EXT4_I(inode)->i_data_sem);
}
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3313168b680f..6f9862bf63f1 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2465,7 +2465,7 @@ static int mpage_map_and_submit_extent(handle_t *handle,
* truncate are avoided by checking i_size under i_data_sem.
*/
disksize = ((loff_t)mpd->first_page) << PAGE_SHIFT;
- if (disksize > EXT4_I(inode)->i_disksize) {
+ if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) {
int err2;
loff_t i_size;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize
2020-02-07 14:29 [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize Qian Cai
@ 2020-02-07 15:12 ` Marco Elver
2020-02-07 15:25 ` Qian Cai
2020-02-07 15:38 ` Qian Cai
2020-02-20 4:16 ` Theodore Y. Ts'o
1 sibling, 2 replies; 6+ messages in thread
From: Marco Elver @ 2020-02-07 15:12 UTC (permalink / raw)
To: Qian Cai; +Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, LKML
On Fri, 7 Feb 2020 at 15:29, Qian Cai <cai@lca.pw> wrote:
>
> EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
> KCSAN,
>
> BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
>
> write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
> ext4_write_end+0x4e3/0x750 [ext4]
> ext4_update_i_disksize at fs/ext4/ext4.h:3032
> (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
> (inlined by) ext4_write_end at fs/ext4/inode.c:1287
> generic_perform_write+0x208/0x2a0
> ext4_buffered_write_iter+0x11f/0x210 [ext4]
> ext4_file_write_iter+0xce/0x9e0 [ext4]
> new_sync_write+0x29c/0x3b0
> __vfs_write+0x92/0xa0
> vfs_write+0x103/0x260
> ksys_write+0x9d/0x130
> __x64_sys_write+0x4c/0x60
> do_syscall_64+0x91/0xb47
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
> ext4_writepages+0x10ac/0x1d00 [ext4]
> mpage_map_and_submit_extent at fs/ext4/inode.c:2468
> (inlined by) ext4_writepages at fs/ext4/inode.c:2772
> do_writepages+0x5e/0x130
> __writeback_single_inode+0xeb/0xb20
> writeback_sb_inodes+0x429/0x900
> __writeback_inodes_wb+0xc4/0x150
> wb_writeback+0x4bd/0x870
> wb_workfn+0x6b4/0x960
> process_one_work+0x54c/0xbe0
> worker_thread+0x80/0x650
> kthread+0x1e0/0x200
> ret_from_fork+0x27/0x50
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
> Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> Workqueue: writeback wb_workfn (flush-7:0)
>
> Since only the read is operating as lockless (outside of the
> "i_data_sem"), load tearing could introduce a logic bug. Fix it by
> adding READ_ONCE() for the read and WRITE_ONCE() for the write.
>
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>
> v2: also add WRITE_ONCE() which is recommended even for fixing load tearing.
Just a note: I keep seeing 'load tearing' mentioned as the only reason:
- The WRITE_ONCE avoids store-tearing (and other optimizations).
- We're not only interested in avoiding load/store tearing. There
are plenty other compiler optimizations that can break concurrent
code: https://lwn.net/Articles/793253/
Thanks,
-- Marco
> fs/ext4/ext4.h | 2 +-
> fs/ext4/inode.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 9a2ee2428ecc..8329ccc82fa9 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -3029,7 +3029,7 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
> !inode_is_locked(inode));
> down_write(&EXT4_I(inode)->i_data_sem);
> if (newsize > EXT4_I(inode)->i_disksize)
> - EXT4_I(inode)->i_disksize = newsize;
> + WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize);
> up_write(&EXT4_I(inode)->i_data_sem);
> }
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3313168b680f..6f9862bf63f1 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2465,7 +2465,7 @@ static int mpage_map_and_submit_extent(handle_t *handle,
> * truncate are avoided by checking i_size under i_data_sem.
> */
> disksize = ((loff_t)mpd->first_page) << PAGE_SHIFT;
> - if (disksize > EXT4_I(inode)->i_disksize) {
> + if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) {
> int err2;
> loff_t i_size;
>
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize
2020-02-07 15:12 ` Marco Elver
@ 2020-02-07 15:25 ` Qian Cai
2020-02-07 15:38 ` Qian Cai
1 sibling, 0 replies; 6+ messages in thread
From: Qian Cai @ 2020-02-07 15:25 UTC (permalink / raw)
To: Marco Elver; +Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, LKML
On Fri, 2020-02-07 at 16:12 +0100, Marco Elver wrote:
> On Fri, 7 Feb 2020 at 15:29, Qian Cai <cai@lca.pw> wrote:
> >
> > EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
> > KCSAN,
> >
> > BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
> >
> > write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
> > ext4_write_end+0x4e3/0x750 [ext4]
> > ext4_update_i_disksize at fs/ext4/ext4.h:3032
> > (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
> > (inlined by) ext4_write_end at fs/ext4/inode.c:1287
> > generic_perform_write+0x208/0x2a0
> > ext4_buffered_write_iter+0x11f/0x210 [ext4]
> > ext4_file_write_iter+0xce/0x9e0 [ext4]
> > new_sync_write+0x29c/0x3b0
> > __vfs_write+0x92/0xa0
> > vfs_write+0x103/0x260
> > ksys_write+0x9d/0x130
> > __x64_sys_write+0x4c/0x60
> > do_syscall_64+0x91/0xb47
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
> > ext4_writepages+0x10ac/0x1d00 [ext4]
> > mpage_map_and_submit_extent at fs/ext4/inode.c:2468
> > (inlined by) ext4_writepages at fs/ext4/inode.c:2772
> > do_writepages+0x5e/0x130
> > __writeback_single_inode+0xeb/0xb20
> > writeback_sb_inodes+0x429/0x900
> > __writeback_inodes_wb+0xc4/0x150
> > wb_writeback+0x4bd/0x870
> > wb_workfn+0x6b4/0x960
> > process_one_work+0x54c/0xbe0
> > worker_thread+0x80/0x650
> > kthread+0x1e0/0x200
> > ret_from_fork+0x27/0x50
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
> > Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > Workqueue: writeback wb_workfn (flush-7:0)
> >
> > Since only the read is operating as lockless (outside of the
> > "i_data_sem"), load tearing could introduce a logic bug. Fix it by
> > adding READ_ONCE() for the read and WRITE_ONCE() for the write.
> >
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> >
> > v2: also add WRITE_ONCE() which is recommended even for fixing load tearing.
>
> Just a note: I keep seeing 'load tearing' mentioned as the only reason:
>
> - The WRITE_ONCE avoids store-tearing (and other optimizations).
In general, yes, but in this case, store tearing can't happen because those
concurrent writers are protected by "i_data_sem", i.e.,
down_write(&EXT4_I(inode)->i_data_sem);
>
> - We're not only interested in avoiding load/store tearing. There
> are plenty other compiler optimizations that can break concurrent
> code: https://lwn.net/Articles/793253/
>
> Thanks,
> -- Marco
>
>
> > fs/ext4/ext4.h | 2 +-
> > fs/ext4/inode.c | 2 +-
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index 9a2ee2428ecc..8329ccc82fa9 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -3029,7 +3029,7 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
> > !inode_is_locked(inode));
> > down_write(&EXT4_I(inode)->i_data_sem);
> > if (newsize > EXT4_I(inode)->i_disksize)
> > - EXT4_I(inode)->i_disksize = newsize;
> > + WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize);
> > up_write(&EXT4_I(inode)->i_data_sem);
> > }
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 3313168b680f..6f9862bf63f1 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -2465,7 +2465,7 @@ static int mpage_map_and_submit_extent(handle_t *handle,
> > * truncate are avoided by checking i_size under i_data_sem.
> > */
> > disksize = ((loff_t)mpd->first_page) << PAGE_SHIFT;
> > - if (disksize > EXT4_I(inode)->i_disksize) {
> > + if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) {
> > int err2;
> > loff_t i_size;
> >
> > --
> > 1.8.3.1
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize
2020-02-07 15:12 ` Marco Elver
2020-02-07 15:25 ` Qian Cai
@ 2020-02-07 15:38 ` Qian Cai
2020-02-07 16:08 ` Marco Elver
1 sibling, 1 reply; 6+ messages in thread
From: Qian Cai @ 2020-02-07 15:38 UTC (permalink / raw)
To: Marco Elver; +Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, LKML
On Fri, 2020-02-07 at 16:12 +0100, Marco Elver wrote:
> On Fri, 7 Feb 2020 at 15:29, Qian Cai <cai@lca.pw> wrote:
> >
> > EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
> > KCSAN,
> >
> > BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
> >
> > write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
> > ext4_write_end+0x4e3/0x750 [ext4]
> > ext4_update_i_disksize at fs/ext4/ext4.h:3032
> > (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
> > (inlined by) ext4_write_end at fs/ext4/inode.c:1287
> > generic_perform_write+0x208/0x2a0
> > ext4_buffered_write_iter+0x11f/0x210 [ext4]
> > ext4_file_write_iter+0xce/0x9e0 [ext4]
> > new_sync_write+0x29c/0x3b0
> > __vfs_write+0x92/0xa0
> > vfs_write+0x103/0x260
> > ksys_write+0x9d/0x130
> > __x64_sys_write+0x4c/0x60
> > do_syscall_64+0x91/0xb47
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
> > ext4_writepages+0x10ac/0x1d00 [ext4]
> > mpage_map_and_submit_extent at fs/ext4/inode.c:2468
> > (inlined by) ext4_writepages at fs/ext4/inode.c:2772
> > do_writepages+0x5e/0x130
> > __writeback_single_inode+0xeb/0xb20
> > writeback_sb_inodes+0x429/0x900
> > __writeback_inodes_wb+0xc4/0x150
> > wb_writeback+0x4bd/0x870
> > wb_workfn+0x6b4/0x960
> > process_one_work+0x54c/0xbe0
> > worker_thread+0x80/0x650
> > kthread+0x1e0/0x200
> > ret_from_fork+0x27/0x50
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
> > Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > Workqueue: writeback wb_workfn (flush-7:0)
> >
> > Since only the read is operating as lockless (outside of the
> > "i_data_sem"), load tearing could introduce a logic bug. Fix it by
> > adding READ_ONCE() for the read and WRITE_ONCE() for the write.
> >
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> >
> > v2: also add WRITE_ONCE() which is recommended even for fixing load tearing.
>
> Just a note: I keep seeing 'load tearing' mentioned as the only reason:
>
> - The WRITE_ONCE avoids store-tearing (and other optimizations).
>
> - We're not only interested in avoiding load/store tearing. There
> are plenty other compiler optimizations that can break concurrent
> code: https://lwn.net/Articles/793253/
I also realized that from that article, store tearing is strictly from multiple
concurrent writers. However, in the sense of without the WRITE_ONCE() here,
compilers could still have 2 store instructions, so
CPU0: CPU1:
store #1
read
store #2
which was not mentioned in that article. I called it also load tearing, but
maybe you will call that store tearing. Do I understand correctly?
>
> Thanks,
> -- Marco
>
>
> > fs/ext4/ext4.h | 2 +-
> > fs/ext4/inode.c | 2 +-
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index 9a2ee2428ecc..8329ccc82fa9 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -3029,7 +3029,7 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
> > !inode_is_locked(inode));
> > down_write(&EXT4_I(inode)->i_data_sem);
> > if (newsize > EXT4_I(inode)->i_disksize)
> > - EXT4_I(inode)->i_disksize = newsize;
> > + WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize);
> > up_write(&EXT4_I(inode)->i_data_sem);
> > }
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 3313168b680f..6f9862bf63f1 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -2465,7 +2465,7 @@ static int mpage_map_and_submit_extent(handle_t *handle,
> > * truncate are avoided by checking i_size under i_data_sem.
> > */
> > disksize = ((loff_t)mpd->first_page) << PAGE_SHIFT;
> > - if (disksize > EXT4_I(inode)->i_disksize) {
> > + if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) {
> > int err2;
> > loff_t i_size;
> >
> > --
> > 1.8.3.1
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize
2020-02-07 15:38 ` Qian Cai
@ 2020-02-07 16:08 ` Marco Elver
0 siblings, 0 replies; 6+ messages in thread
From: Marco Elver @ 2020-02-07 16:08 UTC (permalink / raw)
To: Qian Cai; +Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, LKML
On Fri, 7 Feb 2020 at 16:38, Qian Cai <cai@lca.pw> wrote:
>
> On Fri, 2020-02-07 at 16:12 +0100, Marco Elver wrote:
> > On Fri, 7 Feb 2020 at 15:29, Qian Cai <cai@lca.pw> wrote:
> > >
> > > EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
> > > KCSAN,
> > >
> > > BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
> > >
> > > write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
> > > ext4_write_end+0x4e3/0x750 [ext4]
> > > ext4_update_i_disksize at fs/ext4/ext4.h:3032
> > > (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
> > > (inlined by) ext4_write_end at fs/ext4/inode.c:1287
> > > generic_perform_write+0x208/0x2a0
> > > ext4_buffered_write_iter+0x11f/0x210 [ext4]
> > > ext4_file_write_iter+0xce/0x9e0 [ext4]
> > > new_sync_write+0x29c/0x3b0
> > > __vfs_write+0x92/0xa0
> > > vfs_write+0x103/0x260
> > > ksys_write+0x9d/0x130
> > > __x64_sys_write+0x4c/0x60
> > > do_syscall_64+0x91/0xb47
> > > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > >
> > > read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
> > > ext4_writepages+0x10ac/0x1d00 [ext4]
> > > mpage_map_and_submit_extent at fs/ext4/inode.c:2468
> > > (inlined by) ext4_writepages at fs/ext4/inode.c:2772
> > > do_writepages+0x5e/0x130
> > > __writeback_single_inode+0xeb/0xb20
> > > writeback_sb_inodes+0x429/0x900
> > > __writeback_inodes_wb+0xc4/0x150
> > > wb_writeback+0x4bd/0x870
> > > wb_workfn+0x6b4/0x960
> > > process_one_work+0x54c/0xbe0
> > > worker_thread+0x80/0x650
> > > kthread+0x1e0/0x200
> > > ret_from_fork+0x27/0x50
> > >
> > > Reported by Kernel Concurrency Sanitizer on:
> > > CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
> > > Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > > Workqueue: writeback wb_workfn (flush-7:0)
> > >
> > > Since only the read is operating as lockless (outside of the
> > > "i_data_sem"), load tearing could introduce a logic bug. Fix it by
> > > adding READ_ONCE() for the read and WRITE_ONCE() for the write.
> > >
> > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > ---
> > >
> > > v2: also add WRITE_ONCE() which is recommended even for fixing load tearing.
> >
> > Just a note: I keep seeing 'load tearing' mentioned as the only reason:
> >
> > - The WRITE_ONCE avoids store-tearing (and other optimizations).
> >
> > - We're not only interested in avoiding load/store tearing. There
> > are plenty other compiler optimizations that can break concurrent
> > code: https://lwn.net/Articles/793253/
>
> I also realized that from that article, store tearing is strictly from multiple
> concurrent writers. However, in the sense of without the WRITE_ONCE() here,
> compilers could still have 2 store instructions, so
>
> CPU0: CPU1:
> store #1
> read
> store #2
>
> which was not mentioned in that article. I called it also load tearing, but
> maybe you will call that store tearing. Do I understand correctly?
The effect is the same, so yes. If you have the writer side split the
write, but have a concurrent load, the observed value will appear
"teared". Similar if the reader side splits the reads (the more
obvious case).
> >
> > Thanks,
> > -- Marco
> >
> >
> > > fs/ext4/ext4.h | 2 +-
> > > fs/ext4/inode.c | 2 +-
> > > 2 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > > index 9a2ee2428ecc..8329ccc82fa9 100644
> > > --- a/fs/ext4/ext4.h
> > > +++ b/fs/ext4/ext4.h
> > > @@ -3029,7 +3029,7 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
> > > !inode_is_locked(inode));
> > > down_write(&EXT4_I(inode)->i_data_sem);
> > > if (newsize > EXT4_I(inode)->i_disksize)
> > > - EXT4_I(inode)->i_disksize = newsize;
> > > + WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize);
> > > up_write(&EXT4_I(inode)->i_data_sem);
> > > }
> > >
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index 3313168b680f..6f9862bf63f1 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -2465,7 +2465,7 @@ static int mpage_map_and_submit_extent(handle_t *handle,
> > > * truncate are avoided by checking i_size under i_data_sem.
> > > */
> > > disksize = ((loff_t)mpd->first_page) << PAGE_SHIFT;
> > > - if (disksize > EXT4_I(inode)->i_disksize) {
> > > + if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) {
> > > int err2;
> > > loff_t i_size;
> > >
> > > --
> > > 1.8.3.1
> > >
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize
2020-02-07 14:29 [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize Qian Cai
2020-02-07 15:12 ` Marco Elver
@ 2020-02-20 4:16 ` Theodore Y. Ts'o
1 sibling, 0 replies; 6+ messages in thread
From: Theodore Y. Ts'o @ 2020-02-20 4:16 UTC (permalink / raw)
To: Qian Cai; +Cc: adilger.kernel, elver, linux-ext4, linux-kernel
On Fri, Feb 07, 2020 at 09:29:11AM -0500, Qian Cai wrote:
> EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
> KCSAN,
>
> BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
>
> write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
> ext4_write_end+0x4e3/0x750 [ext4]
> ext4_update_i_disksize at fs/ext4/ext4.h:3032
> (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
> (inlined by) ext4_write_end at fs/ext4/inode.c:1287
> generic_perform_write+0x208/0x2a0
> ext4_buffered_write_iter+0x11f/0x210 [ext4]
> ext4_file_write_iter+0xce/0x9e0 [ext4]
> new_sync_write+0x29c/0x3b0
> __vfs_write+0x92/0xa0
> vfs_write+0x103/0x260
> ksys_write+0x9d/0x130
> __x64_sys_write+0x4c/0x60
> do_syscall_64+0x91/0xb47
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
> ext4_writepages+0x10ac/0x1d00 [ext4]
> mpage_map_and_submit_extent at fs/ext4/inode.c:2468
> (inlined by) ext4_writepages at fs/ext4/inode.c:2772
> do_writepages+0x5e/0x130
> __writeback_single_inode+0xeb/0xb20
> writeback_sb_inodes+0x429/0x900
> __writeback_inodes_wb+0xc4/0x150
> wb_writeback+0x4bd/0x870
> wb_workfn+0x6b4/0x960
> process_one_work+0x54c/0xbe0
> worker_thread+0x80/0x650
> kthread+0x1e0/0x200
> ret_from_fork+0x27/0x50
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
> Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> Workqueue: writeback wb_workfn (flush-7:0)
>
> Since only the read is operating as lockless (outside of the
> "i_data_sem"), load tearing could introduce a logic bug. Fix it by
> adding READ_ONCE() for the read and WRITE_ONCE() for the write.
>
> Signed-off-by: Qian Cai <cai@lca.pw>
Thanks, applied.
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-02-20 4:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-07 14:29 [PATCH v2] ext4: fix a data race in EXT4_I(inode)->i_disksize Qian Cai
2020-02-07 15:12 ` Marco Elver
2020-02-07 15:25 ` Qian Cai
2020-02-07 15:38 ` Qian Cai
2020-02-07 16:08 ` Marco Elver
2020-02-20 4:16 ` Theodore Y. Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).