* [PATCH] fs: ratelimit __find_get_block_slow() failure message. @ 2019-01-11 10:10 Tetsuo Handa 2019-01-11 10:19 ` Dmitry Vyukov 2019-01-11 11:03 ` Jan Kara 0 siblings, 2 replies; 31+ messages in thread From: Tetsuo Handa @ 2019-01-11 10:10 UTC (permalink / raw) To: Jan Kara, Alexander Viro; +Cc: Dmitry Vyukov, linux-fsdevel, Tetsuo Handa When something let __find_get_block_slow() hit all_mapped path, it calls printk() for 100+ times per a second. But there is no need to print same message with such high frequency; it is just asking for stall warning, or at least bloating log files. [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.873324][T15342] b_state=0x00000029, b_size=512 [ 399.878403][T15342] device loop0 blocksize: 4096 [ 399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.890400][T15342] b_state=0x00000029, b_size=512 [ 399.895595][T15342] device loop0 blocksize: 4096 [ 399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.907471][T15342] b_state=0x00000029, b_size=512 [ 399.912506][T15342] device loop0 blocksize: 4096 This patch reduces frequency to approximately once per a second, in addition to concatenating three lines into one. This patch uses plain time_in_range() than __ratelimit(), for there is no need to allocate space for debug objects in ratelimit_state.lock field. [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- fs/buffer.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 52d024b..3fc9167 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) struct buffer_head *head; struct page *page; int all_mapped = 1; + static unsigned long last; index = block >> (PAGE_SHIFT - bd_inode->i_blkbits); page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED); @@ -227,15 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) * file io on the block device and getblk. It gets dealt with * elsewhere, don't buffer_error if we had some unmapped buffers */ - if (all_mapped) { - printk("__find_get_block_slow() failed. " - "block=%llu, b_blocknr=%llu\n", - (unsigned long long)block, - (unsigned long long)bh->b_blocknr); - printk("b_state=0x%08lx, b_size=%zu\n", - bh->b_state, bh->b_size); - printk("device %pg blocksize: %d\n", bdev, - 1 << bd_inode->i_blkbits); + if (all_mapped && !time_in_range(jiffies, last, last + HZ)) { + printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n", + (unsigned long long)block, + (unsigned long long)bh->b_blocknr, + bh->b_state, bh->b_size, bdev, + 1 << bd_inode->i_blkbits); + last = jiffies; } out_unlock: spin_unlock(&bd_mapping->private_lock); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 10:10 [PATCH] fs: ratelimit __find_get_block_slow() failure message Tetsuo Handa @ 2019-01-11 10:19 ` Dmitry Vyukov [not found] ` <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp> 2019-01-11 11:03 ` Jan Kara 1 sibling, 1 reply; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-11 10:19 UTC (permalink / raw) To: Tetsuo Handa; +Cc: Jan Kara, Alexander Viro, linux-fsdevel On Fri, Jan 11, 2019 at 11:10 AM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > When something let __find_get_block_slow() hit all_mapped path, it calls > printk() for 100+ times per a second. But there is no need to print same > message with such high frequency; it is just asking for stall warning, or > at least bloating log files. > > [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.873324][T15342] b_state=0x00000029, b_size=512 > [ 399.878403][T15342] device loop0 blocksize: 4096 > [ 399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.890400][T15342] b_state=0x00000029, b_size=512 > [ 399.895595][T15342] device loop0 blocksize: 4096 > [ 399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.907471][T15342] b_state=0x00000029, b_size=512 > [ 399.912506][T15342] device loop0 blocksize: 4096 > > This patch reduces frequency to approximately once per a second, in > addition to concatenating three lines into one. This patch uses plain > time_in_range() than __ratelimit(), for there is no need to allocate > space for debug objects in ratelimit_state.lock field. > > [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096 > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > --- > fs/buffer.c | 17 ++++++++--------- > 1 file changed, 8 insertions(+), 9 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 52d024b..3fc9167 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > struct buffer_head *head; > struct page *page; > int all_mapped = 1; > + static unsigned long last; /\/\/\/\/\/\ We will have to re-fix this when we start testing with KTSAN, data race detector. > index = block >> (PAGE_SHIFT - bd_inode->i_blkbits); > page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED); > @@ -227,15 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > * file io on the block device and getblk. It gets dealt with > * elsewhere, don't buffer_error if we had some unmapped buffers > */ > - if (all_mapped) { > - printk("__find_get_block_slow() failed. " > - "block=%llu, b_blocknr=%llu\n", > - (unsigned long long)block, > - (unsigned long long)bh->b_blocknr); > - printk("b_state=0x%08lx, b_size=%zu\n", > - bh->b_state, bh->b_size); > - printk("device %pg blocksize: %d\n", bdev, > - 1 << bd_inode->i_blkbits); > + if (all_mapped && !time_in_range(jiffies, last, last + HZ)) { > + printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n", > + (unsigned long long)block, > + (unsigned long long)bh->b_blocknr, > + bh->b_state, bh->b_size, bdev, > + 1 << bd_inode->i_blkbits); > + last = jiffies; > } > out_unlock: > spin_unlock(&bd_mapping->private_lock); > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp>]
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. [not found] ` <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp> @ 2019-01-11 10:40 ` Dmitry Vyukov 2019-01-11 10:48 ` Dmitry Vyukov 2019-01-11 10:51 ` Tetsuo Handa 0 siblings, 2 replies; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-11 10:40 UTC (permalink / raw) To: Tetsuo Handa, Jan Kara, Al Viro, linux-fsdevel On Fri, Jan 11, 2019 at 11:28 AM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > On 2019/01/11 19:19, Dmitry Vyukov wrote: > >> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > >> struct buffer_head *head; > >> struct page *page; > >> int all_mapped = 1; > >> + static unsigned long last; > > > > /\/\/\/\/\/\ > > > > We will have to re-fix this when we start testing with KTSAN, data > > race detector. > > What's wrong? This race is harmless. How did you arrive to the conclusion that it is harmless? There is only one relevant standard covering this, which is the C language standard, and it is very clear on this -- this has Undefined Behavior, that is the same as, for example, reading/writing random pointers. Check out this on how any race that you might think is benign can be badly miscompiled and lead to arbitrary program behavior: https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 10:40 ` Dmitry Vyukov @ 2019-01-11 10:48 ` Dmitry Vyukov 2019-01-11 12:46 ` Tetsuo Handa 2019-01-11 10:51 ` Tetsuo Handa 1 sibling, 1 reply; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-11 10:48 UTC (permalink / raw) To: Tetsuo Handa, Jan Kara, Al Viro, linux-fsdevel On Fri, Jan 11, 2019 at 11:40 AM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Fri, Jan 11, 2019 at 11:28 AM Tetsuo Handa > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > On 2019/01/11 19:19, Dmitry Vyukov wrote: > > >> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > > >> struct buffer_head *head; > > >> struct page *page; > > >> int all_mapped = 1; > > >> + static unsigned long last; > > > > > > /\/\/\/\/\/\ > > > > > > We will have to re-fix this when we start testing with KTSAN, data > > > race detector. > > > > What's wrong? This race is harmless. > > How did you arrive to the conclusion that it is harmless? > There is only one relevant standard covering this, which is the C > language standard, and it is very clear on this -- this has Undefined > Behavior, that is the same as, for example, reading/writing random > pointers. > > Check out this on how any race that you might think is benign can be > badly miscompiled and lead to arbitrary program behavior: > https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong Also there is no other practical definition of data race for automatic data race detectors than: two conflicting non-atomic concurrent accesses. Which this code is. Which means that if we continue writing such code we are not getting data race detection and don't detect thousands of races in kernel code that one may consider more harmful than this one the easy way. And instead will spent large amounts of time to fix some of then the hard way, and leave the rest as just too hard to debug so let the kernel continue crashing from time to time (I believe a portion of currently open syzbot bugs that developers just left as "I don't see how this can happen" are due to such races). ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 10:48 ` Dmitry Vyukov @ 2019-01-11 12:46 ` Tetsuo Handa 2019-01-16 9:47 ` Dmitry Vyukov 0 siblings, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2019-01-11 12:46 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Jan Kara, Al Viro, linux-fsdevel On 2019/01/11 19:48, Dmitry Vyukov wrote: >> How did you arrive to the conclusion that it is harmless? >> There is only one relevant standard covering this, which is the C >> language standard, and it is very clear on this -- this has Undefined >> Behavior, that is the same as, for example, reading/writing random >> pointers. >> >> Check out this on how any race that you might think is benign can be >> badly miscompiled and lead to arbitrary program behavior: >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > Also there is no other practical definition of data race for automatic > data race detectors than: two conflicting non-atomic concurrent > accesses. Which this code is. Which means that if we continue writing > such code we are not getting data race detection and don't detect > thousands of races in kernel code that one may consider more harmful > than this one the easy way. And instead will spent large amounts of > time to fix some of then the hard way, and leave the rest as just too > hard to debug so let the kernel continue crashing from time to time (I > believe a portion of currently open syzbot bugs that developers just > left as "I don't see how this can happen" are due to such races). > I still cannot catch. Read/write of sizeof(long) bytes at naturally aligned address is atomic, isn't it? I'm not using increments etc. Therefore, in the worst case, some threads see outdated value. But outdated values affect only time_in_range() test, which does not cause severe problems like crash. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 12:46 ` Tetsuo Handa @ 2019-01-16 9:47 ` Dmitry Vyukov 2019-01-16 10:43 ` Jan Kara 0 siblings, 1 reply; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-16 9:47 UTC (permalink / raw) To: Tetsuo Handa; +Cc: Jan Kara, Al Viro, linux-fsdevel On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > >> How did you arrive to the conclusion that it is harmless? > >> There is only one relevant standard covering this, which is the C > >> language standard, and it is very clear on this -- this has Undefined > >> Behavior, that is the same as, for example, reading/writing random > >> pointers. > >> > >> Check out this on how any race that you might think is benign can be > >> badly miscompiled and lead to arbitrary program behavior: > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > Also there is no other practical definition of data race for automatic > > data race detectors than: two conflicting non-atomic concurrent > > accesses. Which this code is. Which means that if we continue writing > > such code we are not getting data race detection and don't detect > > thousands of races in kernel code that one may consider more harmful > > than this one the easy way. And instead will spent large amounts of > > time to fix some of then the hard way, and leave the rest as just too > > hard to debug so let the kernel continue crashing from time to time (I > > believe a portion of currently open syzbot bugs that developers just > > left as "I don't see how this can happen" are due to such races). > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > aligned address is atomic, isn't it? Nobody guarantees this. According to C non-atomic conflicting reads/writes of sizeof(long) cause undefined behavior of the whole program. > I'm not using increments etc. > Therefore, in the worst case, some threads see outdated value. But > outdated values affect only time_in_range() test, which does not cause > severe problems like crash. > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 9:47 ` Dmitry Vyukov @ 2019-01-16 10:43 ` Jan Kara 2019-01-16 11:03 ` Dmitry Vyukov 2019-01-17 14:11 ` Tetsuo Handa 0 siblings, 2 replies; 31+ messages in thread From: Jan Kara @ 2019-01-16 10:43 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Tetsuo Handa, Jan Kara, Al Viro, linux-fsdevel On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > >> How did you arrive to the conclusion that it is harmless? > > >> There is only one relevant standard covering this, which is the C > > >> language standard, and it is very clear on this -- this has Undefined > > >> Behavior, that is the same as, for example, reading/writing random > > >> pointers. > > >> > > >> Check out this on how any race that you might think is benign can be > > >> badly miscompiled and lead to arbitrary program behavior: > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > Also there is no other practical definition of data race for automatic > > > data race detectors than: two conflicting non-atomic concurrent > > > accesses. Which this code is. Which means that if we continue writing > > > such code we are not getting data race detection and don't detect > > > thousands of races in kernel code that one may consider more harmful > > > than this one the easy way. And instead will spent large amounts of > > > time to fix some of then the hard way, and leave the rest as just too > > > hard to debug so let the kernel continue crashing from time to time (I > > > believe a portion of currently open syzbot bugs that developers just > > > left as "I don't see how this can happen" are due to such races). > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > aligned address is atomic, isn't it? > > Nobody guarantees this. According to C non-atomic conflicting > reads/writes of sizeof(long) cause undefined behavior of the whole > program. Yes, but to be fair the kernel has always relied on long accesses to be atomic pretty heavily so that it is now de-facto standard for the kernel AFAICT. I understand this makes life for static checkers hard but such is reality. But in this particular case I agree with you that special logic is not really warranted. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 10:43 ` Jan Kara @ 2019-01-16 11:03 ` Dmitry Vyukov 2019-01-16 11:48 ` Dmitry Vyukov 2019-01-16 11:56 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara 2019-01-17 14:11 ` Tetsuo Handa 1 sibling, 2 replies; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-16 11:03 UTC (permalink / raw) To: Jan Kara; +Cc: Tetsuo Handa, Al Viro, linux-fsdevel On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > >> How did you arrive to the conclusion that it is harmless? > > > >> There is only one relevant standard covering this, which is the C > > > >> language standard, and it is very clear on this -- this has Undefined > > > >> Behavior, that is the same as, for example, reading/writing random > > > >> pointers. > > > >> > > > >> Check out this on how any race that you might think is benign can be > > > >> badly miscompiled and lead to arbitrary program behavior: > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > Also there is no other practical definition of data race for automatic > > > > data race detectors than: two conflicting non-atomic concurrent > > > > accesses. Which this code is. Which means that if we continue writing > > > > such code we are not getting data race detection and don't detect > > > > thousands of races in kernel code that one may consider more harmful > > > > than this one the easy way. And instead will spent large amounts of > > > > time to fix some of then the hard way, and leave the rest as just too > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > believe a portion of currently open syzbot bugs that developers just > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > aligned address is atomic, isn't it? > > > > Nobody guarantees this. According to C non-atomic conflicting > > reads/writes of sizeof(long) cause undefined behavior of the whole > > program. > > Yes, but to be fair the kernel has always relied on long accesses to be > atomic pretty heavily so that it is now de-facto standard for the kernel > AFAICT. I understand this makes life for static checkers hard but such is > reality. Yes, but nobody never defined what "a long access" means. And if you see a function that accepts a long argument and stores it into a long field, no, it does not qualify. I bet this will come at surprise to lots of developers. Check out this fix and try to extrapolate how this "function stores long into a long leads to a serious security bug" can actually be applied to whole lot of places after inlining (or when somebody just slightly shuffles code in a way that looks totally safe) that also kinda look safe and atomic: https://lore.kernel.org/patchwork/patch/599779/ So where is the boundary between "a long access" that is atomic and the one that is not necessary atomic? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 11:03 ` Dmitry Vyukov @ 2019-01-16 11:48 ` Dmitry Vyukov 2019-01-16 16:28 ` Greg Kroah-Hartman 2019-01-16 11:56 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara 1 sibling, 1 reply; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-16 11:48 UTC (permalink / raw) To: Jan Kara, Linus Torvalds, Greg Kroah-Hartman, Kees Cook Cc: Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > >> How did you arrive to the conclusion that it is harmless? > > > > >> There is only one relevant standard covering this, which is the C > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > >> pointers. > > > > >> > > > > >> Check out this on how any race that you might think is benign can be > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > such code we are not getting data race detection and don't detect > > > > > thousands of races in kernel code that one may consider more harmful > > > > > than this one the easy way. And instead will spent large amounts of > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > aligned address is atomic, isn't it? > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > program. > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > atomic pretty heavily so that it is now de-facto standard for the kernel > > AFAICT. I understand this makes life for static checkers hard but such is > > reality. > > Yes, but nobody never defined what "a long access" means. And if you > see a function that accepts a long argument and stores it into a long > field, no, it does not qualify. I bet this will come at surprise to > lots of developers. > Check out this fix and try to extrapolate how this "function stores > long into a long leads to a serious security bug" can actually be > applied to whole lot of places after inlining (or when somebody just > slightly shuffles code in a way that looks totally safe) that also > kinda look safe and atomic: > https://lore.kernel.org/patchwork/patch/599779/ > So where is the boundary between "a long access" that is atomic and > the one that is not necessary atomic? +Linus, Greg, Kees I wanted to provide a hash/link to this commit but, wait, you want to say that this patch for a security bugs was mailed, recorded by patchwork, acked by subsystem developer and then dropped on the floor for 3+ years? Doh! https://lore.kernel.org/patchwork/patch/599779/ There are known ways how to make this not a thing at all. Like open pull requests on github: https://github.com/google/syzkaller/pulls or, some projects even do own dashboard for this: https://dev.golang.org/reviews because this is important. Especially for new contributors, drive-by improvements, good samaritan fixes, etc. Another example: a bug-fixing patch was lost for 2 years: "Two years ago ;) I don't understand why there were ignored" https://www.spinics.net/lists/linux-mm/msg161351.html Another example: a patch is applied to a subsystem tree and then lost for 6 months: https://patchwork.kernel.org/patch/10339089/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 11:48 ` Dmitry Vyukov @ 2019-01-16 16:28 ` Greg Kroah-Hartman 2019-01-17 13:18 ` Dmitry Vyukov 0 siblings, 1 reply; 31+ messages in thread From: Greg Kroah-Hartman @ 2019-01-16 16:28 UTC (permalink / raw) To: Dmitry Vyukov Cc: Jan Kara, Linus Torvalds, Kees Cook, Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote: > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > >> There is only one relevant standard covering this, which is the C > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > >> pointers. > > > > > >> > > > > > >> Check out this on how any race that you might think is benign can be > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > such code we are not getting data race detection and don't detect > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > aligned address is atomic, isn't it? > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > program. > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > AFAICT. I understand this makes life for static checkers hard but such is > > > reality. > > > > Yes, but nobody never defined what "a long access" means. And if you > > see a function that accepts a long argument and stores it into a long > > field, no, it does not qualify. I bet this will come at surprise to > > lots of developers. > > Check out this fix and try to extrapolate how this "function stores > > long into a long leads to a serious security bug" can actually be > > applied to whole lot of places after inlining (or when somebody just > > slightly shuffles code in a way that looks totally safe) that also > > kinda look safe and atomic: > > https://lore.kernel.org/patchwork/patch/599779/ > > So where is the boundary between "a long access" that is atomic and > > the one that is not necessary atomic? > > > +Linus, Greg, Kees > > I wanted to provide a hash/link to this commit but, wait, you want to > say that this patch for a security bugs was mailed, recorded by > patchwork, acked by subsystem developer and then dropped on the floor > for 3+ years? Doh! > > https://lore.kernel.org/patchwork/patch/599779/ > > There are known ways how to make this not a thing at all. Like open > pull requests on github: > https://github.com/google/syzkaller/pulls > or, some projects even do own dashboard for this: > https://dev.golang.org/reviews > because this is important. Especially for new contributors, drive-by > improvements, good samaritan fixes, etc. > > Another example: a bug-fixing patch was lost for 2 years: > "Two years ago ;) I don't understand why there were ignored" > https://www.spinics.net/lists/linux-mm/msg161351.html > > Another example: a patch is applied to a subsystem tree and then lost > for 6 months: > https://patchwork.kernel.org/patch/10339089/ I don't understand the issue here. Are you saying that sometimes patches that have been submitted get dropped? Yes, that's known, it is up to the submitter to verify and ensure that the patch is applied. Given our rate of change and the large workload that some maintainers have, this is the best that we can do at the moment. Putting it all in a github dashboard would not scale in the least (other projects smaller than us have tried and ended up stopping from doing that as it fails horribly). Yes, we can always do better, but remember that the submitter needs to take the time to ensure that their patches are applied. Heck, I have patches submitted months ago that I know the maintainers ignored, and I need to remember to send them again. We put the burden of development on the thing that scales, the developer themselves, not the maintainer here. It's the best that we know of how to do at the moment, and we are always trying to do better. Examples of this are where some subsystems are now getting multiple maintainers to handle the workload, and that's helping a lot. That doesn't work for all subsystems as not all subsystems can even find more than one maintainer who is willing to look at the patches. Please, resubmit your mount patch again, that's a crazy bug :) thanks, greg k-h ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 16:28 ` Greg Kroah-Hartman @ 2019-01-17 13:18 ` Dmitry Vyukov 2019-01-21 8:37 ` Jan Kara 0 siblings, 1 reply; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-17 13:18 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Jan Kara, Linus Torvalds, Kees Cook, Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote: > > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > > >> There is only one relevant standard covering this, which is the C > > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > > >> pointers. > > > > > > >> > > > > > > >> Check out this on how any race that you might think is benign can be > > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > > such code we are not getting data race detection and don't detect > > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > > aligned address is atomic, isn't it? > > > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > > program. > > > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > > AFAICT. I understand this makes life for static checkers hard but such is > > > > reality. > > > > > > Yes, but nobody never defined what "a long access" means. And if you > > > see a function that accepts a long argument and stores it into a long > > > field, no, it does not qualify. I bet this will come at surprise to > > > lots of developers. > > > Check out this fix and try to extrapolate how this "function stores > > > long into a long leads to a serious security bug" can actually be > > > applied to whole lot of places after inlining (or when somebody just > > > slightly shuffles code in a way that looks totally safe) that also > > > kinda look safe and atomic: > > > https://lore.kernel.org/patchwork/patch/599779/ > > > So where is the boundary between "a long access" that is atomic and > > > the one that is not necessary atomic? > > > > > > +Linus, Greg, Kees > > > > I wanted to provide a hash/link to this commit but, wait, you want to > > say that this patch for a security bugs was mailed, recorded by > > patchwork, acked by subsystem developer and then dropped on the floor > > for 3+ years? Doh! > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > There are known ways how to make this not a thing at all. Like open > > pull requests on github: > > https://github.com/google/syzkaller/pulls > > or, some projects even do own dashboard for this: > > https://dev.golang.org/reviews > > because this is important. Especially for new contributors, drive-by > > improvements, good samaritan fixes, etc. > > > > Another example: a bug-fixing patch was lost for 2 years: > > "Two years ago ;) I don't understand why there were ignored" > > https://www.spinics.net/lists/linux-mm/msg161351.html > > > > Another example: a patch is applied to a subsystem tree and then lost > > for 6 months: > > https://patchwork.kernel.org/patch/10339089/ > > I don't understand the issue here. Are you saying that sometimes > patches that have been submitted get dropped? Yes, that's known, it is > up to the submitter to verify and ensure that the patch is applied. > Given our rate of change and the large workload that some maintainers > have, this is the best that we can do at the moment. > > Putting it all in a github dashboard would not scale in the least (other > projects smaller than us have tried and ended up stopping from doing > that as it fails horribly). > > Yes, we can always do better, but remember that the submitter needs to > take the time to ensure that their patches are applied. Heck, I have > patches submitted months ago that I know the maintainers ignored, and I > need to remember to send them again. We put the burden of development > on the thing that scales, the developer themselves, not the maintainer > here. > > It's the best that we know of how to do at the moment, and we are always > trying to do better. Examples of this are where some subsystems are now > getting multiple maintainers to handle the workload, and that's helping > a lot. That doesn't work for all subsystems as not all subsystems can > even find more than one maintainer who is willing to look at the > patches. The issue here is that patches are lost and "up to the submitter" is not fully working. It may be working reasonably well when a developer has an official assignment at work to do thing X, and then they can't miss/forget about "is thing X merged yet". But it fails for new contributors, drive-by improvements, good samaritan fixes, etc. Things that we need no less than the first category (maybe more). Machines are always better than humans at such scrupulous tracking work. So if humans can do it, machines will do even better. The dashboard definitely needs to be sharded in multiple dimensions. E.g. "per subsystem", "per assigned reviewer", and even "per author". Because e.g. how may mine are lost? Only this one or more? How many yours are lost? Do you know? I am sure this is doable and beneficial. I don't know why other projects failed with this, maybe that's something with github. But there are also codebases that are 100x larger than kernel and do amount of changes kernel receives in a year in less than a week and nothing gets lots thanks to scalable processes and automation. > Please, resubmit your mount patch again, that's a crazy bug :) That's the problem. It now requires way more additional work and it's even unclear if the problem is still there or not, the code has radically changed. It could have been merged back then as is with 0 additional work. I could have been updated it a week after original submission. But now that's completely paged out, I am looking at that file and I don't recognize anything, no idea how the patch should be updated, I now have no idea what tree such patch should be based on, etc. Also a good question is how many other my patches were lost that I now have no idea about? I discovered this one by pure accident. To make things more constructive: say, if somebody offers to build such a system for kernel, in accordance with kernel specific requirements (would also enable presubmit testing, recording base commit, not losing review comments and other nice things), would you, Linus (not sure who is in change of such decisions) be willing to integrate it into the official kernel development process so that everybody use it for all kernel changes? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-17 13:18 ` Dmitry Vyukov @ 2019-01-21 8:37 ` Jan Kara 2019-01-21 10:33 ` Tetsuo Handa 2019-01-22 15:27 ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov 0 siblings, 2 replies; 31+ messages in thread From: Jan Kara @ 2019-01-21 8:37 UTC (permalink / raw) To: Dmitry Vyukov Cc: Greg Kroah-Hartman, Jan Kara, Linus Torvalds, Kees Cook, Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote: > On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman > <gregkh@linuxfoundation.org> wrote: > > > > On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote: > > > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > > I wanted to provide a hash/link to this commit but, wait, you want to > > > say that this patch for a security bugs was mailed, recorded by > > > patchwork, acked by subsystem developer and then dropped on the floor > > > for 3+ years? Doh! > > > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > > > There are known ways how to make this not a thing at all. Like open > > > pull requests on github: > > > https://github.com/google/syzkaller/pulls > > > or, some projects even do own dashboard for this: > > > https://dev.golang.org/reviews > > > because this is important. Especially for new contributors, drive-by > > > improvements, good samaritan fixes, etc. > > > > > > Another example: a bug-fixing patch was lost for 2 years: > > > "Two years ago ;) I don't understand why there were ignored" > > > https://www.spinics.net/lists/linux-mm/msg161351.html > > > > > > Another example: a patch is applied to a subsystem tree and then lost > > > for 6 months: > > > https://patchwork.kernel.org/patch/10339089/ > > > > I don't understand the issue here. Are you saying that sometimes > > patches that have been submitted get dropped? Yes, that's known, it is > > up to the submitter to verify and ensure that the patch is applied. > > Given our rate of change and the large workload that some maintainers > > have, this is the best that we can do at the moment. > > > > Putting it all in a github dashboard would not scale in the least (other > > projects smaller than us have tried and ended up stopping from doing > > that as it fails horribly). > > > > Yes, we can always do better, but remember that the submitter needs to > > take the time to ensure that their patches are applied. Heck, I have > > patches submitted months ago that I know the maintainers ignored, and I > > need to remember to send them again. We put the burden of development > > on the thing that scales, the developer themselves, not the maintainer > > here. > > > > It's the best that we know of how to do at the moment, and we are always > > trying to do better. Examples of this are where some subsystems are now > > getting multiple maintainers to handle the workload, and that's helping > > a lot. That doesn't work for all subsystems as not all subsystems can > > even find more than one maintainer who is willing to look at the > > patches. > > The issue here is that patches are lost and "up to the submitter" is > not fully working. > It may be working reasonably well when a developer has an official > assignment at work to do thing X, and then they can't miss/forget > about "is thing X merged yet". But it fails for new contributors, > drive-by improvements, good samaritan fixes, etc. Things that we need > no less than the first category (maybe more). > Machines are always better than humans at such scrupulous tracking > work. So if humans can do it, machines will do even better. > The dashboard definitely needs to be sharded in multiple dimensions. > E.g. "per subsystem", "per assigned reviewer", and even "per author". > Because e.g. how may mine are lost? Only this one or more? How many > yours are lost? Do you know? > I am sure this is doable and beneficial. I don't know why other > projects failed with this, maybe that's something with github. But > there are also codebases that are 100x larger than kernel and do > amount of changes kernel receives in a year in less than a week and > nothing gets lots thanks to scalable processes and automation. Out of curiosity which ones? > > Please, resubmit your mount patch again, that's a crazy bug :) > > That's the problem. It now requires way more additional work and it's > even unclear if the problem is still there or not, the code has > radically changed. It could have been merged back then as is with 0 > additional work. I could have been updated it a week after original > submission. But now that's completely paged out, I am looking at that > file and I don't recognize anything, no idea how the patch should be > updated, I now have no idea what tree such patch should be based on, > etc. Yeah, it's painful at times... I know it as well. > Also a good question is how many other my patches were lost that I now > have no idea about? I discovered this one by pure accident. Well, I do keep track of my own submitted patches in my git tree and occasionally sweep through it and resubmit / ping about lost ones. But yes, it requires some motivation and self-discipline which is not always present for drive-by contributions. > To make things more constructive: say, if somebody offers to build > such a system for kernel, in accordance with kernel specific > requirements (would also enable presubmit testing, recording base > commit, not losing review comments and other nice things), would you, > Linus (not sure who is in change of such decisions) be willing to > integrate it into the official kernel development process so that > everybody use it for all kernel changes? Well, we do have patchwork and some subsystems use it. It doesn't have pre-submit testing or other fancy features but it is good enough so that patches don't get lost. And Konstantin (kernel.org admin) did quite some good work recently with automating lots of tedious tasks with patchwork. So at this point I think it's more about the fact that some maintainers prefer to work differently rather than the lack of tooling as such. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-21 8:37 ` Jan Kara @ 2019-01-21 10:33 ` Tetsuo Handa 2019-01-21 10:48 ` Greg Kroah-Hartman 2019-01-22 15:27 ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov 1 sibling, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2019-01-21 10:33 UTC (permalink / raw) To: Jan Kara, Dmitry Vyukov Cc: Greg Kroah-Hartman, Linus Torvalds, Kees Cook, Al Viro, linux-fsdevel, Kostya Serebryany, Andrew Morton On 2019/01/21 17:37, Jan Kara wrote: > On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote: >> Also a good question is how many other my patches were lost that I now >> have no idea about? I discovered this one by pure accident. > > Well, I do keep track of my own submitted patches in my git tree and > occasionally sweep through it and resubmit / ping about lost ones. But yes, > it requires some motivation and self-discipline which is not always present > for drive-by contributions. I use quilt. quilt is simple and easier for me than git. You can maintain a file named "series" and do "quilt push -a" && "quilt pop -a" for automatic checking. By the way, isn't Andrew Morton still using quilt? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-21 10:33 ` Tetsuo Handa @ 2019-01-21 10:48 ` Greg Kroah-Hartman 0 siblings, 0 replies; 31+ messages in thread From: Greg Kroah-Hartman @ 2019-01-21 10:48 UTC (permalink / raw) To: Tetsuo Handa Cc: Jan Kara, Dmitry Vyukov, Linus Torvalds, Kees Cook, Al Viro, linux-fsdevel, Kostya Serebryany, Andrew Morton On Mon, Jan 21, 2019 at 07:33:13PM +0900, Tetsuo Handa wrote: > On 2019/01/21 17:37, Jan Kara wrote: > > On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote: > >> Also a good question is how many other my patches were lost that I now > >> have no idea about? I discovered this one by pure accident. > > > > Well, I do keep track of my own submitted patches in my git tree and > > occasionally sweep through it and resubmit / ping about lost ones. But yes, > > it requires some motivation and self-discipline which is not always present > > for drive-by contributions. > > I use quilt. quilt is simple and easier for me than git. You can maintain a > file named "series" and do "quilt push -a" && "quilt pop -a" for automatic > checking. By the way, isn't Andrew Morton still using quilt? > I think Andrew's scripts pre-date quilt, they are what got turned into quilt. I use quilt daily for all stable kernel work, and for my personal kernel development as well. thanks, greg k-h ^ permalink raw reply [flat|nested] 31+ messages in thread
* Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) 2019-01-21 8:37 ` Jan Kara 2019-01-21 10:33 ` Tetsuo Handa @ 2019-01-22 15:27 ` Dmitry Vyukov 2019-01-22 17:15 ` Jan Kara 1 sibling, 1 reply; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-22 15:27 UTC (permalink / raw) To: Jan Kara Cc: Greg Kroah-Hartman, Linus Torvalds, Kees Cook, Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany, LKML, Daniel Vetter [-- Attachment #1: Type: text/plain, Size: 7993 bytes --] On Mon, Jan 21, 2019 at 9:37 AM Jan Kara <jack@suse.cz> wrote: > > On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote: > > On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman > > <gregkh@linuxfoundation.org> wrote: > > > > > > On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote: > > > > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > > > I wanted to provide a hash/link to this commit but, wait, you want to > > > > say that this patch for a security bugs was mailed, recorded by > > > > patchwork, acked by subsystem developer and then dropped on the floor > > > > for 3+ years? Doh! > > > > > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > > > > > There are known ways how to make this not a thing at all. Like open > > > > pull requests on github: > > > > https://github.com/google/syzkaller/pulls > > > > or, some projects even do own dashboard for this: > > > > https://dev.golang.org/reviews > > > > because this is important. Especially for new contributors, drive-by > > > > improvements, good samaritan fixes, etc. > > > > > > > > Another example: a bug-fixing patch was lost for 2 years: > > > > "Two years ago ;) I don't understand why there were ignored" > > > > https://www.spinics.net/lists/linux-mm/msg161351.html > > > > > > > > Another example: a patch is applied to a subsystem tree and then lost > > > > for 6 months: > > > > https://patchwork.kernel.org/patch/10339089/ > > > > > > I don't understand the issue here. Are you saying that sometimes > > > patches that have been submitted get dropped? Yes, that's known, it is > > > up to the submitter to verify and ensure that the patch is applied. > > > Given our rate of change and the large workload that some maintainers > > > have, this is the best that we can do at the moment. > > > > > > Putting it all in a github dashboard would not scale in the least (other > > > projects smaller than us have tried and ended up stopping from doing > > > that as it fails horribly). > > > > > > Yes, we can always do better, but remember that the submitter needs to > > > take the time to ensure that their patches are applied. Heck, I have > > > patches submitted months ago that I know the maintainers ignored, and I > > > need to remember to send them again. We put the burden of development > > > on the thing that scales, the developer themselves, not the maintainer > > > here. > > > > > > It's the best that we know of how to do at the moment, and we are always > > > trying to do better. Examples of this are where some subsystems are now > > > getting multiple maintainers to handle the workload, and that's helping > > > a lot. That doesn't work for all subsystems as not all subsystems can > > > even find more than one maintainer who is willing to look at the > > > patches. > > > > The issue here is that patches are lost and "up to the submitter" is > > not fully working. > > It may be working reasonably well when a developer has an official > > assignment at work to do thing X, and then they can't miss/forget > > about "is thing X merged yet". But it fails for new contributors, > > drive-by improvements, good samaritan fixes, etc. Things that we need > > no less than the first category (maybe more). > > Machines are always better than humans at such scrupulous tracking > > work. So if humans can do it, machines will do even better. > > The dashboard definitely needs to be sharded in multiple dimensions. > > E.g. "per subsystem", "per assigned reviewer", and even "per author". > > Because e.g. how may mine are lost? Only this one or more? How many > > yours are lost? Do you know? > > I am sure this is doable and beneficial. I don't know why other > > projects failed with this, maybe that's something with github. But > > there are also codebases that are 100x larger than kernel and do > > amount of changes kernel receives in a year in less than a week and > > nothing gets lots thanks to scalable processes and automation. > > Out of curiosity which ones? I mean in particular Google codebase [1] but I think Facebook [2], Chromium [3], Rust [4], Go processes share lots of the same principles. Overall idea is process unification and automation and building more complex functions on top of lower-level functions. This allows to move very fast at very large scale and at the same time preserving very high code quality (as required by and proven by continuous delivery). I feel that perhaps I failed to explain the larger picture assuming that it's common knowledge, but perhaps it's not, so I draw this 1-pager diagram how functions build on top of functions and all fit together: https://docs.google.com/presentation/d/e/2PACX-1vRq2SdmiP-wqUb3Xo2drgn48bw2HbyGqFPP-ebfTfn6eNZkHSRwKZKRBAT6K3E3Ra9IJ218ZqRxvmfG/pub (also attached if you prefer a download) The goal is not to say that this is the only true way of doing things or that we need all of this, but to show that higher-level nice things can't be built without proper lower-level foundation. We all agree on few lowest level things (like git and C), which is good and already brings tremendous benefits. But it really feels to me that at the current kernel scale and fundamentality we need the next layer of common building blocks in the process: things like change tracking (in particular, patches that can be reliably applied) and tests (that are easy to add, discoverer, run locally and on CI). And to really work as foundation these things need to be agreed on as being "the solution" (e.g. "all kernel changes go through patchwork") rather then "being allowed to be used by fragmented groups if they want". [1] https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext [2] https://framethink.wordpress.com/2011/01/17/how-facebook-ships-code/ [3] https://www.youtube.com/watch?v=dIageYT0Vgg [4] https://www.chromium.org/developers > > > Please, resubmit your mount patch again, that's a crazy bug :) > > > > That's the problem. It now requires way more additional work and it's > > even unclear if the problem is still there or not, the code has > > radically changed. It could have been merged back then as is with 0 > > additional work. I could have been updated it a week after original > > submission. But now that's completely paged out, I am looking at that > > file and I don't recognize anything, no idea how the patch should be > > updated, I now have no idea what tree such patch should be based on, > > etc. > > Yeah, it's painful at times... I know it as well. > > > Also a good question is how many other my patches were lost that I now > > have no idea about? I discovered this one by pure accident. > > Well, I do keep track of my own submitted patches in my git tree and > occasionally sweep through it and resubmit / ping about lost ones. But yes, > it requires some motivation and self-discipline which is not always present > for drive-by contributions. > > > To make things more constructive: say, if somebody offers to build > > such a system for kernel, in accordance with kernel specific > > requirements (would also enable presubmit testing, recording base > > commit, not losing review comments and other nice things), would you, > > Linus (not sure who is in change of such decisions) be willing to > > integrate it into the official kernel development process so that > > everybody use it for all kernel changes? > > Well, we do have patchwork and some subsystems use it. It doesn't have > pre-submit testing or other fancy features but it is good enough so that > patches don't get lost. And Konstantin (kernel.org admin) did quite some > good work recently with automating lots of tedious tasks with patchwork. So > at this point I think it's more about the fact that some maintainers prefer > to work differently rather than the lack of tooling as such. > > Honza > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR [-- Attachment #2: devtools.pdf --] [-- Type: application/pdf, Size: 62218 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) 2019-01-22 15:27 ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov @ 2019-01-22 17:15 ` Jan Kara 0 siblings, 0 replies; 31+ messages in thread From: Jan Kara @ 2019-01-22 17:15 UTC (permalink / raw) To: Dmitry Vyukov Cc: Jan Kara, Greg Kroah-Hartman, Linus Torvalds, Kees Cook, Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany, LKML, Daniel Vetter On Tue 22-01-19 16:27:53, Dmitry Vyukov wrote: > On Mon, Jan 21, 2019 at 9:37 AM Jan Kara <jack@suse.cz> wrote: > > > > On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote: > > > On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman > > > <gregkh@linuxfoundation.org> wrote: > > > > > > > > On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote: > > > > > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > > > > I wanted to provide a hash/link to this commit but, wait, you want to > > > > > say that this patch for a security bugs was mailed, recorded by > > > > > patchwork, acked by subsystem developer and then dropped on the floor > > > > > for 3+ years? Doh! > > > > > > > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > > > > > > > There are known ways how to make this not a thing at all. Like open > > > > > pull requests on github: > > > > > https://github.com/google/syzkaller/pulls > > > > > or, some projects even do own dashboard for this: > > > > > https://dev.golang.org/reviews > > > > > because this is important. Especially for new contributors, drive-by > > > > > improvements, good samaritan fixes, etc. > > > > > > > > > > Another example: a bug-fixing patch was lost for 2 years: > > > > > "Two years ago ;) I don't understand why there were ignored" > > > > > https://www.spinics.net/lists/linux-mm/msg161351.html > > > > > > > > > > Another example: a patch is applied to a subsystem tree and then lost > > > > > for 6 months: > > > > > https://patchwork.kernel.org/patch/10339089/ > > > > > > > > I don't understand the issue here. Are you saying that sometimes > > > > patches that have been submitted get dropped? Yes, that's known, it is > > > > up to the submitter to verify and ensure that the patch is applied. > > > > Given our rate of change and the large workload that some maintainers > > > > have, this is the best that we can do at the moment. > > > > > > > > Putting it all in a github dashboard would not scale in the least (other > > > > projects smaller than us have tried and ended up stopping from doing > > > > that as it fails horribly). > > > > > > > > Yes, we can always do better, but remember that the submitter needs to > > > > take the time to ensure that their patches are applied. Heck, I have > > > > patches submitted months ago that I know the maintainers ignored, and I > > > > need to remember to send them again. We put the burden of development > > > > on the thing that scales, the developer themselves, not the maintainer > > > > here. > > > > > > > > It's the best that we know of how to do at the moment, and we are always > > > > trying to do better. Examples of this are where some subsystems are now > > > > getting multiple maintainers to handle the workload, and that's helping > > > > a lot. That doesn't work for all subsystems as not all subsystems can > > > > even find more than one maintainer who is willing to look at the > > > > patches. > > > > > > The issue here is that patches are lost and "up to the submitter" is > > > not fully working. > > > It may be working reasonably well when a developer has an official > > > assignment at work to do thing X, and then they can't miss/forget > > > about "is thing X merged yet". But it fails for new contributors, > > > drive-by improvements, good samaritan fixes, etc. Things that we need > > > no less than the first category (maybe more). > > > Machines are always better than humans at such scrupulous tracking > > > work. So if humans can do it, machines will do even better. > > > The dashboard definitely needs to be sharded in multiple dimensions. > > > E.g. "per subsystem", "per assigned reviewer", and even "per author". > > > Because e.g. how may mine are lost? Only this one or more? How many > > > yours are lost? Do you know? > > > I am sure this is doable and beneficial. I don't know why other > > > projects failed with this, maybe that's something with github. But > > > there are also codebases that are 100x larger than kernel and do > > > amount of changes kernel receives in a year in less than a week and > > > nothing gets lots thanks to scalable processes and automation. > > > > Out of curiosity which ones? > > I mean in particular Google codebase [1] but I think Facebook [2], > Chromium [3], Rust [4], Go processes share lots of the same > principles. Overall idea is process unification and automation and > building more complex functions on top of lower-level functions. This > allows to move very fast at very large scale and at the same time > preserving very high code quality (as required by and proven by > continuous delivery). > > I feel that perhaps I failed to explain the larger picture assuming > that it's common knowledge, but perhaps it's not, so I draw this > 1-pager diagram how functions build on top of functions and all fit > together: > > https://docs.google.com/presentation/d/e/2PACX-1vRq2SdmiP-wqUb3Xo2drgn48bw2HbyGqFPP-ebfTfn6eNZkHSRwKZKRBAT6K3E3Ra9IJ218ZqRxvmfG/pub > (also attached if you prefer a download) Thanks for drawing this and for the references! I know these things in principle but the image certainly helps in knowing what your are talking about exactly. > The goal is not to say that this is the only true way of doing things > or that we need all of this, but to show that higher-level nice things > can't be built without proper lower-level foundation. We all agree on > few lowest level things (like git and C), which is good and already > brings tremendous benefits. But it really feels to me that at the > current kernel scale and fundamentality we need the next layer of > common building blocks in the process: things like change tracking (in > particular, patches that can be reliably applied) and tests (that are > easy to add, discoverer, run locally and on CI). And to really work as > foundation these things need to be agreed on as being "the solution" > (e.g. "all kernel changes go through patchwork") rather then "being > allowed to be used by fragmented groups if they want". Understood. I guess eventually we may get to something like that but at least as far as I can observe current efforts, trying to change something in the kernel development process is like herding cats. You need to offer big enough bowl of cream ;). > [1] https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext > [2] https://framethink.wordpress.com/2011/01/17/how-facebook-ships-code/ > [3] https://www.youtube.com/watch?v=dIageYT0Vgg > [4] https://www.chromium.org/developers Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 11:03 ` Dmitry Vyukov 2019-01-16 11:48 ` Dmitry Vyukov @ 2019-01-16 11:56 ` Jan Kara 2019-01-16 12:37 ` Dmitry Vyukov 1 sibling, 1 reply; 31+ messages in thread From: Jan Kara @ 2019-01-16 11:56 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > >> How did you arrive to the conclusion that it is harmless? > > > > >> There is only one relevant standard covering this, which is the C > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > >> pointers. > > > > >> > > > > >> Check out this on how any race that you might think is benign can be > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > such code we are not getting data race detection and don't detect > > > > > thousands of races in kernel code that one may consider more harmful > > > > > than this one the easy way. And instead will spent large amounts of > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > aligned address is atomic, isn't it? > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > program. > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > atomic pretty heavily so that it is now de-facto standard for the kernel > > AFAICT. I understand this makes life for static checkers hard but such is > > reality. > > Yes, but nobody never defined what "a long access" means. And if you > see a function that accepts a long argument and stores it into a long > field, no, it does not qualify. I bet this will come at surprise to > lots of developers. Yes, inlining and other optimizations can screw you. > Check out this fix and try to extrapolate how this "function stores > long into a long leads to a serious security bug" can actually be > applied to whole lot of places after inlining (or when somebody just > slightly shuffles code in a way that looks totally safe) that also > kinda look safe and atomic: > https://lore.kernel.org/patchwork/patch/599779/ > So where is the boundary between "a long access" that is atomic and > the one that is not necessary atomic? So I tend to rely on "long access being atomic" for opaque values (no flags, no counters, ...). Just value that gets fetched from some global variable / other data structure, stored, read, and possibly compared for equality. I agree the compiler could still screw you if it could infer how that value got initially created and try to be clever about it... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 11:56 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara @ 2019-01-16 12:37 ` Dmitry Vyukov 2019-01-16 12:37 ` Dmitry Vyukov 2019-01-16 14:51 ` Jan Kara 0 siblings, 2 replies; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-16 12:37 UTC (permalink / raw) To: Jan Kara Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney, Alan Stern, Andrea Parri On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > >> There is only one relevant standard covering this, which is the C > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > >> pointers. > > > > > >> > > > > > >> Check out this on how any race that you might think is benign can be > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > such code we are not getting data race detection and don't detect > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > aligned address is atomic, isn't it? > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > program. > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > AFAICT. I understand this makes life for static checkers hard but such is > > > reality. > > > > Yes, but nobody never defined what "a long access" means. And if you > > see a function that accepts a long argument and stores it into a long > > field, no, it does not qualify. I bet this will come at surprise to > > lots of developers. > > Yes, inlining and other optimizations can screw you. > > > Check out this fix and try to extrapolate how this "function stores > > long into a long leads to a serious security bug" can actually be > > applied to whole lot of places after inlining (or when somebody just > > slightly shuffles code in a way that looks totally safe) that also > > kinda look safe and atomic: > > https://lore.kernel.org/patchwork/patch/599779/ > > So where is the boundary between "a long access" that is atomic and > > the one that is not necessary atomic? > > So I tend to rely on "long access being atomic" for opaque values (no > flags, no counters, ...). Just value that gets fetched from some global > variable / other data structure, stored, read, and possibly compared for > equality. I agree the compiler could still screw you if it could infer how > that value got initially created and try to be clever about it... So can you, or somebody else, define a set of rules that we can use to discriminate each particular case? How can we avoid that "the compiler could still screw you"? Inlining is always enabled, right, so one needs to take into account everything that's possibly can be inlined. Now or in future. And also link-time-code generation, if we don't use it we are dropping 10% of performance on the floor. Also, ensuring that the code works when it's first submitted is the smaller part of the problem. It's ensuring that it continues to work in future what's more important. Try to imagine what amount of burden this puts onto all developers who touch any kernel code in future. Basically if you slightly alter local logic in a function that does not do any loads/stores, you can screw multiple "proofs" that long accesses are atomic. Or, you just move a function from .c file to .h. I can bet nobody re-proofs all "long accesses are atomic" around the changed code during code reviews, so these things break over time. Or, even if only comparisons are involved (that you mentioned as "safe") I see how that can actually affect compilation process. Say, we are in the branch where 2 variables compare equal, now since no concurrency is involved from compiler point of view, it can, say, discard one variable and then re-load it from the other variable's location, and say not the other variable has value that the other one must never have. I don't have a full scenario, but that's exactly the point. One will never see all possibilities. It all becomes super slippery slope very quickly. And we do want compiler to generate as fast code as possible and do all these optimizations. And it's not that there are big objective reasons to not just mark all concurrent accesses and stop spending large amounts of time on these "proofs". ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 12:37 ` Dmitry Vyukov @ 2019-01-16 12:37 ` Dmitry Vyukov 2019-01-16 14:51 ` Jan Kara 1 sibling, 0 replies; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-16 12:37 UTC (permalink / raw) To: Jan Kara Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney, Alan Stern, Andrea Parri On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > >> There is only one relevant standard covering this, which is the C > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > >> pointers. > > > > > >> > > > > > >> Check out this on how any race that you might think is benign can be > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > such code we are not getting data race detection and don't detect > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > aligned address is atomic, isn't it? > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > program. > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > AFAICT. I understand this makes life for static checkers hard but such is > > > reality. > > > > Yes, but nobody never defined what "a long access" means. And if you > > see a function that accepts a long argument and stores it into a long > > field, no, it does not qualify. I bet this will come at surprise to > > lots of developers. > > Yes, inlining and other optimizations can screw you. > > > Check out this fix and try to extrapolate how this "function stores > > long into a long leads to a serious security bug" can actually be > > applied to whole lot of places after inlining (or when somebody just > > slightly shuffles code in a way that looks totally safe) that also > > kinda look safe and atomic: > > https://lore.kernel.org/patchwork/patch/599779/ > > So where is the boundary between "a long access" that is atomic and > > the one that is not necessary atomic? > > So I tend to rely on "long access being atomic" for opaque values (no > flags, no counters, ...). Just value that gets fetched from some global > variable / other data structure, stored, read, and possibly compared for > equality. I agree the compiler could still screw you if it could infer how > that value got initially created and try to be clever about it... So can you, or somebody else, define a set of rules that we can use to discriminate each particular case? How can we avoid that "the compiler could still screw you"? Inlining is always enabled, right, so one needs to take into account everything that's possibly can be inlined. Now or in future. And also link-time-code generation, if we don't use it we are dropping 10% of performance on the floor. Also, ensuring that the code works when it's first submitted is the smaller part of the problem. It's ensuring that it continues to work in future what's more important. Try to imagine what amount of burden this puts onto all developers who touch any kernel code in future. Basically if you slightly alter local logic in a function that does not do any loads/stores, you can screw multiple "proofs" that long accesses are atomic. Or, you just move a function from .c file to .h. I can bet nobody re-proofs all "long accesses are atomic" around the changed code during code reviews, so these things break over time. Or, even if only comparisons are involved (that you mentioned as "safe") I see how that can actually affect compilation process. Say, we are in the branch where 2 variables compare equal, now since no concurrency is involved from compiler point of view, it can, say, discard one variable and then re-load it from the other variable's location, and say not the other variable has value that the other one must never have. I don't have a full scenario, but that's exactly the point. One will never see all possibilities. It all becomes super slippery slope very quickly. And we do want compiler to generate as fast code as possible and do all these optimizations. And it's not that there are big objective reasons to not just mark all concurrent accesses and stop spending large amounts of time on these "proofs". ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 12:37 ` Dmitry Vyukov 2019-01-16 12:37 ` Dmitry Vyukov @ 2019-01-16 14:51 ` Jan Kara 2019-01-16 14:51 ` Jan Kara 2019-01-16 15:33 ` Dmitry Vyukov 1 sibling, 2 replies; 31+ messages in thread From: Jan Kara @ 2019-01-16 14:51 UTC (permalink / raw) To: Dmitry Vyukov Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney, Alan Stern, Andrea Parri On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote: > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > > >> There is only one relevant standard covering this, which is the C > > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > > >> pointers. > > > > > > >> > > > > > > >> Check out this on how any race that you might think is benign can be > > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > > such code we are not getting data race detection and don't detect > > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > > aligned address is atomic, isn't it? > > > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > > program. > > > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > > AFAICT. I understand this makes life for static checkers hard but such is > > > > reality. > > > > > > Yes, but nobody never defined what "a long access" means. And if you > > > see a function that accepts a long argument and stores it into a long > > > field, no, it does not qualify. I bet this will come at surprise to > > > lots of developers. > > > > Yes, inlining and other optimizations can screw you. > > > > > Check out this fix and try to extrapolate how this "function stores > > > long into a long leads to a serious security bug" can actually be > > > applied to whole lot of places after inlining (or when somebody just > > > slightly shuffles code in a way that looks totally safe) that also > > > kinda look safe and atomic: > > > https://lore.kernel.org/patchwork/patch/599779/ > > > So where is the boundary between "a long access" that is atomic and > > > the one that is not necessary atomic? > > > > So I tend to rely on "long access being atomic" for opaque values (no > > flags, no counters, ...). Just value that gets fetched from some global > > variable / other data structure, stored, read, and possibly compared for > > equality. I agree the compiler could still screw you if it could infer how > > that value got initially created and try to be clever about it... > > So can you, or somebody else, define a set of rules that we can use to > discriminate each particular case? How can we avoid that "the compiler > could still screw you"? > > Inlining is always enabled, right, so one needs to take into account > everything that's possibly can be inlined. Now or in future. And also > link-time-code generation, if we don't use it we are dropping 10% of > performance on the floor. > Also, ensuring that the code works when it's first submitted is the > smaller part of the problem. It's ensuring that it continues to work > in future what's more important. Try to imagine what amount of burden > this puts onto all developers who touch any kernel code in future. > Basically if you slightly alter local logic in a function that does > not do any loads/stores, you can screw multiple "proofs" that long > accesses are atomic. Or, you just move a function from .c file to .h. > I can bet nobody re-proofs all "long accesses are atomic" around the > changed code during code reviews, so these things break over time. > Or, even if only comparisons are involved (that you mentioned as > "safe") I see how that can actually affect compilation process. Say, > we are in the branch where 2 variables compare equal, now since no > concurrency is involved from compiler point of view, it can, say, > discard one variable and then re-load it from the other variable's > location, and say not the other variable has value that the other one > must never have. I don't have a full scenario, but that's exactly the > point. One will never see all possibilities. > > It all becomes super slippery slope very quickly. And we do want > compiler to generate as fast code as possible and do all these > optimizations. And it's not that there are big objective reasons to > not just mark all concurrent accesses and stop spending large amounts > of time on these "proofs". I guess you've convinced me that somehow marking such accesses is desirable. So is using atomic_long_t and atomic_long_set() / atomic_long_read() for manipulation instead what you suggest? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 14:51 ` Jan Kara @ 2019-01-16 14:51 ` Jan Kara 2019-01-16 15:33 ` Dmitry Vyukov 1 sibling, 0 replies; 31+ messages in thread From: Jan Kara @ 2019-01-16 14:51 UTC (permalink / raw) To: Dmitry Vyukov Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney, Alan Stern, Andrea Parri On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote: > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > > >> There is only one relevant standard covering this, which is the C > > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > > >> pointers. > > > > > > >> > > > > > > >> Check out this on how any race that you might think is benign can be > > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > > such code we are not getting data race detection and don't detect > > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > > aligned address is atomic, isn't it? > > > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > > program. > > > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > > AFAICT. I understand this makes life for static checkers hard but such is > > > > reality. > > > > > > Yes, but nobody never defined what "a long access" means. And if you > > > see a function that accepts a long argument and stores it into a long > > > field, no, it does not qualify. I bet this will come at surprise to > > > lots of developers. > > > > Yes, inlining and other optimizations can screw you. > > > > > Check out this fix and try to extrapolate how this "function stores > > > long into a long leads to a serious security bug" can actually be > > > applied to whole lot of places after inlining (or when somebody just > > > slightly shuffles code in a way that looks totally safe) that also > > > kinda look safe and atomic: > > > https://lore.kernel.org/patchwork/patch/599779/ > > > So where is the boundary between "a long access" that is atomic and > > > the one that is not necessary atomic? > > > > So I tend to rely on "long access being atomic" for opaque values (no > > flags, no counters, ...). Just value that gets fetched from some global > > variable / other data structure, stored, read, and possibly compared for > > equality. I agree the compiler could still screw you if it could infer how > > that value got initially created and try to be clever about it... > > So can you, or somebody else, define a set of rules that we can use to > discriminate each particular case? How can we avoid that "the compiler > could still screw you"? > > Inlining is always enabled, right, so one needs to take into account > everything that's possibly can be inlined. Now or in future. And also > link-time-code generation, if we don't use it we are dropping 10% of > performance on the floor. > Also, ensuring that the code works when it's first submitted is the > smaller part of the problem. It's ensuring that it continues to work > in future what's more important. Try to imagine what amount of burden > this puts onto all developers who touch any kernel code in future. > Basically if you slightly alter local logic in a function that does > not do any loads/stores, you can screw multiple "proofs" that long > accesses are atomic. Or, you just move a function from .c file to .h. > I can bet nobody re-proofs all "long accesses are atomic" around the > changed code during code reviews, so these things break over time. > Or, even if only comparisons are involved (that you mentioned as > "safe") I see how that can actually affect compilation process. Say, > we are in the branch where 2 variables compare equal, now since no > concurrency is involved from compiler point of view, it can, say, > discard one variable and then re-load it from the other variable's > location, and say not the other variable has value that the other one > must never have. I don't have a full scenario, but that's exactly the > point. One will never see all possibilities. > > It all becomes super slippery slope very quickly. And we do want > compiler to generate as fast code as possible and do all these > optimizations. And it's not that there are big objective reasons to > not just mark all concurrent accesses and stop spending large amounts > of time on these "proofs". I guess you've convinced me that somehow marking such accesses is desirable. So is using atomic_long_t and atomic_long_set() / atomic_long_read() for manipulation instead what you suggest? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 14:51 ` Jan Kara 2019-01-16 14:51 ` Jan Kara @ 2019-01-16 15:33 ` Dmitry Vyukov 2019-01-16 15:33 ` Dmitry Vyukov 2019-01-16 16:15 ` Paul E. McKenney 1 sibling, 2 replies; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-16 15:33 UTC (permalink / raw) To: Jan Kara Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney, Alan Stern, Andrea Parri On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote: > > On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote: > > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > > > >> There is only one relevant standard covering this, which is the C > > > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > > > >> pointers. > > > > > > > >> > > > > > > > >> Check out this on how any race that you might think is benign can be > > > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > > > such code we are not getting data race detection and don't detect > > > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > > > aligned address is atomic, isn't it? > > > > > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > > > program. > > > > > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > > > AFAICT. I understand this makes life for static checkers hard but such is > > > > > reality. > > > > > > > > Yes, but nobody never defined what "a long access" means. And if you > > > > see a function that accepts a long argument and stores it into a long > > > > field, no, it does not qualify. I bet this will come at surprise to > > > > lots of developers. > > > > > > Yes, inlining and other optimizations can screw you. > > > > > > > Check out this fix and try to extrapolate how this "function stores > > > > long into a long leads to a serious security bug" can actually be > > > > applied to whole lot of places after inlining (or when somebody just > > > > slightly shuffles code in a way that looks totally safe) that also > > > > kinda look safe and atomic: > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > So where is the boundary between "a long access" that is atomic and > > > > the one that is not necessary atomic? > > > > > > So I tend to rely on "long access being atomic" for opaque values (no > > > flags, no counters, ...). Just value that gets fetched from some global > > > variable / other data structure, stored, read, and possibly compared for > > > equality. I agree the compiler could still screw you if it could infer how > > > that value got initially created and try to be clever about it... > > > > So can you, or somebody else, define a set of rules that we can use to > > discriminate each particular case? How can we avoid that "the compiler > > could still screw you"? > > > > Inlining is always enabled, right, so one needs to take into account > > everything that's possibly can be inlined. Now or in future. And also > > link-time-code generation, if we don't use it we are dropping 10% of > > performance on the floor. > > Also, ensuring that the code works when it's first submitted is the > > smaller part of the problem. It's ensuring that it continues to work > > in future what's more important. Try to imagine what amount of burden > > this puts onto all developers who touch any kernel code in future. > > Basically if you slightly alter local logic in a function that does > > not do any loads/stores, you can screw multiple "proofs" that long > > accesses are atomic. Or, you just move a function from .c file to .h. > > I can bet nobody re-proofs all "long accesses are atomic" around the > > changed code during code reviews, so these things break over time. > > Or, even if only comparisons are involved (that you mentioned as > > "safe") I see how that can actually affect compilation process. Say, > > we are in the branch where 2 variables compare equal, now since no > > concurrency is involved from compiler point of view, it can, say, > > discard one variable and then re-load it from the other variable's > > location, and say not the other variable has value that the other one > > must never have. I don't have a full scenario, but that's exactly the > > point. One will never see all possibilities. > > > > It all becomes super slippery slope very quickly. And we do want > > compiler to generate as fast code as possible and do all these > > optimizations. And it's not that there are big objective reasons to > > not just mark all concurrent accesses and stop spending large amounts > > of time on these "proofs". > > I guess you've convinced me that somehow marking such accesses is > desirable. So is using atomic_long_t and atomic_long_set() > / atomic_long_read() for manipulation instead what you suggest? +Paul, Alan, Andrea, who may better know the long-term direction. I think the closer analog of "plain long read/write that is intended to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic type. It should achieve the goal of just read/write it once without any tricks, but also makes them visible/searchable for humans and tools and potential future refactoring to something else. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 15:33 ` Dmitry Vyukov @ 2019-01-16 15:33 ` Dmitry Vyukov 2019-01-16 16:15 ` Paul E. McKenney 1 sibling, 0 replies; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-16 15:33 UTC (permalink / raw) To: Jan Kara Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney, Alan Stern, Andrea Parri On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote: > > On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote: > > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > > > >> There is only one relevant standard covering this, which is the C > > > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > > > >> pointers. > > > > > > > >> > > > > > > > >> Check out this on how any race that you might think is benign can be > > > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > > > such code we are not getting data race detection and don't detect > > > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > > > aligned address is atomic, isn't it? > > > > > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > > > program. > > > > > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > > > AFAICT. I understand this makes life for static checkers hard but such is > > > > > reality. > > > > > > > > Yes, but nobody never defined what "a long access" means. And if you > > > > see a function that accepts a long argument and stores it into a long > > > > field, no, it does not qualify. I bet this will come at surprise to > > > > lots of developers. > > > > > > Yes, inlining and other optimizations can screw you. > > > > > > > Check out this fix and try to extrapolate how this "function stores > > > > long into a long leads to a serious security bug" can actually be > > > > applied to whole lot of places after inlining (or when somebody just > > > > slightly shuffles code in a way that looks totally safe) that also > > > > kinda look safe and atomic: > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > So where is the boundary between "a long access" that is atomic and > > > > the one that is not necessary atomic? > > > > > > So I tend to rely on "long access being atomic" for opaque values (no > > > flags, no counters, ...). Just value that gets fetched from some global > > > variable / other data structure, stored, read, and possibly compared for > > > equality. I agree the compiler could still screw you if it could infer how > > > that value got initially created and try to be clever about it... > > > > So can you, or somebody else, define a set of rules that we can use to > > discriminate each particular case? How can we avoid that "the compiler > > could still screw you"? > > > > Inlining is always enabled, right, so one needs to take into account > > everything that's possibly can be inlined. Now or in future. And also > > link-time-code generation, if we don't use it we are dropping 10% of > > performance on the floor. > > Also, ensuring that the code works when it's first submitted is the > > smaller part of the problem. It's ensuring that it continues to work > > in future what's more important. Try to imagine what amount of burden > > this puts onto all developers who touch any kernel code in future. > > Basically if you slightly alter local logic in a function that does > > not do any loads/stores, you can screw multiple "proofs" that long > > accesses are atomic. Or, you just move a function from .c file to .h. > > I can bet nobody re-proofs all "long accesses are atomic" around the > > changed code during code reviews, so these things break over time. > > Or, even if only comparisons are involved (that you mentioned as > > "safe") I see how that can actually affect compilation process. Say, > > we are in the branch where 2 variables compare equal, now since no > > concurrency is involved from compiler point of view, it can, say, > > discard one variable and then re-load it from the other variable's > > location, and say not the other variable has value that the other one > > must never have. I don't have a full scenario, but that's exactly the > > point. One will never see all possibilities. > > > > It all becomes super slippery slope very quickly. And we do want > > compiler to generate as fast code as possible and do all these > > optimizations. And it's not that there are big objective reasons to > > not just mark all concurrent accesses and stop spending large amounts > > of time on these "proofs". > > I guess you've convinced me that somehow marking such accesses is > desirable. So is using atomic_long_t and atomic_long_set() > / atomic_long_read() for manipulation instead what you suggest? +Paul, Alan, Andrea, who may better know the long-term direction. I think the closer analog of "plain long read/write that is intended to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic type. It should achieve the goal of just read/write it once without any tricks, but also makes them visible/searchable for humans and tools and potential future refactoring to something else. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 15:33 ` Dmitry Vyukov 2019-01-16 15:33 ` Dmitry Vyukov @ 2019-01-16 16:15 ` Paul E. McKenney 2019-01-16 16:15 ` Paul E. McKenney 1 sibling, 1 reply; 31+ messages in thread From: Paul E. McKenney @ 2019-01-16 16:15 UTC (permalink / raw) To: Dmitry Vyukov Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Alan Stern, Andrea Parri On Wed, Jan 16, 2019 at 04:33:05PM +0100, Dmitry Vyukov wrote: > On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote: > > > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > > > > >> There is only one relevant standard covering this, which is the C > > > > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > > > > >> pointers. > > > > > > > > >> > > > > > > > > >> Check out this on how any race that you might think is benign can be > > > > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > > > > such code we are not getting data race detection and don't detect > > > > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > > > > aligned address is atomic, isn't it? > > > > > > > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > > > > program. > > > > > > > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > > > > AFAICT. I understand this makes life for static checkers hard but such is > > > > > > reality. > > > > > > > > > > Yes, but nobody never defined what "a long access" means. And if you > > > > > see a function that accepts a long argument and stores it into a long > > > > > field, no, it does not qualify. I bet this will come at surprise to > > > > > lots of developers. > > > > > > > > Yes, inlining and other optimizations can screw you. > > > > > > > > > Check out this fix and try to extrapolate how this "function stores > > > > > long into a long leads to a serious security bug" can actually be > > > > > applied to whole lot of places after inlining (or when somebody just > > > > > slightly shuffles code in a way that looks totally safe) that also > > > > > kinda look safe and atomic: > > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > > So where is the boundary between "a long access" that is atomic and > > > > > the one that is not necessary atomic? > > > > > > > > So I tend to rely on "long access being atomic" for opaque values (no > > > > flags, no counters, ...). Just value that gets fetched from some global > > > > variable / other data structure, stored, read, and possibly compared for > > > > equality. I agree the compiler could still screw you if it could infer how > > > > that value got initially created and try to be clever about it... > > > > > > So can you, or somebody else, define a set of rules that we can use to > > > discriminate each particular case? How can we avoid that "the compiler > > > could still screw you"? > > > > > > Inlining is always enabled, right, so one needs to take into account > > > everything that's possibly can be inlined. Now or in future. And also > > > link-time-code generation, if we don't use it we are dropping 10% of > > > performance on the floor. > > > Also, ensuring that the code works when it's first submitted is the > > > smaller part of the problem. It's ensuring that it continues to work > > > in future what's more important. Try to imagine what amount of burden > > > this puts onto all developers who touch any kernel code in future. > > > Basically if you slightly alter local logic in a function that does > > > not do any loads/stores, you can screw multiple "proofs" that long > > > accesses are atomic. Or, you just move a function from .c file to .h. > > > I can bet nobody re-proofs all "long accesses are atomic" around the > > > changed code during code reviews, so these things break over time. > > > Or, even if only comparisons are involved (that you mentioned as > > > "safe") I see how that can actually affect compilation process. Say, > > > we are in the branch where 2 variables compare equal, now since no > > > concurrency is involved from compiler point of view, it can, say, > > > discard one variable and then re-load it from the other variable's > > > location, and say not the other variable has value that the other one > > > must never have. I don't have a full scenario, but that's exactly the > > > point. One will never see all possibilities. > > > > > > It all becomes super slippery slope very quickly. And we do want > > > compiler to generate as fast code as possible and do all these > > > optimizations. And it's not that there are big objective reasons to > > > not just mark all concurrent accesses and stop spending large amounts > > > of time on these "proofs". > > > > I guess you've convinced me that somehow marking such accesses is > > desirable. So is using atomic_long_t and atomic_long_set() > > / atomic_long_read() for manipulation instead what you suggest? > > +Paul, Alan, Andrea, who may better know the long-term direction. > I think the closer analog of "plain long read/write that is intended > to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic > type. It should achieve the goal of just read/write it once without > any tricks, but also makes them visible/searchable for humans and > tools and potential future refactoring to something else. Agreed, use of READ_ONCE() and WRITE_ONCE() should suffice. The atomic_long_set() and atomic_long_read() would also work, but are a bit more typing. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 16:15 ` Paul E. McKenney @ 2019-01-16 16:15 ` Paul E. McKenney 0 siblings, 0 replies; 31+ messages in thread From: Paul E. McKenney @ 2019-01-16 16:15 UTC (permalink / raw) To: Dmitry Vyukov Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Alan Stern, Andrea Parri On Wed, Jan 16, 2019 at 04:33:05PM +0100, Dmitry Vyukov wrote: > On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote: > > > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote: > > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > > > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa > > > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > > > > > > > > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote: > > > > > > > > >> How did you arrive to the conclusion that it is harmless? > > > > > > > > >> There is only one relevant standard covering this, which is the C > > > > > > > > >> language standard, and it is very clear on this -- this has Undefined > > > > > > > > >> Behavior, that is the same as, for example, reading/writing random > > > > > > > > >> pointers. > > > > > > > > >> > > > > > > > > >> Check out this on how any race that you might think is benign can be > > > > > > > > >> badly miscompiled and lead to arbitrary program behavior: > > > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > > > > > > > > > > > > > > > > > > Also there is no other practical definition of data race for automatic > > > > > > > > > data race detectors than: two conflicting non-atomic concurrent > > > > > > > > > accesses. Which this code is. Which means that if we continue writing > > > > > > > > > such code we are not getting data race detection and don't detect > > > > > > > > > thousands of races in kernel code that one may consider more harmful > > > > > > > > > than this one the easy way. And instead will spent large amounts of > > > > > > > > > time to fix some of then the hard way, and leave the rest as just too > > > > > > > > > hard to debug so let the kernel continue crashing from time to time (I > > > > > > > > > believe a portion of currently open syzbot bugs that developers just > > > > > > > > > left as "I don't see how this can happen" are due to such races). > > > > > > > > > > > > > > > > > > > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally > > > > > > > > aligned address is atomic, isn't it? > > > > > > > > > > > > > > Nobody guarantees this. According to C non-atomic conflicting > > > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole > > > > > > > program. > > > > > > > > > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > > > > > atomic pretty heavily so that it is now de-facto standard for the kernel > > > > > > AFAICT. I understand this makes life for static checkers hard but such is > > > > > > reality. > > > > > > > > > > Yes, but nobody never defined what "a long access" means. And if you > > > > > see a function that accepts a long argument and stores it into a long > > > > > field, no, it does not qualify. I bet this will come at surprise to > > > > > lots of developers. > > > > > > > > Yes, inlining and other optimizations can screw you. > > > > > > > > > Check out this fix and try to extrapolate how this "function stores > > > > > long into a long leads to a serious security bug" can actually be > > > > > applied to whole lot of places after inlining (or when somebody just > > > > > slightly shuffles code in a way that looks totally safe) that also > > > > > kinda look safe and atomic: > > > > > https://lore.kernel.org/patchwork/patch/599779/ > > > > > So where is the boundary between "a long access" that is atomic and > > > > > the one that is not necessary atomic? > > > > > > > > So I tend to rely on "long access being atomic" for opaque values (no > > > > flags, no counters, ...). Just value that gets fetched from some global > > > > variable / other data structure, stored, read, and possibly compared for > > > > equality. I agree the compiler could still screw you if it could infer how > > > > that value got initially created and try to be clever about it... > > > > > > So can you, or somebody else, define a set of rules that we can use to > > > discriminate each particular case? How can we avoid that "the compiler > > > could still screw you"? > > > > > > Inlining is always enabled, right, so one needs to take into account > > > everything that's possibly can be inlined. Now or in future. And also > > > link-time-code generation, if we don't use it we are dropping 10% of > > > performance on the floor. > > > Also, ensuring that the code works when it's first submitted is the > > > smaller part of the problem. It's ensuring that it continues to work > > > in future what's more important. Try to imagine what amount of burden > > > this puts onto all developers who touch any kernel code in future. > > > Basically if you slightly alter local logic in a function that does > > > not do any loads/stores, you can screw multiple "proofs" that long > > > accesses are atomic. Or, you just move a function from .c file to .h. > > > I can bet nobody re-proofs all "long accesses are atomic" around the > > > changed code during code reviews, so these things break over time. > > > Or, even if only comparisons are involved (that you mentioned as > > > "safe") I see how that can actually affect compilation process. Say, > > > we are in the branch where 2 variables compare equal, now since no > > > concurrency is involved from compiler point of view, it can, say, > > > discard one variable and then re-load it from the other variable's > > > location, and say not the other variable has value that the other one > > > must never have. I don't have a full scenario, but that's exactly the > > > point. One will never see all possibilities. > > > > > > It all becomes super slippery slope very quickly. And we do want > > > compiler to generate as fast code as possible and do all these > > > optimizations. And it's not that there are big objective reasons to > > > not just mark all concurrent accesses and stop spending large amounts > > > of time on these "proofs". > > > > I guess you've convinced me that somehow marking such accesses is > > desirable. So is using atomic_long_t and atomic_long_set() > > / atomic_long_read() for manipulation instead what you suggest? > > +Paul, Alan, Andrea, who may better know the long-term direction. > I think the closer analog of "plain long read/write that is intended > to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic > type. It should achieve the goal of just read/write it once without > any tricks, but also makes them visible/searchable for humans and > tools and potential future refactoring to something else. Agreed, use of READ_ONCE() and WRITE_ONCE() should suffice. The atomic_long_set() and atomic_long_read() would also work, but are a bit more typing. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-16 10:43 ` Jan Kara 2019-01-16 11:03 ` Dmitry Vyukov @ 2019-01-17 14:11 ` Tetsuo Handa 2019-01-18 15:30 ` Dmitry Vyukov 1 sibling, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2019-01-17 14:11 UTC (permalink / raw) To: Jan Kara, Dmitry Vyukov; +Cc: Al Viro, linux-fsdevel On 2019/01/16 19:43, Jan Kara wrote: > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: >>> I still cannot catch. Read/write of sizeof(long) bytes at naturally >>> aligned address is atomic, isn't it? >> >> Nobody guarantees this. According to C non-atomic conflicting >> reads/writes of sizeof(long) cause undefined behavior of the whole >> program. > > Yes, but to be fair the kernel has always relied on long accesses to be > atomic pretty heavily so that it is now de-facto standard for the kernel > AFAICT. I understand this makes life for static checkers hard but such is > reality. Regarding "load a long word from naturally aligned address" and "store a long word to naturally aligned address", they are required to be atomic in Linux kernel, (for it is described as "On all systems running Linux, loads from and stores to pointers are atomic, that is, if a store to a pointer occurs at the same time as a load from that same pointer, the load will return either the initial value or the value stored, never some bitwise mashup of the two." at https://lwn.net/Articles/262464/ ) aren't they? And I know we need to use READ_ONCE()/WRITE_ONCE() etc. when we need to guarantee that reordering/reloading is not acceptable. But what damage can cause for this specific case? It does only "compare two long variables" in addition to "load a long word from naturally aligned address" and "store a long word to naturally aligned address". Reordering/reloading these variables won't cause severe problems. > > But in this particular case I agree with you that special logic is not > really warranted. Anyway, do we apply V2 patch at https://marc.info/?i=cd9c47c9-1dc6-6949-a138-703f6a10bc9b@i-love.sakura.ne.jp ? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-17 14:11 ` Tetsuo Handa @ 2019-01-18 15:30 ` Dmitry Vyukov 0 siblings, 0 replies; 31+ messages in thread From: Dmitry Vyukov @ 2019-01-18 15:30 UTC (permalink / raw) To: Tetsuo Handa; +Cc: Jan Kara, Al Viro, linux-fsdevel On Thu, Jan 17, 2019 at 3:11 PM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > On 2019/01/16 19:43, Jan Kara wrote: > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote: > >>> I still cannot catch. Read/write of sizeof(long) bytes at naturally > >>> aligned address is atomic, isn't it? > >> > >> Nobody guarantees this. According to C non-atomic conflicting > >> reads/writes of sizeof(long) cause undefined behavior of the whole > >> program. > > > > Yes, but to be fair the kernel has always relied on long accesses to be > > atomic pretty heavily so that it is now de-facto standard for the kernel > > AFAICT. I understand this makes life for static checkers hard but such is > > reality. > > Regarding "load a long word from naturally aligned address" and > "store a long word to naturally aligned address", they are required to be atomic > in Linux kernel, (for it is described as "On all systems running Linux, loads from > and stores to pointers are atomic, that is, if a store to a pointer occurs at the > same time as a load from that same pointer, the load will return either the initial > value or the value stored, never some bitwise mashup of the two." at > https://lwn.net/Articles/262464/ ) aren't they? Well, the description of the patch explains how/why a compiler can make a C pointer store non-atomic. Also you can see that list_for_each_entry_rcu uses READ_ONCE: https://elixir.bootlin.com/linux/v5.0-rc2/source/include/linux/rculist.h#L278 > And I know we need to use READ_ONCE()/WRITE_ONCE() etc. when we need to guarantee > that reordering/reloading is not acceptable. READ/WRITE_ONCE don't have anything to do with ordering. They only provide atomicity and visibility. That's various acquire/release/barriers that provide ordering guarantees. > But what damage can cause for this > specific case? Any. > It does only "compare two long variables" in addition to "load a > long word from naturally aligned address" and "store a long word to naturally > aligned address". Reordering/reloading these variables won't cause severe problems. > > > > > But in this particular case I agree with you that special logic is not > > really warranted. > > Anyway, do we apply V2 patch at > https://marc.info/?i=cd9c47c9-1dc6-6949-a138-703f6a10bc9b@i-love.sakura.ne.jp ? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 10:40 ` Dmitry Vyukov 2019-01-11 10:48 ` Dmitry Vyukov @ 2019-01-11 10:51 ` Tetsuo Handa 1 sibling, 0 replies; 31+ messages in thread From: Tetsuo Handa @ 2019-01-11 10:51 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Jan Kara, Al Viro, linux-fsdevel On 2019/01/11 19:40, Dmitry Vyukov wrote: > On Fri, Jan 11, 2019 at 11:28 AM Tetsuo Handa > <penguin-kernel@i-love.sakura.ne.jp> wrote: >> >> On 2019/01/11 19:19, Dmitry Vyukov wrote: >>>> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) >>>> struct buffer_head *head; >>>> struct page *page; >>>> int all_mapped = 1; >>>> + static unsigned long last; >>> >>> /\/\/\/\/\/\ >>> >>> We will have to re-fix this when we start testing with KTSAN, data >>> race detector. >> >> What's wrong? This race is harmless. > > How did you arrive to the conclusion that it is harmless? > There is only one relevant standard covering this, which is the C > language standard, and it is very clear on this -- this has Undefined > Behavior, that is the same as, for example, reading/writing random > pointers. > > Check out this on how any race that you might think is benign can be > badly miscompiled and lead to arbitrary program behavior: > https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong > This is a "static" variable. In the Linux kernel's world, "static" variable is automatically initialized to 0. I just want to avoid bloated logs like https://syzkaller.appspot.com/text?tag=CrashLog&x=11f4a5e8c00000 . ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 10:10 [PATCH] fs: ratelimit __find_get_block_slow() failure message Tetsuo Handa 2019-01-11 10:19 ` Dmitry Vyukov @ 2019-01-11 11:03 ` Jan Kara 2019-01-11 11:37 ` [PATCH v2] " Tetsuo Handa 1 sibling, 1 reply; 31+ messages in thread From: Jan Kara @ 2019-01-11 11:03 UTC (permalink / raw) To: Tetsuo Handa; +Cc: Jan Kara, Alexander Viro, Dmitry Vyukov, linux-fsdevel On Fri 11-01-19 19:10:33, Tetsuo Handa wrote: > When something let __find_get_block_slow() hit all_mapped path, it calls > printk() for 100+ times per a second. But there is no need to print same > message with such high frequency; it is just asking for stall warning, or > at least bloating log files. > > [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.873324][T15342] b_state=0x00000029, b_size=512 > [ 399.878403][T15342] device loop0 blocksize: 4096 > [ 399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.890400][T15342] b_state=0x00000029, b_size=512 > [ 399.895595][T15342] device loop0 blocksize: 4096 > [ 399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.907471][T15342] b_state=0x00000029, b_size=512 > [ 399.912506][T15342] device loop0 blocksize: 4096 > > This patch reduces frequency to approximately once per a second, in > addition to concatenating three lines into one. This patch uses plain > time_in_range() than __ratelimit(), for there is no need to allocate > space for debug objects in ratelimit_state.lock field. > > [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096 > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> I agree with collapsing three printks into one. I just don't think that those couple bytes saved do warrant not using printk_ratelimited(). Those bytes are not worth people's time looking into the code and trying to understand why you did that special thing (let alone if they start poking into possibility of data races like Dmitry pointed out). Honza > --- > fs/buffer.c | 17 ++++++++--------- > 1 file changed, 8 insertions(+), 9 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 52d024b..3fc9167 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > struct buffer_head *head; > struct page *page; > int all_mapped = 1; > + static unsigned long last; > > index = block >> (PAGE_SHIFT - bd_inode->i_blkbits); > page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED); > @@ -227,15 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > * file io on the block device and getblk. It gets dealt with > * elsewhere, don't buffer_error if we had some unmapped buffers > */ > - if (all_mapped) { > - printk("__find_get_block_slow() failed. " > - "block=%llu, b_blocknr=%llu\n", > - (unsigned long long)block, > - (unsigned long long)bh->b_blocknr); > - printk("b_state=0x%08lx, b_size=%zu\n", > - bh->b_state, bh->b_size); > - printk("device %pg blocksize: %d\n", bdev, > - 1 << bd_inode->i_blkbits); > + if (all_mapped && !time_in_range(jiffies, last, last + HZ)) { > + printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n", > + (unsigned long long)block, > + (unsigned long long)bh->b_blocknr, > + bh->b_state, bh->b_size, bdev, > + 1 << bd_inode->i_blkbits); > + last = jiffies; > } > out_unlock: > spin_unlock(&bd_mapping->private_lock); > -- > 1.8.3.1 -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 11:03 ` Jan Kara @ 2019-01-11 11:37 ` Tetsuo Handa 2019-01-21 8:57 ` Jan Kara 0 siblings, 1 reply; 31+ messages in thread From: Tetsuo Handa @ 2019-01-11 11:37 UTC (permalink / raw) To: Jan Kara; +Cc: Alexander Viro, Dmitry Vyukov, linux-fsdevel When something let __find_get_block_slow() hit all_mapped path, it calls printk() for 100+ times per a second. But there is no need to print same message with such high frequency; it is just asking for stall warning, or at least bloating log files. [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.873324][T15342] b_state=0x00000029, b_size=512 [ 399.878403][T15342] device loop0 blocksize: 4096 [ 399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.890400][T15342] b_state=0x00000029, b_size=512 [ 399.895595][T15342] device loop0 blocksize: 4096 [ 399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.907471][T15342] b_state=0x00000029, b_size=512 [ 399.912506][T15342] device loop0 blocksize: 4096 This patch reduces frequency to up to once per a second, in addition to concatenating three lines into one. [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- fs/buffer.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 52d024b..4be8083 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) struct buffer_head *head; struct page *page; int all_mapped = 1; + static DEFINE_RATELIMIT_STATE(last_warned, HZ, 1); index = block >> (PAGE_SHIFT - bd_inode->i_blkbits); page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED); @@ -227,16 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) * file io on the block device and getblk. It gets dealt with * elsewhere, don't buffer_error if we had some unmapped buffers */ - if (all_mapped) { - printk("__find_get_block_slow() failed. " - "block=%llu, b_blocknr=%llu\n", - (unsigned long long)block, - (unsigned long long)bh->b_blocknr); - printk("b_state=0x%08lx, b_size=%zu\n", - bh->b_state, bh->b_size); - printk("device %pg blocksize: %d\n", bdev, - 1 << bd_inode->i_blkbits); - } + ratelimit_set_flags(&last_warned, RATELIMIT_MSG_ON_RELEASE); + if (all_mapped && __ratelimit(&last_warned)) + printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n", + (unsigned long long)block, + (unsigned long long)bh->b_blocknr, + bh->b_state, bh->b_size, bdev, + 1 << bd_inode->i_blkbits); out_unlock: spin_unlock(&bd_mapping->private_lock); put_page(page); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v2] fs: ratelimit __find_get_block_slow() failure message. 2019-01-11 11:37 ` [PATCH v2] " Tetsuo Handa @ 2019-01-21 8:57 ` Jan Kara 0 siblings, 0 replies; 31+ messages in thread From: Jan Kara @ 2019-01-21 8:57 UTC (permalink / raw) To: Tetsuo Handa; +Cc: Jan Kara, Alexander Viro, Dmitry Vyukov, linux-fsdevel On Fri 11-01-19 20:37:42, Tetsuo Handa wrote: > When something let __find_get_block_slow() hit all_mapped path, it calls > printk() for 100+ times per a second. But there is no need to print same > message with such high frequency; it is just asking for stall warning, or > at least bloating log files. > > [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.873324][T15342] b_state=0x00000029, b_size=512 > [ 399.878403][T15342] device loop0 blocksize: 4096 > [ 399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.890400][T15342] b_state=0x00000029, b_size=512 > [ 399.895595][T15342] device loop0 blocksize: 4096 > [ 399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 > [ 399.907471][T15342] b_state=0x00000029, b_size=512 > [ 399.912506][T15342] device loop0 blocksize: 4096 > > This patch reduces frequency to up to once per a second, in addition to > concatenating three lines into one. > > [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096 > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > --- > fs/buffer.c | 18 ++++++++---------- > 1 file changed, 8 insertions(+), 10 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 52d024b..4be8083 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > struct buffer_head *head; > struct page *page; > int all_mapped = 1; > + static DEFINE_RATELIMIT_STATE(last_warned, HZ, 1); > > index = block >> (PAGE_SHIFT - bd_inode->i_blkbits); > page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED); > @@ -227,16 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) > * file io on the block device and getblk. It gets dealt with > * elsewhere, don't buffer_error if we had some unmapped buffers > */ > - if (all_mapped) { > - printk("__find_get_block_slow() failed. " > - "block=%llu, b_blocknr=%llu\n", > - (unsigned long long)block, > - (unsigned long long)bh->b_blocknr); > - printk("b_state=0x%08lx, b_size=%zu\n", > - bh->b_state, bh->b_size); > - printk("device %pg blocksize: %d\n", bdev, > - 1 << bd_inode->i_blkbits); > - } > + ratelimit_set_flags(&last_warned, RATELIMIT_MSG_ON_RELEASE); > + if (all_mapped && __ratelimit(&last_warned)) > + printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n", I would wrap this like: printk("__find_get_block_slow() failed. block=%llu, " "b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, " "device %pg blocksize: %d\n", as 140 chars long line seems really too much. Also I'd embed the 'if body' in { }. I know it's not necessary but it makes the code more readable when printk has to occupy multiple lines. Otherwise the patch looks good so feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> WRT merging, you might want to send the fix to Jens Axboe since he's occasionally merging fs/buffer.c patches. Honza > + (unsigned long long)block, > + (unsigned long long)bh->b_blocknr, > + bh->b_state, bh->b_size, bdev, > + 1 << bd_inode->i_blkbits); > out_unlock: > spin_unlock(&bd_mapping->private_lock); > put_page(page); > -- > 1.8.3.1 > > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2019-01-22 17:16 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-11 10:10 [PATCH] fs: ratelimit __find_get_block_slow() failure message Tetsuo Handa 2019-01-11 10:19 ` Dmitry Vyukov [not found] ` <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp> 2019-01-11 10:40 ` Dmitry Vyukov 2019-01-11 10:48 ` Dmitry Vyukov 2019-01-11 12:46 ` Tetsuo Handa 2019-01-16 9:47 ` Dmitry Vyukov 2019-01-16 10:43 ` Jan Kara 2019-01-16 11:03 ` Dmitry Vyukov 2019-01-16 11:48 ` Dmitry Vyukov 2019-01-16 16:28 ` Greg Kroah-Hartman 2019-01-17 13:18 ` Dmitry Vyukov 2019-01-21 8:37 ` Jan Kara 2019-01-21 10:33 ` Tetsuo Handa 2019-01-21 10:48 ` Greg Kroah-Hartman 2019-01-22 15:27 ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov 2019-01-22 17:15 ` Jan Kara 2019-01-16 11:56 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara 2019-01-16 12:37 ` Dmitry Vyukov 2019-01-16 12:37 ` Dmitry Vyukov 2019-01-16 14:51 ` Jan Kara 2019-01-16 14:51 ` Jan Kara 2019-01-16 15:33 ` Dmitry Vyukov 2019-01-16 15:33 ` Dmitry Vyukov 2019-01-16 16:15 ` Paul E. McKenney 2019-01-16 16:15 ` Paul E. McKenney 2019-01-17 14:11 ` Tetsuo Handa 2019-01-18 15:30 ` Dmitry Vyukov 2019-01-11 10:51 ` Tetsuo Handa 2019-01-11 11:03 ` Jan Kara 2019-01-11 11:37 ` [PATCH v2] " Tetsuo Handa 2019-01-21 8:57 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).