linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fs: ratelimit __find_get_block_slow() failure message.
@ 2019-01-11 10:10 Tetsuo Handa
  2019-01-11 10:19 ` Dmitry Vyukov
  2019-01-11 11:03 ` Jan Kara
  0 siblings, 2 replies; 31+ messages in thread
From: Tetsuo Handa @ 2019-01-11 10:10 UTC (permalink / raw)
  To: Jan Kara, Alexander Viro; +Cc: Dmitry Vyukov, linux-fsdevel, Tetsuo Handa

When something let __find_get_block_slow() hit all_mapped path, it calls
printk() for 100+ times per a second. But there is no need to print same
message with such high frequency; it is just asking for stall warning, or
at least bloating log files.

  [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
  [  399.873324][T15342] b_state=0x00000029, b_size=512
  [  399.878403][T15342] device loop0 blocksize: 4096
  [  399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
  [  399.890400][T15342] b_state=0x00000029, b_size=512
  [  399.895595][T15342] device loop0 blocksize: 4096
  [  399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
  [  399.907471][T15342] b_state=0x00000029, b_size=512
  [  399.912506][T15342] device loop0 blocksize: 4096

This patch reduces frequency to approximately once per a second, in
addition to concatenating three lines into one. This patch uses plain
time_in_range() than __ratelimit(), for there is no need to allocate
space for debug objects in ratelimit_state.lock field.

  [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 fs/buffer.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 52d024b..3fc9167 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
 	struct buffer_head *head;
 	struct page *page;
 	int all_mapped = 1;
+	static unsigned long last;
 
 	index = block >> (PAGE_SHIFT - bd_inode->i_blkbits);
 	page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED);
@@ -227,15 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
 	 * file io on the block device and getblk.  It gets dealt with
 	 * elsewhere, don't buffer_error if we had some unmapped buffers
 	 */
-	if (all_mapped) {
-		printk("__find_get_block_slow() failed. "
-			"block=%llu, b_blocknr=%llu\n",
-			(unsigned long long)block,
-			(unsigned long long)bh->b_blocknr);
-		printk("b_state=0x%08lx, b_size=%zu\n",
-			bh->b_state, bh->b_size);
-		printk("device %pg blocksize: %d\n", bdev,
-			1 << bd_inode->i_blkbits);
+	if (all_mapped && !time_in_range(jiffies, last, last + HZ)) {
+		printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n",
+		       (unsigned long long)block,
+		       (unsigned long long)bh->b_blocknr,
+		       bh->b_state, bh->b_size, bdev,
+		       1 << bd_inode->i_blkbits);
+		last = jiffies;
 	}
 out_unlock:
 	spin_unlock(&bd_mapping->private_lock);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 10:10 [PATCH] fs: ratelimit __find_get_block_slow() failure message Tetsuo Handa
@ 2019-01-11 10:19 ` Dmitry Vyukov
       [not found]   ` <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp>
  2019-01-11 11:03 ` Jan Kara
  1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-11 10:19 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Jan Kara, Alexander Viro, linux-fsdevel

On Fri, Jan 11, 2019 at 11:10 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> When something let __find_get_block_slow() hit all_mapped path, it calls
> printk() for 100+ times per a second. But there is no need to print same
> message with such high frequency; it is just asking for stall warning, or
> at least bloating log files.
>
>   [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.873324][T15342] b_state=0x00000029, b_size=512
>   [  399.878403][T15342] device loop0 blocksize: 4096
>   [  399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.890400][T15342] b_state=0x00000029, b_size=512
>   [  399.895595][T15342] device loop0 blocksize: 4096
>   [  399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.907471][T15342] b_state=0x00000029, b_size=512
>   [  399.912506][T15342] device loop0 blocksize: 4096
>
> This patch reduces frequency to approximately once per a second, in
> addition to concatenating three lines into one. This patch uses plain
> time_in_range() than __ratelimit(), for there is no need to allocate
> space for debug objects in ratelimit_state.lock field.
>
>   [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  fs/buffer.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 52d024b..3fc9167 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>         struct buffer_head *head;
>         struct page *page;
>         int all_mapped = 1;
> +       static unsigned long last;

/\/\/\/\/\/\

We will have to re-fix this when we start testing with KTSAN, data
race detector.

>         index = block >> (PAGE_SHIFT - bd_inode->i_blkbits);
>         page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED);
> @@ -227,15 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>          * file io on the block device and getblk.  It gets dealt with
>          * elsewhere, don't buffer_error if we had some unmapped buffers
>          */
> -       if (all_mapped) {
> -               printk("__find_get_block_slow() failed. "
> -                       "block=%llu, b_blocknr=%llu\n",
> -                       (unsigned long long)block,
> -                       (unsigned long long)bh->b_blocknr);
> -               printk("b_state=0x%08lx, b_size=%zu\n",
> -                       bh->b_state, bh->b_size);
> -               printk("device %pg blocksize: %d\n", bdev,
> -                       1 << bd_inode->i_blkbits);
> +       if (all_mapped && !time_in_range(jiffies, last, last + HZ)) {
> +               printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n",
> +                      (unsigned long long)block,
> +                      (unsigned long long)bh->b_blocknr,
> +                      bh->b_state, bh->b_size, bdev,
> +                      1 << bd_inode->i_blkbits);
> +               last = jiffies;
>         }
>  out_unlock:
>         spin_unlock(&bd_mapping->private_lock);
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
       [not found]   ` <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp>
@ 2019-01-11 10:40     ` Dmitry Vyukov
  2019-01-11 10:48       ` Dmitry Vyukov
  2019-01-11 10:51       ` Tetsuo Handa
  0 siblings, 2 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-11 10:40 UTC (permalink / raw)
  To: Tetsuo Handa, Jan Kara, Al Viro, linux-fsdevel

On Fri, Jan 11, 2019 at 11:28 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2019/01/11 19:19, Dmitry Vyukov wrote:
> >> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
> >>         struct buffer_head *head;
> >>         struct page *page;
> >>         int all_mapped = 1;
> >> +       static unsigned long last;
> >
> > /\/\/\/\/\/\
> >
> > We will have to re-fix this when we start testing with KTSAN, data
> > race detector.
>
> What's wrong? This race is harmless.

How did you arrive to the conclusion that it is harmless?
There is only one relevant standard covering this, which is the C
language standard, and it is very clear on this -- this has Undefined
Behavior, that is the same as, for example, reading/writing random
pointers.

Check out this on how any race that you might think is benign can be
badly miscompiled and lead to arbitrary program behavior:
https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 10:40     ` Dmitry Vyukov
@ 2019-01-11 10:48       ` Dmitry Vyukov
  2019-01-11 12:46         ` Tetsuo Handa
  2019-01-11 10:51       ` Tetsuo Handa
  1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-11 10:48 UTC (permalink / raw)
  To: Tetsuo Handa, Jan Kara, Al Viro, linux-fsdevel

On Fri, Jan 11, 2019 at 11:40 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Fri, Jan 11, 2019 at 11:28 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > On 2019/01/11 19:19, Dmitry Vyukov wrote:
> > >> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
> > >>         struct buffer_head *head;
> > >>         struct page *page;
> > >>         int all_mapped = 1;
> > >> +       static unsigned long last;
> > >
> > > /\/\/\/\/\/\
> > >
> > > We will have to re-fix this when we start testing with KTSAN, data
> > > race detector.
> >
> > What's wrong? This race is harmless.
>
> How did you arrive to the conclusion that it is harmless?
> There is only one relevant standard covering this, which is the C
> language standard, and it is very clear on this -- this has Undefined
> Behavior, that is the same as, for example, reading/writing random
> pointers.
>
> Check out this on how any race that you might think is benign can be
> badly miscompiled and lead to arbitrary program behavior:
> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong

Also there is no other practical definition of data race for automatic
data race detectors than: two conflicting non-atomic concurrent
accesses. Which this code is. Which means that if we continue writing
such code we are not getting data race detection and don't detect
thousands of races in kernel code that one may consider more harmful
than this one the easy way. And instead will spent large amounts of
time to fix some of then the hard way, and leave the rest as just too
hard to debug so let the kernel continue crashing from time to time (I
believe a portion of currently open syzbot bugs that developers just
left as "I don't see how this can happen" are due to such races).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 10:40     ` Dmitry Vyukov
  2019-01-11 10:48       ` Dmitry Vyukov
@ 2019-01-11 10:51       ` Tetsuo Handa
  1 sibling, 0 replies; 31+ messages in thread
From: Tetsuo Handa @ 2019-01-11 10:51 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Jan Kara, Al Viro, linux-fsdevel

On 2019/01/11 19:40, Dmitry Vyukov wrote:
> On Fri, Jan 11, 2019 at 11:28 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> On 2019/01/11 19:19, Dmitry Vyukov wrote:
>>>> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>>>>         struct buffer_head *head;
>>>>         struct page *page;
>>>>         int all_mapped = 1;
>>>> +       static unsigned long last;
>>>
>>> /\/\/\/\/\/\
>>>
>>> We will have to re-fix this when we start testing with KTSAN, data
>>> race detector.
>>
>> What's wrong? This race is harmless.
> 
> How did you arrive to the conclusion that it is harmless?
> There is only one relevant standard covering this, which is the C
> language standard, and it is very clear on this -- this has Undefined
> Behavior, that is the same as, for example, reading/writing random
> pointers.
> 
> Check out this on how any race that you might think is benign can be
> badly miscompiled and lead to arbitrary program behavior:
> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> 

This is a "static" variable. In the Linux kernel's world,
"static" variable is automatically initialized to 0.

I just want to avoid bloated logs like https://syzkaller.appspot.com/text?tag=CrashLog&x=11f4a5e8c00000 .

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 10:10 [PATCH] fs: ratelimit __find_get_block_slow() failure message Tetsuo Handa
  2019-01-11 10:19 ` Dmitry Vyukov
@ 2019-01-11 11:03 ` Jan Kara
  2019-01-11 11:37   ` [PATCH v2] " Tetsuo Handa
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Kara @ 2019-01-11 11:03 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Jan Kara, Alexander Viro, Dmitry Vyukov, linux-fsdevel

On Fri 11-01-19 19:10:33, Tetsuo Handa wrote:
> When something let __find_get_block_slow() hit all_mapped path, it calls
> printk() for 100+ times per a second. But there is no need to print same
> message with such high frequency; it is just asking for stall warning, or
> at least bloating log files.
> 
>   [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.873324][T15342] b_state=0x00000029, b_size=512
>   [  399.878403][T15342] device loop0 blocksize: 4096
>   [  399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.890400][T15342] b_state=0x00000029, b_size=512
>   [  399.895595][T15342] device loop0 blocksize: 4096
>   [  399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.907471][T15342] b_state=0x00000029, b_size=512
>   [  399.912506][T15342] device loop0 blocksize: 4096
> 
> This patch reduces frequency to approximately once per a second, in
> addition to concatenating three lines into one. This patch uses plain
> time_in_range() than __ratelimit(), for there is no need to allocate
> space for debug objects in ratelimit_state.lock field.
> 
>   [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

I agree with collapsing three printks into one. I just don't think that
those couple bytes saved do warrant not using printk_ratelimited(). Those
bytes are not worth people's time looking into the code and trying to
understand why you did that special thing (let alone if they start poking
into possibility of data races like Dmitry pointed out).

								Honza

> ---
>  fs/buffer.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 52d024b..3fc9167 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>  	struct buffer_head *head;
>  	struct page *page;
>  	int all_mapped = 1;
> +	static unsigned long last;
>  
>  	index = block >> (PAGE_SHIFT - bd_inode->i_blkbits);
>  	page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED);
> @@ -227,15 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>  	 * file io on the block device and getblk.  It gets dealt with
>  	 * elsewhere, don't buffer_error if we had some unmapped buffers
>  	 */
> -	if (all_mapped) {
> -		printk("__find_get_block_slow() failed. "
> -			"block=%llu, b_blocknr=%llu\n",
> -			(unsigned long long)block,
> -			(unsigned long long)bh->b_blocknr);
> -		printk("b_state=0x%08lx, b_size=%zu\n",
> -			bh->b_state, bh->b_size);
> -		printk("device %pg blocksize: %d\n", bdev,
> -			1 << bd_inode->i_blkbits);
> +	if (all_mapped && !time_in_range(jiffies, last, last + HZ)) {
> +		printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n",
> +		       (unsigned long long)block,
> +		       (unsigned long long)bh->b_blocknr,
> +		       bh->b_state, bh->b_size, bdev,
> +		       1 << bd_inode->i_blkbits);
> +		last = jiffies;
>  	}
>  out_unlock:
>  	spin_unlock(&bd_mapping->private_lock);
> -- 
> 1.8.3.1
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 11:03 ` Jan Kara
@ 2019-01-11 11:37   ` Tetsuo Handa
  2019-01-21  8:57     ` Jan Kara
  0 siblings, 1 reply; 31+ messages in thread
From: Tetsuo Handa @ 2019-01-11 11:37 UTC (permalink / raw)
  To: Jan Kara; +Cc: Alexander Viro, Dmitry Vyukov, linux-fsdevel

When something let __find_get_block_slow() hit all_mapped path, it calls
printk() for 100+ times per a second. But there is no need to print same
message with such high frequency; it is just asking for stall warning, or
at least bloating log files.

  [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
  [  399.873324][T15342] b_state=0x00000029, b_size=512
  [  399.878403][T15342] device loop0 blocksize: 4096
  [  399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
  [  399.890400][T15342] b_state=0x00000029, b_size=512
  [  399.895595][T15342] device loop0 blocksize: 4096
  [  399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
  [  399.907471][T15342] b_state=0x00000029, b_size=512
  [  399.912506][T15342] device loop0 blocksize: 4096

This patch reduces frequency to up to once per a second, in addition to
concatenating three lines into one.

  [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 fs/buffer.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 52d024b..4be8083 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
 	struct buffer_head *head;
 	struct page *page;
 	int all_mapped = 1;
+	static DEFINE_RATELIMIT_STATE(last_warned, HZ, 1);
 
 	index = block >> (PAGE_SHIFT - bd_inode->i_blkbits);
 	page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED);
@@ -227,16 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
 	 * file io on the block device and getblk.  It gets dealt with
 	 * elsewhere, don't buffer_error if we had some unmapped buffers
 	 */
-	if (all_mapped) {
-		printk("__find_get_block_slow() failed. "
-			"block=%llu, b_blocknr=%llu\n",
-			(unsigned long long)block,
-			(unsigned long long)bh->b_blocknr);
-		printk("b_state=0x%08lx, b_size=%zu\n",
-			bh->b_state, bh->b_size);
-		printk("device %pg blocksize: %d\n", bdev,
-			1 << bd_inode->i_blkbits);
-	}
+	ratelimit_set_flags(&last_warned, RATELIMIT_MSG_ON_RELEASE);
+	if (all_mapped && __ratelimit(&last_warned))
+		printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n",
+		       (unsigned long long)block,
+		       (unsigned long long)bh->b_blocknr,
+		       bh->b_state, bh->b_size, bdev,
+		       1 << bd_inode->i_blkbits);
 out_unlock:
 	spin_unlock(&bd_mapping->private_lock);
 	put_page(page);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 10:48       ` Dmitry Vyukov
@ 2019-01-11 12:46         ` Tetsuo Handa
  2019-01-16  9:47           ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Tetsuo Handa @ 2019-01-11 12:46 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Jan Kara, Al Viro, linux-fsdevel

On 2019/01/11 19:48, Dmitry Vyukov wrote:
>> How did you arrive to the conclusion that it is harmless?
>> There is only one relevant standard covering this, which is the C
>> language standard, and it is very clear on this -- this has Undefined
>> Behavior, that is the same as, for example, reading/writing random
>> pointers.
>>
>> Check out this on how any race that you might think is benign can be
>> badly miscompiled and lead to arbitrary program behavior:
>> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> 
> Also there is no other practical definition of data race for automatic
> data race detectors than: two conflicting non-atomic concurrent
> accesses. Which this code is. Which means that if we continue writing
> such code we are not getting data race detection and don't detect
> thousands of races in kernel code that one may consider more harmful
> than this one the easy way. And instead will spent large amounts of
> time to fix some of then the hard way, and leave the rest as just too
> hard to debug so let the kernel continue crashing from time to time (I
> believe a portion of currently open syzbot bugs that developers just
> left as "I don't see how this can happen" are due to such races).
> 

I still cannot catch. Read/write of sizeof(long) bytes at naturally
aligned address is atomic, isn't it? I'm not using increments etc.
Therefore, in the worst case, some threads see outdated value. But
outdated values affect only time_in_range() test, which does not cause
severe problems like crash.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 12:46         ` Tetsuo Handa
@ 2019-01-16  9:47           ` Dmitry Vyukov
  2019-01-16 10:43             ` Jan Kara
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-16  9:47 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Jan Kara, Al Viro, linux-fsdevel

On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2019/01/11 19:48, Dmitry Vyukov wrote:
> >> How did you arrive to the conclusion that it is harmless?
> >> There is only one relevant standard covering this, which is the C
> >> language standard, and it is very clear on this -- this has Undefined
> >> Behavior, that is the same as, for example, reading/writing random
> >> pointers.
> >>
> >> Check out this on how any race that you might think is benign can be
> >> badly miscompiled and lead to arbitrary program behavior:
> >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> >
> > Also there is no other practical definition of data race for automatic
> > data race detectors than: two conflicting non-atomic concurrent
> > accesses. Which this code is. Which means that if we continue writing
> > such code we are not getting data race detection and don't detect
> > thousands of races in kernel code that one may consider more harmful
> > than this one the easy way. And instead will spent large amounts of
> > time to fix some of then the hard way, and leave the rest as just too
> > hard to debug so let the kernel continue crashing from time to time (I
> > believe a portion of currently open syzbot bugs that developers just
> > left as "I don't see how this can happen" are due to such races).
> >
>
> I still cannot catch. Read/write of sizeof(long) bytes at naturally
> aligned address is atomic, isn't it?

Nobody guarantees this. According to C non-atomic conflicting
reads/writes of sizeof(long) cause undefined behavior of the whole
program.

> I'm not using increments etc.
> Therefore, in the worst case, some threads see outdated value. But
> outdated values affect only time_in_range() test, which does not cause
> severe problems like crash.
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16  9:47           ` Dmitry Vyukov
@ 2019-01-16 10:43             ` Jan Kara
  2019-01-16 11:03               ` Dmitry Vyukov
  2019-01-17 14:11               ` Tetsuo Handa
  0 siblings, 2 replies; 31+ messages in thread
From: Jan Kara @ 2019-01-16 10:43 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Tetsuo Handa, Jan Kara, Al Viro, linux-fsdevel

On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > >> How did you arrive to the conclusion that it is harmless?
> > >> There is only one relevant standard covering this, which is the C
> > >> language standard, and it is very clear on this -- this has Undefined
> > >> Behavior, that is the same as, for example, reading/writing random
> > >> pointers.
> > >>
> > >> Check out this on how any race that you might think is benign can be
> > >> badly miscompiled and lead to arbitrary program behavior:
> > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > >
> > > Also there is no other practical definition of data race for automatic
> > > data race detectors than: two conflicting non-atomic concurrent
> > > accesses. Which this code is. Which means that if we continue writing
> > > such code we are not getting data race detection and don't detect
> > > thousands of races in kernel code that one may consider more harmful
> > > than this one the easy way. And instead will spent large amounts of
> > > time to fix some of then the hard way, and leave the rest as just too
> > > hard to debug so let the kernel continue crashing from time to time (I
> > > believe a portion of currently open syzbot bugs that developers just
> > > left as "I don't see how this can happen" are due to such races).
> > >
> >
> > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > aligned address is atomic, isn't it?
> 
> Nobody guarantees this. According to C non-atomic conflicting
> reads/writes of sizeof(long) cause undefined behavior of the whole
> program.

Yes, but to be fair the kernel has always relied on long accesses to be
atomic pretty heavily so that it is now de-facto standard for the kernel
AFAICT. I understand this makes life for static checkers hard but such is
reality.

But in this particular case I agree with you that special logic is not
really warranted.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 10:43             ` Jan Kara
@ 2019-01-16 11:03               ` Dmitry Vyukov
  2019-01-16 11:48                 ` Dmitry Vyukov
  2019-01-16 11:56                 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara
  2019-01-17 14:11               ` Tetsuo Handa
  1 sibling, 2 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-16 11:03 UTC (permalink / raw)
  To: Jan Kara; +Cc: Tetsuo Handa, Al Viro, linux-fsdevel

On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > >
> > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > >> How did you arrive to the conclusion that it is harmless?
> > > >> There is only one relevant standard covering this, which is the C
> > > >> language standard, and it is very clear on this -- this has Undefined
> > > >> Behavior, that is the same as, for example, reading/writing random
> > > >> pointers.
> > > >>
> > > >> Check out this on how any race that you might think is benign can be
> > > >> badly miscompiled and lead to arbitrary program behavior:
> > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > >
> > > > Also there is no other practical definition of data race for automatic
> > > > data race detectors than: two conflicting non-atomic concurrent
> > > > accesses. Which this code is. Which means that if we continue writing
> > > > such code we are not getting data race detection and don't detect
> > > > thousands of races in kernel code that one may consider more harmful
> > > > than this one the easy way. And instead will spent large amounts of
> > > > time to fix some of then the hard way, and leave the rest as just too
> > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > believe a portion of currently open syzbot bugs that developers just
> > > > left as "I don't see how this can happen" are due to such races).
> > > >
> > >
> > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > aligned address is atomic, isn't it?
> >
> > Nobody guarantees this. According to C non-atomic conflicting
> > reads/writes of sizeof(long) cause undefined behavior of the whole
> > program.
>
> Yes, but to be fair the kernel has always relied on long accesses to be
> atomic pretty heavily so that it is now de-facto standard for the kernel
> AFAICT. I understand this makes life for static checkers hard but such is
> reality.

Yes, but nobody never defined what "a long access" means. And if you
see a function that accepts a long argument and stores it into a long
field, no, it does not qualify. I bet this will come at surprise to
lots of developers.
Check out this fix and try to extrapolate how this "function stores
long into a long leads to a serious security bug" can actually be
applied to whole lot of places after inlining (or when somebody just
slightly shuffles code in a way that looks totally safe) that also
kinda look safe and atomic:
https://lore.kernel.org/patchwork/patch/599779/
So where is the boundary between "a long access" that is atomic and
the one that is not necessary atomic?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 11:03               ` Dmitry Vyukov
@ 2019-01-16 11:48                 ` Dmitry Vyukov
  2019-01-16 16:28                   ` Greg Kroah-Hartman
  2019-01-16 11:56                 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara
  1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-16 11:48 UTC (permalink / raw)
  To: Jan Kara, Linus Torvalds, Greg Kroah-Hartman, Kees Cook
  Cc: Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany

On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > >
> > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > >> How did you arrive to the conclusion that it is harmless?
> > > > >> There is only one relevant standard covering this, which is the C
> > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > >> pointers.
> > > > >>
> > > > >> Check out this on how any race that you might think is benign can be
> > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > >
> > > > > Also there is no other practical definition of data race for automatic
> > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > such code we are not getting data race detection and don't detect
> > > > > thousands of races in kernel code that one may consider more harmful
> > > > > than this one the easy way. And instead will spent large amounts of
> > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > left as "I don't see how this can happen" are due to such races).
> > > > >
> > > >
> > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > aligned address is atomic, isn't it?
> > >
> > > Nobody guarantees this. According to C non-atomic conflicting
> > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > program.
> >
> > Yes, but to be fair the kernel has always relied on long accesses to be
> > atomic pretty heavily so that it is now de-facto standard for the kernel
> > AFAICT. I understand this makes life for static checkers hard but such is
> > reality.
>
> Yes, but nobody never defined what "a long access" means. And if you
> see a function that accepts a long argument and stores it into a long
> field, no, it does not qualify. I bet this will come at surprise to
> lots of developers.
> Check out this fix and try to extrapolate how this "function stores
> long into a long leads to a serious security bug" can actually be
> applied to whole lot of places after inlining (or when somebody just
> slightly shuffles code in a way that looks totally safe) that also
> kinda look safe and atomic:
> https://lore.kernel.org/patchwork/patch/599779/
> So where is the boundary between "a long access" that is atomic and
> the one that is not necessary atomic?


+Linus, Greg, Kees

I wanted to provide a hash/link to this commit but, wait, you want to
say that this patch for a security bugs was mailed, recorded by
patchwork, acked by subsystem developer and then dropped on the floor
for 3+ years? Doh!

https://lore.kernel.org/patchwork/patch/599779/

There are known ways how to make this not a thing at all. Like open
pull requests on github:
https://github.com/google/syzkaller/pulls
or, some projects even do own dashboard for this:
https://dev.golang.org/reviews
because this is important. Especially for new contributors, drive-by
improvements, good samaritan fixes, etc.

Another example: a bug-fixing patch was lost for 2 years:
"Two years ago ;) I don't understand why there were ignored"
https://www.spinics.net/lists/linux-mm/msg161351.html

Another example: a patch is applied to a subsystem tree and then lost
for 6 months:
https://patchwork.kernel.org/patch/10339089/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 11:03               ` Dmitry Vyukov
  2019-01-16 11:48                 ` Dmitry Vyukov
@ 2019-01-16 11:56                 ` Jan Kara
  2019-01-16 12:37                   ` Dmitry Vyukov
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Kara @ 2019-01-16 11:56 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel

On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > >
> > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > >> How did you arrive to the conclusion that it is harmless?
> > > > >> There is only one relevant standard covering this, which is the C
> > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > >> pointers.
> > > > >>
> > > > >> Check out this on how any race that you might think is benign can be
> > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > >
> > > > > Also there is no other practical definition of data race for automatic
> > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > such code we are not getting data race detection and don't detect
> > > > > thousands of races in kernel code that one may consider more harmful
> > > > > than this one the easy way. And instead will spent large amounts of
> > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > left as "I don't see how this can happen" are due to such races).
> > > > >
> > > >
> > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > aligned address is atomic, isn't it?
> > >
> > > Nobody guarantees this. According to C non-atomic conflicting
> > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > program.
> >
> > Yes, but to be fair the kernel has always relied on long accesses to be
> > atomic pretty heavily so that it is now de-facto standard for the kernel
> > AFAICT. I understand this makes life for static checkers hard but such is
> > reality.
> 
> Yes, but nobody never defined what "a long access" means. And if you
> see a function that accepts a long argument and stores it into a long
> field, no, it does not qualify. I bet this will come at surprise to
> lots of developers.

Yes, inlining and other optimizations can screw you.

> Check out this fix and try to extrapolate how this "function stores
> long into a long leads to a serious security bug" can actually be
> applied to whole lot of places after inlining (or when somebody just
> slightly shuffles code in a way that looks totally safe) that also
> kinda look safe and atomic:
> https://lore.kernel.org/patchwork/patch/599779/
> So where is the boundary between "a long access" that is atomic and
> the one that is not necessary atomic?

So I tend to rely on "long access being atomic" for opaque values (no
flags, no counters, ...). Just value that gets fetched from some global
variable / other data structure, stored, read, and possibly compared for
equality. I agree the compiler could still screw you if it could infer how
that value got initially created and try to be clever about it...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 11:56                 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara
@ 2019-01-16 12:37                   ` Dmitry Vyukov
  2019-01-16 12:37                     ` Dmitry Vyukov
  2019-01-16 14:51                     ` Jan Kara
  0 siblings, 2 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-16 12:37 UTC (permalink / raw)
  To: Jan Kara
  Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney,
	Alan Stern, Andrea Parri

On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > >
> > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > >> There is only one relevant standard covering this, which is the C
> > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > >> pointers.
> > > > > >>
> > > > > >> Check out this on how any race that you might think is benign can be
> > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > >
> > > > > > Also there is no other practical definition of data race for automatic
> > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > such code we are not getting data race detection and don't detect
> > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > >
> > > > >
> > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > aligned address is atomic, isn't it?
> > > >
> > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > program.
> > >
> > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > AFAICT. I understand this makes life for static checkers hard but such is
> > > reality.
> >
> > Yes, but nobody never defined what "a long access" means. And if you
> > see a function that accepts a long argument and stores it into a long
> > field, no, it does not qualify. I bet this will come at surprise to
> > lots of developers.
>
> Yes, inlining and other optimizations can screw you.
>
> > Check out this fix and try to extrapolate how this "function stores
> > long into a long leads to a serious security bug" can actually be
> > applied to whole lot of places after inlining (or when somebody just
> > slightly shuffles code in a way that looks totally safe) that also
> > kinda look safe and atomic:
> > https://lore.kernel.org/patchwork/patch/599779/
> > So where is the boundary between "a long access" that is atomic and
> > the one that is not necessary atomic?
>
> So I tend to rely on "long access being atomic" for opaque values (no
> flags, no counters, ...). Just value that gets fetched from some global
> variable / other data structure, stored, read, and possibly compared for
> equality. I agree the compiler could still screw you if it could infer how
> that value got initially created and try to be clever about it...

So can you, or somebody else, define a set of rules that we can use to
discriminate each particular case? How can we avoid that "the compiler
could still screw you"?

Inlining is always enabled, right, so one needs to take into account
everything that's possibly can be inlined. Now or in future. And also
link-time-code generation, if we don't use it we are dropping 10% of
performance on the floor.
Also, ensuring that the code works when it's first submitted is the
smaller part of the problem. It's ensuring that it continues to work
in future what's more important. Try to imagine what amount of burden
this puts onto all developers who touch any kernel code in future.
Basically if you slightly alter local logic in a function that does
not do any loads/stores, you can screw multiple "proofs" that long
accesses are atomic. Or, you just move a function from .c file to .h.
I can bet nobody re-proofs all "long accesses are atomic" around the
changed code during code reviews, so these things break over time.
Or, even if only comparisons are involved (that you mentioned as
"safe") I see how that can actually affect compilation process. Say,
we are in the branch where 2 variables compare equal, now since no
concurrency is involved from compiler point of view, it can, say,
discard one variable and then re-load it from the other variable's
location, and say not the other variable has value that the other one
must never have. I don't have a full scenario, but that's exactly the
point. One will never see all possibilities.

It all becomes super slippery slope very quickly. And we do want
compiler to generate as fast code as possible and do all these
optimizations. And it's not that there are big objective reasons to
not just mark all concurrent accesses and stop spending large amounts
of time on these "proofs".

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 12:37                   ` Dmitry Vyukov
@ 2019-01-16 12:37                     ` Dmitry Vyukov
  2019-01-16 14:51                     ` Jan Kara
  1 sibling, 0 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-16 12:37 UTC (permalink / raw)
  To: Jan Kara
  Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney,
	Alan Stern, Andrea Parri

On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > >
> > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > >> There is only one relevant standard covering this, which is the C
> > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > >> pointers.
> > > > > >>
> > > > > >> Check out this on how any race that you might think is benign can be
> > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > >
> > > > > > Also there is no other practical definition of data race for automatic
> > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > such code we are not getting data race detection and don't detect
> > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > >
> > > > >
> > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > aligned address is atomic, isn't it?
> > > >
> > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > program.
> > >
> > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > AFAICT. I understand this makes life for static checkers hard but such is
> > > reality.
> >
> > Yes, but nobody never defined what "a long access" means. And if you
> > see a function that accepts a long argument and stores it into a long
> > field, no, it does not qualify. I bet this will come at surprise to
> > lots of developers.
>
> Yes, inlining and other optimizations can screw you.
>
> > Check out this fix and try to extrapolate how this "function stores
> > long into a long leads to a serious security bug" can actually be
> > applied to whole lot of places after inlining (or when somebody just
> > slightly shuffles code in a way that looks totally safe) that also
> > kinda look safe and atomic:
> > https://lore.kernel.org/patchwork/patch/599779/
> > So where is the boundary between "a long access" that is atomic and
> > the one that is not necessary atomic?
>
> So I tend to rely on "long access being atomic" for opaque values (no
> flags, no counters, ...). Just value that gets fetched from some global
> variable / other data structure, stored, read, and possibly compared for
> equality. I agree the compiler could still screw you if it could infer how
> that value got initially created and try to be clever about it...

So can you, or somebody else, define a set of rules that we can use to
discriminate each particular case? How can we avoid that "the compiler
could still screw you"?

Inlining is always enabled, right, so one needs to take into account
everything that's possibly can be inlined. Now or in future. And also
link-time-code generation, if we don't use it we are dropping 10% of
performance on the floor.
Also, ensuring that the code works when it's first submitted is the
smaller part of the problem. It's ensuring that it continues to work
in future what's more important. Try to imagine what amount of burden
this puts onto all developers who touch any kernel code in future.
Basically if you slightly alter local logic in a function that does
not do any loads/stores, you can screw multiple "proofs" that long
accesses are atomic. Or, you just move a function from .c file to .h.
I can bet nobody re-proofs all "long accesses are atomic" around the
changed code during code reviews, so these things break over time.
Or, even if only comparisons are involved (that you mentioned as
"safe") I see how that can actually affect compilation process. Say,
we are in the branch where 2 variables compare equal, now since no
concurrency is involved from compiler point of view, it can, say,
discard one variable and then re-load it from the other variable's
location, and say not the other variable has value that the other one
must never have. I don't have a full scenario, but that's exactly the
point. One will never see all possibilities.

It all becomes super slippery slope very quickly. And we do want
compiler to generate as fast code as possible and do all these
optimizations. And it's not that there are big objective reasons to
not just mark all concurrent accesses and stop spending large amounts
of time on these "proofs".

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 12:37                   ` Dmitry Vyukov
  2019-01-16 12:37                     ` Dmitry Vyukov
@ 2019-01-16 14:51                     ` Jan Kara
  2019-01-16 14:51                       ` Jan Kara
  2019-01-16 15:33                       ` Dmitry Vyukov
  1 sibling, 2 replies; 31+ messages in thread
From: Jan Kara @ 2019-01-16 14:51 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML,
	Paul E. McKenney, Alan Stern, Andrea Parri

On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > >
> > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > >> pointers.
> > > > > > >>
> > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > >
> > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > >
> > > > > >
> > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > aligned address is atomic, isn't it?
> > > > >
> > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > program.
> > > >
> > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > reality.
> > >
> > > Yes, but nobody never defined what "a long access" means. And if you
> > > see a function that accepts a long argument and stores it into a long
> > > field, no, it does not qualify. I bet this will come at surprise to
> > > lots of developers.
> >
> > Yes, inlining and other optimizations can screw you.
> >
> > > Check out this fix and try to extrapolate how this "function stores
> > > long into a long leads to a serious security bug" can actually be
> > > applied to whole lot of places after inlining (or when somebody just
> > > slightly shuffles code in a way that looks totally safe) that also
> > > kinda look safe and atomic:
> > > https://lore.kernel.org/patchwork/patch/599779/
> > > So where is the boundary between "a long access" that is atomic and
> > > the one that is not necessary atomic?
> >
> > So I tend to rely on "long access being atomic" for opaque values (no
> > flags, no counters, ...). Just value that gets fetched from some global
> > variable / other data structure, stored, read, and possibly compared for
> > equality. I agree the compiler could still screw you if it could infer how
> > that value got initially created and try to be clever about it...
> 
> So can you, or somebody else, define a set of rules that we can use to
> discriminate each particular case? How can we avoid that "the compiler
> could still screw you"?
> 
> Inlining is always enabled, right, so one needs to take into account
> everything that's possibly can be inlined. Now or in future. And also
> link-time-code generation, if we don't use it we are dropping 10% of
> performance on the floor.
> Also, ensuring that the code works when it's first submitted is the
> smaller part of the problem. It's ensuring that it continues to work
> in future what's more important. Try to imagine what amount of burden
> this puts onto all developers who touch any kernel code in future.
> Basically if you slightly alter local logic in a function that does
> not do any loads/stores, you can screw multiple "proofs" that long
> accesses are atomic. Or, you just move a function from .c file to .h.
> I can bet nobody re-proofs all "long accesses are atomic" around the
> changed code during code reviews, so these things break over time.
> Or, even if only comparisons are involved (that you mentioned as
> "safe") I see how that can actually affect compilation process. Say,
> we are in the branch where 2 variables compare equal, now since no
> concurrency is involved from compiler point of view, it can, say,
> discard one variable and then re-load it from the other variable's
> location, and say not the other variable has value that the other one
> must never have. I don't have a full scenario, but that's exactly the
> point. One will never see all possibilities.
> 
> It all becomes super slippery slope very quickly. And we do want
> compiler to generate as fast code as possible and do all these
> optimizations. And it's not that there are big objective reasons to
> not just mark all concurrent accesses and stop spending large amounts
> of time on these "proofs".

I guess you've convinced me that somehow marking such accesses is
desirable. So is using atomic_long_t and atomic_long_set()
/ atomic_long_read() for manipulation instead what you suggest?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 14:51                     ` Jan Kara
@ 2019-01-16 14:51                       ` Jan Kara
  2019-01-16 15:33                       ` Dmitry Vyukov
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Kara @ 2019-01-16 14:51 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML,
	Paul E. McKenney, Alan Stern, Andrea Parri

On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > >
> > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > >> pointers.
> > > > > > >>
> > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > >
> > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > >
> > > > > >
> > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > aligned address is atomic, isn't it?
> > > > >
> > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > program.
> > > >
> > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > reality.
> > >
> > > Yes, but nobody never defined what "a long access" means. And if you
> > > see a function that accepts a long argument and stores it into a long
> > > field, no, it does not qualify. I bet this will come at surprise to
> > > lots of developers.
> >
> > Yes, inlining and other optimizations can screw you.
> >
> > > Check out this fix and try to extrapolate how this "function stores
> > > long into a long leads to a serious security bug" can actually be
> > > applied to whole lot of places after inlining (or when somebody just
> > > slightly shuffles code in a way that looks totally safe) that also
> > > kinda look safe and atomic:
> > > https://lore.kernel.org/patchwork/patch/599779/
> > > So where is the boundary between "a long access" that is atomic and
> > > the one that is not necessary atomic?
> >
> > So I tend to rely on "long access being atomic" for opaque values (no
> > flags, no counters, ...). Just value that gets fetched from some global
> > variable / other data structure, stored, read, and possibly compared for
> > equality. I agree the compiler could still screw you if it could infer how
> > that value got initially created and try to be clever about it...
> 
> So can you, or somebody else, define a set of rules that we can use to
> discriminate each particular case? How can we avoid that "the compiler
> could still screw you"?
> 
> Inlining is always enabled, right, so one needs to take into account
> everything that's possibly can be inlined. Now or in future. And also
> link-time-code generation, if we don't use it we are dropping 10% of
> performance on the floor.
> Also, ensuring that the code works when it's first submitted is the
> smaller part of the problem. It's ensuring that it continues to work
> in future what's more important. Try to imagine what amount of burden
> this puts onto all developers who touch any kernel code in future.
> Basically if you slightly alter local logic in a function that does
> not do any loads/stores, you can screw multiple "proofs" that long
> accesses are atomic. Or, you just move a function from .c file to .h.
> I can bet nobody re-proofs all "long accesses are atomic" around the
> changed code during code reviews, so these things break over time.
> Or, even if only comparisons are involved (that you mentioned as
> "safe") I see how that can actually affect compilation process. Say,
> we are in the branch where 2 variables compare equal, now since no
> concurrency is involved from compiler point of view, it can, say,
> discard one variable and then re-load it from the other variable's
> location, and say not the other variable has value that the other one
> must never have. I don't have a full scenario, but that's exactly the
> point. One will never see all possibilities.
> 
> It all becomes super slippery slope very quickly. And we do want
> compiler to generate as fast code as possible and do all these
> optimizations. And it's not that there are big objective reasons to
> not just mark all concurrent accesses and stop spending large amounts
> of time on these "proofs".

I guess you've convinced me that somehow marking such accesses is
desirable. So is using atomic_long_t and atomic_long_set()
/ atomic_long_read() for manipulation instead what you suggest?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 14:51                     ` Jan Kara
  2019-01-16 14:51                       ` Jan Kara
@ 2019-01-16 15:33                       ` Dmitry Vyukov
  2019-01-16 15:33                         ` Dmitry Vyukov
  2019-01-16 16:15                         ` Paul E. McKenney
  1 sibling, 2 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-16 15:33 UTC (permalink / raw)
  To: Jan Kara
  Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney,
	Alan Stern, Andrea Parri

On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote:
> > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > > >
> > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > > >> pointers.
> > > > > > > >>
> > > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > > >
> > > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > > >
> > > > > > >
> > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > > aligned address is atomic, isn't it?
> > > > > >
> > > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > > program.
> > > > >
> > > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > > reality.
> > > >
> > > > Yes, but nobody never defined what "a long access" means. And if you
> > > > see a function that accepts a long argument and stores it into a long
> > > > field, no, it does not qualify. I bet this will come at surprise to
> > > > lots of developers.
> > >
> > > Yes, inlining and other optimizations can screw you.
> > >
> > > > Check out this fix and try to extrapolate how this "function stores
> > > > long into a long leads to a serious security bug" can actually be
> > > > applied to whole lot of places after inlining (or when somebody just
> > > > slightly shuffles code in a way that looks totally safe) that also
> > > > kinda look safe and atomic:
> > > > https://lore.kernel.org/patchwork/patch/599779/
> > > > So where is the boundary between "a long access" that is atomic and
> > > > the one that is not necessary atomic?
> > >
> > > So I tend to rely on "long access being atomic" for opaque values (no
> > > flags, no counters, ...). Just value that gets fetched from some global
> > > variable / other data structure, stored, read, and possibly compared for
> > > equality. I agree the compiler could still screw you if it could infer how
> > > that value got initially created and try to be clever about it...
> >
> > So can you, or somebody else, define a set of rules that we can use to
> > discriminate each particular case? How can we avoid that "the compiler
> > could still screw you"?
> >
> > Inlining is always enabled, right, so one needs to take into account
> > everything that's possibly can be inlined. Now or in future. And also
> > link-time-code generation, if we don't use it we are dropping 10% of
> > performance on the floor.
> > Also, ensuring that the code works when it's first submitted is the
> > smaller part of the problem. It's ensuring that it continues to work
> > in future what's more important. Try to imagine what amount of burden
> > this puts onto all developers who touch any kernel code in future.
> > Basically if you slightly alter local logic in a function that does
> > not do any loads/stores, you can screw multiple "proofs" that long
> > accesses are atomic. Or, you just move a function from .c file to .h.
> > I can bet nobody re-proofs all "long accesses are atomic" around the
> > changed code during code reviews, so these things break over time.
> > Or, even if only comparisons are involved (that you mentioned as
> > "safe") I see how that can actually affect compilation process. Say,
> > we are in the branch where 2 variables compare equal, now since no
> > concurrency is involved from compiler point of view, it can, say,
> > discard one variable and then re-load it from the other variable's
> > location, and say not the other variable has value that the other one
> > must never have. I don't have a full scenario, but that's exactly the
> > point. One will never see all possibilities.
> >
> > It all becomes super slippery slope very quickly. And we do want
> > compiler to generate as fast code as possible and do all these
> > optimizations. And it's not that there are big objective reasons to
> > not just mark all concurrent accesses and stop spending large amounts
> > of time on these "proofs".
>
> I guess you've convinced me that somehow marking such accesses is
> desirable. So is using atomic_long_t and atomic_long_set()
> / atomic_long_read() for manipulation instead what you suggest?

+Paul, Alan, Andrea, who may better know the long-term direction.
I think the closer analog of "plain long read/write that is intended
to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic
type. It should achieve the goal of just read/write it once without
any tricks, but also makes them visible/searchable for humans and
tools and potential future refactoring to something else.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 15:33                       ` Dmitry Vyukov
@ 2019-01-16 15:33                         ` Dmitry Vyukov
  2019-01-16 16:15                         ` Paul E. McKenney
  1 sibling, 0 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-16 15:33 UTC (permalink / raw)
  To: Jan Kara
  Cc: Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Paul E. McKenney,
	Alan Stern, Andrea Parri

On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote:
> > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > > >
> > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > > >> pointers.
> > > > > > > >>
> > > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > > >
> > > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > > >
> > > > > > >
> > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > > aligned address is atomic, isn't it?
> > > > > >
> > > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > > program.
> > > > >
> > > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > > reality.
> > > >
> > > > Yes, but nobody never defined what "a long access" means. And if you
> > > > see a function that accepts a long argument and stores it into a long
> > > > field, no, it does not qualify. I bet this will come at surprise to
> > > > lots of developers.
> > >
> > > Yes, inlining and other optimizations can screw you.
> > >
> > > > Check out this fix and try to extrapolate how this "function stores
> > > > long into a long leads to a serious security bug" can actually be
> > > > applied to whole lot of places after inlining (or when somebody just
> > > > slightly shuffles code in a way that looks totally safe) that also
> > > > kinda look safe and atomic:
> > > > https://lore.kernel.org/patchwork/patch/599779/
> > > > So where is the boundary between "a long access" that is atomic and
> > > > the one that is not necessary atomic?
> > >
> > > So I tend to rely on "long access being atomic" for opaque values (no
> > > flags, no counters, ...). Just value that gets fetched from some global
> > > variable / other data structure, stored, read, and possibly compared for
> > > equality. I agree the compiler could still screw you if it could infer how
> > > that value got initially created and try to be clever about it...
> >
> > So can you, or somebody else, define a set of rules that we can use to
> > discriminate each particular case? How can we avoid that "the compiler
> > could still screw you"?
> >
> > Inlining is always enabled, right, so one needs to take into account
> > everything that's possibly can be inlined. Now or in future. And also
> > link-time-code generation, if we don't use it we are dropping 10% of
> > performance on the floor.
> > Also, ensuring that the code works when it's first submitted is the
> > smaller part of the problem. It's ensuring that it continues to work
> > in future what's more important. Try to imagine what amount of burden
> > this puts onto all developers who touch any kernel code in future.
> > Basically if you slightly alter local logic in a function that does
> > not do any loads/stores, you can screw multiple "proofs" that long
> > accesses are atomic. Or, you just move a function from .c file to .h.
> > I can bet nobody re-proofs all "long accesses are atomic" around the
> > changed code during code reviews, so these things break over time.
> > Or, even if only comparisons are involved (that you mentioned as
> > "safe") I see how that can actually affect compilation process. Say,
> > we are in the branch where 2 variables compare equal, now since no
> > concurrency is involved from compiler point of view, it can, say,
> > discard one variable and then re-load it from the other variable's
> > location, and say not the other variable has value that the other one
> > must never have. I don't have a full scenario, but that's exactly the
> > point. One will never see all possibilities.
> >
> > It all becomes super slippery slope very quickly. And we do want
> > compiler to generate as fast code as possible and do all these
> > optimizations. And it's not that there are big objective reasons to
> > not just mark all concurrent accesses and stop spending large amounts
> > of time on these "proofs".
>
> I guess you've convinced me that somehow marking such accesses is
> desirable. So is using atomic_long_t and atomic_long_set()
> / atomic_long_read() for manipulation instead what you suggest?

+Paul, Alan, Andrea, who may better know the long-term direction.
I think the closer analog of "plain long read/write that is intended
to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic
type. It should achieve the goal of just read/write it once without
any tricks, but also makes them visible/searchable for humans and
tools and potential future refactoring to something else.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 15:33                       ` Dmitry Vyukov
  2019-01-16 15:33                         ` Dmitry Vyukov
@ 2019-01-16 16:15                         ` Paul E. McKenney
  2019-01-16 16:15                           ` Paul E. McKenney
  1 sibling, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-01-16 16:15 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Alan Stern,
	Andrea Parri

On Wed, Jan 16, 2019 at 04:33:05PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote:
> > > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > > > >
> > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > > > >> pointers.
> > > > > > > > >>
> > > > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > > > >
> > > > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > > > >
> > > > > > > >
> > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > > > aligned address is atomic, isn't it?
> > > > > > >
> > > > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > > > program.
> > > > > >
> > > > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > > > reality.
> > > > >
> > > > > Yes, but nobody never defined what "a long access" means. And if you
> > > > > see a function that accepts a long argument and stores it into a long
> > > > > field, no, it does not qualify. I bet this will come at surprise to
> > > > > lots of developers.
> > > >
> > > > Yes, inlining and other optimizations can screw you.
> > > >
> > > > > Check out this fix and try to extrapolate how this "function stores
> > > > > long into a long leads to a serious security bug" can actually be
> > > > > applied to whole lot of places after inlining (or when somebody just
> > > > > slightly shuffles code in a way that looks totally safe) that also
> > > > > kinda look safe and atomic:
> > > > > https://lore.kernel.org/patchwork/patch/599779/
> > > > > So where is the boundary between "a long access" that is atomic and
> > > > > the one that is not necessary atomic?
> > > >
> > > > So I tend to rely on "long access being atomic" for opaque values (no
> > > > flags, no counters, ...). Just value that gets fetched from some global
> > > > variable / other data structure, stored, read, and possibly compared for
> > > > equality. I agree the compiler could still screw you if it could infer how
> > > > that value got initially created and try to be clever about it...
> > >
> > > So can you, or somebody else, define a set of rules that we can use to
> > > discriminate each particular case? How can we avoid that "the compiler
> > > could still screw you"?
> > >
> > > Inlining is always enabled, right, so one needs to take into account
> > > everything that's possibly can be inlined. Now or in future. And also
> > > link-time-code generation, if we don't use it we are dropping 10% of
> > > performance on the floor.
> > > Also, ensuring that the code works when it's first submitted is the
> > > smaller part of the problem. It's ensuring that it continues to work
> > > in future what's more important. Try to imagine what amount of burden
> > > this puts onto all developers who touch any kernel code in future.
> > > Basically if you slightly alter local logic in a function that does
> > > not do any loads/stores, you can screw multiple "proofs" that long
> > > accesses are atomic. Or, you just move a function from .c file to .h.
> > > I can bet nobody re-proofs all "long accesses are atomic" around the
> > > changed code during code reviews, so these things break over time.
> > > Or, even if only comparisons are involved (that you mentioned as
> > > "safe") I see how that can actually affect compilation process. Say,
> > > we are in the branch where 2 variables compare equal, now since no
> > > concurrency is involved from compiler point of view, it can, say,
> > > discard one variable and then re-load it from the other variable's
> > > location, and say not the other variable has value that the other one
> > > must never have. I don't have a full scenario, but that's exactly the
> > > point. One will never see all possibilities.
> > >
> > > It all becomes super slippery slope very quickly. And we do want
> > > compiler to generate as fast code as possible and do all these
> > > optimizations. And it's not that there are big objective reasons to
> > > not just mark all concurrent accesses and stop spending large amounts
> > > of time on these "proofs".
> >
> > I guess you've convinced me that somehow marking such accesses is
> > desirable. So is using atomic_long_t and atomic_long_set()
> > / atomic_long_read() for manipulation instead what you suggest?
> 
> +Paul, Alan, Andrea, who may better know the long-term direction.
> I think the closer analog of "plain long read/write that is intended
> to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic
> type. It should achieve the goal of just read/write it once without
> any tricks, but also makes them visible/searchable for humans and
> tools and potential future refactoring to something else.

Agreed, use of READ_ONCE() and WRITE_ONCE() should suffice.

The atomic_long_set() and atomic_long_read() would also work, but are
a bit more typing.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 16:15                         ` Paul E. McKenney
@ 2019-01-16 16:15                           ` Paul E. McKenney
  0 siblings, 0 replies; 31+ messages in thread
From: Paul E. McKenney @ 2019-01-16 16:15 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Tetsuo Handa, Al Viro, linux-fsdevel, LKML, Alan Stern,
	Andrea Parri

On Wed, Jan 16, 2019 at 04:33:05PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 3:51 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote:
> > > On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > > > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > > > >
> > > > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > > > >> pointers.
> > > > > > > > >>
> > > > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > > > >
> > > > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > > > >
> > > > > > > >
> > > > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > > > aligned address is atomic, isn't it?
> > > > > > >
> > > > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > > > program.
> > > > > >
> > > > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > > > reality.
> > > > >
> > > > > Yes, but nobody never defined what "a long access" means. And if you
> > > > > see a function that accepts a long argument and stores it into a long
> > > > > field, no, it does not qualify. I bet this will come at surprise to
> > > > > lots of developers.
> > > >
> > > > Yes, inlining and other optimizations can screw you.
> > > >
> > > > > Check out this fix and try to extrapolate how this "function stores
> > > > > long into a long leads to a serious security bug" can actually be
> > > > > applied to whole lot of places after inlining (or when somebody just
> > > > > slightly shuffles code in a way that looks totally safe) that also
> > > > > kinda look safe and atomic:
> > > > > https://lore.kernel.org/patchwork/patch/599779/
> > > > > So where is the boundary between "a long access" that is atomic and
> > > > > the one that is not necessary atomic?
> > > >
> > > > So I tend to rely on "long access being atomic" for opaque values (no
> > > > flags, no counters, ...). Just value that gets fetched from some global
> > > > variable / other data structure, stored, read, and possibly compared for
> > > > equality. I agree the compiler could still screw you if it could infer how
> > > > that value got initially created and try to be clever about it...
> > >
> > > So can you, or somebody else, define a set of rules that we can use to
> > > discriminate each particular case? How can we avoid that "the compiler
> > > could still screw you"?
> > >
> > > Inlining is always enabled, right, so one needs to take into account
> > > everything that's possibly can be inlined. Now or in future. And also
> > > link-time-code generation, if we don't use it we are dropping 10% of
> > > performance on the floor.
> > > Also, ensuring that the code works when it's first submitted is the
> > > smaller part of the problem. It's ensuring that it continues to work
> > > in future what's more important. Try to imagine what amount of burden
> > > this puts onto all developers who touch any kernel code in future.
> > > Basically if you slightly alter local logic in a function that does
> > > not do any loads/stores, you can screw multiple "proofs" that long
> > > accesses are atomic. Or, you just move a function from .c file to .h.
> > > I can bet nobody re-proofs all "long accesses are atomic" around the
> > > changed code during code reviews, so these things break over time.
> > > Or, even if only comparisons are involved (that you mentioned as
> > > "safe") I see how that can actually affect compilation process. Say,
> > > we are in the branch where 2 variables compare equal, now since no
> > > concurrency is involved from compiler point of view, it can, say,
> > > discard one variable and then re-load it from the other variable's
> > > location, and say not the other variable has value that the other one
> > > must never have. I don't have a full scenario, but that's exactly the
> > > point. One will never see all possibilities.
> > >
> > > It all becomes super slippery slope very quickly. And we do want
> > > compiler to generate as fast code as possible and do all these
> > > optimizations. And it's not that there are big objective reasons to
> > > not just mark all concurrent accesses and stop spending large amounts
> > > of time on these "proofs".
> >
> > I guess you've convinced me that somehow marking such accesses is
> > desirable. So is using atomic_long_t and atomic_long_set()
> > / atomic_long_read() for manipulation instead what you suggest?
> 
> +Paul, Alan, Andrea, who may better know the long-term direction.
> I think the closer analog of "plain long read/write that is intended
> to be atomic" is using READ_ONCE/WRITE_ONCE with a normal non-atomic
> type. It should achieve the goal of just read/write it once without
> any tricks, but also makes them visible/searchable for humans and
> tools and potential future refactoring to something else.

Agreed, use of READ_ONCE() and WRITE_ONCE() should suffice.

The atomic_long_set() and atomic_long_read() would also work, but are
a bit more typing.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 11:48                 ` Dmitry Vyukov
@ 2019-01-16 16:28                   ` Greg Kroah-Hartman
  2019-01-17 13:18                     ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Greg Kroah-Hartman @ 2019-01-16 16:28 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Linus Torvalds, Kees Cook, Tetsuo Handa, Al Viro,
	linux-fsdevel, Kostya Serebryany

On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > >
> > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > >> There is only one relevant standard covering this, which is the C
> > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > >> pointers.
> > > > > >>
> > > > > >> Check out this on how any race that you might think is benign can be
> > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > >
> > > > > > Also there is no other practical definition of data race for automatic
> > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > such code we are not getting data race detection and don't detect
> > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > >
> > > > >
> > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > aligned address is atomic, isn't it?
> > > >
> > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > program.
> > >
> > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > AFAICT. I understand this makes life for static checkers hard but such is
> > > reality.
> >
> > Yes, but nobody never defined what "a long access" means. And if you
> > see a function that accepts a long argument and stores it into a long
> > field, no, it does not qualify. I bet this will come at surprise to
> > lots of developers.
> > Check out this fix and try to extrapolate how this "function stores
> > long into a long leads to a serious security bug" can actually be
> > applied to whole lot of places after inlining (or when somebody just
> > slightly shuffles code in a way that looks totally safe) that also
> > kinda look safe and atomic:
> > https://lore.kernel.org/patchwork/patch/599779/
> > So where is the boundary between "a long access" that is atomic and
> > the one that is not necessary atomic?
> 
> 
> +Linus, Greg, Kees
> 
> I wanted to provide a hash/link to this commit but, wait, you want to
> say that this patch for a security bugs was mailed, recorded by
> patchwork, acked by subsystem developer and then dropped on the floor
> for 3+ years? Doh!
> 
> https://lore.kernel.org/patchwork/patch/599779/
> 
> There are known ways how to make this not a thing at all. Like open
> pull requests on github:
> https://github.com/google/syzkaller/pulls
> or, some projects even do own dashboard for this:
> https://dev.golang.org/reviews
> because this is important. Especially for new contributors, drive-by
> improvements, good samaritan fixes, etc.
> 
> Another example: a bug-fixing patch was lost for 2 years:
> "Two years ago ;) I don't understand why there were ignored"
> https://www.spinics.net/lists/linux-mm/msg161351.html
> 
> Another example: a patch is applied to a subsystem tree and then lost
> for 6 months:
> https://patchwork.kernel.org/patch/10339089/

I don't understand the issue here.  Are you saying that sometimes
patches that have been submitted get dropped?  Yes, that's known, it is
up to the submitter to verify and ensure that the patch is applied.
Given our rate of change and the large workload that some maintainers
have, this is the best that we can do at the moment.

Putting it all in a github dashboard would not scale in the least (other
projects smaller than us have tried and ended up stopping from doing
that as it fails horribly).

Yes, we can always do better, but remember that the submitter needs to
take the time to ensure that their patches are applied.  Heck, I have
patches submitted months ago that I know the maintainers ignored, and I
need to remember to send them again.  We put the burden of development
on the thing that scales, the developer themselves, not the maintainer
here.

It's the best that we know of how to do at the moment, and we are always
trying to do better.  Examples of this are where some subsystems are now
getting multiple maintainers to handle the workload, and that's helping
a lot.  That doesn't work for all subsystems as not all subsystems can
even find more than one maintainer who is willing to look at the
patches.

Please, resubmit your mount patch again, that's a crazy bug :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 16:28                   ` Greg Kroah-Hartman
@ 2019-01-17 13:18                     ` Dmitry Vyukov
  2019-01-21  8:37                       ` Jan Kara
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-17 13:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jan Kara, Linus Torvalds, Kees Cook, Tetsuo Handa, Al Viro,
	linux-fsdevel, Kostya Serebryany

On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote:
> > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > >
> > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > >> pointers.
> > > > > > >>
> > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > >
> > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > >
> > > > > >
> > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > aligned address is atomic, isn't it?
> > > > >
> > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > program.
> > > >
> > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > reality.
> > >
> > > Yes, but nobody never defined what "a long access" means. And if you
> > > see a function that accepts a long argument and stores it into a long
> > > field, no, it does not qualify. I bet this will come at surprise to
> > > lots of developers.
> > > Check out this fix and try to extrapolate how this "function stores
> > > long into a long leads to a serious security bug" can actually be
> > > applied to whole lot of places after inlining (or when somebody just
> > > slightly shuffles code in a way that looks totally safe) that also
> > > kinda look safe and atomic:
> > > https://lore.kernel.org/patchwork/patch/599779/
> > > So where is the boundary between "a long access" that is atomic and
> > > the one that is not necessary atomic?
> >
> >
> > +Linus, Greg, Kees
> >
> > I wanted to provide a hash/link to this commit but, wait, you want to
> > say that this patch for a security bugs was mailed, recorded by
> > patchwork, acked by subsystem developer and then dropped on the floor
> > for 3+ years? Doh!
> >
> > https://lore.kernel.org/patchwork/patch/599779/
> >
> > There are known ways how to make this not a thing at all. Like open
> > pull requests on github:
> > https://github.com/google/syzkaller/pulls
> > or, some projects even do own dashboard for this:
> > https://dev.golang.org/reviews
> > because this is important. Especially for new contributors, drive-by
> > improvements, good samaritan fixes, etc.
> >
> > Another example: a bug-fixing patch was lost for 2 years:
> > "Two years ago ;) I don't understand why there were ignored"
> > https://www.spinics.net/lists/linux-mm/msg161351.html
> >
> > Another example: a patch is applied to a subsystem tree and then lost
> > for 6 months:
> > https://patchwork.kernel.org/patch/10339089/
>
> I don't understand the issue here.  Are you saying that sometimes
> patches that have been submitted get dropped?  Yes, that's known, it is
> up to the submitter to verify and ensure that the patch is applied.
> Given our rate of change and the large workload that some maintainers
> have, this is the best that we can do at the moment.
>
> Putting it all in a github dashboard would not scale in the least (other
> projects smaller than us have tried and ended up stopping from doing
> that as it fails horribly).
>
> Yes, we can always do better, but remember that the submitter needs to
> take the time to ensure that their patches are applied.  Heck, I have
> patches submitted months ago that I know the maintainers ignored, and I
> need to remember to send them again.  We put the burden of development
> on the thing that scales, the developer themselves, not the maintainer
> here.
>
> It's the best that we know of how to do at the moment, and we are always
> trying to do better.  Examples of this are where some subsystems are now
> getting multiple maintainers to handle the workload, and that's helping
> a lot.  That doesn't work for all subsystems as not all subsystems can
> even find more than one maintainer who is willing to look at the
> patches.

The issue here is that patches are lost and "up to the submitter" is
not fully working.
It may be working reasonably well when a developer has an official
assignment at work to do thing X, and then they can't miss/forget
about "is thing X merged yet". But it fails for new contributors,
drive-by improvements, good samaritan fixes, etc. Things that we need
no less than the first category (maybe more).
Machines are always better than humans at such scrupulous tracking
work. So if humans can do it, machines will do even better.
The dashboard definitely needs to be sharded in multiple dimensions.
E.g. "per subsystem", "per assigned reviewer", and even "per author".
Because e.g. how may mine are lost? Only this one or more? How many
yours are lost? Do you know?
I am sure this is doable and beneficial. I don't know why other
projects failed with this, maybe that's something with github. But
there are also codebases that are 100x larger than kernel and do
amount of changes kernel receives in a year in less than a week and
nothing gets lots thanks to scalable processes and automation.

> Please, resubmit your mount patch again, that's a crazy bug :)

That's the problem. It now requires way more additional work and it's
even unclear if the problem is still there or not, the code has
radically changed. It could have been merged back then as is with 0
additional work. I could have been updated it a week after original
submission. But now that's completely paged out, I am looking at that
file and I don't recognize anything, no idea how the patch should be
updated, I now have no idea what tree such patch should be based on,
etc.
Also a good question is how many other my patches were lost that I now
have no idea about? I discovered this one by pure accident.

To make things more constructive: say, if somebody offers to build
such a system for kernel, in accordance with kernel specific
requirements (would also enable presubmit testing, recording base
commit, not losing review comments and other nice things), would you,
Linus (not sure who is in change of such decisions) be willing to
integrate it into the official kernel development process so that
everybody use it for all kernel changes?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-16 10:43             ` Jan Kara
  2019-01-16 11:03               ` Dmitry Vyukov
@ 2019-01-17 14:11               ` Tetsuo Handa
  2019-01-18 15:30                 ` Dmitry Vyukov
  1 sibling, 1 reply; 31+ messages in thread
From: Tetsuo Handa @ 2019-01-17 14:11 UTC (permalink / raw)
  To: Jan Kara, Dmitry Vyukov; +Cc: Al Viro, linux-fsdevel

On 2019/01/16 19:43, Jan Kara wrote:
> On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
>>> I still cannot catch. Read/write of sizeof(long) bytes at naturally
>>> aligned address is atomic, isn't it?
>>
>> Nobody guarantees this. According to C non-atomic conflicting
>> reads/writes of sizeof(long) cause undefined behavior of the whole
>> program.
> 
> Yes, but to be fair the kernel has always relied on long accesses to be
> atomic pretty heavily so that it is now de-facto standard for the kernel
> AFAICT. I understand this makes life for static checkers hard but such is
> reality.

Regarding "load a long word from naturally aligned address" and 
"store a long word to naturally aligned address", they are required to be atomic
in Linux kernel, (for it is described as "On all systems running Linux, loads from
and stores to pointers are atomic, that is, if a store to a pointer occurs at the
same time as a load from that same pointer, the load will return either the initial
value or the value stored, never some bitwise mashup of the two." at
https://lwn.net/Articles/262464/ ) aren't they?

And I know we need to use READ_ONCE()/WRITE_ONCE() etc. when we need to guarantee
that reordering/reloading is not acceptable. But what damage can cause for this
specific case? It does only "compare two long variables" in addition to "load a
long word from naturally aligned address" and "store a long word to naturally
aligned address". Reordering/reloading these variables won't cause severe problems.

> 
> But in this particular case I agree with you that special logic is not
> really warranted.

Anyway, do we apply V2 patch at
https://marc.info/?i=cd9c47c9-1dc6-6949-a138-703f6a10bc9b@i-love.sakura.ne.jp ?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-17 14:11               ` Tetsuo Handa
@ 2019-01-18 15:30                 ` Dmitry Vyukov
  0 siblings, 0 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-18 15:30 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Jan Kara, Al Viro, linux-fsdevel

On Thu, Jan 17, 2019 at 3:11 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2019/01/16 19:43, Jan Kara wrote:
> > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> >>> I still cannot catch. Read/write of sizeof(long) bytes at naturally
> >>> aligned address is atomic, isn't it?
> >>
> >> Nobody guarantees this. According to C non-atomic conflicting
> >> reads/writes of sizeof(long) cause undefined behavior of the whole
> >> program.
> >
> > Yes, but to be fair the kernel has always relied on long accesses to be
> > atomic pretty heavily so that it is now de-facto standard for the kernel
> > AFAICT. I understand this makes life for static checkers hard but such is
> > reality.
>
> Regarding "load a long word from naturally aligned address" and
> "store a long word to naturally aligned address", they are required to be atomic
> in Linux kernel, (for it is described as "On all systems running Linux, loads from
> and stores to pointers are atomic, that is, if a store to a pointer occurs at the
> same time as a load from that same pointer, the load will return either the initial
> value or the value stored, never some bitwise mashup of the two." at
> https://lwn.net/Articles/262464/ ) aren't they?

Well, the description of the patch explains how/why a compiler can
make a C pointer store non-atomic. Also you can see that
list_for_each_entry_rcu uses READ_ONCE:
https://elixir.bootlin.com/linux/v5.0-rc2/source/include/linux/rculist.h#L278


> And I know we need to use READ_ONCE()/WRITE_ONCE() etc. when we need to guarantee
> that reordering/reloading is not acceptable.

READ/WRITE_ONCE don't have anything to do with ordering. They only
provide atomicity and visibility.
That's various acquire/release/barriers that provide ordering guarantees.

> But what damage can cause for this
> specific case?

Any.

> It does only "compare two long variables" in addition to "load a
> long word from naturally aligned address" and "store a long word to naturally
> aligned address". Reordering/reloading these variables won't cause severe problems.
>
> >
> > But in this particular case I agree with you that special logic is not
> > really warranted.
>
> Anyway, do we apply V2 patch at
> https://marc.info/?i=cd9c47c9-1dc6-6949-a138-703f6a10bc9b@i-love.sakura.ne.jp ?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-17 13:18                     ` Dmitry Vyukov
@ 2019-01-21  8:37                       ` Jan Kara
  2019-01-21 10:33                         ` Tetsuo Handa
  2019-01-22 15:27                         ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov
  0 siblings, 2 replies; 31+ messages in thread
From: Jan Kara @ 2019-01-21  8:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Jan Kara, Linus Torvalds, Kees Cook,
	Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany

On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote:
> > > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > I wanted to provide a hash/link to this commit but, wait, you want to
> > > say that this patch for a security bugs was mailed, recorded by
> > > patchwork, acked by subsystem developer and then dropped on the floor
> > > for 3+ years? Doh!
> > >
> > > https://lore.kernel.org/patchwork/patch/599779/
> > >
> > > There are known ways how to make this not a thing at all. Like open
> > > pull requests on github:
> > > https://github.com/google/syzkaller/pulls
> > > or, some projects even do own dashboard for this:
> > > https://dev.golang.org/reviews
> > > because this is important. Especially for new contributors, drive-by
> > > improvements, good samaritan fixes, etc.
> > >
> > > Another example: a bug-fixing patch was lost for 2 years:
> > > "Two years ago ;) I don't understand why there were ignored"
> > > https://www.spinics.net/lists/linux-mm/msg161351.html
> > >
> > > Another example: a patch is applied to a subsystem tree and then lost
> > > for 6 months:
> > > https://patchwork.kernel.org/patch/10339089/
> >
> > I don't understand the issue here.  Are you saying that sometimes
> > patches that have been submitted get dropped?  Yes, that's known, it is
> > up to the submitter to verify and ensure that the patch is applied.
> > Given our rate of change and the large workload that some maintainers
> > have, this is the best that we can do at the moment.
> >
> > Putting it all in a github dashboard would not scale in the least (other
> > projects smaller than us have tried and ended up stopping from doing
> > that as it fails horribly).
> >
> > Yes, we can always do better, but remember that the submitter needs to
> > take the time to ensure that their patches are applied.  Heck, I have
> > patches submitted months ago that I know the maintainers ignored, and I
> > need to remember to send them again.  We put the burden of development
> > on the thing that scales, the developer themselves, not the maintainer
> > here.
> >
> > It's the best that we know of how to do at the moment, and we are always
> > trying to do better.  Examples of this are where some subsystems are now
> > getting multiple maintainers to handle the workload, and that's helping
> > a lot.  That doesn't work for all subsystems as not all subsystems can
> > even find more than one maintainer who is willing to look at the
> > patches.
> 
> The issue here is that patches are lost and "up to the submitter" is
> not fully working.
> It may be working reasonably well when a developer has an official
> assignment at work to do thing X, and then they can't miss/forget
> about "is thing X merged yet". But it fails for new contributors,
> drive-by improvements, good samaritan fixes, etc. Things that we need
> no less than the first category (maybe more).
> Machines are always better than humans at such scrupulous tracking
> work. So if humans can do it, machines will do even better.
> The dashboard definitely needs to be sharded in multiple dimensions.
> E.g. "per subsystem", "per assigned reviewer", and even "per author".
> Because e.g. how may mine are lost? Only this one or more? How many
> yours are lost? Do you know?
> I am sure this is doable and beneficial. I don't know why other
> projects failed with this, maybe that's something with github. But
> there are also codebases that are 100x larger than kernel and do
> amount of changes kernel receives in a year in less than a week and
> nothing gets lots thanks to scalable processes and automation.

Out of curiosity which ones?

> > Please, resubmit your mount patch again, that's a crazy bug :)
> 
> That's the problem. It now requires way more additional work and it's
> even unclear if the problem is still there or not, the code has
> radically changed. It could have been merged back then as is with 0
> additional work. I could have been updated it a week after original
> submission. But now that's completely paged out, I am looking at that
> file and I don't recognize anything, no idea how the patch should be
> updated, I now have no idea what tree such patch should be based on,
> etc.

Yeah, it's painful at times... I know it as well.

> Also a good question is how many other my patches were lost that I now
> have no idea about? I discovered this one by pure accident.

Well, I do keep track of my own submitted patches in my git tree and
occasionally sweep through it and resubmit / ping about lost ones. But yes,
it requires some motivation and self-discipline which is not always present
for drive-by contributions.

> To make things more constructive: say, if somebody offers to build
> such a system for kernel, in accordance with kernel specific
> requirements (would also enable presubmit testing, recording base
> commit, not losing review comments and other nice things), would you,
> Linus (not sure who is in change of such decisions) be willing to
> integrate it into the official kernel development process so that
> everybody use it for all kernel changes?

Well, we do have patchwork and some subsystems use it. It doesn't have
pre-submit testing or other fancy features but it is good enough so that
patches don't get lost. And Konstantin (kernel.org admin) did quite some
good work recently with automating lots of tedious tasks with patchwork. So
at this point I think it's more about the fact that some maintainers prefer
to work differently rather than the lack of tooling as such.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-11 11:37   ` [PATCH v2] " Tetsuo Handa
@ 2019-01-21  8:57     ` Jan Kara
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Kara @ 2019-01-21  8:57 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Jan Kara, Alexander Viro, Dmitry Vyukov, linux-fsdevel

On Fri 11-01-19 20:37:42, Tetsuo Handa wrote:
> When something let __find_get_block_slow() hit all_mapped path, it calls
> printk() for 100+ times per a second. But there is no need to print same
> message with such high frequency; it is just asking for stall warning, or
> at least bloating log files.
> 
>   [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.873324][T15342] b_state=0x00000029, b_size=512
>   [  399.878403][T15342] device loop0 blocksize: 4096
>   [  399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.890400][T15342] b_state=0x00000029, b_size=512
>   [  399.895595][T15342] device loop0 blocksize: 4096
>   [  399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
>   [  399.907471][T15342] b_state=0x00000029, b_size=512
>   [  399.912506][T15342] device loop0 blocksize: 4096
> 
> This patch reduces frequency to up to once per a second, in addition to
> concatenating three lines into one.
> 
>   [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  fs/buffer.c | 18 ++++++++----------
>  1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 52d024b..4be8083 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -200,6 +200,7 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>  	struct buffer_head *head;
>  	struct page *page;
>  	int all_mapped = 1;
> +	static DEFINE_RATELIMIT_STATE(last_warned, HZ, 1);
>  
>  	index = block >> (PAGE_SHIFT - bd_inode->i_blkbits);
>  	page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED);
> @@ -227,16 +228,13 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>  	 * file io on the block device and getblk.  It gets dealt with
>  	 * elsewhere, don't buffer_error if we had some unmapped buffers
>  	 */
> -	if (all_mapped) {
> -		printk("__find_get_block_slow() failed. "
> -			"block=%llu, b_blocknr=%llu\n",
> -			(unsigned long long)block,
> -			(unsigned long long)bh->b_blocknr);
> -		printk("b_state=0x%08lx, b_size=%zu\n",
> -			bh->b_state, bh->b_size);
> -		printk("device %pg blocksize: %d\n", bdev,
> -			1 << bd_inode->i_blkbits);
> -	}
> +	ratelimit_set_flags(&last_warned, RATELIMIT_MSG_ON_RELEASE);
> +	if (all_mapped && __ratelimit(&last_warned))
> +		printk("__find_get_block_slow() failed. block=%llu, b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, device %pg blocksize: %d\n",

I would wrap this like:

		printk("__find_get_block_slow() failed. block=%llu, "
		       "b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, "
		       "device %pg blocksize: %d\n",

as 140 chars long line seems really too much. Also I'd embed the 'if body'
in { }. I know it's not necessary but it makes the code more readable when
printk has to occupy multiple lines.

Otherwise the patch looks good so feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

WRT merging, you might want to send the fix to Jens Axboe since he's
occasionally merging fs/buffer.c patches.

								Honza

> +		       (unsigned long long)block,
> +		       (unsigned long long)bh->b_blocknr,
> +		       bh->b_state, bh->b_size, bdev,
> +		       1 << bd_inode->i_blkbits);
>  out_unlock:
>  	spin_unlock(&bd_mapping->private_lock);
>  	put_page(page);
> -- 
> 1.8.3.1
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-21  8:37                       ` Jan Kara
@ 2019-01-21 10:33                         ` Tetsuo Handa
  2019-01-21 10:48                           ` Greg Kroah-Hartman
  2019-01-22 15:27                         ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov
  1 sibling, 1 reply; 31+ messages in thread
From: Tetsuo Handa @ 2019-01-21 10:33 UTC (permalink / raw)
  To: Jan Kara, Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Linus Torvalds, Kees Cook, Al Viro,
	linux-fsdevel, Kostya Serebryany, Andrew Morton

On 2019/01/21 17:37, Jan Kara wrote:
> On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote:
>> Also a good question is how many other my patches were lost that I now
>> have no idea about? I discovered this one by pure accident.
> 
> Well, I do keep track of my own submitted patches in my git tree and
> occasionally sweep through it and resubmit / ping about lost ones. But yes,
> it requires some motivation and self-discipline which is not always present
> for drive-by contributions.

I use quilt. quilt is simple and easier for me than git. You can maintain a
file named "series" and do "quilt push -a" && "quilt pop -a" for automatic
checking. By the way, isn't Andrew Morton still using quilt?


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
  2019-01-21 10:33                         ` Tetsuo Handa
@ 2019-01-21 10:48                           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 31+ messages in thread
From: Greg Kroah-Hartman @ 2019-01-21 10:48 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jan Kara, Dmitry Vyukov, Linus Torvalds, Kees Cook, Al Viro,
	linux-fsdevel, Kostya Serebryany, Andrew Morton

On Mon, Jan 21, 2019 at 07:33:13PM +0900, Tetsuo Handa wrote:
> On 2019/01/21 17:37, Jan Kara wrote:
> > On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote:
> >> Also a good question is how many other my patches were lost that I now
> >> have no idea about? I discovered this one by pure accident.
> > 
> > Well, I do keep track of my own submitted patches in my git tree and
> > occasionally sweep through it and resubmit / ping about lost ones. But yes,
> > it requires some motivation and self-discipline which is not always present
> > for drive-by contributions.
> 
> I use quilt. quilt is simple and easier for me than git. You can maintain a
> file named "series" and do "quilt push -a" && "quilt pop -a" for automatic
> checking. By the way, isn't Andrew Morton still using quilt?
> 

I think Andrew's scripts pre-date quilt, they are what got turned into
quilt.

I use quilt daily for all stable kernel work, and for my personal kernel
development as well.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.)
  2019-01-21  8:37                       ` Jan Kara
  2019-01-21 10:33                         ` Tetsuo Handa
@ 2019-01-22 15:27                         ` Dmitry Vyukov
  2019-01-22 17:15                           ` Jan Kara
  1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2019-01-22 15:27 UTC (permalink / raw)
  To: Jan Kara
  Cc: Greg Kroah-Hartman, Linus Torvalds, Kees Cook, Tetsuo Handa,
	Al Viro, linux-fsdevel, Kostya Serebryany, LKML, Daniel Vetter

[-- Attachment #1: Type: text/plain, Size: 7993 bytes --]

On Mon, Jan 21, 2019 at 9:37 AM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote:
> > On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote:
> > > > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > > I wanted to provide a hash/link to this commit but, wait, you want to
> > > > say that this patch for a security bugs was mailed, recorded by
> > > > patchwork, acked by subsystem developer and then dropped on the floor
> > > > for 3+ years? Doh!
> > > >
> > > > https://lore.kernel.org/patchwork/patch/599779/
> > > >
> > > > There are known ways how to make this not a thing at all. Like open
> > > > pull requests on github:
> > > > https://github.com/google/syzkaller/pulls
> > > > or, some projects even do own dashboard for this:
> > > > https://dev.golang.org/reviews
> > > > because this is important. Especially for new contributors, drive-by
> > > > improvements, good samaritan fixes, etc.
> > > >
> > > > Another example: a bug-fixing patch was lost for 2 years:
> > > > "Two years ago ;) I don't understand why there were ignored"
> > > > https://www.spinics.net/lists/linux-mm/msg161351.html
> > > >
> > > > Another example: a patch is applied to a subsystem tree and then lost
> > > > for 6 months:
> > > > https://patchwork.kernel.org/patch/10339089/
> > >
> > > I don't understand the issue here.  Are you saying that sometimes
> > > patches that have been submitted get dropped?  Yes, that's known, it is
> > > up to the submitter to verify and ensure that the patch is applied.
> > > Given our rate of change and the large workload that some maintainers
> > > have, this is the best that we can do at the moment.
> > >
> > > Putting it all in a github dashboard would not scale in the least (other
> > > projects smaller than us have tried and ended up stopping from doing
> > > that as it fails horribly).
> > >
> > > Yes, we can always do better, but remember that the submitter needs to
> > > take the time to ensure that their patches are applied.  Heck, I have
> > > patches submitted months ago that I know the maintainers ignored, and I
> > > need to remember to send them again.  We put the burden of development
> > > on the thing that scales, the developer themselves, not the maintainer
> > > here.
> > >
> > > It's the best that we know of how to do at the moment, and we are always
> > > trying to do better.  Examples of this are where some subsystems are now
> > > getting multiple maintainers to handle the workload, and that's helping
> > > a lot.  That doesn't work for all subsystems as not all subsystems can
> > > even find more than one maintainer who is willing to look at the
> > > patches.
> >
> > The issue here is that patches are lost and "up to the submitter" is
> > not fully working.
> > It may be working reasonably well when a developer has an official
> > assignment at work to do thing X, and then they can't miss/forget
> > about "is thing X merged yet". But it fails for new contributors,
> > drive-by improvements, good samaritan fixes, etc. Things that we need
> > no less than the first category (maybe more).
> > Machines are always better than humans at such scrupulous tracking
> > work. So if humans can do it, machines will do even better.
> > The dashboard definitely needs to be sharded in multiple dimensions.
> > E.g. "per subsystem", "per assigned reviewer", and even "per author".
> > Because e.g. how may mine are lost? Only this one or more? How many
> > yours are lost? Do you know?
> > I am sure this is doable and beneficial. I don't know why other
> > projects failed with this, maybe that's something with github. But
> > there are also codebases that are 100x larger than kernel and do
> > amount of changes kernel receives in a year in less than a week and
> > nothing gets lots thanks to scalable processes and automation.
>
> Out of curiosity which ones?

I mean in particular Google codebase [1] but I think Facebook [2],
Chromium [3], Rust [4], Go processes share lots of the same
principles. Overall idea is process unification and automation and
building more complex functions on top of lower-level functions. This
allows to move very fast at very large scale and at the same time
preserving very high code quality (as required by and proven by
continuous delivery).

I feel that perhaps I failed to explain the larger picture assuming
that it's common knowledge, but perhaps it's not, so I draw this
1-pager diagram how functions build on top of functions and all fit
together:

https://docs.google.com/presentation/d/e/2PACX-1vRq2SdmiP-wqUb3Xo2drgn48bw2HbyGqFPP-ebfTfn6eNZkHSRwKZKRBAT6K3E3Ra9IJ218ZqRxvmfG/pub
(also attached if you prefer a download)

The goal is not to say that this is the only true way of doing things
or that we need all of this, but to show that higher-level nice things
can't be built without proper lower-level foundation. We all agree on
few lowest level things (like git and C), which is good and already
brings tremendous benefits. But it really feels to me that at the
current kernel scale and fundamentality we need the next layer of
common building blocks in the process: things like change tracking (in
particular, patches that can be reliably applied) and tests (that are
easy to add, discoverer, run locally and on CI). And to really work as
foundation these things need to be agreed on as being "the solution"
(e.g. "all kernel changes go through patchwork") rather then "being
allowed to be used by fragmented groups if they want".

[1] https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext
[2] https://framethink.wordpress.com/2011/01/17/how-facebook-ships-code/
[3] https://www.youtube.com/watch?v=dIageYT0Vgg
[4] https://www.chromium.org/developers

> > > Please, resubmit your mount patch again, that's a crazy bug :)
> >
> > That's the problem. It now requires way more additional work and it's
> > even unclear if the problem is still there or not, the code has
> > radically changed. It could have been merged back then as is with 0
> > additional work. I could have been updated it a week after original
> > submission. But now that's completely paged out, I am looking at that
> > file and I don't recognize anything, no idea how the patch should be
> > updated, I now have no idea what tree such patch should be based on,
> > etc.
>
> Yeah, it's painful at times... I know it as well.
>
> > Also a good question is how many other my patches were lost that I now
> > have no idea about? I discovered this one by pure accident.
>
> Well, I do keep track of my own submitted patches in my git tree and
> occasionally sweep through it and resubmit / ping about lost ones. But yes,
> it requires some motivation and self-discipline which is not always present
> for drive-by contributions.
>
> > To make things more constructive: say, if somebody offers to build
> > such a system for kernel, in accordance with kernel specific
> > requirements (would also enable presubmit testing, recording base
> > commit, not losing review comments and other nice things), would you,
> > Linus (not sure who is in change of such decisions) be willing to
> > integrate it into the official kernel development process so that
> > everybody use it for all kernel changes?
>
> Well, we do have patchwork and some subsystems use it. It doesn't have
> pre-submit testing or other fancy features but it is good enough so that
> patches don't get lost. And Konstantin (kernel.org admin) did quite some
> good work recently with automating lots of tedious tasks with patchwork. So
> at this point I think it's more about the fact that some maintainers prefer
> to work differently rather than the lack of tooling as such.
>
>                                                                 Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

[-- Attachment #2: devtools.pdf --]
[-- Type: application/pdf, Size: 62218 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.)
  2019-01-22 15:27                         ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov
@ 2019-01-22 17:15                           ` Jan Kara
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Kara @ 2019-01-22 17:15 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Greg Kroah-Hartman, Linus Torvalds, Kees Cook,
	Tetsuo Handa, Al Viro, linux-fsdevel, Kostya Serebryany, LKML,
	Daniel Vetter

On Tue 22-01-19 16:27:53, Dmitry Vyukov wrote:
> On Mon, Jan 21, 2019 at 9:37 AM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 17-01-19 14:18:56, Dmitry Vyukov wrote:
> > > On Wed, Jan 16, 2019 at 5:28 PM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Jan 16, 2019 at 12:48:41PM +0100, Dmitry Vyukov wrote:
> > > > > On Wed, Jan 16, 2019 at 12:03 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > > > I wanted to provide a hash/link to this commit but, wait, you want to
> > > > > say that this patch for a security bugs was mailed, recorded by
> > > > > patchwork, acked by subsystem developer and then dropped on the floor
> > > > > for 3+ years? Doh!
> > > > >
> > > > > https://lore.kernel.org/patchwork/patch/599779/
> > > > >
> > > > > There are known ways how to make this not a thing at all. Like open
> > > > > pull requests on github:
> > > > > https://github.com/google/syzkaller/pulls
> > > > > or, some projects even do own dashboard for this:
> > > > > https://dev.golang.org/reviews
> > > > > because this is important. Especially for new contributors, drive-by
> > > > > improvements, good samaritan fixes, etc.
> > > > >
> > > > > Another example: a bug-fixing patch was lost for 2 years:
> > > > > "Two years ago ;) I don't understand why there were ignored"
> > > > > https://www.spinics.net/lists/linux-mm/msg161351.html
> > > > >
> > > > > Another example: a patch is applied to a subsystem tree and then lost
> > > > > for 6 months:
> > > > > https://patchwork.kernel.org/patch/10339089/
> > > >
> > > > I don't understand the issue here.  Are you saying that sometimes
> > > > patches that have been submitted get dropped?  Yes, that's known, it is
> > > > up to the submitter to verify and ensure that the patch is applied.
> > > > Given our rate of change and the large workload that some maintainers
> > > > have, this is the best that we can do at the moment.
> > > >
> > > > Putting it all in a github dashboard would not scale in the least (other
> > > > projects smaller than us have tried and ended up stopping from doing
> > > > that as it fails horribly).
> > > >
> > > > Yes, we can always do better, but remember that the submitter needs to
> > > > take the time to ensure that their patches are applied.  Heck, I have
> > > > patches submitted months ago that I know the maintainers ignored, and I
> > > > need to remember to send them again.  We put the burden of development
> > > > on the thing that scales, the developer themselves, not the maintainer
> > > > here.
> > > >
> > > > It's the best that we know of how to do at the moment, and we are always
> > > > trying to do better.  Examples of this are where some subsystems are now
> > > > getting multiple maintainers to handle the workload, and that's helping
> > > > a lot.  That doesn't work for all subsystems as not all subsystems can
> > > > even find more than one maintainer who is willing to look at the
> > > > patches.
> > >
> > > The issue here is that patches are lost and "up to the submitter" is
> > > not fully working.
> > > It may be working reasonably well when a developer has an official
> > > assignment at work to do thing X, and then they can't miss/forget
> > > about "is thing X merged yet". But it fails for new contributors,
> > > drive-by improvements, good samaritan fixes, etc. Things that we need
> > > no less than the first category (maybe more).
> > > Machines are always better than humans at such scrupulous tracking
> > > work. So if humans can do it, machines will do even better.
> > > The dashboard definitely needs to be sharded in multiple dimensions.
> > > E.g. "per subsystem", "per assigned reviewer", and even "per author".
> > > Because e.g. how may mine are lost? Only this one or more? How many
> > > yours are lost? Do you know?
> > > I am sure this is doable and beneficial. I don't know why other
> > > projects failed with this, maybe that's something with github. But
> > > there are also codebases that are 100x larger than kernel and do
> > > amount of changes kernel receives in a year in less than a week and
> > > nothing gets lots thanks to scalable processes and automation.
> >
> > Out of curiosity which ones?
> 
> I mean in particular Google codebase [1] but I think Facebook [2],
> Chromium [3], Rust [4], Go processes share lots of the same
> principles. Overall idea is process unification and automation and
> building more complex functions on top of lower-level functions. This
> allows to move very fast at very large scale and at the same time
> preserving very high code quality (as required by and proven by
> continuous delivery).
> 
> I feel that perhaps I failed to explain the larger picture assuming
> that it's common knowledge, but perhaps it's not, so I draw this
> 1-pager diagram how functions build on top of functions and all fit
> together:
> 
> https://docs.google.com/presentation/d/e/2PACX-1vRq2SdmiP-wqUb3Xo2drgn48bw2HbyGqFPP-ebfTfn6eNZkHSRwKZKRBAT6K3E3Ra9IJ218ZqRxvmfG/pub
> (also attached if you prefer a download)

Thanks for drawing this and for the references! I know these things in
principle but the image certainly helps in knowing what your are talking
about exactly.

> The goal is not to say that this is the only true way of doing things
> or that we need all of this, but to show that higher-level nice things
> can't be built without proper lower-level foundation. We all agree on
> few lowest level things (like git and C), which is good and already
> brings tremendous benefits. But it really feels to me that at the
> current kernel scale and fundamentality we need the next layer of
> common building blocks in the process: things like change tracking (in
> particular, patches that can be reliably applied) and tests (that are
> easy to add, discoverer, run locally and on CI). And to really work as
> foundation these things need to be agreed on as being "the solution"
> (e.g. "all kernel changes go through patchwork") rather then "being
> allowed to be used by fragmented groups if they want".

Understood. I guess eventually we may get to something like that but at
least as far as I can observe current efforts, trying to change something
in the kernel development process is like herding cats. You need to offer
big enough bowl of cream ;).

> [1] https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext
> [2] https://framethink.wordpress.com/2011/01/17/how-facebook-ships-code/
> [3] https://www.youtube.com/watch?v=dIageYT0Vgg
> [4] https://www.chromium.org/developers

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-01-22 17:16 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-11 10:10 [PATCH] fs: ratelimit __find_get_block_slow() failure message Tetsuo Handa
2019-01-11 10:19 ` Dmitry Vyukov
     [not found]   ` <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp>
2019-01-11 10:40     ` Dmitry Vyukov
2019-01-11 10:48       ` Dmitry Vyukov
2019-01-11 12:46         ` Tetsuo Handa
2019-01-16  9:47           ` Dmitry Vyukov
2019-01-16 10:43             ` Jan Kara
2019-01-16 11:03               ` Dmitry Vyukov
2019-01-16 11:48                 ` Dmitry Vyukov
2019-01-16 16:28                   ` Greg Kroah-Hartman
2019-01-17 13:18                     ` Dmitry Vyukov
2019-01-21  8:37                       ` Jan Kara
2019-01-21 10:33                         ` Tetsuo Handa
2019-01-21 10:48                           ` Greg Kroah-Hartman
2019-01-22 15:27                         ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov
2019-01-22 17:15                           ` Jan Kara
2019-01-16 11:56                 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara
2019-01-16 12:37                   ` Dmitry Vyukov
2019-01-16 12:37                     ` Dmitry Vyukov
2019-01-16 14:51                     ` Jan Kara
2019-01-16 14:51                       ` Jan Kara
2019-01-16 15:33                       ` Dmitry Vyukov
2019-01-16 15:33                         ` Dmitry Vyukov
2019-01-16 16:15                         ` Paul E. McKenney
2019-01-16 16:15                           ` Paul E. McKenney
2019-01-17 14:11               ` Tetsuo Handa
2019-01-18 15:30                 ` Dmitry Vyukov
2019-01-11 10:51       ` Tetsuo Handa
2019-01-11 11:03 ` Jan Kara
2019-01-11 11:37   ` [PATCH v2] " Tetsuo Handa
2019-01-21  8:57     ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).