linux-trace-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: contact me <www@aegistudio.net>
To: "Steven Rostedt" <rostedt@goodmis.org>
Cc: "Linus Torvalds" <torvalds@linuxfoundation.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"linux-trace-devel" <linux-trace-devel@vger.kernel.org>,
	"Security Officers" <security@kernel.org>
Subject: Re: [PATCH 1/1] tracing: fix bug in rb_per_cpu_empty() that might cause deadloop.
Date: Wed, 21 Jul 2021 10:13:01 +0800	[thread overview]
Message-ID: <17ac6d675f2.129c41f20168460.6720846023977971963@aegistudio.net> (raw)
In-Reply-To: <20210720210855.6fdf1190@oasis.local.home>

On Wed, 21 Jul 2021 09:08:55 +0800 Steven Rostedt <rostedt@goodmis.org> wrote:
 > On Tue, 20 Jul 2021 11:35:56 -0700 
 > Linus Torvalds <torvalds@linuxfoundation.org> wrote: 
 >  
 > > On Tue, Jul 20, 2021 at 2:04 AM Haoran Luo <www@aegistudio.net> wrote: 
 > > > 
 > > >         return reader->read == rb_page_commit(reader) && 
 > > >                 (commit == reader || 
 > > >                  (commit == head && 
 > > > -                 head->read == rb_page_commit(commit))); 
 > > > +                 (rb_page_commit(commit) == 0 || 
 > > > +                  head->read == rb_page_commit(commit)))); 
 > > >  } 
 > > 
 > > Can we please re-write that whole thing to be legible? 
 > > 
 > > That conditional was complicated before. It's worse now. 
 >  
 > I agree. I always cringe when I write code like this. For some reason, 
 > I write code I dislike over adding multiple returns. Not sure why I do 
 > that. Probably some childhood trauma. 
 >  
 > I'll add the comments below, as it looks correct. 
 >  
 > > 
 > > For example, just splitting it up would help: 
 > > 
 >  
 >     /* 
 >      * If what has been read on the reader page is not equal 
 >      * to the content on the reader page, then there's more 
 >      * data (not empty). 
 >      */ 
 >  
 > >         if (reader->read != rb_page_commit(reader)) 
 > >                 return false; 
 > > 
 >  
 >     /* 
 >      * The last thing written is on the reader page, and the above 
 >      * statement already shows everything written was read. 
 >      * (empty buffer) 
 >      */ 
 >  
 > >         if (commit == reader) 
 > >                 return true; 
 > > 
 >  
 >     /* 
 >      * The last page written does not equal the last page read, 
 >      * thus there's more data (not empty). 
 >      */ 
 >  
 > >         if (commit != head) 
 > >                 return false; 
 > > 
 >  
 > So, looking at this, I see that the bug is not here. head->read should 
 > never be non-zero. The issue is that we missed resetting "read" to zero 
 > before adding it back to the ring buffer. 
 >  
 >  
 > >         return !rb_page_commit(commit) || 
 > >                 commit->read == rb_page_commit(commit); 
 >  
 > Really, the original code worked because head->read is expected to be 
 > zero. And looking at the code more, it should even work with: 
 >  
 >     /* 
 >      * With the commit == head page (tail == head), and nothing on 
 >      * the commit page there's nothing in the buffer. 
 >      */ 
 >     if (!rb_page_commit(commit)) 
 >         return true; 
 >  
 > > 
 > > but adding comments to each conditional case would also be good. 
 > > 
 > > Note: I checked "head->read" to "commit->read" in that last 
 > > conditional, because we had _just_ verified that "commit" and "head" 
 > > are the same, and it seems to be more logical to check "commit->read" 
 > > with "rb_page_commit(commit)". 
 >  
 > I believe I kept it as "head" to re-enforce that the two are the same. 
 >  
 >  
 > > 
 > > (And somebody should check that I split it right - I'm just doing this 
 > > in the MUA without knowing the code: I hate conditionals that are 
 > > cross multiple lines like this, and I hate them particularly when they 
 > > turn out to be buggy, so when you fix a complex conditional, that's a 
 > > BIG sign to me that the problem was that the conditional was too 
 > > complex to begin with). 
 > > 
 > > Btw, that "rb_page_commit()" forces a new atomic read each time, so 
 > > the "rb_page_commit(commit)" value should probably be cached as a 
 > > variable before it's used twice. 
 > > 
 >  
 > Well, the double read of rb_page_commit() is not needed as the bug is 
 > that read is non zero. And I'm not sure this patch even fixes the bug, 
 > because read is not updated when commit is, thus read is some random 
 > number that could cause this to return a false positive. "read" needed 
 > to be reset when the page was added back to the ring buffer, and it 
 > appears that it was missed (ever though all the other fields were reset). 
 >  
 > Although this patch is wrong, I'm not against a clean up patch to make 
 > the code easier to read. 
 >  
 > I think the real fix is below (not even compiled tested). 
 >  
 > -- Steve 
 >  
 > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c 
 > index d1463eac11a3..0b9ce927e95f 100644 
 > --- a/kernel/trace/ring_buffer.c 
 > +++ b/kernel/trace/ring_buffer.c 
 > @@ -4445,6 +4445,7 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) 
 >      local_set(&cpu_buffer->reader_page->write, 0); 
 >      local_set(&cpu_buffer->reader_page->entries, 0); 
 >      local_set(&cpu_buffer->reader_page->page->commit, 0); 
 > +    cpu_buffer->reader_page->read = 0; 
 >      cpu_buffer->reader_page->real_end = 0; 
 >  
 >  spin: 
 > 

I'll have a try and tell you the result. But I've seen the code written on

https://github.com/torvalds/linux/blob/2734d6c1b1a089fb593ef6a23d4b70903526fe0c/kernel/trace/ring_buffer.c#L4513
	cpu_buffer->reader_page->read = 0;

I though the algorithm is to reset the reader_page->read when it swaps with the
head page. And following the original algorithm the read field of a buffer page 
should be either 0 or its original value when the content to read exhaust. So I
added an extra conditional to ensure that.

  reply	other threads:[~2021-07-21  2:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20  8:48 [PATCH 1/1] tracing: fix bug in rb_per_cpu_empty() that might cause deadloop Haoran Luo
2021-07-20 18:35 ` Linus Torvalds
2021-07-20 22:36   ` contact me
2021-07-21  1:10     ` Steven Rostedt
2021-07-21  1:08   ` Steven Rostedt
2021-07-21  2:13     ` contact me [this message]
2021-07-21  2:48       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17ac6d675f2.129c41f20168460.6720846023977971963@aegistudio.net \
    --to=www@aegistudio.net \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=security@kernel.org \
    --cc=torvalds@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).