All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Kosina <jkosina@suse.cz>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Martin Jambor <mjambor@suse.cz>, Petr Mladek <pmladek@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
Subject: Re: [PATCH] tell gcc optimizer to never introduce new data races
Date: Tue, 10 Jun 2014 20:10:54 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LNX.2.00.1406101959580.1321@pobox.suse.cz> (raw)
In-Reply-To: <CA+55aFyY7_=3xt-0KE=-fmb+kLurxkBmh8TZWgDZMi_RVpKBpg@mail.gmail.com>

On Tue, 10 Jun 2014, Linus Torvalds wrote:

> > We have been chasing a memory corruption bug, which turned out to be
> > caused by very old gcc (4.3.4), which happily turned conditional load into
> > a non-conditional one, and that broke correctness (the condition was met
> > only if lock was held) and corrupted memory.
> 
> Just out of interest, can you point to the particular kernel code that
> caused this? I think that's more interesting than the example program
> you show - which I'm sure is really nice for gcc developers as an
> example, but from a kernel standpoint I think it's more important to
> show the particular problems this caused for the kernel?

Well, as I said, that was with gcc 4.3.4, and GCC people expressed 
themselves that that was a slightly different optimization. It made me 
nervous enough though to ask whether it's absolutely positively not going 
to happen with newer gccs, and the code snippet quoted in the original 
mail came back as a response.

The code in question was out-of-tree printk-in-NMI (yeah, surprise 
suprise, once again) patch written by Petr Mladek, let me quote his 
comment from our internal bugzilla:

===
I have spent few days investigating inconsistent state of kernel ring buffer.
It went out that it was caused by speculative store generated by 
gcc-4.3.4.

The problem is in assembly generated for make_free_space(). The functions is
called the following way:

+ vprintk_emit();
    + log = MAIN_LOG; // with logbuf_lock
       or
       log = NMI_LOG; // with nmi_logbuf_lock
       cont_add(log, ...);
        + cont_flush(log, ...);
            + log_store(log, ...);
                  + log_make_free_space(log, ...);

If called with log = NMI_LOG then only nmi_log_* global variables are safe to
modify but the generated code does store also into (main_)log_* global 
variables:


<log_make_free_space>:
       55                      push   %rbp
       89 f6                   mov    %esi,%esi

       48 8b 05 03 99 51 01    mov    0x1519903(%rip),%rax       # ffffffff82620868 <nmi_log_next_id>
       44 8b 1d ec 98 51 01    mov    0x15198ec(%rip),%r11d      # ffffffff82620858 <log_next_idx>
       8b 35 36 60 14 01       mov    0x1146036(%rip),%esi       # ffffffff8224cfa8 <log_buf_len>
       44 8b 35 33 60 14 01    mov    0x1146033(%rip),%r14d      # ffffffff8224cfac <nmi_log_buf_len>
       4c 8b 2d d0 98 51 01    mov    0x15198d0(%rip),%r13       # ffffffff82620850 <log_next_seq>
       4c 8b 25 11 61 14 01    mov    0x1146111(%rip),%r12       # ffffffff8224d098 <log_buf>
       49 89 c2                mov    %rax,%r10
       48 21 c2                and    %rax,%rdx
       48 8b 1d 0c 99 55 01    mov    0x155990c(%rip),%rbx       # ffffffff826608a0 <nmi_log_buf>
       49 c1 ea 20             shr    $0x20,%r10
       48 89 55 d0             mov    %rdx,-0x30(%rbp)
       44 29 de                sub    %r11d,%esi
       45 29 d6                sub    %r10d,%r14d
       4c 8b 0d 97 98 51 01    mov    0x1519897(%rip),%r9	# ffffffff82620840 <log_first_seq>
       eb 7e                   jmp    ffffffff81107029	<log_make_free_space+0xe9>
[...]
       85 ff                   test   %edi,%edi                  # edi = 1 for NMI_LOG
       4c 89 e8                mov    %r13,%rax
       4c 89 ca                mov    %r9,%rdx
       74 0a                   je     ffffffff8110703d	<log_make_free_space+0xfd>
       8b 15 27 98 51 01       mov    0x1519827(%rip),%edx       # ffffffff82620860 <nmi_log_first_id>
       48 8b 45 d0             mov    -0x30(%rbp),%rax
       48 39 c2                cmp    %rax,%rdx                  # end of loop 
       0f 84 da 00 00 00       je     ffffffff81107120 <log_make_free_space+0x1e0>
[...]
       85 ff                   test   %edi,%edi                  # edi = 1 for NMI_LOG
       4c 89 0d 17 97 51 01    mov    %r9,0x1519717(%rip)        # ffffffff82620840 <log_first_seq>
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
                               KABOOOM
       74 35                   je     ffffffff81107160		 <log_make_free_space+0x220>


It stores log_first_seq when edi == NMI_LOG. This instructions are used also
when edi == MAIN_LOG but the store is done speculatively before the condition
is decided.  It is unsafe because we do not have "logbuf_lock" in NMI context
and some other process migh modify "log_first_seq" in parallel.
===

I believe that the best course of action is both

- building kernel (and anything multi-threaded, I guess) with that 
  optimization turned off
- persuade gcc folks to change the default for future releases

Thanks,

-- 
Jiri Kosina
SUSE Labs

  parent reply	other threads:[~2014-06-10 18:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-10 13:23 [PATCH] tell gcc optimizer to never introduce new data races Jiri Kosina
2014-06-10 14:53 ` Peter Zijlstra
2014-06-10 15:04   ` Marek Polacek
2014-06-10 15:13     ` Peter Zijlstra
2014-06-10 15:17       ` Jakub Jelinek
2014-06-10 17:46 ` Linus Torvalds
2014-06-10 18:04   ` Steven Noonan
2014-06-10 18:54     ` Richard Biener
2014-06-10 18:10   ` Jiri Kosina [this message]
2014-06-16 10:29 ` Dan Carpenter
2014-06-16 10:52   ` Andreas Schwab
2014-06-16 11:20     ` Jiri Kosina
2014-06-16 14:37     ` Mark Brown
2014-06-17  7:58   ` Jiri Kosina
2014-08-20 23:27 ` Jiri Kosina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LNX.2.00.1406101959580.1321@pobox.suse.cz \
    --to=jkosina@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=gcc@gcc.gnu.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjambor@suse.cz \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.cz \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.