All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Chen Gang <gang.chen@asianux.com>
Cc: Tejun Heo <tj@kernel.org>, Oleg Nesterov <oleg@redhat.com>,
	laijs@cn.fujitsu.com, Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] kernel/kthread.c: need spin_lock_irq() for 'worker' before main looping, since it can "WARN_ON(worker->task)".
Date: Thu, 20 Jun 2013 10:28:49 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.02.1306200951090.4013@ionos.tec.linutronix.de> (raw)
In-Reply-To: <51C2B157.40806@asianux.com>

On Thu, 20 Jun 2013, Chen Gang wrote:

> On 06/20/2013 03:02 PM, Thomas Gleixner wrote:
> > On Thu, 20 Jun 2013, Chen Gang wrote:
> > 
> >> > On 06/19/2013 11:52 PM, Tejun Heo wrote:
> >>> > > On Wed, Jun 19, 2013 at 06:17:36PM +0800, Chen Gang wrote:
> >>>>> > >> > Hmm... can 'worker->task' has chance to be not NULL before set 'current'
> >>>>> > >> > to it ?
> >>> > > Yes, if the caller screws up and try to attach more than one workers
> >>> > > to the kthread_worker, which has some possibility of happening as
> >>> > > kthread_worker allows both attaching and detaching a worker.
> >>> > > 
> >> > 
> >> > If we detect the bugs, and still want to use WARN_ON() to report warning
> >> > and continue running, we need be sure of keeping the related things no
> >> > touch (at least not lead to worse).
> >> > 
> >> > If we can not be sure of keeping the related things no touch:
> >> >   if it is a kernel bug, better use BUG_ON() instead of,
> >> >   if it is a user mode bug, better to return failure with error code and
> >> > print related information.
> > Wrong. BUG_ON() is only for cases where the kernel CANNOT continue at
> > all. WARN_ON() prints the very same information, but allows to
> > continue.
> > 
> 
> In fact, BUG_ON() and WARN_ON() has various implementations in different
> architectures, and also can be configured by user.

And how is that relevant? 
 
> Even some of 'crazy users' (e.g. randconfig), can make BUG_ON() and
> WARN_ON() 'empty' (include/asm-generic/bug.h).

That does not matter at all.
 
> In my experience (mainly for servers), when find a kernel bug, it will
> stop and report bug, that will let coredump analysing (or KDB trap) much
> easier.

And your core dump will help you in what way? The code which
misbehaved is not longer executing. The problem is detected after the
fact and therefor your coredump will just tell you that worker->task
is not NULL.
 
> >> > BUG_ON() will stop current working flow and report kernel bug in details.
> > There is no reason to crash the machine completely. The kernel can
> > continue and the WARN_ON reports the bug with the same details.

Linus said about BUG_ON():

  Adding BUG_ON()'s just makes things much much much worse. There is
  *never* a reason to add a BUG_ON().
  
  BUG_ON() makes it almost impossible to debug something, because you
  just killed the machine. So using BUG_ON() for "please notice this"
  is stupid as hell, because the most common end result is: "Oh, the
  machine just hung with no messages".

And he is right about that. 
 
> If so (we still prefer to use WARN_ON), we'd better to let it in lock
> protected.

No, because the lock is not protecting anything in that case. If some
other code misbehaves and sets worker->task, then the lock does not
prevent this and taking the lock is not making the WARN_ON any more
reliable. So why the heck should we take it?
 
> At least when we still have to continue, try not to lead things worse.

And what's going to be better if we take the lock? Nothing, because
the lock CANNOT protect the check.
 
> It will provide much help for coredump analysing (or KDB trap).
> 
> In fact, for coredump analysers, for every real world coredump, they
> have to assume the system has already continued blindly, and then die.

Core dump analysers cannot analyse dynamic race conditions and neither
can KDB. 

So what do you gain from crashing the kernel? Exactly NOTHING.

Thanks,

	tglx

  reply	other threads:[~2013-06-20  8:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-19  4:03 [PATCH] kernel/kthread.c: need spin_lock_irq() for 'worker' before main looping, since it can "WARN_ON(worker->task)" Chen Gang
2013-06-19  8:41 ` Tejun Heo
2013-06-19 10:17   ` Chen Gang
2013-06-19 15:52     ` Tejun Heo
2013-06-20  1:53       ` Chen Gang
2013-06-20  7:02         ` Thomas Gleixner
2013-06-20  7:37           ` Chen Gang
2013-06-20  8:28             ` Thomas Gleixner [this message]
2013-06-20  9:36               ` Chen Gang
2013-06-19  8:43 ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.02.1306200951090.4013@ionos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=gang.chen@asianux.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.