linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: sergey.senozhatsky.work@gmail.com, akpm@linux-foundation.org,
	jack@suse.com, pmladek@suse.com, tj@kernel.org,
	linux-kernel@vger.kernel.org, sergey.senozhatsky@gmail.com,
	jack@suse.cz
Subject: Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async
Date: Sun, 6 Mar 2016 22:27:03 +0900	[thread overview]
Message-ID: <20160306132703.GA927@swordfish> (raw)
In-Reply-To: <201603062006.IJD17667.OOQFLtMVHOFSJF@I-love.SAKURA.ne.jp>

On (03/06/16 20:06), Tetsuo Handa wrote:
[..]
> > do you mean a new worker allocation delay and a MAYDAY timer delay?
> > 
> 
> I don't know what MAYDAY is. I'm talking about a situation where printing_work
> work item is not processed (i.e. printing_work_func() is not called) until
> current work item calls schedule_timeout_*().
> 
> We had a problem that since vmstat_work work item was using system_wq,
> vmstat_work work item was not processed (i.e. vmstat_update() was not called) if
> kworker was looping inside memory allocator without calling schedule_timeout_*()
> due to disk_events_workfn() doing GFP_NOIO allocation).

hm, just for note, none of system-wide wqs seem to have a ->rescuer thread
(WQ_MEM_RECLAIM).

[..]
> Even if you use printk_wq with WQ_MEM_RECLAIM for printing_work work item,
> printing_work_func() will not be called until current work item calls
> schedule_timeout_*(). That will be an undesirable random delay. If you use
> a dedicated kernel thread rather than a dedicated workqueue with WQ_MEM_RECLAIM,
> we can avoid this random delay.

hm. yes, seems that it may take some time until workqueue wakeup() a ->rescuer thread.
need to look more.

[..]
> > 	console_lock();
> > 	for (...) {
> > 		do_foo() {
> > 			...
> > 				pr_err(" ... foo message ...\n");
> > 			...
> > 		}
> > 	}
> > 	console_unlock();
> > 
> > then yes, nothing will be printed until that process executes console_unlock(),
> > because it's console_unlock() that pushes the messages to console drivers.
> 
> Yes, I meant a sequence like
> 
>   console_lock();
>   ptr = kmalloc(GFP_KERNEL);
>   kfree(ptr);
>   console_unlock();
> 
> and kmalloc() prints OOM killer messages rather than failing that allocation.
> Are we sure that there is no such usage?

such usage is quite possible.

problems that I have with console_lock()/console_unlock() is that
these functions serve a double purpose: exclusive printk() lock and a
console_drivers list lock.

**** I haven't really thought about it yet, but I want to split it. ****

console_lock()/console_unlock() can be executed by user space processes
(inside system calls). for example:

SyS_open()
 ...
  chrdev_open()
   tty_open()
    console_device()
     console_lock()
     console_unlock()
      for (;;) {
       call_console_drivers()
      }

or doing `cat /proc/consoles`

SyS_read()
 vfs_read()
  proc_reg_read()
   seq_read()
    c_stop()
     console_unlock()
      for (;;) {
       call_console_drivers()
      }

which can introduce two nasty problems:

1) console_lock() may put user space process in TASK_UNINTERRUPTIBLE for
   unknown period of time -- until current console_sem owner will not
   finish print loop in console_unlock(). no signals, heartbeats, etc.
   will be processed by this user space process.

2) user space process may have to spend an unknown period of time in
   console_unlock() later, pushing "3rd party" kernel messages to
   console drivers. again, not really good.

(kthreads can suffer here too, sure).


in the examples above, a process just wanted to iterate the console_drivers
list in read access mode. so, for instance, in

struct tty_driver *console_device(int *index)
{
        struct console *c;
        struct tty_driver *driver = NULL;

        console_lock();
        for_each_console(c) {
                if (!c->device)
                        continue;
                driver = c->device(c, index);
                if (driver)
                        break;
        }
        console_unlock();
        return driver;
}

instead of console_lock()/console_unlock()->call_console_drivers()
it could do (very schematically)

	read_lock_console();
          for_each_console(c) {
                  if (!c->device)
                          continue;
                  driver = c->device(c, index);
                  if (driver)
                          break;
          }
	read_unlock_console();

and in functions that modify the list, the lock can be acquired in
write mode. example,

int unregister_console(struct console *console)
{
        write_lock_console();
...
                for (a=console_drivers->next, b=console_drivers ;
                     a; b=a, a=b->next) {
                        if (a == console) {
                                b->next = a->next;
                                res = 0;
                                break;
                        }
                }
        }
...
        if (console_drivers != NULL && console->flags & CON_CONSDEV)
                console_drivers->flags |= CON_CONSDEV;

        console->flags &= ~CON_ENABLED;
        write_unlock_unlock();



printk(), thus, will take its own "another" exclusive lock, to guarantee
that only one process can call_console_drivers(), and it will take the
console_drivers list lock in read mode. so other process(-es) that also
want to access console_drivers list in read mode will not wait in
TASK_UNINTERRUPTIBLE.

of course, this means that console_unlock() now does not print
anything to console drivers. it's printk() duty to do this.

and the last statement can be very hard to sell; because I don't know
for sure if there is a code in the kernel that depends on the fact that
console_lock() stops printk() and that console_unlock() prints all of
the printk messages.


that "lock split" also can fix another *theoretical* scenario: suppose,
that for some reason someone's setup has both a) huge number of printk()
calls and b) a relatively huge number of console_lock()/console_unlock()
calls, all happening simultaneously. while async printk helps in (a),
case (b) "detours" the printk async code; and if cpus at lest some of the
cpus that do console_lock()/console_unlock() also disable IRQs or
preemption, then lockups and all the bad things are very much likely.


just an idea.

	-ss

  reply	other threads:[~2016-03-06 13:25 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-05 10:55 [RFC][PATCH v2 0/2] printk: Make printk() completely async Sergey Senozhatsky
2016-03-05 10:55 ` [RFC][PATCH v2 1/2] " Sergey Senozhatsky
2016-03-06  6:32   ` Sergey Senozhatsky
2016-03-06  7:18     ` Tetsuo Handa
2016-03-06  9:35       ` Sergey Senozhatsky
2016-03-06 11:06         ` Tetsuo Handa
2016-03-06 13:27           ` Sergey Senozhatsky [this message]
2016-03-06 14:54             ` Tetsuo Handa
2016-03-07  8:22             ` Jan Kara
2016-03-07 10:12               ` Sergey Senozhatsky
2016-03-07 10:52                 ` Jan Kara
2016-03-07 12:16                   ` Jan Kara
2016-03-07 12:37                     ` Tetsuo Handa
2016-03-07 15:10                     ` Sergey Senozhatsky
2016-03-07 15:49                       ` Tejun Heo
2016-03-08 10:21                         ` Sergey Senozhatsky
2016-03-11 17:22                           ` Tejun Heo
2016-03-12  5:01                             ` Sergey Senozhatsky
2016-03-09  6:09                     ` Sergey Senozhatsky
2016-03-10  9:27                       ` Jan Kara
2016-03-10 15:48                         ` Sergey Senozhatsky
2016-03-10  9:53                       ` Petr Mladek
2016-03-10 16:26                         ` Sergey Senozhatsky
2016-03-07 14:40                   ` Sergey Senozhatsky
2016-03-07 11:10                 ` Tetsuo Handa
2016-03-07 14:36                   ` Sergey Senozhatsky
2016-03-07 15:42               ` Tejun Heo
2016-03-05 10:55 ` [RFC][PATCH v2 2/2] printk: Skip messages on oops Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160306132703.GA927@swordfish \
    --to=sergey.senozhatsky.work@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=pmladek@suse.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).