From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751420AbcCFLGT (ORCPT ); Sun, 6 Mar 2016 06:06:19 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:41317 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751203AbcCFLGI (ORCPT ); Sun, 6 Mar 2016 06:06:08 -0500 To: sergey.senozhatsky.work@gmail.com Cc: akpm@linux-foundation.org, jack@suse.com, pmladek@suse.com, tj@kernel.org, linux-kernel@vger.kernel.org, sergey.senozhatsky@gmail.com, jack@suse.cz Subject: Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async From: Tetsuo Handa References: <1457175338-1665-1-git-send-email-sergey.senozhatsky@gmail.com> <1457175338-1665-2-git-send-email-sergey.senozhatsky@gmail.com> <20160306063251.GA493@swordfish> <201603061618.GED43232.MtOQOFSLOFHJFV@I-love.SAKURA.ne.jp> <20160306093530.GA26055@swordfish> In-Reply-To: <20160306093530.GA26055@swordfish> Message-Id: <201603062006.IJD17667.OOQFLtMVHOFSJF@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Sun, 6 Mar 2016 20:06:03 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sergey Senozhatsky wrote: > On (03/06/16 16:18), Tetsuo Handa wrote: > > Moreover, I don't like use of a workqueue even if printk_wq was allocated > > with WQ_MEM_RECLAIM bit. As you can see in the discussion of the OOM reaper, > > the OOM reaper chose a dedicated kernel thread rather than a workqueue > > ( http://lkml.kernel.org/r/1454505240-23446-2-git-send-email-mhocko@kernel.org ). > > > > Blocking actual printing until ongoing workqueue item calls schedule_timeout_*() > > is not nice (see commit 373ccbe59270 and 564e81a57f97). > > do you mean a new worker allocation delay and a MAYDAY timer delay? > I don't know what MAYDAY is. I'm talking about a situation where printing_work work item is not processed (i.e. printing_work_func() is not called) until current work item calls schedule_timeout_*(). We had a problem that since vmstat_work work item was using system_wq, vmstat_work work item was not processed (i.e. vmstat_update() was not called) if kworker was looping inside memory allocator without calling schedule_timeout_*() due to disk_events_workfn() doing GFP_NOIO allocation). ---------- [ 271.579276] MemAlloc: kworker/0:56(7399) gfp=0x2400000 order=0 delay=129294 [ 271.581237] ffff88007c78fa08 ffff8800778f8c80 ffff88007c790000 ffff8800778f8c80 [ 271.583329] 0000000002400000 0000000000000000 ffff8800778f8c80 ffff88007c78fa20 [ 271.585391] ffffffff8162aa9d 0000000000000001 ffff88007c78fa30 ffffffff8162aac7 [ 271.587463] Call Trace: [ 271.588512] [] preempt_schedule_common+0x18/0x2b [ 271.590243] [] _cond_resched+0x17/0x20 [ 271.591830] [] __alloc_pages_nodemask+0x64e/0xcc0 [ 271.593561] [] ? __kmalloc+0x22/0x190 [ 271.595119] [] alloc_pages_current+0x87/0x110 [ 271.596778] [] bio_copy_kern+0xc4/0x180 [ 271.598342] [] ? wait_woken+0x80/0x80 [ 271.599878] [] blk_rq_map_kern+0x70/0x130 [ 271.601481] [] ? blk_get_request+0x75/0xe0 [ 271.603100] [] scsi_execute+0x12d/0x160 [ 271.604657] [] scsi_execute_req_flags+0x84/0xf0 [ 271.606339] [] sr_check_events+0xb2/0x2a0 [sr_mod] [ 271.608141] [] ? set_next_entity+0x6c/0x6a0 [ 271.609830] [] cdrom_check_events+0x13/0x30 [cdrom] [ 271.611610] [] sr_block_check_events+0x25/0x30 [sr_mod] [ 271.613429] [] disk_check_events+0x5b/0x150 [ 271.615065] [] disk_events_workfn+0x11/0x20 [ 271.616699] [] process_one_work+0x135/0x310 [ 271.618321] [] worker_thread+0x11b/0x4a0 [ 271.620018] [] ? process_one_work+0x310/0x310 [ 271.622022] [] kthread+0xd3/0xf0 [ 271.623533] [] ? kthread_create_on_node+0x1a0/0x1a0 [ 271.625487] [] ret_from_fork+0x3f/0x70 [ 271.627175] [] ? kthread_create_on_node+0x1a0/0x1a0 ---------- Even if you use printk_wq with WQ_MEM_RECLAIM for printing_work work item, printing_work_func() will not be called until current work item calls schedule_timeout_*(). That will be an undesirable random delay. If you use a dedicated kernel thread rather than a dedicated workqueue with WQ_MEM_RECLAIM, we can avoid this random delay. > > > +static void printing_work_func(struct work_struct *work) > > > +{ > > > + console_lock(); > > > + console_unlock(); > > > +} > > > > Is this safe? If somebody invokes the OOM killer between console_lock() > > and console_unlock(), won't this cause OOM killer messages not printed? > > if you mean something like > > console_lock(); > for (...) { > do_foo() { > ... > pr_err(" ... foo message ...\n"); > ... > } > } > console_unlock(); > > then yes, nothing will be printed until that process executes console_unlock(), > because it's console_unlock() that pushes the messages to console drivers. Yes, I meant a sequence like console_lock(); ptr = kmalloc(GFP_KERNEL); kfree(ptr); console_unlock(); and kmalloc() prints OOM killer messages rather than failing that allocation. Are we sure that there is no such usage?