From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752324AbeAQHeU (ORCPT + 1 other); Wed, 17 Jan 2018 02:34:20 -0500 Received: from LGEAMRELO12.lge.com ([156.147.23.52]:55119 "EHLO lgeamrelo12.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752226AbeAQHeS (ORCPT ); Wed, 17 Jan 2018 02:34:18 -0500 X-Original-SENDERIP: 156.147.1.127 X-Original-MAILFROM: byungchul.park@lge.com X-Original-SENDERIP: 10.177.222.184 X-Original-MAILFROM: byungchul.park@lge.com Subject: Re: [PATCH v5 1/2] printk: Add console owner and waiter logic to load balance console writes From: Byungchul Park To: Petr Mladek , Steven Rostedt , Sergey Senozhatsky Cc: akpm@linux-foundation.org, linux-mm@kvack.org, Cong Wang , Dave Hansen , Johannes Weiner , Mel Gorman , Michal Hocko , Vlastimil Babka , Peter Zijlstra , Linus Torvalds , Jan Kara , Mathieu Desnoyers , Tetsuo Handa , rostedt@home.goodmis.org, Sergey Senozhatsky , Tejun Heo , Pavel Machek , linux-kernel@vger.kernel.org, kernel-team@lge.com References: <20180110132418.7080-1-pmladek@suse.com> <20180110132418.7080-2-pmladek@suse.com> Message-ID: <9f0ef69d-49e7-abf1-2f61-5f0f44ffcf7b@lge.com> Date: Wed, 17 Jan 2018 16:34:14 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 1/17/2018 11:19 AM, Byungchul Park wrote: > On 1/10/2018 10:24 PM, Petr Mladek wrote: >> From: Steven Rostedt >> >> From: Steven Rostedt (VMware) >> >> This patch implements what I discussed in Kernel Summit. I added >> lockdep annotation (hopefully correctly), and it hasn't had any splats >> (since I fixed some bugs in the first iterations). It did catch >> problems when I had the owner covering too much. But now that the owner >> is only set when actively calling the consoles, lockdep has stayed >> quiet. >> >> Here's the design again: >> >> I added a "console_owner" which is set to a task that is actively >> writing to the consoles. It is *not* the same as the owner of the >> console_lock. It is only set when doing the calls to the console >> functions. It is protected by a console_owner_lock which is a raw spin >> lock. >> >> There is a console_waiter. This is set when there is an active console >> owner that is not current, and waiter is not set. This too is protected >> by console_owner_lock. >> >> In printk() when it tries to write to the consoles, we have: >> >>     if (console_trylock()) >>         console_unlock(); >> >> Now I added an else, which will check if there is an active owner, and >> no current waiter. If that is the case, then console_waiter is set, and >> the task goes into a spin until it is no longer set. >> >> When the active console owner finishes writing the current message to >> the consoles, it grabs the console_owner_lock and sees if there is a >> waiter, and clears console_owner. >> >> If there is a waiter, then it breaks out of the loop, clears the waiter >> flag (because that will release the waiter from its spin), and exits. >> Note, it does *not* release the console semaphore. Because it is a >> semaphore, there is no owner. Another task may release it. This means >> that the waiter is guaranteed to be the new console owner! Which it >> becomes. >> >> Then the waiter calls console_unlock() and continues to write to the >> consoles. >> >> If another task comes along and does a printk() it too can become the >> new waiter, and we wash rinse and repeat! >> >> By Petr Mladek about possible new deadlocks: >> >> The thing is that we move console_sem only to printk() call >> that normally calls console_unlock() as well. It means that >> the transferred owner should not bring new type of dependencies. >> As Steven said somewhere: "If there is a deadlock, it was >> there even before." >> >> We could look at it from this side. The possible deadlock would >> look like: >> >> CPU0                            CPU1 >> >> console_unlock() >> >>    console_owner = current; >> >>                 spin_lockA() >>                   printk() >>                     spin = true; >>                     while (...) >> >>      call_console_drivers() >>        spin_lockA() >> >> This would be a deadlock. CPU0 would wait for the lock A. >> While CPU1 would own the lockA and would wait for CPU0 >> to finish calling the console drivers and pass the console_sem >> owner. >> >> But if the above is true than the following scenario was >> already possible before: >> >> CPU0 >> >> spin_lockA() >>    printk() >>      console_unlock() >>        call_console_drivers() >>     spin_lockA() >> >> By other words, this deadlock was there even before. Such >> deadlocks are prevented by using printk_deferred() in >> the sections guarded by the lock A. > > Hello, > > I didn't see what you did, at the last version. You were > tring to transfer the semaphore owner and make it taken > over. I see. > > But, what I mentioned last time is still valid. See below. > >> Signed-off-by: Steven Rostedt (VMware) >> [pmladek@suse.com: Commit message about possible deadlocks] >> --- >>   kernel/printk/printk.c | 108 >> ++++++++++++++++++++++++++++++++++++++++++++++++- >>   1 file changed, 107 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c >> index b9006617710f..7e6459abba43 100644 >> --- a/kernel/printk/printk.c >> +++ b/kernel/printk/printk.c >> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers); >>   static struct lockdep_map console_lock_dep_map = { >>       .name = "console_lock" >>   }; >> +static struct lockdep_map console_owner_dep_map = { >> +    .name = "console_owner" >> +}; >>   #endif >> +static DEFINE_RAW_SPINLOCK(console_owner_lock); >> +static struct task_struct *console_owner; >> +static bool console_waiter; >> + >>   enum devkmsg_log_bits { >>       __DEVKMSG_LOG_BIT_ON = 0, >>       __DEVKMSG_LOG_BIT_OFF, >> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility, int >> level, >>            * semaphore.  The release will print out buffers and wake up >>            * /dev/kmsg and syslog() users. >>            */ >> -        if (console_trylock()) >> +        if (console_trylock()) { >>               console_unlock(); >> +        } else { >> +            struct task_struct *owner = NULL; >> +            bool waiter; >> +            bool spin = false; >> + >> +            printk_safe_enter_irqsave(flags); >> + >> +            raw_spin_lock(&console_owner_lock); >> +            owner = READ_ONCE(console_owner); >> +            waiter = READ_ONCE(console_waiter); >> +            if (!waiter && owner && owner != current) { >> +                WRITE_ONCE(console_waiter, true); >> +                spin = true; >> +            } >> +            raw_spin_unlock(&console_owner_lock); >> + >> +            /* >> +             * If there is an active printk() writing to the >> +             * consoles, instead of having it write our data too, >> +             * see if we can offload that load from the active >> +             * printer, and do some printing ourselves. >> +             * Go into a spin only if there isn't already a waiter >> +             * spinning, and there is an active printer, and >> +             * that active printer isn't us (recursive printk?). >> +             */ >> +            if (spin) { >> +                /* We spin waiting for the owner to release us */ >> +                spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_); >> +                /* Owner will clear console_waiter on hand off */ >> +                while (READ_ONCE(console_waiter)) >> +                    cpu_relax(); >> + >> +                spin_release(&console_owner_dep_map, 1, _THIS_IP_); > > Why don't you move this over "while (READ_ONCE(console_waiter))" and > right after acquire()? > > As I said last time, only acquisitions between acquire() and release() > are meaningful. Are you taking care of acquisitions within cpu_relax()? > If so, leave it. In addition, this way would be correct if you intended to use cross-lock's map here, assuming cross-release alive.. But anyway this is just a typical acquire/release pair so we don't usually use the pair in this way. -- Thanks, Byungchul