From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752324AbeAQHeU (ORCPT <rfc822;ralf@linux-mips.org> + 1 other);
        Wed, 17 Jan 2018 02:34:20 -0500
Received: from LGEAMRELO12.lge.com ([156.147.23.52]:55119 "EHLO
        lgeamrelo12.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752226AbeAQHeS (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 17 Jan 2018 02:34:18 -0500
X-Original-SENDERIP: 156.147.1.127
X-Original-MAILFROM: byungchul.park@lge.com
X-Original-SENDERIP: 10.177.222.184
X-Original-MAILFROM: byungchul.park@lge.com
Subject: Re: [PATCH v5 1/2] printk: Add console owner and waiter logic to load
 balance console writes
From: Byungchul Park <byungchul.park@lge.com>
To: Petr Mladek <pmladek@suse.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
        Cong Wang <xiyou.wangcong@gmail.com>,
        Dave Hansen <dave.hansen@intel.com>,
        Johannes Weiner <hannes@cmpxchg.org>,
        Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@kernel.org>,
        Vlastimil Babka <vbabka@suse.cz>,
        Peter Zijlstra <peterz@infradead.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Jan Kara <jack@suse.cz>,
        Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
        Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
        rostedt@home.goodmis.org,
        Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
        Tejun Heo <tj@kernel.org>, Pavel Machek <pavel@ucw.cz>,
        linux-kernel@vger.kernel.org, kernel-team@lge.com
References: <20180110132418.7080-1-pmladek@suse.com>
 <20180110132418.7080-2-pmladek@suse.com>
 <f4ea1404-404d-11d2-550c-7367add3f5fa@lge.com>
Message-ID: <9f0ef69d-49e7-abf1-2f61-5f0f44ffcf7b@lge.com>
Date: Wed, 17 Jan 2018 16:34:14 +0900
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.5.2
MIME-Version: 1.0
In-Reply-To: <f4ea1404-404d-11d2-550c-7367add3f5fa@lge.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Return-Path: <linux-kernel-owner@vger.kernel.org>

On 1/17/2018 11:19 AM, Byungchul Park wrote:
> On 1/10/2018 10:24 PM, Petr Mladek wrote:
>> From: Steven Rostedt <rostedt@goodmis.org>
>>
>> From: Steven Rostedt (VMware) <rostedt@goodmis.org>
>>
>> This patch implements what I discussed in Kernel Summit. I added
>> lockdep annotation (hopefully correctly), and it hasn't had any splats
>> (since I fixed some bugs in the first iterations). It did catch
>> problems when I had the owner covering too much. But now that the owner
>> is only set when actively calling the consoles, lockdep has stayed
>> quiet.
>>
>> Here's the design again:
>>
>> I added a "console_owner" which is set to a task that is actively
>> writing to the consoles. It is *not* the same as the owner of the
>> console_lock. It is only set when doing the calls to the console
>> functions. It is protected by a console_owner_lock which is a raw spin
>> lock.
>>
>> There is a console_waiter. This is set when there is an active console
>> owner that is not current, and waiter is not set. This too is protected
>> by console_owner_lock.
>>
>> In printk() when it tries to write to the consoles, we have:
>>
>>     if (console_trylock())
>>         console_unlock();
>>
>> Now I added an else, which will check if there is an active owner, and
>> no current waiter. If that is the case, then console_waiter is set, and
>> the task goes into a spin until it is no longer set.
>>
>> When the active console owner finishes writing the current message to
>> the consoles, it grabs the console_owner_lock and sees if there is a
>> waiter, and clears console_owner.
>>
>> If there is a waiter, then it breaks out of the loop, clears the waiter
>> flag (because that will release the waiter from its spin), and exits.
>> Note, it does *not* release the console semaphore. Because it is a
>> semaphore, there is no owner. Another task may release it. This means
>> that the waiter is guaranteed to be the new console owner! Which it
>> becomes.
>>
>> Then the waiter calls console_unlock() and continues to write to the
>> consoles.
>>
>> If another task comes along and does a printk() it too can become the
>> new waiter, and we wash rinse and repeat!
>>
>> By Petr Mladek about possible new deadlocks:
>>
>> The thing is that we move console_sem only to printk() call
>> that normally calls console_unlock() as well. It means that
>> the transferred owner should not bring new type of dependencies.
>> As Steven said somewhere: "If there is a deadlock, it was
>> there even before."
>>
>> We could look at it from this side. The possible deadlock would
>> look like:
>>
>> CPU0                            CPU1
>>
>> console_unlock()
>>
>>    console_owner = current;
>>
>>                 spin_lockA()
>>                   printk()
>>                     spin = true;
>>                     while (...)
>>
>>      call_console_drivers()
>>        spin_lockA()
>>
>> This would be a deadlock. CPU0 would wait for the lock A.
>> While CPU1 would own the lockA and would wait for CPU0
>> to finish calling the console drivers and pass the console_sem
>> owner.
>>
>> But if the above is true than the following scenario was
>> already possible before:
>>
>> CPU0
>>
>> spin_lockA()
>>    printk()
>>      console_unlock()
>>        call_console_drivers()
>>     spin_lockA()
>>
>> By other words, this deadlock was there even before. Such
>> deadlocks are prevented by using printk_deferred() in
>> the sections guarded by the lock A.
> 
> Hello,
> 
> I didn't see what you did, at the last version. You were
> tring to transfer the semaphore owner and make it taken
> over. I see.
> 
> But, what I mentioned last time is still valid. See below.
> 
>> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
>> [pmladek@suse.com: Commit message about possible deadlocks]
>> ---
>>   kernel/printk/printk.c | 108 
>> ++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 107 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index b9006617710f..7e6459abba43 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
>>   static struct lockdep_map console_lock_dep_map = {
>>       .name = "console_lock"
>>   };
>> +static struct lockdep_map console_owner_dep_map = {
>> +    .name = "console_owner"
>> +};
>>   #endif
>> +static DEFINE_RAW_SPINLOCK(console_owner_lock);
>> +static struct task_struct *console_owner;
>> +static bool console_waiter;
>> +
>>   enum devkmsg_log_bits {
>>       __DEVKMSG_LOG_BIT_ON = 0,
>>       __DEVKMSG_LOG_BIT_OFF,
>> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility, int 
>> level,
>>            * semaphore.  The release will print out buffers and wake up
>>            * /dev/kmsg and syslog() users.
>>            */
>> -        if (console_trylock())
>> +        if (console_trylock()) {
>>               console_unlock();
>> +        } else {
>> +            struct task_struct *owner = NULL;
>> +            bool waiter;
>> +            bool spin = false;
>> +
>> +            printk_safe_enter_irqsave(flags);
>> +
>> +            raw_spin_lock(&console_owner_lock);
>> +            owner = READ_ONCE(console_owner);
>> +            waiter = READ_ONCE(console_waiter);
>> +            if (!waiter && owner && owner != current) {
>> +                WRITE_ONCE(console_waiter, true);
>> +                spin = true;
>> +            }
>> +            raw_spin_unlock(&console_owner_lock);
>> +
>> +            /*
>> +             * If there is an active printk() writing to the
>> +             * consoles, instead of having it write our data too,
>> +             * see if we can offload that load from the active
>> +             * printer, and do some printing ourselves.
>> +             * Go into a spin only if there isn't already a waiter
>> +             * spinning, and there is an active printer, and
>> +             * that active printer isn't us (recursive printk?).
>> +             */
>> +            if (spin) {
>> +                /* We spin waiting for the owner to release us */
>> +                spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
>> +                /* Owner will clear console_waiter on hand off */
>> +                while (READ_ONCE(console_waiter))
>> +                    cpu_relax();
>> +
>> +                spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> 
> Why don't you move this over "while (READ_ONCE(console_waiter))" and
> right after acquire()?
> 
> As I said last time, only acquisitions between acquire() and release()
> are meaningful. Are you taking care of acquisitions within cpu_relax()?
> If so, leave it.

In addition, this way would be correct if you intended to use
cross-lock's map here, assuming cross-release alive..

But anyway this is just a typical acquire/release pair so we
don't usually use the pair in this way.

-- 
Thanks,
Byungchul