All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-08 15:27 ` Steven Rostedt
  0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-11-08 15:27 UTC (permalink / raw)
  To: LKML
  Cc: akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Petr Mladek, Sergey Senozhatsky,
	Vlastimil Babka, yuwang.yuwang, Peter Zijlstra, Linus Torvalds,
	Jan Kara, Mathieu Desnoyers, Tetsuo Handa, rostedt

[ claws-mail is really pissing me off. It did it again, after I
  manually fixed all the addresses. This time, I'm going to do things
  slightly different. Sorry for all the spam :-( ]

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

This patch implements what I discussed in Kernel Summit. I added
lockdep annotation (hopefully correctly), and it hasn't had any splats
(since I fixed some bugs in the first iterations). It did catch
problems when I had the owner covering too much. But now that the owner
is only set when actively calling the consoles, lockdep has stayed
quiet.
 
Here's the design again:

I added a "console_owner" which is set to a task that is actively
writing to the consoles. It is *not* the same an the owner of the
console_lock. It is only set when doing the calls to the console
functions. It is protected by a console_owner_lock which is a raw spin
lock.

There is a console_waiter. This is set when there is an active console
owner that is not current, and waiter is not set. This too is protected
by console_owner_lock.

In printk() when it tries to write to the consoles, we have:

	if (console_trylock())
		console_unlock();

Now I added an else, which will check if there is an active owner, and
no current waiter. If that is the case, then console_waiter is set, and
the task goes into a spin until it is no longer set.

When the active console owner finishes writing the current message to
the consoles, it grabs the console_owner_lock and sees if there is a
waiter, and clears console_owner.

If there is a waiter, then it breaks out of the loop, clears the waiter
flag (because that will release the waiter from its spin), and exits.
Note, it does *not* release the console semaphore. Because it is a
semaphore, there is no owner. Another task may release it. This means
that the waiter is guaranteed to be the new console owner! Which it
becomes.

Then the waiter calls console_unlock() and continues to write to the
consoles.

If another task comes along and does a printk() it too can become the
new waiter, and we wash rinse and repeat!

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
Changes from v3:

	Fixed while loop on console_waiter (Thanks Vlastimil!)

	Moved console_owner out of logbuf_lock taking (reported by
	Tetsuo Handa)

Changes from v2:

  - Added back some READ/WRITE_ONCE() just to be on the safe side
Index: linux-trace.git/kernel/printk/printk.c
===================================================================
--- linux-trace.git.orig/kernel/printk/printk.c
+++ linux-trace.git/kernel/printk/printk.c
@@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
 static struct lockdep_map console_lock_dep_map = {
 	.name = "console_lock"
 };
+static struct lockdep_map console_owner_dep_map = {
+	.name = "console_owner"
+};
 #endif
 
+static DEFINE_RAW_SPINLOCK(console_owner_lock);
+static struct task_struct *console_owner;
+static bool console_waiter;
+
 enum devkmsg_log_bits {
 	__DEVKMSG_LOG_BIT_ON = 0,
 	__DEVKMSG_LOG_BIT_OFF,
@@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
 		 * semaphore.  The release will print out buffers and wake up
 		 * /dev/kmsg and syslog() users.
 		 */
-		if (console_trylock())
+		if (console_trylock()) {
 			console_unlock();
+		} else {
+			struct task_struct *owner = NULL;
+			bool waiter;
+			bool spin = false;
+
+			printk_safe_enter_irqsave(flags);
+
+			raw_spin_lock(&console_owner_lock);
+			owner = READ_ONCE(console_owner);
+			waiter = READ_ONCE(console_waiter);
+			if (!waiter && owner && owner != current) {
+				WRITE_ONCE(console_waiter, true);
+				spin = true;
+			}
+			raw_spin_unlock(&console_owner_lock);
+
+			/*
+			 * If there is an active printk() writing to the
+			 * consoles, instead of having it write our data too,
+			 * see if we can offload that load from the active
+			 * printer, and do some printing ourselves.
+			 * Go into a spin only if there isn't already a waiter
+			 * spinning, and there is an active printer, and
+			 * that active printer isn't us (recursive printk?).
+			 */
+			if (spin) {
+				/* We spin waiting for the owner to release us */
+				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
+				/* Owner will clear console_waiter on hand off */
+				while (READ_ONCE(console_waiter))
+					cpu_relax();
+
+				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+				printk_safe_exit_irqrestore(flags);
+
+				/*
+				 * The owner passed the console lock to us.
+				 * Since we did not spin on console lock, annotate
+				 * this as a trylock. Otherwise lockdep will
+				 * complain.
+				 */
+				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
+				console_unlock();
+				printk_safe_enter_irqsave(flags);
+			}
+			printk_safe_exit_irqrestore(flags);
+
+		}
 	}
 
 	return printed_len;
@@ -2141,6 +2196,7 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
+	bool waiter = false;
 	bool do_cond_resched, retry;
 
 	if (console_suspended) {
@@ -2229,14 +2285,64 @@ skip:
 		console_seq++;
 		raw_spin_unlock(&logbuf_lock);
 
+		/*
+		 * While actively printing out messages, if another printk()
+		 * were to occur on another CPU, it may wait for this one to
+		 * finish. This task can not be preempted if there is a
+		 * waiter waiting to take over.
+		 */
+		raw_spin_lock(&console_owner_lock);
+		console_owner = current;
+		raw_spin_unlock(&console_owner_lock);
+
+		/* The waiter may spin on us after setting console_owner */
+		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
+
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(ext_text, ext_len, text, len);
 		start_critical_timings();
+
+		raw_spin_lock(&console_owner_lock);
+		waiter = READ_ONCE(console_waiter);
+		console_owner = NULL;
+		raw_spin_unlock(&console_owner_lock);
+
+		/*
+		 * If there is a waiter waiting for us, then pass the
+		 * rest of the work load over to that waiter.
+		 */
+		if (waiter)
+			break;
+
+		/* There was no waiter, and nothing will spin on us here */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+
 		printk_safe_exit_irqrestore(flags);
 
 		if (do_cond_resched)
 			cond_resched();
 	}
+
+	/*
+	 * If there is an active waiter waiting on the console_lock.
+	 * Pass off the printing to the waiter, and the waiter
+	 * will continue printing on its CPU, and when all writing
+	 * has finished, the last printer will wake up klogd.
+	 */
+	if (waiter) {
+		WRITE_ONCE(console_waiter, false);
+		/* The waiter is now free to continue */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+		/*
+		 * Hand off console_lock to waiter. The waiter will perform
+		 * the up(). After this, the waiter is the console_lock owner.
+		 */
+		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
+		printk_safe_exit_irqrestore(flags);
+		/* Note, if waiter is set, logbuf_lock is not held */
+		return;
+	}
+
 	console_locked = 0;
 
 	/* Release the exclusive_console once it is used */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-08 15:27 ` Steven Rostedt
  0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-11-08 15:27 UTC (permalink / raw)
  To: LKML
  Cc: akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Petr Mladek, Sergey Senozhatsky,
	Vlastimil Babka, yuwang.yuwang, Peter Zijlstra, Linus Torvalds,
	Jan Kara, Mathieu Desnoyers, Tetsuo Handa, rostedt

[ claws-mail is really pissing me off. It did it again, after I
  manually fixed all the addresses. This time, I'm going to do things
  slightly different. Sorry for all the spam :-( ]

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

This patch implements what I discussed in Kernel Summit. I added
lockdep annotation (hopefully correctly), and it hasn't had any splats
(since I fixed some bugs in the first iterations). It did catch
problems when I had the owner covering too much. But now that the owner
is only set when actively calling the consoles, lockdep has stayed
quiet.
 
Here's the design again:

I added a "console_owner" which is set to a task that is actively
writing to the consoles. It is *not* the same an the owner of the
console_lock. It is only set when doing the calls to the console
functions. It is protected by a console_owner_lock which is a raw spin
lock.

There is a console_waiter. This is set when there is an active console
owner that is not current, and waiter is not set. This too is protected
by console_owner_lock.

In printk() when it tries to write to the consoles, we have:

	if (console_trylock())
		console_unlock();

Now I added an else, which will check if there is an active owner, and
no current waiter. If that is the case, then console_waiter is set, and
the task goes into a spin until it is no longer set.

When the active console owner finishes writing the current message to
the consoles, it grabs the console_owner_lock and sees if there is a
waiter, and clears console_owner.

If there is a waiter, then it breaks out of the loop, clears the waiter
flag (because that will release the waiter from its spin), and exits.
Note, it does *not* release the console semaphore. Because it is a
semaphore, there is no owner. Another task may release it. This means
that the waiter is guaranteed to be the new console owner! Which it
becomes.

Then the waiter calls console_unlock() and continues to write to the
consoles.

If another task comes along and does a printk() it too can become the
new waiter, and we wash rinse and repeat!

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
Changes from v3:

	Fixed while loop on console_waiter (Thanks Vlastimil!)

	Moved console_owner out of logbuf_lock taking (reported by
	Tetsuo Handa)

Changes from v2:

  - Added back some READ/WRITE_ONCE() just to be on the safe side
Index: linux-trace.git/kernel/printk/printk.c
===================================================================
--- linux-trace.git.orig/kernel/printk/printk.c
+++ linux-trace.git/kernel/printk/printk.c
@@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
 static struct lockdep_map console_lock_dep_map = {
 	.name = "console_lock"
 };
+static struct lockdep_map console_owner_dep_map = {
+	.name = "console_owner"
+};
 #endif
 
+static DEFINE_RAW_SPINLOCK(console_owner_lock);
+static struct task_struct *console_owner;
+static bool console_waiter;
+
 enum devkmsg_log_bits {
 	__DEVKMSG_LOG_BIT_ON = 0,
 	__DEVKMSG_LOG_BIT_OFF,
@@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
 		 * semaphore.  The release will print out buffers and wake up
 		 * /dev/kmsg and syslog() users.
 		 */
-		if (console_trylock())
+		if (console_trylock()) {
 			console_unlock();
+		} else {
+			struct task_struct *owner = NULL;
+			bool waiter;
+			bool spin = false;
+
+			printk_safe_enter_irqsave(flags);
+
+			raw_spin_lock(&console_owner_lock);
+			owner = READ_ONCE(console_owner);
+			waiter = READ_ONCE(console_waiter);
+			if (!waiter && owner && owner != current) {
+				WRITE_ONCE(console_waiter, true);
+				spin = true;
+			}
+			raw_spin_unlock(&console_owner_lock);
+
+			/*
+			 * If there is an active printk() writing to the
+			 * consoles, instead of having it write our data too,
+			 * see if we can offload that load from the active
+			 * printer, and do some printing ourselves.
+			 * Go into a spin only if there isn't already a waiter
+			 * spinning, and there is an active printer, and
+			 * that active printer isn't us (recursive printk?).
+			 */
+			if (spin) {
+				/* We spin waiting for the owner to release us */
+				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
+				/* Owner will clear console_waiter on hand off */
+				while (READ_ONCE(console_waiter))
+					cpu_relax();
+
+				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+				printk_safe_exit_irqrestore(flags);
+
+				/*
+				 * The owner passed the console lock to us.
+				 * Since we did not spin on console lock, annotate
+				 * this as a trylock. Otherwise lockdep will
+				 * complain.
+				 */
+				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
+				console_unlock();
+				printk_safe_enter_irqsave(flags);
+			}
+			printk_safe_exit_irqrestore(flags);
+
+		}
 	}
 
 	return printed_len;
@@ -2141,6 +2196,7 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
+	bool waiter = false;
 	bool do_cond_resched, retry;
 
 	if (console_suspended) {
@@ -2229,14 +2285,64 @@ skip:
 		console_seq++;
 		raw_spin_unlock(&logbuf_lock);
 
+		/*
+		 * While actively printing out messages, if another printk()
+		 * were to occur on another CPU, it may wait for this one to
+		 * finish. This task can not be preempted if there is a
+		 * waiter waiting to take over.
+		 */
+		raw_spin_lock(&console_owner_lock);
+		console_owner = current;
+		raw_spin_unlock(&console_owner_lock);
+
+		/* The waiter may spin on us after setting console_owner */
+		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
+
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(ext_text, ext_len, text, len);
 		start_critical_timings();
+
+		raw_spin_lock(&console_owner_lock);
+		waiter = READ_ONCE(console_waiter);
+		console_owner = NULL;
+		raw_spin_unlock(&console_owner_lock);
+
+		/*
+		 * If there is a waiter waiting for us, then pass the
+		 * rest of the work load over to that waiter.
+		 */
+		if (waiter)
+			break;
+
+		/* There was no waiter, and nothing will spin on us here */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+
 		printk_safe_exit_irqrestore(flags);
 
 		if (do_cond_resched)
 			cond_resched();
 	}
+
+	/*
+	 * If there is an active waiter waiting on the console_lock.
+	 * Pass off the printing to the waiter, and the waiter
+	 * will continue printing on its CPU, and when all writing
+	 * has finished, the last printer will wake up klogd.
+	 */
+	if (waiter) {
+		WRITE_ONCE(console_waiter, false);
+		/* The waiter is now free to continue */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+		/*
+		 * Hand off console_lock to waiter. The waiter will perform
+		 * the up(). After this, the waiter is the console_lock owner.
+		 */
+		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
+		printk_safe_exit_irqrestore(flags);
+		/* Note, if waiter is set, logbuf_lock is not held */
+		return;
+	}
+
 	console_locked = 0;
 
 	/* Release the exclusive_console once it is used */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-08 15:27 ` Steven Rostedt
@ 2017-11-09 10:12   ` Michal Hocko
  -1 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2017-11-09 10:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Petr Mladek, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

Hi,
assuming that this passes warn stall torturing by Tetsuo, do you think
we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
from the mmotm tree?

On Wed 08-11-17 10:27:23, Steven Rostedt wrote:
> [ claws-mail is really pissing me off. It did it again, after I
>   manually fixed all the addresses. This time, I'm going to do things
>   slightly different. Sorry for all the spam :-( ]
> 
> From: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> This patch implements what I discussed in Kernel Summit. I added
> lockdep annotation (hopefully correctly), and it hasn't had any splats
> (since I fixed some bugs in the first iterations). It did catch
> problems when I had the owner covering too much. But now that the owner
> is only set when actively calling the consoles, lockdep has stayed
> quiet.
>  
> Here's the design again:
> 
> I added a "console_owner" which is set to a task that is actively
> writing to the consoles. It is *not* the same an the owner of the
> console_lock. It is only set when doing the calls to the console
> functions. It is protected by a console_owner_lock which is a raw spin
> lock.
> 
> There is a console_waiter. This is set when there is an active console
> owner that is not current, and waiter is not set. This too is protected
> by console_owner_lock.
> 
> In printk() when it tries to write to the consoles, we have:
> 
> 	if (console_trylock())
> 		console_unlock();
> 
> Now I added an else, which will check if there is an active owner, and
> no current waiter. If that is the case, then console_waiter is set, and
> the task goes into a spin until it is no longer set.
> 
> When the active console owner finishes writing the current message to
> the consoles, it grabs the console_owner_lock and sees if there is a
> waiter, and clears console_owner.
> 
> If there is a waiter, then it breaks out of the loop, clears the waiter
> flag (because that will release the waiter from its spin), and exits.
> Note, it does *not* release the console semaphore. Because it is a
> semaphore, there is no owner. Another task may release it. This means
> that the waiter is guaranteed to be the new console owner! Which it
> becomes.
> 
> Then the waiter calls console_unlock() and continues to write to the
> consoles.
> 
> If another task comes along and does a printk() it too can become the
> new waiter, and we wash rinse and repeat!
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> Changes from v3:
> 
> 	Fixed while loop on console_waiter (Thanks Vlastimil!)
> 
> 	Moved console_owner out of logbuf_lock taking (reported by
> 	Tetsuo Handa)
> 
> Changes from v2:
> 
>   - Added back some READ/WRITE_ONCE() just to be on the safe side
> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
>  static struct lockdep_map console_lock_dep_map = {
>  	.name = "console_lock"
>  };
> +static struct lockdep_map console_owner_dep_map = {
> +	.name = "console_owner"
> +};
>  #endif
>  
> +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> +static struct task_struct *console_owner;
> +static bool console_waiter;
> +
>  enum devkmsg_log_bits {
>  	__DEVKMSG_LOG_BIT_ON = 0,
>  	__DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>  		 * semaphore.  The release will print out buffers and wake up
>  		 * /dev/kmsg and syslog() users.
>  		 */
> -		if (console_trylock())
> +		if (console_trylock()) {
>  			console_unlock();
> +		} else {
> +			struct task_struct *owner = NULL;
> +			bool waiter;
> +			bool spin = false;
> +
> +			printk_safe_enter_irqsave(flags);
> +
> +			raw_spin_lock(&console_owner_lock);
> +			owner = READ_ONCE(console_owner);
> +			waiter = READ_ONCE(console_waiter);
> +			if (!waiter && owner && owner != current) {
> +				WRITE_ONCE(console_waiter, true);
> +				spin = true;
> +			}
> +			raw_spin_unlock(&console_owner_lock);
> +
> +			/*
> +			 * If there is an active printk() writing to the
> +			 * consoles, instead of having it write our data too,
> +			 * see if we can offload that load from the active
> +			 * printer, and do some printing ourselves.
> +			 * Go into a spin only if there isn't already a waiter
> +			 * spinning, and there is an active printer, and
> +			 * that active printer isn't us (recursive printk?).
> +			 */
> +			if (spin) {
> +				/* We spin waiting for the owner to release us */
> +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +				/* Owner will clear console_waiter on hand off */
> +				while (READ_ONCE(console_waiter))
> +					cpu_relax();
> +
> +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +				printk_safe_exit_irqrestore(flags);
> +
> +				/*
> +				 * The owner passed the console lock to us.
> +				 * Since we did not spin on console lock, annotate
> +				 * this as a trylock. Otherwise lockdep will
> +				 * complain.
> +				 */
> +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> +				console_unlock();
> +				printk_safe_enter_irqsave(flags);
> +			}
> +			printk_safe_exit_irqrestore(flags);
> +
> +		}
>  	}
>  
>  	return printed_len;
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();
>  	}
> +
> +	/*
> +	 * If there is an active waiter waiting on the console_lock.
> +	 * Pass off the printing to the waiter, and the waiter
> +	 * will continue printing on its CPU, and when all writing
> +	 * has finished, the last printer will wake up klogd.
> +	 */
> +	if (waiter) {
> +		WRITE_ONCE(console_waiter, false);
> +		/* The waiter is now free to continue */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +		/*
> +		 * Hand off console_lock to waiter. The waiter will perform
> +		 * the up(). After this, the waiter is the console_lock owner.
> +		 */
> +		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> +		printk_safe_exit_irqrestore(flags);
> +		/* Note, if waiter is set, logbuf_lock is not held */
> +		return;
> +	}
> +
>  	console_locked = 0;
>  
>  	/* Release the exclusive_console once it is used */

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-09 10:12   ` Michal Hocko
  0 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2017-11-09 10:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Petr Mladek, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

Hi,
assuming that this passes warn stall torturing by Tetsuo, do you think
we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
from the mmotm tree?

On Wed 08-11-17 10:27:23, Steven Rostedt wrote:
> [ claws-mail is really pissing me off. It did it again, after I
>   manually fixed all the addresses. This time, I'm going to do things
>   slightly different. Sorry for all the spam :-( ]
> 
> From: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> This patch implements what I discussed in Kernel Summit. I added
> lockdep annotation (hopefully correctly), and it hasn't had any splats
> (since I fixed some bugs in the first iterations). It did catch
> problems when I had the owner covering too much. But now that the owner
> is only set when actively calling the consoles, lockdep has stayed
> quiet.
>  
> Here's the design again:
> 
> I added a "console_owner" which is set to a task that is actively
> writing to the consoles. It is *not* the same an the owner of the
> console_lock. It is only set when doing the calls to the console
> functions. It is protected by a console_owner_lock which is a raw spin
> lock.
> 
> There is a console_waiter. This is set when there is an active console
> owner that is not current, and waiter is not set. This too is protected
> by console_owner_lock.
> 
> In printk() when it tries to write to the consoles, we have:
> 
> 	if (console_trylock())
> 		console_unlock();
> 
> Now I added an else, which will check if there is an active owner, and
> no current waiter. If that is the case, then console_waiter is set, and
> the task goes into a spin until it is no longer set.
> 
> When the active console owner finishes writing the current message to
> the consoles, it grabs the console_owner_lock and sees if there is a
> waiter, and clears console_owner.
> 
> If there is a waiter, then it breaks out of the loop, clears the waiter
> flag (because that will release the waiter from its spin), and exits.
> Note, it does *not* release the console semaphore. Because it is a
> semaphore, there is no owner. Another task may release it. This means
> that the waiter is guaranteed to be the new console owner! Which it
> becomes.
> 
> Then the waiter calls console_unlock() and continues to write to the
> consoles.
> 
> If another task comes along and does a printk() it too can become the
> new waiter, and we wash rinse and repeat!
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> Changes from v3:
> 
> 	Fixed while loop on console_waiter (Thanks Vlastimil!)
> 
> 	Moved console_owner out of logbuf_lock taking (reported by
> 	Tetsuo Handa)
> 
> Changes from v2:
> 
>   - Added back some READ/WRITE_ONCE() just to be on the safe side
> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
>  static struct lockdep_map console_lock_dep_map = {
>  	.name = "console_lock"
>  };
> +static struct lockdep_map console_owner_dep_map = {
> +	.name = "console_owner"
> +};
>  #endif
>  
> +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> +static struct task_struct *console_owner;
> +static bool console_waiter;
> +
>  enum devkmsg_log_bits {
>  	__DEVKMSG_LOG_BIT_ON = 0,
>  	__DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>  		 * semaphore.  The release will print out buffers and wake up
>  		 * /dev/kmsg and syslog() users.
>  		 */
> -		if (console_trylock())
> +		if (console_trylock()) {
>  			console_unlock();
> +		} else {
> +			struct task_struct *owner = NULL;
> +			bool waiter;
> +			bool spin = false;
> +
> +			printk_safe_enter_irqsave(flags);
> +
> +			raw_spin_lock(&console_owner_lock);
> +			owner = READ_ONCE(console_owner);
> +			waiter = READ_ONCE(console_waiter);
> +			if (!waiter && owner && owner != current) {
> +				WRITE_ONCE(console_waiter, true);
> +				spin = true;
> +			}
> +			raw_spin_unlock(&console_owner_lock);
> +
> +			/*
> +			 * If there is an active printk() writing to the
> +			 * consoles, instead of having it write our data too,
> +			 * see if we can offload that load from the active
> +			 * printer, and do some printing ourselves.
> +			 * Go into a spin only if there isn't already a waiter
> +			 * spinning, and there is an active printer, and
> +			 * that active printer isn't us (recursive printk?).
> +			 */
> +			if (spin) {
> +				/* We spin waiting for the owner to release us */
> +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +				/* Owner will clear console_waiter on hand off */
> +				while (READ_ONCE(console_waiter))
> +					cpu_relax();
> +
> +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +				printk_safe_exit_irqrestore(flags);
> +
> +				/*
> +				 * The owner passed the console lock to us.
> +				 * Since we did not spin on console lock, annotate
> +				 * this as a trylock. Otherwise lockdep will
> +				 * complain.
> +				 */
> +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> +				console_unlock();
> +				printk_safe_enter_irqsave(flags);
> +			}
> +			printk_safe_exit_irqrestore(flags);
> +
> +		}
>  	}
>  
>  	return printed_len;
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();
>  	}
> +
> +	/*
> +	 * If there is an active waiter waiting on the console_lock.
> +	 * Pass off the printing to the waiter, and the waiter
> +	 * will continue printing on its CPU, and when all writing
> +	 * has finished, the last printer will wake up klogd.
> +	 */
> +	if (waiter) {
> +		WRITE_ONCE(console_waiter, false);
> +		/* The waiter is now free to continue */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +		/*
> +		 * Hand off console_lock to waiter. The waiter will perform
> +		 * the up(). After this, the waiter is the console_lock owner.
> +		 */
> +		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> +		printk_safe_exit_irqrestore(flags);
> +		/* Note, if waiter is set, logbuf_lock is not held */
> +		return;
> +	}
> +
>  	console_locked = 0;
>  
>  	/* Release the exclusive_console once it is used */

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
  2017-11-09 10:12   ` Michal Hocko
@ 2017-11-09 10:22     ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-11-09 10:22 UTC (permalink / raw)
  To: mhocko, rostedt
  Cc: linux-kernel, akpm, linux-mm, xiyou.wangcong, dave.hansen,
	hannes, mgorman, pmladek, sergey.senozhatsky, vbabka, peterz,
	torvalds, jack, mathieu.desnoyers, rostedt

Michal Hocko wrote:
> Hi,
> assuming that this passes warn stall torturing by Tetsuo, do you think
> we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> from the mmotm tree?

I don't think so.

The rule that "do not try to printk() faster than the kernel can write to
consoles" will remain no matter how printk() changes. Unless asynchronous
approach like https://lwn.net/Articles/723447/ is used, I think we can't
obtain useful information.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
@ 2017-11-09 10:22     ` Tetsuo Handa
  0 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-11-09 10:22 UTC (permalink / raw)
  To: mhocko, rostedt
  Cc: linux-kernel, akpm, linux-mm, xiyou.wangcong, dave.hansen,
	hannes, mgorman, pmladek, sergey.senozhatsky, vbabka, peterz,
	torvalds, jack, mathieu.desnoyers, rostedt

Michal Hocko wrote:
> Hi,
> assuming that this passes warn stall torturing by Tetsuo, do you think
> we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> from the mmotm tree?

I don't think so.

The rule that "do not try to printk() faster than the kernel can write to
consoles" will remain no matter how printk() changes. Unless asynchronous
approach like https://lwn.net/Articles/723447/ is used, I think we can't
obtain useful information.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
  2017-11-09 10:22     ` Tetsuo Handa
@ 2017-11-09 10:26       ` Michal Hocko
  -1 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2017-11-09 10:26 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > Hi,
> > assuming that this passes warn stall torturing by Tetsuo, do you think
> > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > from the mmotm tree?
> 
> I don't think so.
> 
> The rule that "do not try to printk() faster than the kernel can write to
> consoles" will remain no matter how printk() changes. Unless asynchronous
> approach like https://lwn.net/Articles/723447/ is used, I think we can't
> obtain useful information.

Does that mean that the patch doesn't pass your test?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
@ 2017-11-09 10:26       ` Michal Hocko
  0 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2017-11-09 10:26 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > Hi,
> > assuming that this passes warn stall torturing by Tetsuo, do you think
> > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > from the mmotm tree?
> 
> I don't think so.
> 
> The rule that "do not try to printk() faster than the kernel can write to
> consoles" will remain no matter how printk() changes. Unless asynchronous
> approach like https://lwn.net/Articles/723447/ is used, I think we can't
> obtain useful information.

Does that mean that the patch doesn't pass your test?

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
  2017-11-09 10:26       ` Michal Hocko
@ 2017-11-09 11:03         ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-11-09 11:03 UTC (permalink / raw)
  To: mhocko
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

Michal Hocko wrote:
> On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > Hi,
> > > assuming that this passes warn stall torturing by Tetsuo, do you think
> > > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > > from the mmotm tree?
> > 
> > I don't think so.
> > 
> > The rule that "do not try to printk() faster than the kernel can write to
> > consoles" will remain no matter how printk() changes. Unless asynchronous
> > approach like https://lwn.net/Articles/723447/ is used, I think we can't
> > obtain useful information.
> 
> Does that mean that the patch doesn't pass your test?
> 

Test is irrelevant. See the changelog.

  Synchronous approach is prone to unexpected results (e.g. too late [1], too
  frequent [2], overlooked [3]). As far as I know, warn_alloc() never helped
  with providing information other than "something is going wrong".
  I want to consider asynchronous approach which can obtain information
  during stalls with possibly relevant threads (e.g. the owner of oom_lock
  and kswapd-like threads) and serve as a trigger for actions (e.g. turn
  on/off tracepoints, ask libvirt daemon to take a memory dump of stalling
  KVM guest for diagnostic purpose).

  [1] https://bugzilla.kernel.org/show_bug.cgi?id=192981
  [2] http://lkml.kernel.org/r/CAM_iQpWuPVGc2ky8M-9yukECtS+zKjiDasNymX7rMcBjBFyM_A@mail.gmail.com
  [3] commit db73ee0d46379922 ("mm, vmscan: do not loop on too_many_isolated for ever")

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
@ 2017-11-09 11:03         ` Tetsuo Handa
  0 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-11-09 11:03 UTC (permalink / raw)
  To: mhocko
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

Michal Hocko wrote:
> On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > Hi,
> > > assuming that this passes warn stall torturing by Tetsuo, do you think
> > > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > > from the mmotm tree?
> > 
> > I don't think so.
> > 
> > The rule that "do not try to printk() faster than the kernel can write to
> > consoles" will remain no matter how printk() changes. Unless asynchronous
> > approach like https://lwn.net/Articles/723447/ is used, I think we can't
> > obtain useful information.
> 
> Does that mean that the patch doesn't pass your test?
> 

Test is irrelevant. See the changelog.

  Synchronous approach is prone to unexpected results (e.g. too late [1], too
  frequent [2], overlooked [3]). As far as I know, warn_alloc() never helped
  with providing information other than "something is going wrong".
  I want to consider asynchronous approach which can obtain information
  during stalls with possibly relevant threads (e.g. the owner of oom_lock
  and kswapd-like threads) and serve as a trigger for actions (e.g. turn
  on/off tracepoints, ask libvirt daemon to take a memory dump of stalling
  KVM guest for diagnostic purpose).

  [1] https://bugzilla.kernel.org/show_bug.cgi?id=192981
  [2] http://lkml.kernel.org/r/CAM_iQpWuPVGc2ky8M-9yukECtS+zKjiDasNymX7rMcBjBFyM_A@mail.gmail.com
  [3] commit db73ee0d46379922 ("mm, vmscan: do not loop on too_many_isolated for ever")

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
  2017-11-09 11:03         ` Tetsuo Handa
@ 2017-11-09 11:31           ` Michal Hocko
  -1 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2017-11-09 11:31 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

On Thu 09-11-17 20:03:30, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > Hi,
> > > > assuming that this passes warn stall torturing by Tetsuo, do you think
> > > > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > > > from the mmotm tree?
> > > 
> > > I don't think so.
> > > 
> > > The rule that "do not try to printk() faster than the kernel can write to
> > > consoles" will remain no matter how printk() changes. Unless asynchronous
> > > approach like https://lwn.net/Articles/723447/ is used, I think we can't
> > > obtain useful information.
> > 
> > Does that mean that the patch doesn't pass your test?
> > 
> 
> Test is irrelevant. See the changelog.
> 
>   Synchronous approach is prone to unexpected results (e.g. too late [1], too
>   frequent [2], overlooked [3]). As far as I know, warn_alloc() never helped
>   with providing information other than "something is going wrong".
>   I want to consider asynchronous approach which can obtain information
>   during stalls with possibly relevant threads (e.g. the owner of oom_lock
>   and kswapd-like threads) and serve as a trigger for actions (e.g. turn
>   on/off tracepoints, ask libvirt daemon to take a memory dump of stalling
>   KVM guest for diagnostic purpose).
> 
>   [1] https://bugzilla.kernel.org/show_bug.cgi?id=192981
>   [2] http://lkml.kernel.org/r/CAM_iQpWuPVGc2ky8M-9yukECtS+zKjiDasNymX7rMcBjBFyM_A@mail.gmail.com
>   [3] commit db73ee0d46379922 ("mm, vmscan: do not loop on too_many_isolated for ever")

So you want to keep the warning out of the kernel even though the
problems you are seeing are gone just to allow for an async approach
nobody is very fond of? That is a very dubious approach.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
@ 2017-11-09 11:31           ` Michal Hocko
  0 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2017-11-09 11:31 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

On Thu 09-11-17 20:03:30, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > Hi,
> > > > assuming that this passes warn stall torturing by Tetsuo, do you think
> > > > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > > > from the mmotm tree?
> > > 
> > > I don't think so.
> > > 
> > > The rule that "do not try to printk() faster than the kernel can write to
> > > consoles" will remain no matter how printk() changes. Unless asynchronous
> > > approach like https://lwn.net/Articles/723447/ is used, I think we can't
> > > obtain useful information.
> > 
> > Does that mean that the patch doesn't pass your test?
> > 
> 
> Test is irrelevant. See the changelog.
> 
>   Synchronous approach is prone to unexpected results (e.g. too late [1], too
>   frequent [2], overlooked [3]). As far as I know, warn_alloc() never helped
>   with providing information other than "something is going wrong".
>   I want to consider asynchronous approach which can obtain information
>   during stalls with possibly relevant threads (e.g. the owner of oom_lock
>   and kswapd-like threads) and serve as a trigger for actions (e.g. turn
>   on/off tracepoints, ask libvirt daemon to take a memory dump of stalling
>   KVM guest for diagnostic purpose).
> 
>   [1] https://bugzilla.kernel.org/show_bug.cgi?id=192981
>   [2] http://lkml.kernel.org/r/CAM_iQpWuPVGc2ky8M-9yukECtS+zKjiDasNymX7rMcBjBFyM_A@mail.gmail.com
>   [3] commit db73ee0d46379922 ("mm, vmscan: do not loop on too_many_isolated for ever")

So you want to keep the warning out of the kernel even though the
problems you are seeing are gone just to allow for an async approach
nobody is very fond of? That is a very dubious approach.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
  2017-11-09 11:31           ` Michal Hocko
@ 2017-11-09 12:07             ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-11-09 12:07 UTC (permalink / raw)
  To: mhocko
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

Michal Hocko wrote:
> On Thu 09-11-17 20:03:30, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> > > > Michal Hocko wrote:
> > > > > Hi,
> > > > > assuming that this passes warn stall torturing by Tetsuo, do you think
> > > > > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > > > > from the mmotm tree?
> > > > 
> > > > I don't think so.
> > > > 
> > > > The rule that "do not try to printk() faster than the kernel can write to
> > > > consoles" will remain no matter how printk() changes. Unless asynchronous
> > > > approach like https://lwn.net/Articles/723447/ is used, I think we can't
> > > > obtain useful information.
> > > 
> > > Does that mean that the patch doesn't pass your test?
> > > 
> > 
> > Test is irrelevant. See the changelog.
> > 
> >   Synchronous approach is prone to unexpected results (e.g. too late [1], too
> >   frequent [2], overlooked [3]). As far as I know, warn_alloc() never helped
> >   with providing information other than "something is going wrong".
> >   I want to consider asynchronous approach which can obtain information
> >   during stalls with possibly relevant threads (e.g. the owner of oom_lock
> >   and kswapd-like threads) and serve as a trigger for actions (e.g. turn
> >   on/off tracepoints, ask libvirt daemon to take a memory dump of stalling
> >   KVM guest for diagnostic purpose).
> > 
> >   [1] https://bugzilla.kernel.org/show_bug.cgi?id=192981
> >   [2] http://lkml.kernel.org/r/CAM_iQpWuPVGc2ky8M-9yukECtS+zKjiDasNymX7rMcBjBFyM_A@mail.gmail.com
> >   [3] commit db73ee0d46379922 ("mm, vmscan: do not loop on too_many_isolated for ever")
> 
> So you want to keep the warning out of the kernel even though the
> problems you are seeing are gone just to allow for an async approach
> nobody is very fond of? That is a very dubious approach.

You are assuming that there are no more bugs which will be caught by
an async approach. That is seriously wrong. [3] is just an example.
http://lkml.kernel.org/r/CABXGCsOzaorL0wKZFYRFKR7RSnUL+7=vspE36sFTENoimsJGSw@mail.gmail.com
is an example where async approach will help. For example, turn various tracepoints on
if stall lasted for 5 seconds and then turn them off when stall disappeared.
It is very unfortunate that we still do not have such trigger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to loadbalance console writes
@ 2017-11-09 12:07             ` Tetsuo Handa
  0 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-11-09 12:07 UTC (permalink / raw)
  To: mhocko
  Cc: rostedt, linux-kernel, akpm, linux-mm, xiyou.wangcong,
	dave.hansen, hannes, mgorman, pmladek, sergey.senozhatsky,
	vbabka, peterz, torvalds, jack, mathieu.desnoyers, rostedt

Michal Hocko wrote:
> On Thu 09-11-17 20:03:30, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > On Thu 09-11-17 19:22:58, Tetsuo Handa wrote:
> > > > Michal Hocko wrote:
> > > > > Hi,
> > > > > assuming that this passes warn stall torturing by Tetsuo, do you think
> > > > > we can drop http://lkml.kernel.org/r/1509017339-4802-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > > > > from the mmotm tree?
> > > > 
> > > > I don't think so.
> > > > 
> > > > The rule that "do not try to printk() faster than the kernel can write to
> > > > consoles" will remain no matter how printk() changes. Unless asynchronous
> > > > approach like https://lwn.net/Articles/723447/ is used, I think we can't
> > > > obtain useful information.
> > > 
> > > Does that mean that the patch doesn't pass your test?
> > > 
> > 
> > Test is irrelevant. See the changelog.
> > 
> >   Synchronous approach is prone to unexpected results (e.g. too late [1], too
> >   frequent [2], overlooked [3]). As far as I know, warn_alloc() never helped
> >   with providing information other than "something is going wrong".
> >   I want to consider asynchronous approach which can obtain information
> >   during stalls with possibly relevant threads (e.g. the owner of oom_lock
> >   and kswapd-like threads) and serve as a trigger for actions (e.g. turn
> >   on/off tracepoints, ask libvirt daemon to take a memory dump of stalling
> >   KVM guest for diagnostic purpose).
> > 
> >   [1] https://bugzilla.kernel.org/show_bug.cgi?id=192981
> >   [2] http://lkml.kernel.org/r/CAM_iQpWuPVGc2ky8M-9yukECtS+zKjiDasNymX7rMcBjBFyM_A@mail.gmail.com
> >   [3] commit db73ee0d46379922 ("mm, vmscan: do not loop on too_many_isolated for ever")
> 
> So you want to keep the warning out of the kernel even though the
> problems you are seeing are gone just to allow for an async approach
> nobody is very fond of? That is a very dubious approach.

You are assuming that there are no more bugs which will be caught by
an async approach. That is seriously wrong. [3] is just an example.
http://lkml.kernel.org/r/CABXGCsOzaorL0wKZFYRFKR7RSnUL+7=vspE36sFTENoimsJGSw@mail.gmail.com
is an example where async approach will help. For example, turn various tracepoints on
if stall lasted for 5 seconds and then turn them off when stall disappeared.
It is very unfortunate that we still do not have such trigger.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-08 15:27 ` Steven Rostedt
@ 2017-11-24 15:54   ` Petr Mladek
  -1 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-11-24 15:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, Byungchul Park

On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> Here's the design again:
> 
> I added a "console_owner" which is set to a task that is actively
> writing to the consoles. It is *not* the same an the owner of the
> console_lock. It is only set when doing the calls to the console
> functions. It is protected by a console_owner_lock which is a raw spin
> lock.
> 
> There is a console_waiter. This is set when there is an active console
> owner that is not current, and waiter is not set. This too is protected
> by console_owner_lock.
> 
> In printk() when it tries to write to the consoles, we have:
> 
> 	if (console_trylock())
> 		console_unlock();
> 
> Now I added an else, which will check if there is an active owner, and
> no current waiter. If that is the case, then console_waiter is set, and
> the task goes into a spin until it is no longer set.
> 
> When the active console owner finishes writing the current message to
> the consoles, it grabs the console_owner_lock and sees if there is a
> waiter, and clears console_owner.
> 
> If there is a waiter, then it breaks out of the loop, clears the waiter
> flag (because that will release the waiter from its spin), and exits.
> Note, it does *not* release the console semaphore. Because it is a
> semaphore, there is no owner.

This is very nice trick how to avoid steeling the lock by
a waiter that might sleep.


> Another task may release it. This means
> that the waiter is guaranteed to be the new console owner! Which it
> becomes.
> 
> Then the waiter calls console_unlock() and continues to write to the
> consoles.
> 
> If another task comes along and does a printk() it too can become the
> new waiter, and we wash rinse and repeat!
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
>  static struct lockdep_map console_lock_dep_map = {
>  	.name = "console_lock"
>  };
> +static struct lockdep_map console_owner_dep_map = {
> +	.name = "console_owner"
> +};
>  #endif
>  
> +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> +static struct task_struct *console_owner;
> +static bool console_waiter;
> +
>  enum devkmsg_log_bits {
>  	__DEVKMSG_LOG_BIT_ON = 0,
>  	__DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>  		 * semaphore.  The release will print out buffers and wake up
>  		 * /dev/kmsg and syslog() users.
>  		 */
> -		if (console_trylock())
> +		if (console_trylock()) {
>  			console_unlock();
> +		} else {
> +			struct task_struct *owner = NULL;
> +			bool waiter;
> +			bool spin = false;
> +
> +			printk_safe_enter_irqsave(flags);
> +
> +			raw_spin_lock(&console_owner_lock);
> +			owner = READ_ONCE(console_owner);
> +			waiter = READ_ONCE(console_waiter);
> +			if (!waiter && owner && owner != current) {
> +				WRITE_ONCE(console_waiter, true);
> +				spin = true;
> +			}
> +			raw_spin_unlock(&console_owner_lock);
> +
> +			/*
> +			 * If there is an active printk() writing to the
> +			 * consoles, instead of having it write our data too,
> +			 * see if we can offload that load from the active
> +			 * printer, and do some printing ourselves.
> +			 * Go into a spin only if there isn't already a waiter
> +			 * spinning, and there is an active printer, and
> +			 * that active printer isn't us (recursive printk?).
> +			 */
> +			if (spin) {
> +				/* We spin waiting for the owner to release us */
> +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +				/* Owner will clear console_waiter on hand off */
> +				while (READ_ONCE(console_waiter))
> +					cpu_relax();
> +
> +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +				printk_safe_exit_irqrestore(flags);
> +
> +				/*
> +				 * The owner passed the console lock to us.
> +				 * Since we did not spin on console lock, annotate
> +				 * this as a trylock. Otherwise lockdep will
> +				 * complain.
> +				 */
> +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);

I am not sure that this correctly imitates the real lock
dependency. The trylock flag means that we are able to skip
this section when the lock is taken elsewhere. But it is not
the whole truth. In fact, we are blocked in this code path
when console_sem is taken by another process.

The use of console_owner_lock is not enough. The other
console_sem calls a lot of code outside the section
guarded by console_owner_lock.

I think that we have actually entered the cross-release bunch
of problems, see https://lwn.net/Articles/709849/

IMHO, we need to use struct lockdep_map_cross for
console_lock_dep_map. Also we need to put somewhere
lock_commit_crosslock().

I am going to play with it. Also I add Byungchul Park into CC.
This is why I keep most of the context in this reply (I am sorry
for it).


> +				console_unlock();
> +				printk_safe_enter_irqsave(flags);
> +			}
> +			printk_safe_exit_irqrestore(flags);
> +
> +		}
>  	}
>  
>  	return printed_len;
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();
>  	}
> +
> +	/*
> +	 * If there is an active waiter waiting on the console_lock.
> +	 * Pass off the printing to the waiter, and the waiter
> +	 * will continue printing on its CPU, and when all writing
> +	 * has finished, the last printer will wake up klogd.
> +	 */
> +	if (waiter) {
> +		WRITE_ONCE(console_waiter, false);
> +		/* The waiter is now free to continue */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +		/*
> +		 * Hand off console_lock to waiter. The waiter will perform
> +		 * the up(). After this, the waiter is the console_lock owner.
> +		 */
> +		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> +		printk_safe_exit_irqrestore(flags);
> +		/* Note, if waiter is set, logbuf_lock is not held */
> +		return;
> +	}
> +
>  	console_locked = 0;
>  
>  	/* Release the exclusive_console once it is used */

I am most concerned about the possible deadlock at the moment.

Otherwise, I agree that this solution should help to avoid softlockup
in many situations. It should not make things worse. Only more
testing would show if it would be enough for the real life problems.

One plus is that this solution does not need special hacks in
the critical paths like panic, suspend, halt, kexec. Well, these
paths still would benefit from a more synchronous mode.

The implementation is much smaller than I have expected.
Well, I hear some alarming bells. The fact that several
people, including Tejun and Jack, reported "invalid"
problems means that this cake was easy to eat.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-24 15:54   ` Petr Mladek
  0 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-11-24 15:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, Byungchul Park

On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> Here's the design again:
> 
> I added a "console_owner" which is set to a task that is actively
> writing to the consoles. It is *not* the same an the owner of the
> console_lock. It is only set when doing the calls to the console
> functions. It is protected by a console_owner_lock which is a raw spin
> lock.
> 
> There is a console_waiter. This is set when there is an active console
> owner that is not current, and waiter is not set. This too is protected
> by console_owner_lock.
> 
> In printk() when it tries to write to the consoles, we have:
> 
> 	if (console_trylock())
> 		console_unlock();
> 
> Now I added an else, which will check if there is an active owner, and
> no current waiter. If that is the case, then console_waiter is set, and
> the task goes into a spin until it is no longer set.
> 
> When the active console owner finishes writing the current message to
> the consoles, it grabs the console_owner_lock and sees if there is a
> waiter, and clears console_owner.
> 
> If there is a waiter, then it breaks out of the loop, clears the waiter
> flag (because that will release the waiter from its spin), and exits.
> Note, it does *not* release the console semaphore. Because it is a
> semaphore, there is no owner.

This is very nice trick how to avoid steeling the lock by
a waiter that might sleep.


> Another task may release it. This means
> that the waiter is guaranteed to be the new console owner! Which it
> becomes.
> 
> Then the waiter calls console_unlock() and continues to write to the
> consoles.
> 
> If another task comes along and does a printk() it too can become the
> new waiter, and we wash rinse and repeat!
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
>  static struct lockdep_map console_lock_dep_map = {
>  	.name = "console_lock"
>  };
> +static struct lockdep_map console_owner_dep_map = {
> +	.name = "console_owner"
> +};
>  #endif
>  
> +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> +static struct task_struct *console_owner;
> +static bool console_waiter;
> +
>  enum devkmsg_log_bits {
>  	__DEVKMSG_LOG_BIT_ON = 0,
>  	__DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>  		 * semaphore.  The release will print out buffers and wake up
>  		 * /dev/kmsg and syslog() users.
>  		 */
> -		if (console_trylock())
> +		if (console_trylock()) {
>  			console_unlock();
> +		} else {
> +			struct task_struct *owner = NULL;
> +			bool waiter;
> +			bool spin = false;
> +
> +			printk_safe_enter_irqsave(flags);
> +
> +			raw_spin_lock(&console_owner_lock);
> +			owner = READ_ONCE(console_owner);
> +			waiter = READ_ONCE(console_waiter);
> +			if (!waiter && owner && owner != current) {
> +				WRITE_ONCE(console_waiter, true);
> +				spin = true;
> +			}
> +			raw_spin_unlock(&console_owner_lock);
> +
> +			/*
> +			 * If there is an active printk() writing to the
> +			 * consoles, instead of having it write our data too,
> +			 * see if we can offload that load from the active
> +			 * printer, and do some printing ourselves.
> +			 * Go into a spin only if there isn't already a waiter
> +			 * spinning, and there is an active printer, and
> +			 * that active printer isn't us (recursive printk?).
> +			 */
> +			if (spin) {
> +				/* We spin waiting for the owner to release us */
> +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +				/* Owner will clear console_waiter on hand off */
> +				while (READ_ONCE(console_waiter))
> +					cpu_relax();
> +
> +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +				printk_safe_exit_irqrestore(flags);
> +
> +				/*
> +				 * The owner passed the console lock to us.
> +				 * Since we did not spin on console lock, annotate
> +				 * this as a trylock. Otherwise lockdep will
> +				 * complain.
> +				 */
> +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);

I am not sure that this correctly imitates the real lock
dependency. The trylock flag means that we are able to skip
this section when the lock is taken elsewhere. But it is not
the whole truth. In fact, we are blocked in this code path
when console_sem is taken by another process.

The use of console_owner_lock is not enough. The other
console_sem calls a lot of code outside the section
guarded by console_owner_lock.

I think that we have actually entered the cross-release bunch
of problems, see https://lwn.net/Articles/709849/

IMHO, we need to use struct lockdep_map_cross for
console_lock_dep_map. Also we need to put somewhere
lock_commit_crosslock().

I am going to play with it. Also I add Byungchul Park into CC.
This is why I keep most of the context in this reply (I am sorry
for it).


> +				console_unlock();
> +				printk_safe_enter_irqsave(flags);
> +			}
> +			printk_safe_exit_irqrestore(flags);
> +
> +		}
>  	}
>  
>  	return printed_len;
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();
>  	}
> +
> +	/*
> +	 * If there is an active waiter waiting on the console_lock.
> +	 * Pass off the printing to the waiter, and the waiter
> +	 * will continue printing on its CPU, and when all writing
> +	 * has finished, the last printer will wake up klogd.
> +	 */
> +	if (waiter) {
> +		WRITE_ONCE(console_waiter, false);
> +		/* The waiter is now free to continue */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +		/*
> +		 * Hand off console_lock to waiter. The waiter will perform
> +		 * the up(). After this, the waiter is the console_lock owner.
> +		 */
> +		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> +		printk_safe_exit_irqrestore(flags);
> +		/* Note, if waiter is set, logbuf_lock is not held */
> +		return;
> +	}
> +
>  	console_locked = 0;
>  
>  	/* Release the exclusive_console once it is used */

I am most concerned about the possible deadlock at the moment.

Otherwise, I agree that this solution should help to avoid softlockup
in many situations. It should not make things worse. Only more
testing would show if it would be enough for the real life problems.

One plus is that this solution does not need special hacks in
the critical paths like panic, suspend, halt, kexec. Well, these
paths still would benefit from a more synchronous mode.

The implementation is much smaller than I have expected.
Well, I hear some alarming bells. The fact that several
people, including Tejun and Jack, reported "invalid"
problems means that this cake was easy to eat.

Best Regards,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-24 15:54   ` Petr Mladek
@ 2017-11-24 15:58     ` Petr Mladek
  -1 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-11-24 15:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, Byungchul Park

On Fri 2017-11-24 16:54:16, Petr Mladek wrote:
> On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> > If there is a waiter, then it breaks out of the loop, clears the waiter
> > flag (because that will release the waiter from its spin), and exits.
> > Note, it does *not* release the console semaphore. Because it is a
> > semaphore, there is no owner.
> 
> > Index: linux-trace.git/kernel/printk/printk.c
> > ===================================================================
> > --- linux-trace.git.orig/kernel/printk/printk.c
> > +++ linux-trace.git/kernel/printk/printk.c
> > @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
> >  static struct lockdep_map console_lock_dep_map = {
> >  	.name = "console_lock"
> >  };
> > +static struct lockdep_map console_owner_dep_map = {
> > +	.name = "console_owner"
> > +};
> >  #endif
> >  
> > +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> > +static struct task_struct *console_owner;
> > +static bool console_waiter;
> > +
> >  enum devkmsg_log_bits {
> >  	__DEVKMSG_LOG_BIT_ON = 0,
> >  	__DEVKMSG_LOG_BIT_OFF,
> > @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> >  		 * semaphore.  The release will print out buffers and wake up
> >  		 * /dev/kmsg and syslog() users.
> >  		 */
> > -		if (console_trylock())
> > +		if (console_trylock()) {
> >  			console_unlock();
> > +		} else {
> > +			struct task_struct *owner = NULL;
> > +			bool waiter;
> > +			bool spin = false;
> > +
> > +			printk_safe_enter_irqsave(flags);
> > +
> > +			raw_spin_lock(&console_owner_lock);
> > +			owner = READ_ONCE(console_owner);
> > +			waiter = READ_ONCE(console_waiter);
> > +			if (!waiter && owner && owner != current) {
> > +				WRITE_ONCE(console_waiter, true);
> > +				spin = true;
> > +			}
> > +			raw_spin_unlock(&console_owner_lock);
> > +
> > +			/*
> > +			 * If there is an active printk() writing to the
> > +			 * consoles, instead of having it write our data too,
> > +			 * see if we can offload that load from the active
> > +			 * printer, and do some printing ourselves.
> > +			 * Go into a spin only if there isn't already a waiter
> > +			 * spinning, and there is an active printer, and
> > +			 * that active printer isn't us (recursive printk?).
> > +			 */
> > +			if (spin) {
> > +				/* We spin waiting for the owner to release us */
> > +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > +				/* Owner will clear console_waiter on hand off */
> > +				while (READ_ONCE(console_waiter))
> > +					cpu_relax();
> > +
> > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > +				printk_safe_exit_irqrestore(flags);
> > +
> > +				/*
> > +				 * The owner passed the console lock to us.
> > +				 * Since we did not spin on console lock, annotate
> > +				 * this as a trylock. Otherwise lockdep will
> > +				 * complain.
> > +				 */
> > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> 
> I am not sure that this correctly imitates the real lock
> dependency. The trylock flag means that we are able to skip
> this section when the lock is taken elsewhere. But it is not
> the whole truth. In fact, we are blocked in this code path
> when console_sem is taken by another process.
> 
> The use of console_owner_lock is not enough. The other
> console_sem calls a lot of code outside the section
> guarded by console_owner_lock.
> 
> I think that we have actually entered the cross-release bunch
> of problems, see https://lwn.net/Articles/709849/
> 
> IMHO, we need to use struct lockdep_map_cross for
> console_lock_dep_map. Also we need to put somewhere
> lock_commit_crosslock().
> 
> I am going to play with it. Also I add Byungchul Park into CC.
> This is why I keep most of the context in this reply (I am sorry
> for it).

See my first attempt below. I do not get any lockdep
warning but it is possible that I just did it wrong.


>From 0345785d4767f853ff2d733b565084c3339f9fe0 Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@suse.com>
Date: Fri, 24 Nov 2017 16:50:25 +0100
Subject: [PATCH] printk: Try to describe real console_sem dependecies using
 the crosslock feature

console_sem might be newly taken and released by different
processes. This is an attempt to check the crossrelease
dependencies.
---
 kernel/printk/printk.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 040fb948924e..bda25feae0d5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -83,9 +83,9 @@ struct console *console_drivers;
 EXPORT_SYMBOL_GPL(console_drivers);
 
 #ifdef CONFIG_LOCKDEP
-static struct lockdep_map console_lock_dep_map = {
-	.name = "console_lock"
-};
+static struct lockdep_map_cross console_lock_dep_map =
+	STATIC_CROSS_LOCKDEP_MAP_INIT("console_lock", &console_sem);
+
 static struct lockdep_map console_owner_dep_map = {
 	.name = "console_owner"
 };
@@ -218,7 +218,7 @@ static int nr_ext_console_drivers;
  */
 #define down_console_sem() do { \
 	down(&console_sem);\
-	mutex_acquire(&console_lock_dep_map, 0, 0, _RET_IP_);\
+	mutex_acquire((struct lockdep_map *)&console_lock_dep_map, 0, 0, _RET_IP_);\
 } while (0)
 
 static int __down_trylock_console_sem(unsigned long ip)
@@ -237,7 +237,7 @@ static int __down_trylock_console_sem(unsigned long ip)
 
 	if (lock_failed)
 		return 1;
-	mutex_acquire(&console_lock_dep_map, 0, 1, ip);
+	mutex_acquire((struct lockdep_map *)&console_lock_dep_map, 0, 1, ip);
 	return 0;
 }
 #define down_trylock_console_sem() __down_trylock_console_sem(_RET_IP_)
@@ -246,7 +246,7 @@ static void __up_console_sem(unsigned long ip)
 {
 	unsigned long flags;
 
-	mutex_release(&console_lock_dep_map, 1, ip);
+	mutex_release((struct lockdep_map *)&console_lock_dep_map, 1, ip);
 
 	printk_safe_enter_irqsave(flags);
 	up(&console_sem);
@@ -1797,13 +1797,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
 				printk_safe_exit_irqrestore(flags);
 
-				/*
-				 * The owner passed the console lock to us.
-				 * Since we did not spin on console lock, annotate
-				 * this as a trylock. Otherwise lockdep will
-				 * complain.
-				 */
-				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
 				console_unlock();
 				printk_safe_enter_irqsave(flags);
 			}
@@ -2334,10 +2327,10 @@ void console_unlock(void)
 		/* The waiter is now free to continue */
 		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
 		/*
-		 * Hand off console_lock to waiter. The waiter will perform
-		 * the up(). After this, the waiter is the console_lock owner.
+		 * Hand off console_lock to waiter. After this, the waiter
+		 * is the console_lock owner.
 		 */
-		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
+		lock_commit_crosslock((struct lockdep_map *)&console_lock_dep_map);
 		printk_safe_exit_irqrestore(flags);
 		/* Note, if waiter is set, logbuf_lock is not held */
 		return;
-- 
2.13.6

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-24 15:58     ` Petr Mladek
  0 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-11-24 15:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, Byungchul Park

On Fri 2017-11-24 16:54:16, Petr Mladek wrote:
> On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> > If there is a waiter, then it breaks out of the loop, clears the waiter
> > flag (because that will release the waiter from its spin), and exits.
> > Note, it does *not* release the console semaphore. Because it is a
> > semaphore, there is no owner.
> 
> > Index: linux-trace.git/kernel/printk/printk.c
> > ===================================================================
> > --- linux-trace.git.orig/kernel/printk/printk.c
> > +++ linux-trace.git/kernel/printk/printk.c
> > @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
> >  static struct lockdep_map console_lock_dep_map = {
> >  	.name = "console_lock"
> >  };
> > +static struct lockdep_map console_owner_dep_map = {
> > +	.name = "console_owner"
> > +};
> >  #endif
> >  
> > +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> > +static struct task_struct *console_owner;
> > +static bool console_waiter;
> > +
> >  enum devkmsg_log_bits {
> >  	__DEVKMSG_LOG_BIT_ON = 0,
> >  	__DEVKMSG_LOG_BIT_OFF,
> > @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> >  		 * semaphore.  The release will print out buffers and wake up
> >  		 * /dev/kmsg and syslog() users.
> >  		 */
> > -		if (console_trylock())
> > +		if (console_trylock()) {
> >  			console_unlock();
> > +		} else {
> > +			struct task_struct *owner = NULL;
> > +			bool waiter;
> > +			bool spin = false;
> > +
> > +			printk_safe_enter_irqsave(flags);
> > +
> > +			raw_spin_lock(&console_owner_lock);
> > +			owner = READ_ONCE(console_owner);
> > +			waiter = READ_ONCE(console_waiter);
> > +			if (!waiter && owner && owner != current) {
> > +				WRITE_ONCE(console_waiter, true);
> > +				spin = true;
> > +			}
> > +			raw_spin_unlock(&console_owner_lock);
> > +
> > +			/*
> > +			 * If there is an active printk() writing to the
> > +			 * consoles, instead of having it write our data too,
> > +			 * see if we can offload that load from the active
> > +			 * printer, and do some printing ourselves.
> > +			 * Go into a spin only if there isn't already a waiter
> > +			 * spinning, and there is an active printer, and
> > +			 * that active printer isn't us (recursive printk?).
> > +			 */
> > +			if (spin) {
> > +				/* We spin waiting for the owner to release us */
> > +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > +				/* Owner will clear console_waiter on hand off */
> > +				while (READ_ONCE(console_waiter))
> > +					cpu_relax();
> > +
> > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > +				printk_safe_exit_irqrestore(flags);
> > +
> > +				/*
> > +				 * The owner passed the console lock to us.
> > +				 * Since we did not spin on console lock, annotate
> > +				 * this as a trylock. Otherwise lockdep will
> > +				 * complain.
> > +				 */
> > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> 
> I am not sure that this correctly imitates the real lock
> dependency. The trylock flag means that we are able to skip
> this section when the lock is taken elsewhere. But it is not
> the whole truth. In fact, we are blocked in this code path
> when console_sem is taken by another process.
> 
> The use of console_owner_lock is not enough. The other
> console_sem calls a lot of code outside the section
> guarded by console_owner_lock.
> 
> I think that we have actually entered the cross-release bunch
> of problems, see https://lwn.net/Articles/709849/
> 
> IMHO, we need to use struct lockdep_map_cross for
> console_lock_dep_map. Also we need to put somewhere
> lock_commit_crosslock().
> 
> I am going to play with it. Also I add Byungchul Park into CC.
> This is why I keep most of the context in this reply (I am sorry
> for it).

See my first attempt below. I do not get any lockdep
warning but it is possible that I just did it wrong.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-08 15:27 ` Steven Rostedt
@ 2017-11-27  8:48   ` Byungchul Park
  -1 siblings, 0 replies; 47+ messages in thread
From: Byungchul Park @ 2017-11-27  8:48 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Petr Mladek, Sergey Senozhatsky,
	Vlastimil Babka, yuwang.yuwang, Peter Zijlstra, Linus Torvalds,
	Jan Kara, Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

On Wed, Nov 08, 2017 at 10:27:23AM -0500, Steven Rostedt wrote:
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
>  static struct lockdep_map console_lock_dep_map = {
>  	.name = "console_lock"
>  };
> +static struct lockdep_map console_owner_dep_map = {
> +	.name = "console_owner"
> +};
>  #endif
>  
> +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> +static struct task_struct *console_owner;
> +static bool console_waiter;
> +
>  enum devkmsg_log_bits {
>  	__DEVKMSG_LOG_BIT_ON = 0,
>  	__DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>  		 * semaphore.  The release will print out buffers and wake up
>  		 * /dev/kmsg and syslog() users.
>  		 */
> -		if (console_trylock())
> +		if (console_trylock()) {
>  			console_unlock();
> +		} else {
> +			struct task_struct *owner = NULL;
> +			bool waiter;
> +			bool spin = false;
> +
> +			printk_safe_enter_irqsave(flags);
> +
> +			raw_spin_lock(&console_owner_lock);
> +			owner = READ_ONCE(console_owner);
> +			waiter = READ_ONCE(console_waiter);
> +			if (!waiter && owner && owner != current) {
> +				WRITE_ONCE(console_waiter, true);
> +				spin = true;
> +			}
> +			raw_spin_unlock(&console_owner_lock);
> +
> +			/*
> +			 * If there is an active printk() writing to the
> +			 * consoles, instead of having it write our data too,
> +			 * see if we can offload that load from the active
> +			 * printer, and do some printing ourselves.
> +			 * Go into a spin only if there isn't already a waiter
> +			 * spinning, and there is an active printer, and
> +			 * that active printer isn't us (recursive printk?).
> +			 */
> +			if (spin) {
> +				/* We spin waiting for the owner to release us */
> +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);

Hello Steven,

I think it would be better to use cross-release stuff here, because the
waiter waits for an event which happens in another context.

> +				/* Owner will clear console_waiter on hand off */
> +				while (READ_ONCE(console_waiter))
> +					cpu_relax();
> +
> +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +				printk_safe_exit_irqrestore(flags);
> +
> +				/*
> +				 * The owner passed the console lock to us.
> +				 * Since we did not spin on console lock, annotate
> +				 * this as a trylock. Otherwise lockdep will
> +				 * complain.
> +				 */
> +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);

I'm afraid if it's ok even not to lock(or trylock) actually here. Is there
any problem if you call console_trylock() instead of mutex_acquire() here?

> +				console_unlock();
> +				printk_safe_enter_irqsave(flags);
> +			}
> +			printk_safe_exit_irqrestore(flags);
> +
> +		}
>  	}
>  
>  	return printed_len;
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);

If you want to do this speculatively here, I think it would be better to
use a read recursive acquisition. I think spin_acquire() is too stong
for that purpose - I also mentioned it on workqueue flush code. Don't
you think so?

> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);

I think this release() can be moved up over 'if (waiter)' because only
waiters within the region between acquire() and release() are meaningful.

> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();
>  	}
> +
> +	/*
> +	 * If there is an active waiter waiting on the console_lock.
> +	 * Pass off the printing to the waiter, and the waiter
> +	 * will continue printing on its CPU, and when all writing
> +	 * has finished, the last printer will wake up klogd.
> +	 */
> +	if (waiter) {
> +		WRITE_ONCE(console_waiter, false);
> +		/* The waiter is now free to continue */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);

So this can be removed.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-27  8:48   ` Byungchul Park
  0 siblings, 0 replies; 47+ messages in thread
From: Byungchul Park @ 2017-11-27  8:48 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Petr Mladek, Sergey Senozhatsky,
	Vlastimil Babka, yuwang.yuwang, Peter Zijlstra, Linus Torvalds,
	Jan Kara, Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

On Wed, Nov 08, 2017 at 10:27:23AM -0500, Steven Rostedt wrote:
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
>  static struct lockdep_map console_lock_dep_map = {
>  	.name = "console_lock"
>  };
> +static struct lockdep_map console_owner_dep_map = {
> +	.name = "console_owner"
> +};
>  #endif
>  
> +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> +static struct task_struct *console_owner;
> +static bool console_waiter;
> +
>  enum devkmsg_log_bits {
>  	__DEVKMSG_LOG_BIT_ON = 0,
>  	__DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>  		 * semaphore.  The release will print out buffers and wake up
>  		 * /dev/kmsg and syslog() users.
>  		 */
> -		if (console_trylock())
> +		if (console_trylock()) {
>  			console_unlock();
> +		} else {
> +			struct task_struct *owner = NULL;
> +			bool waiter;
> +			bool spin = false;
> +
> +			printk_safe_enter_irqsave(flags);
> +
> +			raw_spin_lock(&console_owner_lock);
> +			owner = READ_ONCE(console_owner);
> +			waiter = READ_ONCE(console_waiter);
> +			if (!waiter && owner && owner != current) {
> +				WRITE_ONCE(console_waiter, true);
> +				spin = true;
> +			}
> +			raw_spin_unlock(&console_owner_lock);
> +
> +			/*
> +			 * If there is an active printk() writing to the
> +			 * consoles, instead of having it write our data too,
> +			 * see if we can offload that load from the active
> +			 * printer, and do some printing ourselves.
> +			 * Go into a spin only if there isn't already a waiter
> +			 * spinning, and there is an active printer, and
> +			 * that active printer isn't us (recursive printk?).
> +			 */
> +			if (spin) {
> +				/* We spin waiting for the owner to release us */
> +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);

Hello Steven,

I think it would be better to use cross-release stuff here, because the
waiter waits for an event which happens in another context.

> +				/* Owner will clear console_waiter on hand off */
> +				while (READ_ONCE(console_waiter))
> +					cpu_relax();
> +
> +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +				printk_safe_exit_irqrestore(flags);
> +
> +				/*
> +				 * The owner passed the console lock to us.
> +				 * Since we did not spin on console lock, annotate
> +				 * this as a trylock. Otherwise lockdep will
> +				 * complain.
> +				 */
> +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);

I'm afraid if it's ok even not to lock(or trylock) actually here. Is there
any problem if you call console_trylock() instead of mutex_acquire() here?

> +				console_unlock();
> +				printk_safe_enter_irqsave(flags);
> +			}
> +			printk_safe_exit_irqrestore(flags);
> +
> +		}
>  	}
>  
>  	return printed_len;
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);

If you want to do this speculatively here, I think it would be better to
use a read recursive acquisition. I think spin_acquire() is too stong
for that purpose - I also mentioned it on workqueue flush code. Don't
you think so?

> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);

I think this release() can be moved up over 'if (waiter)' because only
waiters within the region between acquire() and release() are meaningful.

> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();
>  	}
> +
> +	/*
> +	 * If there is an active waiter waiting on the console_lock.
> +	 * Pass off the printing to the waiter, and the waiter
> +	 * will continue printing on its CPU, and when all writing
> +	 * has finished, the last printer will wake up klogd.
> +	 */
> +	if (waiter) {
> +		WRITE_ONCE(console_waiter, false);
> +		/* The waiter is now free to continue */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);

So this can be removed.

Thanks,
Byungchul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-24 15:58     ` Petr Mladek
@ 2017-11-27  8:53       ` Byungchul Park
  -1 siblings, 0 replies; 47+ messages in thread
From: Byungchul Park @ 2017-11-27  8:53 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Sergey Senozhatsky,
	Vlastimil Babka, Peter Zijlstra, Linus Torvalds, Jan Kara,
	Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

On Fri, Nov 24, 2017 at 04:58:16PM +0100, Petr Mladek wrote:
> On Fri 2017-11-24 16:54:16, Petr Mladek wrote:
> > On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> > > If there is a waiter, then it breaks out of the loop, clears the waiter
> > > flag (because that will release the waiter from its spin), and exits.
> > > Note, it does *not* release the console semaphore. Because it is a
> > > semaphore, there is no owner.

Hello Petr,

Thank you for adding me into this thread.

You seem to change console_lock_dep_map to cross-release stuff. I will
add my opinion after reviewing it :)

Thanks,
Byungchul

> > > Index: linux-trace.git/kernel/printk/printk.c
> > > ===================================================================
> > > --- linux-trace.git.orig/kernel/printk/printk.c
> > > +++ linux-trace.git/kernel/printk/printk.c
> > > @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
> > >  static struct lockdep_map console_lock_dep_map = {
> > >  	.name = "console_lock"
> > >  };
> > > +static struct lockdep_map console_owner_dep_map = {
> > > +	.name = "console_owner"
> > > +};
> > >  #endif
> > >  
> > > +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> > > +static struct task_struct *console_owner;
> > > +static bool console_waiter;
> > > +
> > >  enum devkmsg_log_bits {
> > >  	__DEVKMSG_LOG_BIT_ON = 0,
> > >  	__DEVKMSG_LOG_BIT_OFF,
> > > @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> > >  		 * semaphore.  The release will print out buffers and wake up
> > >  		 * /dev/kmsg and syslog() users.
> > >  		 */
> > > -		if (console_trylock())
> > > +		if (console_trylock()) {
> > >  			console_unlock();
> > > +		} else {
> > > +			struct task_struct *owner = NULL;
> > > +			bool waiter;
> > > +			bool spin = false;
> > > +
> > > +			printk_safe_enter_irqsave(flags);
> > > +
> > > +			raw_spin_lock(&console_owner_lock);
> > > +			owner = READ_ONCE(console_owner);
> > > +			waiter = READ_ONCE(console_waiter);
> > > +			if (!waiter && owner && owner != current) {
> > > +				WRITE_ONCE(console_waiter, true);
> > > +				spin = true;
> > > +			}
> > > +			raw_spin_unlock(&console_owner_lock);
> > > +
> > > +			/*
> > > +			 * If there is an active printk() writing to the
> > > +			 * consoles, instead of having it write our data too,
> > > +			 * see if we can offload that load from the active
> > > +			 * printer, and do some printing ourselves.
> > > +			 * Go into a spin only if there isn't already a waiter
> > > +			 * spinning, and there is an active printer, and
> > > +			 * that active printer isn't us (recursive printk?).
> > > +			 */
> > > +			if (spin) {
> > > +				/* We spin waiting for the owner to release us */
> > > +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > > +				/* Owner will clear console_waiter on hand off */
> > > +				while (READ_ONCE(console_waiter))
> > > +					cpu_relax();
> > > +
> > > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > > +				printk_safe_exit_irqrestore(flags);
> > > +
> > > +				/*
> > > +				 * The owner passed the console lock to us.
> > > +				 * Since we did not spin on console lock, annotate
> > > +				 * this as a trylock. Otherwise lockdep will
> > > +				 * complain.
> > > +				 */
> > > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> > 
> > I am not sure that this correctly imitates the real lock
> > dependency. The trylock flag means that we are able to skip
> > this section when the lock is taken elsewhere. But it is not
> > the whole truth. In fact, we are blocked in this code path
> > when console_sem is taken by another process.
> > 
> > The use of console_owner_lock is not enough. The other
> > console_sem calls a lot of code outside the section
> > guarded by console_owner_lock.
> > 
> > I think that we have actually entered the cross-release bunch
> > of problems, see https://lwn.net/Articles/709849/
> > 
> > IMHO, we need to use struct lockdep_map_cross for
> > console_lock_dep_map. Also we need to put somewhere
> > lock_commit_crosslock().
> > 
> > I am going to play with it. Also I add Byungchul Park into CC.
> > This is why I keep most of the context in this reply (I am sorry
> > for it).
> 
> See my first attempt below. I do not get any lockdep
> warning but it is possible that I just did it wrong.
> 
> 
> >From 0345785d4767f853ff2d733b565084c3339f9fe0 Mon Sep 17 00:00:00 2001
> From: Petr Mladek <pmladek@suse.com>
> Date: Fri, 24 Nov 2017 16:50:25 +0100
> Subject: [PATCH] printk: Try to describe real console_sem dependecies using
>  the crosslock feature
> 
> console_sem might be newly taken and released by different
> processes. This is an attempt to check the crossrelease
> dependencies.
> ---
>  kernel/printk/printk.c | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 040fb948924e..bda25feae0d5 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -83,9 +83,9 @@ struct console *console_drivers;
>  EXPORT_SYMBOL_GPL(console_drivers);
>  
>  #ifdef CONFIG_LOCKDEP
> -static struct lockdep_map console_lock_dep_map = {
> -	.name = "console_lock"
> -};
> +static struct lockdep_map_cross console_lock_dep_map =
> +	STATIC_CROSS_LOCKDEP_MAP_INIT("console_lock", &console_sem);
> +
>  static struct lockdep_map console_owner_dep_map = {
>  	.name = "console_owner"
>  };
> @@ -218,7 +218,7 @@ static int nr_ext_console_drivers;
>   */
>  #define down_console_sem() do { \
>  	down(&console_sem);\
> -	mutex_acquire(&console_lock_dep_map, 0, 0, _RET_IP_);\
> +	mutex_acquire((struct lockdep_map *)&console_lock_dep_map, 0, 0, _RET_IP_);\
>  } while (0)
>  
>  static int __down_trylock_console_sem(unsigned long ip)
> @@ -237,7 +237,7 @@ static int __down_trylock_console_sem(unsigned long ip)
>  
>  	if (lock_failed)
>  		return 1;
> -	mutex_acquire(&console_lock_dep_map, 0, 1, ip);
> +	mutex_acquire((struct lockdep_map *)&console_lock_dep_map, 0, 1, ip);
>  	return 0;
>  }
>  #define down_trylock_console_sem() __down_trylock_console_sem(_RET_IP_)
> @@ -246,7 +246,7 @@ static void __up_console_sem(unsigned long ip)
>  {
>  	unsigned long flags;
>  
> -	mutex_release(&console_lock_dep_map, 1, ip);
> +	mutex_release((struct lockdep_map *)&console_lock_dep_map, 1, ip);
>  
>  	printk_safe_enter_irqsave(flags);
>  	up(&console_sem);
> @@ -1797,13 +1797,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  				printk_safe_exit_irqrestore(flags);
>  
> -				/*
> -				 * The owner passed the console lock to us.
> -				 * Since we did not spin on console lock, annotate
> -				 * this as a trylock. Otherwise lockdep will
> -				 * complain.
> -				 */
> -				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
>  				console_unlock();
>  				printk_safe_enter_irqsave(flags);
>  			}
> @@ -2334,10 +2327,10 @@ void console_unlock(void)
>  		/* The waiter is now free to continue */
>  		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  		/*
> -		 * Hand off console_lock to waiter. The waiter will perform
> -		 * the up(). After this, the waiter is the console_lock owner.
> +		 * Hand off console_lock to waiter. After this, the waiter
> +		 * is the console_lock owner.
>  		 */
> -		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> +		lock_commit_crosslock((struct lockdep_map *)&console_lock_dep_map);
>  		printk_safe_exit_irqrestore(flags);
>  		/* Note, if waiter is set, logbuf_lock is not held */
>  		return;
> -- 
> 2.13.6

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-27  8:53       ` Byungchul Park
  0 siblings, 0 replies; 47+ messages in thread
From: Byungchul Park @ 2017-11-27  8:53 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Sergey Senozhatsky,
	Vlastimil Babka, Peter Zijlstra, Linus Torvalds, Jan Kara,
	Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

On Fri, Nov 24, 2017 at 04:58:16PM +0100, Petr Mladek wrote:
> On Fri 2017-11-24 16:54:16, Petr Mladek wrote:
> > On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> > > If there is a waiter, then it breaks out of the loop, clears the waiter
> > > flag (because that will release the waiter from its spin), and exits.
> > > Note, it does *not* release the console semaphore. Because it is a
> > > semaphore, there is no owner.

Hello Petr,

Thank you for adding me into this thread.

You seem to change console_lock_dep_map to cross-release stuff. I will
add my opinion after reviewing it :)

Thanks,
Byungchul

> > > Index: linux-trace.git/kernel/printk/printk.c
> > > ===================================================================
> > > --- linux-trace.git.orig/kernel/printk/printk.c
> > > +++ linux-trace.git/kernel/printk/printk.c
> > > @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
> > >  static struct lockdep_map console_lock_dep_map = {
> > >  	.name = "console_lock"
> > >  };
> > > +static struct lockdep_map console_owner_dep_map = {
> > > +	.name = "console_owner"
> > > +};
> > >  #endif
> > >  
> > > +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> > > +static struct task_struct *console_owner;
> > > +static bool console_waiter;
> > > +
> > >  enum devkmsg_log_bits {
> > >  	__DEVKMSG_LOG_BIT_ON = 0,
> > >  	__DEVKMSG_LOG_BIT_OFF,
> > > @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> > >  		 * semaphore.  The release will print out buffers and wake up
> > >  		 * /dev/kmsg and syslog() users.
> > >  		 */
> > > -		if (console_trylock())
> > > +		if (console_trylock()) {
> > >  			console_unlock();
> > > +		} else {
> > > +			struct task_struct *owner = NULL;
> > > +			bool waiter;
> > > +			bool spin = false;
> > > +
> > > +			printk_safe_enter_irqsave(flags);
> > > +
> > > +			raw_spin_lock(&console_owner_lock);
> > > +			owner = READ_ONCE(console_owner);
> > > +			waiter = READ_ONCE(console_waiter);
> > > +			if (!waiter && owner && owner != current) {
> > > +				WRITE_ONCE(console_waiter, true);
> > > +				spin = true;
> > > +			}
> > > +			raw_spin_unlock(&console_owner_lock);
> > > +
> > > +			/*
> > > +			 * If there is an active printk() writing to the
> > > +			 * consoles, instead of having it write our data too,
> > > +			 * see if we can offload that load from the active
> > > +			 * printer, and do some printing ourselves.
> > > +			 * Go into a spin only if there isn't already a waiter
> > > +			 * spinning, and there is an active printer, and
> > > +			 * that active printer isn't us (recursive printk?).
> > > +			 */
> > > +			if (spin) {
> > > +				/* We spin waiting for the owner to release us */
> > > +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > > +				/* Owner will clear console_waiter on hand off */
> > > +				while (READ_ONCE(console_waiter))
> > > +					cpu_relax();
> > > +
> > > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > > +				printk_safe_exit_irqrestore(flags);
> > > +
> > > +				/*
> > > +				 * The owner passed the console lock to us.
> > > +				 * Since we did not spin on console lock, annotate
> > > +				 * this as a trylock. Otherwise lockdep will
> > > +				 * complain.
> > > +				 */
> > > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> > 
> > I am not sure that this correctly imitates the real lock
> > dependency. The trylock flag means that we are able to skip
> > this section when the lock is taken elsewhere. But it is not
> > the whole truth. In fact, we are blocked in this code path
> > when console_sem is taken by another process.
> > 
> > The use of console_owner_lock is not enough. The other
> > console_sem calls a lot of code outside the section
> > guarded by console_owner_lock.
> > 
> > I think that we have actually entered the cross-release bunch
> > of problems, see https://lwn.net/Articles/709849/
> > 
> > IMHO, we need to use struct lockdep_map_cross for
> > console_lock_dep_map. Also we need to put somewhere
> > lock_commit_crosslock().
> > 
> > I am going to play with it. Also I add Byungchul Park into CC.
> > This is why I keep most of the context in this reply (I am sorry
> > for it).
> 
> See my first attempt below. I do not get any lockdep
> warning but it is possible that I just did it wrong.
> 
> 
> >From 0345785d4767f853ff2d733b565084c3339f9fe0 Mon Sep 17 00:00:00 2001
> From: Petr Mladek <pmladek@suse.com>
> Date: Fri, 24 Nov 2017 16:50:25 +0100
> Subject: [PATCH] printk: Try to describe real console_sem dependecies using
>  the crosslock feature
> 
> console_sem might be newly taken and released by different
> processes. This is an attempt to check the crossrelease
> dependencies.
> ---
>  kernel/printk/printk.c | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 040fb948924e..bda25feae0d5 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -83,9 +83,9 @@ struct console *console_drivers;
>  EXPORT_SYMBOL_GPL(console_drivers);
>  
>  #ifdef CONFIG_LOCKDEP
> -static struct lockdep_map console_lock_dep_map = {
> -	.name = "console_lock"
> -};
> +static struct lockdep_map_cross console_lock_dep_map =
> +	STATIC_CROSS_LOCKDEP_MAP_INIT("console_lock", &console_sem);
> +
>  static struct lockdep_map console_owner_dep_map = {
>  	.name = "console_owner"
>  };
> @@ -218,7 +218,7 @@ static int nr_ext_console_drivers;
>   */
>  #define down_console_sem() do { \
>  	down(&console_sem);\
> -	mutex_acquire(&console_lock_dep_map, 0, 0, _RET_IP_);\
> +	mutex_acquire((struct lockdep_map *)&console_lock_dep_map, 0, 0, _RET_IP_);\
>  } while (0)
>  
>  static int __down_trylock_console_sem(unsigned long ip)
> @@ -237,7 +237,7 @@ static int __down_trylock_console_sem(unsigned long ip)
>  
>  	if (lock_failed)
>  		return 1;
> -	mutex_acquire(&console_lock_dep_map, 0, 1, ip);
> +	mutex_acquire((struct lockdep_map *)&console_lock_dep_map, 0, 1, ip);
>  	return 0;
>  }
>  #define down_trylock_console_sem() __down_trylock_console_sem(_RET_IP_)
> @@ -246,7 +246,7 @@ static void __up_console_sem(unsigned long ip)
>  {
>  	unsigned long flags;
>  
> -	mutex_release(&console_lock_dep_map, 1, ip);
> +	mutex_release((struct lockdep_map *)&console_lock_dep_map, 1, ip);
>  
>  	printk_safe_enter_irqsave(flags);
>  	up(&console_sem);
> @@ -1797,13 +1797,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  				printk_safe_exit_irqrestore(flags);
>  
> -				/*
> -				 * The owner passed the console lock to us.
> -				 * Since we did not spin on console lock, annotate
> -				 * this as a trylock. Otherwise lockdep will
> -				 * complain.
> -				 */
> -				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
>  				console_unlock();
>  				printk_safe_enter_irqsave(flags);
>  			}
> @@ -2334,10 +2327,10 @@ void console_unlock(void)
>  		/* The waiter is now free to continue */
>  		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  		/*
> -		 * Hand off console_lock to waiter. The waiter will perform
> -		 * the up(). After this, the waiter is the console_lock owner.
> +		 * Hand off console_lock to waiter. After this, the waiter
> +		 * is the console_lock owner.
>  		 */
> -		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> +		lock_commit_crosslock((struct lockdep_map *)&console_lock_dep_map);
>  		printk_safe_exit_irqrestore(flags);
>  		/* Note, if waiter is set, logbuf_lock is not held */
>  		return;
> -- 
> 2.13.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-24 15:58     ` Petr Mladek
@ 2017-11-28  1:42       ` Byungchul Park
  -1 siblings, 0 replies; 47+ messages in thread
From: Byungchul Park @ 2017-11-28  1:42 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Sergey Senozhatsky,
	Vlastimil Babka, Peter Zijlstra, Linus Torvalds, Jan Kara,
	Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

On Fri, Nov 24, 2017 at 04:58:16PM +0100, Petr Mladek wrote:
> @@ -1797,13 +1797,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  				printk_safe_exit_irqrestore(flags);
>  
> -				/*
> -				 * The owner passed the console lock to us.
> -				 * Since we did not spin on console lock, annotate
> -				 * this as a trylock. Otherwise lockdep will
> -				 * complain.
> -				 */
> -				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);

Hello Petr,

IMHO, it would get unbalanced if you only remove this mutex_acquire().

>  				console_unlock();
>  				printk_safe_enter_irqsave(flags);
>  			}
> @@ -2334,10 +2327,10 @@ void console_unlock(void)
>  		/* The waiter is now free to continue */
>  		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  		/*
> -		 * Hand off console_lock to waiter. The waiter will perform
> -		 * the up(). After this, the waiter is the console_lock owner.
> +		 * Hand off console_lock to waiter. After this, the waiter
> +		 * is the console_lock owner.
>  		 */
> -		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);

IMHO, this release() should be moved to somewhere properly.

> +		lock_commit_crosslock((struct lockdep_map *)&console_lock_dep_map);
>  		printk_safe_exit_irqrestore(flags);
>  		/* Note, if waiter is set, logbuf_lock is not held */
>  		return;

However, now that cross-release was introduces, lockdep can be applied
to semaphore operations. Actually, I have a plan to do that. I think it
would be better to make semaphore tracked with lockdep and remove all
these manual acquire() and release() here. What do you think about it?

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-28  1:42       ` Byungchul Park
  0 siblings, 0 replies; 47+ messages in thread
From: Byungchul Park @ 2017-11-28  1:42 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Sergey Senozhatsky,
	Vlastimil Babka, Peter Zijlstra, Linus Torvalds, Jan Kara,
	Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

On Fri, Nov 24, 2017 at 04:58:16PM +0100, Petr Mladek wrote:
> @@ -1797,13 +1797,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  				printk_safe_exit_irqrestore(flags);
>  
> -				/*
> -				 * The owner passed the console lock to us.
> -				 * Since we did not spin on console lock, annotate
> -				 * this as a trylock. Otherwise lockdep will
> -				 * complain.
> -				 */
> -				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);

Hello Petr,

IMHO, it would get unbalanced if you only remove this mutex_acquire().

>  				console_unlock();
>  				printk_safe_enter_irqsave(flags);
>  			}
> @@ -2334,10 +2327,10 @@ void console_unlock(void)
>  		/* The waiter is now free to continue */
>  		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
>  		/*
> -		 * Hand off console_lock to waiter. The waiter will perform
> -		 * the up(). After this, the waiter is the console_lock owner.
> +		 * Hand off console_lock to waiter. After this, the waiter
> +		 * is the console_lock owner.
>  		 */
> -		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);

IMHO, this release() should be moved to somewhere properly.

> +		lock_commit_crosslock((struct lockdep_map *)&console_lock_dep_map);
>  		printk_safe_exit_irqrestore(flags);
>  		/* Note, if waiter is set, logbuf_lock is not held */
>  		return;

However, now that cross-release was introduces, lockdep can be applied
to semaphore operations. Actually, I have a plan to do that. I think it
would be better to make semaphore tracked with lockdep and remove all
these manual acquire() and release() here. What do you think about it?

Thanks,
Byungchul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-27  8:48   ` Byungchul Park
@ 2017-11-28  6:23     ` Sergey Senozhatsky
  -1 siblings, 0 replies; 47+ messages in thread
From: Sergey Senozhatsky @ 2017-11-28  6:23 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Petr Mladek,
	Sergey Senozhatsky, Vlastimil Babka, yuwang.yuwang,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, kernel-team

Hi,

On (11/27/17 17:48), Byungchul Park wrote:
[..]
> > +				/* Owner will clear console_waiter on hand off */
> > +				while (READ_ONCE(console_waiter))
> > +					cpu_relax();
> > +
> > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > +				printk_safe_exit_irqrestore(flags);
> > +
> > +				/*
> > +				 * The owner passed the console lock to us.
> > +				 * Since we did not spin on console lock, annotate
> > +				 * this as a trylock. Otherwise lockdep will
> > +				 * complain.
> > +				 */
> > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> 
> I'm afraid if it's ok even not to lock(or trylock) actually here. Is there
> any problem if you call console_trylock() instead of mutex_acquire() here?

console_trylock() will not work. console_trylock() implies that the
current printing process does up(), which a) opens a race with possible
console_lock()/console_trylock() from foreign CPUs, and b) additionally
wakes up a task from console_sem wait list [if there was one]. so chances
are some other CPU potentially can acquire the lock ahead of waiter,
forcing the waiter to continue spinning on console_sem. and the bad news
are a) it's spinning with local IRQs disabled and b) that another CPU,
which has acquired the console_sem, can schedule under console_sem.


anyway, we are not going to merge this patch. we already have discussed
that in V3 thread:

https://marc.info/?l=linux-kernel&m=151019815721161&w=2
https://marc.info/?l=linux-kernel&m=151020275921953&w=2
https://marc.info/?l=linux-kernel&m=151020404622181&w=2
https://marc.info/?l=linux-kernel&m=151020565222469&w=2


I took some parts of the Steven's patch set, tho, and backported them
to the current printk_kthread series.

	-ss

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-11-28  6:23     ` Sergey Senozhatsky
  0 siblings, 0 replies; 47+ messages in thread
From: Sergey Senozhatsky @ 2017-11-28  6:23 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Petr Mladek,
	Sergey Senozhatsky, Vlastimil Babka, yuwang.yuwang,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, kernel-team

Hi,

On (11/27/17 17:48), Byungchul Park wrote:
[..]
> > +				/* Owner will clear console_waiter on hand off */
> > +				while (READ_ONCE(console_waiter))
> > +					cpu_relax();
> > +
> > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > +				printk_safe_exit_irqrestore(flags);
> > +
> > +				/*
> > +				 * The owner passed the console lock to us.
> > +				 * Since we did not spin on console lock, annotate
> > +				 * this as a trylock. Otherwise lockdep will
> > +				 * complain.
> > +				 */
> > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> 
> I'm afraid if it's ok even not to lock(or trylock) actually here. Is there
> any problem if you call console_trylock() instead of mutex_acquire() here?

console_trylock() will not work. console_trylock() implies that the
current printing process does up(), which a) opens a race with possible
console_lock()/console_trylock() from foreign CPUs, and b) additionally
wakes up a task from console_sem wait list [if there was one]. so chances
are some other CPU potentially can acquire the lock ahead of waiter,
forcing the waiter to continue spinning on console_sem. and the bad news
are a) it's spinning with local IRQs disabled and b) that another CPU,
which has acquired the console_sem, can schedule under console_sem.


anyway, we are not going to merge this patch. we already have discussed
that in V3 thread:

https://marc.info/?l=linux-kernel&m=151019815721161&w=2
https://marc.info/?l=linux-kernel&m=151020275921953&w=2
https://marc.info/?l=linux-kernel&m=151020404622181&w=2
https://marc.info/?l=linux-kernel&m=151020565222469&w=2


I took some parts of the Steven's patch set, tho, and backported them
to the current printk_kthread series.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-28  1:42       ` Byungchul Park
@ 2017-12-08 14:00         ` Petr Mladek
  -1 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-08 14:00 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Sergey Senozhatsky,
	Vlastimil Babka, Peter Zijlstra, Linus Torvalds, Jan Kara,
	Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

Hello,

thanks a lot for help. I am sorry for the late response. I wanted to
handle this mail with a clean head.

On Tue 2017-11-28 10:42:29, Byungchul Park wrote:
> On Fri, Nov 24, 2017 at 04:58:16PM +0100, Petr Mladek wrote:
> > @@ -1797,13 +1797,6 @@ asmlinkage int vprintk_emit(int facility, int level,
> >  				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> >  				printk_safe_exit_irqrestore(flags);
> >  
> > -				/*
> > -				 * The owner passed the console lock to us.
> > -				 * Since we did not spin on console lock, annotate
> > -				 * this as a trylock. Otherwise lockdep will
> > -				 * complain.
> > -				 */
> > -				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> 
> Hello Petr,
> 
> IMHO, it would get unbalanced if you only remove this mutex_acquire().
> 
> >  				console_unlock();
> >  				printk_safe_enter_irqsave(flags);
> >  			}
> > @@ -2334,10 +2327,10 @@ void console_unlock(void)
> >  		/* The waiter is now free to continue */
> >  		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> >  		/*
> > -		 * Hand off console_lock to waiter. The waiter will perform
> > -		 * the up(). After this, the waiter is the console_lock owner.
> > +		 * Hand off console_lock to waiter. After this, the waiter
> > +		 * is the console_lock owner.
> >  		 */
> > -		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> 
> IMHO, this release() should be moved to somewhere properly.
> 
> > +		lock_commit_crosslock((struct lockdep_map *)&console_lock_dep_map);
> >  		printk_safe_exit_irqrestore(flags);
> >  		/* Note, if waiter is set, logbuf_lock is not held */
> >  		return;
> 
> However, now that cross-release was introduces, lockdep can be applied
> to semaphore operations. Actually, I have a plan to do that. I think it
> would be better to make semaphore tracked with lockdep and remove all
> these manual acquire() and release() here. What do you think about it?

IMHO, it would be great to add lockdep annotations into semaphore
operations.

Well, I am not sure if this would be enough in this case. I think
that the locking dependency in this Steven's patch is special.
The semaphore is passed from one owner to another one without
unlocking. Both sides wait for each other using a busy loop.

The busy loop/waiting is activated only when the current owner
is not sleeping to avoid softlockup. I think that it is
a kind of conditional cross-release or something even
more special.

Sigh, I wish I was able to clean my head even more to be
able to think about this.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-08 14:00         ` Petr Mladek
  0 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-08 14:00 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, LKML, akpm, linux-mm, Cong Wang, Dave Hansen,
	Johannes Weiner, Mel Gorman, Michal Hocko, Sergey Senozhatsky,
	Vlastimil Babka, Peter Zijlstra, Linus Torvalds, Jan Kara,
	Mathieu Desnoyers, Tetsuo Handa, rostedt, kernel-team

Hello,

thanks a lot for help. I am sorry for the late response. I wanted to
handle this mail with a clean head.

On Tue 2017-11-28 10:42:29, Byungchul Park wrote:
> On Fri, Nov 24, 2017 at 04:58:16PM +0100, Petr Mladek wrote:
> > @@ -1797,13 +1797,6 @@ asmlinkage int vprintk_emit(int facility, int level,
> >  				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> >  				printk_safe_exit_irqrestore(flags);
> >  
> > -				/*
> > -				 * The owner passed the console lock to us.
> > -				 * Since we did not spin on console lock, annotate
> > -				 * this as a trylock. Otherwise lockdep will
> > -				 * complain.
> > -				 */
> > -				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> 
> Hello Petr,
> 
> IMHO, it would get unbalanced if you only remove this mutex_acquire().
> 
> >  				console_unlock();
> >  				printk_safe_enter_irqsave(flags);
> >  			}
> > @@ -2334,10 +2327,10 @@ void console_unlock(void)
> >  		/* The waiter is now free to continue */
> >  		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> >  		/*
> > -		 * Hand off console_lock to waiter. The waiter will perform
> > -		 * the up(). After this, the waiter is the console_lock owner.
> > +		 * Hand off console_lock to waiter. After this, the waiter
> > +		 * is the console_lock owner.
> >  		 */
> > -		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> 
> IMHO, this release() should be moved to somewhere properly.
> 
> > +		lock_commit_crosslock((struct lockdep_map *)&console_lock_dep_map);
> >  		printk_safe_exit_irqrestore(flags);
> >  		/* Note, if waiter is set, logbuf_lock is not held */
> >  		return;
> 
> However, now that cross-release was introduces, lockdep can be applied
> to semaphore operations. Actually, I have a plan to do that. I think it
> would be better to make semaphore tracked with lockdep and remove all
> these manual acquire() and release() here. What do you think about it?

IMHO, it would be great to add lockdep annotations into semaphore
operations.

Well, I am not sure if this would be enough in this case. I think
that the locking dependency in this Steven's patch is special.
The semaphore is passed from one owner to another one without
unlocking. Both sides wait for each other using a busy loop.

The busy loop/waiting is activated only when the current owner
is not sleeping to avoid softlockup. I think that it is
a kind of conditional cross-release or something even
more special.

Sigh, I wish I was able to clean my head even more to be
able to think about this.

Best Regards,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-12-08 14:00         ` Petr Mladek
@ 2017-12-12  5:39           ` Sergey Senozhatsky
  -1 siblings, 0 replies; 47+ messages in thread
From: Sergey Senozhatsky @ 2017-12-12  5:39 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Byungchul Park, Steven Rostedt, LKML, akpm, linux-mm, Cong Wang,
	Dave Hansen, Johannes Weiner, Mel Gorman, Michal Hocko,
	Sergey Senozhatsky, Vlastimil Babka, Peter Zijlstra,
	Linus Torvalds, Jan Kara, Mathieu Desnoyers, Tetsuo Handa,
	rostedt, kernel-team

Hello,

On (12/08/17 15:00), Petr Mladek wrote:
[..]
> > However, now that cross-release was introduces, lockdep can be applied
> > to semaphore operations. Actually, I have a plan to do that. I think it
> > would be better to make semaphore tracked with lockdep and remove all
> > these manual acquire() and release() here. What do you think about it?
> 
> IMHO, it would be great to add lockdep annotations into semaphore
> operations.

certain types of locks have no guaranteed lock-unlock ordering.
e.g. readers-writer locks, semaphores, etc.

for readers-writer lock we can easily have

CPU0		CPU1		CPU2		CPU3		CPU4
read_lock
		write_lock
		// sleep because
		// of CPU0
								read_lock
read_unlock			read_lock
				read_unlock	read_lock
						read_unlock
								read_unlock
								// wake up CPU1

so for CPU1 the lock was "locked" by CPU0 and "unlocked" by CPU4.

semaphore not necessarily has the mutual-exclusion property, because
its ->count is not required to be set to 1. in printk we use semaphore
with ->count == 1, but that's just an accident.

	-ss


p.s.
frankly, I don't see any "locking issues" in Steven's patch.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-12  5:39           ` Sergey Senozhatsky
  0 siblings, 0 replies; 47+ messages in thread
From: Sergey Senozhatsky @ 2017-12-12  5:39 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Byungchul Park, Steven Rostedt, LKML, akpm, linux-mm, Cong Wang,
	Dave Hansen, Johannes Weiner, Mel Gorman, Michal Hocko,
	Sergey Senozhatsky, Vlastimil Babka, Peter Zijlstra,
	Linus Torvalds, Jan Kara, Mathieu Desnoyers, Tetsuo Handa,
	rostedt, kernel-team

Hello,

On (12/08/17 15:00), Petr Mladek wrote:
[..]
> > However, now that cross-release was introduces, lockdep can be applied
> > to semaphore operations. Actually, I have a plan to do that. I think it
> > would be better to make semaphore tracked with lockdep and remove all
> > these manual acquire() and release() here. What do you think about it?
> 
> IMHO, it would be great to add lockdep annotations into semaphore
> operations.

certain types of locks have no guaranteed lock-unlock ordering.
e.g. readers-writer locks, semaphores, etc.

for readers-writer lock we can easily have

CPU0		CPU1		CPU2		CPU3		CPU4
read_lock
		write_lock
		// sleep because
		// of CPU0
								read_lock
read_unlock			read_lock
				read_unlock	read_lock
						read_unlock
								read_unlock
								// wake up CPU1

so for CPU1 the lock was "locked" by CPU0 and "unlocked" by CPU4.

semaphore not necessarily has the mutual-exclusion property, because
its ->count is not required to be set to 1. in printk we use semaphore
with ->count == 1, but that's just an accident.

	-ss


p.s.
frankly, I don't see any "locking issues" in Steven's patch.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-12-12  5:39           ` Sergey Senozhatsky
@ 2017-12-12 19:27             ` Steven Rostedt
  -1 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-12-12 19:27 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Byungchul Park, LKML, akpm, linux-mm, Cong Wang,
	Dave Hansen, Johannes Weiner, Mel Gorman, Michal Hocko,
	Sergey Senozhatsky, Vlastimil Babka, Peter Zijlstra,
	Linus Torvalds, Jan Kara, Mathieu Desnoyers, Tetsuo Handa,
	rostedt, kernel-team

On Tue, 12 Dec 2017 14:39:21 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> p.s.
> frankly, I don't see any "locking issues" in Steven's patch.

Should I push out another revision of mine?

-- Steve

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-12 19:27             ` Steven Rostedt
  0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-12-12 19:27 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Byungchul Park, LKML, akpm, linux-mm, Cong Wang,
	Dave Hansen, Johannes Weiner, Mel Gorman, Michal Hocko,
	Sergey Senozhatsky, Vlastimil Babka, Peter Zijlstra,
	Linus Torvalds, Jan Kara, Mathieu Desnoyers, Tetsuo Handa,
	rostedt, kernel-team

On Tue, 12 Dec 2017 14:39:21 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> p.s.
> frankly, I don't see any "locking issues" in Steven's patch.

Should I push out another revision of mine?

-- Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-12-12 19:27             ` Steven Rostedt
@ 2017-12-13  1:50               ` Sergey Senozhatsky
  -1 siblings, 0 replies; 47+ messages in thread
From: Sergey Senozhatsky @ 2017-12-13  1:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Byungchul Park, LKML, akpm,
	linux-mm, Cong Wang, Dave Hansen, Johannes Weiner, Mel Gorman,
	Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, kernel-team

On (12/12/17 14:27), Steven Rostedt wrote:
> > p.s.
> > frankly, I don't see any "locking issues" in Steven's patch.
> 
> Should I push out another revision of mine?

well, up to you :)

I've picked up some bits of your console-owner patch and it's
part of printk-kthread patch set [as of now]:

lkml.kernel.org/r/20171204134825.7822-13-sergey.senozhatsky@gmail.com


the series:
lkml.kernel.org/r/20171204134825.7822-1-sergey.senozhatsky@gmail.com

	-ss

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-13  1:50               ` Sergey Senozhatsky
  0 siblings, 0 replies; 47+ messages in thread
From: Sergey Senozhatsky @ 2017-12-13  1:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Byungchul Park, LKML, akpm,
	linux-mm, Cong Wang, Dave Hansen, Johannes Weiner, Mel Gorman,
	Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, kernel-team

On (12/12/17 14:27), Steven Rostedt wrote:
> > p.s.
> > frankly, I don't see any "locking issues" in Steven's patch.
> 
> Should I push out another revision of mine?

well, up to you :)

I've picked up some bits of your console-owner patch and it's
part of printk-kthread patch set [as of now]:

lkml.kernel.org/r/20171204134825.7822-13-sergey.senozhatsky@gmail.com


the series:
lkml.kernel.org/r/20171204134825.7822-1-sergey.senozhatsky@gmail.com

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-24 15:58     ` Petr Mladek
@ 2017-12-14 13:51       ` Petr Mladek
  -1 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-14 13:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, Byungchul Park

On Fri 2017-11-24 16:58:16, Petr Mladek wrote:
> On Fri 2017-11-24 16:54:16, Petr Mladek wrote:
> > On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> > > If there is a waiter, then it breaks out of the loop, clears the waiter
> > > flag (because that will release the waiter from its spin), and exits.
> > > Note, it does *not* release the console semaphore. Because it is a
> > > semaphore, there is no owner.
> > 
> > > Index: linux-trace.git/kernel/printk/printk.c
> > > ===================================================================
> > > --- linux-trace.git.orig/kernel/printk/printk.c
> > > +++ linux-trace.git/kernel/printk/printk.c
> > > @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
> > >  static struct lockdep_map console_lock_dep_map = {
> > >  	.name = "console_lock"
> > >  };
> > > +static struct lockdep_map console_owner_dep_map = {
> > > +	.name = "console_owner"
> > > +};
> > >  #endif
> > >  
> > > +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> > > +static struct task_struct *console_owner;
> > > +static bool console_waiter;
> > > +
> > >  enum devkmsg_log_bits {
> > >  	__DEVKMSG_LOG_BIT_ON = 0,
> > >  	__DEVKMSG_LOG_BIT_OFF,
> > > @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> > >  		 * semaphore.  The release will print out buffers and wake up
> > >  		 * /dev/kmsg and syslog() users.
> > >  		 */
> > > -		if (console_trylock())
> > > +		if (console_trylock()) {
> > >  			console_unlock();
> > > +		} else {
> > > +			struct task_struct *owner = NULL;
> > > +			bool waiter;
> > > +			bool spin = false;
> > > +
> > > +			printk_safe_enter_irqsave(flags);
> > > +
> > > +			raw_spin_lock(&console_owner_lock);
> > > +			owner = READ_ONCE(console_owner);
> > > +			waiter = READ_ONCE(console_waiter);
> > > +			if (!waiter && owner && owner != current) {
> > > +				WRITE_ONCE(console_waiter, true);
> > > +				spin = true;
> > > +			}
> > > +			raw_spin_unlock(&console_owner_lock);
> > > +
> > > +			/*
> > > +			 * If there is an active printk() writing to the
> > > +			 * consoles, instead of having it write our data too,
> > > +			 * see if we can offload that load from the active
> > > +			 * printer, and do some printing ourselves.
> > > +			 * Go into a spin only if there isn't already a waiter
> > > +			 * spinning, and there is an active printer, and
> > > +			 * that active printer isn't us (recursive printk?).
> > > +			 */
> > > +			if (spin) {
> > > +				/* We spin waiting for the owner to release us */
> > > +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > > +				/* Owner will clear console_waiter on hand off */
> > > +				while (READ_ONCE(console_waiter))
> > > +					cpu_relax();
> > > +
> > > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > > +				printk_safe_exit_irqrestore(flags);
> > > +
> > > +				/*
> > > +				 * The owner passed the console lock to us.
> > > +				 * Since we did not spin on console lock, annotate
> > > +				 * this as a trylock. Otherwise lockdep will
> > > +				 * complain.
> > > +				 */
> > > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> > 
> > I am not sure that this correctly imitates the real lock
> > dependency. The trylock flag means that we are able to skip
> > this section when the lock is taken elsewhere. But it is not
> > the whole truth. In fact, we are blocked in this code path
> > when console_sem is taken by another process.
> > 
> > The use of console_owner_lock is not enough. The other
> > console_sem calls a lot of code outside the section
> > guarded by console_owner_lock.

Ah, I confused here console_owner_lock and console_owner_dep_map.
The custom map covers all the code where console_owner is set.
It might be enough to catch a potential bug after all.


> > I think that we have actually entered the cross-release bunch
> > of problems, see https://lwn.net/Articles/709849/

Also I think that we do not need the cross-release stuff after all.
The thing is that we move console_sem only to printk() call
that normally calls console_unlock() as well. It means that
the transferred owner should not bring new type of dependencies.
As Steven said somewhere: "If there is a deadlock, it was
there even before."

We could look at it from this side. The possible deadlock would
look like:

CPU0				CPU1

console_unlock()

  console_owner = current;

				spin_lockA()
				  printk()
				    spin = true;
				    while (...)

    call_console_drivers()
      spin_lockA()

This would be a deadlock. CPU0 would wait for the lock A.
While CPU1 would own the lockA and would wait for CPU0
to finish calling the console drivers and pass the console_sem
owner.

But if the above is true than the following scenario was
already possible before:

CPU0

spin_lockA()
  printk()
    console_unlock()
      call_console_drivers()
	spin_lockA()

By other words, this deadlock was there even before. Such
deadlocks are prevented by using printk_deferred() in
the sections guarded by the lock A.

I am sorry for the noise and that it took me so long to
get over this. Well, nobody said that there was something
wrong with my fears and why. I hope that I did not simplified
it too much this time ;-)

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-14 13:51       ` Petr Mladek
  0 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-14 13:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, Byungchul Park

On Fri 2017-11-24 16:58:16, Petr Mladek wrote:
> On Fri 2017-11-24 16:54:16, Petr Mladek wrote:
> > On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> > > If there is a waiter, then it breaks out of the loop, clears the waiter
> > > flag (because that will release the waiter from its spin), and exits.
> > > Note, it does *not* release the console semaphore. Because it is a
> > > semaphore, there is no owner.
> > 
> > > Index: linux-trace.git/kernel/printk/printk.c
> > > ===================================================================
> > > --- linux-trace.git.orig/kernel/printk/printk.c
> > > +++ linux-trace.git/kernel/printk/printk.c
> > > @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
> > >  static struct lockdep_map console_lock_dep_map = {
> > >  	.name = "console_lock"
> > >  };
> > > +static struct lockdep_map console_owner_dep_map = {
> > > +	.name = "console_owner"
> > > +};
> > >  #endif
> > >  
> > > +static DEFINE_RAW_SPINLOCK(console_owner_lock);
> > > +static struct task_struct *console_owner;
> > > +static bool console_waiter;
> > > +
> > >  enum devkmsg_log_bits {
> > >  	__DEVKMSG_LOG_BIT_ON = 0,
> > >  	__DEVKMSG_LOG_BIT_OFF,
> > > @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> > >  		 * semaphore.  The release will print out buffers and wake up
> > >  		 * /dev/kmsg and syslog() users.
> > >  		 */
> > > -		if (console_trylock())
> > > +		if (console_trylock()) {
> > >  			console_unlock();
> > > +		} else {
> > > +			struct task_struct *owner = NULL;
> > > +			bool waiter;
> > > +			bool spin = false;
> > > +
> > > +			printk_safe_enter_irqsave(flags);
> > > +
> > > +			raw_spin_lock(&console_owner_lock);
> > > +			owner = READ_ONCE(console_owner);
> > > +			waiter = READ_ONCE(console_waiter);
> > > +			if (!waiter && owner && owner != current) {
> > > +				WRITE_ONCE(console_waiter, true);
> > > +				spin = true;
> > > +			}
> > > +			raw_spin_unlock(&console_owner_lock);
> > > +
> > > +			/*
> > > +			 * If there is an active printk() writing to the
> > > +			 * consoles, instead of having it write our data too,
> > > +			 * see if we can offload that load from the active
> > > +			 * printer, and do some printing ourselves.
> > > +			 * Go into a spin only if there isn't already a waiter
> > > +			 * spinning, and there is an active printer, and
> > > +			 * that active printer isn't us (recursive printk?).
> > > +			 */
> > > +			if (spin) {
> > > +				/* We spin waiting for the owner to release us */
> > > +				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > > +				/* Owner will clear console_waiter on hand off */
> > > +				while (READ_ONCE(console_waiter))
> > > +					cpu_relax();
> > > +
> > > +				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > > +				printk_safe_exit_irqrestore(flags);
> > > +
> > > +				/*
> > > +				 * The owner passed the console lock to us.
> > > +				 * Since we did not spin on console lock, annotate
> > > +				 * this as a trylock. Otherwise lockdep will
> > > +				 * complain.
> > > +				 */
> > > +				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
> > 
> > I am not sure that this correctly imitates the real lock
> > dependency. The trylock flag means that we are able to skip
> > this section when the lock is taken elsewhere. But it is not
> > the whole truth. In fact, we are blocked in this code path
> > when console_sem is taken by another process.
> > 
> > The use of console_owner_lock is not enough. The other
> > console_sem calls a lot of code outside the section
> > guarded by console_owner_lock.

Ah, I confused here console_owner_lock and console_owner_dep_map.
The custom map covers all the code where console_owner is set.
It might be enough to catch a potential bug after all.


> > I think that we have actually entered the cross-release bunch
> > of problems, see https://lwn.net/Articles/709849/

Also I think that we do not need the cross-release stuff after all.
The thing is that we move console_sem only to printk() call
that normally calls console_unlock() as well. It means that
the transferred owner should not bring new type of dependencies.
As Steven said somewhere: "If there is a deadlock, it was
there even before."

We could look at it from this side. The possible deadlock would
look like:

CPU0				CPU1

console_unlock()

  console_owner = current;

				spin_lockA()
				  printk()
				    spin = true;
				    while (...)

    call_console_drivers()
      spin_lockA()

This would be a deadlock. CPU0 would wait for the lock A.
While CPU1 would own the lockA and would wait for CPU0
to finish calling the console drivers and pass the console_sem
owner.

But if the above is true than the following scenario was
already possible before:

CPU0

spin_lockA()
  printk()
    console_unlock()
      call_console_drivers()
	spin_lockA()

By other words, this deadlock was there even before. Such
deadlocks are prevented by using printk_deferred() in
the sections guarded by the lock A.

I am sorry for the noise and that it took me so long to
get over this. Well, nobody said that there was something
wrong with my fears and why. I hope that I did not simplified
it too much this time ;-)

Best Regards,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-12-12 19:27             ` Steven Rostedt
@ 2017-12-14 14:34               ` Petr Mladek
  -1 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-14 14:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Byungchul Park, LKML, akpm, linux-mm,
	Cong Wang, Dave Hansen, Johannes Weiner, Mel Gorman,
	Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, kernel-team

On Tue 2017-12-12 14:27:10, Steven Rostedt wrote:
> On Tue, 12 Dec 2017 14:39:21 +0900
> Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:
> 
> > p.s.
> > frankly, I don't see any "locking issues" in Steven's patch.
> 
> Should I push out another revision of mine?

I am going to to give some more testing v4 within next few days.
If it works well, I think that it would need just some cosmetic
changes.

For example, it would be nice to somehow encapsulate
the handshake-related code into few helpers. I believe that
it might help us to understand and maintain it. Both
vprintk_emit() and console_unlock() were too long already
before.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-14 14:34               ` Petr Mladek
  0 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-14 14:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Byungchul Park, LKML, akpm, linux-mm,
	Cong Wang, Dave Hansen, Johannes Weiner, Mel Gorman,
	Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt, kernel-team

On Tue 2017-12-12 14:27:10, Steven Rostedt wrote:
> On Tue, 12 Dec 2017 14:39:21 +0900
> Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:
> 
> > p.s.
> > frankly, I don't see any "locking issues" in Steven's patch.
> 
> Should I push out another revision of mine?

I am going to to give some more testing v4 within next few days.
If it works well, I think that it would need just some cosmetic
changes.

For example, it would be nice to somehow encapsulate
the handshake-related code into few helpers. I believe that
it might help us to understand and maintain it. Both
vprintk_emit() and console_unlock() were too long already
before.

Best Regards,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-08 15:27 ` Steven Rostedt
@ 2017-12-22 10:31   ` Petr Mladek
  -1 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-22 10:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> [ claws-mail is really pissing me off. It did it again, after I
>   manually fixed all the addresses. This time, I'm going to do things
>   slightly different. Sorry for all the spam :-( ]
> 
> From: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> This patch implements what I discussed in Kernel Summit. I added
> lockdep annotation (hopefully correctly), and it hasn't had any splats
> (since I fixed some bugs in the first iterations). It did catch
> problems when I had the owner covering too much. But now that the owner
> is only set when actively calling the consoles, lockdep has stayed
> quiet.

> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);

One idea. We could do the above only when "do_cond_resched" is false.
I mean that we could allow stealing the console duty only from
atomic context.

If I get it correctly, this variable is always true in schedulable
context.

> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();

On the contrary, we could allow steeling the console semaphore
when sleeping here. It would allow to get the messages out
faster. It might help to move the duty to someone who is
actually producing many messages or even the panic() caller.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-22 10:31   ` Petr Mladek
  0 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2017-12-22 10:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

On Wed 2017-11-08 10:27:23, Steven Rostedt wrote:
> [ claws-mail is really pissing me off. It did it again, after I
>   manually fixed all the addresses. This time, I'm going to do things
>   slightly different. Sorry for all the spam :-( ]
> 
> From: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> This patch implements what I discussed in Kernel Summit. I added
> lockdep annotation (hopefully correctly), and it hasn't had any splats
> (since I fixed some bugs in the first iterations). It did catch
> problems when I had the owner covering too much. But now that the owner
> is only set when actively calling the consoles, lockdep has stayed
> quiet.

> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c
> +++ linux-trace.git/kernel/printk/printk.c
> @@ -2141,6 +2196,7 @@ void console_unlock(void)
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> +	bool waiter = false;
>  	bool do_cond_resched, retry;
>  
>  	if (console_suspended) {
> @@ -2229,14 +2285,64 @@ skip:
>  		console_seq++;
>  		raw_spin_unlock(&logbuf_lock);
>  
> +		/*
> +		 * While actively printing out messages, if another printk()
> +		 * were to occur on another CPU, it may wait for this one to
> +		 * finish. This task can not be preempted if there is a
> +		 * waiter waiting to take over.
> +		 */
> +		raw_spin_lock(&console_owner_lock);
> +		console_owner = current;
> +		raw_spin_unlock(&console_owner_lock);

One idea. We could do the above only when "do_cond_resched" is false.
I mean that we could allow stealing the console duty only from
atomic context.

If I get it correctly, this variable is always true in schedulable
context.

> +
> +		/* The waiter may spin on us after setting console_owner */
> +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> +
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(ext_text, ext_len, text, len);
>  		start_critical_timings();
> +
> +		raw_spin_lock(&console_owner_lock);
> +		waiter = READ_ONCE(console_waiter);
> +		console_owner = NULL;
> +		raw_spin_unlock(&console_owner_lock);
> +
> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +
> +		/* There was no waiter, and nothing will spin on us here */
> +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> +
>  		printk_safe_exit_irqrestore(flags);
>  
>  		if (do_cond_resched)
>  			cond_resched();

On the contrary, we could allow steeling the console semaphore
when sleeping here. It would allow to get the messages out
faster. It might help to move the duty to someone who is
actually producing many messages or even the panic() caller.

Best Regards,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-12-22 10:31   ` Petr Mladek
@ 2017-12-22 12:44     ` Steven Rostedt
  -1 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-12-22 12:44 UTC (permalink / raw)
  To: Petr Mladek
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

On Fri, 22 Dec 2017 11:31:31 +0100
Petr Mladek <pmladek@suse.com> wrote:

> > Index: linux-trace.git/kernel/printk/printk.c
> > ===================================================================
> > --- linux-trace.git.orig/kernel/printk/printk.c
> > +++ linux-trace.git/kernel/printk/printk.c
> > @@ -2141,6 +2196,7 @@ void console_unlock(void)
> >  	static u64 seen_seq;
> >  	unsigned long flags;
> >  	bool wake_klogd = false;
> > +	bool waiter = false;
> >  	bool do_cond_resched, retry;
> >  
> >  	if (console_suspended) {
> > @@ -2229,14 +2285,64 @@ skip:
> >  		console_seq++;
> >  		raw_spin_unlock(&logbuf_lock);
> >  
> > +		/*
> > +		 * While actively printing out messages, if another printk()
> > +		 * were to occur on another CPU, it may wait for this one to
> > +		 * finish. This task can not be preempted if there is a
> > +		 * waiter waiting to take over.
> > +		 */
> > +		raw_spin_lock(&console_owner_lock);
> > +		console_owner = current;
> > +		raw_spin_unlock(&console_owner_lock);  
> 
> One idea. We could do the above only when "do_cond_resched" is false.
> I mean that we could allow stealing the console duty only from
> atomic context.

I'd like to hold off before making a change like that. I thought about
it, but by saying "atomic" is more important than "non-atomic" can also
lead to problems. Once you don't allow stealing, you just changed
printk to be unbounded again. Maybe that's not an issue. But I'd rather
add that as an enhancement in case. I could make this a patch series,
and we can build cases like this up.

> 
> If I get it correctly, this variable is always true in schedulable
> context.
> 
> > +
> > +		/* The waiter may spin on us after setting console_owner */
> > +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > +
> >  		stop_critical_timings();	/* don't trace print latency */
> >  		call_console_drivers(ext_text, ext_len, text, len);
> >  		start_critical_timings();
> > +
> > +		raw_spin_lock(&console_owner_lock);
> > +		waiter = READ_ONCE(console_waiter);
> > +		console_owner = NULL;
> > +		raw_spin_unlock(&console_owner_lock);
> > +
> > +		/*
> > +		 * If there is a waiter waiting for us, then pass the
> > +		 * rest of the work load over to that waiter.
> > +		 */
> > +		if (waiter)
> > +			break;
> > +
> > +		/* There was no waiter, and nothing will spin on us here */
> > +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > +
> >  		printk_safe_exit_irqrestore(flags);
> >  
> >  		if (do_cond_resched)
> >  			cond_resched();  
> 
> On the contrary, we could allow steeling the console semaphore
> when sleeping here. It would allow to get the messages out
> faster. It might help to move the duty to someone who is
> actually producing many messages or even the panic() caller.

Good point. I'll add a patch that adds that feature too.

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2017-12-22 12:44     ` Steven Rostedt
  0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-12-22 12:44 UTC (permalink / raw)
  To: Petr Mladek
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

On Fri, 22 Dec 2017 11:31:31 +0100
Petr Mladek <pmladek@suse.com> wrote:

> > Index: linux-trace.git/kernel/printk/printk.c
> > ===================================================================
> > --- linux-trace.git.orig/kernel/printk/printk.c
> > +++ linux-trace.git/kernel/printk/printk.c
> > @@ -2141,6 +2196,7 @@ void console_unlock(void)
> >  	static u64 seen_seq;
> >  	unsigned long flags;
> >  	bool wake_klogd = false;
> > +	bool waiter = false;
> >  	bool do_cond_resched, retry;
> >  
> >  	if (console_suspended) {
> > @@ -2229,14 +2285,64 @@ skip:
> >  		console_seq++;
> >  		raw_spin_unlock(&logbuf_lock);
> >  
> > +		/*
> > +		 * While actively printing out messages, if another printk()
> > +		 * were to occur on another CPU, it may wait for this one to
> > +		 * finish. This task can not be preempted if there is a
> > +		 * waiter waiting to take over.
> > +		 */
> > +		raw_spin_lock(&console_owner_lock);
> > +		console_owner = current;
> > +		raw_spin_unlock(&console_owner_lock);  
> 
> One idea. We could do the above only when "do_cond_resched" is false.
> I mean that we could allow stealing the console duty only from
> atomic context.

I'd like to hold off before making a change like that. I thought about
it, but by saying "atomic" is more important than "non-atomic" can also
lead to problems. Once you don't allow stealing, you just changed
printk to be unbounded again. Maybe that's not an issue. But I'd rather
add that as an enhancement in case. I could make this a patch series,
and we can build cases like this up.

> 
> If I get it correctly, this variable is always true in schedulable
> context.
> 
> > +
> > +		/* The waiter may spin on us after setting console_owner */
> > +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > +
> >  		stop_critical_timings();	/* don't trace print latency */
> >  		call_console_drivers(ext_text, ext_len, text, len);
> >  		start_critical_timings();
> > +
> > +		raw_spin_lock(&console_owner_lock);
> > +		waiter = READ_ONCE(console_waiter);
> > +		console_owner = NULL;
> > +		raw_spin_unlock(&console_owner_lock);
> > +
> > +		/*
> > +		 * If there is a waiter waiting for us, then pass the
> > +		 * rest of the work load over to that waiter.
> > +		 */
> > +		if (waiter)
> > +			break;
> > +
> > +		/* There was no waiter, and nothing will spin on us here */
> > +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > +
> >  		printk_safe_exit_irqrestore(flags);
> >  
> >  		if (do_cond_resched)
> >  			cond_resched();  
> 
> On the contrary, we could allow steeling the console semaphore
> when sleeping here. It would allow to get the messages out
> faster. It might help to move the duty to someone who is
> actually producing many messages or even the panic() caller.

Good point. I'll add a patch that adds that feature too.

Thanks!

-- Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-12-22 12:44     ` Steven Rostedt
@ 2018-01-10 12:50       ` Petr Mladek
  -1 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2018-01-10 12:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

On Fri 2017-12-22 07:44:26, Steven Rostedt wrote:
> On Fri, 22 Dec 2017 11:31:31 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> 
> > > Index: linux-trace.git/kernel/printk/printk.c
> > > ===================================================================
> > > --- linux-trace.git.orig/kernel/printk/printk.c
> > > +++ linux-trace.git/kernel/printk/printk.c
> > > @@ -2141,6 +2196,7 @@ void console_unlock(void)
> > >  	static u64 seen_seq;
> > >  	unsigned long flags;
> > >  	bool wake_klogd = false;
> > > +	bool waiter = false;
> > >  	bool do_cond_resched, retry;
> > >  
> > >  	if (console_suspended) {
> > > @@ -2229,14 +2285,64 @@ skip:
> > >  		console_seq++;
> > >  		raw_spin_unlock(&logbuf_lock);
> > >  
> > > +		/*
> > > +		 * While actively printing out messages, if another printk()
> > > +		 * were to occur on another CPU, it may wait for this one to
> > > +		 * finish. This task can not be preempted if there is a
> > > +		 * waiter waiting to take over.
> > > +		 */
> > > +		raw_spin_lock(&console_owner_lock);
> > > +		console_owner = current;
> > > +		raw_spin_unlock(&console_owner_lock);  
> > 
> > One idea. We could do the above only when "do_cond_resched" is false.
> > I mean that we could allow stealing the console duty only from
> > atomic context.
> 
> I'd like to hold off before making a change like that. I thought about
> it, but by saying "atomic" is more important than "non-atomic" can also
> lead to problems. Once you don't allow stealing, you just changed
> printk to be unbounded again. Maybe that's not an issue. But I'd rather
> add that as an enhancement in case. I could make this a patch series,
> and we can build cases like this up.

I see the point. It might reduce the chance of the handshake and
load balancing. Let's avoid it for now.

> > 
> > If I get it correctly, this variable is always true in schedulable
> > context.
> > 
> > > +
> > > +		/* The waiter may spin on us after setting console_owner */
> > > +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > > +
> > >  		stop_critical_timings();	/* don't trace print latency */
> > >  		call_console_drivers(ext_text, ext_len, text, len);
> > >  		start_critical_timings();
> > > +
> > > +		raw_spin_lock(&console_owner_lock);
> > > +		waiter = READ_ONCE(console_waiter);
> > > +		console_owner = NULL;
> > > +		raw_spin_unlock(&console_owner_lock);
> > > +
> > > +		/*
> > > +		 * If there is a waiter waiting for us, then pass the
> > > +		 * rest of the work load over to that waiter.
> > > +		 */
> > > +		if (waiter)
> > > +			break;
> > > +
> > > +		/* There was no waiter, and nothing will spin on us here */
> > > +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > > +
> > >  		printk_safe_exit_irqrestore(flags);
> > >  
> > >  		if (do_cond_resched)
> > >  			cond_resched();  
> > 
> > On the contrary, we could allow steeling the console semaphore
> > when sleeping here. It would allow to get the messages out
> > faster. It might help to move the duty to someone who is
> > actually producing many messages or even the panic() caller.
> 
> Good point. I'll add a patch that adds that feature too.

Ah, it would require a bit different logic. The waiter would need
to take the lock immediately. The current/owner would need to detect
that she lost the lock once she wakes up. It would make sense
to support yet another passing of the lock even before the first
owner wakes up.

All this is doable but not that easy. We might and should try
it later only if the current solution is not enough.

Best Regards,
Petr

PS: I am about to sent a clean up on top of Steven's patch.
I offered this to Steven before Christmas. Unfortunately
it got delayed by the holidays, another urgent work and sickness.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
@ 2018-01-10 12:50       ` Petr Mladek
  0 siblings, 0 replies; 47+ messages in thread
From: Petr Mladek @ 2018-01-10 12:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Sergey Senozhatsky, Vlastimil Babka,
	Peter Zijlstra, Linus Torvalds, Jan Kara, Mathieu Desnoyers,
	Tetsuo Handa, rostedt

On Fri 2017-12-22 07:44:26, Steven Rostedt wrote:
> On Fri, 22 Dec 2017 11:31:31 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> 
> > > Index: linux-trace.git/kernel/printk/printk.c
> > > ===================================================================
> > > --- linux-trace.git.orig/kernel/printk/printk.c
> > > +++ linux-trace.git/kernel/printk/printk.c
> > > @@ -2141,6 +2196,7 @@ void console_unlock(void)
> > >  	static u64 seen_seq;
> > >  	unsigned long flags;
> > >  	bool wake_klogd = false;
> > > +	bool waiter = false;
> > >  	bool do_cond_resched, retry;
> > >  
> > >  	if (console_suspended) {
> > > @@ -2229,14 +2285,64 @@ skip:
> > >  		console_seq++;
> > >  		raw_spin_unlock(&logbuf_lock);
> > >  
> > > +		/*
> > > +		 * While actively printing out messages, if another printk()
> > > +		 * were to occur on another CPU, it may wait for this one to
> > > +		 * finish. This task can not be preempted if there is a
> > > +		 * waiter waiting to take over.
> > > +		 */
> > > +		raw_spin_lock(&console_owner_lock);
> > > +		console_owner = current;
> > > +		raw_spin_unlock(&console_owner_lock);  
> > 
> > One idea. We could do the above only when "do_cond_resched" is false.
> > I mean that we could allow stealing the console duty only from
> > atomic context.
> 
> I'd like to hold off before making a change like that. I thought about
> it, but by saying "atomic" is more important than "non-atomic" can also
> lead to problems. Once you don't allow stealing, you just changed
> printk to be unbounded again. Maybe that's not an issue. But I'd rather
> add that as an enhancement in case. I could make this a patch series,
> and we can build cases like this up.

I see the point. It might reduce the chance of the handshake and
load balancing. Let's avoid it for now.

> > 
> > If I get it correctly, this variable is always true in schedulable
> > context.
> > 
> > > +
> > > +		/* The waiter may spin on us after setting console_owner */
> > > +		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> > > +
> > >  		stop_critical_timings();	/* don't trace print latency */
> > >  		call_console_drivers(ext_text, ext_len, text, len);
> > >  		start_critical_timings();
> > > +
> > > +		raw_spin_lock(&console_owner_lock);
> > > +		waiter = READ_ONCE(console_waiter);
> > > +		console_owner = NULL;
> > > +		raw_spin_unlock(&console_owner_lock);
> > > +
> > > +		/*
> > > +		 * If there is a waiter waiting for us, then pass the
> > > +		 * rest of the work load over to that waiter.
> > > +		 */
> > > +		if (waiter)
> > > +			break;
> > > +
> > > +		/* There was no waiter, and nothing will spin on us here */
> > > +		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
> > > +
> > >  		printk_safe_exit_irqrestore(flags);
> > >  
> > >  		if (do_cond_resched)
> > >  			cond_resched();  
> > 
> > On the contrary, we could allow steeling the console semaphore
> > when sleeping here. It would allow to get the messages out
> > faster. It might help to move the duty to someone who is
> > actually producing many messages or even the panic() caller.
> 
> Good point. I'll add a patch that adds that feature too.

Ah, it would require a bit different logic. The waiter would need
to take the lock immediately. The current/owner would need to detect
that she lost the lock once she wakes up. It would make sense
to support yet another passing of the lock even before the first
owner wakes up.

All this is doable but not that easy. We might and should try
it later only if the current solution is not enough.

Best Regards,
Petr

PS: I am about to sent a clean up on top of Steven's patch.
I offered this to Steven before Christmas. Unfortunately
it got delayed by the holidays, another urgent work and sickness.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v4] printk: Add console owner and waiter logic to load   balance console writes
@ 2017-11-08 15:13 Steven Rostedt
  0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-11-08 15:13 UTC (permalink / raw)
  To: LKML
  Cc: akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Petr Mladek, Sergey Senozhatsky,
	Vlastimil Babka,
	yuwang.yuwang"   <yuwang.yuwang@alibaba-inc.com>,
	Peter Zijlstra  <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan  Kara <jack@suse.cz>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	 Tetsuo Handa  <penguin-kernel@I-love.SAKURA.ne.jp>,
	 rostedt@home.goodmis.org

[ Sending again, because claws-mail screwed up some of the addresses
  when I did a cut-and-paste causing several bounces. ]

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

This patch implements what I discussed in Kernel Summit. I added
lockdep annotation (hopefully correctly), and it hasn't had any splats
(since I fixed some bugs in the first iterations). It did catch
problems when I had the owner covering too much. But now that the owner
is only set when actively calling the consoles, lockdep has stayed
quiet.
 
Here's the design again:

I added a "console_owner" which is set to a task that is actively
writing to the consoles. It is *not* the same an the owner of the
console_lock. It is only set when doing the calls to the console
functions. It is protected by a console_owner_lock which is a raw spin
lock.

There is a console_waiter. This is set when there is an active console
owner that is not current, and waiter is not set. This too is protected
by console_owner_lock.

In printk() when it tries to write to the consoles, we have:

	if (console_trylock())
		console_unlock();

Now I added an else, which will check if there is an active owner, and
no current waiter. If that is the case, then console_waiter is set, and
the task goes into a spin until it is no longer set.

When the active console owner finishes writing the current message to
the consoles, it grabs the console_owner_lock and sees if there is a
waiter, and clears console_owner.

If there is a waiter, then it breaks out of the loop, clears the waiter
flag (because that will release the waiter from its spin), and exits.
Note, it does *not* release the console semaphore. Because it is a
semaphore, there is no owner. Another task may release it. This means
that the waiter is guaranteed to be the new console owner! Which it
becomes.

Then the waiter calls console_unlock() and continues to write to the
consoles.

If another task comes along and does a printk() it too can become the
new waiter, and we wash rinse and repeat!

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
Changes from v3:

	Fixed while loop on console_waiter (Thanks Vlastimil!)

	Moved console_owner out of logbuf_lock taking (reported by
	Tetsuo Handa)

Changes from v2:

  - Added back some READ/WRITE_ONCE() just to be on the safe side

Index: linux-trace.git/kernel/printk/printk.c
===================================================================
--- linux-trace.git.orig/kernel/printk/printk.c
+++ linux-trace.git/kernel/printk/printk.c
@@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
 static struct lockdep_map console_lock_dep_map = {
 	.name = "console_lock"
 };
+static struct lockdep_map console_owner_dep_map = {
+	.name = "console_owner"
+};
 #endif
 
+static DEFINE_RAW_SPINLOCK(console_owner_lock);
+static struct task_struct *console_owner;
+static bool console_waiter;
+
 enum devkmsg_log_bits {
 	__DEVKMSG_LOG_BIT_ON = 0,
 	__DEVKMSG_LOG_BIT_OFF,
@@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
 		 * semaphore.  The release will print out buffers and
wake up
 		 * /dev/kmsg and syslog() users.
 		 */
-		if (console_trylock())
+		if (console_trylock()) {
 			console_unlock();
+		} else {
+			struct task_struct *owner = NULL;
+			bool waiter;
+			bool spin = false;
+
+			printk_safe_enter_irqsave(flags);
+
+			raw_spin_lock(&console_owner_lock);
+			owner = READ_ONCE(console_owner);
+			waiter = READ_ONCE(console_waiter);
+			if (!waiter && owner && owner != current) {
+				WRITE_ONCE(console_waiter, true);
+				spin = true;
+			}
+			raw_spin_unlock(&console_owner_lock);
+
+			/*
+			 * If there is an active printk() writing to
the
+			 * consoles, instead of having it write our
data too,
+			 * see if we can offload that load from the
active
+			 * printer, and do some printing ourselves.
+			 * Go into a spin only if there isn't already
a waiter
+			 * spinning, and there is an active printer,
and
+			 * that active printer isn't us (recursive
printk?).
+			 */
+			if (spin) {
+				/* We spin waiting for the owner to
release us */
+				spin_acquire(&console_owner_dep_map,
0, 0, _THIS_IP_);
+				/* Owner will clear console_waiter on
hand off */
+				while (READ_ONCE(console_waiter))
+					cpu_relax();
+
+				spin_release(&console_owner_dep_map,
1, _THIS_IP_);
+				printk_safe_exit_irqrestore(flags);
+
+				/*
+				 * The owner passed the console lock
to us.
+				 * Since we did not spin on console
lock, annotate
+				 * this as a trylock. Otherwise
lockdep will
+				 * complain.
+				 */
+				mutex_acquire(&console_lock_dep_map,
0, 1, _THIS_IP_);
+				console_unlock();
+				printk_safe_enter_irqsave(flags);
+			}
+			printk_safe_exit_irqrestore(flags);
+
+		}
 	}
 
 	return printed_len;
@@ -2141,6 +2196,7 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
+	bool waiter = false;
 	bool do_cond_resched, retry;
 
 	if (console_suspended) {
@@ -2229,14 +2285,64 @@ skip:
 		console_seq++;
 		raw_spin_unlock(&logbuf_lock);
 
+		/*
+		 * While actively printing out messages, if another
printk()
+		 * were to occur on another CPU, it may wait for this
one to
+		 * finish. This task can not be preempted if there is a
+		 * waiter waiting to take over.
+		 */
+		raw_spin_lock(&console_owner_lock);
+		console_owner = current;
+		raw_spin_unlock(&console_owner_lock);
+
+		/* The waiter may spin on us after setting
console_owner */
+		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
+
 		stop_critical_timings();	/* don't trace print
latency */ call_console_drivers(ext_text, ext_len, text, len);
 		start_critical_timings();
+
+		raw_spin_lock(&console_owner_lock);
+		waiter = READ_ONCE(console_waiter);
+		console_owner = NULL;
+		raw_spin_unlock(&console_owner_lock);
+
+		/*
+		 * If there is a waiter waiting for us, then pass the
+		 * rest of the work load over to that waiter.
+		 */
+		if (waiter)
+			break;
+
+		/* There was no waiter, and nothing will spin on us
here */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+
 		printk_safe_exit_irqrestore(flags);
 
 		if (do_cond_resched)
 			cond_resched();
 	}
+
+	/*
+	 * If there is an active waiter waiting on the console_lock.
+	 * Pass off the printing to the waiter, and the waiter
+	 * will continue printing on its CPU, and when all writing
+	 * has finished, the last printer will wake up klogd.
+	 */
+	if (waiter) {
+		WRITE_ONCE(console_waiter, false);
+		/* The waiter is now free to continue */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+		/*
+		 * Hand off console_lock to waiter. The waiter will
perform
+		 * the up(). After this, the waiter is the
console_lock owner.
+		 */
+		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
+		printk_safe_exit_irqrestore(flags);
+		/* Note, if waiter is set, logbuf_lock is not held */
+		return;
+	}
+
 	console_locked = 0;
 
 	/* Release the exclusive_console once it is used */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] printk: Add console owner and waiter logic to load balance console writes
  2017-11-08 15:03 Steven Rostedt
@ 2017-11-08 15:10 ` Steven Rostedt
  0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2017-11-08 15:10 UTC (permalink / raw)
  To: LKML
  Cc: akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Petr Mladek, Sergey Senozhatsky,
	Vlastimil Babka,
	yuwang.yuwang"    <yuwang.yuwang@alibaba-inc.com>,
	Peter Zijlstra  <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan  Kara <jack@suse.cz>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	 Tetsuo Handa   <penguin-kernel@I-love.SAKURA.ne.jp>,
	 rostedt@home.goodmis.org

On Wed, 8 Nov 2017 10:03:21 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> +		/*
> +		 * If there is a waiter waiting for us, then pass the
> +		 * rest of the work load over to that waiter.
> +		 */
> +		if (waiter)
> +			break;
> +

And if we are worried about flushing on crashes, we could change this
to:

	if (waiter) {
		if (oops_in_progress)
			waiter = false;
		else
			break;
	}

And keep whoever is printing printing if a crash is happening.

-- Steve

			

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v4] printk: Add console owner and waiter logic to load   balance console writes
@ 2017-11-08 15:03 Steven Rostedt
  2017-11-08 15:10 ` Steven Rostedt
  0 siblings, 1 reply; 47+ messages in thread
From: Steven Rostedt @ 2017-11-08 15:03 UTC (permalink / raw)
  To: LKML
  Cc: akpm, linux-mm, Cong Wang, Dave Hansen, Johannes Weiner,
	Mel Gorman, Michal Hocko, Petr Mladek, Sergey Senozhatsky,
	Vlastimil Babka,
	yuwang.yuwang"    <yuwang.yuwang@alibaba-inc.com>,
	Peter Zijlstra <peterz@infradead.org>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Tetsuo Handa   <penguin-kernel@I-love.SAKURA.ne.jp>,
	rostedt@home.goodmis.org

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

This patch implements what I discussed in Kernel Summit. I added
lockdep annotation (hopefully correctly), and it hasn't had any splats
(since I fixed some bugs in the first iterations). It did catch
problems when I had the owner covering too much. But now that the owner
is only set when actively calling the consoles, lockdep has stayed
quiet.
 
Here's the design again:

I added a "console_owner" which is set to a task that is actively
writing to the consoles. It is *not* the same an the owner of the
console_lock. It is only set when doing the calls to the console
functions. It is protected by a console_owner_lock which is a raw spin
lock.

There is a console_waiter. This is set when there is an active console
owner that is not current, and waiter is not set. This too is protected
by console_owner_lock.

In printk() when it tries to write to the consoles, we have:

	if (console_trylock())
		console_unlock();

Now I added an else, which will check if there is an active owner, and
no current waiter. If that is the case, then console_waiter is set, and
the task goes into a spin until it is no longer set.

When the active console owner finishes writing the current message to
the consoles, it grabs the console_owner_lock and sees if there is a
waiter, and clears console_owner.

If there is a waiter, then it breaks out of the loop, clears the waiter
flag (because that will release the waiter from its spin), and exits.
Note, it does *not* release the console semaphore. Because it is a
semaphore, there is no owner. Another task may release it. This means
that the waiter is guaranteed to be the new console owner! Which it
becomes.

Then the waiter calls console_unlock() and continues to write to the
consoles.

If another task comes along and does a printk() it too can become the
new waiter, and we wash rinse and repeat!

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
Changes from v3:

	Fixed while loop on console_waiter (Thanks Vlastimil!)

	Moved console_owner out of logbuf_lock taking (reported by Tetsuo Handa)

Changes from v2:

  - Added back some READ/WRITE_ONCE() just to be on the safe side

Index: linux-trace.git/kernel/printk/printk.c
===================================================================
--- linux-trace.git.orig/kernel/printk/printk.c
+++ linux-trace.git/kernel/printk/printk.c
@@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers);
 static struct lockdep_map console_lock_dep_map = {
 	.name = "console_lock"
 };
+static struct lockdep_map console_owner_dep_map = {
+	.name = "console_owner"
+};
 #endif
 
+static DEFINE_RAW_SPINLOCK(console_owner_lock);
+static struct task_struct *console_owner;
+static bool console_waiter;
+
 enum devkmsg_log_bits {
 	__DEVKMSG_LOG_BIT_ON = 0,
 	__DEVKMSG_LOG_BIT_OFF,
@@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
 		 * semaphore.  The release will print out buffers and wake up
 		 * /dev/kmsg and syslog() users.
 		 */
-		if (console_trylock())
+		if (console_trylock()) {
 			console_unlock();
+		} else {
+			struct task_struct *owner = NULL;
+			bool waiter;
+			bool spin = false;
+
+			printk_safe_enter_irqsave(flags);
+
+			raw_spin_lock(&console_owner_lock);
+			owner = READ_ONCE(console_owner);
+			waiter = READ_ONCE(console_waiter);
+			if (!waiter && owner && owner != current) {
+				WRITE_ONCE(console_waiter, true);
+				spin = true;
+			}
+			raw_spin_unlock(&console_owner_lock);
+
+			/*
+			 * If there is an active printk() writing to the
+			 * consoles, instead of having it write our data too,
+			 * see if we can offload that load from the active
+			 * printer, and do some printing ourselves.
+			 * Go into a spin only if there isn't already a waiter
+			 * spinning, and there is an active printer, and
+			 * that active printer isn't us (recursive printk?).
+			 */
+			if (spin) {
+				/* We spin waiting for the owner to release us */
+				spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
+				/* Owner will clear console_waiter on hand off */
+				while (READ_ONCE(console_waiter))
+					cpu_relax();
+
+				spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+				printk_safe_exit_irqrestore(flags);
+
+				/*
+				 * The owner passed the console lock to us.
+				 * Since we did not spin on console lock, annotate
+				 * this as a trylock. Otherwise lockdep will
+				 * complain.
+				 */
+				mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
+				console_unlock();
+				printk_safe_enter_irqsave(flags);
+			}
+			printk_safe_exit_irqrestore(flags);
+
+		}
 	}
 
 	return printed_len;
@@ -2141,6 +2196,7 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
+	bool waiter = false;
 	bool do_cond_resched, retry;
 
 	if (console_suspended) {
@@ -2229,14 +2285,64 @@ skip:
 		console_seq++;
 		raw_spin_unlock(&logbuf_lock);
 
+		/*
+		 * While actively printing out messages, if another printk()
+		 * were to occur on another CPU, it may wait for this one to
+		 * finish. This task can not be preempted if there is a
+		 * waiter waiting to take over.
+		 */
+		raw_spin_lock(&console_owner_lock);
+		console_owner = current;
+		raw_spin_unlock(&console_owner_lock);
+
+		/* The waiter may spin on us after setting console_owner */
+		spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
+
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(ext_text, ext_len, text, len);
 		start_critical_timings();
+
+		raw_spin_lock(&console_owner_lock);
+		waiter = READ_ONCE(console_waiter);
+		console_owner = NULL;
+		raw_spin_unlock(&console_owner_lock);
+
+		/*
+		 * If there is a waiter waiting for us, then pass the
+		 * rest of the work load over to that waiter.
+		 */
+		if (waiter)
+			break;
+
+		/* There was no waiter, and nothing will spin on us here */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+
 		printk_safe_exit_irqrestore(flags);
 
 		if (do_cond_resched)
 			cond_resched();
 	}
+
+	/*
+	 * If there is an active waiter waiting on the console_lock.
+	 * Pass off the printing to the waiter, and the waiter
+	 * will continue printing on its CPU, and when all writing
+	 * has finished, the last printer will wake up klogd.
+	 */
+	if (waiter) {
+		WRITE_ONCE(console_waiter, false);
+		/* The waiter is now free to continue */
+		spin_release(&console_owner_dep_map, 1, _THIS_IP_);
+		/*
+		 * Hand off console_lock to waiter. The waiter will perform
+		 * the up(). After this, the waiter is the console_lock owner.
+		 */
+		mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
+		printk_safe_exit_irqrestore(flags);
+		/* Note, if waiter is set, logbuf_lock is not held */
+		return;
+	}
+
 	console_locked = 0;
 
 	/* Release the exclusive_console once it is used */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2018-01-10 12:50 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-08 15:27 [PATCH v4] printk: Add console owner and waiter logic to load balance console writes Steven Rostedt
2017-11-08 15:27 ` Steven Rostedt
2017-11-09 10:12 ` Michal Hocko
2017-11-09 10:12   ` Michal Hocko
2017-11-09 10:22   ` [PATCH v4] printk: Add console owner and waiter logic to loadbalance " Tetsuo Handa
2017-11-09 10:22     ` Tetsuo Handa
2017-11-09 10:26     ` Michal Hocko
2017-11-09 10:26       ` Michal Hocko
2017-11-09 11:03       ` Tetsuo Handa
2017-11-09 11:03         ` Tetsuo Handa
2017-11-09 11:31         ` Michal Hocko
2017-11-09 11:31           ` Michal Hocko
2017-11-09 12:07           ` Tetsuo Handa
2017-11-09 12:07             ` Tetsuo Handa
2017-11-24 15:54 ` [PATCH v4] printk: Add console owner and waiter logic to load balance " Petr Mladek
2017-11-24 15:54   ` Petr Mladek
2017-11-24 15:58   ` Petr Mladek
2017-11-24 15:58     ` Petr Mladek
2017-11-27  8:53     ` Byungchul Park
2017-11-27  8:53       ` Byungchul Park
2017-11-28  1:42     ` Byungchul Park
2017-11-28  1:42       ` Byungchul Park
2017-12-08 14:00       ` Petr Mladek
2017-12-08 14:00         ` Petr Mladek
2017-12-12  5:39         ` Sergey Senozhatsky
2017-12-12  5:39           ` Sergey Senozhatsky
2017-12-12 19:27           ` Steven Rostedt
2017-12-12 19:27             ` Steven Rostedt
2017-12-13  1:50             ` Sergey Senozhatsky
2017-12-13  1:50               ` Sergey Senozhatsky
2017-12-14 14:34             ` Petr Mladek
2017-12-14 14:34               ` Petr Mladek
2017-12-14 13:51     ` Petr Mladek
2017-12-14 13:51       ` Petr Mladek
2017-11-27  8:48 ` Byungchul Park
2017-11-27  8:48   ` Byungchul Park
2017-11-28  6:23   ` Sergey Senozhatsky
2017-11-28  6:23     ` Sergey Senozhatsky
2017-12-22 10:31 ` Petr Mladek
2017-12-22 10:31   ` Petr Mladek
2017-12-22 12:44   ` Steven Rostedt
2017-12-22 12:44     ` Steven Rostedt
2018-01-10 12:50     ` Petr Mladek
2018-01-10 12:50       ` Petr Mladek
  -- strict thread matches above, loose matches on Subject: below --
2017-11-08 15:13 Steven Rostedt
2017-11-08 15:03 Steven Rostedt
2017-11-08 15:10 ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.