linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/3] reduce tty latency
@ 2019-01-10 10:12 Oleksij Rempel
  2019-01-10 10:12 ` [PATCH v1 1/3] drivers/tty: refactor functions for flushing/queuing work Oleksij Rempel
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Oleksij Rempel @ 2019-01-10 10:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: Oleksij Rempel, Pengutronix Kernel Team, linux-kernel

This patch set is reducing latency on tty path.
For testing I used hackbench running on all cores of 4 core system and
high prioritized application sending and receiving packets over tty interface
with loop-back adapter.

Results without this patches:
        latency histogram:
            0 ... <     250 usec : 1933104 transmissions
          250 ... <     500 usec : 21339 transmissions
          500 ... <     750 usec : 8952 transmissions
          750 ... <    1000 usec : 6226 transmissions
         1000 ... <    1500 usec : 7688 transmissions
         1500 ... <    2000 usec : 5236 transmissions
         2000 ... <    5000 usec : 11724 transmissions
         5000 ... <   10000 usec : 3588 transmissions
        10000 ... <   50000 usec : 2123 transmissions
        50000 ... < 1000000 usec : 20 transmissions
                 >= 1000000 usec : 0 transmissions

Test results after this patches:
        min latency: 0 sec : 75 usec
        max latency: 0 sec : 125 usec
        average latency: 81 usec
        latency measure cycles overall: 79000000
        latency histogram:
            0 ... <     250 usec : 79000000 transmissions
          250 ... <     500 usec : 0 transmissions
          500 ... <     750 usec : 0 transmissions
          750 ... <    1000 usec : 0 transmissions
         1000 ... <    1500 usec : 0 transmissions
         1500 ... <    2000 usec : 0 transmissions
         2000 ... <    5000 usec : 0 transmissions
         5000 ... <   10000 usec : 0 transmissions
        10000 ... <   50000 usec : 0 transmissions
        50000 ... < 1000000 usec : 0 transmissions
                 >= 1000000 usec : 0 transmissions
        average no. of read calls to assemble the packet: 1 

Usage of real-time patches makes no difference for test case.

Oleksij Rempel (1):
  drivers/tty: increase priority for tty_buffer_worker

Steven Walter (2):
  drivers/tty: refactor functions for flushing/queuing work
  drivers/tty: convert tty_port to use kthread_worker

 drivers/tty/tty_buffer.c | 39 +++++++++++++++++++++++++++++++--------
 drivers/tty/tty_io.c     |  1 +
 include/linux/tty.h      |  8 +++++---
 3 files changed, 37 insertions(+), 11 deletions(-)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v1 1/3] drivers/tty: refactor functions for flushing/queuing work
  2019-01-10 10:12 [PATCH v1 0/3] reduce tty latency Oleksij Rempel
@ 2019-01-10 10:12 ` Oleksij Rempel
  2019-03-11  8:16   ` Alexander Sverdlin
  2019-01-10 10:12 ` [PATCH v1 2/3] drivers/tty: convert tty_port to use kthread_worker Oleksij Rempel
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Oleksij Rempel @ 2019-01-10 10:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: Steven Walter, Oleksij Rempel, Pengutronix Kernel Team, linux-kernel

From: Steven Walter <stevenrwalter@gmail.com>

Preparation for converting to kthread_worker

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 drivers/tty/tty_buffer.c | 12 +++++++++---
 include/linux/tty.h      |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 77070c2d1240..e0090e65d83a 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -72,7 +72,7 @@ void tty_buffer_unlock_exclusive(struct tty_port *port)
 	atomic_dec(&buf->priority);
 	mutex_unlock(&buf->lock);
 	if (restart)
-		queue_work(system_unbound_wq, &buf->work);
+		tty_buffer_queue_work(port);
 }
 EXPORT_SYMBOL_GPL(tty_buffer_unlock_exclusive);
 
@@ -410,7 +410,7 @@ void tty_schedule_flip(struct tty_port *port)
 	 * flush_to_ldisc() sees buffer data.
 	 */
 	smp_store_release(&buf->tail->commit, buf->tail->used);
-	queue_work(system_unbound_wq, &buf->work);
+	tty_buffer_queue_work(port);
 }
 EXPORT_SYMBOL(tty_schedule_flip);
 
@@ -557,6 +557,12 @@ void tty_flip_buffer_push(struct tty_port *port)
 }
 EXPORT_SYMBOL(tty_flip_buffer_push);
 
+bool tty_buffer_queue_work(struct tty_port *port)
+{
+	struct tty_bufhead *buf = &port->buf;
+	return schedule_work(&buf->work);
+}
+
 /**
  *	tty_buffer_init		-	prepare a tty buffer structure
  *	@tty: tty to initialise
@@ -605,7 +611,7 @@ void tty_buffer_set_lock_subclass(struct tty_port *port)
 
 bool tty_buffer_restart_work(struct tty_port *port)
 {
-	return queue_work(system_unbound_wq, &port->buf.work);
+	return tty_buffer_queue_work(port);
 }
 
 bool tty_buffer_cancel_work(struct tty_port *port)
diff --git a/include/linux/tty.h b/include/linux/tty.h
index bfa4e2ee94a9..226a9eff0766 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -507,6 +507,7 @@ extern void session_clear_tty(struct pid *session);
 extern void no_tty(void);
 extern void tty_buffer_free_all(struct tty_port *port);
 extern void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld);
+extern bool tty_buffer_queue_work(struct tty_port *port);
 extern void tty_buffer_init(struct tty_port *port);
 extern void tty_buffer_set_lock_subclass(struct tty_port *port);
 extern bool tty_buffer_restart_work(struct tty_port *port);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 2/3] drivers/tty: convert tty_port to use kthread_worker
  2019-01-10 10:12 [PATCH v1 0/3] reduce tty latency Oleksij Rempel
  2019-01-10 10:12 ` [PATCH v1 1/3] drivers/tty: refactor functions for flushing/queuing work Oleksij Rempel
@ 2019-01-10 10:12 ` Oleksij Rempel
  2019-03-11  8:23   ` Alexander Sverdlin
  2019-01-10 10:12 ` [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker Oleksij Rempel
  2019-01-10 13:51 ` [PATCH v1 0/3] reduce tty latency Greg Kroah-Hartman
  3 siblings, 1 reply; 17+ messages in thread
From: Oleksij Rempel @ 2019-01-10 10:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: Steven Walter, Oleksij Rempel, Pengutronix Kernel Team, linux-kernel

From: Steven Walter <stevenrwalter@gmail.com>

Use kthread_worker instead of workqueues.  For now there is only a
single workqueue, but the intention is to bring back the "low_latency"
tty option, along with a second high-priority kthread worker.

Even without a second worker this patch allows to give a higher priority
to tty processing by modifying the priority of the corresponding
kthread.

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 drivers/tty/tty_buffer.c | 21 ++++++++++++++-------
 drivers/tty/tty_io.c     |  1 +
 include/linux/tty.h      |  7 ++++---
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index e0090e65d83a..18bd7f48a319 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -4,6 +4,7 @@
  */
 
 #include <linux/types.h>
+#include <linux/kthread.h>
 #include <linux/errno.h>
 #include <linux/tty.h>
 #include <linux/tty_driver.h>
@@ -497,7 +498,7 @@ receive_buf(struct tty_port *port, struct tty_buffer *head, int count)
  *		 'consumer'
  */
 
-static void flush_to_ldisc(struct work_struct *work)
+static void flush_to_ldisc(struct kthread_work *work)
 {
 	struct tty_port *port = container_of(work, struct tty_port, buf.work);
 	struct tty_bufhead *buf = &port->buf;
@@ -557,10 +558,16 @@ void tty_flip_buffer_push(struct tty_port *port)
 }
 EXPORT_SYMBOL(tty_flip_buffer_push);
 
+static DEFINE_KTHREAD_WORKER(tty_buffer_worker);
+
 bool tty_buffer_queue_work(struct tty_port *port)
 {
-	struct tty_bufhead *buf = &port->buf;
-	return schedule_work(&buf->work);
+	return kthread_queue_work(&tty_buffer_worker, &port->buf.work);
+}
+
+void tty_buffer_init_kthread(void)
+{
+	kthread_run(kthread_worker_fn, &tty_buffer_worker, "tty");
 }
 
 /**
@@ -582,7 +589,7 @@ void tty_buffer_init(struct tty_port *port)
 	init_llist_head(&buf->free);
 	atomic_set(&buf->mem_used, 0);
 	atomic_set(&buf->priority, 0);
-	INIT_WORK(&buf->work, flush_to_ldisc);
+	kthread_init_work(&buf->work, flush_to_ldisc);
 	buf->mem_limit = TTYB_DEFAULT_MEM_LIMIT;
 }
 
@@ -614,12 +621,12 @@ bool tty_buffer_restart_work(struct tty_port *port)
 	return tty_buffer_queue_work(port);
 }
 
-bool tty_buffer_cancel_work(struct tty_port *port)
+void tty_buffer_cancel_work(struct tty_port *port)
 {
-	return cancel_work_sync(&port->buf.work);
+	tty_buffer_flush_work(port);
 }
 
 void tty_buffer_flush_work(struct tty_port *port)
 {
-	flush_work(&port->buf.work);
+	kthread_flush_work(&port->buf.work);
 }
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index bfe9ad85b362..5fd7cecbe4e7 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3476,6 +3476,7 @@ void console_sysfs_notify(void)
  */
 int __init tty_init(void)
 {
+	tty_buffer_init_kthread();
 	cdev_init(&tty_cdev, &tty_fops);
 	if (cdev_add(&tty_cdev, MKDEV(TTYAUX_MAJOR, 0), 1) ||
 	    register_chrdev_region(MKDEV(TTYAUX_MAJOR, 0), 1, "/dev/tty") < 0)
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 226a9eff0766..924c1093967e 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -3,9 +3,9 @@
 #define _LINUX_TTY_H
 
 #include <linux/fs.h>
+#include <linux/kthread.h>
 #include <linux/major.h>
 #include <linux/termios.h>
-#include <linux/workqueue.h>
 #include <linux/tty_driver.h>
 #include <linux/tty_ldisc.h>
 #include <linux/mutex.h>
@@ -84,7 +84,7 @@ static inline char *flag_buf_ptr(struct tty_buffer *b, int ofs)
 
 struct tty_bufhead {
 	struct tty_buffer *head;	/* Queue head */
-	struct work_struct work;
+	struct kthread_work work;
 	struct mutex	   lock;
 	atomic_t	   priority;
 	struct tty_buffer sentinel;
@@ -510,8 +510,9 @@ extern void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld);
 extern bool tty_buffer_queue_work(struct tty_port *port);
 extern void tty_buffer_init(struct tty_port *port);
 extern void tty_buffer_set_lock_subclass(struct tty_port *port);
+extern void tty_buffer_init_kthread(void);
 extern bool tty_buffer_restart_work(struct tty_port *port);
-extern bool tty_buffer_cancel_work(struct tty_port *port);
+extern void tty_buffer_cancel_work(struct tty_port *port);
 extern void tty_buffer_flush_work(struct tty_port *port);
 extern speed_t tty_termios_baud_rate(struct ktermios *termios);
 extern speed_t tty_termios_input_baud_rate(struct ktermios *termios);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-10 10:12 [PATCH v1 0/3] reduce tty latency Oleksij Rempel
  2019-01-10 10:12 ` [PATCH v1 1/3] drivers/tty: refactor functions for flushing/queuing work Oleksij Rempel
  2019-01-10 10:12 ` [PATCH v1 2/3] drivers/tty: convert tty_port to use kthread_worker Oleksij Rempel
@ 2019-01-10 10:12 ` Oleksij Rempel
  2019-01-10 12:54   ` Linus Torvalds
  2019-03-11  8:24   ` Alexander Sverdlin
  2019-01-10 13:51 ` [PATCH v1 0/3] reduce tty latency Greg Kroah-Hartman
  3 siblings, 2 replies; 17+ messages in thread
From: Oleksij Rempel @ 2019-01-10 10:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: Oleksij Rempel, Pengutronix Kernel Team, linux-kernel

sched_priority = 1 is enough to dramatically reduce latency
on have system load produced by tasks with default user space prio.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 drivers/tty/tty_buffer.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 18bd7f48a319..7cf42f6570a0 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -13,11 +13,13 @@
 #include <linux/string.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
+#include <linux/sched/rt.h>
 #include <linux/wait.h>
 #include <linux/bitops.h>
 #include <linux/delay.h>
 #include <linux/module.h>
 #include <linux/ratelimit.h>
+#include <uapi/linux/sched/types.h>
 
 
 #define MIN_TTYB_SIZE	256
@@ -567,7 +569,15 @@ bool tty_buffer_queue_work(struct tty_port *port)
 
 void tty_buffer_init_kthread(void)
 {
-	kthread_run(kthread_worker_fn, &tty_buffer_worker, "tty");
+	struct sched_param param = { .sched_priority = 1 };
+	struct task_struct *kworker_task;
+
+	kworker_task = kthread_run(kthread_worker_fn, &tty_buffer_worker, "tty");
+	if (IS_ERR(kworker_task)) {
+		pr_err("failed to create message pump task\n");
+		return;
+	}
+	sched_setscheduler(kworker_task, SCHED_FIFO, &param);
 }
 
 /**
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-10 10:12 ` [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker Oleksij Rempel
@ 2019-01-10 12:54   ` Linus Torvalds
  2019-01-10 15:19     ` Oleksij Rempel
  2019-03-11  8:24   ` Alexander Sverdlin
  1 sibling, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2019-01-10 12:54 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Greg Kroah-Hartman, Jiri Slaby, Pengutronix Kernel Team, lkml

On Thu, Jan 10, 2019 at 2:12 AM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
>
> sched_priority = 1 is enough to dramatically reduce latency
> on have system load produced by tasks with default user space prio.

.. and is this perhaps a way for a user to then make the system spend
inordinate amounts of time in the tty layer, and hurting other people?
I'm thinking threads using pty's etc as a way to make the system
unresponsive.

We have *never* had good results with random priority modifications.
People used to do this for the X server, and it helped in very
specific cases, and hurt enormously in others.

Why would anybody use a tty interface with a l;oopback adapter and
care about latency?

I can kind of see why you want to do this from a theoretical point,
but from a *practical* point of view it seems pointless. Why not use
more appropriate models like networking or pipes etc. IOW, I think you
should describe what you *really* are doing much more.

"hackbench with a loopback serial adapter" really doesn't sound like
something that should worry a lot of people.

My gut feel is that if somebody still cares deeply about serial line
latency, they should look at trying to see if they can do some of the
work directly without the bounce to the workqueue. We use workqueues
for a reason, but it's possible that some of it could be avoided at
least in special cases... And yours sounds like a special case.

                Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 0/3] reduce tty latency
  2019-01-10 10:12 [PATCH v1 0/3] reduce tty latency Oleksij Rempel
                   ` (2 preceding siblings ...)
  2019-01-10 10:12 ` [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker Oleksij Rempel
@ 2019-01-10 13:51 ` Greg Kroah-Hartman
  3 siblings, 0 replies; 17+ messages in thread
From: Greg Kroah-Hartman @ 2019-01-10 13:51 UTC (permalink / raw)
  To: Oleksij Rempel; +Cc: Jiri Slaby, Pengutronix Kernel Team, linux-kernel

On Thu, Jan 10, 2019 at 11:12:29AM +0100, Oleksij Rempel wrote:
> This patch set is reducing latency on tty path.
> For testing I used hackbench running on all cores of 4 core system and
> high prioritized application sending and receiving packets over tty interface
> with loop-back adapter.

Odd, all of these ended up in my spam folder, you might want to check
your server email settings...

> Results without this patches:
>         latency histogram:
>             0 ... <     250 usec : 1933104 transmissions
>           250 ... <     500 usec : 21339 transmissions
>           500 ... <     750 usec : 8952 transmissions
>           750 ... <    1000 usec : 6226 transmissions
>          1000 ... <    1500 usec : 7688 transmissions
>          1500 ... <    2000 usec : 5236 transmissions
>          2000 ... <    5000 usec : 11724 transmissions
>          5000 ... <   10000 usec : 3588 transmissions
>         10000 ... <   50000 usec : 2123 transmissions
>         50000 ... < 1000000 usec : 20 transmissions
>                  >= 1000000 usec : 0 transmissions
> 
> Test results after this patches:
>         min latency: 0 sec : 75 usec
>         max latency: 0 sec : 125 usec
>         average latency: 81 usec
>         latency measure cycles overall: 79000000
>         latency histogram:
>             0 ... <     250 usec : 79000000 transmissions
>           250 ... <     500 usec : 0 transmissions
>           500 ... <     750 usec : 0 transmissions
>           750 ... <    1000 usec : 0 transmissions
>          1000 ... <    1500 usec : 0 transmissions
>          1500 ... <    2000 usec : 0 transmissions
>          2000 ... <    5000 usec : 0 transmissions
>          5000 ... <   10000 usec : 0 transmissions
>         10000 ... <   50000 usec : 0 transmissions
>         50000 ... < 1000000 usec : 0 transmissions
>                  >= 1000000 usec : 0 transmissions
>         average no. of read calls to assemble the packet: 1 

Like Linus said, who runs a real-world system that cares about this
latency measurement?

Yes, it might be fun for odd benchmarks to show the value of one RT
patchset/OS vs. another one, but this change can cause real issues in
real systems that do real, non-serial-loopback work.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-10 12:54   ` Linus Torvalds
@ 2019-01-10 15:19     ` Oleksij Rempel
  2019-01-10 16:30       ` Greg Kroah-Hartman
  2019-01-10 16:59       ` Linus Torvalds
  0 siblings, 2 replies; 17+ messages in thread
From: Oleksij Rempel @ 2019-01-10 15:19 UTC (permalink / raw)
  To: Linus Torvalds, Greg Kroah-Hartman
  Cc: Jiri Slaby, Pengutronix Kernel Team, lkml

On Thu, Jan 10, 2019 at 04:54:53AM -0800, Linus Torvalds wrote:
> On Thu, Jan 10, 2019 at 2:12 AM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> >
> > sched_priority = 1 is enough to dramatically reduce latency
> > on have system load produced by tasks with default user space prio.
> 
> .. and is this perhaps a way for a user to then make the system spend
> inordinate amounts of time in the tty layer, and hurting other people?
> I'm thinking threads using pty's etc as a way to make the system
> unresponsive.
> 
> We have *never* had good results with random priority modifications.
> People used to do this for the X server, and it helped in very
> specific cases, and hurt enormously in others.
> 
> Why would anybody use a tty interface with a l;oopback adapter and
> care about latency?
> 
> I can kind of see why you want to do this from a theoretical point,
> but from a *practical* point of view it seems pointless. Why not use
> more appropriate models like networking or pipes etc. IOW, I think you
> should describe what you *really* are doing much more.
> 
> "hackbench with a loopback serial adapter" really doesn't sound like
> something that should worry a lot of people.

yes, you right.

> My gut feel is that if somebody still cares deeply about serial line
> latency, they should look at trying to see if they can do some of the
> work directly without the bounce to the workqueue. We use workqueues
> for a reason, but it's possible that some of it could be avoided at
> least in special cases... And yours sounds like a special case.

It is for industrial low latency RS-422 based application. The loopback
test is just easy way to test/reproduce it without additional hardware.

What is good, mainlineable way to implement it? 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-10 15:19     ` Oleksij Rempel
@ 2019-01-10 16:30       ` Greg Kroah-Hartman
  2019-01-28  8:05         ` Oleksij Rempel
  2019-01-10 16:59       ` Linus Torvalds
  1 sibling, 1 reply; 17+ messages in thread
From: Greg Kroah-Hartman @ 2019-01-10 16:30 UTC (permalink / raw)
  To: Oleksij Rempel; +Cc: Linus Torvalds, Jiri Slaby, Pengutronix Kernel Team, lkml

On Thu, Jan 10, 2019 at 04:19:53PM +0100, Oleksij Rempel wrote:
> > My gut feel is that if somebody still cares deeply about serial line
> > latency, they should look at trying to see if they can do some of the
> > work directly without the bounce to the workqueue. We use workqueues
> > for a reason, but it's possible that some of it could be avoided at
> > least in special cases... And yours sounds like a special case.
> 
> It is for industrial low latency RS-422 based application. The loopback
> test is just easy way to test/reproduce it without additional hardware.
> 
> What is good, mainlineable way to implement it? 

What is the real problem your systems are having?  Are they serial-port
limited?  Is latency a big issue?  Trying to tune for a fake workload
isn't the best way to solve anything :)

thanks,
greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-10 15:19     ` Oleksij Rempel
  2019-01-10 16:30       ` Greg Kroah-Hartman
@ 2019-01-10 16:59       ` Linus Torvalds
  1 sibling, 0 replies; 17+ messages in thread
From: Linus Torvalds @ 2019-01-10 16:59 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Greg Kroah-Hartman, Jiri Slaby, Pengutronix Kernel Team, lkml

On Thu, Jan 10, 2019 at 7:19 AM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
>
> It is for industrial low latency RS-422 based application. The loopback
> test is just easy way to test/reproduce it without additional hardware.
>
> What is good, mainlineable way to implement it?

I can easily see that for that specific use it might be the right
thing and the easiest way to give better latency guarantees. I just
don't think it's generally applicable, because of the reasons stated
(ie "giving one thread much higher priority can have some detrimental
effects on other cases").

And obviously for people who do *not* want this, the extra kthreads
for just tty flushing are just ugly overhead. We've tried to no longer
have a lot of special threads. At least yours isn't percpu..

I wonder if  you could get at least similar behavior by:

 - allocating your own workqueue

      tty_wq = alloc_workqueue("tty", WQ_HIGHPRI, 0);

 - and then instead of the current

     queue_work(system_unbound_wq, &buf->work);

   you'd queue it onto that WQ_HIGHPRI workqueue.

NOTE! The advantage of the above is that it should be much easier to
make conditional. It could be a small boot-time option to say "create
your own tty high-priority workqueue", and it would be off by default
- and the tty buffer code would just use the default
"system_unbound_wq" normally, but then with the special option to use
that HIGHPRI workqueue.

See what I'm saying? It shouldn't be all that different from what you
do in your patch-series, but at least this way it's an easy
conditional, and people who don't have your kind of special latency
issues would never see it (and the code would be maintainable).

And do explain in the commit messages what your real load is (the
loopback case can still be interesting to show numbers in a controlled
environment, but it's not very compelling from a "this is the _reason_
for the change", if you see what I mean).

How does that sound? I think it would be much more likely to be
acceptable, and it doesn't sound like a big change.

But I may have missed something.

             Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-10 16:30       ` Greg Kroah-Hartman
@ 2019-01-28  8:05         ` Oleksij Rempel
  2019-01-28  8:23           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 17+ messages in thread
From: Oleksij Rempel @ 2019-01-28  8:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linus Torvalds, Jiri Slaby, Pengutronix Kernel Team, lkml



On 10.01.19 17:30, Greg Kroah-Hartman wrote:
> On Thu, Jan 10, 2019 at 04:19:53PM +0100, Oleksij Rempel wrote:
>>> My gut feel is that if somebody still cares deeply about serial line
>>> latency, they should look at trying to see if they can do some of the
>>> work directly without the bounce to the workqueue. We use workqueues
>>> for a reason, but it's possible that some of it could be avoided at
>>> least in special cases... And yours sounds like a special case.
>>
>> It is for industrial low latency RS-422 based application. The loopback
>> test is just easy way to test/reproduce it without additional hardware.
>>
>> What is good, mainlineable way to implement it?
> 
> What is the real problem your systems are having?  Are they serial-port
> limited?  Is latency a big issue?  Trying to tune for a fake workload
> isn't the best way to solve anything :)

The system in question is a high power laser cutter with live image-based inspection
and adjustment of the cutting process. In this setup the RS422 interface is used to
control parameters of the laser cutting unit in a tie control loop with the camera.
This loops needs to operate at 1000 Hz.

The xy-stage moves with a speed of approx. 60m/min, i.e. within 1ms it
moves about 1mm. For a high precision control process a jitter of ± 500 us (+/- 0.5mm)
is unacceptable.

Kind regards,
Oleksij Rempel

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-28  8:05         ` Oleksij Rempel
@ 2019-01-28  8:23           ` Greg Kroah-Hartman
  2019-01-28  9:22             ` Oleksij Rempel
  0 siblings, 1 reply; 17+ messages in thread
From: Greg Kroah-Hartman @ 2019-01-28  8:23 UTC (permalink / raw)
  To: Oleksij Rempel; +Cc: Linus Torvalds, Jiri Slaby, Pengutronix Kernel Team, lkml

On Mon, Jan 28, 2019 at 09:05:30AM +0100, Oleksij Rempel wrote:
> 
> 
> On 10.01.19 17:30, Greg Kroah-Hartman wrote:
> > On Thu, Jan 10, 2019 at 04:19:53PM +0100, Oleksij Rempel wrote:
> > > > My gut feel is that if somebody still cares deeply about serial line
> > > > latency, they should look at trying to see if they can do some of the
> > > > work directly without the bounce to the workqueue. We use workqueues
> > > > for a reason, but it's possible that some of it could be avoided at
> > > > least in special cases... And yours sounds like a special case.
> > > 
> > > It is for industrial low latency RS-422 based application. The loopback
> > > test is just easy way to test/reproduce it without additional hardware.
> > > 
> > > What is good, mainlineable way to implement it?
> > 
> > What is the real problem your systems are having?  Are they serial-port
> > limited?  Is latency a big issue?  Trying to tune for a fake workload
> > isn't the best way to solve anything :)
> 
> The system in question is a high power laser cutter with live image-based inspection
> and adjustment of the cutting process. In this setup the RS422 interface is used to
> control parameters of the laser cutting unit in a tie control loop with the camera.
> This loops needs to operate at 1000 Hz.
> 
> The xy-stage moves with a speed of approx. 60m/min, i.e. within 1ms it
> moves about 1mm. For a high precision control process a jitter of ± 500 us (+/- 0.5mm)
> is unacceptable.

Are you using the rt kernel patch for this type of thing?  That should
bound your jitter at a much more deterministic level.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-28  8:23           ` Greg Kroah-Hartman
@ 2019-01-28  9:22             ` Oleksij Rempel
  2019-01-28 20:03               ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Oleksij Rempel @ 2019-01-28  9:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linus Torvalds, Jiri Slaby, Pengutronix Kernel Team, lkml



On 28.01.19 09:23, Greg Kroah-Hartman wrote:
> On Mon, Jan 28, 2019 at 09:05:30AM +0100, Oleksij Rempel wrote:
>>
>>
>> On 10.01.19 17:30, Greg Kroah-Hartman wrote:
>>> On Thu, Jan 10, 2019 at 04:19:53PM +0100, Oleksij Rempel wrote:
>>>>> My gut feel is that if somebody still cares deeply about serial line
>>>>> latency, they should look at trying to see if they can do some of the
>>>>> work directly without the bounce to the workqueue. We use workqueues
>>>>> for a reason, but it's possible that some of it could be avoided at
>>>>> least in special cases... And yours sounds like a special case.
>>>>
>>>> It is for industrial low latency RS-422 based application. The loopback
>>>> test is just easy way to test/reproduce it without additional hardware.
>>>>
>>>> What is good, mainlineable way to implement it?
>>>
>>> What is the real problem your systems are having?  Are they serial-port
>>> limited?  Is latency a big issue?  Trying to tune for a fake workload
>>> isn't the best way to solve anything :)
>>
>> The system in question is a high power laser cutter with live image-based inspection
>> and adjustment of the cutting process. In this setup the RS422 interface is used to
>> control parameters of the laser cutting unit in a tie control loop with the camera.
>> This loops needs to operate at 1000 Hz.
>>
>> The xy-stage moves with a speed of approx. 60m/min, i.e. within 1ms it
>> moves about 1mm. For a high precision control process a jitter of ± 500 us (+/- 0.5mm)
>> is unacceptable.
> 
> Are you using the rt kernel patch for this type of thing?  That should
> bound your jitter at a much more deterministic level.

Yes, I tested it with different linux-rt version with mostly similar results:
         kernel 4.8.15-rt10+
         latency histogram:
             0 ... <     250 usec : 1933104 transmissions
           250 ... <     500 usec : 21339 transmissions
           500 ... <     750 usec : 8952 transmissions
           750 ... <    1000 usec : 6226 transmissions
          1000 ... <    1500 usec : 7688 transmissions
          1500 ... <    2000 usec : 5236 transmissions
          2000 ... <    5000 usec : 11724 transmissions
          5000 ... <   10000 usec : 3588 transmissions
         10000 ... <   50000 usec : 2123 transmissions
         50000 ... < 1000000 usec : 20 transmissions
                  >= 1000000 usec : 0 transmissions

         kernel 4.9.0-rt1+
         latency histogram:
             0 ... <     250 usec : 1950222 transmissions
           250 ... <     500 usec : 15041 transmissions
           500 ... <     750 usec : 5968 transmissions
           750 ... <    1000 usec : 4437 transmissions
          1000 ... <    1500 usec : 6022 transmissions
          1500 ... <    2000 usec : 4185 transmissions
          2000 ... <    5000 usec : 9864 transmissions
          5000 ... <   10000 usec : 2773 transmissions
         10000 ... <   50000 usec : 1462 transmissions
         50000 ... < 1000000 usec : 26 transmissions
                  >= 1000000 usec : 0 transmissions

         4.19.10-rt8
         latency histogram:
             0 ... <     250 usec : 1906861 transmissions
           250 ... <     500 usec : 35271 transmissions
           500 ... <     750 usec : 13103 transmissions
           750 ... <    1000 usec : 9084 transmissions
          1000 ... <    1500 usec : 9434 transmissions
          1500 ... <    2000 usec : 5644 transmissions
          2000 ... <    5000 usec : 12737 transmissions
          5000 ... <   10000 usec : 4511 transmissions
         10000 ... <   50000 usec : 3201 transmissions
         50000 ... < 1000000 usec : 154 transmissions
                  >= 1000000 usec : 0 transmissions


         without extra CPU load the result on kernel 4.19.10-rt8 will be:
         latency histogram:
             0 ... <     250 usec : 1999992 transmissions
           250 ... <     500 usec : 8 transmissions
           500 ... <     750 usec : 0 transmissions
           750 ... <    1000 usec : 0 transmissions
          1000 ... <    1500 usec : 0 transmissions
          1500 ... <    2000 usec : 0 transmissions
          2000 ... <    5000 usec : 0 transmissions
          5000 ... <   10000 usec : 0 transmissions
         10000 ... <   50000 usec : 0 transmissions
         50000 ... < 1000000 usec : 0 transmissions
                  >= 1000000 usec : 0 transmissions
=============================================================


test results with same load and replaced kworker with kthread and assigned an RT priority

         min latency: 0 sec : 75 usec
         max latency: 0 sec : 125 usec
         average latency: 81 usec
         latency measure cycles overall: 79000000
         latency histogram:
             0 ... <     250 usec : 79000000 transmissions
           250 ... <     500 usec : 0 transmissions
           500 ... <     750 usec : 0 transmissions
           750 ... <    1000 usec : 0 transmissions
          1000 ... <    1500 usec : 0 transmissions
          1500 ... <    2000 usec : 0 transmissions
          2000 ... <    5000 usec : 0 transmissions
          5000 ... <   10000 usec : 0 transmissions
         10000 ... <   50000 usec : 0 transmissions
         50000 ... < 1000000 usec : 0 transmissions
                  >= 1000000 usec : 0 transmissions



Kind regards,
Oleksij Rempel

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-28  9:22             ` Oleksij Rempel
@ 2019-01-28 20:03               ` Linus Torvalds
  2019-01-28 20:13                 ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2019-01-28 20:03 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Greg Kroah-Hartman, Jiri Slaby, Pengutronix Kernel Team, lkml

On Mon, Jan 28, 2019 at 1:22 AM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
>
> Yes, I tested it with different linux-rt version with mostly similar results:

Hmm. It strikes me that you use very carefully timed serial *writes*
to control the laser cutter, but the flip buffer handling is mostly a
latency issue on the *read* side, isn't it?

Are you sure you are testing the right thing? Because your loopback
test is testing the latency not of writes, but of writes _and_ reads.

I'm wondering if we could/should try to simply avoid the workqueue
entirely if we could do the work in process context.

That's harder to do for reads - because incoming characters happen in
interrupt context, but shouldn't be all that hard to do for writes.

In fact, I thought we already did writes without any tty buffer
flipping at all, and that your patch series shouldn't actually affect
any write latency, but I've happily not had to work much with the
tty/serial layer in years..

Do you actually have read latency issues? Or is this whole series
perhaps an artifical effect of the benchmark you use?

                Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-28 20:03               ` Linus Torvalds
@ 2019-01-28 20:13                 ` Linus Torvalds
  0 siblings, 0 replies; 17+ messages in thread
From: Linus Torvalds @ 2019-01-28 20:13 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Greg Kroah-Hartman, Jiri Slaby, Pengutronix Kernel Team, lkml

On Mon, Jan 28, 2019 at 12:03 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> That's harder to do for reads - because incoming characters happen in
> interrupt context, but shouldn't be all that hard to do for writes.

Side note: the reason I mention this part is that "harder" may not
mean "impossible".

In particular, I wonder if we could do the tty buffer flipping in the
reader context too. Currently, what happens is that when we receive
characters, we schedule things for flipping with the workqueues. *BUT*
we could also just wake up any pending readers directly, and maybe
have the *readers* do the flip if they wake up before the workqueue.

And that would allow you to do real-time serial work simply by marking
the process *you* care about as RT, and not worry so much about the
workqueue threads at all. The workqueue threads would be fallbacks for
when there isn't an active reader at all.

I dunno. A bit handwavy, I know, but it sounds like if you care about
the read latency, that would be a better model entirely (skipping the
technically unnecessary kernel workqueue entirely).

              Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/3] drivers/tty: refactor functions for flushing/queuing work
  2019-01-10 10:12 ` [PATCH v1 1/3] drivers/tty: refactor functions for flushing/queuing work Oleksij Rempel
@ 2019-03-11  8:16   ` Alexander Sverdlin
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Sverdlin @ 2019-03-11  8:16 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Greg Kroah-Hartman, Jiri Slaby, Steven Walter,
	Pengutronix Kernel Team, linux-kernel

Hello Oleksij,

On Thu, 10 Jan 2019 11:12:30 +0100
Oleksij Rempel <o.rempel@pengutronix.de> wrote:

> From: Steven Walter <stevenrwalter@gmail.com>
> 
> Preparation for converting to kthread_worker
> 
> Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>

Tested-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>

> ---
>  drivers/tty/tty_buffer.c | 12 +++++++++---
>  include/linux/tty.h      |  1 +
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> index 77070c2d1240..e0090e65d83a 100644
> --- a/drivers/tty/tty_buffer.c
> +++ b/drivers/tty/tty_buffer.c
> @@ -72,7 +72,7 @@ void tty_buffer_unlock_exclusive(struct tty_port *port)
>  	atomic_dec(&buf->priority);
>  	mutex_unlock(&buf->lock);
>  	if (restart)
> -		queue_work(system_unbound_wq, &buf->work);
> +		tty_buffer_queue_work(port);
>  }
>  EXPORT_SYMBOL_GPL(tty_buffer_unlock_exclusive);
>  
> @@ -410,7 +410,7 @@ void tty_schedule_flip(struct tty_port *port)
>  	 * flush_to_ldisc() sees buffer data.
>  	 */
>  	smp_store_release(&buf->tail->commit, buf->tail->used);
> -	queue_work(system_unbound_wq, &buf->work);
> +	tty_buffer_queue_work(port);
>  }
>  EXPORT_SYMBOL(tty_schedule_flip);
>  
> @@ -557,6 +557,12 @@ void tty_flip_buffer_push(struct tty_port *port)
>  }
>  EXPORT_SYMBOL(tty_flip_buffer_push);
>  
> +bool tty_buffer_queue_work(struct tty_port *port)
> +{
> +	struct tty_bufhead *buf = &port->buf;
> +	return schedule_work(&buf->work);
> +}
> +
>  /**
>   *	tty_buffer_init		-	prepare a tty buffer structure
>   *	@tty: tty to initialise
> @@ -605,7 +611,7 @@ void tty_buffer_set_lock_subclass(struct tty_port *port)
>  
>  bool tty_buffer_restart_work(struct tty_port *port)
>  {
> -	return queue_work(system_unbound_wq, &port->buf.work);
> +	return tty_buffer_queue_work(port);
>  }
>  
>  bool tty_buffer_cancel_work(struct tty_port *port)
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index bfa4e2ee94a9..226a9eff0766 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -507,6 +507,7 @@ extern void session_clear_tty(struct pid *session);
>  extern void no_tty(void);
>  extern void tty_buffer_free_all(struct tty_port *port);
>  extern void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld);
> +extern bool tty_buffer_queue_work(struct tty_port *port);
>  extern void tty_buffer_init(struct tty_port *port);
>  extern void tty_buffer_set_lock_subclass(struct tty_port *port);
>  extern bool tty_buffer_restart_work(struct tty_port *port);


-- 
Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 2/3] drivers/tty: convert tty_port to use kthread_worker
  2019-01-10 10:12 ` [PATCH v1 2/3] drivers/tty: convert tty_port to use kthread_worker Oleksij Rempel
@ 2019-03-11  8:23   ` Alexander Sverdlin
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Sverdlin @ 2019-03-11  8:23 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Greg Kroah-Hartman, Jiri Slaby, Steven Walter,
	Pengutronix Kernel Team, linux-kernel

Hello Oleksij,

On Thu, 10 Jan 2019 11:12:31 +0100
Oleksij Rempel <o.rempel@pengutronix.de> wrote:

> From: Steven Walter <stevenrwalter@gmail.com>
> 
> Use kthread_worker instead of workqueues.  For now there is only a
> single workqueue, but the intention is to bring back the "low_latency"
> tty option, along with a second high-priority kthread worker.
> 
> Even without a second worker this patch allows to give a higher priority
> to tty processing by modifying the priority of the corresponding
> kthread.
> 
> Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>

Tested-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>

In general I agree with Linus, that a thread with some random
priority is suboptimal, and we actually should move this
work into the context of the receiving thread, to properly
inherit its priority. But this would be quite amount of rework.

In the meanwhile this patch series restores the "low_latency"
tty option. Thanks for your efforts!

> ---
>  drivers/tty/tty_buffer.c | 21 ++++++++++++++-------
>  drivers/tty/tty_io.c     |  1 +
>  include/linux/tty.h      |  7 ++++---
>  3 files changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> index e0090e65d83a..18bd7f48a319 100644
> --- a/drivers/tty/tty_buffer.c
> +++ b/drivers/tty/tty_buffer.c
> @@ -4,6 +4,7 @@
>   */
>  
>  #include <linux/types.h>
> +#include <linux/kthread.h>
>  #include <linux/errno.h>
>  #include <linux/tty.h>
>  #include <linux/tty_driver.h>
> @@ -497,7 +498,7 @@ receive_buf(struct tty_port *port, struct tty_buffer *head, int count)
>   *		 'consumer'
>   */
>  
> -static void flush_to_ldisc(struct work_struct *work)
> +static void flush_to_ldisc(struct kthread_work *work)
>  {
>  	struct tty_port *port = container_of(work, struct tty_port, buf.work);
>  	struct tty_bufhead *buf = &port->buf;
> @@ -557,10 +558,16 @@ void tty_flip_buffer_push(struct tty_port *port)
>  }
>  EXPORT_SYMBOL(tty_flip_buffer_push);
>  
> +static DEFINE_KTHREAD_WORKER(tty_buffer_worker);
> +
>  bool tty_buffer_queue_work(struct tty_port *port)
>  {
> -	struct tty_bufhead *buf = &port->buf;
> -	return schedule_work(&buf->work);
> +	return kthread_queue_work(&tty_buffer_worker, &port->buf.work);
> +}
> +
> +void tty_buffer_init_kthread(void)
> +{
> +	kthread_run(kthread_worker_fn, &tty_buffer_worker, "tty");
>  }
>  
>  /**
> @@ -582,7 +589,7 @@ void tty_buffer_init(struct tty_port *port)
>  	init_llist_head(&buf->free);
>  	atomic_set(&buf->mem_used, 0);
>  	atomic_set(&buf->priority, 0);
> -	INIT_WORK(&buf->work, flush_to_ldisc);
> +	kthread_init_work(&buf->work, flush_to_ldisc);
>  	buf->mem_limit = TTYB_DEFAULT_MEM_LIMIT;
>  }
>  
> @@ -614,12 +621,12 @@ bool tty_buffer_restart_work(struct tty_port *port)
>  	return tty_buffer_queue_work(port);
>  }
>  
> -bool tty_buffer_cancel_work(struct tty_port *port)
> +void tty_buffer_cancel_work(struct tty_port *port)
>  {
> -	return cancel_work_sync(&port->buf.work);
> +	tty_buffer_flush_work(port);
>  }
>  
>  void tty_buffer_flush_work(struct tty_port *port)
>  {
> -	flush_work(&port->buf.work);
> +	kthread_flush_work(&port->buf.work);
>  }
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index bfe9ad85b362..5fd7cecbe4e7 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -3476,6 +3476,7 @@ void console_sysfs_notify(void)
>   */
>  int __init tty_init(void)
>  {
> +	tty_buffer_init_kthread();
>  	cdev_init(&tty_cdev, &tty_fops);
>  	if (cdev_add(&tty_cdev, MKDEV(TTYAUX_MAJOR, 0), 1) ||
>  	    register_chrdev_region(MKDEV(TTYAUX_MAJOR, 0), 1, "/dev/tty") < 0)
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index 226a9eff0766..924c1093967e 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -3,9 +3,9 @@
>  #define _LINUX_TTY_H
>  
>  #include <linux/fs.h>
> +#include <linux/kthread.h>
>  #include <linux/major.h>
>  #include <linux/termios.h>
> -#include <linux/workqueue.h>
>  #include <linux/tty_driver.h>
>  #include <linux/tty_ldisc.h>
>  #include <linux/mutex.h>
> @@ -84,7 +84,7 @@ static inline char *flag_buf_ptr(struct tty_buffer *b, int ofs)
>  
>  struct tty_bufhead {
>  	struct tty_buffer *head;	/* Queue head */
> -	struct work_struct work;
> +	struct kthread_work work;
>  	struct mutex	   lock;
>  	atomic_t	   priority;
>  	struct tty_buffer sentinel;
> @@ -510,8 +510,9 @@ extern void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld);
>  extern bool tty_buffer_queue_work(struct tty_port *port);
>  extern void tty_buffer_init(struct tty_port *port);
>  extern void tty_buffer_set_lock_subclass(struct tty_port *port);
> +extern void tty_buffer_init_kthread(void);
>  extern bool tty_buffer_restart_work(struct tty_port *port);
> -extern bool tty_buffer_cancel_work(struct tty_port *port);
> +extern void tty_buffer_cancel_work(struct tty_port *port);
>  extern void tty_buffer_flush_work(struct tty_port *port);
>  extern speed_t tty_termios_baud_rate(struct ktermios *termios);
>  extern speed_t tty_termios_input_baud_rate(struct ktermios *termios);


-- 
Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker
  2019-01-10 10:12 ` [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker Oleksij Rempel
  2019-01-10 12:54   ` Linus Torvalds
@ 2019-03-11  8:24   ` Alexander Sverdlin
  1 sibling, 0 replies; 17+ messages in thread
From: Alexander Sverdlin @ 2019-03-11  8:24 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Greg Kroah-Hartman, Jiri Slaby, Pengutronix Kernel Team, linux-kernel

Hello Oleksij,

On Thu, 10 Jan 2019 11:12:32 +0100
Oleksij Rempel <o.rempel@pengutronix.de> wrote:

> sched_priority = 1 is enough to dramatically reduce latency
> on have system load produced by tasks with default user space prio.
> 
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>

Tested-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>

> ---
>  drivers/tty/tty_buffer.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> index 18bd7f48a319..7cf42f6570a0 100644
> --- a/drivers/tty/tty_buffer.c
> +++ b/drivers/tty/tty_buffer.c
> @@ -13,11 +13,13 @@
>  #include <linux/string.h>
>  #include <linux/slab.h>
>  #include <linux/sched.h>
> +#include <linux/sched/rt.h>
>  #include <linux/wait.h>
>  #include <linux/bitops.h>
>  #include <linux/delay.h>
>  #include <linux/module.h>
>  #include <linux/ratelimit.h>
> +#include <uapi/linux/sched/types.h>
>  
>  
>  #define MIN_TTYB_SIZE	256
> @@ -567,7 +569,15 @@ bool tty_buffer_queue_work(struct tty_port *port)
>  
>  void tty_buffer_init_kthread(void)
>  {
> -	kthread_run(kthread_worker_fn, &tty_buffer_worker, "tty");
> +	struct sched_param param = { .sched_priority = 1 };
> +	struct task_struct *kworker_task;
> +
> +	kworker_task = kthread_run(kthread_worker_fn, &tty_buffer_worker, "tty");
> +	if (IS_ERR(kworker_task)) {
> +		pr_err("failed to create message pump task\n");
> +		return;
> +	}
> +	sched_setscheduler(kworker_task, SCHED_FIFO, &param);
>  }
>  
>  /**


-- 
Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-03-11  8:24 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-10 10:12 [PATCH v1 0/3] reduce tty latency Oleksij Rempel
2019-01-10 10:12 ` [PATCH v1 1/3] drivers/tty: refactor functions for flushing/queuing work Oleksij Rempel
2019-03-11  8:16   ` Alexander Sverdlin
2019-01-10 10:12 ` [PATCH v1 2/3] drivers/tty: convert tty_port to use kthread_worker Oleksij Rempel
2019-03-11  8:23   ` Alexander Sverdlin
2019-01-10 10:12 ` [PATCH v1 3/3] drivers/tty: increase priority for tty_buffer_worker Oleksij Rempel
2019-01-10 12:54   ` Linus Torvalds
2019-01-10 15:19     ` Oleksij Rempel
2019-01-10 16:30       ` Greg Kroah-Hartman
2019-01-28  8:05         ` Oleksij Rempel
2019-01-28  8:23           ` Greg Kroah-Hartman
2019-01-28  9:22             ` Oleksij Rempel
2019-01-28 20:03               ` Linus Torvalds
2019-01-28 20:13                 ` Linus Torvalds
2019-01-10 16:59       ` Linus Torvalds
2019-03-11  8:24   ` Alexander Sverdlin
2019-01-10 13:51 ` [PATCH v1 0/3] reduce tty latency Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).