Sorry for the delay, yesterday was rough.

On Thu, Aug 01, 2019 at 11:20:20AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Aug 01, 2019 at 01:23:59AM +0300, Arseny Maslennikov wrote:
> > On Tue, Jul 30, 2019 at 06:19:40PM +0200, Greg Kroah-Hartman wrote:
> > > On Tue, Jun 25, 2019 at 07:11:53PM +0300, Arseny Maslennikov wrote:
> > > > If the three termios local flags isig, icanon, iexten are enabled
> > > > and the local flag nokerninfo is disabled for a tty governed
> > > > by the n_tty line discipline, then on receiving the keyboard status
> > > > character n_tty will generate a status message and write it out to
> > > > the tty before sending SIGINFO to the tty's foreground process group.
> > > > 
> > > > <...>
> > > 
> > > Why is this really all needed as we have the SysRq handlers that report
> > > all of this today?
> > 
> > Different use-cases have different needs; SysRq is targeted at a different
> > audience; see below.
> > 
> > > > The "most interesting" process is chosen as follows:
> > > >  - runnables over everything
> > > >  - uninterruptibles over everything else
> > > >  - among 2 runnables pick the biggest utime + stime
> > > >  - any unresolved ties are decided in favour of greatest PID.
> > > 
> > > This does not feel like something that the tty core code should be doing
> > > at all.
> > 
> > Yes, this selection part is quite clumsy. In defense of it, one could
> > argue that we already have the whole n_tty implemented in kernel-space.
> 
> One could argue that :)
> 
> > One way we could get rid of this is to display a summarized statistic
> > for the whole pgrp: pgid, oldest real time, cumulative utime and stime,
> > cumulative memory usage. Would this be more acceptable? Are there any
> > other ideas?
> 
> Given that I really think you are just making something up here that no
> one really is needing for their workflow, but would just be "cool to
> have", I say do whatever you think is right.
> 
> And there is nothing wrong with "cool to have" things, I'm not trying to
> dismiss this, it's just that all new features come with the "you must
> support this until the end of time" requirement that we must balance it
> with.
> 

To be fair, the kerninfo line itself is intended for the user's eyes; it
can't even be programmatically read by anyone in a Linux system except
pty masters, so I don't feel it has to be particularly stable, since
changing its contents won't break anyone.
So IMO we could still tweak something about it after we merge this.
Of course this cannot be said about the signal part of the patch series,
which interfaces with processes and is a proper part of the outward
kernel API, thus subject to the kernel's policy.

> > > > While the kerninfo line is not very useful for debugging the kernel
> > > > itself, since we have much more powerful debugging tools, it still gives
> > > > the user behind the terminal some meaningful feedback to a VSTATUS that
> > > > works even if no processes respond.
> > > 
> > > That's what SysRq is for.  If there's a specific set of values that we
> > > don't currently report in that facility, why not just add the
> > > information there?  It's much simpler and "safer" that way.
> > 
> > SysRq is intended for the person either administrating the system to be used in
> > emergency (e.g. f for the oom kill, the famous s-u-b combo also comes to
> > mind) or debugging the kernel, and it indeed does a much better job for
> > those purposes.  In both use-cases mentioned the person has access to
> > the system console, where the sysrq button handlers produce all their
> > output, if any, and to either a physical keyboard / serial console or to
> > /proc/sysrq-trigger, whose mode is 0200 (writable by uid 0 only).
> > 
> > The use-case for this is different: the ^T-line as proposed by this
> > patch is for the user that interacts with a system through a terminal, who
> > wants to be informed not about the whole system (sort of what SysRq-t
> > tells you), but about what they run on that particular tty.
> 
> Ok, fair enough, although if you just add a new sysrq option for "what
> is running on this tty", would that help resolve this?
> 

I see at least two problems here:
1) sysrq is not accessible by unprivileged users (and here goes the
   whole idea of sysrq being designed for different purposes, etc.)
2) Even if we fix it and somehow allow unpriv users to do the equivalent
   of `echo $letter > /proc/sysrq-trigger' for a particular letter,
   there still are UI issues.  (yes, UI issues are better dealt with
   outside the kernel, but that's not the point) It still has to dump
   info to the particular tty (let's remember, console is unaccessible
   and shared to the whole system), that is a novelty for sysrq. There
   still is the open question about how to make a VSTATUS press trigger
   that sysrq button in addition to signalling fg pgrp without shoving
   this bit into every other userspace program.

To sum up, this looks worse to me than integrating with n_tty.

In general, the main appeal of the feature is in it being activated by a
single keypress handled by the line discipline, from any tty.

> > This is much less about "why does my system/kernel seem to hang?" or
> > exposing low-level internals (registers, hrtimers, locks, ...), and more
> > about "is my SSH terminal session unresponsive?" and "I ran a command,
> > it doesn't finish, how's it doing?".
> > e.g. A user might want to know if their SSH connection is alive without
> > interrupting anything, while having no access both to SysRq and console,
> > and no one in fg pgrp actually handles SIGINFO.
> 
> If you have access to a tty, you should have access to sysrq, right?
> 

No.
SSH gives you access to a pty, which doesn't seem to handle BREAK; the
BREAK SysRq activation method does not work there.
As Pavel correctly noted, if the user is unprivileged, they can't use
/proc/sysrq-trigger as well.

> > SysRq is system-wide, whereas this is per-terminal and only cares about
> > one tty which the status char is pressed at and its foreground pgrp
> > (most likely it's the foreground shell job).
> > 
> > I hope this is clear enough.
> 
> It is, yes.  My big objection is the crazy code I point out above, as
> well as the "create a totally new interface when we might be able to use
> an existing one" that you need to convince me is really required :)
> 

The signal part (patches [1-4]/7) is not really new, since we've had
Unix-like signals for ages.
(You've probably meant the [5-7]/7, but there's no such thing as a too
detailed explanation).
The compelling reasons for a new signal are:
- it provides asynchronous process notification, as opposed to e. g. a
  POLLIN on a file descriptor, which is not noticed by the recipient
  until they poll(2);
- to make use of the mechanism, an existing app does not have to be
  rewritten in a certain way — a signal handler just gets called, if the
  signal is not blocked — so it's easy to use from applications.
- it aligns well with the current practice of the line discipline
  relaying user requests to processes via signals.

As for the line: before I started writing the patch I thought a lot
about how we could move the status line logic away from n_tty and, when
^T is pressed, have it notify someone in userspace, who would look up
all the pieces in /proc and write() the line to the tty. Obviously this
approach should be much more flexible and easier to maintain from the
kernel perspective.  One candidate for this could actually be the tty's
session leader (most often that's the interactive shell).

To be convenient, the feature needs to have at least the following
qualities:
- the status line has to come first, since it's the only invariant; this
  way it's easy to visually find if needed and just as easy to ignore
  otherwise;
- the reaction to VSTATUS has to be as responsive as it can reasonably
  be and as the system load allows. (btw, SysRq does well in this
  regard, but the drawbacks outweigh this, see above)
The only way we can secure the first point is to signal a process, wait
for it to get scheduled, wait a little more for it to compose and send
output, and only then to signal the pgrp and be done. This is
potentially slow and might lock up the tty's input receive queue for
sizable amounts of time. What if the session leader opts-in to do this
and then ignores this kind of request / never responds altogether? This
gets messy fast. So we can't guarantee the second point.
If we focus on responsiveness, we can't wait for processes at all, so we
ping them and be done with it. This means no output ordering guarantees,
it all gets interleaved and, consequently, less readable and less useful
overall.
Further tweaking the protocol gives the risk of making it too complicated.

So I gave up on the idea and went with this patch, which ticks both boxes
mentioned above. Of course, I'd love to have an ideal solution, and a
pony too. :)

(Or perhaps I'm thinking inside the box and missing a clue and someone
reading this does not. This is a public list, anyway)