All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-03 18:19 ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever,
	bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert,
	gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris,
	bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook,
	mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini,
	peterz, rth, richard, serge, rostedt, tglx, trond.myklebust,
	vincent.guittot, x86
  Cc: linux-kernel, Walt Drummond, ceph-devel, kvm, linux-alpha,
	linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs,
	linux-scsi, linux-security-module

This patch set expands the number of signals in Linux beyond the
current cap of 64.  It sets a new cap at the somewhat arbitrary limit
of 1024 signals, both because it’s what GLibc and MUSL support and
because many architectures pad sigset_t or ucontext_t in the kernel to
this cap.  This limit is not fixed and can be further expanded within
reason.

Despite best efforts, there is some non-zero potential that this could
break user space; I'd appreciate any comments, review and/or pointers
to areas of concern.

Basically, these changes entail:

 - Make all system calls that accept sigset_t honor the existing
   sigsetsize parameter for values between 8 and 128, and to return
   sigsetsize bytes to user space.

 - Add AT_SIGSET_SZ to the aux vector to signal to user space the
   maximum size sigset_t the kernel can accept.

 - Remove the sigmask() macro except in compatibility cases, change
   the sigaddset()/sigdelset()/etc. to accept a comma separated list
   of signal numbers.

 - Change the _NSIG_WORDS calculation to round up when needed on
   generic and x86.

 - Place the complete sigmask in the real time signal frame (x86_64,
   x32 and ia32).

 - Various fixes where sigset_t size is assumed.

 - Add BSD SIGINFO (and VSTATUS) as a test.

The changes that have the most risk of breaking user space are the
ones that put more than 8 bytes of sigset_t in the real time signal
stack frame (Patches 2 & 6), and I should note that an earlier and
incomplete version of patch 2 was NAK’ed by Al in
https://lore.kernel.org/lkml/20201119221132.1515696-1-walt@drummond.us/.

As far as I have been able to determine this patchset, and
specifically changing the size of sigset_t, does not break user space.

The two uses of sigset_t that pose the most user space risk are 1) as
a member of ucontext_t passed as a parameter to the signal handler and
2) when user space performs manual inspection of the real-time signal
stack frame.

In case (1), user space has definitions of both siget_t and ucontext_t
that are independent of, and may differ from, the kernel (eg, sigset_t
in uclibc-ng is 16 bytes, musl is 128 bytes, glibc is 128 bytes on all
architectures except Arc, etc.).  User space will interpret the data
on the signal stack through these definitions, and extensions to
sigset_t will be opaque.  Other non-C runtimes are similarly
independent from kernel sigset_t and ucontext_t and derive their
definition of sigset_t from libc either directly or indirectly, and do
not manually inspect the signal stack (specifically OpenJDK, Golang,
Python3, Rust and Perl).

The only instances I found of case (2), manually inspecting the signal
stack frame, are in stack unwinders/backtracers (GDB, GCC, libunwind)
and in GDB when recording program execution, and only on the i386,
x86_64, s390 and powerpc architectures.  The GDB, GCC and libunwind
behave consistently with and without this patchset.

GDB's execution recording is somewhat more complicated.  It uses
internally defined architecture specific constants to represent the
total size of the signal frame, and will save that entire frame for
later use.  I cannot confirm that the values for powerpc and s390 are
correct, but for this purpose it doesn't matter as these architectures
explicitly pad for an expanded uc_sigmask.  I can, however, confirm
that the values for i386 and x86_64 are not correct, and that GDB is
recording an incorrect amount of stack data.  This doesn’t appear to
be an issue; while I cannot build a test case on x86_64 due to a known
bug[1], a basic test on i386 shows that the stack is correctly being
recorded, and forward and reverse replay seems to work just fine
across signal handlers.

There are other cases to consider if the number of signals and
therefore the size of sigset_t changes:

Impact on struct rt_sigframe member elements

  The placement of ucontext_t in struct rt_sigframe has the potential
  to move following member elements in ways that could break user
  space if user space relied on the offsets of these elements.
  However a review shows that any elements in rt_sigframe after
  ucontext_t.uc_sigmask are either (1) unused or only used by the
  kernel or (2) fall into the x86_64/i386 floating point state case
  above.

Kernel has new signals, user space does not

  Any new bits in ucontext.uc_sigmask placed on the signal stack are
  opaque to user space (except in cases where user space already has a
  larger sigset_t, as in glibc).

  There are no changes to the real-time signals system call semantics,
  as the kernel will honor the hard-coded sigsetsize value of 8 in
  libc and behave as it has before these changes.

  Signal numbers larger than 64 cannot be blocked or caught until user
  space is updated, however their default action will work as
  expected.  This can cause one problem: a parent process that uses
  the signal number a child exited with as an index into an array
  without bounds checking can cause a crash.  I’ve seen exactly one
  instance of this in tcsh, and is, I think, a bug in tcsh.

User space has new signals, kernel does not

  User space attempting to use a signal number not supported by the
  kernel in system calls (eg, sigaction()) or other libc functions (eg,
  sigaddset()) will result in EINVAL, as expected.

  User space needs to know how to set the sigsetsize parameter to the
  real time signal system calls and it can use getauxval(AT_SIGSET_SZ)
  to determine this.  If it returns zero the sigsetsize must be 8,
  otherwise the kernel will accept sigsetsize between 8 and the return
  value.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=23188

Walt Drummond (8):
  signals: Make the real-time signal system calls accept different sized
    sigset_t from user space.
  signals: Put the full signal mask on the signal stack for x86_64, X32
    and ia32 compatibility mode
  signals: Use a helper function to test if a signal is a real-time
    signal.
  signals: Remove sigmask() macro
  signals: Better support cases where _NSIG_WORDS is greater than 2
  signals: Round up _NSIG_WORDS
  signals: Add signal debugging
  signals: Support BSD VSTATUS, KERNINFO and SIGINFO

 arch/alpha/kernel/signal.c          |   4 +-
 arch/m68k/include/asm/signal.h      |   6 +-
 arch/nios2/kernel/signal.c          |   2 -
 arch/x86/ia32/ia32_signal.c         |   5 +-
 arch/x86/include/asm/sighandling.h  |  34 +++
 arch/x86/include/asm/signal.h       |  10 +-
 arch/x86/include/uapi/asm/signal.h  |   4 +-
 arch/x86/kernel/signal.c            |  11 +-
 drivers/scsi/dpti.h                 |   2 -
 drivers/tty/Makefile                |   2 +-
 drivers/tty/n_tty.c                 |  21 ++
 drivers/tty/tty_io.c                |  10 +-
 drivers/tty/tty_ioctl.c             |   4 +
 drivers/tty/tty_status.c            | 135 ++++++++++
 fs/binfmt_elf.c                     |   1 +
 fs/binfmt_elf_fdpic.c               |   1 +
 fs/ceph/addr.c                      |   2 +-
 fs/jffs2/background.c               |   2 +-
 fs/lockd/svc.c                      |   1 -
 fs/proc/array.c                     |  32 +--
 fs/proc/base.c                      |  48 ++++
 fs/signalfd.c                       |  26 +-
 include/asm-generic/termios.h       |   4 +-
 include/linux/compat.h              |  98 ++++++-
 include/linux/sched.h               |  52 +++-
 include/linux/signal.h              | 389 ++++++++++++++++++++--------
 include/linux/tty.h                 |   8 +
 include/uapi/asm-generic/ioctls.h   |   2 +
 include/uapi/asm-generic/signal.h   |   8 +-
 include/uapi/asm-generic/termbits.h |  34 +--
 include/uapi/linux/auxvec.h         |   1 +
 kernel/compat.c                     |  30 +--
 kernel/fork.c                       |   2 +-
 kernel/ptrace.c                     |  18 +-
 kernel/signal.c                     | 288 ++++++++++----------
 kernel/sysctl.c                     |  41 +++
 kernel/time/posix-timers.c          |   3 +-
 lib/Kconfig.debug                   |  10 +
 security/apparmor/ipc.c             |   4 +-
 virt/kvm/kvm_main.c                 |  18 +-
 40 files changed, 974 insertions(+), 399 deletions(-)
 create mode 100644 drivers/tty/tty_status.c

-- 
2.30.2


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-03 18:19 ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever,
	bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert,
	gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris,
	bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook,
	mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini,
	peterz, rth, richard, serge, rostedt, tglx, trond.myklebust,
	vincent.guittot, x86
  Cc: linux-kernel, Walt Drummond, ceph-devel, kvm, linux-alpha,
	linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs,
	linux-scsi, linux-security-module

This patch set expands the number of signals in Linux beyond the
current cap of 64.  It sets a new cap at the somewhat arbitrary limit
of 1024 signals, both because it’s what GLibc and MUSL support and
because many architectures pad sigset_t or ucontext_t in the kernel to
this cap.  This limit is not fixed and can be further expanded within
reason.

Despite best efforts, there is some non-zero potential that this could
break user space; I'd appreciate any comments, review and/or pointers
to areas of concern.

Basically, these changes entail:

 - Make all system calls that accept sigset_t honor the existing
   sigsetsize parameter for values between 8 and 128, and to return
   sigsetsize bytes to user space.

 - Add AT_SIGSET_SZ to the aux vector to signal to user space the
   maximum size sigset_t the kernel can accept.

 - Remove the sigmask() macro except in compatibility cases, change
   the sigaddset()/sigdelset()/etc. to accept a comma separated list
   of signal numbers.

 - Change the _NSIG_WORDS calculation to round up when needed on
   generic and x86.

 - Place the complete sigmask in the real time signal frame (x86_64,
   x32 and ia32).

 - Various fixes where sigset_t size is assumed.

 - Add BSD SIGINFO (and VSTATUS) as a test.

The changes that have the most risk of breaking user space are the
ones that put more than 8 bytes of sigset_t in the real time signal
stack frame (Patches 2 & 6), and I should note that an earlier and
incomplete version of patch 2 was NAK’ed by Al in
https://lore.kernel.org/lkml/20201119221132.1515696-1-walt@drummond.us/.

As far as I have been able to determine this patchset, and
specifically changing the size of sigset_t, does not break user space.

The two uses of sigset_t that pose the most user space risk are 1) as
a member of ucontext_t passed as a parameter to the signal handler and
2) when user space performs manual inspection of the real-time signal
stack frame.

In case (1), user space has definitions of both siget_t and ucontext_t
that are independent of, and may differ from, the kernel (eg, sigset_t
in uclibc-ng is 16 bytes, musl is 128 bytes, glibc is 128 bytes on all
architectures except Arc, etc.).  User space will interpret the data
on the signal stack through these definitions, and extensions to
sigset_t will be opaque.  Other non-C runtimes are similarly
independent from kernel sigset_t and ucontext_t and derive their
definition of sigset_t from libc either directly or indirectly, and do
not manually inspect the signal stack (specifically OpenJDK, Golang,
Python3, Rust and Perl).

The only instances I found of case (2), manually inspecting the signal
stack frame, are in stack unwinders/backtracers (GDB, GCC, libunwind)
and in GDB when recording program execution, and only on the i386,
x86_64, s390 and powerpc architectures.  The GDB, GCC and libunwind
behave consistently with and without this patchset.

GDB's execution recording is somewhat more complicated.  It uses
internally defined architecture specific constants to represent the
total size of the signal frame, and will save that entire frame for
later use.  I cannot confirm that the values for powerpc and s390 are
correct, but for this purpose it doesn't matter as these architectures
explicitly pad for an expanded uc_sigmask.  I can, however, confirm
that the values for i386 and x86_64 are not correct, and that GDB is
recording an incorrect amount of stack data.  This doesn’t appear to
be an issue; while I cannot build a test case on x86_64 due to a known
bug[1], a basic test on i386 shows that the stack is correctly being
recorded, and forward and reverse replay seems to work just fine
across signal handlers.

There are other cases to consider if the number of signals and
therefore the size of sigset_t changes:

Impact on struct rt_sigframe member elements

  The placement of ucontext_t in struct rt_sigframe has the potential
  to move following member elements in ways that could break user
  space if user space relied on the offsets of these elements.
  However a review shows that any elements in rt_sigframe after
  ucontext_t.uc_sigmask are either (1) unused or only used by the
  kernel or (2) fall into the x86_64/i386 floating point state case
  above.

Kernel has new signals, user space does not

  Any new bits in ucontext.uc_sigmask placed on the signal stack are
  opaque to user space (except in cases where user space already has a
  larger sigset_t, as in glibc).

  There are no changes to the real-time signals system call semantics,
  as the kernel will honor the hard-coded sigsetsize value of 8 in
  libc and behave as it has before these changes.

  Signal numbers larger than 64 cannot be blocked or caught until user
  space is updated, however their default action will work as
  expected.  This can cause one problem: a parent process that uses
  the signal number a child exited with as an index into an array
  without bounds checking can cause a crash.  I’ve seen exactly one
  instance of this in tcsh, and is, I think, a bug in tcsh.

User space has new signals, kernel does not

  User space attempting to use a signal number not supported by the
  kernel in system calls (eg, sigaction()) or other libc functions (eg,
  sigaddset()) will result in EINVAL, as expected.

  User space needs to know how to set the sigsetsize parameter to the
  real time signal system calls and it can use getauxval(AT_SIGSET_SZ)
  to determine this.  If it returns zero the sigsetsize must be 8,
  otherwise the kernel will accept sigsetsize between 8 and the return
  value.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=23188

Walt Drummond (8):
  signals: Make the real-time signal system calls accept different sized
    sigset_t from user space.
  signals: Put the full signal mask on the signal stack for x86_64, X32
    and ia32 compatibility mode
  signals: Use a helper function to test if a signal is a real-time
    signal.
  signals: Remove sigmask() macro
  signals: Better support cases where _NSIG_WORDS is greater than 2
  signals: Round up _NSIG_WORDS
  signals: Add signal debugging
  signals: Support BSD VSTATUS, KERNINFO and SIGINFO

 arch/alpha/kernel/signal.c          |   4 +-
 arch/m68k/include/asm/signal.h      |   6 +-
 arch/nios2/kernel/signal.c          |   2 -
 arch/x86/ia32/ia32_signal.c         |   5 +-
 arch/x86/include/asm/sighandling.h  |  34 +++
 arch/x86/include/asm/signal.h       |  10 +-
 arch/x86/include/uapi/asm/signal.h  |   4 +-
 arch/x86/kernel/signal.c            |  11 +-
 drivers/scsi/dpti.h                 |   2 -
 drivers/tty/Makefile                |   2 +-
 drivers/tty/n_tty.c                 |  21 ++
 drivers/tty/tty_io.c                |  10 +-
 drivers/tty/tty_ioctl.c             |   4 +
 drivers/tty/tty_status.c            | 135 ++++++++++
 fs/binfmt_elf.c                     |   1 +
 fs/binfmt_elf_fdpic.c               |   1 +
 fs/ceph/addr.c                      |   2 +-
 fs/jffs2/background.c               |   2 +-
 fs/lockd/svc.c                      |   1 -
 fs/proc/array.c                     |  32 +--
 fs/proc/base.c                      |  48 ++++
 fs/signalfd.c                       |  26 +-
 include/asm-generic/termios.h       |   4 +-
 include/linux/compat.h              |  98 ++++++-
 include/linux/sched.h               |  52 +++-
 include/linux/signal.h              | 389 ++++++++++++++++++++--------
 include/linux/tty.h                 |   8 +
 include/uapi/asm-generic/ioctls.h   |   2 +
 include/uapi/asm-generic/signal.h   |   8 +-
 include/uapi/asm-generic/termbits.h |  34 +--
 include/uapi/linux/auxvec.h         |   1 +
 kernel/compat.c                     |  30 +--
 kernel/fork.c                       |   2 +-
 kernel/ptrace.c                     |  18 +-
 kernel/signal.c                     | 288 ++++++++++----------
 kernel/sysctl.c                     |  41 +++
 kernel/time/posix-timers.c          |   3 +-
 lib/Kconfig.debug                   |  10 +
 security/apparmor/ipc.c             |   4 +-
 virt/kvm/kvm_main.c                 |  18 +-
 40 files changed, 974 insertions(+), 399 deletions(-)
 create mode 100644 drivers/tty/tty_status.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-03 18:19 ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever,
	bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert,
	gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris,
	bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook,
	mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini,
	peterz, rth, richard, serge
  Cc: linux-kernel, Walt Drummond, ceph-devel, kvm, linux-alpha,
	linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs,
	linux-scsi, linux-security-module

This patch set expands the number of signals in Linux beyond the
current cap of 64.  It sets a new cap at the somewhat arbitrary limit
of 1024 signals, both because it’s what GLibc and MUSL support and
because many architectures pad sigset_t or ucontext_t in the kernel to
this cap.  This limit is not fixed and can be further expanded within
reason.

Despite best efforts, there is some non-zero potential that this could
break user space; I'd appreciate any comments, review and/or pointers
to areas of concern.

Basically, these changes entail:

 - Make all system calls that accept sigset_t honor the existing
   sigsetsize parameter for values between 8 and 128, and to return
   sigsetsize bytes to user space.

 - Add AT_SIGSET_SZ to the aux vector to signal to user space the
   maximum size sigset_t the kernel can accept.

 - Remove the sigmask() macro except in compatibility cases, change
   the sigaddset()/sigdelset()/etc. to accept a comma separated list
   of signal numbers.

 - Change the _NSIG_WORDS calculation to round up when needed on
   generic and x86.

 - Place the complete sigmask in the real time signal frame (x86_64,
   x32 and ia32).

 - Various fixes where sigset_t size is assumed.

 - Add BSD SIGINFO (and VSTATUS) as a test.

The changes that have the most risk of breaking user space are the
ones that put more than 8 bytes of sigset_t in the real time signal
stack frame (Patches 2 & 6), and I should note that an earlier and
incomplete version of patch 2 was NAK’ed by Al in
https://lore.kernel.org/lkml/20201119221132.1515696-1-walt@drummond.us/.

As far as I have been able to determine this patchset, and
specifically changing the size of sigset_t, does not break user space.

The two uses of sigset_t that pose the most user space risk are 1) as
a member of ucontext_t passed as a parameter to the signal handler and
2) when user space performs manual inspection of the real-time signal
stack frame.

In case (1), user space has definitions of both siget_t and ucontext_t
that are independent of, and may differ from, the kernel (eg, sigset_t
in uclibc-ng is 16 bytes, musl is 128 bytes, glibc is 128 bytes on all
architectures except Arc, etc.).  User space will interpret the data
on the signal stack through these definitions, and extensions to
sigset_t will be opaque.  Other non-C runtimes are similarly
independent from kernel sigset_t and ucontext_t and derive their
definition of sigset_t from libc either directly or indirectly, and do
not manually inspect the signal stack (specifically OpenJDK, Golang,
Python3, Rust and Perl).

The only instances I found of case (2), manually inspecting the signal
stack frame, are in stack unwinders/backtracers (GDB, GCC, libunwind)
and in GDB when recording program execution, and only on the i386,
x86_64, s390 and powerpc architectures.  The GDB, GCC and libunwind
behave consistently with and without this patchset.

GDB's execution recording is somewhat more complicated.  It uses
internally defined architecture specific constants to represent the
total size of the signal frame, and will save that entire frame for
later use.  I cannot confirm that the values for powerpc and s390 are
correct, but for this purpose it doesn't matter as these architectures
explicitly pad for an expanded uc_sigmask.  I can, however, confirm
that the values for i386 and x86_64 are not correct, and that GDB is
recording an incorrect amount of stack data.  This doesn’t appear to
be an issue; while I cannot build a test case on x86_64 due to a known
bug[1], a basic test on i386 shows that the stack is correctly being
recorded, and forward and reverse replay seems to work just fine
across signal handlers.

There are other cases to consider if the number of signals and
therefore the size of sigset_t changes:

Impact on struct rt_sigframe member elements

  The placement of ucontext_t in struct rt_sigframe has the potential
  to move following member elements in ways that could break user
  space if user space relied on the offsets of these elements.
  However a review shows that any elements in rt_sigframe after
  ucontext_t.uc_sigmask are either (1) unused or only used by the
  kernel or (2) fall into the x86_64/i386 floating point state case
  above.

Kernel has new signals, user space does not

  Any new bits in ucontext.uc_sigmask placed on the signal stack are
  opaque to user space (except in cases where user space already has a
  larger sigset_t, as in glibc).

  There are no changes to the real-time signals system call semantics,
  as the kernel will honor the hard-coded sigsetsize value of 8 in
  libc and behave as it has before these changes.

  Signal numbers larger than 64 cannot be blocked or caught until user
  space is updated, however their default action will work as
  expected.  This can cause one problem: a parent process that uses
  the signal number a child exited with as an index into an array
  without bounds checking can cause a crash.  I’ve seen exactly one
  instance of this in tcsh, and is, I think, a bug in tcsh.

User space has new signals, kernel does not

  User space attempting to use a signal number not supported by the
  kernel in system calls (eg, sigaction()) or other libc functions (eg,
  sigaddset()) will result in EINVAL, as expected.

  User space needs to know how to set the sigsetsize parameter to the
  real time signal system calls and it can use getauxval(AT_SIGSET_SZ)
  to determine this.  If it returns zero the sigsetsize must be 8,
  otherwise the kernel will accept sigsetsize between 8 and the return
  value.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=23188

Walt Drummond (8):
  signals: Make the real-time signal system calls accept different sized
    sigset_t from user space.
  signals: Put the full signal mask on the signal stack for x86_64, X32
    and ia32 compatibility mode
  signals: Use a helper function to test if a signal is a real-time
    signal.
  signals: Remove sigmask() macro
  signals: Better support cases where _NSIG_WORDS is greater than 2
  signals: Round up _NSIG_WORDS
  signals: Add signal debugging
  signals: Support BSD VSTATUS, KERNINFO and SIGINFO

 arch/alpha/kernel/signal.c          |   4 +-
 arch/m68k/include/asm/signal.h      |   6 +-
 arch/nios2/kernel/signal.c          |   2 -
 arch/x86/ia32/ia32_signal.c         |   5 +-
 arch/x86/include/asm/sighandling.h  |  34 +++
 arch/x86/include/asm/signal.h       |  10 +-
 arch/x86/include/uapi/asm/signal.h  |   4 +-
 arch/x86/kernel/signal.c            |  11 +-
 drivers/scsi/dpti.h                 |   2 -
 drivers/tty/Makefile                |   2 +-
 drivers/tty/n_tty.c                 |  21 ++
 drivers/tty/tty_io.c                |  10 +-
 drivers/tty/tty_ioctl.c             |   4 +
 drivers/tty/tty_status.c            | 135 ++++++++++
 fs/binfmt_elf.c                     |   1 +
 fs/binfmt_elf_fdpic.c               |   1 +
 fs/ceph/addr.c                      |   2 +-
 fs/jffs2/background.c               |   2 +-
 fs/lockd/svc.c                      |   1 -
 fs/proc/array.c                     |  32 +--
 fs/proc/base.c                      |  48 ++++
 fs/signalfd.c                       |  26 +-
 include/asm-generic/termios.h       |   4 +-
 include/linux/compat.h              |  98 ++++++-
 include/linux/sched.h               |  52 +++-
 include/linux/signal.h              | 389 ++++++++++++++++++++--------
 include/linux/tty.h                 |   8 +
 include/uapi/asm-generic/ioctls.h   |   2 +
 include/uapi/asm-generic/signal.h   |   8 +-
 include/uapi/asm-generic/termbits.h |  34 +--
 include/uapi/linux/auxvec.h         |   1 +
 kernel/compat.c                     |  30 +--
 kernel/fork.c                       |   2 +-
 kernel/ptrace.c                     |  18 +-
 kernel/signal.c                     | 288 ++++++++++----------
 kernel/sysctl.c                     |  41 +++
 kernel/time/posix-timers.c          |   3 +-
 lib/Kconfig.debug                   |  10 +
 security/apparmor/ipc.c             |   4 +-
 virt/kvm/kvm_main.c                 |  18 +-
 40 files changed, 974 insertions(+), 399 deletions(-)
 create mode 100644 drivers/tty/tty_status.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [RFC PATCH 1/8] signals: Make the real-time signal system calls accept different sized sigset_t from user space.
  2022-01-03 18:19 ` Walt Drummond
  (?)
  (?)
@ 2022-01-03 18:19 ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Alexander Viro, Oleg Nesterov, Paolo Bonzini
  Cc: linux-kernel, Walt Drummond, linux-fsdevel, kvm

The real-time signals API provides a mechanism for user space to tell
the kernel how many bytes is has in sigset_t. Make these system calls
use that mechanism and accept differently sized sigset_t.

Add a value to the auxvec to inform user space of the maximum size
sigset_t the kernel can accept.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 fs/binfmt_elf.c             |   1 +
 fs/binfmt_elf_fdpic.c       |   1 +
 fs/signalfd.c               |  24 +++---
 include/linux/compat.h      |  98 +++++++++++++++++++++---
 include/linux/signal.h      |  62 +++++++++++++++
 include/uapi/linux/auxvec.h |   1 +
 kernel/compat.c             |  24 ------
 kernel/ptrace.c             |  16 ++--
 kernel/signal.c             | 147 +++++++++++++++++++-----------------
 virt/kvm/kvm_main.c         |  16 ++--
 10 files changed, 257 insertions(+), 133 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index a813b70f594e..7133515fd386 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -274,6 +274,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
 #ifdef ELF_HWCAP2
 	NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2);
 #endif
+	NEW_AUX_ENT(AT_SIGSET_SZ, SIGSETSIZE_MAX);
 	NEW_AUX_ENT(AT_EXECFN, bprm->exec);
 	if (k_platform) {
 		NEW_AUX_ENT(AT_PLATFORM,
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 6d8fd6030cbb..09249dc4364b 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -659,6 +659,7 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
 	NEW_AUX_ENT(AT_EGID,	(elf_addr_t) from_kgid_munged(cred->user_ns, cred->egid));
 	NEW_AUX_ENT(AT_SECURE,	bprm->secureexec);
 	NEW_AUX_ENT(AT_EXECFN,	bprm->exec);
+	NEW_AUX_ENT(AT_SIGSET_SZ, SIGSETSIZE_MAX);
 
 #ifdef ARCH_DLINFO
 	nr = 0;
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 040e1cf90528..12fdc282e299 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -311,24 +311,24 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
 SYSCALL_DEFINE4(signalfd4, int, ufd, sigset_t __user *, user_mask,
 		size_t, sizemask, int, flags)
 {
+	int ret;
 	sigset_t mask;
 
-	if (sizemask != sizeof(sigset_t))
-		return -EINVAL;
-	if (copy_from_user(&mask, user_mask, sizeof(mask)))
-		return -EFAULT;
+	ret = copy_sigset_from_user(&mask, user_mask, sizemask);
+	if (ret)
+		return ret;
 	return do_signalfd4(ufd, &mask, flags);
 }
 
 SYSCALL_DEFINE3(signalfd, int, ufd, sigset_t __user *, user_mask,
 		size_t, sizemask)
 {
+	int ret;
 	sigset_t mask;
 
-	if (sizemask != sizeof(sigset_t))
-		return -EINVAL;
-	if (copy_from_user(&mask, user_mask, sizeof(mask)))
-		return -EFAULT;
+	ret = copy_sigset_from_user(&mask, user_mask, sizemask);
+	if (ret)
+		return ret;
 	return do_signalfd4(ufd, &mask, 0);
 }
 
@@ -338,11 +338,11 @@ static long do_compat_signalfd4(int ufd,
 			compat_size_t sigsetsize, int flags)
 {
 	sigset_t mask;
+	int ret;
 
-	if (sigsetsize != sizeof(compat_sigset_t))
-		return -EINVAL;
-	if (get_compat_sigset(&mask, user_mask))
-		return -EFAULT;
+	ret = copy_compat_sigset_from_user(&mask, user_mask, sigsetsize);
+	if (ret)
+		return ret;
 	return do_signalfd4(ufd, &mask, flags);
 }
 
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 1c758b0e0359..ecdbff1d2218 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -407,33 +407,109 @@ int __copy_siginfo_to_user32(struct compat_siginfo __user *to,
 int get_compat_sigevent(struct sigevent *event,
 		const struct compat_sigevent __user *u_event);
 
-extern int get_compat_sigset(sigset_t *set, const compat_sigset_t __user *compat);
-
 /*
  * Defined inline such that size can be compile time constant, which avoids
  * CONFIG_HARDENED_USERCOPY complaining about copies from task_struct
  */
 static inline int
-put_compat_sigset(compat_sigset_t __user *compat, const sigset_t *set,
-		  unsigned int size)
+copy_compat_sigset_to_user(compat_sigset_t __user *compat, const sigset_t *set,
+			   size_t sigsetsize)
 {
-	/* size <= sizeof(compat_sigset_t) <= sizeof(sigset_t) */
+	size_t copybytes;
 #if defined(__BIG_ENDIAN) && defined(CONFIG_64BIT)
 	compat_sigset_t v;
+	int i;
+#endif
+
+	if (!valid_sigsetsize(sigsetsize))
+		return -EINVAL;
+
+	copybytes = min(sizeof(compat_sigset_t), sigsetsize);
+
+#if defined(__BIG_ENDIAN) && defined(CONFIG_64BIT)
+	switch (_NSIG_WORDS) {
+	default:
+		for (i = 0; i < _NSIG_WORDS; i++) {
+			v.sig[(i * 2)]     = set->sig[i];
+			v.sig[(i * 2) + 1] = set->sig[i] >> 32;
+		}
+		break;
+	case 4:
+		v.sig[7] = (set->sig[3] >> 32);
+		v.sig[6] =  set->sig[3];
+		fallthrough;
+	case 3:
+		v.sig[5] = (set->sig[2] >> 32);
+		v.sig[4] =  set->sig[2];
+		fallthrough;
+	case 2:
+		v.sig[3] = (set->sig[1] >> 32);
+		v.sig[2] =  set->sig[1];
+		fallthrough;
+	case 1:
+		v.sig[1] = (set->sig[0] >> 32);
+		v.sig[0] =  set->sig[0];
+	}
+	if (copy_to_user(compat, &v, copybytes))
+		return -EFAULT;
+#else
+	if (copy_to_user(compat, set, copybytes))
+		return -EFAULT;
+#endif
+	/* Zero any unused part of mask */
+	if (sigsetsize > sizeof(compat_sigset_t)) {
+		if (clear_user((char *)compat + copybytes,
+			       sigsetsize - sizeof(compat_sigset_t)))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+#define put_compat_sigset(set, compat, size)		\
+	copy_compat_sigset_to_user((set), (compat), (size))
+
+static inline int
+copy_compat_sigset_from_user(sigset_t *set,
+			     const compat_sigset_t __user *compat, size_t size)
+{
+#if defined(__BIG_ENDIAN) && defined(CONFIG_64BIT)
+	compat_sigset_t v;
+	int i;
+#endif
+
+	if (!valid_sigsetsize(size))
+		return -EINVAL;
+
+#if defined(__BIG_ENDIAN) && defined(CONFIG_64BIT)
+	if (copy_from_user(&v, compat, min(sizeof(compat_sigset_t), size)))
+		return -EFAULT;
 	switch (_NSIG_WORDS) {
-	case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
+	default:
+		for (i = 0; i < _NSIG_WORDS; i++) {
+			set->sig[i] =    v.sig[(i * 2)] |
+				(((long) v.sig[(i * 2) + 1]) << 32);
+		}
+		break;
+	case 4:
+		set->sig[3] = v.sig[6] | (((long)v.sig[7]) << 32);
 		fallthrough;
-	case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
+	case 3:
+		set->sig[2] = v.sig[4] | (((long)v.sig[5]) << 32);
 		fallthrough;
-	case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
+	case 2:
+		set->sig[1] = v.sig[2] | (((long)v.sig[3]) << 32);
 		fallthrough;
-	case 1: v.sig[1] = (set->sig[0] >> 32); v.sig[0] = set->sig[0];
+	case 1:
+		set->sig[0] = v.sig[0] | (((long)v.sig[1]) << 32);
 	}
-	return copy_to_user(compat, &v, size) ? -EFAULT : 0;
 #else
-	return copy_to_user(compat, set, size) ? -EFAULT : 0;
+	if (copy_from_user(set, compat, min(sizeof(compat_sigset_t), size)))
+		return -EFAULT;
 #endif
+	return 0;
 }
+#define get_compat_sigset(set, compat)					\
+	copy_compat_sigset_from_user((set), (compat), sizeof(compat_sigset_t))
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define unsafe_put_compat_sigset(compat, set, label) do {		\
diff --git a/include/linux/signal.h b/include/linux/signal.h
index 3f96a6374e4f..c66d4f520228 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -5,6 +5,7 @@
 #include <linux/bug.h>
 #include <linux/signal_types.h>
 #include <linux/string.h>
+#include <linux/uaccess.h>
 
 struct task_struct;
 
@@ -260,6 +261,67 @@ static inline void siginitsetinv(sigset_t *set, unsigned long mask)
 
 #endif /* __HAVE_ARCH_SIG_SETOPS */
 
+/* Safely copy a sigset_t from user space handling any differences in
+ * size between user space and kernel sigset_t.  We don't use
+ * copy_struct_from_user() here as we can't ensure that in the case
+ * where sigisetsize > sizeof(sigset_t), the unused bytes are zeroed.
+ *
+ * SIGSETSIZE_MIN *must* be 8 bytes and cannot change.
+ *
+ * SIGSETSIZE_MAX shouldn't be too small, nor should it be too large.
+ * We've somewhat randomly picked 128 bytes to keep this sync'ed with
+ * glibc and musl; this can be changed as needed.
+ */
+
+#define SIGSETSIZE_MIN 8
+#define SIGSETSIZE_MAX 128
+
+static inline int valid_sigsetsize(size_t sigsetsize)
+{
+	return  sigsetsize >= SIGSETSIZE_MIN &&
+		sigsetsize <= SIGSETSIZE_MAX;
+}
+
+static inline int copy_sigset_from_user(sigset_t *kmask,
+					const sigset_t __user *umask,
+					size_t sigsetsize)
+{
+	if (!valid_sigsetsize(sigsetsize))
+		return -EINVAL;
+
+	if (kmask == NULL)
+		return -EFAULT;
+
+	sigemptyset(kmask);
+
+	if (copy_from_user(kmask, umask, min(sizeof(sigset_t), sigsetsize)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static inline int copy_sigset_to_user(sigset_t __user *umask,
+				      sigset_t *kmask,
+				      size_t sigsetsize)
+{
+	size_t copybytes;
+
+	if (!valid_sigsetsize(sigsetsize))
+		return -EINVAL;
+
+	copybytes = min(sizeof(sigset_t), sigsetsize);
+	if (copy_to_user(umask, kmask, copybytes))
+		return -EFAULT;
+
+	/* Zero unused parts of umask */
+	if (sigsetsize > copybytes) {
+		if (clear_user((char *)umask + copybytes,
+			       sigsetsize - copybytes))
+			return -EFAULT;
+	}
+	return 0;
+}
+
 static inline void init_sigpending(struct sigpending *sig)
 {
 	sigemptyset(&sig->signal);
diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
index c7e502bf5a6f..752184abf620 100644
--- a/include/uapi/linux/auxvec.h
+++ b/include/uapi/linux/auxvec.h
@@ -30,6 +30,7 @@
 				 * differ from AT_PLATFORM. */
 #define AT_RANDOM 25	/* address of 16 random bytes */
 #define AT_HWCAP2 26	/* extension of AT_HWCAP */
+#define AT_SIGSET_SZ 27	/* sizeof(sigset_t) */
 
 #define AT_EXECFN  31	/* filename of program */
 
diff --git a/kernel/compat.c b/kernel/compat.c
index 55551989d9da..cc2438f4070c 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -245,27 +245,3 @@ long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
 	user_write_access_end();
 	return -EFAULT;
 }
-
-int
-get_compat_sigset(sigset_t *set, const compat_sigset_t __user *compat)
-{
-#ifdef __BIG_ENDIAN
-	compat_sigset_t v;
-	if (copy_from_user(&v, compat, sizeof(compat_sigset_t)))
-		return -EFAULT;
-	switch (_NSIG_WORDS) {
-	case 4: set->sig[3] = v.sig[6] | (((long)v.sig[7]) << 32 );
-		fallthrough;
-	case 3: set->sig[2] = v.sig[4] | (((long)v.sig[5]) << 32 );
-		fallthrough;
-	case 2: set->sig[1] = v.sig[2] | (((long)v.sig[3]) << 32 );
-		fallthrough;
-	case 1: set->sig[0] = v.sig[0] | (((long)v.sig[1]) << 32 );
-	}
-#else
-	if (copy_from_user(set, compat, sizeof(compat_sigset_t)))
-		return -EFAULT;
-#endif
-	return 0;
-}
-EXPORT_SYMBOL_GPL(get_compat_sigset);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index f8589bf8d7dc..2f7ee345a629 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1074,8 +1074,9 @@ int ptrace_request(struct task_struct *child, long request,
 
 	case PTRACE_GETSIGMASK: {
 		sigset_t *mask;
+		size_t sigsetsize = (size_t) addr;
 
-		if (addr != sizeof(sigset_t)) {
+		if (!valid_sigsetsize(sigsetsize) == 0) {
 			ret = -EINVAL;
 			break;
 		}
@@ -1085,7 +1086,7 @@ int ptrace_request(struct task_struct *child, long request,
 		else
 			mask = &child->blocked;
 
-		if (copy_to_user(datavp, mask, sizeof(sigset_t)))
+		if (copy_sigset_to_user(datavp, mask, sigsetsize))
 			ret = -EFAULT;
 		else
 			ret = 0;
@@ -1095,16 +1096,11 @@ int ptrace_request(struct task_struct *child, long request,
 
 	case PTRACE_SETSIGMASK: {
 		sigset_t new_set;
+		size_t sigsetsize = (size_t) addr;
 
-		if (addr != sizeof(sigset_t)) {
-			ret = -EINVAL;
-			break;
-		}
-
-		if (copy_from_user(&new_set, datavp, sizeof(sigset_t))) {
-			ret = -EFAULT;
+		ret = copy_sigset_from_user(&new_set, datavp, sigsetsize);
+		if (ret)
 			break;
-		}
 
 		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 487bf4f5dadf..94b1828ae973 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3091,13 +3091,14 @@ EXPORT_SYMBOL(sigprocmask);
 int set_user_sigmask(const sigset_t __user *umask, size_t sigsetsize)
 {
 	sigset_t kmask;
+	int ret;
 
 	if (!umask)
 		return 0;
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
-	if (copy_from_user(&kmask, umask, sizeof(sigset_t)))
-		return -EFAULT;
+
+	ret = copy_sigset_from_user(&kmask, umask, sigsetsize);
+	if (ret)
+		return ret;
 
 	set_restore_sigmask();
 	current->saved_sigmask = current->blocked;
@@ -3111,13 +3112,14 @@ int set_compat_user_sigmask(const compat_sigset_t __user *umask,
 			    size_t sigsetsize)
 {
 	sigset_t kmask;
+	int ret;
 
 	if (!umask)
 		return 0;
-	if (sigsetsize != sizeof(compat_sigset_t))
-		return -EINVAL;
-	if (get_compat_sigset(&kmask, umask))
-		return -EFAULT;
+
+	ret = copy_compat_sigset_from_user(&kmask, umask, sigsetsize);
+	if (ret)
+		return ret;
 
 	set_restore_sigmask();
 	current->saved_sigmask = current->blocked;
@@ -3140,14 +3142,13 @@ SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
 	sigset_t old_set, new_set;
 	int error;
 
-	/* XXX: Don't preclude handling different sized sigset_t's.  */
-	if (sigsetsize != sizeof(sigset_t))
+	if (!valid_sigsetsize(sigsetsize))
 		return -EINVAL;
 
 	old_set = current->blocked;
 
 	if (nset) {
-		if (copy_from_user(&new_set, nset, sizeof(sigset_t)))
+		if (copy_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
 		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
 
@@ -3157,7 +3158,7 @@ SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
 	}
 
 	if (oset) {
-		if (copy_to_user(oset, &old_set, sizeof(sigset_t)))
+		if (copy_sigset_to_user(oset, &old_set, sigsetsize))
 			return -EFAULT;
 	}
 
@@ -3168,16 +3169,16 @@ SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
 COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, compat_sigset_t __user *, nset,
 		compat_sigset_t __user *, oset, compat_size_t, sigsetsize)
 {
-	sigset_t old_set = current->blocked;
+	sigset_t old_set, new_set;
+	int error;
 
-	/* XXX: Don't preclude handling different sized sigset_t's.  */
-	if (sigsetsize != sizeof(sigset_t))
+	if (!valid_sigsetsize(sigsetsize))
 		return -EINVAL;
 
+	old_set = current->blocked;
+
 	if (nset) {
-		sigset_t new_set;
-		int error;
-		if (get_compat_sigset(&new_set, nset))
+		if (copy_compat_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
 		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
 
@@ -3185,7 +3186,12 @@ COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, compat_sigset_t __user *, nset,
 		if (error)
 			return error;
 	}
-	return oset ? put_compat_sigset(oset, &old_set, sizeof(*oset)) : 0;
+	if (oset) {
+		if (copy_compat_sigset_to_user(oset, &old_set, sigsetsize))
+			return -EFAULT;
+	}
+
+	return 0;
 }
 #endif
 
@@ -3210,12 +3216,12 @@ SYSCALL_DEFINE2(rt_sigpending, sigset_t __user *, uset, size_t, sigsetsize)
 {
 	sigset_t set;
 
-	if (sigsetsize > sizeof(*uset))
+	if (!valid_sigsetsize(sigsetsize))
 		return -EINVAL;
 
 	do_sigpending(&set);
 
-	if (copy_to_user(uset, &set, sigsetsize))
+	if (copy_sigset_to_user(uset, &set, sigsetsize))
 		return -EFAULT;
 
 	return 0;
@@ -3227,12 +3233,15 @@ COMPAT_SYSCALL_DEFINE2(rt_sigpending, compat_sigset_t __user *, uset,
 {
 	sigset_t set;
 
-	if (sigsetsize > sizeof(*uset))
+	if (!valid_sigsetsize(sigsetsize))
 		return -EINVAL;
 
 	do_sigpending(&set);
 
-	return put_compat_sigset(uset, &set, sigsetsize);
+	if (copy_compat_sigset_to_user(uset, &set, sigsetsize))
+		return -EFAULT;
+
+	return 0;
 }
 #endif
 
@@ -3627,12 +3636,9 @@ SYSCALL_DEFINE4(rt_sigtimedwait, const sigset_t __user *, uthese,
 	kernel_siginfo_t info;
 	int ret;
 
-	/* XXX: Don't preclude handling different sized sigset_t's.  */
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
-
-	if (copy_from_user(&these, uthese, sizeof(these)))
-		return -EFAULT;
+	ret = copy_sigset_from_user(&these, uthese, sigsetsize);
+	if (ret)
+		return ret;
 
 	if (uts) {
 		if (get_timespec64(&ts, uts))
@@ -3660,11 +3666,9 @@ SYSCALL_DEFINE4(rt_sigtimedwait_time32, const sigset_t __user *, uthese,
 	kernel_siginfo_t info;
 	int ret;
 
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
-
-	if (copy_from_user(&these, uthese, sizeof(these)))
-		return -EFAULT;
+	ret = copy_sigset_from_user(&these, uthese, sigsetsize);
+	if (ret)
+		return ret;
 
 	if (uts) {
 		if (get_old_timespec32(&ts, uts))
@@ -3692,11 +3696,9 @@ COMPAT_SYSCALL_DEFINE4(rt_sigtimedwait_time64, compat_sigset_t __user *, uthese,
 	kernel_siginfo_t info;
 	long ret;
 
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
-
-	if (get_compat_sigset(&s, uthese))
-		return -EFAULT;
+	ret = copy_compat_sigset_from_user(&s, uthese, sigsetsize);
+	if (ret)
+		return ret;
 
 	if (uts) {
 		if (get_timespec64(&t, uts))
@@ -3723,11 +3725,9 @@ COMPAT_SYSCALL_DEFINE4(rt_sigtimedwait_time32, compat_sigset_t __user *, uthese,
 	kernel_siginfo_t info;
 	long ret;
 
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
-
-	if (get_compat_sigset(&s, uthese))
-		return -EFAULT;
+	ret = copy_compat_sigset_from_user(&s, uthese, sigsetsize);
+	if (ret)
+		return ret;
 
 	if (uts) {
 		if (get_old_timespec32(&t, uts))
@@ -4370,21 +4370,36 @@ SYSCALL_DEFINE4(rt_sigaction, int, sig,
 		size_t, sigsetsize)
 {
 	struct k_sigaction new_sa, old_sa;
+	size_t sa_len = sizeof(struct sigaction) - sizeof(sigset_t);
 	int ret;
 
-	/* XXX: Don't preclude handling different sized sigset_t's.  */
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
+	/* struct sigaction contains a sigset_t; handle cases where
+	 * user and kernel sizes of sigset_t differ.
+	 */
 
-	if (act && copy_from_user(&new_sa.sa, act, sizeof(new_sa.sa)))
-		return -EFAULT;
+	memset(&new_sa.sa, 0, sizeof(struct sigaction));
+
+	if (act) {
+		if (copy_from_user(&new_sa.sa, act, sa_len))
+			return -EFAULT;
+		ret = copy_sigset_from_user(&new_sa.sa.sa_mask, &act->sa_mask,
+					    sigsetsize);
+		if (ret)
+			return ret;
+	}
 
 	ret = do_sigaction(sig, act ? &new_sa : NULL, oact ? &old_sa : NULL);
 	if (ret)
 		return ret;
 
-	if (oact && copy_to_user(oact, &old_sa.sa, sizeof(old_sa.sa)))
-		return -EFAULT;
+	if (oact) {
+		if (copy_to_user(oact, &old_sa.sa, sa_len))
+			return -EFAULT;
+		ret = copy_sigset_to_user(&oact->sa_mask, &old_sa.sa.sa_mask,
+					  sigsetsize);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }
@@ -4400,8 +4415,7 @@ COMPAT_SYSCALL_DEFINE4(rt_sigaction, int, sig,
 #endif
 	int ret;
 
-	/* XXX: Don't preclude handling different sized sigset_t's.  */
-	if (sigsetsize != sizeof(compat_sigset_t))
+	if (!valid_sigsetsize(sigsetsize))
 		return -EINVAL;
 
 	if (act) {
@@ -4412,7 +4426,8 @@ COMPAT_SYSCALL_DEFINE4(rt_sigaction, int, sig,
 		ret |= get_user(restorer, &act->sa_restorer);
 		new_ka.sa.sa_restorer = compat_ptr(restorer);
 #endif
-		ret |= get_compat_sigset(&new_ka.sa.sa_mask, &act->sa_mask);
+		ret |= copy_compat_sigset_from_user(&new_ka.sa.sa_mask,
+						    &act->sa_mask, sigsetsize);
 		ret |= get_user(new_ka.sa.sa_flags, &act->sa_flags);
 		if (ret)
 			return -EFAULT;
@@ -4422,8 +4437,8 @@ COMPAT_SYSCALL_DEFINE4(rt_sigaction, int, sig,
 	if (!ret && oact) {
 		ret = put_user(ptr_to_compat(old_ka.sa.sa_handler), 
 			       &oact->sa_handler);
-		ret |= put_compat_sigset(&oact->sa_mask, &old_ka.sa.sa_mask,
-					 sizeof(oact->sa_mask));
+		ret |= copy_compat_sigset_to_user(&oact->sa_mask,
+						  &old_ka.sa.sa_mask, sigsetsize);
 		ret |= put_user(old_ka.sa.sa_flags, &oact->sa_flags);
 #ifdef __ARCH_HAS_SA_RESTORER
 		ret |= put_user(ptr_to_compat(old_ka.sa.sa_restorer),
@@ -4590,13 +4605,11 @@ static int sigsuspend(sigset_t *set)
 SYSCALL_DEFINE2(rt_sigsuspend, sigset_t __user *, unewset, size_t, sigsetsize)
 {
 	sigset_t newset;
+	int ret;
 
-	/* XXX: Don't preclude handling different sized sigset_t's.  */
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
-
-	if (copy_from_user(&newset, unewset, sizeof(newset)))
-		return -EFAULT;
+	ret = copy_sigset_from_user(&newset, unewset, sigsetsize);
+	if (ret)
+		return ret;
 	return sigsuspend(&newset);
 }
  
@@ -4604,13 +4617,11 @@ SYSCALL_DEFINE2(rt_sigsuspend, sigset_t __user *, unewset, size_t, sigsetsize)
 COMPAT_SYSCALL_DEFINE2(rt_sigsuspend, compat_sigset_t __user *, unewset, compat_size_t, sigsetsize)
 {
 	sigset_t newset;
+	int ret;
 
-	/* XXX: Don't preclude handling different sized sigset_t's.  */
-	if (sigsetsize != sizeof(sigset_t))
-		return -EINVAL;
-
-	if (get_compat_sigset(&newset, unewset))
-		return -EFAULT;
+	ret = copy_compat_sigset_from_user(&newset, unewset, sigsetsize);
+	if (ret)
+		return ret;
 	return sigsuspend(&newset);
 }
 #endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7851f3a1b5f7..c8b3645c9a7d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3891,8 +3891,10 @@ static long kvm_vcpu_ioctl(struct file *filp,
 			if (copy_from_user(&kvm_sigmask, argp,
 					   sizeof(kvm_sigmask)))
 				goto out;
-			r = -EINVAL;
-			if (kvm_sigmask.len != sizeof(sigset))
+			r = copy_sigset_from_user(&sigset,
+				       (sigset_t __user *) &sigmask_arg->sigset,
+				       kvm_sigmask.len);
+			if (r)
 				goto out;
 			r = -EFAULT;
 			if (copy_from_user(&sigset, sigmask_arg->sigset,
@@ -3963,12 +3965,10 @@ static long kvm_vcpu_compat_ioctl(struct file *filp,
 			if (copy_from_user(&kvm_sigmask, argp,
 					   sizeof(kvm_sigmask)))
 				goto out;
-			r = -EINVAL;
-			if (kvm_sigmask.len != sizeof(compat_sigset_t))
-				goto out;
-			r = -EFAULT;
-			if (get_compat_sigset(&sigset,
-					      (compat_sigset_t __user *)sigmask_arg->sigset))
+			r = copy_compat_sigset_from_user(&sigset,
+				(compat_sigset_t __user *) &sigmask_arg->sigset,
+				kvm_sigmask.len);
+			if (r)
 				goto out;
 			r = kvm_vcpu_ioctl_set_sigmask(vcpu, &sigset);
 		} else
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 2/8] signals: Put the full signal mask on the signal stack for x86_64, X32 and ia32 compatibility mode
  2022-01-03 18:19 ` Walt Drummond
                   ` (2 preceding siblings ...)
  (?)
@ 2022-01-03 18:19 ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin
  Cc: linux-kernel, Walt Drummond

Put the complete sigset_t in the real-tme signal stack frame for
x86_64, x32 and ia32 compatibility mode on x86.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 arch/x86/ia32/ia32_signal.c        |  5 +++--
 arch/x86/include/asm/sighandling.h | 34 ++++++++++++++++++++++++++++++
 arch/x86/kernel/signal.c           | 11 +++-------
 3 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 5e3d9b7fd5fb..03a0ecd8c7f3 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -130,7 +130,8 @@ COMPAT_SYSCALL_DEFINE0(rt_sigreturn)
 
 	if (!access_ok(frame, sizeof(*frame)))
 		goto badframe;
-	if (__get_user(set.sig[0], (__u64 __user *)&frame->uc.uc_sigmask))
+	if (copy_from_user(&set, &frame->uc.uc_sigmask,
+			   sizeof(frame->uc.uc_sigmask)))
 		goto badframe;
 
 	set_current_blocked(&set);
@@ -347,7 +348,7 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig,
 	 */
 	unsafe_put_user(*((u64 *)&code), (u64 __user *)frame->retcode, Efault);
 	unsafe_put_sigcontext32(&frame->uc.uc_mcontext, fp, regs, set, Efault);
-	unsafe_put_user(*(__u64 *)set, (__u64 __user *)&frame->uc.uc_sigmask, Efault);
+	unsafe_put_compat_sigmask(set, frame, Efault);
 	user_access_end();
 
 	if (__copy_siginfo_to_user32(&frame->info, &ksig->info))
diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index 65e667279e0f..e247bea06a17 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -15,4 +15,38 @@
 
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
 
+static inline int
+__unsafe_put_sigmask(char *set, char __user *fp, size_t size)
+{
+	char *src;
+	char __user *dst;
+	size_t len;
+
+	len = size;
+	src = set;
+	dst = fp;
+	unsafe_copy_loop(dst, src, len, unsigned long, Efault);
+
+	return 0;
+Efault:
+	return -EFAULT;
+}
+
+#define unsafe_put_sigmask(set, frame, label)				\
+do {									\
+	if (__unsafe_put_sigmask((char *) set,				\
+				 (char __user *) &(frame)->uc.uc_sigmask, \
+				 sizeof(sigset_t)))			\
+		goto label;						\
+} while (0)
+
+#define unsafe_put_compat_sigmask(set, frame, label)			\
+do {									\
+	if (__unsafe_put_sigmask((char *) set,				\
+				 (char __user *) &(frame)->uc.uc_sigmask, \
+				 sizeof(compat_sigset_t)))		\
+		goto label;						\
+} while (0)
+
+
 #endif /* _ASM_X86_SIGHANDLING_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index f4d21e470083..bb5f3f39c412 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -203,11 +203,6 @@ do {									\
 		goto label;						\
 } while(0);
 
-#define unsafe_put_sigmask(set, frame, label) \
-	unsafe_put_user(*(__u64 *)(set), \
-			(__u64 __user *)&(frame)->uc.uc_sigmask, \
-			label)
-
 /*
  * Set up a signal frame.
  */
@@ -587,7 +582,7 @@ static int x32_setup_rt_frame(struct ksignal *ksig,
 	restorer = ksig->ka.sa.sa_restorer;
 	unsafe_put_user(restorer, (unsigned long __user *)&frame->pretcode, Efault);
 	unsafe_put_sigcontext(&frame->uc.uc_mcontext, fp, regs, set, Efault);
-	unsafe_put_sigmask(set, frame, Efault);
+	unsafe_put_compat_sigmask(set, frame, Efault);
 	user_access_end();
 
 	if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
@@ -664,7 +659,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long));
 	if (!access_ok(frame, sizeof(*frame)))
 		goto badframe;
-	if (__get_user(*(__u64 *)&set, (__u64 __user *)&frame->uc.uc_sigmask))
+	if (copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(sigset_t)))
 		goto badframe;
 	if (__get_user(uc_flags, &frame->uc.uc_flags))
 		goto badframe;
@@ -922,7 +917,7 @@ COMPAT_SYSCALL_DEFINE0(x32_rt_sigreturn)
 
 	if (!access_ok(frame, sizeof(*frame)))
 		goto badframe;
-	if (__get_user(set.sig[0], (__u64 __user *)&frame->uc.uc_sigmask))
+	if (copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(compat_sigset_t)))
 		goto badframe;
 	if (__get_user(uc_flags, &frame->uc.uc_flags))
 		goto badframe;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 3/8] signals: Use a helper function to test if a signal is a real-time signal.
  2022-01-03 18:19 ` Walt Drummond
                   ` (3 preceding siblings ...)
  (?)
@ 2022-01-03 18:19 ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Thomas Gleixner, John Johansen, James Morris, Serge E. Hallyn
  Cc: linux-kernel, Walt Drummond, linux-security-module

Rather than testing against SIGRTMIN/SIGRTMAX directly, use this
helper to determine if a signal is a real-time signal.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 include/linux/signal.h     | 8 ++++++++
 kernel/signal.c            | 6 +++---
 kernel/time/posix-timers.c | 3 ++-
 security/apparmor/ipc.c    | 4 ++--
 4 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/include/linux/signal.h b/include/linux/signal.h
index c66d4f520228..a730f3d4615e 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -53,6 +53,14 @@ enum siginfo_layout {
 
 enum siginfo_layout siginfo_layout(unsigned sig, int si_code);
 
+/* Test if 'sig' is a realtime signal.  Use this instead of testing
+ * SIGRTMIN/SIGRTMAX directly.
+ */
+static inline int realtime_signal(unsigned long sig)
+{
+	return (sig >= SIGRTMIN) && (sig <= SIGRTMAX);
+}
+
 /*
  * Define some primitives to manipulate sigset_t.
  */
diff --git a/kernel/signal.c b/kernel/signal.c
index 94b1828ae973..a2f0e38ba934 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1065,7 +1065,7 @@ static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
 
 static inline bool legacy_queue(struct sigpending *signals, int sig)
 {
-	return (sig < SIGRTMIN) && sigismember(&signals->signal, sig);
+	return !realtime_signal(sig) && sigismember(&signals->signal, sig);
 }
 
 static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struct *t,
@@ -1108,7 +1108,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
 	 * make sure at least one signal gets delivered and don't
 	 * pass on the info struct.
 	 */
-	if (sig < SIGRTMIN)
+	if (!realtime_signal(sig))
 		override_rlimit = (is_si_special(info) || info->si_code >= 0);
 	else
 		override_rlimit = 0;
@@ -1144,7 +1144,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
 			break;
 		}
 	} else if (!is_si_special(info) &&
-		   sig >= SIGRTMIN && info->si_code != SI_USER) {
+		   realtime_signal(sig) && info->si_code != SI_USER) {
 		/*
 		 * Queue overflow, abort.  We may abort if the
 		 * signal was rt and sent by user using something
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1cd10b102c51..6afb98eadd1d 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -442,7 +442,8 @@ static struct pid *good_sigevent(sigevent_t * event)
 		fallthrough;
 	case SIGEV_SIGNAL:
 	case SIGEV_THREAD:
-		if (event->sigev_signo <= 0 || event->sigev_signo > SIGRTMAX)
+		/* Signal 0 is a valid signal, just not here. */
+		if (!valid_signal(event->sigev_signo) || event->sigev_signo == 0)
 			return NULL;
 		fallthrough;
 	case SIGEV_NONE:
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index fe36d112aad9..8149b989b665 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -130,9 +130,9 @@ int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee,
 
 static inline int map_signal_num(int sig)
 {
-	if (sig > SIGRTMAX)
+	if (!valid_signal(sig))
 		return SIGUNKNOWN;
-	else if (sig >= SIGRTMIN)
+	else if (realtime_signal(sig))
 		return sig - SIGRTMIN + SIGRT_BASE;
 	else if (sig < MAXMAPPED_SIG)
 		return sig_map[sig];
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 4/8] signals: Remove sigmask() macro
  2022-01-03 18:19 ` Walt Drummond
  (?)
@ 2022-01-03 18:19   ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Geert Uytterhoeven, Dinh Nguyen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Adaptec OEM Raid Solutions, James E.J. Bottomley,
	Martin K. Petersen, Jeff Layton, Ilya Dryomov, David Woodhouse,
	Richard Weinberger, Trond Myklebust, Anna Schumaker,
	J. Bruce Fields, Chuck Lever, Alexander Viro, Oleg Nesterov,
	Paolo Bonzini
  Cc: linux-kernel, Walt Drummond, linux-alpha, linux-m68k, linux-scsi,
	ceph-devel, linux-mtd, linux-nfs, linux-fsdevel, kvm

The sigmask() macro can't support signals numbers larger than 64.

Remove the general usage of sigmask() and bit masks as input into the
functions that manipulate or accept sigset_t, with the exceptions of
compatibility cases. Use a comma-separated list of signal numbers as
input to sigaddset()/sigdelset()/... instead.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 arch/alpha/kernel/signal.c     |   4 +-
 arch/m68k/include/asm/signal.h |   6 +-
 arch/nios2/kernel/signal.c     |   2 -
 arch/x86/include/asm/signal.h  |   6 +-
 drivers/scsi/dpti.h            |   2 -
 fs/ceph/addr.c                 |   2 +-
 fs/jffs2/background.c          |   2 +-
 fs/lockd/svc.c                 |   1 -
 fs/signalfd.c                  |   2 +-
 include/linux/signal.h         | 254 +++++++++++++++++++++------------
 kernel/compat.c                |   6 +-
 kernel/fork.c                  |   2 +-
 kernel/ptrace.c                |   2 +-
 kernel/signal.c                | 115 +++++++--------
 virt/kvm/kvm_main.c            |   2 +-
 15 files changed, 238 insertions(+), 170 deletions(-)

diff --git a/arch/alpha/kernel/signal.c b/arch/alpha/kernel/signal.c
index bc077babafab..cae533594248 100644
--- a/arch/alpha/kernel/signal.c
+++ b/arch/alpha/kernel/signal.c
@@ -33,7 +33,7 @@
 
 #define DEBUG_SIG 0
 
-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
+#define _BLOCKABLE (~(compat_sigmask(SIGKILL) | compat_sigmask(SIGSTOP)))
 
 asmlinkage void ret_from_sys_call(void);
 
@@ -47,7 +47,7 @@ SYSCALL_DEFINE2(osf_sigprocmask, int, how, unsigned long, newmask)
 	sigset_t mask;
 	unsigned long res;
 
-	siginitset(&mask, newmask & _BLOCKABLE);
+	compat_siginitset(&mask, newmask & _BLOCKABLE);
 	res = sigprocmask(how, &mask, &oldmask);
 	if (!res) {
 		force_successful_syscall_return();
diff --git a/arch/m68k/include/asm/signal.h b/arch/m68k/include/asm/signal.h
index 8af85c38d377..464ff863c958 100644
--- a/arch/m68k/include/asm/signal.h
+++ b/arch/m68k/include/asm/signal.h
@@ -24,7 +24,7 @@ typedef struct {
 #ifndef CONFIG_CPU_HAS_NO_BITFIELDS
 #define __HAVE_ARCH_SIG_BITOPS
 
-static inline void sigaddset(sigset_t *set, int _sig)
+static inline void sigset_add(sigset_t *set, int _sig)
 {
 	asm ("bfset %0{%1,#1}"
 		: "+o" (*set)
@@ -32,7 +32,7 @@ static inline void sigaddset(sigset_t *set, int _sig)
 		: "cc");
 }
 
-static inline void sigdelset(sigset_t *set, int _sig)
+static inline void sigset_del(sigset_t *set, int _sig)
 {
 	asm ("bfclr %0{%1,#1}"
 		: "+o" (*set)
@@ -56,7 +56,7 @@ static inline int __gen_sigismember(sigset_t *set, int _sig)
 	return ret;
 }
 
-#define sigismember(set,sig)			\
+#define sigset_ismember(set, sig)		\
 	(__builtin_constant_p(sig) ?		\
 	 __const_sigismember(set,sig) :		\
 	 __gen_sigismember(set,sig))
diff --git a/arch/nios2/kernel/signal.c b/arch/nios2/kernel/signal.c
index 2009ae2d3c3b..c9db511a6989 100644
--- a/arch/nios2/kernel/signal.c
+++ b/arch/nios2/kernel/signal.c
@@ -20,8 +20,6 @@
 #include <asm/ucontext.h>
 #include <asm/cacheflush.h>
 
-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
-
 /*
  * Do a signal return; undo the signal stack.
  *
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index 2dfb5fea13af..9bac7c6e524c 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -46,7 +46,7 @@ typedef sigset_t compat_sigset_t;
 
 #define __HAVE_ARCH_SIG_BITOPS
 
-#define sigaddset(set,sig)		    \
+#define sigset_add(set, sig)		    \
 	(__builtin_constant_p(sig)	    \
 	 ? __const_sigaddset((set), (sig))  \
 	 : __gen_sigaddset((set), (sig)))
@@ -62,7 +62,7 @@ static inline void __const_sigaddset(sigset_t *set, int _sig)
 	set->sig[sig / _NSIG_BPW] |= 1 << (sig % _NSIG_BPW);
 }
 
-#define sigdelset(set, sig)		    \
+#define sigset_del(set, sig)		    \
 	(__builtin_constant_p(sig)	    \
 	 ? __const_sigdelset((set), (sig))  \
 	 : __gen_sigdelset((set), (sig)))
@@ -93,7 +93,7 @@ static inline int __gen_sigismember(sigset_t *set, int _sig)
 	return ret;
 }
 
-#define sigismember(set, sig)			\
+#define sigset_ismember(set, sig)		\
 	(__builtin_constant_p(sig)		\
 	 ? __const_sigismember((set), (sig))	\
 	 : __gen_sigismember((set), (sig)))
diff --git a/drivers/scsi/dpti.h b/drivers/scsi/dpti.h
index 8a079e8d7f65..cfcbb7d98fc0 100644
--- a/drivers/scsi/dpti.h
+++ b/drivers/scsi/dpti.h
@@ -96,8 +96,6 @@ static int adpt_device_reset(struct scsi_cmnd* cmd);
 #define PINFO(fmt, args...) printk(KERN_INFO fmt, ##args)
 #define PCRIT(fmt, args...) printk(KERN_CRIT fmt, ##args)
 
-#define SHUTDOWN_SIGS	(sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGTERM))
-
 // Command timeouts
 #define FOREVER			(0)
 #define TMOUT_INQUIRY 		(20)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 99b80b5c7a93..238b5ce5ef64 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1333,7 +1333,7 @@ const struct address_space_operations ceph_aops = {
 static void ceph_block_sigs(sigset_t *oldset)
 {
 	sigset_t mask;
-	siginitsetinv(&mask, sigmask(SIGKILL));
+	siginitsetinv(&mask, SIGKILL);
 	sigprocmask(SIG_BLOCK, &mask, oldset);
 }
 
diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
index 2b4d5013dc5d..bb84a8b2373c 100644
--- a/fs/jffs2/background.c
+++ b/fs/jffs2/background.c
@@ -77,7 +77,7 @@ static int jffs2_garbage_collect_thread(void *_c)
 	struct jffs2_sb_info *c = _c;
 	sigset_t hupmask;
 
-	siginitset(&hupmask, sigmask(SIGHUP));
+	siginitset(&hupmask, SIGHUP);
 	allow_signal(SIGKILL);
 	allow_signal(SIGSTOP);
 	allow_signal(SIGHUP);
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index b632be3ad57b..3c8b56c094d0 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -45,7 +45,6 @@
 
 #define NLMDBG_FACILITY		NLMDBG_SVC
 #define LOCKD_BUFSIZE		(1024 + NLMSVC_XDRSIZE)
-#define ALLOWED_SIGS		(sigmask(SIGKILL))
 
 static struct svc_program	nlmsvc_program;
 
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 12fdc282e299..ed024d5aad2a 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -270,7 +270,7 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
 	if (flags & ~(SFD_CLOEXEC | SFD_NONBLOCK))
 		return -EINVAL;
 
-	sigdelsetmask(mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(mask, SIGKILL, SIGSTOP);
 	signotset(mask);
 
 	if (ufd == -1) {
diff --git a/include/linux/signal.h b/include/linux/signal.h
index a730f3d4615e..eaf7991fffee 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -53,6 +53,12 @@ enum siginfo_layout {
 
 enum siginfo_layout siginfo_layout(unsigned sig, int si_code);
 
+/* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */
+static inline int valid_signal(unsigned long sig)
+{
+	return sig <= _NSIG ? 1 : 0;
+}
+
 /* Test if 'sig' is a realtime signal.  Use this instead of testing
  * SIGRTMIN/SIGRTMAX directly.
  */
@@ -62,15 +68,20 @@ static inline int realtime_signal(unsigned long sig)
 }
 
 /*
- * Define some primitives to manipulate sigset_t.
+ * Define some primitives to manipulate individual bits in sigset_t.
+ * Don't use these directly.  Architectures can define their own
+ * versions (see arch/x86/include/signal.h)
  */
 
 #ifndef __HAVE_ARCH_SIG_BITOPS
-#include <linux/bitops.h>
+#define sigset_add(set, sig)       __sigset_add(set, sig)
+#define sigset_del(set, sig)       __sigset_del(set, sig)
+#define sigset_ismember(set, sig)  __sigset_ismember(set, sig)
+#endif
 
 /* We don't use <linux/bitops.h> for these because there is no need to
    be atomic.  */
-static inline void sigaddset(sigset_t *set, int _sig)
+static inline void __sigset_add(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
@@ -79,7 +90,7 @@ static inline void sigaddset(sigset_t *set, int _sig)
 		set->sig[sig / _NSIG_BPW] |= 1UL << (sig % _NSIG_BPW);
 }
 
-static inline void sigdelset(sigset_t *set, int _sig)
+static inline void __sigset_del(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
@@ -88,33 +99,72 @@ static inline void sigdelset(sigset_t *set, int _sig)
 		set->sig[sig / _NSIG_BPW] &= ~(1UL << (sig % _NSIG_BPW));
 }
 
-static inline int sigismember(sigset_t *set, int _sig)
+static inline int __sigset_ismember(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
-		return 1 & (set->sig[0] >> sig);
+		return 1UL & (set->sig[0] >> sig);
 	else
-		return 1 & (set->sig[sig / _NSIG_BPW] >> (sig % _NSIG_BPW));
+		return 1UL & (set->sig[sig / _NSIG_BPW] >> (sig % _NSIG_BPW));
 }
 
-#endif /* __HAVE_ARCH_SIG_BITOPS */
+/* Some primitives for setting/deleting signals from sigset_t.  Use these. */
 
-static inline int sigisemptyset(sigset_t *set)
+#define NUM_INTARGS(...) (sizeof((int[]){__VA_ARGS__})/sizeof(int))
+
+#define sigdelset(x, ...) __sigdelset((x), NUM_INTARGS(__VA_ARGS__),	\
+				      __VA_ARGS__)
+static inline void __sigdelset(sigset_t *set, int count, ...)
 {
-	switch (_NSIG_WORDS) {
-	case 4:
-		return (set->sig[3] | set->sig[2] |
-			set->sig[1] | set->sig[0]) == 0;
-	case 2:
-		return (set->sig[1] | set->sig[0]) == 0;
-	case 1:
-		return set->sig[0] == 0;
-	default:
-		BUILD_BUG();
-		return 0;
+	va_list ap;
+	int sig;
+
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_del(set, sig);
+		count--;
 	}
+	va_end(ap);
+}
+
+#define sigaddset(x, ...) __sigaddset((x), NUM_INTARGS(__VA_ARGS__),	\
+				      __VA_ARGS__)
+static inline void __sigaddset(sigset_t *set, int count, ...)
+{
+	va_list ap;
+	int sig;
+
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_add(set, sig);
+		count--;
+	}
+	va_end(ap);
+}
+
+static inline int sigismember(sigset_t *set, int sig)
+{
+	if (!valid_signal(sig) || sig == 0)
+		return 0;
+	return sigset_ismember(set, sig);
 }
 
+#define siginitset(set, ...)			\
+do {						\
+	sigemptyset((set));			\
+	sigaddset((set), __VA_ARGS__);		\
+} while (0)
+
+#define siginitsetinv(set, ...)			\
+do {					        \
+	sigfillset((set));			\
+	sigdelset((set), __VA_ARGS__);		\
+} while (0)
+
 static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 {
 	switch (_NSIG_WORDS) {
@@ -128,11 +178,18 @@ static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 			(set1->sig[0] == set2->sig[0]);
 	case 1:
 		return	set1->sig[0] == set2->sig[0];
+	default:
+		return memcmp(set1, set2, sizeof(sigset_t)) == 0;
 	}
 	return 0;
 }
 
-#define sigmask(sig)	(1UL << ((sig) - 1))
+static inline int sigisemptyset(sigset_t *set)
+{
+	sigset_t empty = {0};
+
+	return sigequalsets(set, &empty);
+}
 
 #ifndef __HAVE_ARCH_SIG_SETOPS
 #include <linux/string.h>
@@ -141,6 +198,7 @@ static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
 {									\
 	unsigned long a0, a1, a2, a3, b0, b1, b2, b3;			\
+	int i;								\
 									\
 	switch (_NSIG_WORDS) {						\
 	case 4:								\
@@ -158,7 +216,9 @@ static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
 		r->sig[0] = op(a0, b0);					\
 		break;							\
 	default:							\
-		BUILD_BUG();						\
+		for (i = 0; i < _NSIG_WORDS; i++)			\
+			r->sig[i] = op(a->sig[i], b->sig[i]);		\
+		break;							\
 	}								\
 }
 
@@ -179,6 +239,8 @@ _SIG_SET_BINOP(sigandnsets, _sig_andn)
 #define _SIG_SET_OP(name, op)						\
 static inline void name(sigset_t *set)					\
 {									\
+	int i;								\
+									\
 	switch (_NSIG_WORDS) {						\
 	case 4:	set->sig[3] = op(set->sig[3]);				\
 		set->sig[2] = op(set->sig[2]);				\
@@ -188,7 +250,9 @@ static inline void name(sigset_t *set)					\
 	case 1:	set->sig[0] = op(set->sig[0]);				\
 		    break;						\
 	default:							\
-		BUILD_BUG();						\
+		for (i = 0; i < _NSIG_WORDS; i++)			\
+			set->sig[i] = op(set->sig[i]);			\
+		break;							\
 	}								\
 }
 
@@ -224,24 +288,13 @@ static inline void sigfillset(sigset_t *set)
 	}
 }
 
-/* Some extensions for manipulating the low 32 signals in particular.  */
+#endif /* __HAVE_ARCH_SIG_SETOPS */
 
-static inline void sigaddsetmask(sigset_t *set, unsigned long mask)
-{
-	set->sig[0] |= mask;
-}
+/* Primitives for handing the compat (first long) sigset_t */
 
-static inline void sigdelsetmask(sigset_t *set, unsigned long mask)
-{
-	set->sig[0] &= ~mask;
-}
+#define compat_sigmask(sig)       (1UL << ((sig) - 1))
 
-static inline int sigtestsetmask(sigset_t *set, unsigned long mask)
-{
-	return (set->sig[0] & mask) != 0;
-}
-
-static inline void siginitset(sigset_t *set, unsigned long mask)
+static inline void compat_siginitset(sigset_t *set, unsigned long mask)
 {
 	set->sig[0] = mask;
 	switch (_NSIG_WORDS) {
@@ -254,7 +307,7 @@ static inline void siginitset(sigset_t *set, unsigned long mask)
 	}
 }
 
-static inline void siginitsetinv(sigset_t *set, unsigned long mask)
+static inline void compat_siginitsetinv(sigset_t *set, unsigned long mask)
 {
 	set->sig[0] = ~mask;
 	switch (_NSIG_WORDS) {
@@ -267,7 +320,21 @@ static inline void siginitsetinv(sigset_t *set, unsigned long mask)
 	}
 }
 
-#endif /* __HAVE_ARCH_SIG_SETOPS */
+static inline void compat_sigaddsetmask(sigset_t *set, unsigned long mask)
+{
+	set->sig[0] |= mask;
+}
+
+static inline void compat_sigdelsetmask(sigset_t *set, unsigned long mask)
+{
+	set->sig[0] &= ~mask;
+}
+
+static inline int compat_sigtestsetmask(sigset_t *set, unsigned long mask)
+{
+	return (set->sig[0] & mask) != 0;
+}
+
 
 /* Safely copy a sigset_t from user space handling any differences in
  * size between user space and kernel sigset_t.  We don't use
@@ -338,12 +405,6 @@ static inline void init_sigpending(struct sigpending *sig)
 
 extern void flush_sigqueue(struct sigpending *queue);
 
-/* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */
-static inline int valid_signal(unsigned long sig)
-{
-	return sig <= _NSIG ? 1 : 0;
-}
-
 struct timespec;
 struct pt_regs;
 enum pid_type;
@@ -470,55 +531,72 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig);
  * default action of stopping the process may happen later or never.
  */
 
+static inline int sig_kernel_stop(unsigned long sig)
+{
+	return	sig == SIGSTOP ||
+		sig == SIGTSTP ||
+		sig == SIGTTIN ||
+		sig == SIGTTOU;
+}
+
+static inline int sig_kernel_ignore(unsigned long sig)
+{
+	return	sig == SIGCONT	||
+		sig == SIGCHLD	||
+		sig == SIGWINCH ||
+		sig == SIGURG;
+}
+
+static inline int sig_kernel_only(unsigned long sig)
+{
+	return	sig == SIGKILL ||
+		sig == SIGSTOP;
+}
+
+static inline int sig_kernel_coredump(unsigned long sig)
+{
+	return	sig == SIGQUIT ||
+		sig == SIGILL  ||
+		sig == SIGTRAP ||
+		sig == SIGABRT ||
+		sig == SIGFPE  ||
+		sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGSYS  ||
+		sig == SIGXCPU ||
 #ifdef SIGEMT
-#define SIGEMT_MASK	rt_sigmask(SIGEMT)
-#else
-#define SIGEMT_MASK	0
+		sig == SIGEMT  ||
 #endif
+		sig == SIGXFSZ;
+}
 
-#if SIGRTMIN > BITS_PER_LONG
-#define rt_sigmask(sig)	(1ULL << ((sig)-1))
-#else
-#define rt_sigmask(sig)	sigmask(sig)
+static inline int sig_specific_sicodes(unsigned long sig)
+{
+	return	sig == SIGILL  ||
+		sig == SIGFPE  ||
+		sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGTRAP ||
+		sig == SIGCHLD ||
+		sig == SIGPOLL ||
+#ifdef SIGEMT
+		sig == SIGEMT  ||
 #endif
+		sig == SIGSYS;
+}
 
-#define siginmask(sig, mask) \
-	((sig) > 0 && (sig) < SIGRTMIN && (rt_sigmask(sig) & (mask)))
-
-#define SIG_KERNEL_ONLY_MASK (\
-	rt_sigmask(SIGKILL)   |  rt_sigmask(SIGSTOP))
-
-#define SIG_KERNEL_STOP_MASK (\
-	rt_sigmask(SIGSTOP)   |  rt_sigmask(SIGTSTP)   | \
-	rt_sigmask(SIGTTIN)   |  rt_sigmask(SIGTTOU)   )
-
-#define SIG_KERNEL_COREDUMP_MASK (\
-        rt_sigmask(SIGQUIT)   |  rt_sigmask(SIGILL)    | \
-	rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGABRT)   | \
-        rt_sigmask(SIGFPE)    |  rt_sigmask(SIGSEGV)   | \
-	rt_sigmask(SIGBUS)    |  rt_sigmask(SIGSYS)    | \
-        rt_sigmask(SIGXCPU)   |  rt_sigmask(SIGXFSZ)   | \
-	SIGEMT_MASK				       )
-
-#define SIG_KERNEL_IGNORE_MASK (\
-        rt_sigmask(SIGCONT)   |  rt_sigmask(SIGCHLD)   | \
-	rt_sigmask(SIGWINCH)  |  rt_sigmask(SIGURG)    )
-
-#define SIG_SPECIFIC_SICODES_MASK (\
-	rt_sigmask(SIGILL)    |  rt_sigmask(SIGFPE)    | \
-	rt_sigmask(SIGSEGV)   |  rt_sigmask(SIGBUS)    | \
-	rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGCHLD)   | \
-	rt_sigmask(SIGPOLL)   |  rt_sigmask(SIGSYS)    | \
-	SIGEMT_MASK                                    )
-
-#define sig_kernel_only(sig)		siginmask(sig, SIG_KERNEL_ONLY_MASK)
-#define sig_kernel_coredump(sig)	siginmask(sig, SIG_KERNEL_COREDUMP_MASK)
-#define sig_kernel_ignore(sig)		siginmask(sig, SIG_KERNEL_IGNORE_MASK)
-#define sig_kernel_stop(sig)		siginmask(sig, SIG_KERNEL_STOP_MASK)
-#define sig_specific_sicodes(sig)	siginmask(sig, SIG_SPECIFIC_SICODES_MASK)
+static inline int synchronous_signal(unsigned long sig)
+{
+	return	sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGILL  ||
+		sig == SIGTRAP ||
+		sig == SIGFPE  ||
+		sig == SIGSYS;
+}
 
 #define sig_fatal(t, signr) \
-	(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
+	(!(sig_kernel_ignore(signr) ||	sig_kernel_stop(signr)) &&	\
 	 (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
 
 void signals_init(void);
diff --git a/kernel/compat.c b/kernel/compat.c
index cc2438f4070c..26ffd271444c 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -49,16 +49,16 @@ COMPAT_SYSCALL_DEFINE3(sigprocmask, int, how,
 	if (nset) {
 		if (get_user(new_set, nset))
 			return -EFAULT;
-		new_set &= ~(sigmask(SIGKILL) | sigmask(SIGSTOP));
+		new_set &= ~(compat_sigmask(SIGKILL) | compat_sigmask(SIGSTOP));
 
 		new_blocked = current->blocked;
 
 		switch (how) {
 		case SIG_BLOCK:
-			sigaddsetmask(&new_blocked, new_set);
+			compat_sigaddsetmask(&new_blocked, new_set);
 			break;
 		case SIG_UNBLOCK:
-			sigdelsetmask(&new_blocked, new_set);
+			compat_sigdelsetmask(&new_blocked, new_set);
 			break;
 		case SIG_SETMASK:
 			compat_sig_setmask(&new_blocked, new_set);
diff --git a/kernel/fork.c b/kernel/fork.c
index 38681ad44c76..8b07f0090b82 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2032,7 +2032,7 @@ static __latent_entropy struct task_struct *copy_process(
 		 * fatal or STOP
 		 */
 		p->flags |= PF_IO_WORKER;
-		siginitsetinv(&p->blocked, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		siginitsetinv(&p->blocked, SIGKILL, SIGSTOP);
 	}
 
 	/*
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2f7ee345a629..200b99d39878 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1102,7 +1102,7 @@ int ptrace_request(struct task_struct *child, long request,
 		if (ret)
 			break;
 
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		/*
 		 * Every thread does recalc_sigpending() after resume, so
diff --git a/kernel/signal.c b/kernel/signal.c
index a2f0e38ba934..9421f1112b20 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -64,6 +64,9 @@ static struct kmem_cache *sigqueue_cachep;
 
 int print_fatal_signals __read_mostly;
 
+sigset_t signal_stop_mask;
+sigset_t signal_synchronous_mask;
+
 static void __user *sig_handler(struct task_struct *t, int sig)
 {
 	return t->sighand->action[sig - 1].sa.sa_handler;
@@ -199,55 +202,26 @@ void calculate_sigpending(void)
 }
 
 /* Given the mask, find the first available signal that should be serviced. */
-
-#define SYNCHRONOUS_MASK \
-	(sigmask(SIGSEGV) | sigmask(SIGBUS) | sigmask(SIGILL) | \
-	 sigmask(SIGTRAP) | sigmask(SIGFPE) | sigmask(SIGSYS))
-
 int next_signal(struct sigpending *pending, sigset_t *mask)
 {
-	unsigned long i, *s, *m, x;
-	int sig = 0;
+	int i, sig;
+	sigset_t pend, s;
 
-	s = pending->signal.sig;
-	m = mask->sig;
+	sigandnsets(&pend, &pending->signal, mask);
 
-	/*
-	 * Handle the first word specially: it contains the
-	 * synchronous signals that need to be dequeued first.
-	 */
-	x = *s &~ *m;
-	if (x) {
-		if (x & SYNCHRONOUS_MASK)
-			x &= SYNCHRONOUS_MASK;
-		sig = ffz(~x) + 1;
-		return sig;
-	}
+	/* Handle synchronous signals first */
+	sigandsets(&s, &pend, &signal_synchronous_mask);
+	if (!sigisemptyset(&s))
+		pend = s;
 
-	switch (_NSIG_WORDS) {
-	default:
-		for (i = 1; i < _NSIG_WORDS; ++i) {
-			x = *++s &~ *++m;
-			if (!x)
-				continue;
-			sig = ffz(~x) + i*_NSIG_BPW + 1;
-			break;
+	for (i = 0; i < _NSIG_WORDS; i++) {
+		if (pend.sig[i] != 0) {
+			sig = ffz(~pend.sig[i]) + i*_NSIG_BPW + 1;
+			return sig;
 		}
-		break;
-
-	case 2:
-		x = s[1] &~ m[1];
-		if (!x)
-			break;
-		sig = ffz(~x) + _NSIG_BPW + 1;
-		break;
-
-	case 1:
-		/* Nothing to do */
-		break;
 	}
 
-	return sig;
+	return 0;
 }
 
 static inline void print_dropped_signal(int sig)
@@ -709,11 +683,14 @@ static int dequeue_synchronous_signal(kernel_siginfo_t *info)
 	struct task_struct *tsk = current;
 	struct sigpending *pending = &tsk->pending;
 	struct sigqueue *q, *sync = NULL;
+	sigset_t s;
 
 	/*
 	 * Might a synchronous signal be in the queue?
 	 */
-	if (!((pending->signal.sig[0] & ~tsk->blocked.sig[0]) & SYNCHRONOUS_MASK))
+	sigandnsets(&s, &pending->signal, &tsk->blocked);
+	sigandsets(&s, &s, &signal_synchronous_mask);
+	if (sigisemptyset(&s))
 		return 0;
 
 	/*
@@ -722,7 +699,7 @@ static int dequeue_synchronous_signal(kernel_siginfo_t *info)
 	list_for_each_entry(q, &pending->list, list) {
 		/* Synchronous signals have a positive si_code */
 		if ((q->info.si_code > SI_USER) &&
-		    (sigmask(q->info.si_signo) & SYNCHRONOUS_MASK)) {
+		    synchronous_signal(q->info.si_signo)) {
 			sync = q;
 			goto next;
 		}
@@ -795,6 +772,25 @@ static void flush_sigqueue_mask(sigset_t *mask, struct sigpending *s)
 	}
 }
 
+#define flush_sigqueue_sig(x, ...) __flush_sigqueue_sig((x),		\
+					NUM_INTARGS(__VA_ARGS__), __VA_ARGS__)
+static void __flush_sigqueue_sig(struct sigpending *s, int count, ...)
+{
+	va_list ap;
+	sigset_t mask;
+	int sig;
+
+	sigemptyset(&mask);
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_add(&mask, sig);
+		count--;
+	}
+	flush_sigqueue_mask(&mask, s);
+}
+
 static inline int is_si_special(const struct kernel_siginfo *info)
 {
 	return info <= SEND_SIG_PRIV;
@@ -913,8 +909,7 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
 		/*
 		 * This is a stop signal.  Remove SIGCONT from all queues.
 		 */
-		siginitset(&flush, sigmask(SIGCONT));
-		flush_sigqueue_mask(&flush, &signal->shared_pending);
+		flush_sigqueue_sig(&signal->shared_pending, SIGCONT);
 		for_each_thread(p, t)
 			flush_sigqueue_mask(&flush, &t->pending);
 	} else if (sig == SIGCONT) {
@@ -922,10 +917,9 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
 		/*
 		 * Remove all stop signals from all queues, wake all threads.
 		 */
-		siginitset(&flush, SIG_KERNEL_STOP_MASK);
-		flush_sigqueue_mask(&flush, &signal->shared_pending);
+		flush_sigqueue_mask(&signal_stop_mask, &signal->shared_pending);
 		for_each_thread(p, t) {
-			flush_sigqueue_mask(&flush, &t->pending);
+			flush_sigqueue_mask(&signal_stop_mask, &t->pending);
 			task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 			if (likely(!(t->ptrace & PT_SEIZED)))
 				wake_up_state(t, __TASK_STOPPED);
@@ -1172,7 +1166,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
 			sigset_t *signal = &delayed->signal;
 			/* Can't queue both a stop and a continue signal */
 			if (sig == SIGCONT)
-				sigdelsetmask(signal, SIG_KERNEL_STOP_MASK);
+				sigandnsets(signal, signal, &signal_stop_mask);
 			else if (sig_kernel_stop(sig))
 				sigdelset(signal, SIGCONT);
 			sigaddset(signal, sig);
@@ -3023,7 +3017,7 @@ static void __set_task_blocked(struct task_struct *tsk, const sigset_t *newset)
  */
 void set_current_blocked(sigset_t *newset)
 {
-	sigdelsetmask(newset, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(newset, SIGKILL, SIGSTOP);
 	__set_current_blocked(newset);
 }
 
@@ -3150,7 +3144,7 @@ SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
 	if (nset) {
 		if (copy_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		error = sigprocmask(how, &new_set, NULL);
 		if (error)
@@ -3180,7 +3174,7 @@ COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, compat_sigset_t __user *, nset,
 	if (nset) {
 		if (copy_compat_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		error = sigprocmask(how, &new_set, NULL);
 		if (error)
@@ -3586,7 +3580,7 @@ static int do_sigtimedwait(const sigset_t *which, kernel_siginfo_t *info,
 	/*
 	 * Invert the set of allowed signals to get those we want to block.
 	 */
-	sigdelsetmask(&mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(&mask, SIGKILL, SIGSTOP);
 	signotset(&mask);
 
 	spin_lock_irq(&tsk->sighand->siglock);
@@ -4111,8 +4105,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 	sigaction_compat_abi(act, oact);
 
 	if (act) {
-		sigdelsetmask(&act->sa.sa_mask,
-			      sigmask(SIGKILL) | sigmask(SIGSTOP));
+		sigdelset(&act->sa.sa_mask, SIGKILL, SIGSTOP);
 		*k = *act;
 		/*
 		 * POSIX 3.3.1.3:
@@ -4126,9 +4119,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 		 *   be discarded, whether or not it is blocked"
 		 */
 		if (sig_handler_ignored(sig_handler(p, sig), sig)) {
-			sigemptyset(&mask);
-			sigaddset(&mask, sig);
-			flush_sigqueue_mask(&mask, &p->signal->shared_pending);
+			flush_sigqueue_sig(&p->signal->shared_pending, sig);
 			for_each_thread(p, t)
 				flush_sigqueue_mask(&mask, &t->pending);
 		}
@@ -4332,10 +4323,10 @@ SYSCALL_DEFINE3(sigprocmask, int, how, old_sigset_t __user *, nset,
 
 		switch (how) {
 		case SIG_BLOCK:
-			sigaddsetmask(&new_blocked, new_set);
+			compat_sigaddsetmask(&new_blocked, new_set);
 			break;
 		case SIG_UNBLOCK:
-			sigdelsetmask(&new_blocked, new_set);
+			compat_sigdelsetmask(&new_blocked, new_set);
 			break;
 		case SIG_SETMASK:
 			new_blocked.sig[0] = new_set;
@@ -4724,6 +4715,10 @@ void __init signals_init(void)
 {
 	siginfo_buildtime_checks();
 
+	sigaddset(&signal_stop_mask, SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU);
+	sigaddset(&signal_synchronous_mask, SIGSEGV, SIGBUS, SIGILL, SIGTRAP,
+		 SIGFPE, SIGSYS);
+
 	sigqueue_cachep = KMEM_CACHE(sigqueue, SLAB_PANIC | SLAB_ACCOUNT);
 }
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c8b3645c9a7d..ab6ba4ec661b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3684,7 +3684,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
 static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
 {
 	if (sigset) {
-		sigdelsetmask(sigset, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(sigset, SIGKILL, SIGSTOP);
 		vcpu->sigset_active = 1;
 		vcpu->sigset = *sigset;
 	} else
-- 
2.30.2


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 4/8] signals: Remove sigmask() macro
@ 2022-01-03 18:19   ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Geert Uytterhoeven, Dinh Nguyen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Adaptec OEM Raid Solutions, James E.J. Bottomley,
	Martin K. Petersen, Jeff Layton, Ilya Dryomov, David Woodhouse,
	Richard Weinberger, Trond Myklebust, Anna Schumaker,
	J. Bruce Fields, Chuck Lever, Alexander Viro, Oleg Nesterov,
	Paolo Bonzini
  Cc: linux-kernel, Walt Drummond, linux-alpha, linux-m68k, linux-scsi,
	ceph-devel, linux-mtd, linux-nfs, linux-fsdevel, kvm

The sigmask() macro can't support signals numbers larger than 64.

Remove the general usage of sigmask() and bit masks as input into the
functions that manipulate or accept sigset_t, with the exceptions of
compatibility cases. Use a comma-separated list of signal numbers as
input to sigaddset()/sigdelset()/... instead.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 arch/alpha/kernel/signal.c     |   4 +-
 arch/m68k/include/asm/signal.h |   6 +-
 arch/nios2/kernel/signal.c     |   2 -
 arch/x86/include/asm/signal.h  |   6 +-
 drivers/scsi/dpti.h            |   2 -
 fs/ceph/addr.c                 |   2 +-
 fs/jffs2/background.c          |   2 +-
 fs/lockd/svc.c                 |   1 -
 fs/signalfd.c                  |   2 +-
 include/linux/signal.h         | 254 +++++++++++++++++++++------------
 kernel/compat.c                |   6 +-
 kernel/fork.c                  |   2 +-
 kernel/ptrace.c                |   2 +-
 kernel/signal.c                | 115 +++++++--------
 virt/kvm/kvm_main.c            |   2 +-
 15 files changed, 238 insertions(+), 170 deletions(-)

diff --git a/arch/alpha/kernel/signal.c b/arch/alpha/kernel/signal.c
index bc077babafab..cae533594248 100644
--- a/arch/alpha/kernel/signal.c
+++ b/arch/alpha/kernel/signal.c
@@ -33,7 +33,7 @@
 
 #define DEBUG_SIG 0
 
-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
+#define _BLOCKABLE (~(compat_sigmask(SIGKILL) | compat_sigmask(SIGSTOP)))
 
 asmlinkage void ret_from_sys_call(void);
 
@@ -47,7 +47,7 @@ SYSCALL_DEFINE2(osf_sigprocmask, int, how, unsigned long, newmask)
 	sigset_t mask;
 	unsigned long res;
 
-	siginitset(&mask, newmask & _BLOCKABLE);
+	compat_siginitset(&mask, newmask & _BLOCKABLE);
 	res = sigprocmask(how, &mask, &oldmask);
 	if (!res) {
 		force_successful_syscall_return();
diff --git a/arch/m68k/include/asm/signal.h b/arch/m68k/include/asm/signal.h
index 8af85c38d377..464ff863c958 100644
--- a/arch/m68k/include/asm/signal.h
+++ b/arch/m68k/include/asm/signal.h
@@ -24,7 +24,7 @@ typedef struct {
 #ifndef CONFIG_CPU_HAS_NO_BITFIELDS
 #define __HAVE_ARCH_SIG_BITOPS
 
-static inline void sigaddset(sigset_t *set, int _sig)
+static inline void sigset_add(sigset_t *set, int _sig)
 {
 	asm ("bfset %0{%1,#1}"
 		: "+o" (*set)
@@ -32,7 +32,7 @@ static inline void sigaddset(sigset_t *set, int _sig)
 		: "cc");
 }
 
-static inline void sigdelset(sigset_t *set, int _sig)
+static inline void sigset_del(sigset_t *set, int _sig)
 {
 	asm ("bfclr %0{%1,#1}"
 		: "+o" (*set)
@@ -56,7 +56,7 @@ static inline int __gen_sigismember(sigset_t *set, int _sig)
 	return ret;
 }
 
-#define sigismember(set,sig)			\
+#define sigset_ismember(set, sig)		\
 	(__builtin_constant_p(sig) ?		\
 	 __const_sigismember(set,sig) :		\
 	 __gen_sigismember(set,sig))
diff --git a/arch/nios2/kernel/signal.c b/arch/nios2/kernel/signal.c
index 2009ae2d3c3b..c9db511a6989 100644
--- a/arch/nios2/kernel/signal.c
+++ b/arch/nios2/kernel/signal.c
@@ -20,8 +20,6 @@
 #include <asm/ucontext.h>
 #include <asm/cacheflush.h>
 
-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
-
 /*
  * Do a signal return; undo the signal stack.
  *
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index 2dfb5fea13af..9bac7c6e524c 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -46,7 +46,7 @@ typedef sigset_t compat_sigset_t;
 
 #define __HAVE_ARCH_SIG_BITOPS
 
-#define sigaddset(set,sig)		    \
+#define sigset_add(set, sig)		    \
 	(__builtin_constant_p(sig)	    \
 	 ? __const_sigaddset((set), (sig))  \
 	 : __gen_sigaddset((set), (sig)))
@@ -62,7 +62,7 @@ static inline void __const_sigaddset(sigset_t *set, int _sig)
 	set->sig[sig / _NSIG_BPW] |= 1 << (sig % _NSIG_BPW);
 }
 
-#define sigdelset(set, sig)		    \
+#define sigset_del(set, sig)		    \
 	(__builtin_constant_p(sig)	    \
 	 ? __const_sigdelset((set), (sig))  \
 	 : __gen_sigdelset((set), (sig)))
@@ -93,7 +93,7 @@ static inline int __gen_sigismember(sigset_t *set, int _sig)
 	return ret;
 }
 
-#define sigismember(set, sig)			\
+#define sigset_ismember(set, sig)		\
 	(__builtin_constant_p(sig)		\
 	 ? __const_sigismember((set), (sig))	\
 	 : __gen_sigismember((set), (sig)))
diff --git a/drivers/scsi/dpti.h b/drivers/scsi/dpti.h
index 8a079e8d7f65..cfcbb7d98fc0 100644
--- a/drivers/scsi/dpti.h
+++ b/drivers/scsi/dpti.h
@@ -96,8 +96,6 @@ static int adpt_device_reset(struct scsi_cmnd* cmd);
 #define PINFO(fmt, args...) printk(KERN_INFO fmt, ##args)
 #define PCRIT(fmt, args...) printk(KERN_CRIT fmt, ##args)
 
-#define SHUTDOWN_SIGS	(sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGTERM))
-
 // Command timeouts
 #define FOREVER			(0)
 #define TMOUT_INQUIRY 		(20)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 99b80b5c7a93..238b5ce5ef64 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1333,7 +1333,7 @@ const struct address_space_operations ceph_aops = {
 static void ceph_block_sigs(sigset_t *oldset)
 {
 	sigset_t mask;
-	siginitsetinv(&mask, sigmask(SIGKILL));
+	siginitsetinv(&mask, SIGKILL);
 	sigprocmask(SIG_BLOCK, &mask, oldset);
 }
 
diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
index 2b4d5013dc5d..bb84a8b2373c 100644
--- a/fs/jffs2/background.c
+++ b/fs/jffs2/background.c
@@ -77,7 +77,7 @@ static int jffs2_garbage_collect_thread(void *_c)
 	struct jffs2_sb_info *c = _c;
 	sigset_t hupmask;
 
-	siginitset(&hupmask, sigmask(SIGHUP));
+	siginitset(&hupmask, SIGHUP);
 	allow_signal(SIGKILL);
 	allow_signal(SIGSTOP);
 	allow_signal(SIGHUP);
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index b632be3ad57b..3c8b56c094d0 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -45,7 +45,6 @@
 
 #define NLMDBG_FACILITY		NLMDBG_SVC
 #define LOCKD_BUFSIZE		(1024 + NLMSVC_XDRSIZE)
-#define ALLOWED_SIGS		(sigmask(SIGKILL))
 
 static struct svc_program	nlmsvc_program;
 
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 12fdc282e299..ed024d5aad2a 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -270,7 +270,7 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
 	if (flags & ~(SFD_CLOEXEC | SFD_NONBLOCK))
 		return -EINVAL;
 
-	sigdelsetmask(mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(mask, SIGKILL, SIGSTOP);
 	signotset(mask);
 
 	if (ufd == -1) {
diff --git a/include/linux/signal.h b/include/linux/signal.h
index a730f3d4615e..eaf7991fffee 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -53,6 +53,12 @@ enum siginfo_layout {
 
 enum siginfo_layout siginfo_layout(unsigned sig, int si_code);
 
+/* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */
+static inline int valid_signal(unsigned long sig)
+{
+	return sig <= _NSIG ? 1 : 0;
+}
+
 /* Test if 'sig' is a realtime signal.  Use this instead of testing
  * SIGRTMIN/SIGRTMAX directly.
  */
@@ -62,15 +68,20 @@ static inline int realtime_signal(unsigned long sig)
 }
 
 /*
- * Define some primitives to manipulate sigset_t.
+ * Define some primitives to manipulate individual bits in sigset_t.
+ * Don't use these directly.  Architectures can define their own
+ * versions (see arch/x86/include/signal.h)
  */
 
 #ifndef __HAVE_ARCH_SIG_BITOPS
-#include <linux/bitops.h>
+#define sigset_add(set, sig)       __sigset_add(set, sig)
+#define sigset_del(set, sig)       __sigset_del(set, sig)
+#define sigset_ismember(set, sig)  __sigset_ismember(set, sig)
+#endif
 
 /* We don't use <linux/bitops.h> for these because there is no need to
    be atomic.  */
-static inline void sigaddset(sigset_t *set, int _sig)
+static inline void __sigset_add(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
@@ -79,7 +90,7 @@ static inline void sigaddset(sigset_t *set, int _sig)
 		set->sig[sig / _NSIG_BPW] |= 1UL << (sig % _NSIG_BPW);
 }
 
-static inline void sigdelset(sigset_t *set, int _sig)
+static inline void __sigset_del(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
@@ -88,33 +99,72 @@ static inline void sigdelset(sigset_t *set, int _sig)
 		set->sig[sig / _NSIG_BPW] &= ~(1UL << (sig % _NSIG_BPW));
 }
 
-static inline int sigismember(sigset_t *set, int _sig)
+static inline int __sigset_ismember(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
-		return 1 & (set->sig[0] >> sig);
+		return 1UL & (set->sig[0] >> sig);
 	else
-		return 1 & (set->sig[sig / _NSIG_BPW] >> (sig % _NSIG_BPW));
+		return 1UL & (set->sig[sig / _NSIG_BPW] >> (sig % _NSIG_BPW));
 }
 
-#endif /* __HAVE_ARCH_SIG_BITOPS */
+/* Some primitives for setting/deleting signals from sigset_t.  Use these. */
 
-static inline int sigisemptyset(sigset_t *set)
+#define NUM_INTARGS(...) (sizeof((int[]){__VA_ARGS__})/sizeof(int))
+
+#define sigdelset(x, ...) __sigdelset((x), NUM_INTARGS(__VA_ARGS__),	\
+				      __VA_ARGS__)
+static inline void __sigdelset(sigset_t *set, int count, ...)
 {
-	switch (_NSIG_WORDS) {
-	case 4:
-		return (set->sig[3] | set->sig[2] |
-			set->sig[1] | set->sig[0]) == 0;
-	case 2:
-		return (set->sig[1] | set->sig[0]) == 0;
-	case 1:
-		return set->sig[0] == 0;
-	default:
-		BUILD_BUG();
-		return 0;
+	va_list ap;
+	int sig;
+
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_del(set, sig);
+		count--;
 	}
+	va_end(ap);
+}
+
+#define sigaddset(x, ...) __sigaddset((x), NUM_INTARGS(__VA_ARGS__),	\
+				      __VA_ARGS__)
+static inline void __sigaddset(sigset_t *set, int count, ...)
+{
+	va_list ap;
+	int sig;
+
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_add(set, sig);
+		count--;
+	}
+	va_end(ap);
+}
+
+static inline int sigismember(sigset_t *set, int sig)
+{
+	if (!valid_signal(sig) || sig == 0)
+		return 0;
+	return sigset_ismember(set, sig);
 }
 
+#define siginitset(set, ...)			\
+do {						\
+	sigemptyset((set));			\
+	sigaddset((set), __VA_ARGS__);		\
+} while (0)
+
+#define siginitsetinv(set, ...)			\
+do {					        \
+	sigfillset((set));			\
+	sigdelset((set), __VA_ARGS__);		\
+} while (0)
+
 static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 {
 	switch (_NSIG_WORDS) {
@@ -128,11 +178,18 @@ static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 			(set1->sig[0] == set2->sig[0]);
 	case 1:
 		return	set1->sig[0] == set2->sig[0];
+	default:
+		return memcmp(set1, set2, sizeof(sigset_t)) == 0;
 	}
 	return 0;
 }
 
-#define sigmask(sig)	(1UL << ((sig) - 1))
+static inline int sigisemptyset(sigset_t *set)
+{
+	sigset_t empty = {0};
+
+	return sigequalsets(set, &empty);
+}
 
 #ifndef __HAVE_ARCH_SIG_SETOPS
 #include <linux/string.h>
@@ -141,6 +198,7 @@ static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
 {									\
 	unsigned long a0, a1, a2, a3, b0, b1, b2, b3;			\
+	int i;								\
 									\
 	switch (_NSIG_WORDS) {						\
 	case 4:								\
@@ -158,7 +216,9 @@ static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
 		r->sig[0] = op(a0, b0);					\
 		break;							\
 	default:							\
-		BUILD_BUG();						\
+		for (i = 0; i < _NSIG_WORDS; i++)			\
+			r->sig[i] = op(a->sig[i], b->sig[i]);		\
+		break;							\
 	}								\
 }
 
@@ -179,6 +239,8 @@ _SIG_SET_BINOP(sigandnsets, _sig_andn)
 #define _SIG_SET_OP(name, op)						\
 static inline void name(sigset_t *set)					\
 {									\
+	int i;								\
+									\
 	switch (_NSIG_WORDS) {						\
 	case 4:	set->sig[3] = op(set->sig[3]);				\
 		set->sig[2] = op(set->sig[2]);				\
@@ -188,7 +250,9 @@ static inline void name(sigset_t *set)					\
 	case 1:	set->sig[0] = op(set->sig[0]);				\
 		    break;						\
 	default:							\
-		BUILD_BUG();						\
+		for (i = 0; i < _NSIG_WORDS; i++)			\
+			set->sig[i] = op(set->sig[i]);			\
+		break;							\
 	}								\
 }
 
@@ -224,24 +288,13 @@ static inline void sigfillset(sigset_t *set)
 	}
 }
 
-/* Some extensions for manipulating the low 32 signals in particular.  */
+#endif /* __HAVE_ARCH_SIG_SETOPS */
 
-static inline void sigaddsetmask(sigset_t *set, unsigned long mask)
-{
-	set->sig[0] |= mask;
-}
+/* Primitives for handing the compat (first long) sigset_t */
 
-static inline void sigdelsetmask(sigset_t *set, unsigned long mask)
-{
-	set->sig[0] &= ~mask;
-}
+#define compat_sigmask(sig)       (1UL << ((sig) - 1))
 
-static inline int sigtestsetmask(sigset_t *set, unsigned long mask)
-{
-	return (set->sig[0] & mask) != 0;
-}
-
-static inline void siginitset(sigset_t *set, unsigned long mask)
+static inline void compat_siginitset(sigset_t *set, unsigned long mask)
 {
 	set->sig[0] = mask;
 	switch (_NSIG_WORDS) {
@@ -254,7 +307,7 @@ static inline void siginitset(sigset_t *set, unsigned long mask)
 	}
 }
 
-static inline void siginitsetinv(sigset_t *set, unsigned long mask)
+static inline void compat_siginitsetinv(sigset_t *set, unsigned long mask)
 {
 	set->sig[0] = ~mask;
 	switch (_NSIG_WORDS) {
@@ -267,7 +320,21 @@ static inline void siginitsetinv(sigset_t *set, unsigned long mask)
 	}
 }
 
-#endif /* __HAVE_ARCH_SIG_SETOPS */
+static inline void compat_sigaddsetmask(sigset_t *set, unsigned long mask)
+{
+	set->sig[0] |= mask;
+}
+
+static inline void compat_sigdelsetmask(sigset_t *set, unsigned long mask)
+{
+	set->sig[0] &= ~mask;
+}
+
+static inline int compat_sigtestsetmask(sigset_t *set, unsigned long mask)
+{
+	return (set->sig[0] & mask) != 0;
+}
+
 
 /* Safely copy a sigset_t from user space handling any differences in
  * size between user space and kernel sigset_t.  We don't use
@@ -338,12 +405,6 @@ static inline void init_sigpending(struct sigpending *sig)
 
 extern void flush_sigqueue(struct sigpending *queue);
 
-/* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */
-static inline int valid_signal(unsigned long sig)
-{
-	return sig <= _NSIG ? 1 : 0;
-}
-
 struct timespec;
 struct pt_regs;
 enum pid_type;
@@ -470,55 +531,72 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig);
  * default action of stopping the process may happen later or never.
  */
 
+static inline int sig_kernel_stop(unsigned long sig)
+{
+	return	sig == SIGSTOP ||
+		sig == SIGTSTP ||
+		sig == SIGTTIN ||
+		sig == SIGTTOU;
+}
+
+static inline int sig_kernel_ignore(unsigned long sig)
+{
+	return	sig == SIGCONT	||
+		sig == SIGCHLD	||
+		sig == SIGWINCH ||
+		sig == SIGURG;
+}
+
+static inline int sig_kernel_only(unsigned long sig)
+{
+	return	sig == SIGKILL ||
+		sig == SIGSTOP;
+}
+
+static inline int sig_kernel_coredump(unsigned long sig)
+{
+	return	sig == SIGQUIT ||
+		sig == SIGILL  ||
+		sig == SIGTRAP ||
+		sig == SIGABRT ||
+		sig == SIGFPE  ||
+		sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGSYS  ||
+		sig == SIGXCPU ||
 #ifdef SIGEMT
-#define SIGEMT_MASK	rt_sigmask(SIGEMT)
-#else
-#define SIGEMT_MASK	0
+		sig == SIGEMT  ||
 #endif
+		sig == SIGXFSZ;
+}
 
-#if SIGRTMIN > BITS_PER_LONG
-#define rt_sigmask(sig)	(1ULL << ((sig)-1))
-#else
-#define rt_sigmask(sig)	sigmask(sig)
+static inline int sig_specific_sicodes(unsigned long sig)
+{
+	return	sig == SIGILL  ||
+		sig == SIGFPE  ||
+		sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGTRAP ||
+		sig == SIGCHLD ||
+		sig == SIGPOLL ||
+#ifdef SIGEMT
+		sig == SIGEMT  ||
 #endif
+		sig == SIGSYS;
+}
 
-#define siginmask(sig, mask) \
-	((sig) > 0 && (sig) < SIGRTMIN && (rt_sigmask(sig) & (mask)))
-
-#define SIG_KERNEL_ONLY_MASK (\
-	rt_sigmask(SIGKILL)   |  rt_sigmask(SIGSTOP))
-
-#define SIG_KERNEL_STOP_MASK (\
-	rt_sigmask(SIGSTOP)   |  rt_sigmask(SIGTSTP)   | \
-	rt_sigmask(SIGTTIN)   |  rt_sigmask(SIGTTOU)   )
-
-#define SIG_KERNEL_COREDUMP_MASK (\
-        rt_sigmask(SIGQUIT)   |  rt_sigmask(SIGILL)    | \
-	rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGABRT)   | \
-        rt_sigmask(SIGFPE)    |  rt_sigmask(SIGSEGV)   | \
-	rt_sigmask(SIGBUS)    |  rt_sigmask(SIGSYS)    | \
-        rt_sigmask(SIGXCPU)   |  rt_sigmask(SIGXFSZ)   | \
-	SIGEMT_MASK				       )
-
-#define SIG_KERNEL_IGNORE_MASK (\
-        rt_sigmask(SIGCONT)   |  rt_sigmask(SIGCHLD)   | \
-	rt_sigmask(SIGWINCH)  |  rt_sigmask(SIGURG)    )
-
-#define SIG_SPECIFIC_SICODES_MASK (\
-	rt_sigmask(SIGILL)    |  rt_sigmask(SIGFPE)    | \
-	rt_sigmask(SIGSEGV)   |  rt_sigmask(SIGBUS)    | \
-	rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGCHLD)   | \
-	rt_sigmask(SIGPOLL)   |  rt_sigmask(SIGSYS)    | \
-	SIGEMT_MASK                                    )
-
-#define sig_kernel_only(sig)		siginmask(sig, SIG_KERNEL_ONLY_MASK)
-#define sig_kernel_coredump(sig)	siginmask(sig, SIG_KERNEL_COREDUMP_MASK)
-#define sig_kernel_ignore(sig)		siginmask(sig, SIG_KERNEL_IGNORE_MASK)
-#define sig_kernel_stop(sig)		siginmask(sig, SIG_KERNEL_STOP_MASK)
-#define sig_specific_sicodes(sig)	siginmask(sig, SIG_SPECIFIC_SICODES_MASK)
+static inline int synchronous_signal(unsigned long sig)
+{
+	return	sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGILL  ||
+		sig == SIGTRAP ||
+		sig == SIGFPE  ||
+		sig == SIGSYS;
+}
 
 #define sig_fatal(t, signr) \
-	(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
+	(!(sig_kernel_ignore(signr) ||	sig_kernel_stop(signr)) &&	\
 	 (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
 
 void signals_init(void);
diff --git a/kernel/compat.c b/kernel/compat.c
index cc2438f4070c..26ffd271444c 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -49,16 +49,16 @@ COMPAT_SYSCALL_DEFINE3(sigprocmask, int, how,
 	if (nset) {
 		if (get_user(new_set, nset))
 			return -EFAULT;
-		new_set &= ~(sigmask(SIGKILL) | sigmask(SIGSTOP));
+		new_set &= ~(compat_sigmask(SIGKILL) | compat_sigmask(SIGSTOP));
 
 		new_blocked = current->blocked;
 
 		switch (how) {
 		case SIG_BLOCK:
-			sigaddsetmask(&new_blocked, new_set);
+			compat_sigaddsetmask(&new_blocked, new_set);
 			break;
 		case SIG_UNBLOCK:
-			sigdelsetmask(&new_blocked, new_set);
+			compat_sigdelsetmask(&new_blocked, new_set);
 			break;
 		case SIG_SETMASK:
 			compat_sig_setmask(&new_blocked, new_set);
diff --git a/kernel/fork.c b/kernel/fork.c
index 38681ad44c76..8b07f0090b82 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2032,7 +2032,7 @@ static __latent_entropy struct task_struct *copy_process(
 		 * fatal or STOP
 		 */
 		p->flags |= PF_IO_WORKER;
-		siginitsetinv(&p->blocked, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		siginitsetinv(&p->blocked, SIGKILL, SIGSTOP);
 	}
 
 	/*
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2f7ee345a629..200b99d39878 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1102,7 +1102,7 @@ int ptrace_request(struct task_struct *child, long request,
 		if (ret)
 			break;
 
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		/*
 		 * Every thread does recalc_sigpending() after resume, so
diff --git a/kernel/signal.c b/kernel/signal.c
index a2f0e38ba934..9421f1112b20 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -64,6 +64,9 @@ static struct kmem_cache *sigqueue_cachep;
 
 int print_fatal_signals __read_mostly;
 
+sigset_t signal_stop_mask;
+sigset_t signal_synchronous_mask;
+
 static void __user *sig_handler(struct task_struct *t, int sig)
 {
 	return t->sighand->action[sig - 1].sa.sa_handler;
@@ -199,55 +202,26 @@ void calculate_sigpending(void)
 }
 
 /* Given the mask, find the first available signal that should be serviced. */
-
-#define SYNCHRONOUS_MASK \
-	(sigmask(SIGSEGV) | sigmask(SIGBUS) | sigmask(SIGILL) | \
-	 sigmask(SIGTRAP) | sigmask(SIGFPE) | sigmask(SIGSYS))
-
 int next_signal(struct sigpending *pending, sigset_t *mask)
 {
-	unsigned long i, *s, *m, x;
-	int sig = 0;
+	int i, sig;
+	sigset_t pend, s;
 
-	s = pending->signal.sig;
-	m = mask->sig;
+	sigandnsets(&pend, &pending->signal, mask);
 
-	/*
-	 * Handle the first word specially: it contains the
-	 * synchronous signals that need to be dequeued first.
-	 */
-	x = *s &~ *m;
-	if (x) {
-		if (x & SYNCHRONOUS_MASK)
-			x &= SYNCHRONOUS_MASK;
-		sig = ffz(~x) + 1;
-		return sig;
-	}
+	/* Handle synchronous signals first */
+	sigandsets(&s, &pend, &signal_synchronous_mask);
+	if (!sigisemptyset(&s))
+		pend = s;
 
-	switch (_NSIG_WORDS) {
-	default:
-		for (i = 1; i < _NSIG_WORDS; ++i) {
-			x = *++s &~ *++m;
-			if (!x)
-				continue;
-			sig = ffz(~x) + i*_NSIG_BPW + 1;
-			break;
+	for (i = 0; i < _NSIG_WORDS; i++) {
+		if (pend.sig[i] != 0) {
+			sig = ffz(~pend.sig[i]) + i*_NSIG_BPW + 1;
+			return sig;
 		}
-		break;
-
-	case 2:
-		x = s[1] &~ m[1];
-		if (!x)
-			break;
-		sig = ffz(~x) + _NSIG_BPW + 1;
-		break;
-
-	case 1:
-		/* Nothing to do */
-		break;
 	}
 
-	return sig;
+	return 0;
 }
 
 static inline void print_dropped_signal(int sig)
@@ -709,11 +683,14 @@ static int dequeue_synchronous_signal(kernel_siginfo_t *info)
 	struct task_struct *tsk = current;
 	struct sigpending *pending = &tsk->pending;
 	struct sigqueue *q, *sync = NULL;
+	sigset_t s;
 
 	/*
 	 * Might a synchronous signal be in the queue?
 	 */
-	if (!((pending->signal.sig[0] & ~tsk->blocked.sig[0]) & SYNCHRONOUS_MASK))
+	sigandnsets(&s, &pending->signal, &tsk->blocked);
+	sigandsets(&s, &s, &signal_synchronous_mask);
+	if (sigisemptyset(&s))
 		return 0;
 
 	/*
@@ -722,7 +699,7 @@ static int dequeue_synchronous_signal(kernel_siginfo_t *info)
 	list_for_each_entry(q, &pending->list, list) {
 		/* Synchronous signals have a positive si_code */
 		if ((q->info.si_code > SI_USER) &&
-		    (sigmask(q->info.si_signo) & SYNCHRONOUS_MASK)) {
+		    synchronous_signal(q->info.si_signo)) {
 			sync = q;
 			goto next;
 		}
@@ -795,6 +772,25 @@ static void flush_sigqueue_mask(sigset_t *mask, struct sigpending *s)
 	}
 }
 
+#define flush_sigqueue_sig(x, ...) __flush_sigqueue_sig((x),		\
+					NUM_INTARGS(__VA_ARGS__), __VA_ARGS__)
+static void __flush_sigqueue_sig(struct sigpending *s, int count, ...)
+{
+	va_list ap;
+	sigset_t mask;
+	int sig;
+
+	sigemptyset(&mask);
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_add(&mask, sig);
+		count--;
+	}
+	flush_sigqueue_mask(&mask, s);
+}
+
 static inline int is_si_special(const struct kernel_siginfo *info)
 {
 	return info <= SEND_SIG_PRIV;
@@ -913,8 +909,7 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
 		/*
 		 * This is a stop signal.  Remove SIGCONT from all queues.
 		 */
-		siginitset(&flush, sigmask(SIGCONT));
-		flush_sigqueue_mask(&flush, &signal->shared_pending);
+		flush_sigqueue_sig(&signal->shared_pending, SIGCONT);
 		for_each_thread(p, t)
 			flush_sigqueue_mask(&flush, &t->pending);
 	} else if (sig == SIGCONT) {
@@ -922,10 +917,9 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
 		/*
 		 * Remove all stop signals from all queues, wake all threads.
 		 */
-		siginitset(&flush, SIG_KERNEL_STOP_MASK);
-		flush_sigqueue_mask(&flush, &signal->shared_pending);
+		flush_sigqueue_mask(&signal_stop_mask, &signal->shared_pending);
 		for_each_thread(p, t) {
-			flush_sigqueue_mask(&flush, &t->pending);
+			flush_sigqueue_mask(&signal_stop_mask, &t->pending);
 			task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 			if (likely(!(t->ptrace & PT_SEIZED)))
 				wake_up_state(t, __TASK_STOPPED);
@@ -1172,7 +1166,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
 			sigset_t *signal = &delayed->signal;
 			/* Can't queue both a stop and a continue signal */
 			if (sig == SIGCONT)
-				sigdelsetmask(signal, SIG_KERNEL_STOP_MASK);
+				sigandnsets(signal, signal, &signal_stop_mask);
 			else if (sig_kernel_stop(sig))
 				sigdelset(signal, SIGCONT);
 			sigaddset(signal, sig);
@@ -3023,7 +3017,7 @@ static void __set_task_blocked(struct task_struct *tsk, const sigset_t *newset)
  */
 void set_current_blocked(sigset_t *newset)
 {
-	sigdelsetmask(newset, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(newset, SIGKILL, SIGSTOP);
 	__set_current_blocked(newset);
 }
 
@@ -3150,7 +3144,7 @@ SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
 	if (nset) {
 		if (copy_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		error = sigprocmask(how, &new_set, NULL);
 		if (error)
@@ -3180,7 +3174,7 @@ COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, compat_sigset_t __user *, nset,
 	if (nset) {
 		if (copy_compat_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		error = sigprocmask(how, &new_set, NULL);
 		if (error)
@@ -3586,7 +3580,7 @@ static int do_sigtimedwait(const sigset_t *which, kernel_siginfo_t *info,
 	/*
 	 * Invert the set of allowed signals to get those we want to block.
 	 */
-	sigdelsetmask(&mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(&mask, SIGKILL, SIGSTOP);
 	signotset(&mask);
 
 	spin_lock_irq(&tsk->sighand->siglock);
@@ -4111,8 +4105,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 	sigaction_compat_abi(act, oact);
 
 	if (act) {
-		sigdelsetmask(&act->sa.sa_mask,
-			      sigmask(SIGKILL) | sigmask(SIGSTOP));
+		sigdelset(&act->sa.sa_mask, SIGKILL, SIGSTOP);
 		*k = *act;
 		/*
 		 * POSIX 3.3.1.3:
@@ -4126,9 +4119,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 		 *   be discarded, whether or not it is blocked"
 		 */
 		if (sig_handler_ignored(sig_handler(p, sig), sig)) {
-			sigemptyset(&mask);
-			sigaddset(&mask, sig);
-			flush_sigqueue_mask(&mask, &p->signal->shared_pending);
+			flush_sigqueue_sig(&p->signal->shared_pending, sig);
 			for_each_thread(p, t)
 				flush_sigqueue_mask(&mask, &t->pending);
 		}
@@ -4332,10 +4323,10 @@ SYSCALL_DEFINE3(sigprocmask, int, how, old_sigset_t __user *, nset,
 
 		switch (how) {
 		case SIG_BLOCK:
-			sigaddsetmask(&new_blocked, new_set);
+			compat_sigaddsetmask(&new_blocked, new_set);
 			break;
 		case SIG_UNBLOCK:
-			sigdelsetmask(&new_blocked, new_set);
+			compat_sigdelsetmask(&new_blocked, new_set);
 			break;
 		case SIG_SETMASK:
 			new_blocked.sig[0] = new_set;
@@ -4724,6 +4715,10 @@ void __init signals_init(void)
 {
 	siginfo_buildtime_checks();
 
+	sigaddset(&signal_stop_mask, SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU);
+	sigaddset(&signal_synchronous_mask, SIGSEGV, SIGBUS, SIGILL, SIGTRAP,
+		 SIGFPE, SIGSYS);
+
 	sigqueue_cachep = KMEM_CACHE(sigqueue, SLAB_PANIC | SLAB_ACCOUNT);
 }
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c8b3645c9a7d..ab6ba4ec661b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3684,7 +3684,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
 static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
 {
 	if (sigset) {
-		sigdelsetmask(sigset, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(sigset, SIGKILL, SIGSTOP);
 		vcpu->sigset_active = 1;
 		vcpu->sigset = *sigset;
 	} else
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 4/8] signals: Remove sigmask() macro
@ 2022-01-03 18:19   ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Geert Uytterhoeven, Dinh Nguyen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Adaptec OEM Raid Solutions, James E.J. Bottomley,
	Martin K. Petersen, Jeff Layton, Ilya Dryomov, David Woodhouse,
	Richard Weinberger, Trond Myklebust, Anna Schumaker
  Cc: linux-kernel, Walt Drummond, linux-alpha, linux-m68k, linux-scsi,
	ceph-devel, linux-mtd, linux-nfs, linux-fsdevel, kvm

The sigmask() macro can't support signals numbers larger than 64.

Remove the general usage of sigmask() and bit masks as input into the
functions that manipulate or accept sigset_t, with the exceptions of
compatibility cases. Use a comma-separated list of signal numbers as
input to sigaddset()/sigdelset()/... instead.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 arch/alpha/kernel/signal.c     |   4 +-
 arch/m68k/include/asm/signal.h |   6 +-
 arch/nios2/kernel/signal.c     |   2 -
 arch/x86/include/asm/signal.h  |   6 +-
 drivers/scsi/dpti.h            |   2 -
 fs/ceph/addr.c                 |   2 +-
 fs/jffs2/background.c          |   2 +-
 fs/lockd/svc.c                 |   1 -
 fs/signalfd.c                  |   2 +-
 include/linux/signal.h         | 254 +++++++++++++++++++++------------
 kernel/compat.c                |   6 +-
 kernel/fork.c                  |   2 +-
 kernel/ptrace.c                |   2 +-
 kernel/signal.c                | 115 +++++++--------
 virt/kvm/kvm_main.c            |   2 +-
 15 files changed, 238 insertions(+), 170 deletions(-)

diff --git a/arch/alpha/kernel/signal.c b/arch/alpha/kernel/signal.c
index bc077babafab..cae533594248 100644
--- a/arch/alpha/kernel/signal.c
+++ b/arch/alpha/kernel/signal.c
@@ -33,7 +33,7 @@
 
 #define DEBUG_SIG 0
 
-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
+#define _BLOCKABLE (~(compat_sigmask(SIGKILL) | compat_sigmask(SIGSTOP)))
 
 asmlinkage void ret_from_sys_call(void);
 
@@ -47,7 +47,7 @@ SYSCALL_DEFINE2(osf_sigprocmask, int, how, unsigned long, newmask)
 	sigset_t mask;
 	unsigned long res;
 
-	siginitset(&mask, newmask & _BLOCKABLE);
+	compat_siginitset(&mask, newmask & _BLOCKABLE);
 	res = sigprocmask(how, &mask, &oldmask);
 	if (!res) {
 		force_successful_syscall_return();
diff --git a/arch/m68k/include/asm/signal.h b/arch/m68k/include/asm/signal.h
index 8af85c38d377..464ff863c958 100644
--- a/arch/m68k/include/asm/signal.h
+++ b/arch/m68k/include/asm/signal.h
@@ -24,7 +24,7 @@ typedef struct {
 #ifndef CONFIG_CPU_HAS_NO_BITFIELDS
 #define __HAVE_ARCH_SIG_BITOPS
 
-static inline void sigaddset(sigset_t *set, int _sig)
+static inline void sigset_add(sigset_t *set, int _sig)
 {
 	asm ("bfset %0{%1,#1}"
 		: "+o" (*set)
@@ -32,7 +32,7 @@ static inline void sigaddset(sigset_t *set, int _sig)
 		: "cc");
 }
 
-static inline void sigdelset(sigset_t *set, int _sig)
+static inline void sigset_del(sigset_t *set, int _sig)
 {
 	asm ("bfclr %0{%1,#1}"
 		: "+o" (*set)
@@ -56,7 +56,7 @@ static inline int __gen_sigismember(sigset_t *set, int _sig)
 	return ret;
 }
 
-#define sigismember(set,sig)			\
+#define sigset_ismember(set, sig)		\
 	(__builtin_constant_p(sig) ?		\
 	 __const_sigismember(set,sig) :		\
 	 __gen_sigismember(set,sig))
diff --git a/arch/nios2/kernel/signal.c b/arch/nios2/kernel/signal.c
index 2009ae2d3c3b..c9db511a6989 100644
--- a/arch/nios2/kernel/signal.c
+++ b/arch/nios2/kernel/signal.c
@@ -20,8 +20,6 @@
 #include <asm/ucontext.h>
 #include <asm/cacheflush.h>
 
-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
-
 /*
  * Do a signal return; undo the signal stack.
  *
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index 2dfb5fea13af..9bac7c6e524c 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -46,7 +46,7 @@ typedef sigset_t compat_sigset_t;
 
 #define __HAVE_ARCH_SIG_BITOPS
 
-#define sigaddset(set,sig)		    \
+#define sigset_add(set, sig)		    \
 	(__builtin_constant_p(sig)	    \
 	 ? __const_sigaddset((set), (sig))  \
 	 : __gen_sigaddset((set), (sig)))
@@ -62,7 +62,7 @@ static inline void __const_sigaddset(sigset_t *set, int _sig)
 	set->sig[sig / _NSIG_BPW] |= 1 << (sig % _NSIG_BPW);
 }
 
-#define sigdelset(set, sig)		    \
+#define sigset_del(set, sig)		    \
 	(__builtin_constant_p(sig)	    \
 	 ? __const_sigdelset((set), (sig))  \
 	 : __gen_sigdelset((set), (sig)))
@@ -93,7 +93,7 @@ static inline int __gen_sigismember(sigset_t *set, int _sig)
 	return ret;
 }
 
-#define sigismember(set, sig)			\
+#define sigset_ismember(set, sig)		\
 	(__builtin_constant_p(sig)		\
 	 ? __const_sigismember((set), (sig))	\
 	 : __gen_sigismember((set), (sig)))
diff --git a/drivers/scsi/dpti.h b/drivers/scsi/dpti.h
index 8a079e8d7f65..cfcbb7d98fc0 100644
--- a/drivers/scsi/dpti.h
+++ b/drivers/scsi/dpti.h
@@ -96,8 +96,6 @@ static int adpt_device_reset(struct scsi_cmnd* cmd);
 #define PINFO(fmt, args...) printk(KERN_INFO fmt, ##args)
 #define PCRIT(fmt, args...) printk(KERN_CRIT fmt, ##args)
 
-#define SHUTDOWN_SIGS	(sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGTERM))
-
 // Command timeouts
 #define FOREVER			(0)
 #define TMOUT_INQUIRY 		(20)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 99b80b5c7a93..238b5ce5ef64 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1333,7 +1333,7 @@ const struct address_space_operations ceph_aops = {
 static void ceph_block_sigs(sigset_t *oldset)
 {
 	sigset_t mask;
-	siginitsetinv(&mask, sigmask(SIGKILL));
+	siginitsetinv(&mask, SIGKILL);
 	sigprocmask(SIG_BLOCK, &mask, oldset);
 }
 
diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
index 2b4d5013dc5d..bb84a8b2373c 100644
--- a/fs/jffs2/background.c
+++ b/fs/jffs2/background.c
@@ -77,7 +77,7 @@ static int jffs2_garbage_collect_thread(void *_c)
 	struct jffs2_sb_info *c = _c;
 	sigset_t hupmask;
 
-	siginitset(&hupmask, sigmask(SIGHUP));
+	siginitset(&hupmask, SIGHUP);
 	allow_signal(SIGKILL);
 	allow_signal(SIGSTOP);
 	allow_signal(SIGHUP);
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index b632be3ad57b..3c8b56c094d0 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -45,7 +45,6 @@
 
 #define NLMDBG_FACILITY		NLMDBG_SVC
 #define LOCKD_BUFSIZE		(1024 + NLMSVC_XDRSIZE)
-#define ALLOWED_SIGS		(sigmask(SIGKILL))
 
 static struct svc_program	nlmsvc_program;
 
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 12fdc282e299..ed024d5aad2a 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -270,7 +270,7 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
 	if (flags & ~(SFD_CLOEXEC | SFD_NONBLOCK))
 		return -EINVAL;
 
-	sigdelsetmask(mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(mask, SIGKILL, SIGSTOP);
 	signotset(mask);
 
 	if (ufd == -1) {
diff --git a/include/linux/signal.h b/include/linux/signal.h
index a730f3d4615e..eaf7991fffee 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -53,6 +53,12 @@ enum siginfo_layout {
 
 enum siginfo_layout siginfo_layout(unsigned sig, int si_code);
 
+/* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */
+static inline int valid_signal(unsigned long sig)
+{
+	return sig <= _NSIG ? 1 : 0;
+}
+
 /* Test if 'sig' is a realtime signal.  Use this instead of testing
  * SIGRTMIN/SIGRTMAX directly.
  */
@@ -62,15 +68,20 @@ static inline int realtime_signal(unsigned long sig)
 }
 
 /*
- * Define some primitives to manipulate sigset_t.
+ * Define some primitives to manipulate individual bits in sigset_t.
+ * Don't use these directly.  Architectures can define their own
+ * versions (see arch/x86/include/signal.h)
  */
 
 #ifndef __HAVE_ARCH_SIG_BITOPS
-#include <linux/bitops.h>
+#define sigset_add(set, sig)       __sigset_add(set, sig)
+#define sigset_del(set, sig)       __sigset_del(set, sig)
+#define sigset_ismember(set, sig)  __sigset_ismember(set, sig)
+#endif
 
 /* We don't use <linux/bitops.h> for these because there is no need to
    be atomic.  */
-static inline void sigaddset(sigset_t *set, int _sig)
+static inline void __sigset_add(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
@@ -79,7 +90,7 @@ static inline void sigaddset(sigset_t *set, int _sig)
 		set->sig[sig / _NSIG_BPW] |= 1UL << (sig % _NSIG_BPW);
 }
 
-static inline void sigdelset(sigset_t *set, int _sig)
+static inline void __sigset_del(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
@@ -88,33 +99,72 @@ static inline void sigdelset(sigset_t *set, int _sig)
 		set->sig[sig / _NSIG_BPW] &= ~(1UL << (sig % _NSIG_BPW));
 }
 
-static inline int sigismember(sigset_t *set, int _sig)
+static inline int __sigset_ismember(sigset_t *set, int _sig)
 {
 	unsigned long sig = _sig - 1;
 	if (_NSIG_WORDS == 1)
-		return 1 & (set->sig[0] >> sig);
+		return 1UL & (set->sig[0] >> sig);
 	else
-		return 1 & (set->sig[sig / _NSIG_BPW] >> (sig % _NSIG_BPW));
+		return 1UL & (set->sig[sig / _NSIG_BPW] >> (sig % _NSIG_BPW));
 }
 
-#endif /* __HAVE_ARCH_SIG_BITOPS */
+/* Some primitives for setting/deleting signals from sigset_t.  Use these. */
 
-static inline int sigisemptyset(sigset_t *set)
+#define NUM_INTARGS(...) (sizeof((int[]){__VA_ARGS__})/sizeof(int))
+
+#define sigdelset(x, ...) __sigdelset((x), NUM_INTARGS(__VA_ARGS__),	\
+				      __VA_ARGS__)
+static inline void __sigdelset(sigset_t *set, int count, ...)
 {
-	switch (_NSIG_WORDS) {
-	case 4:
-		return (set->sig[3] | set->sig[2] |
-			set->sig[1] | set->sig[0]) == 0;
-	case 2:
-		return (set->sig[1] | set->sig[0]) == 0;
-	case 1:
-		return set->sig[0] == 0;
-	default:
-		BUILD_BUG();
-		return 0;
+	va_list ap;
+	int sig;
+
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_del(set, sig);
+		count--;
 	}
+	va_end(ap);
+}
+
+#define sigaddset(x, ...) __sigaddset((x), NUM_INTARGS(__VA_ARGS__),	\
+				      __VA_ARGS__)
+static inline void __sigaddset(sigset_t *set, int count, ...)
+{
+	va_list ap;
+	int sig;
+
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_add(set, sig);
+		count--;
+	}
+	va_end(ap);
+}
+
+static inline int sigismember(sigset_t *set, int sig)
+{
+	if (!valid_signal(sig) || sig == 0)
+		return 0;
+	return sigset_ismember(set, sig);
 }
 
+#define siginitset(set, ...)			\
+do {						\
+	sigemptyset((set));			\
+	sigaddset((set), __VA_ARGS__);		\
+} while (0)
+
+#define siginitsetinv(set, ...)			\
+do {					        \
+	sigfillset((set));			\
+	sigdelset((set), __VA_ARGS__);		\
+} while (0)
+
 static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 {
 	switch (_NSIG_WORDS) {
@@ -128,11 +178,18 @@ static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 			(set1->sig[0] == set2->sig[0]);
 	case 1:
 		return	set1->sig[0] == set2->sig[0];
+	default:
+		return memcmp(set1, set2, sizeof(sigset_t)) == 0;
 	}
 	return 0;
 }
 
-#define sigmask(sig)	(1UL << ((sig) - 1))
+static inline int sigisemptyset(sigset_t *set)
+{
+	sigset_t empty = {0};
+
+	return sigequalsets(set, &empty);
+}
 
 #ifndef __HAVE_ARCH_SIG_SETOPS
 #include <linux/string.h>
@@ -141,6 +198,7 @@ static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
 {									\
 	unsigned long a0, a1, a2, a3, b0, b1, b2, b3;			\
+	int i;								\
 									\
 	switch (_NSIG_WORDS) {						\
 	case 4:								\
@@ -158,7 +216,9 @@ static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
 		r->sig[0] = op(a0, b0);					\
 		break;							\
 	default:							\
-		BUILD_BUG();						\
+		for (i = 0; i < _NSIG_WORDS; i++)			\
+			r->sig[i] = op(a->sig[i], b->sig[i]);		\
+		break;							\
 	}								\
 }
 
@@ -179,6 +239,8 @@ _SIG_SET_BINOP(sigandnsets, _sig_andn)
 #define _SIG_SET_OP(name, op)						\
 static inline void name(sigset_t *set)					\
 {									\
+	int i;								\
+									\
 	switch (_NSIG_WORDS) {						\
 	case 4:	set->sig[3] = op(set->sig[3]);				\
 		set->sig[2] = op(set->sig[2]);				\
@@ -188,7 +250,9 @@ static inline void name(sigset_t *set)					\
 	case 1:	set->sig[0] = op(set->sig[0]);				\
 		    break;						\
 	default:							\
-		BUILD_BUG();						\
+		for (i = 0; i < _NSIG_WORDS; i++)			\
+			set->sig[i] = op(set->sig[i]);			\
+		break;							\
 	}								\
 }
 
@@ -224,24 +288,13 @@ static inline void sigfillset(sigset_t *set)
 	}
 }
 
-/* Some extensions for manipulating the low 32 signals in particular.  */
+#endif /* __HAVE_ARCH_SIG_SETOPS */
 
-static inline void sigaddsetmask(sigset_t *set, unsigned long mask)
-{
-	set->sig[0] |= mask;
-}
+/* Primitives for handing the compat (first long) sigset_t */
 
-static inline void sigdelsetmask(sigset_t *set, unsigned long mask)
-{
-	set->sig[0] &= ~mask;
-}
+#define compat_sigmask(sig)       (1UL << ((sig) - 1))
 
-static inline int sigtestsetmask(sigset_t *set, unsigned long mask)
-{
-	return (set->sig[0] & mask) != 0;
-}
-
-static inline void siginitset(sigset_t *set, unsigned long mask)
+static inline void compat_siginitset(sigset_t *set, unsigned long mask)
 {
 	set->sig[0] = mask;
 	switch (_NSIG_WORDS) {
@@ -254,7 +307,7 @@ static inline void siginitset(sigset_t *set, unsigned long mask)
 	}
 }
 
-static inline void siginitsetinv(sigset_t *set, unsigned long mask)
+static inline void compat_siginitsetinv(sigset_t *set, unsigned long mask)
 {
 	set->sig[0] = ~mask;
 	switch (_NSIG_WORDS) {
@@ -267,7 +320,21 @@ static inline void siginitsetinv(sigset_t *set, unsigned long mask)
 	}
 }
 
-#endif /* __HAVE_ARCH_SIG_SETOPS */
+static inline void compat_sigaddsetmask(sigset_t *set, unsigned long mask)
+{
+	set->sig[0] |= mask;
+}
+
+static inline void compat_sigdelsetmask(sigset_t *set, unsigned long mask)
+{
+	set->sig[0] &= ~mask;
+}
+
+static inline int compat_sigtestsetmask(sigset_t *set, unsigned long mask)
+{
+	return (set->sig[0] & mask) != 0;
+}
+
 
 /* Safely copy a sigset_t from user space handling any differences in
  * size between user space and kernel sigset_t.  We don't use
@@ -338,12 +405,6 @@ static inline void init_sigpending(struct sigpending *sig)
 
 extern void flush_sigqueue(struct sigpending *queue);
 
-/* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */
-static inline int valid_signal(unsigned long sig)
-{
-	return sig <= _NSIG ? 1 : 0;
-}
-
 struct timespec;
 struct pt_regs;
 enum pid_type;
@@ -470,55 +531,72 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig);
  * default action of stopping the process may happen later or never.
  */
 
+static inline int sig_kernel_stop(unsigned long sig)
+{
+	return	sig == SIGSTOP ||
+		sig == SIGTSTP ||
+		sig == SIGTTIN ||
+		sig == SIGTTOU;
+}
+
+static inline int sig_kernel_ignore(unsigned long sig)
+{
+	return	sig == SIGCONT	||
+		sig == SIGCHLD	||
+		sig == SIGWINCH ||
+		sig == SIGURG;
+}
+
+static inline int sig_kernel_only(unsigned long sig)
+{
+	return	sig == SIGKILL ||
+		sig == SIGSTOP;
+}
+
+static inline int sig_kernel_coredump(unsigned long sig)
+{
+	return	sig == SIGQUIT ||
+		sig == SIGILL  ||
+		sig == SIGTRAP ||
+		sig == SIGABRT ||
+		sig == SIGFPE  ||
+		sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGSYS  ||
+		sig == SIGXCPU ||
 #ifdef SIGEMT
-#define SIGEMT_MASK	rt_sigmask(SIGEMT)
-#else
-#define SIGEMT_MASK	0
+		sig == SIGEMT  ||
 #endif
+		sig == SIGXFSZ;
+}
 
-#if SIGRTMIN > BITS_PER_LONG
-#define rt_sigmask(sig)	(1ULL << ((sig)-1))
-#else
-#define rt_sigmask(sig)	sigmask(sig)
+static inline int sig_specific_sicodes(unsigned long sig)
+{
+	return	sig == SIGILL  ||
+		sig == SIGFPE  ||
+		sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGTRAP ||
+		sig == SIGCHLD ||
+		sig == SIGPOLL ||
+#ifdef SIGEMT
+		sig == SIGEMT  ||
 #endif
+		sig == SIGSYS;
+}
 
-#define siginmask(sig, mask) \
-	((sig) > 0 && (sig) < SIGRTMIN && (rt_sigmask(sig) & (mask)))
-
-#define SIG_KERNEL_ONLY_MASK (\
-	rt_sigmask(SIGKILL)   |  rt_sigmask(SIGSTOP))
-
-#define SIG_KERNEL_STOP_MASK (\
-	rt_sigmask(SIGSTOP)   |  rt_sigmask(SIGTSTP)   | \
-	rt_sigmask(SIGTTIN)   |  rt_sigmask(SIGTTOU)   )
-
-#define SIG_KERNEL_COREDUMP_MASK (\
-        rt_sigmask(SIGQUIT)   |  rt_sigmask(SIGILL)    | \
-	rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGABRT)   | \
-        rt_sigmask(SIGFPE)    |  rt_sigmask(SIGSEGV)   | \
-	rt_sigmask(SIGBUS)    |  rt_sigmask(SIGSYS)    | \
-        rt_sigmask(SIGXCPU)   |  rt_sigmask(SIGXFSZ)   | \
-	SIGEMT_MASK				       )
-
-#define SIG_KERNEL_IGNORE_MASK (\
-        rt_sigmask(SIGCONT)   |  rt_sigmask(SIGCHLD)   | \
-	rt_sigmask(SIGWINCH)  |  rt_sigmask(SIGURG)    )
-
-#define SIG_SPECIFIC_SICODES_MASK (\
-	rt_sigmask(SIGILL)    |  rt_sigmask(SIGFPE)    | \
-	rt_sigmask(SIGSEGV)   |  rt_sigmask(SIGBUS)    | \
-	rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGCHLD)   | \
-	rt_sigmask(SIGPOLL)   |  rt_sigmask(SIGSYS)    | \
-	SIGEMT_MASK                                    )
-
-#define sig_kernel_only(sig)		siginmask(sig, SIG_KERNEL_ONLY_MASK)
-#define sig_kernel_coredump(sig)	siginmask(sig, SIG_KERNEL_COREDUMP_MASK)
-#define sig_kernel_ignore(sig)		siginmask(sig, SIG_KERNEL_IGNORE_MASK)
-#define sig_kernel_stop(sig)		siginmask(sig, SIG_KERNEL_STOP_MASK)
-#define sig_specific_sicodes(sig)	siginmask(sig, SIG_SPECIFIC_SICODES_MASK)
+static inline int synchronous_signal(unsigned long sig)
+{
+	return	sig == SIGSEGV ||
+		sig == SIGBUS  ||
+		sig == SIGILL  ||
+		sig == SIGTRAP ||
+		sig == SIGFPE  ||
+		sig == SIGSYS;
+}
 
 #define sig_fatal(t, signr) \
-	(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
+	(!(sig_kernel_ignore(signr) ||	sig_kernel_stop(signr)) &&	\
 	 (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
 
 void signals_init(void);
diff --git a/kernel/compat.c b/kernel/compat.c
index cc2438f4070c..26ffd271444c 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -49,16 +49,16 @@ COMPAT_SYSCALL_DEFINE3(sigprocmask, int, how,
 	if (nset) {
 		if (get_user(new_set, nset))
 			return -EFAULT;
-		new_set &= ~(sigmask(SIGKILL) | sigmask(SIGSTOP));
+		new_set &= ~(compat_sigmask(SIGKILL) | compat_sigmask(SIGSTOP));
 
 		new_blocked = current->blocked;
 
 		switch (how) {
 		case SIG_BLOCK:
-			sigaddsetmask(&new_blocked, new_set);
+			compat_sigaddsetmask(&new_blocked, new_set);
 			break;
 		case SIG_UNBLOCK:
-			sigdelsetmask(&new_blocked, new_set);
+			compat_sigdelsetmask(&new_blocked, new_set);
 			break;
 		case SIG_SETMASK:
 			compat_sig_setmask(&new_blocked, new_set);
diff --git a/kernel/fork.c b/kernel/fork.c
index 38681ad44c76..8b07f0090b82 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2032,7 +2032,7 @@ static __latent_entropy struct task_struct *copy_process(
 		 * fatal or STOP
 		 */
 		p->flags |= PF_IO_WORKER;
-		siginitsetinv(&p->blocked, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		siginitsetinv(&p->blocked, SIGKILL, SIGSTOP);
 	}
 
 	/*
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2f7ee345a629..200b99d39878 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1102,7 +1102,7 @@ int ptrace_request(struct task_struct *child, long request,
 		if (ret)
 			break;
 
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		/*
 		 * Every thread does recalc_sigpending() after resume, so
diff --git a/kernel/signal.c b/kernel/signal.c
index a2f0e38ba934..9421f1112b20 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -64,6 +64,9 @@ static struct kmem_cache *sigqueue_cachep;
 
 int print_fatal_signals __read_mostly;
 
+sigset_t signal_stop_mask;
+sigset_t signal_synchronous_mask;
+
 static void __user *sig_handler(struct task_struct *t, int sig)
 {
 	return t->sighand->action[sig - 1].sa.sa_handler;
@@ -199,55 +202,26 @@ void calculate_sigpending(void)
 }
 
 /* Given the mask, find the first available signal that should be serviced. */
-
-#define SYNCHRONOUS_MASK \
-	(sigmask(SIGSEGV) | sigmask(SIGBUS) | sigmask(SIGILL) | \
-	 sigmask(SIGTRAP) | sigmask(SIGFPE) | sigmask(SIGSYS))
-
 int next_signal(struct sigpending *pending, sigset_t *mask)
 {
-	unsigned long i, *s, *m, x;
-	int sig = 0;
+	int i, sig;
+	sigset_t pend, s;
 
-	s = pending->signal.sig;
-	m = mask->sig;
+	sigandnsets(&pend, &pending->signal, mask);
 
-	/*
-	 * Handle the first word specially: it contains the
-	 * synchronous signals that need to be dequeued first.
-	 */
-	x = *s &~ *m;
-	if (x) {
-		if (x & SYNCHRONOUS_MASK)
-			x &= SYNCHRONOUS_MASK;
-		sig = ffz(~x) + 1;
-		return sig;
-	}
+	/* Handle synchronous signals first */
+	sigandsets(&s, &pend, &signal_synchronous_mask);
+	if (!sigisemptyset(&s))
+		pend = s;
 
-	switch (_NSIG_WORDS) {
-	default:
-		for (i = 1; i < _NSIG_WORDS; ++i) {
-			x = *++s &~ *++m;
-			if (!x)
-				continue;
-			sig = ffz(~x) + i*_NSIG_BPW + 1;
-			break;
+	for (i = 0; i < _NSIG_WORDS; i++) {
+		if (pend.sig[i] != 0) {
+			sig = ffz(~pend.sig[i]) + i*_NSIG_BPW + 1;
+			return sig;
 		}
-		break;
-
-	case 2:
-		x = s[1] &~ m[1];
-		if (!x)
-			break;
-		sig = ffz(~x) + _NSIG_BPW + 1;
-		break;
-
-	case 1:
-		/* Nothing to do */
-		break;
 	}
 
-	return sig;
+	return 0;
 }
 
 static inline void print_dropped_signal(int sig)
@@ -709,11 +683,14 @@ static int dequeue_synchronous_signal(kernel_siginfo_t *info)
 	struct task_struct *tsk = current;
 	struct sigpending *pending = &tsk->pending;
 	struct sigqueue *q, *sync = NULL;
+	sigset_t s;
 
 	/*
 	 * Might a synchronous signal be in the queue?
 	 */
-	if (!((pending->signal.sig[0] & ~tsk->blocked.sig[0]) & SYNCHRONOUS_MASK))
+	sigandnsets(&s, &pending->signal, &tsk->blocked);
+	sigandsets(&s, &s, &signal_synchronous_mask);
+	if (sigisemptyset(&s))
 		return 0;
 
 	/*
@@ -722,7 +699,7 @@ static int dequeue_synchronous_signal(kernel_siginfo_t *info)
 	list_for_each_entry(q, &pending->list, list) {
 		/* Synchronous signals have a positive si_code */
 		if ((q->info.si_code > SI_USER) &&
-		    (sigmask(q->info.si_signo) & SYNCHRONOUS_MASK)) {
+		    synchronous_signal(q->info.si_signo)) {
 			sync = q;
 			goto next;
 		}
@@ -795,6 +772,25 @@ static void flush_sigqueue_mask(sigset_t *mask, struct sigpending *s)
 	}
 }
 
+#define flush_sigqueue_sig(x, ...) __flush_sigqueue_sig((x),		\
+					NUM_INTARGS(__VA_ARGS__), __VA_ARGS__)
+static void __flush_sigqueue_sig(struct sigpending *s, int count, ...)
+{
+	va_list ap;
+	sigset_t mask;
+	int sig;
+
+	sigemptyset(&mask);
+	va_start(ap, count);
+	while (count > 0) {
+		sig = va_arg(ap, int);
+		if (valid_signal(sig) && sig != 0)
+			sigset_add(&mask, sig);
+		count--;
+	}
+	flush_sigqueue_mask(&mask, s);
+}
+
 static inline int is_si_special(const struct kernel_siginfo *info)
 {
 	return info <= SEND_SIG_PRIV;
@@ -913,8 +909,7 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
 		/*
 		 * This is a stop signal.  Remove SIGCONT from all queues.
 		 */
-		siginitset(&flush, sigmask(SIGCONT));
-		flush_sigqueue_mask(&flush, &signal->shared_pending);
+		flush_sigqueue_sig(&signal->shared_pending, SIGCONT);
 		for_each_thread(p, t)
 			flush_sigqueue_mask(&flush, &t->pending);
 	} else if (sig == SIGCONT) {
@@ -922,10 +917,9 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
 		/*
 		 * Remove all stop signals from all queues, wake all threads.
 		 */
-		siginitset(&flush, SIG_KERNEL_STOP_MASK);
-		flush_sigqueue_mask(&flush, &signal->shared_pending);
+		flush_sigqueue_mask(&signal_stop_mask, &signal->shared_pending);
 		for_each_thread(p, t) {
-			flush_sigqueue_mask(&flush, &t->pending);
+			flush_sigqueue_mask(&signal_stop_mask, &t->pending);
 			task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 			if (likely(!(t->ptrace & PT_SEIZED)))
 				wake_up_state(t, __TASK_STOPPED);
@@ -1172,7 +1166,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
 			sigset_t *signal = &delayed->signal;
 			/* Can't queue both a stop and a continue signal */
 			if (sig == SIGCONT)
-				sigdelsetmask(signal, SIG_KERNEL_STOP_MASK);
+				sigandnsets(signal, signal, &signal_stop_mask);
 			else if (sig_kernel_stop(sig))
 				sigdelset(signal, SIGCONT);
 			sigaddset(signal, sig);
@@ -3023,7 +3017,7 @@ static void __set_task_blocked(struct task_struct *tsk, const sigset_t *newset)
  */
 void set_current_blocked(sigset_t *newset)
 {
-	sigdelsetmask(newset, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(newset, SIGKILL, SIGSTOP);
 	__set_current_blocked(newset);
 }
 
@@ -3150,7 +3144,7 @@ SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
 	if (nset) {
 		if (copy_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		error = sigprocmask(how, &new_set, NULL);
 		if (error)
@@ -3180,7 +3174,7 @@ COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, compat_sigset_t __user *, nset,
 	if (nset) {
 		if (copy_compat_sigset_from_user(&new_set, nset, sigsetsize))
 			return -EFAULT;
-		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(&new_set, SIGKILL, SIGSTOP);
 
 		error = sigprocmask(how, &new_set, NULL);
 		if (error)
@@ -3586,7 +3580,7 @@ static int do_sigtimedwait(const sigset_t *which, kernel_siginfo_t *info,
 	/*
 	 * Invert the set of allowed signals to get those we want to block.
 	 */
-	sigdelsetmask(&mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	sigdelset(&mask, SIGKILL, SIGSTOP);
 	signotset(&mask);
 
 	spin_lock_irq(&tsk->sighand->siglock);
@@ -4111,8 +4105,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 	sigaction_compat_abi(act, oact);
 
 	if (act) {
-		sigdelsetmask(&act->sa.sa_mask,
-			      sigmask(SIGKILL) | sigmask(SIGSTOP));
+		sigdelset(&act->sa.sa_mask, SIGKILL, SIGSTOP);
 		*k = *act;
 		/*
 		 * POSIX 3.3.1.3:
@@ -4126,9 +4119,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 		 *   be discarded, whether or not it is blocked"
 		 */
 		if (sig_handler_ignored(sig_handler(p, sig), sig)) {
-			sigemptyset(&mask);
-			sigaddset(&mask, sig);
-			flush_sigqueue_mask(&mask, &p->signal->shared_pending);
+			flush_sigqueue_sig(&p->signal->shared_pending, sig);
 			for_each_thread(p, t)
 				flush_sigqueue_mask(&mask, &t->pending);
 		}
@@ -4332,10 +4323,10 @@ SYSCALL_DEFINE3(sigprocmask, int, how, old_sigset_t __user *, nset,
 
 		switch (how) {
 		case SIG_BLOCK:
-			sigaddsetmask(&new_blocked, new_set);
+			compat_sigaddsetmask(&new_blocked, new_set);
 			break;
 		case SIG_UNBLOCK:
-			sigdelsetmask(&new_blocked, new_set);
+			compat_sigdelsetmask(&new_blocked, new_set);
 			break;
 		case SIG_SETMASK:
 			new_blocked.sig[0] = new_set;
@@ -4724,6 +4715,10 @@ void __init signals_init(void)
 {
 	siginfo_buildtime_checks();
 
+	sigaddset(&signal_stop_mask, SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU);
+	sigaddset(&signal_synchronous_mask, SIGSEGV, SIGBUS, SIGILL, SIGTRAP,
+		 SIGFPE, SIGSYS);
+
 	sigqueue_cachep = KMEM_CACHE(sigqueue, SLAB_PANIC | SLAB_ACCOUNT);
 }
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c8b3645c9a7d..ab6ba4ec661b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3684,7 +3684,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
 static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
 {
 	if (sigset) {
-		sigdelsetmask(sigset, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		sigdelset(sigset, SIGKILL, SIGSTOP);
 		vcpu->sigset_active = 1;
 		vcpu->sigset = *sigset;
 	} else
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 5/8] signals: Better support cases where _NSIG_WORDS is greater than 2
  2022-01-03 18:19 ` Walt Drummond
                   ` (5 preceding siblings ...)
  (?)
@ 2022-01-03 18:19 ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  Cc: linux-kernel, Walt Drummond, linux-fsdevel

Directly handle the now more common cases where _NSIG_WORDS could be 3
or 4.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 fs/proc/array.c        |  3 +-
 include/linux/signal.h | 74 +++++++++++++++++++++++++++---------------
 kernel/signal.c        |  5 +++
 3 files changed, 55 insertions(+), 27 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 49be8c8ef555..f37c03077b58 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -223,7 +223,8 @@ void render_sigset_t(struct seq_file *m, const char *header,
 
 	seq_puts(m, header);
 
-	i = _NSIG;
+	/* Round up when _NSIG isn't a multiple of 4 */
+	i = (_NSIG + 3) & ~0x03;
 	do {
 		int x = 0;
 
diff --git a/include/linux/signal.h b/include/linux/signal.h
index eaf7991fffee..4084b765a6cc 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -168,18 +168,22 @@ do {					        \
 static inline int sigequalsets(const sigset_t *set1, const sigset_t *set2)
 {
 	switch (_NSIG_WORDS) {
+	default:
+		return memcmp(set1, set2, sizeof(sigset_t)) == 0;
 	case 4:
 		return	(set1->sig[3] == set2->sig[3]) &&
 			(set1->sig[2] == set2->sig[2]) &&
 			(set1->sig[1] == set2->sig[1]) &&
 			(set1->sig[0] == set2->sig[0]);
+	case 3:
+		return  (set1->sig[2] == set2->sig[2]) &&
+			(set1->sig[1] == set2->sig[1]) &&
+			(set1->sig[0] == set2->sig[0]);
 	case 2:
 		return	(set1->sig[1] == set2->sig[1]) &&
 			(set1->sig[0] == set2->sig[0]);
 	case 1:
 		return	set1->sig[0] == set2->sig[0];
-	default:
-		return memcmp(set1, set2, sizeof(sigset_t)) == 0;
 	}
 	return 0;
 }
@@ -197,27 +201,24 @@ static inline int sigisemptyset(sigset_t *set)
 #define _SIG_SET_BINOP(name, op)					\
 static inline void name(sigset_t *r, const sigset_t *a, const sigset_t *b) \
 {									\
-	unsigned long a0, a1, a2, a3, b0, b1, b2, b3;			\
 	int i;								\
 									\
 	switch (_NSIG_WORDS) {						\
+	default:							\
+		for (i = 0; i < _NSIG_WORDS; i++)			\
+			r->sig[i] = op(a->sig[i], b->sig[i]);		\
+		break;							\
 	case 4:								\
-		a3 = a->sig[3]; a2 = a->sig[2];				\
-		b3 = b->sig[3]; b2 = b->sig[2];				\
-		r->sig[3] = op(a3, b3);					\
-		r->sig[2] = op(a2, b2);					\
+		r->sig[3] = op(a->sig[3], b->sig[3]);			\
+		fallthrough;						\
+	case 3:								\
+		r->sig[2] = op(a->sig[2], b->sig[2]);			\
 		fallthrough;						\
 	case 2:								\
-		a1 = a->sig[1]; b1 = b->sig[1];				\
-		r->sig[1] = op(a1, b1);					\
+		r->sig[1] = op(a->sig[1], b->sig[1]);			\
 		fallthrough;						\
 	case 1:								\
-		a0 = a->sig[0]; b0 = b->sig[0];				\
-		r->sig[0] = op(a0, b0);					\
-		break;							\
-	default:							\
-		for (i = 0; i < _NSIG_WORDS; i++)			\
-			r->sig[i] = op(a->sig[i], b->sig[i]);		\
+		r->sig[0] = op(a->sig[0], b->sig[0]);			\
 		break;							\
 	}								\
 }
@@ -242,17 +243,22 @@ static inline void name(sigset_t *set)					\
 	int i;								\
 									\
 	switch (_NSIG_WORDS) {						\
-	case 4:	set->sig[3] = op(set->sig[3]);				\
-		set->sig[2] = op(set->sig[2]);				\
-		fallthrough;						\
-	case 2:	set->sig[1] = op(set->sig[1]);				\
-		fallthrough;						\
-	case 1:	set->sig[0] = op(set->sig[0]);				\
-		    break;						\
 	default:							\
 		for (i = 0; i < _NSIG_WORDS; i++)			\
 			set->sig[i] = op(set->sig[i]);			\
 		break;							\
+	case 4:								\
+		set->sig[3] = op(set->sig[3]);				\
+		fallthrough;						\
+	case 3:								\
+		set->sig[2] = op(set->sig[2]);				\
+		fallthrough;						\
+	case 2:								\
+		set->sig[1] = op(set->sig[1]);				\
+		fallthrough;						\
+	case 1:								\
+		set->sig[0] = op(set->sig[0]);				\
+		break;							\
 	}								\
 }
 
@@ -268,9 +274,17 @@ static inline void sigemptyset(sigset_t *set)
 	default:
 		memset(set, 0, sizeof(sigset_t));
 		break;
-	case 2: set->sig[1] = 0;
+	case 4:
+		set->sig[3] = 0;
+		fallthrough;
+	case 3:
+		set->sig[2] = 0;
 		fallthrough;
-	case 1:	set->sig[0] = 0;
+	case 2:
+		set->sig[1] = 0;
+		fallthrough;
+	case 1:
+		set->sig[0] = 0;
 		break;
 	}
 }
@@ -281,9 +295,17 @@ static inline void sigfillset(sigset_t *set)
 	default:
 		memset(set, -1, sizeof(sigset_t));
 		break;
-	case 2: set->sig[1] = -1;
+	case 4:
+		set->sig[3] = -1;
 		fallthrough;
-	case 1:	set->sig[0] = -1;
+	case 3:
+		set->sig[2] = -1;
+		fallthrough;
+	case 2:
+		set->sig[1] = -1;
+		fallthrough;
+	case 1:
+		set->sig[0] = -1;
 		break;
 	}
 }
diff --git a/kernel/signal.c b/kernel/signal.c
index 9421f1112b20..9c846a017201 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -143,6 +143,11 @@ static inline bool has_pending_signals(sigset_t *signal, sigset_t *blocked)
 		ready |= signal->sig[0] &~ blocked->sig[0];
 		break;
 
+	case 3: ready  = signal->sig[2] &~ blocked->sig[2];
+		ready |= signal->sig[1] &~ blocked->sig[1];
+		ready |= signal->sig[0] &~ blocked->sig[0];
+		break;
+
 	case 2: ready  = signal->sig[1] &~ blocked->sig[1];
 		ready |= signal->sig[0] &~ blocked->sig[0];
 		break;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 6/8] signals: Round up _NSIG_WORDS
  2022-01-03 18:19 ` Walt Drummond
                   ` (6 preceding siblings ...)
  (?)
@ 2022-01-03 18:19 ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Arnd Bergmann
  Cc: linux-kernel, Walt Drummond, linux-arch

When needed, round _NSIG_WORDS up for generic and x86 architectures.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 arch/x86/include/asm/signal.h     | 2 +-
 include/uapi/asm-generic/signal.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index 9bac7c6e524c..d8e2efe6cd46 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -16,7 +16,7 @@
 # define _NSIG_BPW	64
 #endif
 
-#define _NSIG_WORDS	(_NSIG / _NSIG_BPW)
+#define _NSIG_WORDS	((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW)
 
 typedef unsigned long old_sigset_t;		/* at least 32 bits */
 
diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h
index f634822906e4..3c4cc9b8378e 100644
--- a/include/uapi/asm-generic/signal.h
+++ b/include/uapi/asm-generic/signal.h
@@ -6,7 +6,7 @@
 
 #define _NSIG		64
 #define _NSIG_BPW	__BITS_PER_LONG
-#define _NSIG_WORDS	(_NSIG / _NSIG_BPW)
+#define _NSIG_WORDS	((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW)
 
 #define SIGHUP		 1
 #define SIGINT		 2
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 7/8] signals: Add signal debugging
  2022-01-03 18:19 ` Walt Drummond
                   ` (7 preceding siblings ...)
  (?)
@ 2022-01-03 18:19 ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Luis Chamberlain, Kees Cook, Iurii Zaikin
  Cc: linux-kernel, Walt Drummond, linux-fsdevel

Add CONFIG_SIGNALS_DEBUG, which provides /proc/sys/kern/sigset_size,
/proc/sys/kern/compat_sigset_size (if CONFIG_COMPAT is enabled),
/proc/sys/kern/max_sig and /proc/sys/kern/sigrtmax to indicate sigset
sizes, max signal number (_NSIG) and value of SIGRTMAX respectively.
This also adds /proc/<pid>/signal, which sends a signal number to
<pid> without going through libc.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 fs/proc/base.c         | 48 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/signal.h |  1 +
 kernel/signal.c        | 15 ++++++++-----
 kernel/sysctl.c        | 41 ++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug      | 10 +++++++++
 5 files changed, 110 insertions(+), 5 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 533d5836eb9a..75184abf9af1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3165,6 +3165,51 @@ static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns,
 }
 #endif /* CONFIG_STACKLEAK_METRICS */
 
+#ifdef CONFIG_SIGNALS_DEBUG
+static ssize_t proc_signal_write(struct file *file, const char __user *buf,
+				   size_t count, loff_t *ppos)
+{
+	struct inode *inode = file_inode(file);
+	struct task_struct *task = get_proc_task(inode);
+	int ret;
+	pid_t pid;
+	unsigned long sig = (unsigned long) -1;
+
+	if (!task)
+		return -ESRCH;
+	if (*ppos != 0)
+		/* No partial writes. */
+		return -EINVAL;
+
+	if (count > 4 || count <= 1)
+		return -EINVAL;
+
+	ret = kstrtoul_from_user(buf, count, 10, &sig);
+	if (ret != 0)
+		return -EINVAL;
+
+	if (!valid_signal(sig))
+		return -EINVAL;
+	if (sig == 0)
+		return count;
+
+	pid = pid_vnr(get_task_pid(task, PIDTYPE_PID));
+	if (pid == 0)
+		return -EINVAL;
+
+	ret = do_sys_kill(pid, sig);
+	if (ret)
+		return ret;
+
+	return count;
+}
+
+static const struct file_operations proc_signal_operations = {
+	.write		= proc_signal_write,
+	.llseek		= noop_llseek,
+};
+#endif	/* CONFIG_SIGNALS_DEBUG */
+
 /*
  * Thread groups
  */
@@ -3281,6 +3326,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_SECCOMP_CACHE_DEBUG
 	ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache),
 #endif
+#ifdef CONFIG_SIGNALS_DEBUG
+	REG("signal",  S_IWUSR, proc_signal_operations),
+#endif
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
diff --git a/include/linux/signal.h b/include/linux/signal.h
index 4084b765a6cc..b77f9472a37c 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -446,6 +446,7 @@ extern bool get_signal(struct ksignal *ksig);
 extern void signal_setup_done(int failed, struct ksignal *ksig, int stepping);
 extern void exit_signals(struct task_struct *tsk);
 extern void kernel_sigaction(int, __sighandler_t);
+extern int do_sys_kill(pid_t pid, int sig);
 
 #define SIG_KTHREAD ((__force __sighandler_t)2)
 #define SIG_KTHREAD_KERNEL ((__force __sighandler_t)3)
diff --git a/kernel/signal.c b/kernel/signal.c
index 9c846a017201..1ed392df55fb 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3755,6 +3755,15 @@ static inline void prepare_kill_siginfo(int sig, struct kernel_siginfo *info)
 	info->si_uid = from_kuid_munged(current_user_ns(), current_uid());
 }
 
+int do_sys_kill(pid_t pid, int sig)
+{
+	struct kernel_siginfo info;
+
+	prepare_kill_siginfo(sig, &info);
+
+	return kill_something_info(sig, &info, pid);
+}
+
 /**
  *  sys_kill - send a signal to a process
  *  @pid: the PID of the process
@@ -3762,11 +3771,7 @@ static inline void prepare_kill_siginfo(int sig, struct kernel_siginfo *info)
  */
 SYSCALL_DEFINE2(kill, pid_t, pid, int, sig)
 {
-	struct kernel_siginfo info;
-
-	prepare_kill_siginfo(sig, &info);
-
-	return kill_something_info(sig, &info, pid);
+	return do_sys_kill(pid, sig);
 }
 
 /*
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 083be6af29d7..0d7e1d16b75b 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -139,6 +139,15 @@ static int minolduid;
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
 
+#ifdef CONFIG_SIGNALS_DEBUG
+static int max_signal = _NSIG;
+static int sigrtmax = SIGRTMAX;
+static int sigset_size = sizeof(sigset_t);
+# ifdef CONFIG_COMPAT
+static int compat_sigset_size = sizeof(compat_sigset_t);
+# endif
+#endif
+
 /*
  * This is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs
  * and hung_task_check_interval_secs
@@ -2717,6 +2726,38 @@ static struct ctl_table kern_table[] = {
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_ONE,
 	},
+#endif
+#ifdef CONFIG_SIGNALS_DEBUG
+	{
+		.procname	= "max_signal",
+		.data		= &max_signal,
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= proc_dointvec,
+	},
+	{
+		.procname	= "sigrtmax",
+		.data		= &sigrtmax,
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= proc_dointvec,
+	},
+	{
+		.procname	= "sigset_size",
+		.data		= &sigset_size,
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= proc_dointvec,
+	},
+# ifdef CONFIG_COMPAT
+	{
+		.procname	= "compat_sigset_size",
+		.data		= &compat_sigset_size,
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= proc_dointvec,
+	},
+# endif
 #endif
 	{ }
 };
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2a9b6dcdac4f..c433356c1070 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2639,4 +2639,14 @@ endmenu # "Kernel Testing and Coverage"
 
 source "Documentation/Kconfig"
 
+config SIGNALS_DEBUG
+       bool "Enable basic signals debugging"
+       default n
+       help
+	 Provides several files in /proc to aid in debugging changes to
+	 the signals code: /proc/sys/kernel/max_signal,
+	 /proc/sys/kernel/sigrtmax and /proc/sys/kernel/sigset_size.
+	 Also adds /proc/<pid>/signal to allow sending a signal number
+	 to <pid> without going through libc.
+
 endmenu # Kernel hacking
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO
  2022-01-03 18:19 ` Walt Drummond
                   ` (8 preceding siblings ...)
  (?)
@ 2022-01-03 18:19 ` Walt Drummond
  2022-01-04  7:27   ` Greg Kroah-Hartman
                     ` (2 more replies)
  -1 siblings, 3 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: linux-kernel, Walt Drummond, linux-fsdevel, linux-arch

Support TTY VSTATUS character, NOKERNINFO local control bit and the
signal SIGINFO, all as in 4.3BSD.

Signed-off-by: Walt Drummond <walt@drummond.us>
---
 arch/x86/include/asm/signal.h       |   2 +-
 arch/x86/include/uapi/asm/signal.h  |   4 +-
 drivers/tty/Makefile                |   2 +-
 drivers/tty/n_tty.c                 |  21 +++++
 drivers/tty/tty_io.c                |  10 ++-
 drivers/tty/tty_ioctl.c             |   4 +
 drivers/tty/tty_status.c            | 135 ++++++++++++++++++++++++++++
 fs/proc/array.c                     |  29 +-----
 include/asm-generic/termios.h       |   4 +-
 include/linux/sched.h               |  52 ++++++++++-
 include/linux/signal.h              |   4 +
 include/linux/tty.h                 |   8 ++
 include/uapi/asm-generic/ioctls.h   |   2 +
 include/uapi/asm-generic/signal.h   |   6 +-
 include/uapi/asm-generic/termbits.h |  34 +++----
 15 files changed, 264 insertions(+), 53 deletions(-)
 create mode 100644 drivers/tty/tty_status.c

diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index d8e2efe6cd46..0a01877c11ab 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -8,7 +8,7 @@
 /* Most things should be clean enough to redefine this at will, if care
    is taken to make libc match.  */
 
-#define _NSIG		64
+#define _NSIG		65
 
 #ifdef __i386__
 # define _NSIG_BPW	32
diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h
index 164a22a72984..60dca62d3dcf 100644
--- a/arch/x86/include/uapi/asm/signal.h
+++ b/arch/x86/include/uapi/asm/signal.h
@@ -60,7 +60,9 @@ typedef unsigned long sigset_t;
 
 /* These should not be considered constants from userland.  */
 #define SIGRTMIN	32
-#define SIGRTMAX	_NSIG
+#define SIGRTMAX	64
+
+#define SIGINFO		65
 
 #define SA_RESTORER	0x04000000
 
diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile
index a2bd75fbaaa4..d50ba690bb87 100644
--- a/drivers/tty/Makefile
+++ b/drivers/tty/Makefile
@@ -2,7 +2,7 @@
 obj-$(CONFIG_TTY)		+= tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \
 				   tty_buffer.o tty_port.o tty_mutex.o \
 				   tty_ldsem.o tty_baudrate.o tty_jobctrl.o \
-				   n_null.o
+				   n_null.o tty_status.o
 obj-$(CONFIG_LEGACY_PTYS)	+= pty.o
 obj-$(CONFIG_UNIX98_PTYS)	+= pty.o
 obj-$(CONFIG_AUDIT)		+= tty_audit.o
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 0ec93f1a61f5..b510e01289fd 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c)
 			commit_echoes(tty);
 			return;
 		}
+#ifdef VSTATUS
+		if (c == STATUS_CHAR(tty)) {
+			/* Do the status message first and then send
+			 * the signal, otherwise signal delivery can
+			 * change the process state making the status
+			 * message misleading.  Also, use __isig() and
+			 * not sig(), as if we flush the tty we can
+			 * lose parts of the message.
+			 */
+
+			if (!L_NOKERNINFO(tty))
+				tty_status(tty);
+# if defined(SIGINFO) && SIGINFO != SIGPWR
+			__isig(SIGINFO, tty);
+# endif
+			return;
+		}
+#endif	/* VSTATUS */
 		if (c == '\n') {
 			if (L_ECHO(tty) || L_ECHONL(tty)) {
 				echo_char_raw('\n', ldata);
@@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
 			set_bit(EOF_CHAR(tty), ldata->char_map);
 			set_bit('\n', ldata->char_map);
 			set_bit(EOL_CHAR(tty), ldata->char_map);
+#ifdef VSTATUS
+			set_bit(STATUS_CHAR(tty), ldata->char_map);
+#endif
 			if (L_IEXTEN(tty)) {
 				set_bit(WERASE_CHAR(tty), ldata->char_map);
 				set_bit(LNEXT_CHAR(tty), ldata->char_map);
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 6616d4a0d41d..8e488ecba330 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -120,18 +120,26 @@
 #define TTY_PARANOIA_CHECK 1
 #define CHECK_TTY_COUNT 1
 
+/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */
+#ifdef NOKERNINFO
+# define __NOKERNINFO NOKERNINFO
+#else
+# define __NOKERNINFO 0
+#endif
+
 struct ktermios tty_std_termios = {	/* for the benefit of tty drivers  */
 	.c_iflag = ICRNL | IXON,
 	.c_oflag = OPOST | ONLCR,
 	.c_cflag = B38400 | CS8 | CREAD | HUPCL,
 	.c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK |
-		   ECHOCTL | ECHOKE | IEXTEN,
+		   ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO,
 	.c_cc = INIT_C_CC,
 	.c_ispeed = 38400,
 	.c_ospeed = 38400,
 	/* .c_line = N_TTY, */
 };
 EXPORT_SYMBOL(tty_std_termios);
+#undef __NOKERNINFO
 
 /* This list gets poked at by procfs and various bits of boot up code. This
  * could do with some rationalisation such as pulling the tty proc function
diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c
index 507a25d692bb..b250eabca1ba 100644
--- a/drivers/tty/tty_ioctl.c
+++ b/drivers/tty/tty_ioctl.c
@@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file,
 		if (get_user(arg, (unsigned int __user *) arg))
 			return -EFAULT;
 		return tty_change_softcar(real_tty, arg);
+#ifdef TIOCSTAT
+	case TIOCSTAT:
+		return tty_status(real_tty);
+#endif
 	default:
 		return -ENOIOCTLCMD;
 	}
diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c
new file mode 100644
index 000000000000..a9600f5bd48c
--- /dev/null
+++ b/drivers/tty/tty_status.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-1.0+
+/*
+ * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4
+ *
+ */
+
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/tty.h>
+#include <linux/sched/cputime.h>
+#include <linux/sched/loadavg.h>
+#include <linux/pid.h>
+#include <linux/slab.h>
+#include <linux/math64.h>
+
+#define MSGLEN (160 + TASK_COMM_LEN)
+
+inline unsigned long getRSSk(struct mm_struct *mm)
+{
+	if (mm == NULL)
+		return 0;
+	return get_mm_rss(mm) * PAGE_SIZE / 1024;
+}
+
+inline long nstoms(long l)
+{
+	l /= NSEC_PER_MSEC * 10;
+	if (l < 10)
+		l *= 10;
+	return l;
+}
+
+inline struct task_struct *compare(struct task_struct *new,
+				   struct task_struct *old)
+{
+	unsigned int ostate, nstate;
+
+	if (old == NULL)
+		return new;
+
+	ostate = task_state_index(old);
+	nstate = task_state_index(new);
+
+	if (ostate == nstate) {
+		if (old->start_time > new->start_time)
+			return old;
+		return new;
+	}
+
+	if (ostate < nstate)
+		return old;
+
+	return new;
+}
+
+struct task_struct *pick_process(struct pid *pgrp)
+{
+	struct task_struct *p, *winner = NULL;
+
+	read_lock(&tasklist_lock);
+	do_each_pid_task(pgrp, PIDTYPE_PGID, p) {
+		winner = compare(p, winner);
+	} while_each_pid_task(pgrp, PIDTYPE_PGID, p);
+	read_unlock(&tasklist_lock);
+
+	return winner;
+}
+
+int tty_status(struct tty_struct *tty)
+{
+	char tname[TASK_COMM_LEN];
+	unsigned long loadavg[3];
+	uint64_t pcpu, cputime, wallclock;
+	struct task_struct *p;
+	struct rusage rusage;
+	struct timespec64 utime, stime, rtime;
+	char msg[MSGLEN] = {0};
+	int len = 0;
+
+	if (tty == NULL)
+		return -ENOTTY;
+
+	get_avenrun(loadavg, FIXED_1/200, 0);
+	len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu  ",
+		       LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0]));
+
+	if (tty->ctrl.session == NULL) {
+		len += scnprintf((char *)&msg[len], MSGLEN - len,
+				 "not a controlling terminal");
+		goto print;
+	}
+
+	if (tty->ctrl.pgrp == NULL) {
+		len += scnprintf((char *)&msg[len], MSGLEN - len,
+				 "no foreground process group");
+		goto print;
+	}
+
+	p = pick_process(tty->ctrl.pgrp);
+	if (p == NULL) {
+		len += scnprintf((char *)&msg[len], MSGLEN - len,
+				 "empty foreground process group");
+		goto print;
+	}
+
+	get_task_comm(tname, p);
+	getrusage(p, RUSAGE_BOTH, &rusage);
+	wallclock = ktime_get_ns() - p->start_time;
+
+	utime.tv_sec = rusage.ru_utime.tv_sec;
+	utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC;
+	stime.tv_sec = rusage.ru_stime.tv_sec;
+	stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC;
+	rtime = ns_to_timespec64(wallclock);
+
+	cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime);
+	pcpu = div64_u64(cputime * 100, wallclock);
+
+	len += scnprintf((char *)&msg[len], MSGLEN - len,
+			 /* task, PID, task state */
+			 "cmd: %s %d [%s] "
+			 /* rtime,    utime,      stime,      %cpu,  rss */
+			 "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk",
+			 tname,	task_pid_vnr(p), (char *)get_task_state_name(p),
+			 rtime.tv_sec, nstoms(rtime.tv_nsec),
+			 utime.tv_sec, nstoms(utime.tv_nsec),
+			 stime.tv_sec, nstoms(stime.tv_nsec),
+			 pcpu, getRSSk(p->mm));
+
+print:
+	len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n");
+	tty_write_message(tty, msg);
+
+	return 0;
+}
diff --git a/fs/proc/array.c b/fs/proc/array.c
index f37c03077b58..eb14306cdde2 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -62,6 +62,7 @@
 #include <linux/tty.h>
 #include <linux/string.h>
 #include <linux/mman.h>
+#include <linux/sched.h>
 #include <linux/sched/mm.h>
 #include <linux/sched/numa_balancing.h>
 #include <linux/sched/task_stack.h>
@@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
 		seq_printf(m, "%.64s", tcomm);
 }
 
-/*
- * The task state array is a strange "bitmap" of
- * reasons to sleep. Thus "running" is zero, and
- * you can test for combinations of others with
- * simple bit tests.
- */
-static const char * const task_state_array[] = {
-
-	/* states in TASK_REPORT: */
-	"R (running)",		/* 0x00 */
-	"S (sleeping)",		/* 0x01 */
-	"D (disk sleep)",	/* 0x02 */
-	"T (stopped)",		/* 0x04 */
-	"t (tracing stop)",	/* 0x08 */
-	"X (dead)",		/* 0x10 */
-	"Z (zombie)",		/* 0x20 */
-	"P (parked)",		/* 0x40 */
-
-	/* states beyond TASK_REPORT: */
-	"I (idle)",		/* 0x80 */
-};
-
-static inline const char *get_task_state(struct task_struct *tsk)
-{
-	BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
-	return task_state_array[task_state_index(tsk)];
-}
-
 static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 				struct pid *pid, struct task_struct *p)
 {
diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h
index b1398d0d4a1d..9b080e1a82d4 100644
--- a/include/asm-generic/termios.h
+++ b/include/asm-generic/termios.h
@@ -10,9 +10,9 @@
 	eof=^D		vtime=\0	vmin=\1		sxtc=\0
 	start=^Q	stop=^S		susp=^Z		eol=\0
 	reprint=^R	discard=^U	werase=^W	lnext=^V
-	eol2=\0
+	eol2=\0         status=^T
 */
-#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0"
+#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024"
 
 /*
  * Translate a "termio" structure into a "termios". Ugh.
diff --git a/include/linux/sched.h b/include/linux/sched.h
index c1a927ddec64..2171074ec8f5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -70,7 +70,7 @@ struct task_group;
 
 /*
  * Task state bitmask. NOTE! These bits are also
- * encoded in fs/proc/array.c: get_task_state().
+ * encoded in get_task_state().
  *
  * We have two separate sets of flags: task->state
  * is about runnability, while task->exit_state are
@@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk)
 	return task_index_to_char(task_state_index(tsk));
 }
 
+static inline const char *get_task_state_name(struct task_struct *tsk)
+{
+	static const char * const task_state_array[] = {
+
+		/* states in TASK_REPORT: */
+		"running",		/* 0x00 */
+		"sleeping",		/* 0x01 */
+		"disk sleep",		/* 0x02 */
+		"stopped",		/* 0x04 */
+		"tracing stop",		/* 0x08 */
+		"dead",			/* 0x10 */
+		"zombie",		/* 0x20 */
+		"parked",		/* 0x40 */
+
+		/* states beyond TASK_REPORT: */
+		"idle",			/* 0x80 */
+	};
+
+	BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
+	return task_state_array[task_state_index(tsk)];
+}
+
+static inline const char *get_task_state(struct task_struct *tsk)
+{
+	/*
+	 * The task state array is a strange "bitmap" of
+	 * reasons to sleep. Thus "running" is zero, and
+	 * you can test for combinations of others with
+	 * simple bit tests.
+	 */
+	static const char * const task_state_array[] = {
+
+		/* states in TASK_REPORT: */
+		"R (running)",		/* 0x00 */
+		"S (sleeping)",		/* 0x01 */
+		"D (disk sleep)",	/* 0x02 */
+		"T (stopped)",		/* 0x04 */
+		"t (tracing stop)",	/* 0x08 */
+		"X (dead)",		/* 0x10 */
+		"Z (zombie)",		/* 0x20 */
+		"P (parked)",		/* 0x40 */
+
+		/* states beyond TASK_REPORT: */
+		"I (idle)",		/* 0x80 */
+	};
+
+	BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
+	return task_state_array[task_state_index(tsk)];
+}
+
 /**
  * is_global_init - check if a task structure is init. Since init
  * is free to have sub-threads we need to check tgid.
diff --git a/include/linux/signal.h b/include/linux/signal.h
index b77f9472a37c..76bda1a20578 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig);
  *	|  non-POSIX signal  |  default action  |
  *	+--------------------+------------------+
  *	|  SIGEMT            |  coredump	|
+ *	|  SIGINFO	     |	ignore		|
  *	+--------------------+------------------+
  *
  * (+) For SIGKILL and SIGSTOP the action is "always", not just "default".
@@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig)
 	return	sig == SIGCONT	||
 		sig == SIGCHLD	||
 		sig == SIGWINCH ||
+#if defined(SIGINFO) && SIGINFO != SIGPWR
+		sig == SIGINFO  ||
+#endif
 		sig == SIGURG;
 }
 
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 168e57e40bbb..943d85aa471c 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -49,6 +49,9 @@
 #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE])
 #define LNEXT_CHAR(tty)	((tty)->termios.c_cc[VLNEXT])
 #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2])
+#ifdef VSTATUS
+#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS])
+#endif
 
 #define _I_FLAG(tty, f)	((tty)->termios.c_iflag & (f))
 #define _O_FLAG(tty, f)	((tty)->termios.c_oflag & (f))
@@ -114,6 +117,9 @@
 #define L_PENDIN(tty)	_L_FLAG((tty), PENDIN)
 #define L_IEXTEN(tty)	_L_FLAG((tty), IEXTEN)
 #define L_EXTPROC(tty)	_L_FLAG((tty), EXTPROC)
+#ifdef NOKERNINFO
+#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO)
+#endif
 
 struct device;
 struct signal_struct;
@@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty);
 extern void tty_unlock_slave(struct tty_struct *tty);
 extern void tty_set_lock_subclass(struct tty_struct *tty);
 
+extern int tty_status(struct tty_struct *tty);
+
 #endif
diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h
index cdc9f4ca8c27..baa2b8d42679 100644
--- a/include/uapi/asm-generic/ioctls.h
+++ b/include/uapi/asm-generic/ioctls.h
@@ -97,6 +97,8 @@
 
 #define TIOCMIWAIT	0x545C	/* wait for a change on serial input line(s) */
 #define TIOCGICOUNT	0x545D	/* read serial port inline interrupt counts */
+/* Some architectures use 0x545E for FIOQSIZE */
+#define TIOCSTAT        0x545F	/* display process group stats on tty */
 
 /*
  * Some arches already define FIOQSIZE due to a historical
diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h
index 3c4cc9b8378e..0b771eb1db94 100644
--- a/include/uapi/asm-generic/signal.h
+++ b/include/uapi/asm-generic/signal.h
@@ -4,7 +4,7 @@
 
 #include <linux/types.h>
 
-#define _NSIG		64
+#define _NSIG		65
 #define _NSIG_BPW	__BITS_PER_LONG
 #define _NSIG_WORDS	((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW)
 
@@ -49,9 +49,11 @@
 /* These should not be considered constants from userland.  */
 #define SIGRTMIN	32
 #ifndef SIGRTMAX
-#define SIGRTMAX	_NSIG
+#define SIGRTMAX	64
 #endif
 
+#define SIGINFO		65
+
 #if !defined MINSIGSTKSZ || !defined SIGSTKSZ
 #define MINSIGSTKSZ	2048
 #define SIGSTKSZ	8192
diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h
index 2fbaf9ae89dd..cb4e9c6d629f 100644
--- a/include/uapi/asm-generic/termbits.h
+++ b/include/uapi/asm-generic/termbits.h
@@ -58,6 +58,7 @@ struct ktermios {
 #define VWERASE 14
 #define VLNEXT 15
 #define VEOL2 16
+#define VSTATUS 17
 
 /* c_iflag bits */
 #define IGNBRK	0000001
@@ -164,22 +165,23 @@ struct ktermios {
 #define IBSHIFT	  16		/* Shift from CBAUD to CIBAUD */
 
 /* c_lflag bits */
-#define ISIG	0000001
-#define ICANON	0000002
-#define XCASE	0000004
-#define ECHO	0000010
-#define ECHOE	0000020
-#define ECHOK	0000040
-#define ECHONL	0000100
-#define NOFLSH	0000200
-#define TOSTOP	0000400
-#define ECHOCTL	0001000
-#define ECHOPRT	0002000
-#define ECHOKE	0004000
-#define FLUSHO	0010000
-#define PENDIN	0040000
-#define IEXTEN	0100000
-#define EXTPROC	0200000
+#define ISIG	   0000001
+#define ICANON	   0000002
+#define XCASE	   0000004
+#define ECHO	   0000010
+#define ECHOE	   0000020
+#define ECHOK	   0000040
+#define ECHONL	   0000100
+#define NOFLSH	   0000200
+#define TOSTOP	   0000400
+#define ECHOCTL	   0001000
+#define ECHOPRT	   0002000
+#define ECHOKE	   0004000
+#define FLUSHO	   0010000
+#define PENDIN	   0040000
+#define IEXTEN	   0100000
+#define EXTPROC	   0200000
+#define NOKERNINFO 0400000
 
 /* tcflow() and TCXONC use these */
 #define	TCOOFF		0
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-03 18:19 ` Walt Drummond
  (?)
@ 2022-01-03 18:48   ` Al Viro
  -1 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-03 18:48 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Could you explain the point of the entire exercise?  Why do we need more
rt signals in the first place?

glibc has quite a bit of utterly pointless future-proofing.  So "they
allow more" is not a good reason - not without a plausible use-case,
at least.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-03 18:48   ` Al Viro
  0 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-03 18:48 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Could you explain the point of the entire exercise?  Why do we need more
rt signals in the first place?

glibc has quite a bit of utterly pointless future-proofing.  So "they
allow more" is not a good reason - not without a plausible use-case,
at least.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-03 18:48   ` Al Viro
  0 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-03 18:48 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx

On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Could you explain the point of the entire exercise?  Why do we need more
rt signals in the first place?

glibc has quite a bit of utterly pointless future-proofing.  So "they
allow more" is not a good reason - not without a plausible use-case,
at least.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-03 18:48   ` Al Viro
  (?)
@ 2022-01-04  1:00     ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04  1:00 UTC (permalink / raw)
  To: Al Viro
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

I simply wanted SIGINFO and VSTATUS, and that necessitated this. If
the limit of 1024 rt signals is an issue, that's an extremely simple
change to make.



On Mon, Jan 3, 2022 at 10:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> > This patch set expands the number of signals in Linux beyond the
> > current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> > of 1024 signals, both because it’s what GLibc and MUSL support and
> > because many architectures pad sigset_t or ucontext_t in the kernel to
> > this cap.  This limit is not fixed and can be further expanded within
> > reason.
>
> Could you explain the point of the entire exercise?  Why do we need more
> rt signals in the first place?
>
> glibc has quite a bit of utterly pointless future-proofing.  So "they
> allow more" is not a good reason - not without a plausible use-case,
> at least.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04  1:00     ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04  1:00 UTC (permalink / raw)
  To: Al Viro
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

I simply wanted SIGINFO and VSTATUS, and that necessitated this. If
the limit of 1024 rt signals is an issue, that's an extremely simple
change to make.



On Mon, Jan 3, 2022 at 10:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> > This patch set expands the number of signals in Linux beyond the
> > current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> > of 1024 signals, both because it’s what GLibc and MUSL support and
> > because many architectures pad sigset_t or ucontext_t in the kernel to
> > this cap.  This limit is not fixed and can be further expanded within
> > reason.
>
> Could you explain the point of the entire exercise?  Why do we need more
> rt signals in the first place?
>
> glibc has quite a bit of utterly pointless future-proofing.  So "they
> allow more" is not a good reason - not without a plausible use-case,
> at least.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04  1:00     ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04  1:00 UTC (permalink / raw)
  To: Al Viro
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx

I simply wanted SIGINFO and VSTATUS, and that necessitated this. If
the limit of 1024 rt signals is an issue, that's an extremely simple
change to make.



On Mon, Jan 3, 2022 at 10:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> > This patch set expands the number of signals in Linux beyond the
> > current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> > of 1024 signals, both because it’s what GLibc and MUSL support and
> > because many architectures pad sigset_t or ucontext_t in the kernel to
> > this cap.  This limit is not fixed and can be further expanded within
> > reason.
>
> Could you explain the point of the entire exercise?  Why do we need more
> rt signals in the first place?
>
> glibc has quite a bit of utterly pointless future-proofing.  So "they
> allow more" is not a good reason - not without a plausible use-case,
> at least.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04  1:00     ` Walt Drummond
  (?)
@ 2022-01-04  1:16       ` Al Viro
  -1 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-04  1:16 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> I simply wanted SIGINFO and VSTATUS, and that necessitated this.

Elaborate, please.  What exactly requires more than 32 rt signals?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04  1:16       ` Al Viro
  0 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-04  1:16 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> I simply wanted SIGINFO and VSTATUS, and that necessitated this.

Elaborate, please.  What exactly requires more than 32 rt signals?

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04  1:16       ` Al Viro
  0 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-04  1:16 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx

On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> I simply wanted SIGINFO and VSTATUS, and that necessitated this.

Elaborate, please.  What exactly requires more than 32 rt signals?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04  1:16       ` Al Viro
  (?)
@ 2022-01-04  1:49         ` Al Viro
  -1 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-04  1:49 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

On Tue, Jan 04, 2022 at 01:16:17AM +0000, Al Viro wrote:
> On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> > I simply wanted SIGINFO and VSTATUS, and that necessitated this.
> 
> Elaborate, please.  What exactly requires more than 32 rt signals?

More to the point, which system had SIGINFO >= SIGRTMIN?  Or signals
with numbers greater than SIGRTMAX, for that matter?

I really don't get it...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04  1:49         ` Al Viro
  0 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-04  1:49 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

On Tue, Jan 04, 2022 at 01:16:17AM +0000, Al Viro wrote:
> On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> > I simply wanted SIGINFO and VSTATUS, and that necessitated this.
> 
> Elaborate, please.  What exactly requires more than 32 rt signals?

More to the point, which system had SIGINFO >= SIGRTMIN?  Or signals
with numbers greater than SIGRTMAX, for that matter?

I really don't get it...

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04  1:49         ` Al Viro
  0 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2022-01-04  1:49 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx

On Tue, Jan 04, 2022 at 01:16:17AM +0000, Al Viro wrote:
> On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> > I simply wanted SIGINFO and VSTATUS, and that necessitated this.
> 
> Elaborate, please.  What exactly requires more than 32 rt signals?

More to the point, which system had SIGINFO >= SIGRTMIN?  Or signals
with numbers greater than SIGRTMAX, for that matter?

I really don't get it...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO
  2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond
@ 2022-01-04  7:27   ` Greg Kroah-Hartman
  2022-01-07 21:48   ` Arseny Maslennikov
  2022-01-08 14:38   ` Arseny Maslennikov
  2 siblings, 0 replies; 57+ messages in thread
From: Greg Kroah-Hartman @ 2022-01-04  7:27 UTC (permalink / raw)
  To: Walt Drummond
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Jiri Slaby, Arnd Bergmann, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	linux-fsdevel, linux-arch

On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote:
> Support TTY VSTATUS character, NOKERNINFO local control bit and the
> signal SIGINFO, all as in 4.3BSD.

I am sorry, but this changelog text does not make any sense to me at
all.  It needs to be much more detailed and explain why you are doing
this and what exactly it is doing as I have no idea.

Also, you seem to be adding new user/kernel apis here with no
documentation that I can see, nor any tests.  So how is anyone supposed
to use this?

And finally:

> --- /dev/null
> +++ b/drivers/tty/tty_status.c
> @@ -0,0 +1,135 @@
> +// SPDX-License-Identifier: GPL-1.0+

Please no, you know better than that, and the checkpatch tool should
have warned you.


> +/*
> + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4
> + *
> + */
> +
> +#include <linux/sched.h>
> +#include <linux/mm.h>
> +#include <linux/tty.h>
> +#include <linux/sched/cputime.h>
> +#include <linux/sched/loadavg.h>
> +#include <linux/pid.h>
> +#include <linux/slab.h>
> +#include <linux/math64.h>
> +
> +#define MSGLEN (160 + TASK_COMM_LEN)
> +
> +inline unsigned long getRSSk(struct mm_struct *mm)
> +{
> +	if (mm == NULL)
> +		return 0;
> +	return get_mm_rss(mm) * PAGE_SIZE / 1024;
> +}
> +
> +inline long nstoms(long l)
> +{
> +	l /= NSEC_PER_MSEC * 10;
> +	if (l < 10)
> +		l *= 10;
> +	return l;
> +}
> +
> +inline struct task_struct *compare(struct task_struct *new,
> +				   struct task_struct *old)
> +{
> +	unsigned int ostate, nstate;
> +
> +	if (old == NULL)
> +		return new;
> +
> +	ostate = task_state_index(old);
> +	nstate = task_state_index(new);
> +
> +	if (ostate == nstate) {
> +		if (old->start_time > new->start_time)
> +			return old;
> +		return new;
> +	}
> +
> +	if (ostate < nstate)
> +		return old;
> +
> +	return new;
> +}
> +
> +struct task_struct *pick_process(struct pid *pgrp)

Also, always run sparse on your changes, you have loads of new global
functions for no reason.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-03 18:19 ` Walt Drummond
  (?)
@ 2022-01-04 18:00   ` Eric W. Biederman
  -1 siblings, 0 replies; 57+ messages in thread
From: Eric W. Biederman @ 2022-01-04 18:00 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever,
	bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert,
	gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris,
	bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook,
	mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini,
	peterz, rth, richard, serge, rostedt, tglx, trond.myklebust,
	vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha,
	linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs,
	linux-scsi, linux-security-module

Walt Drummond <walt@drummond.us> writes:

> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Ahhhh!!

Please let's not expand the number of signals supported if there is any
alternative.  Signals only really make sense for supporting existing
interfaces.  For new applications there is almost always something
better.

In the last discussion of adding SIGINFO
https://lore.kernel.org/lkml/20190625161153.29811-1-ar@cs.msu.ru/ the
approach examined was to fix SIGPWR to be ignored by default and to
define SIGINFO as SIGPWR.

I dug through the previous conversations and there is a little debate
about what makes sense for SIGPWR to do by default.  Alan Cox remembered
SIGPWR was sent when the power was restored, so ignoring SIGPWR by
default made sense.  Ted Tso pointed out a different scenario where it
was reasonable for SIGPWR to be a terminating signal.

So far no one has actually found any applications that will regress if
SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
defined to be sent to init, and init ignores all signals by default so
in practice SIGPWR is ignored by the only process that receives it
currently.

I am persuaded at least enough that I could see adding a patch to
linux-next and them sending to Linus that could be reverted if anything
broke.

Where I saw the last conversation falter was in making a persuasive
case of why SIGINFO was interesting to add.  Given a world of ssh
connections I expect a persuasive case can be made.  Especially if there
are a handful of utilities where it is already implemented that just
need to be built with SIGINFO defined.

>  - Add BSD SIGINFO (and VSTATUS) as a test.

If your actual point is not to implement SIGINFO and you really have
another use case for expanding sigset_t please make it clear.

Without seeing the persuasive case for more signals I have to say that
adding more signals to the kernel sounds like a bad idea.

Eric





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 18:00   ` Eric W. Biederman
  0 siblings, 0 replies; 57+ messages in thread
From: Eric W. Biederman @ 2022-01-04 18:00 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever,
	bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert,
	gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris,
	bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook,
	mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini,
	peterz, rth, richard, serge, rostedt, tglx, trond.myklebust,
	vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha,
	linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs,
	linux-scsi, linux-security-module

Walt Drummond <walt@drummond.us> writes:

> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Ahhhh!!

Please let's not expand the number of signals supported if there is any
alternative.  Signals only really make sense for supporting existing
interfaces.  For new applications there is almost always something
better.

In the last discussion of adding SIGINFO
https://lore.kernel.org/lkml/20190625161153.29811-1-ar@cs.msu.ru/ the
approach examined was to fix SIGPWR to be ignored by default and to
define SIGINFO as SIGPWR.

I dug through the previous conversations and there is a little debate
about what makes sense for SIGPWR to do by default.  Alan Cox remembered
SIGPWR was sent when the power was restored, so ignoring SIGPWR by
default made sense.  Ted Tso pointed out a different scenario where it
was reasonable for SIGPWR to be a terminating signal.

So far no one has actually found any applications that will regress if
SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
defined to be sent to init, and init ignores all signals by default so
in practice SIGPWR is ignored by the only process that receives it
currently.

I am persuaded at least enough that I could see adding a patch to
linux-next and them sending to Linus that could be reverted if anything
broke.

Where I saw the last conversation falter was in making a persuasive
case of why SIGINFO was interesting to add.  Given a world of ssh
connections I expect a persuasive case can be made.  Especially if there
are a handful of utilities where it is already implemented that just
need to be built with SIGINFO defined.

>  - Add BSD SIGINFO (and VSTATUS) as a test.

If your actual point is not to implement SIGINFO and you really have
another use case for expanding sigset_t please make it clear.

Without seeing the persuasive case for more signals I have to say that
adding more signals to the kernel sounds like a bad idea.

Eric





______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 18:00   ` Eric W. Biederman
  0 siblings, 0 replies; 57+ messages in thread
From: Eric W. Biederman @ 2022-01-04 18:00 UTC (permalink / raw)
  To: Walt Drummond
  Cc: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever,
	bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert,
	gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris,
	bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook,
	mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini,
	peterz, rth, richard, serge

Walt Drummond <walt@drummond.us> writes:

> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Ahhhh!!

Please let's not expand the number of signals supported if there is any
alternative.  Signals only really make sense for supporting existing
interfaces.  For new applications there is almost always something
better.

In the last discussion of adding SIGINFO
https://lore.kernel.org/lkml/20190625161153.29811-1-ar@cs.msu.ru/ the
approach examined was to fix SIGPWR to be ignored by default and to
define SIGINFO as SIGPWR.

I dug through the previous conversations and there is a little debate
about what makes sense for SIGPWR to do by default.  Alan Cox remembered
SIGPWR was sent when the power was restored, so ignoring SIGPWR by
default made sense.  Ted Tso pointed out a different scenario where it
was reasonable for SIGPWR to be a terminating signal.

So far no one has actually found any applications that will regress if
SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
defined to be sent to init, and init ignores all signals by default so
in practice SIGPWR is ignored by the only process that receives it
currently.

I am persuaded at least enough that I could see adding a patch to
linux-next and them sending to Linus that could be reverted if anything
broke.

Where I saw the last conversation falter was in making a persuasive
case of why SIGINFO was interesting to add.  Given a world of ssh
connections I expect a persuasive case can be made.  Especially if there
are a handful of utilities where it is already implemented that just
need to be built with SIGINFO defined.

>  - Add BSD SIGINFO (and VSTATUS) as a test.

If your actual point is not to implement SIGINFO and you really have
another use case for expanding sigset_t please make it clear.

Without seeing the persuasive case for more signals I have to say that
adding more signals to the kernel sounds like a bad idea.

Eric





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04 18:00   ` Eric W. Biederman
  (?)
@ 2022-01-04 20:52     ` Theodore Ts'o
  -1 siblings, 0 replies; 57+ messages in thread
From: Theodore Ts'o @ 2022-01-04 20:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> I dug through the previous conversations and there is a little debate
> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> default made sense.  Ted Tso pointed out a different scenario where it
> was reasonable for SIGPWR to be a terminating signal.
> 
> So far no one has actually found any applications that will regress if
> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> defined to be sent to init, and init ignores all signals by default so
> in practice SIGPWR is ignored by the only process that receives it
> currently.

As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
initiate the sigpwr target.  From the systemd.special man page:

       sigpwr.target
           A special target that is started when systemd receives the
           SIGPWR process signal, which is normally sent by the kernel
           or UPS daemons when power fails.

And child processes of systemd are not ignoring SIGPWR.  Instead, they
are getting terminated.

<tytso@cwcc>
41% /bin/sleep 50 &
[1] 180671
<tytso@cwcc>
42% kill -PWR 180671
[1]+  Power failure           /bin/sleep 50

> Where I saw the last conversation falter was in making a persuasive
> case of why SIGINFO was interesting to add.  Given a world of ssh
> connections I expect a persuasive case can be made.  Especially if there
> are a handful of utilities where it is already implemented that just
> need to be built with SIGINFO defined.

One thing that's perhaps worth disentangling is the value of
supporting VSTATUS --- which is a control character much like VINTR
(^C) or VQUIT (control backslash) which is set via the c_cc[] array in
termios structure.  Quoting from the termios man page:

       VSTATUS
              (not in POSIX; not supported under Linux; status
              request: 024, DC4, Ctrl-T).  Status character (STATUS).
              Display status information at terminal, including state
              of foreground process and amount of CPU time it has
              consumed.  Also sends a SIGINFO signal (not supported on
              Linux) to the foreground process group.

The basic idea is that when you type C-t, you can find out information
about the currently running process.  This is a feature that
originally comes from TOPS-10's TENEX operating system, and it is
supported today on FreeBSD and Mac OS.  For example, it might display
something like this:

load: 2.39  cmd: ping 5374 running 0.00u 0.00s

The reason why SIGINFO is sent to the foreground process group is that
it gives the process an opportunity print application specific
information about currently running process.  For example, maybe the C
compiler could print something like "parsing 2042 of 5000 header
files", or some such.  :-)

There are people who wish that Linux supported Control-T / VSTATUS,
for example, just last week, on TUHS, the Unix greybeards list, there
were two such heartfelt wishes for Control-T support from two such
greybeards:

    "It's my biggest annoyance with Linux that it doesn't [support
    control-t]
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html

    "I personally can't stand using Linux, even casually for a very
     short sys-admin task, because of this missing feature"
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html

I claim, though, that we could implement VSTATUS without implenting
the SIGINFO part of the feature.  Previous few applications *ever*
implemented SIGINFO signal handlers so they could give status
information, it's the hard one, since we don't have any spare signals
left.  If we were to repurpose some lesser used signal, whether it be
SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
userspace program (such as a UPS monitoring program which wants to
trigger power fail handling, or a userspace NFSv4 process that wants
to signal that it was unable to recover a file's file lock after a
server reboot), and if we try to take over the signal assignment, it's
possible that we might get surprised.  Furthermore, all of the
possibly unused signals that we might try to reclaim terminate the
process by default, and SIGINFO *has* to have a default signal
handling action of Ignore, since otherwise typing Control-T will end
up killing the current foreground application.

Personally, I don't care all that much about VSTATUS support --- I
used it when I was in university, but honestly, I've never missed it.
But if there is someone who wants to try to implement VSTATUS, and
make some Unix greybeards happy, and maybe even switch from FreeBSD to
Linux as a result, go wild.  I'm not convinced, though, that adding
the SIGINFO part of the support is worth the effort.

Not only do almost no programs implement SIGINFO support, a lot of CPU
bound programs where this might be actually useful, end up running a
large number of processes in parallel.  Take the "parsing 2042 of 5000
header files" example I gave above.  Consider what would happen if gcc
implemented support for SIGINFO, but the user was running a "make -j
16" and typed Control-T.   The result would be chaos!

So if you really miss Control-T, and it's the only thing holding back
a few FreeBSD users from Linux, I don't see the problem with
implementing that part of the feature.  Why not just do the easy part
of the feature which is perhaps 5% of the work, and might provide 99%
of the benefit (at least for those people who care).

> Without seeing the persuasive case for more signals I have to say that
> adding more signals to the kernel sounds like a bad idea.

Concur, 100%.

						- Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 20:52     ` Theodore Ts'o
  0 siblings, 0 replies; 57+ messages in thread
From: Theodore Ts'o @ 2022-01-04 20:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> I dug through the previous conversations and there is a little debate
> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> default made sense.  Ted Tso pointed out a different scenario where it
> was reasonable for SIGPWR to be a terminating signal.
> 
> So far no one has actually found any applications that will regress if
> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> defined to be sent to init, and init ignores all signals by default so
> in practice SIGPWR is ignored by the only process that receives it
> currently.

As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
initiate the sigpwr target.  From the systemd.special man page:

       sigpwr.target
           A special target that is started when systemd receives the
           SIGPWR process signal, which is normally sent by the kernel
           or UPS daemons when power fails.

And child processes of systemd are not ignoring SIGPWR.  Instead, they
are getting terminated.

<tytso@cwcc>
41% /bin/sleep 50 &
[1] 180671
<tytso@cwcc>
42% kill -PWR 180671
[1]+  Power failure           /bin/sleep 50

> Where I saw the last conversation falter was in making a persuasive
> case of why SIGINFO was interesting to add.  Given a world of ssh
> connections I expect a persuasive case can be made.  Especially if there
> are a handful of utilities where it is already implemented that just
> need to be built with SIGINFO defined.

One thing that's perhaps worth disentangling is the value of
supporting VSTATUS --- which is a control character much like VINTR
(^C) or VQUIT (control backslash) which is set via the c_cc[] array in
termios structure.  Quoting from the termios man page:

       VSTATUS
              (not in POSIX; not supported under Linux; status
              request: 024, DC4, Ctrl-T).  Status character (STATUS).
              Display status information at terminal, including state
              of foreground process and amount of CPU time it has
              consumed.  Also sends a SIGINFO signal (not supported on
              Linux) to the foreground process group.

The basic idea is that when you type C-t, you can find out information
about the currently running process.  This is a feature that
originally comes from TOPS-10's TENEX operating system, and it is
supported today on FreeBSD and Mac OS.  For example, it might display
something like this:

load: 2.39  cmd: ping 5374 running 0.00u 0.00s

The reason why SIGINFO is sent to the foreground process group is that
it gives the process an opportunity print application specific
information about currently running process.  For example, maybe the C
compiler could print something like "parsing 2042 of 5000 header
files", or some such.  :-)

There are people who wish that Linux supported Control-T / VSTATUS,
for example, just last week, on TUHS, the Unix greybeards list, there
were two such heartfelt wishes for Control-T support from two such
greybeards:

    "It's my biggest annoyance with Linux that it doesn't [support
    control-t]
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html

    "I personally can't stand using Linux, even casually for a very
     short sys-admin task, because of this missing feature"
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html

I claim, though, that we could implement VSTATUS without implenting
the SIGINFO part of the feature.  Previous few applications *ever*
implemented SIGINFO signal handlers so they could give status
information, it's the hard one, since we don't have any spare signals
left.  If we were to repurpose some lesser used signal, whether it be
SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
userspace program (such as a UPS monitoring program which wants to
trigger power fail handling, or a userspace NFSv4 process that wants
to signal that it was unable to recover a file's file lock after a
server reboot), and if we try to take over the signal assignment, it's
possible that we might get surprised.  Furthermore, all of the
possibly unused signals that we might try to reclaim terminate the
process by default, and SIGINFO *has* to have a default signal
handling action of Ignore, since otherwise typing Control-T will end
up killing the current foreground application.

Personally, I don't care all that much about VSTATUS support --- I
used it when I was in university, but honestly, I've never missed it.
But if there is someone who wants to try to implement VSTATUS, and
make some Unix greybeards happy, and maybe even switch from FreeBSD to
Linux as a result, go wild.  I'm not convinced, though, that adding
the SIGINFO part of the support is worth the effort.

Not only do almost no programs implement SIGINFO support, a lot of CPU
bound programs where this might be actually useful, end up running a
large number of processes in parallel.  Take the "parsing 2042 of 5000
header files" example I gave above.  Consider what would happen if gcc
implemented support for SIGINFO, but the user was running a "make -j
16" and typed Control-T.   The result would be chaos!

So if you really miss Control-T, and it's the only thing holding back
a few FreeBSD users from Linux, I don't see the problem with
implementing that part of the feature.  Why not just do the easy part
of the feature which is perhaps 5% of the work, and might provide 99%
of the benefit (at least for those people who care).

> Without seeing the persuasive case for more signals I have to say that
> adding more signals to the kernel sounds like a bad idea.

Concur, 100%.

						- Ted

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 20:52     ` Theodore Ts'o
  0 siblings, 0 replies; 57+ messages in thread
From: Theodore Ts'o @ 2022-01-04 20:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richa

On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> I dug through the previous conversations and there is a little debate
> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> default made sense.  Ted Tso pointed out a different scenario where it
> was reasonable for SIGPWR to be a terminating signal.
> 
> So far no one has actually found any applications that will regress if
> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> defined to be sent to init, and init ignores all signals by default so
> in practice SIGPWR is ignored by the only process that receives it
> currently.

As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
initiate the sigpwr target.  From the systemd.special man page:

       sigpwr.target
           A special target that is started when systemd receives the
           SIGPWR process signal, which is normally sent by the kernel
           or UPS daemons when power fails.

And child processes of systemd are not ignoring SIGPWR.  Instead, they
are getting terminated.

<tytso@cwcc>
41% /bin/sleep 50 &
[1] 180671
<tytso@cwcc>
42% kill -PWR 180671
[1]+  Power failure           /bin/sleep 50

> Where I saw the last conversation falter was in making a persuasive
> case of why SIGINFO was interesting to add.  Given a world of ssh
> connections I expect a persuasive case can be made.  Especially if there
> are a handful of utilities where it is already implemented that just
> need to be built with SIGINFO defined.

One thing that's perhaps worth disentangling is the value of
supporting VSTATUS --- which is a control character much like VINTR
(^C) or VQUIT (control backslash) which is set via the c_cc[] array in
termios structure.  Quoting from the termios man page:

       VSTATUS
              (not in POSIX; not supported under Linux; status
              request: 024, DC4, Ctrl-T).  Status character (STATUS).
              Display status information at terminal, including state
              of foreground process and amount of CPU time it has
              consumed.  Also sends a SIGINFO signal (not supported on
              Linux) to the foreground process group.

The basic idea is that when you type C-t, you can find out information
about the currently running process.  This is a feature that
originally comes from TOPS-10's TENEX operating system, and it is
supported today on FreeBSD and Mac OS.  For example, it might display
something like this:

load: 2.39  cmd: ping 5374 running 0.00u 0.00s

The reason why SIGINFO is sent to the foreground process group is that
it gives the process an opportunity print application specific
information about currently running process.  For example, maybe the C
compiler could print something like "parsing 2042 of 5000 header
files", or some such.  :-)

There are people who wish that Linux supported Control-T / VSTATUS,
for example, just last week, on TUHS, the Unix greybeards list, there
were two such heartfelt wishes for Control-T support from two such
greybeards:

    "It's my biggest annoyance with Linux that it doesn't [support
    control-t]
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html

    "I personally can't stand using Linux, even casually for a very
     short sys-admin task, because of this missing feature"
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html

I claim, though, that we could implement VSTATUS without implenting
the SIGINFO part of the feature.  Previous few applications *ever*
implemented SIGINFO signal handlers so they could give status
information, it's the hard one, since we don't have any spare signals
left.  If we were to repurpose some lesser used signal, whether it be
SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
userspace program (such as a UPS monitoring program which wants to
trigger power fail handling, or a userspace NFSv4 process that wants
to signal that it was unable to recover a file's file lock after a
server reboot), and if we try to take over the signal assignment, it's
possible that we might get surprised.  Furthermore, all of the
possibly unused signals that we might try to reclaim terminate the
process by default, and SIGINFO *has* to have a default signal
handling action of Ignore, since otherwise typing Control-T will end
up killing the current foreground application.

Personally, I don't care all that much about VSTATUS support --- I
used it when I was in university, but honestly, I've never missed it.
But if there is someone who wants to try to implement VSTATUS, and
make some Unix greybeards happy, and maybe even switch from FreeBSD to
Linux as a result, go wild.  I'm not convinced, though, that adding
the SIGINFO part of the support is worth the effort.

Not only do almost no programs implement SIGINFO support, a lot of CPU
bound programs where this might be actually useful, end up running a
large number of processes in parallel.  Take the "parsing 2042 of 5000
header files" example I gave above.  Consider what would happen if gcc
implemented support for SIGINFO, but the user was running a "make -j
16" and typed Control-T.   The result would be chaos!

So if you really miss Control-T, and it's the only thing holding back
a few FreeBSD users from Linux, I don't see the problem with
implementing that part of the feature.  Why not just do the easy part
of the feature which is perhaps 5% of the work, and might provide 99%
of the benefit (at least for those people who care).

> Without seeing the persuasive case for more signals I have to say that
> adding more signals to the kernel sounds like a bad idea.

Concur, 100%.

						- Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04 20:52     ` Theodore Ts'o
  (?)
@ 2022-01-04 21:33       ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04 21:33 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall,
	bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

Fair enough.  I'll abandon the signals part of this and just send out
the VSTATUS/Control-T part, after I address some comments from Greg.

Thanks.

On Tue, Jan 4, 2022 at 12:52 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> >
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50
>
> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
>
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
>
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
>
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
>
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
>
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
>
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
>
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
>
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
>
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
>
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
>
> Not only do almost no programs implement SIGINFO support, a lot of CPU
> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
>
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
>
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
>
> Concur, 100%.
>
>                                                 - Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 21:33       ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04 21:33 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall,
	bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

Fair enough.  I'll abandon the signals part of this and just send out
the VSTATUS/Control-T part, after I address some comments from Greg.

Thanks.

On Tue, Jan 4, 2022 at 12:52 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> >
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50
>
> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
>
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
>
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
>
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
>
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
>
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
>
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
>
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
>
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
>
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
>
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
>
> Not only do almost no programs implement SIGINFO support, a lot of CPU
> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
>
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
>
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
>
> Concur, 100%.
>
>                                                 - Ted

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 21:33       ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04 21:33 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall,
	bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth

Fair enough.  I'll abandon the signals part of this and just send out
the VSTATUS/Control-T part, after I address some comments from Greg.

Thanks.

On Tue, Jan 4, 2022 at 12:52 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> >
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50
>
> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
>
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
>
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
>
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
>
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
>
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
>
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
>
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
>
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
>
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
>
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
>
> Not only do almost no programs implement SIGINFO support, a lot of CPU
> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
>
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
>
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
>
> Concur, 100%.
>
>                                                 - Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04 20:52     ` Theodore Ts'o
  (?)
@ 2022-01-04 22:05       ` Eric W. Biederman
  -1 siblings, 0 replies; 57+ messages in thread
From: Eric W. Biederman @ 2022-01-04 22:05 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

"Theodore Ts'o" <tytso@mit.edu> writes:

> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
>> I dug through the previous conversations and there is a little debate
>> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
>> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
>> default made sense.  Ted Tso pointed out a different scenario where it
>> was reasonable for SIGPWR to be a terminating signal.
>> 
>> So far no one has actually found any applications that will regress if
>> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
>> defined to be sent to init, and init ignores all signals by default so
>> in practice SIGPWR is ignored by the only process that receives it
>> currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50


That is all as expected, and does not demonstrate a regression would
happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
SIGCHLD, SIGURG do.  It does show there is the possibility of problems.

The practical question is does anything send SIGPWR to anything besides
init, and expect the process to handle SIGPWR or terminate?

Possibly easier to implement (if people desire) is to simply send
SIGCONT with an si_code that indicates someone pressed the VSTATUS
key.  We have a per signal 32bit si_code space so that should
be comparatively easy.

> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.

I agree that is the place to start.  And if we aren't going to use
SIGINFO perhaps we could have an equally good notification method
if anyone wants one.  Say call an ioctl and get an fd that can
be read when a VSTATUS request comes in.

SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
out when people get interested in modifying userspace.

Eric

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 22:05       ` Eric W. Biederman
  0 siblings, 0 replies; 57+ messages in thread
From: Eric W. Biederman @ 2022-01-04 22:05 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

"Theodore Ts'o" <tytso@mit.edu> writes:

> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
>> I dug through the previous conversations and there is a little debate
>> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
>> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
>> default made sense.  Ted Tso pointed out a different scenario where it
>> was reasonable for SIGPWR to be a terminating signal.
>> 
>> So far no one has actually found any applications that will regress if
>> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
>> defined to be sent to init, and init ignores all signals by default so
>> in practice SIGPWR is ignored by the only process that receives it
>> currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50


That is all as expected, and does not demonstrate a regression would
happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
SIGCHLD, SIGURG do.  It does show there is the possibility of problems.

The practical question is does anything send SIGPWR to anything besides
init, and expect the process to handle SIGPWR or terminate?

Possibly easier to implement (if people desire) is to simply send
SIGCONT with an si_code that indicates someone pressed the VSTATUS
key.  We have a per signal 32bit si_code space so that should
be comparatively easy.

> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.

I agree that is the place to start.  And if we aren't going to use
SIGINFO perhaps we could have an equally good notification method
if anyone wants one.  Say call an ioctl and get an fd that can
be read when a VSTATUS request comes in.

SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
out when people get interested in modifying userspace.

Eric

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 22:05       ` Eric W. Biederman
  0 siblings, 0 replies; 57+ messages in thread
From: Eric W. Biederman @ 2022-01-04 22:05 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richa

"Theodore Ts'o" <tytso@mit.edu> writes:

> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
>> I dug through the previous conversations and there is a little debate
>> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
>> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
>> default made sense.  Ted Tso pointed out a different scenario where it
>> was reasonable for SIGPWR to be a terminating signal.
>> 
>> So far no one has actually found any applications that will regress if
>> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
>> defined to be sent to init, and init ignores all signals by default so
>> in practice SIGPWR is ignored by the only process that receives it
>> currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50


That is all as expected, and does not demonstrate a regression would
happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
SIGCHLD, SIGURG do.  It does show there is the possibility of problems.

The practical question is does anything send SIGPWR to anything besides
init, and expect the process to handle SIGPWR or terminate?

Possibly easier to implement (if people desire) is to simply send
SIGCONT with an si_code that indicates someone pressed the VSTATUS
key.  We have a per signal 32bit si_code space so that should
be comparatively easy.

> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.

I agree that is the place to start.  And if we aren't going to use
SIGINFO perhaps we could have an equally good notification method
if anyone wants one.  Say call an ioctl and get an fd that can
be read when a VSTATUS request comes in.

SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
out when people get interested in modifying userspace.

Eric

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04 22:05       ` Eric W. Biederman
  (?)
@ 2022-01-04 22:23         ` Theodore Ts'o
  -1 siblings, 0 replies; 57+ messages in thread
From: Theodore Ts'o @ 2022-01-04 22:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> 
> That is all as expected, and does not demonstrate a regression would
> happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> 
> The practical question is does anything send SIGPWR to anything besides
> init, and expect the process to handle SIGPWR or terminate?

So if I *cared* about SIGINFO, what I'd do is ask the systemd
developers and users list if there are any users of the sigpwr.target
feature that they know of.  And I'd also download all of the open
source UPS monitoring applications (and perhaps documentation of
closed-source UPS applications, such as for example APC's program) and
see if any of them are trying to send the SIGPWR signal.

I don't personally think it's worth the effort to do that research,
but maybe other people care enough to do the work.

> > I claim, though, that we could implement VSTATUS without implenting
> > the SIGINFO part of the feature.
> 
> I agree that is the place to start.  And if we aren't going to use
> SIGINFO perhaps we could have an equally good notification method
> if anyone wants one.  Say call an ioctl and get an fd that can
> be read when a VSTATUS request comes in.
> 
> SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> out when people get interested in modifying userspace.


Once VSTATUS support lands in the kernel, we can wait and see if there
is anyone who shows up wanting the SIGINFO functionality.  Certainly
we have no shortage of userspace notification interfaces in Linux.  :-)

   	   	       		 	      - Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 22:23         ` Theodore Ts'o
  0 siblings, 0 replies; 57+ messages in thread
From: Theodore Ts'o @ 2022-01-04 22:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> 
> That is all as expected, and does not demonstrate a regression would
> happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> 
> The practical question is does anything send SIGPWR to anything besides
> init, and expect the process to handle SIGPWR or terminate?

So if I *cared* about SIGINFO, what I'd do is ask the systemd
developers and users list if there are any users of the sigpwr.target
feature that they know of.  And I'd also download all of the open
source UPS monitoring applications (and perhaps documentation of
closed-source UPS applications, such as for example APC's program) and
see if any of them are trying to send the SIGPWR signal.

I don't personally think it's worth the effort to do that research,
but maybe other people care enough to do the work.

> > I claim, though, that we could implement VSTATUS without implenting
> > the SIGINFO part of the feature.
> 
> I agree that is the place to start.  And if we aren't going to use
> SIGINFO perhaps we could have an equally good notification method
> if anyone wants one.  Say call an ioctl and get an fd that can
> be read when a VSTATUS request comes in.
> 
> SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> out when people get interested in modifying userspace.


Once VSTATUS support lands in the kernel, we can wait and see if there
is anyone who shows up wanting the SIGINFO functionality.  Certainly
we have no shortage of userspace notification interfaces in Linux.  :-)

   	   	       		 	      - Ted

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 22:23         ` Theodore Ts'o
  0 siblings, 0 replies; 57+ messages in thread
From: Theodore Ts'o @ 2022-01-04 22:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp,
	chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richa

On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> 
> That is all as expected, and does not demonstrate a regression would
> happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> 
> The practical question is does anything send SIGPWR to anything besides
> init, and expect the process to handle SIGPWR or terminate?

So if I *cared* about SIGINFO, what I'd do is ask the systemd
developers and users list if there are any users of the sigpwr.target
feature that they know of.  And I'd also download all of the open
source UPS monitoring applications (and perhaps documentation of
closed-source UPS applications, such as for example APC's program) and
see if any of them are trying to send the SIGPWR signal.

I don't personally think it's worth the effort to do that research,
but maybe other people care enough to do the work.

> > I claim, though, that we could implement VSTATUS without implenting
> > the SIGINFO part of the feature.
> 
> I agree that is the place to start.  And if we aren't going to use
> SIGINFO perhaps we could have an equally good notification method
> if anyone wants one.  Say call an ioctl and get an fd that can
> be read when a VSTATUS request comes in.
> 
> SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> out when people get interested in modifying userspace.


Once VSTATUS support lands in the kernel, we can wait and see if there
is anyone who shows up wanting the SIGINFO functionality.  Certainly
we have no shortage of userspace notification interfaces in Linux.  :-)

   	   	       		 	      - Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04 22:23         ` Theodore Ts'o
  (?)
@ 2022-01-04 22:31           ` Walt Drummond
  -1 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04 22:31 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall,
	bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

The only standard tools that support SIGINFO are sleep, dd and ping,
(and kill, for obvious reasons) so it's not like there's a vast hole
in the tooling or something, nor is there a large legacy software base
just waiting for SIGINFO to appear.   So while I very much enjoyed
figuring out how to make SIGINFO work ...

I'll have the VSTATUS patch out in a little bit.

I also think there might be some merit in consolidating the 10
'sigsetsize != sizeof(sigset_t)' checks in a macro and adding comments
that wave people off on trying to do what I did.  If that would be
useful, happy to provide the patch.

On Tue, Jan 4, 2022 at 2:23 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> >
> > That is all as expected, and does not demonstrate a regression would
> > happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> > SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> >
> > The practical question is does anything send SIGPWR to anything besides
> > init, and expect the process to handle SIGPWR or terminate?
>
> So if I *cared* about SIGINFO, what I'd do is ask the systemd
> developers and users list if there are any users of the sigpwr.target
> feature that they know of.  And I'd also download all of the open
> source UPS monitoring applications (and perhaps documentation of
> closed-source UPS applications, such as for example APC's program) and
> see if any of them are trying to send the SIGPWR signal.
>
> I don't personally think it's worth the effort to do that research,
> but maybe other people care enough to do the work.
>
> > > I claim, though, that we could implement VSTATUS without implenting
> > > the SIGINFO part of the feature.
> >
> > I agree that is the place to start.  And if we aren't going to use
> > SIGINFO perhaps we could have an equally good notification method
> > if anyone wants one.  Say call an ioctl and get an fd that can
> > be read when a VSTATUS request comes in.
> >
> > SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> > out when people get interested in modifying userspace.
>
>
> Once VSTATUS support lands in the kernel, we can wait and see if there
> is anyone who shows up wanting the SIGINFO functionality.  Certainly
> we have no shortage of userspace notification interfaces in Linux.  :-)
>
>                                               - Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 22:31           ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04 22:31 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall,
	bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx,
	trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel,
	kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k,
	linux-mtd, linux-nfs, linux-scsi, linux-security-module

The only standard tools that support SIGINFO are sleep, dd and ping,
(and kill, for obvious reasons) so it's not like there's a vast hole
in the tooling or something, nor is there a large legacy software base
just waiting for SIGINFO to appear.   So while I very much enjoyed
figuring out how to make SIGINFO work ...

I'll have the VSTATUS patch out in a little bit.

I also think there might be some merit in consolidating the 10
'sigsetsize != sizeof(sigset_t)' checks in a macro and adding comments
that wave people off on trying to do what I did.  If that would be
useful, happy to provide the patch.

On Tue, Jan 4, 2022 at 2:23 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> >
> > That is all as expected, and does not demonstrate a regression would
> > happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> > SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> >
> > The practical question is does anything send SIGPWR to anything besides
> > init, and expect the process to handle SIGPWR or terminate?
>
> So if I *cared* about SIGINFO, what I'd do is ask the systemd
> developers and users list if there are any users of the sigpwr.target
> feature that they know of.  And I'd also download all of the open
> source UPS monitoring applications (and perhaps documentation of
> closed-source UPS applications, such as for example APC's program) and
> see if any of them are trying to send the SIGPWR signal.
>
> I don't personally think it's worth the effort to do that research,
> but maybe other people care enough to do the work.
>
> > > I claim, though, that we could implement VSTATUS without implenting
> > > the SIGINFO part of the feature.
> >
> > I agree that is the place to start.  And if we aren't going to use
> > SIGINFO perhaps we could have an equally good notification method
> > if anyone wants one.  Say call an ioctl and get an fd that can
> > be read when a VSTATUS request comes in.
> >
> > SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> > out when people get interested in modifying userspace.
>
>
> Once VSTATUS support lands in the kernel, we can wait and see if there
> is anyone who shows up wanting the SIGINFO functionality.  Certainly
> we have no shortage of userspace notification interfaces in Linux.  :-)
>
>                                               - Ted

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-04 22:31           ` Walt Drummond
  0 siblings, 0 replies; 57+ messages in thread
From: Walt Drummond @ 2022-01-04 22:31 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall,
	bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann,
	dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink,
	jejb, jmorris, bfields, jlayton, jirislaby, john.johansen,
	juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman,
	oleg, pbonzini, peterz, rth

The only standard tools that support SIGINFO are sleep, dd and ping,
(and kill, for obvious reasons) so it's not like there's a vast hole
in the tooling or something, nor is there a large legacy software base
just waiting for SIGINFO to appear.   So while I very much enjoyed
figuring out how to make SIGINFO work ...

I'll have the VSTATUS patch out in a little bit.

I also think there might be some merit in consolidating the 10
'sigsetsize != sizeof(sigset_t)' checks in a macro and adding comments
that wave people off on trying to do what I did.  If that would be
useful, happy to provide the patch.

On Tue, Jan 4, 2022 at 2:23 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> >
> > That is all as expected, and does not demonstrate a regression would
> > happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> > SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> >
> > The practical question is does anything send SIGPWR to anything besides
> > init, and expect the process to handle SIGPWR or terminate?
>
> So if I *cared* about SIGINFO, what I'd do is ask the systemd
> developers and users list if there are any users of the sigpwr.target
> feature that they know of.  And I'd also download all of the open
> source UPS monitoring applications (and perhaps documentation of
> closed-source UPS applications, such as for example APC's program) and
> see if any of them are trying to send the SIGPWR signal.
>
> I don't personally think it's worth the effort to do that research,
> but maybe other people care enough to do the work.
>
> > > I claim, though, that we could implement VSTATUS without implenting
> > > the SIGINFO part of the feature.
> >
> > I agree that is the place to start.  And if we aren't going to use
> > SIGINFO perhaps we could have an equally good notification method
> > if anyone wants one.  Say call an ioctl and get an fd that can
> > be read when a VSTATUS request comes in.
> >
> > SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> > out when people get interested in modifying userspace.
>
>
> Once VSTATUS support lands in the kernel, we can wait and see if there
> is anyone who shows up wanting the SIGINFO functionality.  Certainly
> we have no shortage of userspace notification interfaces in Linux.  :-)
>
>                                               - Ted

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04 20:52     ` Theodore Ts'o
  (?)
@ 2022-01-07 19:19       ` Arseny Maslennikov
  -1 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 19:19 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, Walt Drummond, aacraid, viro, anna.schumaker,
	arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2,
	dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo,
	yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby,
	john.johansen, juri.lelli, keescook, mcgrof, martin.petersen,
	mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge,
	rostedt, tglx, trond.myklebust, vincent.guittot, x86,
	linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module


[-- Attachment #1.1: Type: text/plain, Size: 8254 bytes --]

I generally agree with Ted's suggestion that we could merge the
easy-to-design part — the VSTATUS+kerninfo — first and deal with the
SIGINFO part later. The only concern I have here is that the "later"
part might never practically arrive... :)

Still, some notes on the SIGINFO/userspace-status
part:

On Tue, Jan 04, 2022 at 03:52:28PM -0500, Theodore Ts'o wrote:
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> > 
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.

Some folks from linux-api@ claimed otherwise, but unfortunately didn't elaborate.

> > Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
> 
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
> 
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.

Not sure what you had in mind; in case you're suggesting that systemd has
to drop the sigpwr.target semantics — it doesn't.
We don't need to ask systemd to drop sigpwr.target semantics.

To introduce SIGINFO == SIGPWR to the kernel, the only "breaking" change
we have to do is to change the default disposition for SIGPWR, i. e. the
behaviour if the signal is set to SIG_DFL. If a process (including PID
1) installs its own signal handler for SIGPWR to do something when PWR
is received (or blocks the signal and handles it via signalfd
notifications), then the default disposition does not matter at all, as
Eric notes further in this thread.

From a quick glance at systemd code, pid1's main() function calls
manager_new() calls manager_setup_signals(); this function, in turn,
blocks a set of signals, including PWR, and sets up a signalfd(2) on
that set. No changes have to be made in systemd, no need to remove the
sigpwr.target semantics.

The target activation does not send SIGPWR to anyone, it results in
systemd services being started and possibly stopped; the exact
consequences are out of scope for systemd.

There could be another concern: a VSTATUS keypress could result in
SIGINFO == SIGPWR being sent to pid1. In a correct implementation this
will not ever happen, because a sane PID 1 does not have (and never
acquires) a controlling terminal.

> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
> 
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50

All the possible surprises with the SIGINFO == SIGPWR approach we might
get stem from here, not from the sigpwr.target.

> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
> 
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
> 
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
> 
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
> 
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
> 
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
> 
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
> 
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
> 
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
> 
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
> 
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
> 
> Not only do almost no programs implement SIGINFO support, a lot of CPU

To be fair, many programs are a lot younger than 4.3BSD, and with the
current ubiquity of Linux without VSTATUS, it's kind of a chicken-egg
problem. :)

> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
> 
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
> 
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
> 
> Concur, 100%.
> 
> 						- Ted

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 144 bytes --]

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-07 19:19       ` Arseny Maslennikov
  0 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 19:19 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, Walt Drummond, aacraid, viro, anna.schumaker,
	arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2,
	dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo,
	yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby,
	john.johansen, juri.lelli, keescook, mcgrof, martin.petersen,
	mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge,
	rostedt, tglx, trond.myklebust, vincent.guittot, x86,
	linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

[-- Attachment #1: Type: text/plain, Size: 8254 bytes --]

I generally agree with Ted's suggestion that we could merge the
easy-to-design part — the VSTATUS+kerninfo — first and deal with the
SIGINFO part later. The only concern I have here is that the "later"
part might never practically arrive... :)

Still, some notes on the SIGINFO/userspace-status
part:

On Tue, Jan 04, 2022 at 03:52:28PM -0500, Theodore Ts'o wrote:
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> > 
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.

Some folks from linux-api@ claimed otherwise, but unfortunately didn't elaborate.

> > Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
> 
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
> 
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.

Not sure what you had in mind; in case you're suggesting that systemd has
to drop the sigpwr.target semantics — it doesn't.
We don't need to ask systemd to drop sigpwr.target semantics.

To introduce SIGINFO == SIGPWR to the kernel, the only "breaking" change
we have to do is to change the default disposition for SIGPWR, i. e. the
behaviour if the signal is set to SIG_DFL. If a process (including PID
1) installs its own signal handler for SIGPWR to do something when PWR
is received (or blocks the signal and handles it via signalfd
notifications), then the default disposition does not matter at all, as
Eric notes further in this thread.

From a quick glance at systemd code, pid1's main() function calls
manager_new() calls manager_setup_signals(); this function, in turn,
blocks a set of signals, including PWR, and sets up a signalfd(2) on
that set. No changes have to be made in systemd, no need to remove the
sigpwr.target semantics.

The target activation does not send SIGPWR to anyone, it results in
systemd services being started and possibly stopped; the exact
consequences are out of scope for systemd.

There could be another concern: a VSTATUS keypress could result in
SIGINFO == SIGPWR being sent to pid1. In a correct implementation this
will not ever happen, because a sane PID 1 does not have (and never
acquires) a controlling terminal.

> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
> 
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50

All the possible surprises with the SIGINFO == SIGPWR approach we might
get stem from here, not from the sigpwr.target.

> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
> 
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
> 
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
> 
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
> 
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
> 
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
> 
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
> 
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
> 
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
> 
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
> 
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
> 
> Not only do almost no programs implement SIGINFO support, a lot of CPU

To be fair, many programs are a lot younger than 4.3BSD, and with the
current ubiquity of Linux without VSTATUS, it's kind of a chicken-egg
problem. :)

> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
> 
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
> 
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
> 
> Concur, 100%.
> 
> 						- Ted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-07 19:19       ` Arseny Maslennikov
  0 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 19:19 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Eric W. Biederman, Walt Drummond, aacraid, viro, anna.schumaker,
	arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2,
	dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo,
	yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby,
	john.johansen, juri.lelli, keescook, mcgrof, martin.petersen,
	mattst88, mgorman, oleg, pbonzini

[-- Attachment #1: Type: text/plain, Size: 8254 bytes --]

I generally agree with Ted's suggestion that we could merge the
easy-to-design part — the VSTATUS+kerninfo — first and deal with the
SIGINFO part later. The only concern I have here is that the "later"
part might never practically arrive... :)

Still, some notes on the SIGINFO/userspace-status
part:

On Tue, Jan 04, 2022 at 03:52:28PM -0500, Theodore Ts'o wrote:
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> > 
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.

Some folks from linux-api@ claimed otherwise, but unfortunately didn't elaborate.

> > Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
> 
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
> 
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.

Not sure what you had in mind; in case you're suggesting that systemd has
to drop the sigpwr.target semantics — it doesn't.
We don't need to ask systemd to drop sigpwr.target semantics.

To introduce SIGINFO == SIGPWR to the kernel, the only "breaking" change
we have to do is to change the default disposition for SIGPWR, i. e. the
behaviour if the signal is set to SIG_DFL. If a process (including PID
1) installs its own signal handler for SIGPWR to do something when PWR
is received (or blocks the signal and handles it via signalfd
notifications), then the default disposition does not matter at all, as
Eric notes further in this thread.

From a quick glance at systemd code, pid1's main() function calls
manager_new() calls manager_setup_signals(); this function, in turn,
blocks a set of signals, including PWR, and sets up a signalfd(2) on
that set. No changes have to be made in systemd, no need to remove the
sigpwr.target semantics.

The target activation does not send SIGPWR to anyone, it results in
systemd services being started and possibly stopped; the exact
consequences are out of scope for systemd.

There could be another concern: a VSTATUS keypress could result in
SIGINFO == SIGPWR being sent to pid1. In a correct implementation this
will not ever happen, because a sane PID 1 does not have (and never
acquires) a controlling terminal.

> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
> 
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50

All the possible surprises with the SIGINFO == SIGPWR approach we might
get stem from here, not from the sigpwr.target.

> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
> 
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
> 
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
> 
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
> 
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
> 
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
> 
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
> 
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
> 
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
> 
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
> 
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
> 
> Not only do almost no programs implement SIGINFO support, a lot of CPU

To be fair, many programs are a lot younger than 4.3BSD, and with the
current ubiquity of Linux without VSTATUS, it's kind of a chicken-egg
problem. :)

> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
> 
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
> 
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
> 
> Concur, 100%.
> 
> 						- Ted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-04 22:31           ` Walt Drummond
  (?)
@ 2022-01-07 19:29             ` Arseny Maslennikov
  -1 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 19:29 UTC (permalink / raw)
  To: Walt Drummond
  Cc: Theodore Ts'o, Eric W. Biederman, aacraid, viro,
	anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

[-- Attachment #1: Type: text/plain, Size: 852 bytes --]

On Tue, Jan 04, 2022 at 02:31:44PM -0800, Walt Drummond wrote:
> The only standard tools that support SIGINFO are sleep, dd and ping,
> (and kill, for obvious reasons) so it's not like there's a vast hole
> in the tooling or something, nor is there a large legacy software base
> just waiting for SIGINFO to appear.   So while I very much enjoyed
> figuring out how to make SIGINFO work ...

As far as I recall, GNU make on *BSD does support SIGINFO (Not a
standard tool, but obviously an established one).

The developers of strace have expressed interest in SIGINFO support
to print tracer status messages (unfortunately, not on a public list).
Computational software can use this instead of stderr progress spam, if
run in an interactive fashion on a terminal, as it frequently is. There
is a user base, it's just not very vocal on kernel lists. :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-07 19:29             ` Arseny Maslennikov
  0 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 19:29 UTC (permalink / raw)
  To: Walt Drummond
  Cc: Theodore Ts'o, Eric W. Biederman, aacraid, viro,
	anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module


[-- Attachment #1.1: Type: text/plain, Size: 852 bytes --]

On Tue, Jan 04, 2022 at 02:31:44PM -0800, Walt Drummond wrote:
> The only standard tools that support SIGINFO are sleep, dd and ping,
> (and kill, for obvious reasons) so it's not like there's a vast hole
> in the tooling or something, nor is there a large legacy software base
> just waiting for SIGINFO to appear.   So while I very much enjoyed
> figuring out how to make SIGINFO work ...

As far as I recall, GNU make on *BSD does support SIGINFO (Not a
standard tool, but obviously an established one).

The developers of strace have expressed interest in SIGINFO support
to print tracer status messages (unfortunately, not on a public list).
Computational software can use this instead of stderr progress spam, if
run in an interactive fashion on a terminal, as it frequently is. There
is a user base, it's just not very vocal on kernel lists. :)

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 144 bytes --]

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-07 19:29             ` Arseny Maslennikov
  0 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 19:29 UTC (permalink / raw)
  To: Walt Drummond
  Cc: Theodore Ts'o, Eric W. Biederman, aacraid, viro,
	anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, pe

[-- Attachment #1: Type: text/plain, Size: 852 bytes --]

On Tue, Jan 04, 2022 at 02:31:44PM -0800, Walt Drummond wrote:
> The only standard tools that support SIGINFO are sleep, dd and ping,
> (and kill, for obvious reasons) so it's not like there's a vast hole
> in the tooling or something, nor is there a large legacy software base
> just waiting for SIGINFO to appear.   So while I very much enjoyed
> figuring out how to make SIGINFO work ...

As far as I recall, GNU make on *BSD does support SIGINFO (Not a
standard tool, but obviously an established one).

The developers of strace have expressed interest in SIGINFO support
to print tracer status messages (unfortunately, not on a public list).
Computational software can use this instead of stderr progress spam, if
run in an interactive fashion on a terminal, as it frequently is. There
is a user base, it's just not very vocal on kernel lists. :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO
  2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond
  2022-01-04  7:27   ` Greg Kroah-Hartman
@ 2022-01-07 21:48   ` Arseny Maslennikov
  2022-01-07 21:52     ` Walt Drummond
  2022-01-08 14:38   ` Arseny Maslennikov
  2 siblings, 1 reply; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 21:48 UTC (permalink / raw)
  To: Walt Drummond
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel,
	linux-arch

[-- Attachment #1: Type: text/plain, Size: 19564 bytes --]

On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote:
> Support TTY VSTATUS character, NOKERNINFO local control bit and the
> signal SIGINFO, all as in 4.3BSD.
> 
> Signed-off-by: Walt Drummond <walt@drummond.us>
> ---
>  arch/x86/include/asm/signal.h       |   2 +-
>  arch/x86/include/uapi/asm/signal.h  |   4 +-
>  drivers/tty/Makefile                |   2 +-
>  drivers/tty/n_tty.c                 |  21 +++++
>  drivers/tty/tty_io.c                |  10 ++-
>  drivers/tty/tty_ioctl.c             |   4 +
>  drivers/tty/tty_status.c            | 135 ++++++++++++++++++++++++++++
>  fs/proc/array.c                     |  29 +-----
>  include/asm-generic/termios.h       |   4 +-
>  include/linux/sched.h               |  52 ++++++++++-
>  include/linux/signal.h              |   4 +
>  include/linux/tty.h                 |   8 ++
>  include/uapi/asm-generic/ioctls.h   |   2 +
>  include/uapi/asm-generic/signal.h   |   6 +-
>  include/uapi/asm-generic/termbits.h |  34 +++----
>  15 files changed, 264 insertions(+), 53 deletions(-)
>  create mode 100644 drivers/tty/tty_status.c
> 
> diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
> index d8e2efe6cd46..0a01877c11ab 100644
> --- a/arch/x86/include/asm/signal.h
> +++ b/arch/x86/include/asm/signal.h
> @@ -8,7 +8,7 @@
>  /* Most things should be clean enough to redefine this at will, if care
>     is taken to make libc match.  */
>  
> -#define _NSIG		64
> +#define _NSIG		65
>  
>  #ifdef __i386__
>  # define _NSIG_BPW	32
> diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h
> index 164a22a72984..60dca62d3dcf 100644
> --- a/arch/x86/include/uapi/asm/signal.h
> +++ b/arch/x86/include/uapi/asm/signal.h
> @@ -60,7 +60,9 @@ typedef unsigned long sigset_t;
>  
>  /* These should not be considered constants from userland.  */
>  #define SIGRTMIN	32
> -#define SIGRTMAX	_NSIG
> +#define SIGRTMAX	64
> +
> +#define SIGINFO		65
>  
>  #define SA_RESTORER	0x04000000
>  
> diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile
> index a2bd75fbaaa4..d50ba690bb87 100644
> --- a/drivers/tty/Makefile
> +++ b/drivers/tty/Makefile
> @@ -2,7 +2,7 @@
>  obj-$(CONFIG_TTY)		+= tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \
>  				   tty_buffer.o tty_port.o tty_mutex.o \
>  				   tty_ldsem.o tty_baudrate.o tty_jobctrl.o \
> -				   n_null.o
> +				   n_null.o tty_status.o
>  obj-$(CONFIG_LEGACY_PTYS)	+= pty.o
>  obj-$(CONFIG_UNIX98_PTYS)	+= pty.o
>  obj-$(CONFIG_AUDIT)		+= tty_audit.o
> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> index 0ec93f1a61f5..b510e01289fd 100644
> --- a/drivers/tty/n_tty.c
> +++ b/drivers/tty/n_tty.c
> @@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c)
>  			commit_echoes(tty);
>  			return;
>  		}
> +#ifdef VSTATUS
> +		if (c == STATUS_CHAR(tty)) {
> +			/* Do the status message first and then send
> +			 * the signal, otherwise signal delivery can
> +			 * change the process state making the status
> +			 * message misleading.  Also, use __isig() and
> +			 * not sig(), as if we flush the tty we can
> +			 * lose parts of the message.

...As well as the character input in the canonical mode's built-in line
editor.

> +			 */
> +
> +			if (!L_NOKERNINFO(tty))
> +				tty_status(tty);
> +# if defined(SIGINFO) && SIGINFO != SIGPWR
> +			__isig(SIGINFO, tty);
> +# endif
> +			return;
> +		}
> +#endif	/* VSTATUS */
>  		if (c == '\n') {
>  			if (L_ECHO(tty) || L_ECHONL(tty)) {
>  				echo_char_raw('\n', ldata);
> @@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
>  			set_bit(EOF_CHAR(tty), ldata->char_map);
>  			set_bit('\n', ldata->char_map);
>  			set_bit(EOL_CHAR(tty), ldata->char_map);
> +#ifdef VSTATUS
> +			set_bit(STATUS_CHAR(tty), ldata->char_map);
> +#endif
>  			if (L_IEXTEN(tty)) {
>  				set_bit(WERASE_CHAR(tty), ldata->char_map);
>  				set_bit(LNEXT_CHAR(tty), ldata->char_map);
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 6616d4a0d41d..8e488ecba330 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -120,18 +120,26 @@
>  #define TTY_PARANOIA_CHECK 1
>  #define CHECK_TTY_COUNT 1
>  
> +/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */
> +#ifdef NOKERNINFO
> +# define __NOKERNINFO NOKERNINFO
> +#else
> +# define __NOKERNINFO 0
> +#endif
> +
>  struct ktermios tty_std_termios = {	/* for the benefit of tty drivers  */
>  	.c_iflag = ICRNL | IXON,
>  	.c_oflag = OPOST | ONLCR,
>  	.c_cflag = B38400 | CS8 | CREAD | HUPCL,
>  	.c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK |
> -		   ECHOCTL | ECHOKE | IEXTEN,
> +		   ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO,
>  	.c_cc = INIT_C_CC,
>  	.c_ispeed = 38400,
>  	.c_ospeed = 38400,
>  	/* .c_line = N_TTY, */
>  };
>  EXPORT_SYMBOL(tty_std_termios);
> +#undef __NOKERNINFO
>  
>  /* This list gets poked at by procfs and various bits of boot up code. This
>   * could do with some rationalisation such as pulling the tty proc function
> diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c
> index 507a25d692bb..b250eabca1ba 100644
> --- a/drivers/tty/tty_ioctl.c
> +++ b/drivers/tty/tty_ioctl.c
> @@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file,
>  		if (get_user(arg, (unsigned int __user *) arg))
>  			return -EFAULT;
>  		return tty_change_softcar(real_tty, arg);
> +#ifdef TIOCSTAT
> +	case TIOCSTAT:
> +		return tty_status(real_tty);
> +#endif
>  	default:
>  		return -ENOIOCTLCMD;
>  	}
> diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c
> new file mode 100644

Nitpick: the new functionality is part of n_tty and not the generic tty
subsystem, so "tty_status.c" is a misleading name for the new file,
unlike e. g. "n_tty_status.c". It has no use in the various modem
drivers, for example.
Likewise for the tty_status() function.

> index 000000000000..a9600f5bd48c
> --- /dev/null
> +++ b/drivers/tty/tty_status.c
> @@ -0,0 +1,135 @@
> +// SPDX-License-Identifier: GPL-1.0+
> +/*
> + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4
> + *
> + */
> +
> +#include <linux/sched.h>
> +#include <linux/mm.h>
> +#include <linux/tty.h>
> +#include <linux/sched/cputime.h>
> +#include <linux/sched/loadavg.h>
> +#include <linux/pid.h>
> +#include <linux/slab.h>
> +#include <linux/math64.h>
> +
> +#define MSGLEN (160 + TASK_COMM_LEN)
> +
> +inline unsigned long getRSSk(struct mm_struct *mm)
> +{
> +	if (mm == NULL)
> +		return 0;
> +	return get_mm_rss(mm) * PAGE_SIZE / 1024;
> +}
> +
> +inline long nstoms(long l)
> +{
> +	l /= NSEC_PER_MSEC * 10;
> +	if (l < 10)
> +		l *= 10;
> +	return l;
> +}
> +
> +inline struct task_struct *compare(struct task_struct *new,
> +				   struct task_struct *old)
> +{
> +	unsigned int ostate, nstate;
> +
> +	if (old == NULL)
> +		return new;
> +
> +	ostate = task_state_index(old);
> +	nstate = task_state_index(new);
> +
> +	if (ostate == nstate) {
> +		if (old->start_time > new->start_time)
> +			return old;
> +		return new;
> +	}
> +
> +	if (ostate < nstate)
> +		return old;
> +
> +	return new;
> +}
> +
> +struct task_struct *pick_process(struct pid *pgrp)
> +{
> +	struct task_struct *p, *winner = NULL;
> +
> +	read_lock(&tasklist_lock);
> +	do_each_pid_task(pgrp, PIDTYPE_PGID, p) {
> +		winner = compare(p, winner);
> +	} while_each_pid_task(pgrp, PIDTYPE_PGID, p);
> +	read_unlock(&tasklist_lock);
> +
> +	return winner;
> +}
> +
> +int tty_status(struct tty_struct *tty)
> +{
> +	char tname[TASK_COMM_LEN];
> +	unsigned long loadavg[3];
> +	uint64_t pcpu, cputime, wallclock;
> +	struct task_struct *p;
> +	struct rusage rusage;
> +	struct timespec64 utime, stime, rtime;
> +	char msg[MSGLEN] = {0};
> +	int len = 0;
> +
> +	if (tty == NULL)
> +		return -ENOTTY;
> +
> +	get_avenrun(loadavg, FIXED_1/200, 0);
> +	len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu  ",
> +		       LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0]));
> +
> +	if (tty->ctrl.session == NULL) {
> +		len += scnprintf((char *)&msg[len], MSGLEN - len,
> +				 "not a controlling terminal");
> +		goto print;
> +	}
> +
> +	if (tty->ctrl.pgrp == NULL) {
> +		len += scnprintf((char *)&msg[len], MSGLEN - len,
> +				 "no foreground process group");
> +		goto print;
> +	}
> +
> +	p = pick_process(tty->ctrl.pgrp);
> +	if (p == NULL) {
> +		len += scnprintf((char *)&msg[len], MSGLEN - len,
> +				 "empty foreground process group");
> +		goto print;
> +	}
> +
> +	get_task_comm(tname, p);
> +	getrusage(p, RUSAGE_BOTH, &rusage);
> +	wallclock = ktime_get_ns() - p->start_time;
> +
> +	utime.tv_sec = rusage.ru_utime.tv_sec;
> +	utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC;
> +	stime.tv_sec = rusage.ru_stime.tv_sec;
> +	stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC;
> +	rtime = ns_to_timespec64(wallclock);
> +
> +	cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime);
> +	pcpu = div64_u64(cputime * 100, wallclock);
> +
> +	len += scnprintf((char *)&msg[len], MSGLEN - len,
> +			 /* task, PID, task state */
> +			 "cmd: %s %d [%s] "
> +			 /* rtime,    utime,      stime,      %cpu,  rss */
> +			 "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk",
> +			 tname,	task_pid_vnr(p), (char *)get_task_state_name(p),
> +			 rtime.tv_sec, nstoms(rtime.tv_nsec),
> +			 utime.tv_sec, nstoms(utime.tv_nsec),
> +			 stime.tv_sec, nstoms(stime.tv_nsec),
> +			 pcpu, getRSSk(p->mm));
> +
> +print:
> +	len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n");
> +	tty_write_message(tty, msg);

tty_write_message() is quite risky to use; while writing my
implementation a couple of years ago I've found it easy to accidentally
set up deadlocks with this interface — in particular if the function is
called from the tty character receive path.
I hope you're testing the functionality with CONFIG_PROVE_LOCKING enabled.

> +
> +	return 0;
> +}
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index f37c03077b58..eb14306cdde2 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -62,6 +62,7 @@
>  #include <linux/tty.h>
>  #include <linux/string.h>
>  #include <linux/mman.h>
> +#include <linux/sched.h>
>  #include <linux/sched/mm.h>
>  #include <linux/sched/numa_balancing.h>
>  #include <linux/sched/task_stack.h>
> @@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
>  		seq_printf(m, "%.64s", tcomm);
>  }
>  
> -/*
> - * The task state array is a strange "bitmap" of
> - * reasons to sleep. Thus "running" is zero, and
> - * you can test for combinations of others with
> - * simple bit tests.
> - */
> -static const char * const task_state_array[] = {
> -
> -	/* states in TASK_REPORT: */
> -	"R (running)",		/* 0x00 */
> -	"S (sleeping)",		/* 0x01 */
> -	"D (disk sleep)",	/* 0x02 */
> -	"T (stopped)",		/* 0x04 */
> -	"t (tracing stop)",	/* 0x08 */
> -	"X (dead)",		/* 0x10 */
> -	"Z (zombie)",		/* 0x20 */
> -	"P (parked)",		/* 0x40 */
> -
> -	/* states beyond TASK_REPORT: */
> -	"I (idle)",		/* 0x80 */
> -};
> -
> -static inline const char *get_task_state(struct task_struct *tsk)
> -{
> -	BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> -	return task_state_array[task_state_index(tsk)];
> -}
> -
>  static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>  				struct pid *pid, struct task_struct *p)
>  {
> diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h
> index b1398d0d4a1d..9b080e1a82d4 100644
> --- a/include/asm-generic/termios.h
> +++ b/include/asm-generic/termios.h
> @@ -10,9 +10,9 @@
>  	eof=^D		vtime=\0	vmin=\1		sxtc=\0
>  	start=^Q	stop=^S		susp=^Z		eol=\0
>  	reprint=^R	discard=^U	werase=^W	lnext=^V
> -	eol2=\0
> +	eol2=\0         status=^T
>  */
> -#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0"
> +#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024"
>  
>  /*
>   * Translate a "termio" structure into a "termios". Ugh.
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index c1a927ddec64..2171074ec8f5 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -70,7 +70,7 @@ struct task_group;
>  
>  /*
>   * Task state bitmask. NOTE! These bits are also
> - * encoded in fs/proc/array.c: get_task_state().
> + * encoded in get_task_state().
>   *
>   * We have two separate sets of flags: task->state
>   * is about runnability, while task->exit_state are
> @@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk)
>  	return task_index_to_char(task_state_index(tsk));
>  }
>  
> +static inline const char *get_task_state_name(struct task_struct *tsk)
> +{
> +	static const char * const task_state_array[] = {
> +
> +		/* states in TASK_REPORT: */
> +		"running",		/* 0x00 */
> +		"sleeping",		/* 0x01 */
> +		"disk sleep",		/* 0x02 */
> +		"stopped",		/* 0x04 */
> +		"tracing stop",		/* 0x08 */
> +		"dead",			/* 0x10 */
> +		"zombie",		/* 0x20 */
> +		"parked",		/* 0x40 */
> +
> +		/* states beyond TASK_REPORT: */
> +		"idle",			/* 0x80 */
> +	};
> +
> +	BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> +	return task_state_array[task_state_index(tsk)];
> +}
> +
> +static inline const char *get_task_state(struct task_struct *tsk)
> +{
> +	/*
> +	 * The task state array is a strange "bitmap" of
> +	 * reasons to sleep. Thus "running" is zero, and
> +	 * you can test for combinations of others with
> +	 * simple bit tests.
> +	 */
> +	static const char * const task_state_array[] = {
> +
> +		/* states in TASK_REPORT: */
> +		"R (running)",		/* 0x00 */
> +		"S (sleeping)",		/* 0x01 */
> +		"D (disk sleep)",	/* 0x02 */
> +		"T (stopped)",		/* 0x04 */
> +		"t (tracing stop)",	/* 0x08 */
> +		"X (dead)",		/* 0x10 */
> +		"Z (zombie)",		/* 0x20 */
> +		"P (parked)",		/* 0x40 */
> +
> +		/* states beyond TASK_REPORT: */
> +		"I (idle)",		/* 0x80 */
> +	};
> +
> +	BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> +	return task_state_array[task_state_index(tsk)];
> +}
> +
>  /**
>   * is_global_init - check if a task structure is init. Since init
>   * is free to have sub-threads we need to check tgid.
> diff --git a/include/linux/signal.h b/include/linux/signal.h
> index b77f9472a37c..76bda1a20578 100644
> --- a/include/linux/signal.h
> +++ b/include/linux/signal.h
> @@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig);
>   *	|  non-POSIX signal  |  default action  |
>   *	+--------------------+------------------+
>   *	|  SIGEMT            |  coredump	|
> + *	|  SIGINFO	     |	ignore		|
>   *	+--------------------+------------------+
>   *
>   * (+) For SIGKILL and SIGSTOP the action is "always", not just "default".
> @@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig)
>  	return	sig == SIGCONT	||
>  		sig == SIGCHLD	||
>  		sig == SIGWINCH ||
> +#if defined(SIGINFO) && SIGINFO != SIGPWR
> +		sig == SIGINFO  ||
> +#endif
>  		sig == SIGURG;
>  }
>  
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index 168e57e40bbb..943d85aa471c 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -49,6 +49,9 @@
>  #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE])
>  #define LNEXT_CHAR(tty)	((tty)->termios.c_cc[VLNEXT])
>  #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2])
> +#ifdef VSTATUS
> +#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS])
> +#endif
>  
>  #define _I_FLAG(tty, f)	((tty)->termios.c_iflag & (f))
>  #define _O_FLAG(tty, f)	((tty)->termios.c_oflag & (f))
> @@ -114,6 +117,9 @@
>  #define L_PENDIN(tty)	_L_FLAG((tty), PENDIN)
>  #define L_IEXTEN(tty)	_L_FLAG((tty), IEXTEN)
>  #define L_EXTPROC(tty)	_L_FLAG((tty), EXTPROC)
> +#ifdef NOKERNINFO
> +#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO)
> +#endif
>  
>  struct device;
>  struct signal_struct;
> @@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty);
>  extern void tty_unlock_slave(struct tty_struct *tty);
>  extern void tty_set_lock_subclass(struct tty_struct *tty);
>  
> +extern int tty_status(struct tty_struct *tty);
> +
>  #endif
> diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h
> index cdc9f4ca8c27..baa2b8d42679 100644
> --- a/include/uapi/asm-generic/ioctls.h
> +++ b/include/uapi/asm-generic/ioctls.h
> @@ -97,6 +97,8 @@
>  
>  #define TIOCMIWAIT	0x545C	/* wait for a change on serial input line(s) */
>  #define TIOCGICOUNT	0x545D	/* read serial port inline interrupt counts */
> +/* Some architectures use 0x545E for FIOQSIZE */
> +#define TIOCSTAT        0x545F	/* display process group stats on tty */
>  
>  /*
>   * Some arches already define FIOQSIZE due to a historical
> diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h
> index 3c4cc9b8378e..0b771eb1db94 100644
> --- a/include/uapi/asm-generic/signal.h
> +++ b/include/uapi/asm-generic/signal.h
> @@ -4,7 +4,7 @@
>  
>  #include <linux/types.h>
>  
> -#define _NSIG		64
> +#define _NSIG		65
>  #define _NSIG_BPW	__BITS_PER_LONG
>  #define _NSIG_WORDS	((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW)
>  
> @@ -49,9 +49,11 @@
>  /* These should not be considered constants from userland.  */
>  #define SIGRTMIN	32
>  #ifndef SIGRTMAX
> -#define SIGRTMAX	_NSIG
> +#define SIGRTMAX	64
>  #endif
>  
> +#define SIGINFO		65
> +
>  #if !defined MINSIGSTKSZ || !defined SIGSTKSZ
>  #define MINSIGSTKSZ	2048
>  #define SIGSTKSZ	8192
> diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h
> index 2fbaf9ae89dd..cb4e9c6d629f 100644
> --- a/include/uapi/asm-generic/termbits.h
> +++ b/include/uapi/asm-generic/termbits.h
> @@ -58,6 +58,7 @@ struct ktermios {
>  #define VWERASE 14
>  #define VLNEXT 15
>  #define VEOL2 16
> +#define VSTATUS 17
>  
>  /* c_iflag bits */
>  #define IGNBRK	0000001
> @@ -164,22 +165,23 @@ struct ktermios {
>  #define IBSHIFT	  16		/* Shift from CBAUD to CIBAUD */
>  
>  /* c_lflag bits */
> -#define ISIG	0000001
> -#define ICANON	0000002
> -#define XCASE	0000004
> -#define ECHO	0000010
> -#define ECHOE	0000020
> -#define ECHOK	0000040
> -#define ECHONL	0000100
> -#define NOFLSH	0000200
> -#define TOSTOP	0000400
> -#define ECHOCTL	0001000
> -#define ECHOPRT	0002000
> -#define ECHOKE	0004000
> -#define FLUSHO	0010000
> -#define PENDIN	0040000
> -#define IEXTEN	0100000
> -#define EXTPROC	0200000
> +#define ISIG	   0000001
> +#define ICANON	   0000002
> +#define XCASE	   0000004
> +#define ECHO	   0000010
> +#define ECHOE	   0000020
> +#define ECHOK	   0000040
> +#define ECHONL	   0000100
> +#define NOFLSH	   0000200
> +#define TOSTOP	   0000400
> +#define ECHOCTL	   0001000
> +#define ECHOPRT	   0002000
> +#define ECHOKE	   0004000
> +#define FLUSHO	   0010000
> +#define PENDIN	   0040000
> +#define IEXTEN	   0100000
> +#define EXTPROC	   0200000
> +#define NOKERNINFO 0400000
>  
>  /* tcflow() and TCXONC use these */
>  #define	TCOOFF		0
> -- 
> 2.30.2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO
  2022-01-07 21:48   ` Arseny Maslennikov
@ 2022-01-07 21:52     ` Walt Drummond
  2022-01-07 22:39       ` Arseny Maslennikov
  0 siblings, 1 reply; 57+ messages in thread
From: Walt Drummond @ 2022-01-07 21:52 UTC (permalink / raw)
  To: Arseny Maslennikov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel,
	linux-arch

On Fri, Jan 7, 2022 at 1:48 PM Arseny Maslennikov <ar@cs.msu.ru> wrote:
>
> On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote:
> > Support TTY VSTATUS character, NOKERNINFO local control bit and the
> > signal SIGINFO, all as in 4.3BSD.
> >
> > Signed-off-by: Walt Drummond <walt@drummond.us>
> > ---
> >  arch/x86/include/asm/signal.h       |   2 +-
> >  arch/x86/include/uapi/asm/signal.h  |   4 +-
> >  drivers/tty/Makefile                |   2 +-
> >  drivers/tty/n_tty.c                 |  21 +++++
> >  drivers/tty/tty_io.c                |  10 ++-
> >  drivers/tty/tty_ioctl.c             |   4 +
> >  drivers/tty/tty_status.c            | 135 ++++++++++++++++++++++++++++
> >  fs/proc/array.c                     |  29 +-----
> >  include/asm-generic/termios.h       |   4 +-
> >  include/linux/sched.h               |  52 ++++++++++-
> >  include/linux/signal.h              |   4 +
> >  include/linux/tty.h                 |   8 ++
> >  include/uapi/asm-generic/ioctls.h   |   2 +
> >  include/uapi/asm-generic/signal.h   |   6 +-
> >  include/uapi/asm-generic/termbits.h |  34 +++----
> >  15 files changed, 264 insertions(+), 53 deletions(-)
> >  create mode 100644 drivers/tty/tty_status.c
> >
> > diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
> > index d8e2efe6cd46..0a01877c11ab 100644
> > --- a/arch/x86/include/asm/signal.h
> > +++ b/arch/x86/include/asm/signal.h
> > @@ -8,7 +8,7 @@
> >  /* Most things should be clean enough to redefine this at will, if care
> >     is taken to make libc match.  */
> >
> > -#define _NSIG                64
> > +#define _NSIG                65
> >
> >  #ifdef __i386__
> >  # define _NSIG_BPW   32
> > diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h
> > index 164a22a72984..60dca62d3dcf 100644
> > --- a/arch/x86/include/uapi/asm/signal.h
> > +++ b/arch/x86/include/uapi/asm/signal.h
> > @@ -60,7 +60,9 @@ typedef unsigned long sigset_t;
> >
> >  /* These should not be considered constants from userland.  */
> >  #define SIGRTMIN     32
> > -#define SIGRTMAX     _NSIG
> > +#define SIGRTMAX     64
> > +
> > +#define SIGINFO              65
> >
> >  #define SA_RESTORER  0x04000000
> >
> > diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile
> > index a2bd75fbaaa4..d50ba690bb87 100644
> > --- a/drivers/tty/Makefile
> > +++ b/drivers/tty/Makefile
> > @@ -2,7 +2,7 @@
> >  obj-$(CONFIG_TTY)            += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \
> >                                  tty_buffer.o tty_port.o tty_mutex.o \
> >                                  tty_ldsem.o tty_baudrate.o tty_jobctrl.o \
> > -                                n_null.o
> > +                                n_null.o tty_status.o
> >  obj-$(CONFIG_LEGACY_PTYS)    += pty.o
> >  obj-$(CONFIG_UNIX98_PTYS)    += pty.o
> >  obj-$(CONFIG_AUDIT)          += tty_audit.o
> > diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> > index 0ec93f1a61f5..b510e01289fd 100644
> > --- a/drivers/tty/n_tty.c
> > +++ b/drivers/tty/n_tty.c
> > @@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c)
> >                       commit_echoes(tty);
> >                       return;
> >               }
> > +#ifdef VSTATUS
> > +             if (c == STATUS_CHAR(tty)) {
> > +                     /* Do the status message first and then send
> > +                      * the signal, otherwise signal delivery can
> > +                      * change the process state making the status
> > +                      * message misleading.  Also, use __isig() and
> > +                      * not sig(), as if we flush the tty we can
> > +                      * lose parts of the message.
>
> ...As well as the character input in the canonical mode's built-in line
> editor.
>

Yes, good catch.  But this is not going to be in the next version of the patch.

> > +                      */
> > +
> > +                     if (!L_NOKERNINFO(tty))
> > +                             tty_status(tty);
> > +# if defined(SIGINFO) && SIGINFO != SIGPWR
> > +                     __isig(SIGINFO, tty);
> > +# endif
> > +                     return;
> > +             }
> > +#endif       /* VSTATUS */
> >               if (c == '\n') {
> >                       if (L_ECHO(tty) || L_ECHONL(tty)) {
> >                               echo_char_raw('\n', ldata);
> > @@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
> >                       set_bit(EOF_CHAR(tty), ldata->char_map);
> >                       set_bit('\n', ldata->char_map);
> >                       set_bit(EOL_CHAR(tty), ldata->char_map);
> > +#ifdef VSTATUS
> > +                     set_bit(STATUS_CHAR(tty), ldata->char_map);
> > +#endif
> >                       if (L_IEXTEN(tty)) {
> >                               set_bit(WERASE_CHAR(tty), ldata->char_map);
> >                               set_bit(LNEXT_CHAR(tty), ldata->char_map);
> > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> > index 6616d4a0d41d..8e488ecba330 100644
> > --- a/drivers/tty/tty_io.c
> > +++ b/drivers/tty/tty_io.c
> > @@ -120,18 +120,26 @@
> >  #define TTY_PARANOIA_CHECK 1
> >  #define CHECK_TTY_COUNT 1
> >
> > +/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */
> > +#ifdef NOKERNINFO
> > +# define __NOKERNINFO NOKERNINFO
> > +#else
> > +# define __NOKERNINFO 0
> > +#endif
> > +
> >  struct ktermios tty_std_termios = {  /* for the benefit of tty drivers  */
> >       .c_iflag = ICRNL | IXON,
> >       .c_oflag = OPOST | ONLCR,
> >       .c_cflag = B38400 | CS8 | CREAD | HUPCL,
> >       .c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK |
> > -                ECHOCTL | ECHOKE | IEXTEN,
> > +                ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO,
> >       .c_cc = INIT_C_CC,
> >       .c_ispeed = 38400,
> >       .c_ospeed = 38400,
> >       /* .c_line = N_TTY, */
> >  };
> >  EXPORT_SYMBOL(tty_std_termios);
> > +#undef __NOKERNINFO
> >
> >  /* This list gets poked at by procfs and various bits of boot up code. This
> >   * could do with some rationalisation such as pulling the tty proc function
> > diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c
> > index 507a25d692bb..b250eabca1ba 100644
> > --- a/drivers/tty/tty_ioctl.c
> > +++ b/drivers/tty/tty_ioctl.c
> > @@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file,
> >               if (get_user(arg, (unsigned int __user *) arg))
> >                       return -EFAULT;
> >               return tty_change_softcar(real_tty, arg);
> > +#ifdef TIOCSTAT
> > +     case TIOCSTAT:
> > +             return tty_status(real_tty);
> > +#endif
> >       default:
> >               return -ENOIOCTLCMD;
> >       }
> > diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c
> > new file mode 100644
>
> Nitpick: the new functionality is part of n_tty and not the generic tty
> subsystem, so "tty_status.c" is a misleading name for the new file,
> unlike e. g. "n_tty_status.c". It has no use in the various modem
> drivers, for example.
> Likewise for the tty_status() function.

ACK, will do.

>
> > index 000000000000..a9600f5bd48c
> > --- /dev/null
> > +++ b/drivers/tty/tty_status.c
> > @@ -0,0 +1,135 @@
> > +// SPDX-License-Identifier: GPL-1.0+
> > +/*
> > + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4
> > + *
> > + */
> > +
> > +#include <linux/sched.h>
> > +#include <linux/mm.h>
> > +#include <linux/tty.h>
> > +#include <linux/sched/cputime.h>
> > +#include <linux/sched/loadavg.h>
> > +#include <linux/pid.h>
> > +#include <linux/slab.h>
> > +#include <linux/math64.h>
> > +
> > +#define MSGLEN (160 + TASK_COMM_LEN)
> > +
> > +inline unsigned long getRSSk(struct mm_struct *mm)
> > +{
> > +     if (mm == NULL)
> > +             return 0;
> > +     return get_mm_rss(mm) * PAGE_SIZE / 1024;
> > +}
> > +
> > +inline long nstoms(long l)
> > +{
> > +     l /= NSEC_PER_MSEC * 10;
> > +     if (l < 10)
> > +             l *= 10;
> > +     return l;
> > +}
> > +
> > +inline struct task_struct *compare(struct task_struct *new,
> > +                                struct task_struct *old)
> > +{
> > +     unsigned int ostate, nstate;
> > +
> > +     if (old == NULL)
> > +             return new;
> > +
> > +     ostate = task_state_index(old);
> > +     nstate = task_state_index(new);
> > +
> > +     if (ostate == nstate) {
> > +             if (old->start_time > new->start_time)
> > +                     return old;
> > +             return new;
> > +     }
> > +
> > +     if (ostate < nstate)
> > +             return old;
> > +
> > +     return new;
> > +}
> > +
> > +struct task_struct *pick_process(struct pid *pgrp)
> > +{
> > +     struct task_struct *p, *winner = NULL;
> > +
> > +     read_lock(&tasklist_lock);
> > +     do_each_pid_task(pgrp, PIDTYPE_PGID, p) {
> > +             winner = compare(p, winner);
> > +     } while_each_pid_task(pgrp, PIDTYPE_PGID, p);
> > +     read_unlock(&tasklist_lock);
> > +
> > +     return winner;
> > +}
> > +
> > +int tty_status(struct tty_struct *tty)
> > +{
> > +     char tname[TASK_COMM_LEN];
> > +     unsigned long loadavg[3];
> > +     uint64_t pcpu, cputime, wallclock;
> > +     struct task_struct *p;
> > +     struct rusage rusage;
> > +     struct timespec64 utime, stime, rtime;
> > +     char msg[MSGLEN] = {0};
> > +     int len = 0;
> > +
> > +     if (tty == NULL)
> > +             return -ENOTTY;
> > +
> > +     get_avenrun(loadavg, FIXED_1/200, 0);
> > +     len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu  ",
> > +                    LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0]));
> > +
> > +     if (tty->ctrl.session == NULL) {
> > +             len += scnprintf((char *)&msg[len], MSGLEN - len,
> > +                              "not a controlling terminal");
> > +             goto print;
> > +     }
> > +
> > +     if (tty->ctrl.pgrp == NULL) {
> > +             len += scnprintf((char *)&msg[len], MSGLEN - len,
> > +                              "no foreground process group");
> > +             goto print;
> > +     }
> > +
> > +     p = pick_process(tty->ctrl.pgrp);
> > +     if (p == NULL) {
> > +             len += scnprintf((char *)&msg[len], MSGLEN - len,
> > +                              "empty foreground process group");
> > +             goto print;
> > +     }
> > +
> > +     get_task_comm(tname, p);
> > +     getrusage(p, RUSAGE_BOTH, &rusage);
> > +     wallclock = ktime_get_ns() - p->start_time;
> > +
> > +     utime.tv_sec = rusage.ru_utime.tv_sec;
> > +     utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC;
> > +     stime.tv_sec = rusage.ru_stime.tv_sec;
> > +     stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC;
> > +     rtime = ns_to_timespec64(wallclock);
> > +
> > +     cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime);
> > +     pcpu = div64_u64(cputime * 100, wallclock);
> > +
> > +     len += scnprintf((char *)&msg[len], MSGLEN - len,
> > +                      /* task, PID, task state */
> > +                      "cmd: %s %d [%s] "
> > +                      /* rtime,    utime,      stime,      %cpu,  rss */
> > +                      "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk",
> > +                      tname, task_pid_vnr(p), (char *)get_task_state_name(p),
> > +                      rtime.tv_sec, nstoms(rtime.tv_nsec),
> > +                      utime.tv_sec, nstoms(utime.tv_nsec),
> > +                      stime.tv_sec, nstoms(stime.tv_nsec),
> > +                      pcpu, getRSSk(p->mm));
> > +
> > +print:
> > +     len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n");
> > +     tty_write_message(tty, msg);
>
> tty_write_message() is quite risky to use; while writing my
> implementation a couple of years ago I've found it easy to accidentally
> set up deadlocks with this interface — in particular if the function is
> called from the tty character receive path.
> I hope you're testing the functionality with CONFIG_PROVE_LOCKING enabled.

I have not, but I will.

Is there a different 'put a message on this tty' api I should be using?

Thanks.

>
> > +
> > +     return 0;
> > +}
> > diff --git a/fs/proc/array.c b/fs/proc/array.c
> > index f37c03077b58..eb14306cdde2 100644
> > --- a/fs/proc/array.c
> > +++ b/fs/proc/array.c
> > @@ -62,6 +62,7 @@
> >  #include <linux/tty.h>
> >  #include <linux/string.h>
> >  #include <linux/mman.h>
> > +#include <linux/sched.h>
> >  #include <linux/sched/mm.h>
> >  #include <linux/sched/numa_balancing.h>
> >  #include <linux/sched/task_stack.h>
> > @@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
> >               seq_printf(m, "%.64s", tcomm);
> >  }
> >
> > -/*
> > - * The task state array is a strange "bitmap" of
> > - * reasons to sleep. Thus "running" is zero, and
> > - * you can test for combinations of others with
> > - * simple bit tests.
> > - */
> > -static const char * const task_state_array[] = {
> > -
> > -     /* states in TASK_REPORT: */
> > -     "R (running)",          /* 0x00 */
> > -     "S (sleeping)",         /* 0x01 */
> > -     "D (disk sleep)",       /* 0x02 */
> > -     "T (stopped)",          /* 0x04 */
> > -     "t (tracing stop)",     /* 0x08 */
> > -     "X (dead)",             /* 0x10 */
> > -     "Z (zombie)",           /* 0x20 */
> > -     "P (parked)",           /* 0x40 */
> > -
> > -     /* states beyond TASK_REPORT: */
> > -     "I (idle)",             /* 0x80 */
> > -};
> > -
> > -static inline const char *get_task_state(struct task_struct *tsk)
> > -{
> > -     BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> > -     return task_state_array[task_state_index(tsk)];
> > -}
> > -
> >  static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
> >                               struct pid *pid, struct task_struct *p)
> >  {
> > diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h
> > index b1398d0d4a1d..9b080e1a82d4 100644
> > --- a/include/asm-generic/termios.h
> > +++ b/include/asm-generic/termios.h
> > @@ -10,9 +10,9 @@
> >       eof=^D          vtime=\0        vmin=\1         sxtc=\0
> >       start=^Q        stop=^S         susp=^Z         eol=\0
> >       reprint=^R      discard=^U      werase=^W       lnext=^V
> > -     eol2=\0
> > +     eol2=\0         status=^T
> >  */
> > -#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0"
> > +#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024"
> >
> >  /*
> >   * Translate a "termio" structure into a "termios". Ugh.
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index c1a927ddec64..2171074ec8f5 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -70,7 +70,7 @@ struct task_group;
> >
> >  /*
> >   * Task state bitmask. NOTE! These bits are also
> > - * encoded in fs/proc/array.c: get_task_state().
> > + * encoded in get_task_state().
> >   *
> >   * We have two separate sets of flags: task->state
> >   * is about runnability, while task->exit_state are
> > @@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk)
> >       return task_index_to_char(task_state_index(tsk));
> >  }
> >
> > +static inline const char *get_task_state_name(struct task_struct *tsk)
> > +{
> > +     static const char * const task_state_array[] = {
> > +
> > +             /* states in TASK_REPORT: */
> > +             "running",              /* 0x00 */
> > +             "sleeping",             /* 0x01 */
> > +             "disk sleep",           /* 0x02 */
> > +             "stopped",              /* 0x04 */
> > +             "tracing stop",         /* 0x08 */
> > +             "dead",                 /* 0x10 */
> > +             "zombie",               /* 0x20 */
> > +             "parked",               /* 0x40 */
> > +
> > +             /* states beyond TASK_REPORT: */
> > +             "idle",                 /* 0x80 */
> > +     };
> > +
> > +     BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> > +     return task_state_array[task_state_index(tsk)];
> > +}
> > +
> > +static inline const char *get_task_state(struct task_struct *tsk)
> > +{
> > +     /*
> > +      * The task state array is a strange "bitmap" of
> > +      * reasons to sleep. Thus "running" is zero, and
> > +      * you can test for combinations of others with
> > +      * simple bit tests.
> > +      */
> > +     static const char * const task_state_array[] = {
> > +
> > +             /* states in TASK_REPORT: */
> > +             "R (running)",          /* 0x00 */
> > +             "S (sleeping)",         /* 0x01 */
> > +             "D (disk sleep)",       /* 0x02 */
> > +             "T (stopped)",          /* 0x04 */
> > +             "t (tracing stop)",     /* 0x08 */
> > +             "X (dead)",             /* 0x10 */
> > +             "Z (zombie)",           /* 0x20 */
> > +             "P (parked)",           /* 0x40 */
> > +
> > +             /* states beyond TASK_REPORT: */
> > +             "I (idle)",             /* 0x80 */
> > +     };
> > +
> > +     BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> > +     return task_state_array[task_state_index(tsk)];
> > +}
> > +
> >  /**
> >   * is_global_init - check if a task structure is init. Since init
> >   * is free to have sub-threads we need to check tgid.
> > diff --git a/include/linux/signal.h b/include/linux/signal.h
> > index b77f9472a37c..76bda1a20578 100644
> > --- a/include/linux/signal.h
> > +++ b/include/linux/signal.h
> > @@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig);
> >   *   |  non-POSIX signal  |  default action  |
> >   *   +--------------------+------------------+
> >   *   |  SIGEMT            |  coredump        |
> > + *   |  SIGINFO           |  ignore          |
> >   *   +--------------------+------------------+
> >   *
> >   * (+) For SIGKILL and SIGSTOP the action is "always", not just "default".
> > @@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig)
> >       return  sig == SIGCONT  ||
> >               sig == SIGCHLD  ||
> >               sig == SIGWINCH ||
> > +#if defined(SIGINFO) && SIGINFO != SIGPWR
> > +             sig == SIGINFO  ||
> > +#endif
> >               sig == SIGURG;
> >  }
> >
> > diff --git a/include/linux/tty.h b/include/linux/tty.h
> > index 168e57e40bbb..943d85aa471c 100644
> > --- a/include/linux/tty.h
> > +++ b/include/linux/tty.h
> > @@ -49,6 +49,9 @@
> >  #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE])
> >  #define LNEXT_CHAR(tty)      ((tty)->termios.c_cc[VLNEXT])
> >  #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2])
> > +#ifdef VSTATUS
> > +#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS])
> > +#endif
> >
> >  #define _I_FLAG(tty, f)      ((tty)->termios.c_iflag & (f))
> >  #define _O_FLAG(tty, f)      ((tty)->termios.c_oflag & (f))
> > @@ -114,6 +117,9 @@
> >  #define L_PENDIN(tty)        _L_FLAG((tty), PENDIN)
> >  #define L_IEXTEN(tty)        _L_FLAG((tty), IEXTEN)
> >  #define L_EXTPROC(tty)       _L_FLAG((tty), EXTPROC)
> > +#ifdef NOKERNINFO
> > +#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO)
> > +#endif
> >
> >  struct device;
> >  struct signal_struct;
> > @@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty);
> >  extern void tty_unlock_slave(struct tty_struct *tty);
> >  extern void tty_set_lock_subclass(struct tty_struct *tty);
> >
> > +extern int tty_status(struct tty_struct *tty);
> > +
> >  #endif
> > diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h
> > index cdc9f4ca8c27..baa2b8d42679 100644
> > --- a/include/uapi/asm-generic/ioctls.h
> > +++ b/include/uapi/asm-generic/ioctls.h
> > @@ -97,6 +97,8 @@
> >
> >  #define TIOCMIWAIT   0x545C  /* wait for a change on serial input line(s) */
> >  #define TIOCGICOUNT  0x545D  /* read serial port inline interrupt counts */
> > +/* Some architectures use 0x545E for FIOQSIZE */
> > +#define TIOCSTAT        0x545F       /* display process group stats on tty */
> >
> >  /*
> >   * Some arches already define FIOQSIZE due to a historical
> > diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h
> > index 3c4cc9b8378e..0b771eb1db94 100644
> > --- a/include/uapi/asm-generic/signal.h
> > +++ b/include/uapi/asm-generic/signal.h
> > @@ -4,7 +4,7 @@
> >
> >  #include <linux/types.h>
> >
> > -#define _NSIG                64
> > +#define _NSIG                65
> >  #define _NSIG_BPW    __BITS_PER_LONG
> >  #define _NSIG_WORDS  ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW)
> >
> > @@ -49,9 +49,11 @@
> >  /* These should not be considered constants from userland.  */
> >  #define SIGRTMIN     32
> >  #ifndef SIGRTMAX
> > -#define SIGRTMAX     _NSIG
> > +#define SIGRTMAX     64
> >  #endif
> >
> > +#define SIGINFO              65
> > +
> >  #if !defined MINSIGSTKSZ || !defined SIGSTKSZ
> >  #define MINSIGSTKSZ  2048
> >  #define SIGSTKSZ     8192
> > diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h
> > index 2fbaf9ae89dd..cb4e9c6d629f 100644
> > --- a/include/uapi/asm-generic/termbits.h
> > +++ b/include/uapi/asm-generic/termbits.h
> > @@ -58,6 +58,7 @@ struct ktermios {
> >  #define VWERASE 14
> >  #define VLNEXT 15
> >  #define VEOL2 16
> > +#define VSTATUS 17
> >
> >  /* c_iflag bits */
> >  #define IGNBRK       0000001
> > @@ -164,22 +165,23 @@ struct ktermios {
> >  #define IBSHIFT        16            /* Shift from CBAUD to CIBAUD */
> >
> >  /* c_lflag bits */
> > -#define ISIG 0000001
> > -#define ICANON       0000002
> > -#define XCASE        0000004
> > -#define ECHO 0000010
> > -#define ECHOE        0000020
> > -#define ECHOK        0000040
> > -#define ECHONL       0000100
> > -#define NOFLSH       0000200
> > -#define TOSTOP       0000400
> > -#define ECHOCTL      0001000
> > -#define ECHOPRT      0002000
> > -#define ECHOKE       0004000
> > -#define FLUSHO       0010000
> > -#define PENDIN       0040000
> > -#define IEXTEN       0100000
> > -#define EXTPROC      0200000
> > +#define ISIG    0000001
> > +#define ICANON          0000002
> > +#define XCASE           0000004
> > +#define ECHO    0000010
> > +#define ECHOE           0000020
> > +#define ECHOK           0000040
> > +#define ECHONL          0000100
> > +#define NOFLSH          0000200
> > +#define TOSTOP          0000400
> > +#define ECHOCTL         0001000
> > +#define ECHOPRT         0002000
> > +#define ECHOKE          0004000
> > +#define FLUSHO          0010000
> > +#define PENDIN          0040000
> > +#define IEXTEN          0100000
> > +#define EXTPROC         0200000
> > +#define NOKERNINFO 0400000
> >
> >  /* tcflow() and TCXONC use these */
> >  #define      TCOOFF          0
> > --
> > 2.30.2
> >

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO
  2022-01-07 21:52     ` Walt Drummond
@ 2022-01-07 22:39       ` Arseny Maslennikov
  0 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-07 22:39 UTC (permalink / raw)
  To: Walt Drummond
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel,
	linux-arch

[-- Attachment #1: Type: text/plain, Size: 26392 bytes --]

On Fri, Jan 07, 2022 at 01:52:23PM -0800, Walt Drummond wrote:
> On Fri, Jan 7, 2022 at 1:48 PM Arseny Maslennikov <ar@cs.msu.ru> wrote:
> >
> > On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote:
> > > Support TTY VSTATUS character, NOKERNINFO local control bit and the
> > > signal SIGINFO, all as in 4.3BSD.
> > >
> > > Signed-off-by: Walt Drummond <walt@drummond.us>
> > > ---
> > >  arch/x86/include/asm/signal.h       |   2 +-
> > >  arch/x86/include/uapi/asm/signal.h  |   4 +-
> > >  drivers/tty/Makefile                |   2 +-
> > >  drivers/tty/n_tty.c                 |  21 +++++
> > >  drivers/tty/tty_io.c                |  10 ++-
> > >  drivers/tty/tty_ioctl.c             |   4 +
> > >  drivers/tty/tty_status.c            | 135 ++++++++++++++++++++++++++++
> > >  fs/proc/array.c                     |  29 +-----
> > >  include/asm-generic/termios.h       |   4 +-
> > >  include/linux/sched.h               |  52 ++++++++++-
> > >  include/linux/signal.h              |   4 +
> > >  include/linux/tty.h                 |   8 ++
> > >  include/uapi/asm-generic/ioctls.h   |   2 +
> > >  include/uapi/asm-generic/signal.h   |   6 +-
> > >  include/uapi/asm-generic/termbits.h |  34 +++----
> > >  15 files changed, 264 insertions(+), 53 deletions(-)
> > >  create mode 100644 drivers/tty/tty_status.c
> > >
> > > diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
> > > index d8e2efe6cd46..0a01877c11ab 100644
> > > --- a/arch/x86/include/asm/signal.h
> > > +++ b/arch/x86/include/asm/signal.h
> > > @@ -8,7 +8,7 @@
> > >  /* Most things should be clean enough to redefine this at will, if care
> > >     is taken to make libc match.  */
> > >
> > > -#define _NSIG                64
> > > +#define _NSIG                65
> > >
> > >  #ifdef __i386__
> > >  # define _NSIG_BPW   32
> > > diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h
> > > index 164a22a72984..60dca62d3dcf 100644
> > > --- a/arch/x86/include/uapi/asm/signal.h
> > > +++ b/arch/x86/include/uapi/asm/signal.h
> > > @@ -60,7 +60,9 @@ typedef unsigned long sigset_t;
> > >
> > >  /* These should not be considered constants from userland.  */
> > >  #define SIGRTMIN     32
> > > -#define SIGRTMAX     _NSIG
> > > +#define SIGRTMAX     64
> > > +
> > > +#define SIGINFO              65
> > >
> > >  #define SA_RESTORER  0x04000000
> > >
> > > diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile
> > > index a2bd75fbaaa4..d50ba690bb87 100644
> > > --- a/drivers/tty/Makefile
> > > +++ b/drivers/tty/Makefile
> > > @@ -2,7 +2,7 @@
> > >  obj-$(CONFIG_TTY)            += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \
> > >                                  tty_buffer.o tty_port.o tty_mutex.o \
> > >                                  tty_ldsem.o tty_baudrate.o tty_jobctrl.o \
> > > -                                n_null.o
> > > +                                n_null.o tty_status.o
> > >  obj-$(CONFIG_LEGACY_PTYS)    += pty.o
> > >  obj-$(CONFIG_UNIX98_PTYS)    += pty.o
> > >  obj-$(CONFIG_AUDIT)          += tty_audit.o
> > > diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> > > index 0ec93f1a61f5..b510e01289fd 100644
> > > --- a/drivers/tty/n_tty.c
> > > +++ b/drivers/tty/n_tty.c
> > > @@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c)
> > >                       commit_echoes(tty);
> > >                       return;
> > >               }
> > > +#ifdef VSTATUS
> > > +             if (c == STATUS_CHAR(tty)) {
> > > +                     /* Do the status message first and then send
> > > +                      * the signal, otherwise signal delivery can
> > > +                      * change the process state making the status
> > > +                      * message misleading.  Also, use __isig() and
> > > +                      * not sig(), as if we flush the tty we can
> > > +                      * lose parts of the message.
> >
> > ...As well as the character input in the canonical mode's built-in line
> > editor.
> >
> 
> Yes, good catch.  But this is not going to be in the next version of the patch.
> 
> > > +                      */
> > > +
> > > +                     if (!L_NOKERNINFO(tty))
> > > +                             tty_status(tty);
> > > +# if defined(SIGINFO) && SIGINFO != SIGPWR
> > > +                     __isig(SIGINFO, tty);
> > > +# endif
> > > +                     return;
> > > +             }
> > > +#endif       /* VSTATUS */
> > >               if (c == '\n') {
> > >                       if (L_ECHO(tty) || L_ECHONL(tty)) {
> > >                               echo_char_raw('\n', ldata);
> > > @@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
> > >                       set_bit(EOF_CHAR(tty), ldata->char_map);
> > >                       set_bit('\n', ldata->char_map);
> > >                       set_bit(EOL_CHAR(tty), ldata->char_map);
> > > +#ifdef VSTATUS
> > > +                     set_bit(STATUS_CHAR(tty), ldata->char_map);
> > > +#endif
> > >                       if (L_IEXTEN(tty)) {
> > >                               set_bit(WERASE_CHAR(tty), ldata->char_map);
> > >                               set_bit(LNEXT_CHAR(tty), ldata->char_map);
> > > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> > > index 6616d4a0d41d..8e488ecba330 100644
> > > --- a/drivers/tty/tty_io.c
> > > +++ b/drivers/tty/tty_io.c
> > > @@ -120,18 +120,26 @@
> > >  #define TTY_PARANOIA_CHECK 1
> > >  #define CHECK_TTY_COUNT 1
> > >
> > > +/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */
> > > +#ifdef NOKERNINFO
> > > +# define __NOKERNINFO NOKERNINFO
> > > +#else
> > > +# define __NOKERNINFO 0
> > > +#endif
> > > +
> > >  struct ktermios tty_std_termios = {  /* for the benefit of tty drivers  */
> > >       .c_iflag = ICRNL | IXON,
> > >       .c_oflag = OPOST | ONLCR,
> > >       .c_cflag = B38400 | CS8 | CREAD | HUPCL,
> > >       .c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK |
> > > -                ECHOCTL | ECHOKE | IEXTEN,
> > > +                ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO,
> > >       .c_cc = INIT_C_CC,
> > >       .c_ispeed = 38400,
> > >       .c_ospeed = 38400,
> > >       /* .c_line = N_TTY, */
> > >  };
> > >  EXPORT_SYMBOL(tty_std_termios);
> > > +#undef __NOKERNINFO
> > >
> > >  /* This list gets poked at by procfs and various bits of boot up code. This
> > >   * could do with some rationalisation such as pulling the tty proc function
> > > diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c
> > > index 507a25d692bb..b250eabca1ba 100644
> > > --- a/drivers/tty/tty_ioctl.c
> > > +++ b/drivers/tty/tty_ioctl.c
> > > @@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file,
> > >               if (get_user(arg, (unsigned int __user *) arg))
> > >                       return -EFAULT;
> > >               return tty_change_softcar(real_tty, arg);
> > > +#ifdef TIOCSTAT
> > > +     case TIOCSTAT:
> > > +             return tty_status(real_tty);
> > > +#endif
> > >       default:
> > >               return -ENOIOCTLCMD;
> > >       }
> > > diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c
> > > new file mode 100644
> >
> > Nitpick: the new functionality is part of n_tty and not the generic tty
> > subsystem, so "tty_status.c" is a misleading name for the new file,
> > unlike e. g. "n_tty_status.c". It has no use in the various modem
> > drivers, for example.
> > Likewise for the tty_status() function.
> 
> ACK, will do.
> 
> >
> > > index 000000000000..a9600f5bd48c
> > > --- /dev/null
> > > +++ b/drivers/tty/tty_status.c
> > > @@ -0,0 +1,135 @@
> > > +// SPDX-License-Identifier: GPL-1.0+
> > > +/*
> > > + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4
> > > + *
> > > + */
> > > +
> > > +#include <linux/sched.h>
> > > +#include <linux/mm.h>
> > > +#include <linux/tty.h>
> > > +#include <linux/sched/cputime.h>
> > > +#include <linux/sched/loadavg.h>
> > > +#include <linux/pid.h>
> > > +#include <linux/slab.h>
> > > +#include <linux/math64.h>
> > > +
> > > +#define MSGLEN (160 + TASK_COMM_LEN)
> > > +
> > > +inline unsigned long getRSSk(struct mm_struct *mm)
> > > +{
> > > +     if (mm == NULL)
> > > +             return 0;
> > > +     return get_mm_rss(mm) * PAGE_SIZE / 1024;
> > > +}
> > > +
> > > +inline long nstoms(long l)
> > > +{
> > > +     l /= NSEC_PER_MSEC * 10;
> > > +     if (l < 10)
> > > +             l *= 10;
> > > +     return l;
> > > +}
> > > +
> > > +inline struct task_struct *compare(struct task_struct *new,
> > > +                                struct task_struct *old)
> > > +{
> > > +     unsigned int ostate, nstate;
> > > +
> > > +     if (old == NULL)
> > > +             return new;
> > > +
> > > +     ostate = task_state_index(old);
> > > +     nstate = task_state_index(new);
> > > +
> > > +     if (ostate == nstate) {
> > > +             if (old->start_time > new->start_time)
> > > +                     return old;
> > > +             return new;
> > > +     }
> > > +
> > > +     if (ostate < nstate)
> > > +             return old;
> > > +
> > > +     return new;
> > > +}
> > > +
> > > +struct task_struct *pick_process(struct pid *pgrp)
> > > +{
> > > +     struct task_struct *p, *winner = NULL;
> > > +
> > > +     read_lock(&tasklist_lock);
> > > +     do_each_pid_task(pgrp, PIDTYPE_PGID, p) {
> > > +             winner = compare(p, winner);
> > > +     } while_each_pid_task(pgrp, PIDTYPE_PGID, p);
> > > +     read_unlock(&tasklist_lock);
> > > +
> > > +     return winner;
> > > +}
> > > +
> > > +int tty_status(struct tty_struct *tty)
> > > +{
> > > +     char tname[TASK_COMM_LEN];
> > > +     unsigned long loadavg[3];
> > > +     uint64_t pcpu, cputime, wallclock;
> > > +     struct task_struct *p;
> > > +     struct rusage rusage;
> > > +     struct timespec64 utime, stime, rtime;
> > > +     char msg[MSGLEN] = {0};
> > > +     int len = 0;
> > > +
> > > +     if (tty == NULL)
> > > +             return -ENOTTY;
> > > +
> > > +     get_avenrun(loadavg, FIXED_1/200, 0);
> > > +     len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu  ",
> > > +                    LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0]));
> > > +
> > > +     if (tty->ctrl.session == NULL) {
> > > +             len += scnprintf((char *)&msg[len], MSGLEN - len,
> > > +                              "not a controlling terminal");
> > > +             goto print;
> > > +     }
> > > +
> > > +     if (tty->ctrl.pgrp == NULL) {
> > > +             len += scnprintf((char *)&msg[len], MSGLEN - len,
> > > +                              "no foreground process group");
> > > +             goto print;
> > > +     }
> > > +
> > > +     p = pick_process(tty->ctrl.pgrp);
> > > +     if (p == NULL) {
> > > +             len += scnprintf((char *)&msg[len], MSGLEN - len,
> > > +                              "empty foreground process group");
> > > +             goto print;
> > > +     }
> > > +
> > > +     get_task_comm(tname, p);
> > > +     getrusage(p, RUSAGE_BOTH, &rusage);
> > > +     wallclock = ktime_get_ns() - p->start_time;
> > > +
> > > +     utime.tv_sec = rusage.ru_utime.tv_sec;
> > > +     utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC;
> > > +     stime.tv_sec = rusage.ru_stime.tv_sec;
> > > +     stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC;
> > > +     rtime = ns_to_timespec64(wallclock);
> > > +
> > > +     cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime);
> > > +     pcpu = div64_u64(cputime * 100, wallclock);
> > > +
> > > +     len += scnprintf((char *)&msg[len], MSGLEN - len,
> > > +                      /* task, PID, task state */
> > > +                      "cmd: %s %d [%s] "
> > > +                      /* rtime,    utime,      stime,      %cpu,  rss */
> > > +                      "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk",
> > > +                      tname, task_pid_vnr(p), (char *)get_task_state_name(p),
> > > +                      rtime.tv_sec, nstoms(rtime.tv_nsec),
> > > +                      utime.tv_sec, nstoms(utime.tv_nsec),
> > > +                      stime.tv_sec, nstoms(stime.tv_nsec),
> > > +                      pcpu, getRSSk(p->mm));
> > > +
> > > +print:
> > > +     len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n");
> > > +     tty_write_message(tty, msg);
> >
> > tty_write_message() is quite risky to use; while writing my
> > implementation a couple of years ago I've found it easy to accidentally
> > set up deadlocks with this interface — in particular if the function is
> > called from the tty character receive path.
> > I hope you're testing the functionality with CONFIG_PROVE_LOCKING enabled.
> 
> I have not, but I will.
> 
> Is there a different 'put a message on this tty' api I should be using?

There was none at the time; unfortunately, as of v5.15 it looks like
there's still none.

Please see 6/7 of the following series:
https://lore.kernel.org/lkml/20200430064301.1099452-7-ar@cs.msu.ru/
I had to do that, then use it like this from the line discipline in 7/7
(copy-paste from the series with new notes):

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index f72a3fd4b..905cdd985 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -2489,6 +2496,21 @@ static int n_tty_ioctl(struct tty_struct *tty, struct file *file,
 	}
 }
 
+static void n_tty_status_line(struct tty_struct *tty)
+{
+	/* private data! can't move this to another file. */
+	struct n_tty_data *ldata = tty->disc_data;
+	char *msg, *buf;
+	msg = buf = kzalloc(STATUS_LINE_LEN, GFP_KERNEL);
+	tty_sprint_status_line(tty, buf + 1, STATUS_LINE_LEN - 1);
+	/* The only current caller of this takes output_lock for us. */
+	if (ldata->column != 0)
+		*msg = '\n';
+	else
+		msg++;
+	/* a call to the new function */
+	do_n_tty_write(tty, NULL, msg, strlen(msg));
+	kfree(buf);
+}
+
 static struct tty_ldisc_ops n_tty_ops = {
 	.magic           = TTY_LDISC_MAGIC,
 	.name            = "n_tty",

The tty_sprint_status_line() is defined in n_tty_status.c, it produces the line at a buf+len.
Also, unlike in arguments of tty_write_message() which bypasses the
ldisc, '\n' gets translated by the line discipline to '\r\n'
automatically if relevant termios flags are set.
Also if the cursor is not at the first column of the current row, there
is an automatic newline.

> Thanks.
> 
> >
> > > +
> > > +     return 0;
> > > +}
> > > diff --git a/fs/proc/array.c b/fs/proc/array.c
> > > index f37c03077b58..eb14306cdde2 100644
> > > --- a/fs/proc/array.c
> > > +++ b/fs/proc/array.c
> > > @@ -62,6 +62,7 @@
> > >  #include <linux/tty.h>
> > >  #include <linux/string.h>
> > >  #include <linux/mman.h>
> > > +#include <linux/sched.h>
> > >  #include <linux/sched/mm.h>
> > >  #include <linux/sched/numa_balancing.h>
> > >  #include <linux/sched/task_stack.h>
> > > @@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
> > >               seq_printf(m, "%.64s", tcomm);
> > >  }
> > >
> > > -/*
> > > - * The task state array is a strange "bitmap" of
> > > - * reasons to sleep. Thus "running" is zero, and
> > > - * you can test for combinations of others with
> > > - * simple bit tests.
> > > - */
> > > -static const char * const task_state_array[] = {
> > > -
> > > -     /* states in TASK_REPORT: */
> > > -     "R (running)",          /* 0x00 */
> > > -     "S (sleeping)",         /* 0x01 */
> > > -     "D (disk sleep)",       /* 0x02 */
> > > -     "T (stopped)",          /* 0x04 */
> > > -     "t (tracing stop)",     /* 0x08 */
> > > -     "X (dead)",             /* 0x10 */
> > > -     "Z (zombie)",           /* 0x20 */
> > > -     "P (parked)",           /* 0x40 */
> > > -
> > > -     /* states beyond TASK_REPORT: */
> > > -     "I (idle)",             /* 0x80 */
> > > -};
> > > -
> > > -static inline const char *get_task_state(struct task_struct *tsk)
> > > -{
> > > -     BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> > > -     return task_state_array[task_state_index(tsk)];
> > > -}
> > > -
> > >  static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
> > >                               struct pid *pid, struct task_struct *p)
> > >  {
> > > diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h
> > > index b1398d0d4a1d..9b080e1a82d4 100644
> > > --- a/include/asm-generic/termios.h
> > > +++ b/include/asm-generic/termios.h
> > > @@ -10,9 +10,9 @@
> > >       eof=^D          vtime=\0        vmin=\1         sxtc=\0
> > >       start=^Q        stop=^S         susp=^Z         eol=\0
> > >       reprint=^R      discard=^U      werase=^W       lnext=^V
> > > -     eol2=\0
> > > +     eol2=\0         status=^T
> > >  */
> > > -#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0"
> > > +#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024"
> > >
> > >  /*
> > >   * Translate a "termio" structure into a "termios". Ugh.
> > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > index c1a927ddec64..2171074ec8f5 100644
> > > --- a/include/linux/sched.h
> > > +++ b/include/linux/sched.h
> > > @@ -70,7 +70,7 @@ struct task_group;
> > >
> > >  /*
> > >   * Task state bitmask. NOTE! These bits are also
> > > - * encoded in fs/proc/array.c: get_task_state().
> > > + * encoded in get_task_state().
> > >   *
> > >   * We have two separate sets of flags: task->state
> > >   * is about runnability, while task->exit_state are
> > > @@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk)
> > >       return task_index_to_char(task_state_index(tsk));
> > >  }
> > >
> > > +static inline const char *get_task_state_name(struct task_struct *tsk)
> > > +{
> > > +     static const char * const task_state_array[] = {
> > > +
> > > +             /* states in TASK_REPORT: */
> > > +             "running",              /* 0x00 */
> > > +             "sleeping",             /* 0x01 */
> > > +             "disk sleep",           /* 0x02 */
> > > +             "stopped",              /* 0x04 */
> > > +             "tracing stop",         /* 0x08 */
> > > +             "dead",                 /* 0x10 */
> > > +             "zombie",               /* 0x20 */
> > > +             "parked",               /* 0x40 */
> > > +
> > > +             /* states beyond TASK_REPORT: */
> > > +             "idle",                 /* 0x80 */
> > > +     };
> > > +
> > > +     BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> > > +     return task_state_array[task_state_index(tsk)];
> > > +}
> > > +
> > > +static inline const char *get_task_state(struct task_struct *tsk)
> > > +{
> > > +     /*
> > > +      * The task state array is a strange "bitmap" of
> > > +      * reasons to sleep. Thus "running" is zero, and
> > > +      * you can test for combinations of others with
> > > +      * simple bit tests.
> > > +      */
> > > +     static const char * const task_state_array[] = {
> > > +
> > > +             /* states in TASK_REPORT: */
> > > +             "R (running)",          /* 0x00 */
> > > +             "S (sleeping)",         /* 0x01 */
> > > +             "D (disk sleep)",       /* 0x02 */
> > > +             "T (stopped)",          /* 0x04 */
> > > +             "t (tracing stop)",     /* 0x08 */
> > > +             "X (dead)",             /* 0x10 */
> > > +             "Z (zombie)",           /* 0x20 */
> > > +             "P (parked)",           /* 0x40 */
> > > +
> > > +             /* states beyond TASK_REPORT: */
> > > +             "I (idle)",             /* 0x80 */
> > > +     };
> > > +
> > > +     BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array));
> > > +     return task_state_array[task_state_index(tsk)];
> > > +}
> > > +
> > >  /**
> > >   * is_global_init - check if a task structure is init. Since init
> > >   * is free to have sub-threads we need to check tgid.
> > > diff --git a/include/linux/signal.h b/include/linux/signal.h
> > > index b77f9472a37c..76bda1a20578 100644
> > > --- a/include/linux/signal.h
> > > +++ b/include/linux/signal.h
> > > @@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig);
> > >   *   |  non-POSIX signal  |  default action  |
> > >   *   +--------------------+------------------+
> > >   *   |  SIGEMT            |  coredump        |
> > > + *   |  SIGINFO           |  ignore          |
> > >   *   +--------------------+------------------+
> > >   *
> > >   * (+) For SIGKILL and SIGSTOP the action is "always", not just "default".
> > > @@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig)
> > >       return  sig == SIGCONT  ||
> > >               sig == SIGCHLD  ||
> > >               sig == SIGWINCH ||
> > > +#if defined(SIGINFO) && SIGINFO != SIGPWR
> > > +             sig == SIGINFO  ||
> > > +#endif
> > >               sig == SIGURG;
> > >  }
> > >
> > > diff --git a/include/linux/tty.h b/include/linux/tty.h
> > > index 168e57e40bbb..943d85aa471c 100644
> > > --- a/include/linux/tty.h
> > > +++ b/include/linux/tty.h
> > > @@ -49,6 +49,9 @@
> > >  #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE])
> > >  #define LNEXT_CHAR(tty)      ((tty)->termios.c_cc[VLNEXT])
> > >  #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2])
> > > +#ifdef VSTATUS
> > > +#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS])
> > > +#endif
> > >
> > >  #define _I_FLAG(tty, f)      ((tty)->termios.c_iflag & (f))
> > >  #define _O_FLAG(tty, f)      ((tty)->termios.c_oflag & (f))
> > > @@ -114,6 +117,9 @@
> > >  #define L_PENDIN(tty)        _L_FLAG((tty), PENDIN)
> > >  #define L_IEXTEN(tty)        _L_FLAG((tty), IEXTEN)
> > >  #define L_EXTPROC(tty)       _L_FLAG((tty), EXTPROC)
> > > +#ifdef NOKERNINFO
> > > +#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO)
> > > +#endif
> > >
> > >  struct device;
> > >  struct signal_struct;
> > > @@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty);
> > >  extern void tty_unlock_slave(struct tty_struct *tty);
> > >  extern void tty_set_lock_subclass(struct tty_struct *tty);
> > >
> > > +extern int tty_status(struct tty_struct *tty);
> > > +
> > >  #endif
> > > diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h
> > > index cdc9f4ca8c27..baa2b8d42679 100644
> > > --- a/include/uapi/asm-generic/ioctls.h
> > > +++ b/include/uapi/asm-generic/ioctls.h
> > > @@ -97,6 +97,8 @@
> > >
> > >  #define TIOCMIWAIT   0x545C  /* wait for a change on serial input line(s) */
> > >  #define TIOCGICOUNT  0x545D  /* read serial port inline interrupt counts */
> > > +/* Some architectures use 0x545E for FIOQSIZE */
> > > +#define TIOCSTAT        0x545F       /* display process group stats on tty */
> > >
> > >  /*
> > >   * Some arches already define FIOQSIZE due to a historical
> > > diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h
> > > index 3c4cc9b8378e..0b771eb1db94 100644
> > > --- a/include/uapi/asm-generic/signal.h
> > > +++ b/include/uapi/asm-generic/signal.h
> > > @@ -4,7 +4,7 @@
> > >
> > >  #include <linux/types.h>
> > >
> > > -#define _NSIG                64
> > > +#define _NSIG                65
> > >  #define _NSIG_BPW    __BITS_PER_LONG
> > >  #define _NSIG_WORDS  ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW)
> > >
> > > @@ -49,9 +49,11 @@
> > >  /* These should not be considered constants from userland.  */
> > >  #define SIGRTMIN     32
> > >  #ifndef SIGRTMAX
> > > -#define SIGRTMAX     _NSIG
> > > +#define SIGRTMAX     64
> > >  #endif
> > >
> > > +#define SIGINFO              65
> > > +
> > >  #if !defined MINSIGSTKSZ || !defined SIGSTKSZ
> > >  #define MINSIGSTKSZ  2048
> > >  #define SIGSTKSZ     8192
> > > diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h
> > > index 2fbaf9ae89dd..cb4e9c6d629f 100644
> > > --- a/include/uapi/asm-generic/termbits.h
> > > +++ b/include/uapi/asm-generic/termbits.h
> > > @@ -58,6 +58,7 @@ struct ktermios {
> > >  #define VWERASE 14
> > >  #define VLNEXT 15
> > >  #define VEOL2 16
> > > +#define VSTATUS 17
> > >
> > >  /* c_iflag bits */
> > >  #define IGNBRK       0000001
> > > @@ -164,22 +165,23 @@ struct ktermios {
> > >  #define IBSHIFT        16            /* Shift from CBAUD to CIBAUD */
> > >
> > >  /* c_lflag bits */
> > > -#define ISIG 0000001
> > > -#define ICANON       0000002
> > > -#define XCASE        0000004
> > > -#define ECHO 0000010
> > > -#define ECHOE        0000020
> > > -#define ECHOK        0000040
> > > -#define ECHONL       0000100
> > > -#define NOFLSH       0000200
> > > -#define TOSTOP       0000400
> > > -#define ECHOCTL      0001000
> > > -#define ECHOPRT      0002000
> > > -#define ECHOKE       0004000
> > > -#define FLUSHO       0010000
> > > -#define PENDIN       0040000
> > > -#define IEXTEN       0100000
> > > -#define EXTPROC      0200000
> > > +#define ISIG    0000001
> > > +#define ICANON          0000002
> > > +#define XCASE           0000004
> > > +#define ECHO    0000010
> > > +#define ECHOE           0000020
> > > +#define ECHOK           0000040
> > > +#define ECHONL          0000100
> > > +#define NOFLSH          0000200
> > > +#define TOSTOP          0000400
> > > +#define ECHOCTL         0001000
> > > +#define ECHOPRT         0002000
> > > +#define ECHOKE          0004000
> > > +#define FLUSHO          0010000
> > > +#define PENDIN          0040000
> > > +#define IEXTEN          0100000
> > > +#define EXTPROC         0200000
> > > +#define NOKERNINFO 0400000
> > >
> > >  /* tcflow() and TCXONC use these */
> > >  #define      TCOOFF          0
> > > --
> > > 2.30.2
> > >

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO
  2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond
  2022-01-04  7:27   ` Greg Kroah-Hartman
  2022-01-07 21:48   ` Arseny Maslennikov
@ 2022-01-08 14:38   ` Arseny Maslennikov
  2 siblings, 0 replies; 57+ messages in thread
From: Arseny Maslennikov @ 2022-01-08 14:38 UTC (permalink / raw)
  To: Walt Drummond
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel,
	linux-arch

[-- Attachment #1: Type: text/plain, Size: 5928 bytes --]

On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote:
> Support TTY VSTATUS character, NOKERNINFO local control bit and the
> signal SIGINFO, all as in 4.3BSD.
> 
> Signed-off-by: Walt Drummond <walt@drummond.us>
> ---
>  arch/x86/include/asm/signal.h       |   2 +-
>  arch/x86/include/uapi/asm/signal.h  |   4 +-
>  drivers/tty/Makefile                |   2 +-
>  drivers/tty/n_tty.c                 |  21 +++++
>  drivers/tty/tty_io.c                |  10 ++-
>  drivers/tty/tty_ioctl.c             |   4 +
>  drivers/tty/tty_status.c            | 135 ++++++++++++++++++++++++++++
>  fs/proc/array.c                     |  29 +-----
>  include/asm-generic/termios.h       |   4 +-
>  include/linux/sched.h               |  52 ++++++++++-
>  include/linux/signal.h              |   4 +
>  include/linux/tty.h                 |   8 ++
>  include/uapi/asm-generic/ioctls.h   |   2 +
>  include/uapi/asm-generic/signal.h   |   6 +-
>  include/uapi/asm-generic/termbits.h |  34 +++----
>  15 files changed, 264 insertions(+), 53 deletions(-)
>  create mode 100644 drivers/tty/tty_status.c
> 
> <...>
> 
> diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c
> new file mode 100644
> index 000000000000..a9600f5bd48c
> --- /dev/null
> +++ b/drivers/tty/tty_status.c
> @@ -0,0 +1,135 @@
> +// SPDX-License-Identifier: GPL-1.0+
> +/*
> + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4
> + *
> + */
> +
> +#include <linux/sched.h>
> +#include <linux/mm.h>
> +#include <linux/tty.h>
> +#include <linux/sched/cputime.h>
> +#include <linux/sched/loadavg.h>
> +#include <linux/pid.h>
> +#include <linux/slab.h>
> +#include <linux/math64.h>
> +
> +#define MSGLEN (160 + TASK_COMM_LEN)
> +
> +inline unsigned long getRSSk(struct mm_struct *mm)
> +{
> +	if (mm == NULL)
> +		return 0;
> +	return get_mm_rss(mm) * PAGE_SIZE / 1024;
> +}
> +
> +inline long nstoms(long l)
> +{
> +	l /= NSEC_PER_MSEC * 10;
> +	if (l < 10)
> +		l *= 10;
> +	return l;
> +}
> +
> +inline struct task_struct *compare(struct task_struct *new,
> +				   struct task_struct *old)
> +{
> +	unsigned int ostate, nstate;
> +
> +	if (old == NULL)
> +		return new;
> +
> +	ostate = task_state_index(old);
> +	nstate = task_state_index(new);
> +
> +	if (ostate == nstate) {
> +		if (old->start_time > new->start_time)
> +			return old;
> +		return new;
> +	}
> +
> +	if (ostate < nstate)
> +		return old;
> +
> +	return new;
> +}
> +
> +struct task_struct *pick_process(struct pid *pgrp)
> +{
> +	struct task_struct *p, *winner = NULL;
> +
> +	read_lock(&tasklist_lock);
> +	do_each_pid_task(pgrp, PIDTYPE_PGID, p) {
> +		winner = compare(p, winner);
> +	} while_each_pid_task(pgrp, PIDTYPE_PGID, p);
> +	read_unlock(&tasklist_lock);
> +
> +	return winner;
> +}
> +
> +int tty_status(struct tty_struct *tty)
> +{
> +	char tname[TASK_COMM_LEN];
> +	unsigned long loadavg[3];
> +	uint64_t pcpu, cputime, wallclock;
> +	struct task_struct *p;
> +	struct rusage rusage;
> +	struct timespec64 utime, stime, rtime;
> +	char msg[MSGLEN] = {0};
> +	int len = 0;
> +
> +	if (tty == NULL)
> +		return -ENOTTY;
> +
> +	get_avenrun(loadavg, FIXED_1/200, 0);
> +	len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu  ",
> +		       LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0]));
> +
> +	if (tty->ctrl.session == NULL) {
> +		len += scnprintf((char *)&msg[len], MSGLEN - len,
> +				 "not a controlling terminal");
> +		goto print;
> +	}
> +
> +	if (tty->ctrl.pgrp == NULL) {
> +		len += scnprintf((char *)&msg[len], MSGLEN - len,
> +				 "no foreground process group");
> +		goto print;
> +	}
> +
> +	p = pick_process(tty->ctrl.pgrp);
> +	if (p == NULL) {
> +		len += scnprintf((char *)&msg[len], MSGLEN - len,
> +				 "empty foreground process group");
> +		goto print;
> +	}
> +
> +	get_task_comm(tname, p);
> +	getrusage(p, RUSAGE_BOTH, &rusage);
> +	wallclock = ktime_get_ns() - p->start_time;
> +
> +	utime.tv_sec = rusage.ru_utime.tv_sec;
> +	utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC;
> +	stime.tv_sec = rusage.ru_stime.tv_sec;
> +	stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC;
> +	rtime = ns_to_timespec64(wallclock);
> +
> +	cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime);
> +	pcpu = div64_u64(cputime * 100, wallclock);
> +
> +	len += scnprintf((char *)&msg[len], MSGLEN - len,
> +			 /* task, PID, task state */
> +			 "cmd: %s %d [%s] "
> +			 /* rtime,    utime,      stime,      %cpu,  rss */
> +			 "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk",
> +			 tname,	task_pid_vnr(p), (char *)get_task_state_name(p),

task_pid_vnr(p) returns the PID of p in the PID namespace of current:

  pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
                          struct pid_namespace *ns)
  {
          pid_t nr = 0;
  
          rcu_read_lock();
          if (!ns)
                  ns = task_active_pid_ns(current);
          nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
          rcu_read_unlock();
  
          return nr;
  }
  struct pid_namespace *task_active_pid_ns(struct task_struct *tsk)
  {
          return ns_of_pid(task_pid(tsk));
  }
  static inline pid_t task_pid_vnr(struct task_struct *tsk)
  {
          return __task_pid_nr_ns(tsk, PIDTYPE_PID, NULL); 
  }

At this point current is an arbitrary kernel worker thread, not p. Most
likely we need another helper function in <linux/sched.h>.

> +			 rtime.tv_sec, nstoms(rtime.tv_nsec),
> +			 utime.tv_sec, nstoms(utime.tv_nsec),
> +			 stime.tv_sec, nstoms(stime.tv_nsec),
> +			 pcpu, getRSSk(p->mm));
> +
> +print:
> +	len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n");
> +	tty_write_message(tty, msg);
> +
> +	return 0;
> +}
> 
> <...>
> 
> -- 
> 2.30.2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
  2022-01-07 19:29             ` Arseny Maslennikov
  (?)
@ 2022-05-19 12:27               ` Pavel Machek
  -1 siblings, 0 replies; 57+ messages in thread
From: Pavel Machek @ 2022-05-19 12:27 UTC (permalink / raw)
  To: Arseny Maslennikov
  Cc: Walt Drummond, Theodore Ts'o, Eric W. Biederman, aacraid,
	viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

Hi!

> > The only standard tools that support SIGINFO are sleep, dd and ping,
> > (and kill, for obvious reasons) so it's not like there's a vast hole
> > in the tooling or something, nor is there a large legacy software base
> > just waiting for SIGINFO to appear.   So while I very much enjoyed
> > figuring out how to make SIGINFO work ...
> 
> As far as I recall, GNU make on *BSD does support SIGINFO (Not a
> standard tool, but obviously an established one).
> 
> The developers of strace have expressed interest in SIGINFO support
> to print tracer status messages (unfortunately, not on a public list).
> Computational software can use this instead of stderr progress spam, if
> run in an interactive fashion on a terminal, as it frequently is. There
> is a user base, it's just not very vocal on kernel lists. :)

And often it would be useful if cp supported this. Yes, this
is feature I'd like to see.

BR,							Pavel

-- 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-05-19 12:27               ` Pavel Machek
  0 siblings, 0 replies; 57+ messages in thread
From: Pavel Machek @ 2022-05-19 12:27 UTC (permalink / raw)
  To: Arseny Maslennikov
  Cc: Walt Drummond, Theodore Ts'o, Eric W. Biederman, aacraid,
	viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth,
	richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot,
	x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch,
	linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi,
	linux-security-module

Hi!

> > The only standard tools that support SIGINFO are sleep, dd and ping,
> > (and kill, for obvious reasons) so it's not like there's a vast hole
> > in the tooling or something, nor is there a large legacy software base
> > just waiting for SIGINFO to appear.   So while I very much enjoyed
> > figuring out how to make SIGINFO work ...
> 
> As far as I recall, GNU make on *BSD does support SIGINFO (Not a
> standard tool, but obviously an established one).
> 
> The developers of strace have expressed interest in SIGINFO support
> to print tracer status messages (unfortunately, not on a public list).
> Computational software can use this instead of stderr progress spam, if
> run in an interactive fashion on a terminal, as it frequently is. There
> is a user base, it's just not very vocal on kernel lists. :)

And often it would be useful if cp supported this. Yes, this
is feature I'd like to see.

BR,							Pavel

-- 

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-05-19 12:27               ` Pavel Machek
  0 siblings, 0 replies; 57+ messages in thread
From: Pavel Machek @ 2022-05-19 12:27 UTC (permalink / raw)
  To: Arseny Maslennikov
  Cc: Walt Drummond, Theodore Ts'o, Eric W. Biederman, aacraid,
	viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot,
	dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh,
	hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields,
	jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof,
	martin.petersen, mattst88, mgorman, oleg

Hi!

> > The only standard tools that support SIGINFO are sleep, dd and ping,
> > (and kill, for obvious reasons) so it's not like there's a vast hole
> > in the tooling or something, nor is there a large legacy software base
> > just waiting for SIGINFO to appear.   So while I very much enjoyed
> > figuring out how to make SIGINFO work ...
> 
> As far as I recall, GNU make on *BSD does support SIGINFO (Not a
> standard tool, but obviously an established one).
> 
> The developers of strace have expressed interest in SIGINFO support
> to print tracer status messages (unfortunately, not on a public list).
> Computational software can use this instead of stderr progress spam, if
> run in an interactive fashion on a terminal, as it frequently is. There
> is a user base, it's just not very vocal on kernel lists. :)

And often it would be useful if cp supported this. Yes, this
is feature I'd like to see.

BR,							Pavel

-- 

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2022-05-19 12:28 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-03 18:19 [RFC PATCH 0/8] signals: Support more than 64 signals Walt Drummond
2022-01-03 18:19 ` Walt Drummond
2022-01-03 18:19 ` Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 1/8] signals: Make the real-time signal system calls accept different sized sigset_t from user space Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 2/8] signals: Put the full signal mask on the signal stack for x86_64, X32 and ia32 compatibility mode Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 3/8] signals: Use a helper function to test if a signal is a real-time signal Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 4/8] signals: Remove sigmask() macro Walt Drummond
2022-01-03 18:19   ` Walt Drummond
2022-01-03 18:19   ` Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 5/8] signals: Better support cases where _NSIG_WORDS is greater than 2 Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 6/8] signals: Round up _NSIG_WORDS Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 7/8] signals: Add signal debugging Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond
2022-01-04  7:27   ` Greg Kroah-Hartman
2022-01-07 21:48   ` Arseny Maslennikov
2022-01-07 21:52     ` Walt Drummond
2022-01-07 22:39       ` Arseny Maslennikov
2022-01-08 14:38   ` Arseny Maslennikov
2022-01-03 18:48 ` [RFC PATCH 0/8] signals: Support more than 64 signals Al Viro
2022-01-03 18:48   ` Al Viro
2022-01-03 18:48   ` Al Viro
2022-01-04  1:00   ` Walt Drummond
2022-01-04  1:00     ` Walt Drummond
2022-01-04  1:00     ` Walt Drummond
2022-01-04  1:16     ` Al Viro
2022-01-04  1:16       ` Al Viro
2022-01-04  1:16       ` Al Viro
2022-01-04  1:49       ` Al Viro
2022-01-04  1:49         ` Al Viro
2022-01-04  1:49         ` Al Viro
2022-01-04 18:00 ` Eric W. Biederman
2022-01-04 18:00   ` Eric W. Biederman
2022-01-04 18:00   ` Eric W. Biederman
2022-01-04 20:52   ` Theodore Ts'o
2022-01-04 20:52     ` Theodore Ts'o
2022-01-04 20:52     ` Theodore Ts'o
2022-01-04 21:33     ` Walt Drummond
2022-01-04 21:33       ` Walt Drummond
2022-01-04 21:33       ` Walt Drummond
2022-01-04 22:05     ` Eric W. Biederman
2022-01-04 22:05       ` Eric W. Biederman
2022-01-04 22:05       ` Eric W. Biederman
2022-01-04 22:23       ` Theodore Ts'o
2022-01-04 22:23         ` Theodore Ts'o
2022-01-04 22:23         ` Theodore Ts'o
2022-01-04 22:31         ` Walt Drummond
2022-01-04 22:31           ` Walt Drummond
2022-01-04 22:31           ` Walt Drummond
2022-01-07 19:29           ` Arseny Maslennikov
2022-01-07 19:29             ` Arseny Maslennikov
2022-01-07 19:29             ` Arseny Maslennikov
2022-05-19 12:27             ` Pavel Machek
2022-05-19 12:27               ` Pavel Machek
2022-05-19 12:27               ` Pavel Machek
2022-01-07 19:19     ` Arseny Maslennikov
2022-01-07 19:19       ` Arseny Maslennikov
2022-01-07 19:19       ` Arseny Maslennikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.