[PATCH 00/12] "Task_isolation" mode

* [PATCH 00/12] "Task_isolation" mode
@ 2020-03-04 16:01 Alex Belits
  2020-03-04 16:03 ` [PATCH 01/12] task_isolation: vmstat: add quiet_vmstat_sync function Alex Belits
                   ` (12 more replies)
  0 siblings, 13 replies; 71+ messages in thread
From: Alex Belits @ 2020-03-04 16:01 UTC (permalink / raw)
  To: frederic, rostedt
  Cc: mingo, peterz, linux-kernel, Prasun Kapoor, tglx, linux-api,
	linux-mm, linux-arch

This is an update of task isolation work that was originally done by
Chris Metcalf <cmetcalf@mellanox.com> and maintained by him until
November 2017. It is adapted to the current kernel and cleaned up to
make this functionality both more complete (as in, prevent isolation
breaking in situations that were not covered before) and cleaner (as
in, avoid any dubious or fragile use of kernel interfaces, and provide
clean and reliable isolation breaking procedure).

I guess, I have to explain why such a thing exists.

This is the result of development and maintenance of task isolation
functionality that originally started based on task isolation patch
v15 and was later updated to include v16. It provided RTOS-like
predictable environment for userspace tasks running on arm64
processors alongside with full-featured Linux environment. It is
intended to provide reliable interruption-free environment from the
point when a userspace task enters isolation and until the moment it
leaves isolation or receives a signal intentionally sent to it, and
was successfully used for this purpose. While CPU isolation with nohz
provides an environment that is close to this requirement, the
remaining IPIs and other disturbances keep it from being usable for
tasks that require complete predictability of CPU timing.

It is clear that such isolation is neither possible nor necessary
while a CPU is running kernel, userspace initialization or cleanup, so
there is a need for a separate isolated state that a userspace task
can enter and exit. This was the reason for using the original task
isolation, and such reason still exists now. The alternative, running
RTOS instead of Linux, is becoming more and more labor-consuming
because modern CPUs and SoCs have very complex device/resource
configuration and management procedures, and at this point for some
hardware it is clearly in the realm of impractical to maintain an RTOS
with hardware support on par with Linux kernel, reliable and secure at
the same time.

On the other hand, development of modern embedded-oriented SoCs had
shown that numerous CPU cores may or may not share any hardware
resources based on SoC designers' intention. Therefore OS ability to
switch a CPU core into RTOS-ish mode and truly, really, at all levels,
leave it alone until OS is needed there again, is an important feature
for modern embedded systems development. Probably more important than
even real-time interrupts latency and preemption, now that people,
when they don't like how their interrupts are handled, can just add
CPU cores. This is why we had to maintain task isolation, and I
believe, after all improvements in CPU isolation, timer and interrupt
management that was done in Linux since 2017, it is needed even more,
as opposed to less.

This set of patches only covers the implementation of task isolation,
however the need for additional functionality, such as selective TLB
flushes, is one of the reasons behind task_isolation_on_cpu() avoiding
any non-isolation-specific data structures and the existence of
fast_task_isolation_cpu_cleanup() function, that is always called on
the CPU where isolated task is running.

Reporting task isolation breaking in kernel log is now more
informative and, if necessary, can be adapted to provide meaningful
cause information to userspace software. I am not sure if such
mechanism is needed -- development and reporting failures in
production usually relies on kernel logs, and in production it is
assumed that isolation breaking should not happen on its own. On the
other hand, if application can collect a meaningful log where its
events are matched to isolation failures, this may be better for
developers than matching timing of records from multiple sources. For
now, only log shows detailed descriptions.

The userspace support and test program is now at
https://github.com/abelits/libtmc . It was originally developed for
earlier implementation, so it has some checks that may be redundant
now but kept for compatibility.

My thanks to Chris Metcalf for design and maintenance of the original
task isolation patch, Francis Giraldeau <francis.giraldeau@gmail.com>
and Yuri Norov <ynorov@marvell.com> for various contributions to this
work, and Frederic Weisbecker <frederic@kernel.org> for his work on
CPU isolation and housekeeping that made possible to remove some less
elegant solutions that I had to devise for earlier, <4.17 kernels.

The previous patch (v16 by Chris Metcalf) is at:

https://lore.kernel.org/lkml/1509728692-10460-1-git-send-email-cmetcalf@mellanox.com

-- 
Alex

^ permalink raw reply	[flat|nested] 71+ messages in thread