Linux-api Archive on lore.kernel.org
 help / color / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Alex Belits <abelits@marvell.com>
Cc: "rostedt@goodmis.org" <rostedt@goodmis.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Prasun Kapoor <pkapoor@marvell.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"linux-mm@vger.kernel.org" <linux-mm@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [PATCH 03/12] task_isolation: userspace hard isolation from kernel
Date: Fri, 6 Mar 2020 16:26:33 +0100
Message-ID: <20200306152632.GB8590@lenoir> (raw)
In-Reply-To: <36d84b8dd168a38e6a56549dedc15dd6ebf8c09e.camel@marvell.com>

On Wed, Mar 04, 2020 at 04:07:12PM +0000, Alex Belits wrote:
> +
> +/*
> + * Print message prefixed with the description of the current (or
> + * last) isolated task on a given CPU. Intended for isolation breaking
> + * messages that include target task for the user's convenience.
> + *
> + * Messages produced with this function may have obsolete task
> + * information if isolated tasks managed to exit, start and enter
> + * isolation multiple times, or multiple tasks tried to enter
> + * isolation on the same CPU at once. For those unusual cases it would
> + * contain a valid description of the cause for isolation breaking and
> + * target CPU number, just not the correct description of which task
> + * ended up losing isolation.
> + */
> +int task_isolation_message(int cpu, int level, bool supp, const char *fmt, ...)
> +{
> +	struct isol_task_desc *desc;
> +	struct task_struct *task;
> +	va_list args;
> +	char buf_prefix[TASK_COMM_LEN + 20 + 3 * 20];
> +	char buf[200];
> +	int curr_cpu, ind_counter, ind_counter_old, ind;
> +
> +	curr_cpu = get_cpu();
> +	desc = &per_cpu(isol_task_descs, cpu);
> +	ind_counter = atomic_read(&desc->curr_index);
> +
> +	if (curr_cpu == cpu) {
> +		/*
> +		 * Message is for the current CPU so current
> +		 * task_struct should be used instead of cached
> +		 * information.
> +		 *
> +		 * Like in other diagnostic messages, if issued from
> +		 * interrupt context, current will be the interrupted
> +		 * task. Unlike other diagnostic messages, this is
> +		 * always relevant because the message is about
> +		 * interrupting a task.
> +		 */
> +		ind = ind_counter & 1;
> +		if (supp && desc->warned[ind]) {
> +			/*
> +			 * If supp is true, skip the message if the
> +			 * same task was mentioned in the message
> +			 * originated on remote CPU, and it did not
> +			 * re-enter isolated state since then (warned
> +			 * is true). Only local messages following
> +			 * remote messages, likely about the same
> +			 * isolation breaking event, are skipped to
> +			 * avoid duplication. If remote cause is
> +			 * immediately followed by a local one before
> +			 * isolation is broken, local cause is skipped
> +			 * from messages.
> +			 */
> +			put_cpu();
> +			return 0;
> +		}
> +		task = current;
> +		snprintf(buf_prefix, sizeof(buf_prefix),
> +			 "isolation %s/%d/%d (cpu %d)",
> +			 task->comm, task->tgid, task->pid, cpu);
> +		put_cpu();
> +	} else {
> +		/*
> +		 * Message is for remote CPU, use cached information.
> +		 */
> +		put_cpu();
> +		/*
> +		 * Make sure, index remained unchanged while data was
> +		 * copied. If it changed, data that was copied may be
> +		 * inconsistent because two updates in a sequence could
> +		 * overwrite the data while it was being read.
> +		 */
> +		do {
> +			/* Make sure we are reading up to date values */
> +			smp_mb();
> +			ind = ind_counter & 1;
> +			snprintf(buf_prefix, sizeof(buf_prefix),
> +				 "isolation %s/%d/%d (cpu %d)",
> +				 desc->comm[ind], desc->tgid[ind],
> +				 desc->pid[ind], cpu);
> +			desc->warned[ind] = true;
> +			ind_counter_old = ind_counter;
> +			/* Record the warned flag, then re-read descriptor */
> +			smp_mb();
> +			ind_counter = atomic_read(&desc->curr_index);
> +			/*
> +			 * If the counter changed, something was updated, so
> +			 * repeat everything to get the current data
> +			 */
> +		} while (ind_counter != ind_counter_old);
> +	}

So the need to log the fact we are sending an event to a remote CPU that *may be*
running an isolated task makes things very complicated and even racy.

How bad would it be to only log those interruptions once they land on the target?

Thanks.

  parent reply index

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04 16:01 [PATCH 00/12] "Task_isolation" mode Alex Belits
2020-03-04 16:03 ` [PATCH 01/12] task_isolation: vmstat: add quiet_vmstat_sync function Alex Belits
2020-03-04 16:04 ` [PATCH 02/12] task_isolation: vmstat: add vmstat_idle function Alex Belits
2020-03-04 16:07 ` [PATCH 03/12] task_isolation: userspace hard isolation from kernel Alex Belits
2020-03-05 18:33   ` Frederic Weisbecker
2020-03-08  5:32     ` [EXT] " Alex Belits
2020-04-28 14:12     ` Marcelo Tosatti
2020-03-06 15:26   ` Frederic Weisbecker [this message]
2020-03-08  6:06     ` [EXT] " Alex Belits
2020-03-06 16:00   ` Frederic Weisbecker
2020-03-08  7:16     ` [EXT] " Alex Belits
2020-03-04 16:08 ` [PATCH 04/12] task_isolation: Add task isolation hooks to arch-independent code Alex Belits
2020-03-04 16:09 ` [PATCH 05/12] task_isolation: arch/x86: enable task isolation functionality Alex Belits
2020-03-04 16:10 ` [PATCH 06/12] task_isolation: arch/arm64: " Alex Belits
2020-03-04 16:31   ` Mark Rutland
2020-03-08  4:48     ` [EXT] " Alex Belits
2020-03-04 16:11 ` [PATCH 07/12] task_isolation: arch/arm: " Alex Belits
2020-03-04 16:12 ` [PATCH 08/12] task_isolation: don't interrupt CPUs with tick_nohz_full_kick_cpu() Alex Belits
2020-03-06 16:03   ` Frederic Weisbecker
2020-03-08  7:28     ` [EXT] " Alex Belits
2020-03-09  2:38       ` Frederic Weisbecker
2020-03-04 16:13 ` [PATCH 09/12] task_isolation: net: don't flush backlog on CPUs running isolated tasks Alex Belits
2020-03-04 16:14 ` [PATCH 10/12] task_isolation: ringbuffer: don't interrupt CPUs running isolated tasks on buffer resize Alex Belits
2020-03-04 16:15 ` [PATCH 11/12] task_isolation: kick_all_cpus_sync: don't kick isolated cpus Alex Belits
2020-03-06 15:34   ` Frederic Weisbecker
2020-03-08  6:48     ` [EXT] " Alex Belits
2020-03-09  2:28       ` Frederic Weisbecker
2020-03-04 16:16 ` [PATCH 12/12] task_isolation: CONFIG_TASK_ISOLATION prevents distribution of jobs to non-housekeeping CPUs Alex Belits
2020-03-08  3:42 ` [PATCH v2 00/12] "Task_isolation" mode Alex Belits
2020-03-08  3:44   ` [PATCH v2 01/12] task_isolation: vmstat: add quiet_vmstat_sync function Alex Belits
2020-03-08  3:46   ` [PATCH v2 02/12] task_isolation: vmstat: add vmstat_idle function Alex Belits
2020-03-08  3:47   ` [PATCH v2 03/12] task_isolation: userspace hard isolation from kernel Alex Belits
     [not found]     ` <20200307214254.7a8f6c22@hermes.lan>
2020-03-08  7:33       ` [EXT] " Alex Belits
2020-03-27  8:42     ` Marta Rybczynska
2020-04-06  4:31     ` Kevyn-Alexandre Paré
2020-04-06  4:43     ` Kevyn-Alexandre Paré
2020-03-08  3:48   ` [PATCH v2 04/12] task_isolation: Add task isolation hooks to arch-independent code Alex Belits
2020-03-08  3:49   ` [PATCH v2 05/12] task_isolation: arch/x86: enable task isolation functionality Alex Belits
2020-03-08  3:50   ` [PATCH v2 06/12] task_isolation: arch/arm64: " Alex Belits
2020-03-09 16:59     ` Mark Rutland
2020-03-08  3:52   ` [PATCH v2 07/12] task_isolation: arch/arm: " Alex Belits
2020-03-08  3:53   ` [PATCH v2 08/12] task_isolation: don't interrupt CPUs with tick_nohz_full_kick_cpu() Alex Belits
2020-03-08  3:54   ` [PATCH v2 09/12] task_isolation: net: don't flush backlog on CPUs running isolated tasks Alex Belits
2020-03-08  3:55   ` [PATCH v2 10/12] task_isolation: ringbuffer: don't interrupt CPUs running isolated tasks on buffer resize Alex Belits
2020-04-06  4:27     ` Kevyn-Alexandre Paré
2020-03-08  3:56   ` [PATCH v2 11/12] task_isolation: kick_all_cpus_sync: don't kick isolated cpus Alex Belits
2020-03-08  3:57   ` [PATCH v2 12/12] task_isolation: CONFIG_TASK_ISOLATION prevents distribution of jobs to non-housekeeping CPUs Alex Belits
2020-04-09 15:09   ` [PATCH v3 00/13] "Task_isolation" mode Alex Belits
2020-04-09 15:15     ` [PATCH 01/13] task_isolation: vmstat: add quiet_vmstat_sync function Alex Belits
2020-04-09 15:16     ` [PATCH 02/13] task_isolation: vmstat: add vmstat_idle function Alex Belits
2020-04-09 15:17     ` [PATCH v3 03/13] task_isolation: add instruction synchronization memory barrier Alex Belits
2020-04-15 12:44       ` Mark Rutland
2020-04-19  5:02         ` [EXT] " Alex Belits
2020-04-20 12:23           ` Will Deacon
2020-04-20 12:36             ` Mark Rutland
2020-04-20 13:55               ` Will Deacon
2020-04-21  7:41                 ` Will Deacon
2020-04-20 12:45           ` Mark Rutland
2020-04-09 15:20     ` [PATCH v3 04/13] task_isolation: userspace hard isolation from kernel Alex Belits
2020-04-09 18:00       ` Andy Lutomirski
2020-04-19  5:07         ` Alex Belits
2020-04-09 15:21     ` [PATCH 05/13] task_isolation: Add task isolation hooks to arch-independent code Alex Belits
2020-04-09 15:22     ` [PATCH 06/13] task_isolation: arch/x86: enable task isolation functionality Alex Belits
2020-04-09 15:23     ` [PATCH v3 07/13] task_isolation: arch/arm64: " Alex Belits
2020-04-22 12:08       ` Catalin Marinas
2020-04-09 15:24     ` [PATCH v3 08/13] task_isolation: arch/arm: " Alex Belits
2020-04-09 15:25     ` [PATCH v3 09/13] task_isolation: don't interrupt CPUs with tick_nohz_full_kick_cpu() Alex Belits
2020-04-09 15:26     ` [PATCH v3 10/13] task_isolation: net: don't flush backlog on CPUs running isolated tasks Alex Belits
2020-04-09 15:27     ` [PATCH v3 11/13] task_isolation: ringbuffer: don't interrupt CPUs running isolated tasks on buffer resize Alex Belits
2020-04-09 15:27     ` [PATCH v3 12/13] task_isolation: kick_all_cpus_sync: don't kick isolated cpus Alex Belits
2020-04-09 15:28     ` [PATCH v3 13/13] task_isolation: CONFIG_TASK_ISOLATION prevents distribution of jobs to non-housekeeping CPUs Alex Belits

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200306152632.GB8590@lenoir \
    --to=frederic@kernel.org \
    --cc=abelits@marvell.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pkapoor@marvell.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-api Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-api/0 linux-api/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-api linux-api/ https://lore.kernel.org/linux-api \
		linux-api@vger.kernel.org
	public-inbox-index linux-api

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-api


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git