linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <ak@muc.de>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	hch@infradead.org
Subject: Re: [patch 05/10] Linux Kernel Markers - i386 optimized version
Date: 10 May 2007 11:06:56 +0200
Date: Thu, 10 May 2007 11:06:56 +0200	[thread overview]
Message-ID: <20070510090656.GA57297@muc.de> (raw)
In-Reply-To: <20070510020916.508519573@polymtl.ca>

On Wed, May 09, 2007 at 09:56:00PM -0400, Mathieu Desnoyers wrote:
> @@ -0,0 +1,99 @@
> +/* marker.c
> + *
> + * Erratum 49 fix for Intel PIII and higher.

Errata are CPU specific so they can't be higher. You mean it's a P3
erratum only?

In general you need some more description why the int3 handler is needed.

> + *
> + * Permits marker activation by XMC with correct serialization.

This doesn't follow the Intel recommended algorithm for cross modifying
code (7.1.3 in the IA32 manual) which requires the other CPUs to spin
during the modification.

If you used that would you really need the INT3 handler?

> +static DEFINE_MUTEX(mark_mutex);
> +static long target_eip = 0;
> +
> +static void mark_synchronize_core(void *info)
> +{
> +	sync_core();	/* use cpuid to stop speculative execution */
> +}
> +
> +/* We simply skip the 2 bytes load immediate here, leaving the register in an
> + * undefined state. We don't care about the content (0 or !0), because we are
> + * changing the value 0->1 or 1->0. This small window of undefined value
> + * doesn't matter.
> + */
> +static int mark_notifier(struct notifier_block *nb,
> +	unsigned long val, void *data)
> +{
> +	enum die_val die_val = (enum die_val) val;
> +	struct die_args *args = (struct die_args *)data;
> +
> +	if (!args->regs || user_mode_vm(args->regs))

I don't think regs should be ever NULL

> +		return NOTIFY_DONE;
> +
> +	if (die_val == DIE_INT3	&& args->regs->eip == target_eip) {
> +		args->regs->eip += 1; /* Skip the next byte of load immediate */

In what instruction results that then? This is definitely underdocumented.

> +int marker_optimized_set_enable(void *address, char enable)
> +{
> +	char saved_byte;
> +	int ret;
> +	char *dest = address;
> +
> +	if (!(enable ^ dest[1])) /* Must be a state change 0<->1 to execute */
> +		return 0;

Wouldn't you need that inside the mutex too to avoid races?


> +
> +	mutex_lock(&mark_mutex);
> +	target_eip = (long)address + BREAKPOINT_INS_LEN;
> +	/* register_die_notifier has memory barriers */
> +	register_die_notifier(&mark_notify);
> +	saved_byte = *dest;
> +	*dest = BREAKPOINT_INSTRUCTION;
> +	wmb();

wmb is a nop

> +	/* Execute serializing instruction on each CPU.
> +	 * Acts as a memory barrier. */
> +	ret = on_each_cpu(mark_synchronize_core, NULL, 1, 1);
> +	BUG_ON(ret != 0);
> +
> +	dest[1] = enable;
> +	wmb();
> +	*dest = saved_byte;
> +		/* Wait for all int3 handlers to end
> +		   (interrupts are disabled in int3).
> +		   This CPU is clearly not in a int3 handler
> +		   (not preemptible).

So what happens when the kernel is preemptible? 

> +		   synchronize_sched has memory barriers */
> +	synchronize_sched();
> +	unregister_die_notifier(&mark_notify);
> +	/* unregister_die_notifier has memory barriers */
> +	target_eip = 0;
> +	mutex_unlock(&mark_mutex);
> +	flush_icache_range(address, size);

That's a nop on x86

> +
> +#ifdef CONFIG_MARKERS
> +
> +#define MF_DEFAULT (MF_OPTIMIZED | MF_LOCKDEP | MF_PRINTK)
> +
> +/* Optimized version of the markers */
> +#define trace_mark_optimized(flags, name, format, args...) \
> +	do { \
> +		static const char __mstrtab_name_##name[] \
> +		__attribute__((section("__markers_strings"))) \
> +		= #name; \
> +		static const char __mstrtab_format_##name[] \
> +		__attribute__((section("__markers_strings"))) \
> +		= format; \
> +		static const char __mstrtab_args_##name[] \
> +		__attribute__((section("__markers_strings"))) \
> +		= #args; \

For what do you need special string sections?  

If it's something to be read by external programs the interface
would need to be clearly defined and commented.
If not just use normal variables.

> +		static struct __mark_marker_data __mark_data_##name \
> +		__attribute__((section("__markers_data"))) = \
> +		{ __mstrtab_name_##name,  __mstrtab_format_##name, \
> +		__mstrtab_args_##name, \
> +		(flags) | MF_OPTIMIZED, __mark_empty_function, NULL }; \
> +		char condition; \
> +		asm volatile(	".section __markers, \"a\", @progbits;\n\t" \

The volatile is not needed because the output is used.

> +					".long %1, 0f;\n\t" \
> +					".previous;\n\t" \
> +					".align 2\n\t" \
> +					"0:\n\t" \
> +					"movb $0,%0;\n\t" \

This should be a generic facility in a separate include file / separate 
macros etc so that it can be used by other code too.

> +				: "=r" (condition) \
> +				: "m" (__mark_data_##name)); \
> +		__mark_check_format(format, ## args); \
> +		if (likely(!condition)) { \
> +		} else { \
> +			preempt_disable(); \

Why the preempt disable here and why can't it be in the hook?

> +			(*__mark_data_##name.call)(&__mark_data_##name, \
> +						format, ## args); \
> +			preempt_enable(); \
> +		} \
> +	} while (0)
> +
> +/* Marker macro selecting the generic or optimized version of marker, depending
> + * on the flags specified. */
> +#define _trace_mark(flags, format, args...) \
> +do { \
> +	if (((flags) & MF_LOCKDEP) && ((flags) & MF_OPTIMIZED)) \
> +		trace_mark_optimized(flags, format, ## args); \
> +	else \
> +		trace_mark_generic(flags, format, ## args); \

Is there ever a good reason not to use optimized markers? 

> + * bytes. */
> +#define MARK_OPTIMIZED_ENABLE_IMMEDIATE_OFFSET 1
> +#define MARK_OPTIMIZED_ENABLE_TYPE char

unsigned char is usually safer to avoid sign extension

-Andi

  reply	other threads:[~2007-05-10  9:07 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-10  1:55 [patch 00/10] Linux Kernel Markers for 2.6.21-mm2 Mathieu Desnoyers
2007-05-10  1:55 ` [patch 01/10] Linux Kernel Markers - Add kconfig menus for the marker code Mathieu Desnoyers
2007-05-10  6:57   ` Christoph Hellwig
2007-05-10  1:55 ` [patch 02/10] Linux Kernel Markers, architecture independent code Mathieu Desnoyers
2007-05-10  5:10   ` Alexey Dobriyan
2007-05-10 12:58     ` Mathieu Desnoyers
2007-05-10 13:12     ` Mathieu Desnoyers
2007-05-10 19:00       ` Alexey Dobriyan
2007-05-10 19:46         ` Mathieu Desnoyers
2007-05-10  1:55 ` [patch 03/10] Allow userspace applications to use marker.h to parse the markers section in the kernel binary Mathieu Desnoyers
2007-05-10  6:51   ` Christoph Hellwig
2007-05-10 22:14     ` David Smith
2007-06-23  8:09       ` Christoph Hellwig
2007-06-23  9:25         ` Alan Cox
2007-06-23  9:32           ` Christoph Hellwig
2007-06-23  9:49             ` Alan Cox
2007-06-23 10:06               ` Christoph Hellwig
2007-06-23 14:55                 ` Alan Cox
2007-05-10  1:55 ` [patch 04/10] Linux Kernel Markers - PowerPC optimized version Mathieu Desnoyers
2007-05-10  6:57   ` Christoph Hellwig
2007-05-10  1:56 ` [patch 05/10] Linux Kernel Markers - i386 " Mathieu Desnoyers
2007-05-10  9:06   ` Andi Kleen [this message]
2007-05-10 15:55     ` Mathieu Desnoyers
2007-05-10 16:28       ` Alan Cox
2007-05-10 16:59         ` Mathieu Desnoyers
2007-05-11  4:57           ` Ananth N Mavinakayanahalli
2007-05-11 18:55             ` Mathieu Desnoyers
2007-05-12  5:29             ` Suparna Bhattacharya
2007-05-11  6:04           ` Andi Kleen
2007-05-11 18:02             ` Mathieu Desnoyers
2007-05-11 21:56               ` Alan Cox
2007-05-13 15:20                 ` Mathieu Desnoyers
2007-05-10  1:56 ` [patch 06/10] Linux Kernel Markers - Non optimized architectures Mathieu Desnoyers
2007-05-10  5:13   ` Alexey Dobriyan
2007-05-10  6:56   ` Christoph Hellwig
2007-05-10 13:11     ` Mathieu Desnoyers
2007-05-10 13:40       ` Alan Cox
2007-05-10 14:25         ` Mathieu Desnoyers
2007-05-10 15:33         ` Nicholas Berry
2007-05-10 16:09           ` Alan Cox
2007-05-10  1:56 ` [patch 07/10] Linux Kernel Markers - Documentation Mathieu Desnoyers
2007-05-10  6:58   ` Christoph Hellwig
2007-05-10 11:41     ` Alan Cox
2007-05-10 11:41       ` Christoph Hellwig
2007-05-10 12:48         ` Alan Cox
2007-05-10 12:52           ` Pekka Enberg
2007-05-10 13:04             ` Alan Cox
2007-05-10 13:16               ` Pekka J Enberg
2007-05-10 13:43                 ` Alan Cox
2007-05-10 14:04                   ` Pekka J Enberg
2007-05-10 14:12     ` Mathieu Desnoyers
2007-05-10 14:14     ` Mathieu Desnoyers
2007-05-11 15:05     ` Valdis.Kletnieks
2007-05-10 12:00   ` Christoph Hellwig
2007-05-10 15:51   ` Scott Preece
2007-05-10  1:56 ` [patch 08/10] Defines the linker macro EXTRA_RWDATA for the marker data section Mathieu Desnoyers
2007-05-10  1:56 ` [patch 09/10] Linux Kernel Markers - Use EXTRA_RWDATA in architectures Mathieu Desnoyers
2007-05-10  1:56 ` [patch 10/10] Port of blktrace to the Linux Kernel Markers Mathieu Desnoyers
2007-05-10  6:53   ` Christoph Hellwig
2007-05-10  9:20   ` Jens Axboe
2007-05-10  2:30 ` [patch 00/10] Linux Kernel Markers for 2.6.21-mm2 Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070510090656.GA57297@muc.de \
    --to=ak@muc.de \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).