LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/8] CR4 handling improvements
@ 2014-10-24 22:58 Andy Lutomirski
  2014-10-24 22:58 ` [PATCH v2 1/8] perf: Clean up pmu::event_idx Andy Lutomirski
                   ` (9 more replies)
  0 siblings, 10 replies; 39+ messages in thread
From: Andy Lutomirski @ 2014-10-24 22:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Valdis Kletnieks, linux-kernel, Paul Mackerras,
	Arnaldo Carvalho de Melo, Ingo Molnar, Kees Cook,
	Andrea Arcangeli, Vince Weaver, hillf.zj, Andy Lutomirski

This little series tightens up rdpmc permissions.  With it applied,
rdpmc can only be used if a perf_event is actually mmapped.  For now,
this is only really useful for seccomp.

At some point this could be further tightened up to only allow rdpmc
if an actual self-monitoring perf event that is compatible with
rdpmc is mapped.

This should add <50ns to context switches between rdpmc-capable and
rdpmc-non-capable mms.  I suspect that this is well under 10%
overhead, given that perf already adds some context switch latency.

I think that patches 1-3 are a good idea regardless of any rdpmc changes.

AMD Uncore userspace rdpmc is broken by these patches (cap_user_rdpmc
will be zero), but it was broken anyway.

Changes from v1 (aka RFC):
 - Rebased on top of the KVM CR4 fix.  This applies to a very recent -linus.
 - Renamed the cr4 helpers (Peter, Borislav)
 - Fixed buggy cr4 helpers (Hilf)
 - Improved lots of comments (everyone)
 - Renamed read_cr4 and write_cr4 to make sure I didn't miss anything.
   (NB: This will introduce conflicts with Andi's FSGSBASE work.  This is
    a good thing.)

Andy Lutomirski (7):
  x86: Clean up cr4 manipulation
  x86: Store a per-cpu shadow copy of CR4
  x86: Add a comment clarifying LDT context switching
  perf: Add pmu callbacks to track event mapping and unmapping
  perf: Pass the event to arch_perf_update_userpage
  x86, perf: Only allow rdpmc if a perf_event is mapped
  x86, perf: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks

Peter Zijlstra (1):
  perf: Clean up pmu::event_idx

 arch/powerpc/perf/hv-24x7.c          |  6 ---
 arch/powerpc/perf/hv-gpci.c          |  6 ---
 arch/s390/kernel/perf_cpum_sf.c      |  6 ---
 arch/x86/include/asm/mmu.h           |  2 +
 arch/x86/include/asm/mmu_context.h   | 32 ++++++++++++++-
 arch/x86/include/asm/paravirt.h      |  6 +--
 arch/x86/include/asm/processor.h     | 33 ----------------
 arch/x86/include/asm/special_insns.h |  6 +--
 arch/x86/include/asm/tlbflush.h      | 77 ++++++++++++++++++++++++++++++++----
 arch/x86/include/asm/virtext.h       |  5 ++-
 arch/x86/kernel/acpi/sleep.c         |  2 +-
 arch/x86/kernel/cpu/common.c         | 17 +++++---
 arch/x86/kernel/cpu/mcheck/mce.c     |  3 +-
 arch/x86/kernel/cpu/mcheck/p5.c      |  3 +-
 arch/x86/kernel/cpu/mcheck/winchip.c |  3 +-
 arch/x86/kernel/cpu/mtrr/cyrix.c     |  6 +--
 arch/x86/kernel/cpu/mtrr/generic.c   |  6 +--
 arch/x86/kernel/cpu/perf_event.c     | 76 ++++++++++++++++++++++++++---------
 arch/x86/kernel/cpu/perf_event.h     |  2 +
 arch/x86/kernel/head32.c             |  1 +
 arch/x86/kernel/head64.c             |  2 +
 arch/x86/kernel/i387.c               |  3 +-
 arch/x86/kernel/process.c            |  5 ++-
 arch/x86/kernel/process_32.c         |  2 +-
 arch/x86/kernel/process_64.c         |  2 +-
 arch/x86/kernel/setup.c              |  2 +-
 arch/x86/kernel/xsave.c              |  3 +-
 arch/x86/kvm/svm.c                   |  2 +-
 arch/x86/kvm/vmx.c                   | 10 ++---
 arch/x86/mm/fault.c                  |  2 +-
 arch/x86/mm/init.c                   | 12 +++++-
 arch/x86/mm/tlb.c                    |  3 --
 arch/x86/power/cpu.c                 | 11 ++----
 arch/x86/realmode/init.c             |  2 +-
 arch/x86/xen/enlighten.c             |  4 +-
 drivers/lguest/x86/core.c            |  4 +-
 include/linux/perf_event.h           |  7 ++++
 kernel/events/core.c                 | 29 ++++++--------
 kernel/events/hw_breakpoint.c        |  7 ----
 39 files changed, 256 insertions(+), 154 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 39+ messages in thread
* Re: [PATCH v2 7/8] x86, perf: Only allow rdpmc if a perf_event is mapped
@ 2014-10-27  7:16 张静(长谷)
  2014-10-27 15:45 ` Andy Lutomirski
  0 siblings, 1 reply; 39+ messages in thread
From: 张静(长谷) @ 2014-10-27  7:16 UTC (permalink / raw)
  To: 'Andy Lutomirski', 'Peter Zijlstra'
  Cc: 'Valdis Kletnieks',
	linux-kernel, 'Paul Mackerras',
	'Arnaldo Carvalho de Melo', 'Ingo Molnar',
	'Kees Cook', 'Andrea Arcangeli',
	'Vince Weaver'

> 
> We currently allow any process to use rdpmc.  This significantly
> weakens the protection offered by PR_TSC_DISABLED, and it could be
> helpful to users attempting to exploit timing attacks.
> 
> Since we can't enable access to individual counters, use a very
> coarse heuristic to limit access to rdpmc: allow access only when
> a perf_event is mmapped.  This protects seccomp sandboxes.
> 
> There is plenty of room to further tighen these restrictions.  For
> example, this allows rdpmc for any x86_pmu event, but it's only
> useful for self-monitoring tasks.
> 
> As a side effect, cap_user_rdpmc will now be false for AMD uncore
> events.  This isn't a real regression, since .event_idx is disabled
> for these events anyway for the time being.  Whenever that gets
> re-added, the cap_user_rdpmc code can be adjusted or refactored
> accordingly.
> 
> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> ---
>  arch/x86/include/asm/mmu.h         |  2 ++
>  arch/x86/include/asm/mmu_context.h | 16 +++++++++++
>  arch/x86/kernel/cpu/perf_event.c   | 57 +++++++++++++++++++++++++-------------
>  arch/x86/kernel/cpu/perf_event.h   |  2 ++
>  4 files changed, 58 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> index 876e74e8eec7..09b9620a73b4 100644
> --- a/arch/x86/include/asm/mmu.h
> +++ b/arch/x86/include/asm/mmu.h
> @@ -19,6 +19,8 @@ typedef struct {
> 
>  	struct mutex lock;
>  	void __user *vdso;
> +
> +	atomic_t perf_rdpmc_allowed;	/* nonzero if rdpmc is allowed */
>  } mm_context_t;
> 
>  #ifdef CONFIG_SMP
> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
> index 23697f74b372..ccad8d616038 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -19,6 +19,18 @@ static inline void paravirt_activate_mm(struct mm_struct *prev,
>  }
>  #endif	/* !CONFIG_PARAVIRT */
> 
> +#ifdef CONFIG_PERF_EVENTS
> +static inline void load_mm_cr4(struct mm_struct *mm)
> +{
> +	if (atomic_read(&mm->context.perf_rdpmc_allowed))
> +		cr4_set_bits(X86_CR4_PCE);
> +	else
> +		cr4_clear_bits(X86_CR4_PCE);
> +}
> +#else
> +static inline void load_mm_cr4(struct mm_struct *mm) {}
> +#endif
> +
>  /*
>   * Used for LDT copy/destruction.
>   */
> @@ -53,6 +65,9 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  		/* Stop flush ipis for the previous mm */
>  		cpumask_clear_cpu(cpu, mm_cpumask(prev));
> 
> +		/* Load per-mm CR4 state */
> +		load_mm_cr4(next);
> +
>  		/*
>  		 * Load the LDT, if the LDT is different.
>  		 *
> @@ -88,6 +103,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  			 */
>  			load_cr3(next->pgd);
>  			trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
> +			load_mm_cr4(next);
>  			load_LDT_nolock(&next->context);
>  		}
>  	}
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 00fbab7aa587..3e875b3b30f2 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -31,6 +31,7 @@
>  #include <asm/nmi.h>
>  #include <asm/smp.h>
>  #include <asm/alternative.h>
> +#include <asm/mmu_context.h>
>  #include <asm/tlbflush.h>
>  #include <asm/timer.h>
>  #include <asm/desc.h>
> @@ -1336,8 +1337,6 @@ x86_pmu_notifier(struct notifier_block *self, unsigned long action, void *hcpu)
>  		break;
> 
>  	case CPU_STARTING:
> -		if (x86_pmu.attr_rdpmc)
> -			cr4_set_bits(X86_CR4_PCE);
>  		if (x86_pmu.cpu_starting)
>  			x86_pmu.cpu_starting(cpu);
>  		break;
> @@ -1813,14 +1812,44 @@ static int x86_pmu_event_init(struct perf_event *event)
>  			event->destroy(event);
>  	}
> 
> +	if (ACCESS_ONCE(x86_pmu.attr_rdpmc))
> +		event->hw.flags |= PERF_X86_EVENT_RDPMC_ALLOWED;
> +
>  	return err;
>  }
> 
> +static void refresh_pce(void *ignored)
> +{
> +	if (current->mm)
> +		load_mm_cr4(current->mm);
> +}
> +
> +static void x86_pmu_event_mapped(struct perf_event *event)
> +{
> +	if (!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED))
> +		return;
> +
> +	if (atomic_inc_return(&current->mm->context.perf_rdpmc_allowed) == 1)
> +		on_each_cpu_mask(mm_cpumask(current->mm), refresh_pce, NULL, 1);
> +}
> +
> +static void x86_pmu_event_unmapped(struct perf_event *event)
> +{
> +	if (!current->mm)
> +		return;
> +
> +	if (!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED))
> +		return;
> +
> +	if (atomic_dec_and_test(&current->mm->context.perf_rdpmc_allowed))
> +		on_each_cpu_mask(mm_cpumask(current->mm), refresh_pce, NULL, 1);

The current task(T-a on CPU A) is asking CPUs(A, B, C, D) to refresh pce, and looks
the current task(T-d on CPU D) is disturbed if T-d loaded CR4 when going on CPU D.

Hillf
> +}
> +
>  static int x86_pmu_event_idx(struct perf_event *event)
>  {
>  	int idx = event->hw.idx;
> 
> -	if (!x86_pmu.attr_rdpmc)
> +	if (!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED))
>  		return 0;
> 
>  	if (x86_pmu.num_counters_fixed && idx >= INTEL_PMC_IDX_FIXED) {
> @@ -1838,16 +1867,6 @@ static ssize_t get_attr_rdpmc(struct device *cdev,
>  	return snprintf(buf, 40, "%d\n", x86_pmu.attr_rdpmc);
>  }
> 
> -static void change_rdpmc(void *info)
> -{
> -	bool enable = !!(unsigned long)info;
> -
> -	if (enable)
> -		cr4_set_bits(X86_CR4_PCE);
> -	else
> -		cr4_clear_bits(X86_CR4_PCE);
> -}
> -
>  static ssize_t set_attr_rdpmc(struct device *cdev,
>  			      struct device_attribute *attr,
>  			      const char *buf, size_t count)
> @@ -1862,11 +1881,7 @@ static ssize_t set_attr_rdpmc(struct device *cdev,
>  	if (x86_pmu.attr_rdpmc_broken)
>  		return -ENOTSUPP;
> 
> -	if (!!val != !!x86_pmu.attr_rdpmc) {
> -		x86_pmu.attr_rdpmc = !!val;
> -		on_each_cpu(change_rdpmc, (void *)val, 1);
> -	}
> -
> +	x86_pmu.attr_rdpmc = !!val;
>  	return count;
>  }
> 
> @@ -1909,6 +1924,9 @@ static struct pmu pmu = {
> 
>  	.event_init		= x86_pmu_event_init,
> 
> +	.event_mapped		= x86_pmu_event_mapped,
> +	.event_unmapped		= x86_pmu_event_unmapped,
> +
>  	.add			= x86_pmu_add,
>  	.del			= x86_pmu_del,
>  	.start			= x86_pmu_start,
> @@ -1930,7 +1948,8 @@ void arch_perf_update_userpage(struct perf_event *event,
> 
>  	userpg->cap_user_time = 0;
>  	userpg->cap_user_time_zero = 0;
> -	userpg->cap_user_rdpmc = x86_pmu.attr_rdpmc;
> +	userpg->cap_user_rdpmc =
> +		!!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED);
>  	userpg->pmc_width = x86_pmu.cntval_bits;
> 
>  	if (!sched_clock_stable())
> diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
> index d98a34d435d7..f6868186e67b 100644
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -71,6 +71,8 @@ struct event_constraint {
>  #define PERF_X86_EVENT_COMMITTED	0x8 /* event passed commit_txn */
>  #define PERF_X86_EVENT_PEBS_LD_HSW	0x10 /* haswell style datala, load */
>  #define PERF_X86_EVENT_PEBS_NA_HSW	0x20 /* haswell style datala, unknown */
> +#define PERF_X86_EVENT_RDPMC_ALLOWED	0x40 /* grant rdpmc permission */
> +
> 
>  struct amd_nb {
>  	int nb_id;  /* NorthBridge id */
> --
> 1.9.3


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, back to index

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-24 22:58 [PATCH v2 0/8] CR4 handling improvements Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 1/8] perf: Clean up pmu::event_idx Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 2/8] x86: Clean up cr4 manipulation Andy Lutomirski
2014-11-01 19:56   ` Thomas Gleixner
2015-02-04 14:41   ` [tip:perf/x86] " tip-bot for Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 3/8] x86: Store a per-cpu shadow copy of CR4 Andy Lutomirski
2014-11-01 19:56   ` Thomas Gleixner
2015-02-04 14:41   ` [tip:perf/x86] " tip-bot for Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 4/8] x86: Add a comment clarifying LDT context switching Andy Lutomirski
2014-11-01 19:56   ` Thomas Gleixner
2015-02-04 14:41   ` [tip:perf/x86] " tip-bot for Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 5/8] perf: Add pmu callbacks to track event mapping and unmapping Andy Lutomirski
2014-11-01 19:59   ` Thomas Gleixner
2014-11-01 20:32     ` Andy Lutomirski
2014-11-01 20:39       ` Thomas Gleixner
2014-11-01 21:49         ` Andy Lutomirski
2014-11-01 22:10           ` Thomas Gleixner
2014-11-02 20:15             ` Andy Lutomirski
2015-02-04 14:42   ` [tip:perf/x86] " tip-bot for Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 6/8] perf: Pass the event to arch_perf_update_userpage Andy Lutomirski
2015-02-04 14:42   ` [tip:perf/x86] perf: Pass the event to arch_perf_update_userpage( ) tip-bot for Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 7/8] x86, perf: Only allow rdpmc if a perf_event is mapped Andy Lutomirski
2014-10-31 17:54   ` Paolo Bonzini
2014-10-31 18:25     ` Andy Lutomirski
2015-02-04 14:42   ` [tip:perf/x86] perf/x86: " tip-bot for Andy Lutomirski
2014-10-24 22:58 ` [PATCH v2 8/8] x86, perf: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks Andy Lutomirski
2015-02-04 14:43   ` [tip:perf/x86] perf/x86: Add /sys/devices/cpu/rdpmc= 2 " tip-bot for Andy Lutomirski
2014-10-31 15:09 ` [PATCH v2 0/8] CR4 handling improvements Peter Zijlstra
2014-10-31 17:09   ` Andy Lutomirski
2015-01-14  0:52   ` Andy Lutomirski
2015-01-22 22:42     ` Thomas Gleixner
2015-01-23  8:37       ` Peter Zijlstra
2014-11-12 23:38 ` Andy Lutomirski
2014-10-27  7:16 [PATCH v2 7/8] x86, perf: Only allow rdpmc if a perf_event is mapped 张静(长谷)
2014-10-27 15:45 ` Andy Lutomirski
2014-10-28  3:35   ` Hillf Danton
2014-10-28  3:57     ` Andy Lutomirski
2014-10-28  4:07       ` Hillf Danton
2014-10-28  4:27         ` Andy Lutomirski

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git