All of lore.kernel.org
 help / color / mirror / Atom feed
From: Megha Dey <megha.dey@linux.intel.com>
To: x86@kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
Cc: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com,
	andriy.shevchenko@linux.intel.com, kstewart@linuxfoundation.org,
	yu-cheng.yu@intel.com, len.brown@intel.com,
	gregkh@linuxfoundation.org, peterz@infradead.org,
	acme@kernel.org, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, namhyung@kernel.org,
	vikas.shivappa@linux.intel.com, pombredanne@nexb.com,
	me@kylehuey.com, bp@suse.de, grzegorz.andrejczuk@intel.com,
	tony.luck@intel.com, corbet@lwn.net, ravi.v.shankar@intel.com,
	megha.dey@intel.com, Megha Dey <megha.dey@linux.intel.com>
Subject: [PATCH V1 3/3] x86, bm: Add documentation on Intel Branch Monitoring
Date: Sat, 11 Nov 2017 13:20:06 -0800	[thread overview]
Message-ID: <1510435206-16110-4-git-send-email-megha.dey@linux.intel.com> (raw)
In-Reply-To: <1510435206-16110-1-git-send-email-megha.dey@linux.intel.com>

This patch adds the Documentation/x86/intel_bm.txt file with some
information about Intel Branch monitoring.

Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 Documentation/x86/intel_bm.txt | 216 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 216 insertions(+)
 create mode 100644 Documentation/x86/intel_bm.txt

diff --git a/Documentation/x86/intel_bm.txt b/Documentation/x86/intel_bm.txt
new file mode 100644
index 0000000..25b7177
--- /dev/null
+++ b/Documentation/x86/intel_bm.txt
@@ -0,0 +1,216 @@
+Intel(R) Branch Monitoring
+
+Copyright (C) 2017 Intel Corporation
+
+Megha Dey <megha.dey@intel.com>
+Yu-Cheng Yu <yu-cheng.yu@intel.com>
+
+I. Overview
+===========
+
+The Cannonlake family of Intel processors support the branch monitoring
+feature. This feature uses heuristics to detect the occurrence of an ROP
+(Return Oriented Programming) or ROP like(JOP:Jump oriented programming)
+attack. These heuristics are based off certain performance monitoring
+statistics, measured dynamically over a short configurable window period.
+ROP is a malware trend in which the attacker can compromise a return
+pointer held on the stack to redirect execution to a different desired
+instruction.
+
+Support for branch monitoring has been added via Linux kernel perf event
+infrastructure. This feature is enabled by CONFIG_PERF_EVENTS_INTEL_BM.
+
+Once the kernel is compiled with CONFIG_PERF_EVENTS_INTEL_BM=y on a
+Cannonlake system, the following perf events are added which can be viewed
+with perf list:
+  intel_bm/branch-misp/                              [Kernel PMU event]
+  intel_bm/call-ret/                                 [Kernel PMU event]
+  intel_bm/far-branch/                               [Kernel PMU event]
+  intel_bm/indirect-branch-misp/                     [Kernel PMU event]
+  intel_bm/ret-misp/                                 [Kernel PMU event]
+  intel_bm/rets/                                     [Kernel PMU event]
+
+II. Hardware details
+====================
+
+The MSRs associated with branch monitoring are as follows:
+
+1. BR_DETECT_CTRL : Branch Monitoring Global control
+   Used for enabling and configuring global capability
+
+2. BR_DETECT_STATUS : Branch Monitoring Global Status
+   Used by SW handler for determining detect status
+
+3. BR_DETECT_COUNTER_CONFIG_i : Branch Monitoring Counter Configuration
+   Per-cpu branch monitoring counter Configuration
+
+There are 2 8-bit counters that each can select between one of the
+following 6 events:
+
+1. RET instructions: Counts the number of near return instructions retired
+
+2. CALL-RET instructions: Counts the difference between the number of near
+   return and call instructions retired
+
+3. RET mispredicts: Mispredicted return instructions retired
+
+4. Branch (all) mispredicts: Counts the number of mispredicted branches
+
+5. Indirect branch mispredicts: Counts the number of mispredicted indirect
+   near branch instructions. Includes indirect near jump/call instructions
+
+6. Far branch instructions: Counts the number of far branches retired
+
+Branch Monitoring hardware utilizes various existing performance related
+counter events. Of the 6 events above, only call-ret is newly implemented.
+
+The events are evaluated over a specified 10-bit instruction window size
+(0 to 1023). For each counter, a threshold value (0 to 127) can be
+configured to set a point at which an interrupt is generated and a
+detection event action is taken (determined by user-space). This can take
+the form of signaling an interrupt and/or freezing the state of the last
+branch record information.
+
+The event counters are reset after every 'window size' instructions by the
+hardware.
+
+The feature is for user mode (privilege level > 0) operation only, which is
+the known malware security threat target environment. While in supervisor
+mode, this heuristic detection counter activity is suspended. This behavior
+(user mode) is independent of root vs. non-root with respect to
+virtualization technology execution.
+
+III. Software Implementation
+============================
+
+A perf-based kernel driver has been used to monitor the occurrence of
+one of the 6 branch monitoring events.
+
+If an branch monitoring interrupt is generated, the interrupt bit is set
+which is cleared by interrupt handler and the event counters are reset.
+
+The entire system can monitor a maximum of 2 events at any given time.
+These events can belong to the same or different tasks.
+
+Everytime a task is scheduled out, we save current window and count
+associated with the event being monitored. When the task is scheduled next,
+we start counting from previous count associated with this event. Thus, a
+full context switch in this case is not necessary.
+
+The Branch Monitoring exception can be configured as a regular interrupt or
+an NMI. We chain an NMI handler after PMU, because
+1. It will not interfere with PMU events
+2. We only monitor for user-mode events, and this will not delay branch
+   monitoring events for user-mode
+
+We monitor only per-task events. It does not make sense to monitor all tasks
+for an attack. This could generate a lot of false positives.
+
+IV. User-configurable inputs
+============================
+
+Several sysfs entries are provided in /sys/devices/intel_bm/ to configure
+controls for the supported hardware heuristics.
+
+1. LBR freeze: /sys/devices/intel-bm/lbr_freeze
+   possible values are 0 or 1. By default this is disabled(0). When enabled,
+   an LBR freeze is observed on threshold trip
+
+2. Guest Disable: /sys/devices/intel-bm/guest_disable
+   Possible values are 0 or 1. By default it is 0. When set to ‘1’, branch
+   monitoring feature is disabled when operating at VMX non-root operation.
+
+3. Window size: /sys/devices/intel-bm/window_size
+   By default, window size is 1023. It can take values from 0 to 1023. This
+   represents the number of instructions to be executed before the event
+   counters are reset.
+
+4. Window count select: /sys/devices/intel-bm/window_cnt_sel
+   Possible values are:
+   ‘00 = instructions retired
+   ‘01 = branches retired
+   ‘10 = returned instructions retired
+   ‘11 = indirect branch instructions retired
+   By default, it has a value of 0.
+
+5. Count and mode: /sys/devices/intel-bm/cnt_and_mode
+   Possible values are 0 or 1. By default it is 0. When set to ‘1’, the
+   overall event triggering condition is true only if both enabled
+   counter’s threshold conditions are true. When ‘0’, the threshold
+   tripping condition is true if either enabled counter’s threshold is
+   true. If a counter is not enabled, then it does not factor into the
+   AND’ing logic
+
+6. Threshold: /sys/devices/intel-bm/threshold
+   An unsigned value of 0 to 127 is supported. The value 0 of counter
+   threshold will result in branch monitoring event signaled after every
+   instruction. By default, it has a value of 127.
+
+7. Mispredict counting behaviour: /sys/devices/intel-bm/mispred_evt_cnt
+   Possible values are:
+   0 = mispredict events are counted in a window
+   1 = mispredict events are counted based on a consecutive occurrence.
+   By default, it has a value of 0.
+
+Threshold and Mispredict events counting behaviour are per-counter
+configurations whereas the rest are global.
+
+V. Example usage
+================
+
+1. To monitor a user space application for branch monitoring events, perf
+command line can be used as follows:
+
+perf stat -e intel_bm/rets/ ./test
+
+ Performance counter stats for './test':
+
+                 1      intel_bm/rets/
+
+       0.104705937 seconds time elapsed
+
+where test.c is:
+
+void func(void)
+{
+        return;
+}
+
+void main(void)
+{
+        int i;
+
+        for (i = 0; i < 128; i++) {
+                func();
+        }
+
+        return;
+}
+
+and threshold = 100 (echo 100 > /sys/devices/intel_bm/threshold)
+
+perf returns the number of branch monitoring interrupts occurred when the
+user-space application was running.
+
+2. To monitor 2 events for a task,
+
+perf stat -e intel_bm/far-branch/,intel_bm/rets/ ./rets-128.bin
+
+ Performance counter stats for './rets-128.bin':
+
+                 0      intel_bm/far-branch/
+                 1      intel_bm/rets/
+
+       0.104057608 seconds time elapsed
+
+For the above example, the threshold and window size are shared.
+
+3. To monitor 2 events with different thresholds(same or different task)
+
+On terminal 1:
+echo <threshold1> > /sys/devices/intel_bm/threshold
+perf stat -e intel_bm/rets/ ./test.bin
+
+On terminal 2:
+echo <threshold2> > /sys/devices/intel_bm/threshold
+perf stat -e intel_bm/call-ret/ ./test.bin
-- 
1.9.1

  parent reply	other threads:[~2017-11-11 21:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-11 21:20 [PATCH V1 0/3] perf/x86/intel: Add Branch Monitoring support Megha Dey
2017-11-11 21:20 ` [PATCH V1 1/3] x86/cpu/intel: Add Cannonlake to Intel family Megha Dey
2017-11-11 21:20 ` [PATCH V1 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support Megha Dey
2017-11-13  9:00   ` Peter Zijlstra
2017-11-13 19:22     ` Dey, Megha
2017-11-13 20:25       ` Thomas Gleixner
2017-11-13 22:14         ` Megha Dey
2017-11-11 21:20 ` Megha Dey [this message]
2017-11-12  1:56   ` [PATCH V1 3/3] x86, bm: Add documentation on Intel Branch Monitoring Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1510435206-16110-4-git-send-email-megha.dey@linux.intel.com \
    --to=megha.dey@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=bp@suse.de \
    --cc=corbet@lwn.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=grzegorz.andrejczuk@intel.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=len.brown@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=megha.dey@intel.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pombredanne@nexb.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    --cc=yu-cheng.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.