linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Megha Dey <megha.dey@linux.intel.com>
To: x86@kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
Cc: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com,
	andriy.shevchenko@linux.intel.com, kstewart@linuxfoundation.org,
	yu-cheng.yu@intel.com, len.brown@intel.com,
	gregkh@linuxfoundation.org, peterz@infradead.org,
	acme@kernel.org, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, namhyung@kernel.org,
	vikas.shivappa@linux.intel.com, pombredanne@nexb.com,
	me@kylehuey.com, bp@suse.de, grzegorz.andrejczuk@intel.com,
	tony.luck@intel.com, corbet@lwn.net, ravi.v.shankar@intel.com,
	megha.dey@intel.com, Megha Dey <megha.dey@linux.intel.com>
Subject: [PATCH V1 3/3] x86, bm: Add documentation on Intel Branch Monitoring
Date: Sat, 11 Nov 2017 13:20:06 -0800	[thread overview]
Message-ID: <1510435206-16110-4-git-send-email-megha.dey@linux.intel.com> (raw)
In-Reply-To: <1510435206-16110-1-git-send-email-megha.dey@linux.intel.com>

This patch adds the Documentation/x86/intel_bm.txt file with some
information about Intel Branch monitoring.

Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 Documentation/x86/intel_bm.txt | 216 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 216 insertions(+)
 create mode 100644 Documentation/x86/intel_bm.txt

diff --git a/Documentation/x86/intel_bm.txt b/Documentation/x86/intel_bm.txt
new file mode 100644
index 0000000..25b7177
--- /dev/null
+++ b/Documentation/x86/intel_bm.txt
@@ -0,0 +1,216 @@
+Intel(R) Branch Monitoring
+
+Copyright (C) 2017 Intel Corporation
+
+Megha Dey <megha.dey@intel.com>
+Yu-Cheng Yu <yu-cheng.yu@intel.com>
+
+I. Overview
+===========
+
+The Cannonlake family of Intel processors support the branch monitoring
+feature. This feature uses heuristics to detect the occurrence of an ROP
+(Return Oriented Programming) or ROP like(JOP:Jump oriented programming)
+attack. These heuristics are based off certain performance monitoring
+statistics, measured dynamically over a short configurable window period.
+ROP is a malware trend in which the attacker can compromise a return
+pointer held on the stack to redirect execution to a different desired
+instruction.
+
+Support for branch monitoring has been added via Linux kernel perf event
+infrastructure. This feature is enabled by CONFIG_PERF_EVENTS_INTEL_BM.
+
+Once the kernel is compiled with CONFIG_PERF_EVENTS_INTEL_BM=y on a
+Cannonlake system, the following perf events are added which can be viewed
+with perf list:
+  intel_bm/branch-misp/                              [Kernel PMU event]
+  intel_bm/call-ret/                                 [Kernel PMU event]
+  intel_bm/far-branch/                               [Kernel PMU event]
+  intel_bm/indirect-branch-misp/                     [Kernel PMU event]
+  intel_bm/ret-misp/                                 [Kernel PMU event]
+  intel_bm/rets/                                     [Kernel PMU event]
+
+II. Hardware details
+====================
+
+The MSRs associated with branch monitoring are as follows:
+
+1. BR_DETECT_CTRL : Branch Monitoring Global control
+   Used for enabling and configuring global capability
+
+2. BR_DETECT_STATUS : Branch Monitoring Global Status
+   Used by SW handler for determining detect status
+
+3. BR_DETECT_COUNTER_CONFIG_i : Branch Monitoring Counter Configuration
+   Per-cpu branch monitoring counter Configuration
+
+There are 2 8-bit counters that each can select between one of the
+following 6 events:
+
+1. RET instructions: Counts the number of near return instructions retired
+
+2. CALL-RET instructions: Counts the difference between the number of near
+   return and call instructions retired
+
+3. RET mispredicts: Mispredicted return instructions retired
+
+4. Branch (all) mispredicts: Counts the number of mispredicted branches
+
+5. Indirect branch mispredicts: Counts the number of mispredicted indirect
+   near branch instructions. Includes indirect near jump/call instructions
+
+6. Far branch instructions: Counts the number of far branches retired
+
+Branch Monitoring hardware utilizes various existing performance related
+counter events. Of the 6 events above, only call-ret is newly implemented.
+
+The events are evaluated over a specified 10-bit instruction window size
+(0 to 1023). For each counter, a threshold value (0 to 127) can be
+configured to set a point at which an interrupt is generated and a
+detection event action is taken (determined by user-space). This can take
+the form of signaling an interrupt and/or freezing the state of the last
+branch record information.
+
+The event counters are reset after every 'window size' instructions by the
+hardware.
+
+The feature is for user mode (privilege level > 0) operation only, which is
+the known malware security threat target environment. While in supervisor
+mode, this heuristic detection counter activity is suspended. This behavior
+(user mode) is independent of root vs. non-root with respect to
+virtualization technology execution.
+
+III. Software Implementation
+============================
+
+A perf-based kernel driver has been used to monitor the occurrence of
+one of the 6 branch monitoring events.
+
+If an branch monitoring interrupt is generated, the interrupt bit is set
+which is cleared by interrupt handler and the event counters are reset.
+
+The entire system can monitor a maximum of 2 events at any given time.
+These events can belong to the same or different tasks.
+
+Everytime a task is scheduled out, we save current window and count
+associated with the event being monitored. When the task is scheduled next,
+we start counting from previous count associated with this event. Thus, a
+full context switch in this case is not necessary.
+
+The Branch Monitoring exception can be configured as a regular interrupt or
+an NMI. We chain an NMI handler after PMU, because
+1. It will not interfere with PMU events
+2. We only monitor for user-mode events, and this will not delay branch
+   monitoring events for user-mode
+
+We monitor only per-task events. It does not make sense to monitor all tasks
+for an attack. This could generate a lot of false positives.
+
+IV. User-configurable inputs
+============================
+
+Several sysfs entries are provided in /sys/devices/intel_bm/ to configure
+controls for the supported hardware heuristics.
+
+1. LBR freeze: /sys/devices/intel-bm/lbr_freeze
+   possible values are 0 or 1. By default this is disabled(0). When enabled,
+   an LBR freeze is observed on threshold trip
+
+2. Guest Disable: /sys/devices/intel-bm/guest_disable
+   Possible values are 0 or 1. By default it is 0. When set to ‘1’, branch
+   monitoring feature is disabled when operating at VMX non-root operation.
+
+3. Window size: /sys/devices/intel-bm/window_size
+   By default, window size is 1023. It can take values from 0 to 1023. This
+   represents the number of instructions to be executed before the event
+   counters are reset.
+
+4. Window count select: /sys/devices/intel-bm/window_cnt_sel
+   Possible values are:
+   ‘00 = instructions retired
+   ‘01 = branches retired
+   ‘10 = returned instructions retired
+   ‘11 = indirect branch instructions retired
+   By default, it has a value of 0.
+
+5. Count and mode: /sys/devices/intel-bm/cnt_and_mode
+   Possible values are 0 or 1. By default it is 0. When set to ‘1’, the
+   overall event triggering condition is true only if both enabled
+   counter’s threshold conditions are true. When ‘0’, the threshold
+   tripping condition is true if either enabled counter’s threshold is
+   true. If a counter is not enabled, then it does not factor into the
+   AND’ing logic
+
+6. Threshold: /sys/devices/intel-bm/threshold
+   An unsigned value of 0 to 127 is supported. The value 0 of counter
+   threshold will result in branch monitoring event signaled after every
+   instruction. By default, it has a value of 127.
+
+7. Mispredict counting behaviour: /sys/devices/intel-bm/mispred_evt_cnt
+   Possible values are:
+   0 = mispredict events are counted in a window
+   1 = mispredict events are counted based on a consecutive occurrence.
+   By default, it has a value of 0.
+
+Threshold and Mispredict events counting behaviour are per-counter
+configurations whereas the rest are global.
+
+V. Example usage
+================
+
+1. To monitor a user space application for branch monitoring events, perf
+command line can be used as follows:
+
+perf stat -e intel_bm/rets/ ./test
+
+ Performance counter stats for './test':
+
+                 1      intel_bm/rets/
+
+       0.104705937 seconds time elapsed
+
+where test.c is:
+
+void func(void)
+{
+        return;
+}
+
+void main(void)
+{
+        int i;
+
+        for (i = 0; i < 128; i++) {
+                func();
+        }
+
+        return;
+}
+
+and threshold = 100 (echo 100 > /sys/devices/intel_bm/threshold)
+
+perf returns the number of branch monitoring interrupts occurred when the
+user-space application was running.
+
+2. To monitor 2 events for a task,
+
+perf stat -e intel_bm/far-branch/,intel_bm/rets/ ./rets-128.bin
+
+ Performance counter stats for './rets-128.bin':
+
+                 0      intel_bm/far-branch/
+                 1      intel_bm/rets/
+
+       0.104057608 seconds time elapsed
+
+For the above example, the threshold and window size are shared.
+
+3. To monitor 2 events with different thresholds(same or different task)
+
+On terminal 1:
+echo <threshold1> > /sys/devices/intel_bm/threshold
+perf stat -e intel_bm/rets/ ./test.bin
+
+On terminal 2:
+echo <threshold2> > /sys/devices/intel_bm/threshold
+perf stat -e intel_bm/call-ret/ ./test.bin
-- 
1.9.1

  parent reply	other threads:[~2017-11-11 21:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-11 21:20 [PATCH V1 0/3] perf/x86/intel: Add Branch Monitoring support Megha Dey
2017-11-11 21:20 ` [PATCH V1 1/3] x86/cpu/intel: Add Cannonlake to Intel family Megha Dey
2017-11-11 21:20 ` [PATCH V1 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support Megha Dey
2017-11-13  9:00   ` Peter Zijlstra
2017-11-13 19:22     ` Dey, Megha
2017-11-13 20:25       ` Thomas Gleixner
2017-11-13 22:14         ` Megha Dey
2017-11-11 21:20 ` Megha Dey [this message]
2017-11-12  1:56   ` [PATCH V1 3/3] x86, bm: Add documentation on Intel Branch Monitoring Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1510435206-16110-4-git-send-email-megha.dey@linux.intel.com \
    --to=megha.dey@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=bp@suse.de \
    --cc=corbet@lwn.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=grzegorz.andrejczuk@intel.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=len.brown@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=megha.dey@intel.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pombredanne@nexb.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    --cc=yu-cheng.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).