All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
@ 2011-05-27 17:38 Andy Lutomirski
  2011-05-27 17:38 ` [PATCH 1/5] x86-64: Fix alignment of jiffies variable Andy Lutomirski
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: Andy Lutomirski @ 2011-05-27 17:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, x86; +Cc: linux-kernel, Andy Lutomirski

I lied about taking awhile to do this.

There are a bunch of syscall instructions in kernel space at fixed
addresses that user code can execute.

One is a time() fallback.  Patch 3/5 removes it.

Several are data that isn't marked NX.  Patch 2/5 makes vvars NX and
5/5 makes the HPET NX.

The last one is the gettimeofday fallback.  We need that, but it
doesn't have to be a real syscall.  Patch 3/5 adds int 0xCC (callable
only from the vsyscall page) that implements the gettimeofday fallback
and nothing else.

Patch 1/5 is just a dumb but harmless bug fix from the last vdso
series.

I've only tested this in KVM with a hacked-up initramfs, but Ingo
wanted it for 2.6.40, so here it is.

Andy Lutomirski (5):
  x86-64: Fix alignment of jiffies variable
  x86-64: Give vvars their own page
  x86-64: Remove kernel.vsyscall64 sysctl
  x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  x86-64: Map the HPET NX

 arch/x86/include/asm/fixmap.h        |    1 +
 arch/x86/include/asm/pgtable_types.h |    6 ++-
 arch/x86/include/asm/traps.h         |    4 ++
 arch/x86/include/asm/vgtod.h         |    1 -
 arch/x86/include/asm/vsyscall.h      |    6 ++
 arch/x86/include/asm/vvar.h          |   24 ++++-----
 arch/x86/kernel/entry_64.S           |    2 +
 arch/x86/kernel/hpet.c               |    2 +-
 arch/x86/kernel/traps.c              |    4 ++
 arch/x86/kernel/vmlinux.lds.S        |   27 ++++++----
 arch/x86/kernel/vsyscall_64.c        |   86 ++++++++++++++++++---------------
 arch/x86/vdso/vclock_gettime.c       |   55 ++++++++-------------
 tools/power/x86/turbostat/turbostat  |  Bin 0 -> 29200 bytes
 13 files changed, 117 insertions(+), 101 deletions(-)
 create mode 100755 tools/power/x86/turbostat/turbostat

-- 
1.7.5.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/5] x86-64: Fix alignment of jiffies variable
  2011-05-27 17:38 [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Andy Lutomirski
@ 2011-05-27 17:38 ` Andy Lutomirski
  2011-05-27 17:38 ` [PATCH 2/5] x86-64: Give vvars their own page Andy Lutomirski
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 28+ messages in thread
From: Andy Lutomirski @ 2011-05-27 17:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, x86; +Cc: linux-kernel, Andy Lutomirski

It's declared __attribute__((aligned(16)) but it's explicitly not
aligned.  This is probably harmless but it's a but embarrassing.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
---
 arch/x86/include/asm/vvar.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/vvar.h b/arch/x86/include/asm/vvar.h
index 341b355..a4eaca4 100644
--- a/arch/x86/include/asm/vvar.h
+++ b/arch/x86/include/asm/vvar.h
@@ -45,7 +45,7 @@
 /* DECLARE_VVAR(offset, type, name) */
 
 DECLARE_VVAR(0, volatile unsigned long, jiffies)
-DECLARE_VVAR(8, int, vgetcpu_mode)
+DECLARE_VVAR(16, int, vgetcpu_mode)
 DECLARE_VVAR(128, struct vsyscall_gtod_data, vsyscall_gtod_data)
 
 #undef DECLARE_VVAR
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/5] x86-64: Give vvars their own page
  2011-05-27 17:38 [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Andy Lutomirski
  2011-05-27 17:38 ` [PATCH 1/5] x86-64: Fix alignment of jiffies variable Andy Lutomirski
@ 2011-05-27 17:38 ` Andy Lutomirski
  2011-05-29 20:34   ` Borislav Petkov
  2011-05-27 17:38 ` [PATCH 3/5] x86-64: Remove kernel.vsyscall64 sysctl Andy Lutomirski
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Andy Lutomirski @ 2011-05-27 17:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, x86; +Cc: linux-kernel, Andy Lutomirski

Move vvars out of the vsyscall page into their own page and mark it
NX.

Without this patch, an attacker who can force a daemon to call some
fixed address could wait until the time contains, say, 0xCD80, and
then execute the current time.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
---
 arch/x86/include/asm/fixmap.h        |    1 +
 arch/x86/include/asm/pgtable_types.h |    2 ++
 arch/x86/include/asm/vvar.h          |   22 ++++++++++------------
 arch/x86/kernel/vmlinux.lds.S        |   27 ++++++++++++++++-----------
 arch/x86/kernel/vsyscall_64.c        |    5 +++++
 tools/power/x86/turbostat/turbostat  |  Bin 0 -> 29200 bytes
 6 files changed, 34 insertions(+), 23 deletions(-)
 create mode 100755 tools/power/x86/turbostat/turbostat

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 4729b2b..460c74e 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -78,6 +78,7 @@ enum fixed_addresses {
 	VSYSCALL_LAST_PAGE,
 	VSYSCALL_FIRST_PAGE = VSYSCALL_LAST_PAGE
 			    + ((VSYSCALL_END-VSYSCALL_START) >> PAGE_SHIFT) - 1,
+	VVAR_PAGE,
 	VSYSCALL_HPET,
 #endif
 	FIX_DBGP_BASE,
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index d56187c..6a29aed6 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -108,6 +108,7 @@
 #define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_PCD)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
 #define __PAGE_KERNEL_VSYSCALL_NOCACHE	(__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_VVAR		(__PAGE_KERNEL_RO | _PAGE_USER)
 #define __PAGE_KERNEL_LARGE		(__PAGE_KERNEL | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_NOCACHE	(__PAGE_KERNEL | _PAGE_CACHE_UC | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_EXEC	(__PAGE_KERNEL_EXEC | _PAGE_PSE)
@@ -130,6 +131,7 @@
 #define PAGE_KERNEL_LARGE_EXEC		__pgprot(__PAGE_KERNEL_LARGE_EXEC)
 #define PAGE_KERNEL_VSYSCALL		__pgprot(__PAGE_KERNEL_VSYSCALL)
 #define PAGE_KERNEL_VSYSCALL_NOCACHE	__pgprot(__PAGE_KERNEL_VSYSCALL_NOCACHE)
+#define PAGE_KERNEL_VVAR		__pgprot(__PAGE_KERNEL_VVAR)
 
 #define PAGE_KERNEL_IO			__pgprot(__PAGE_KERNEL_IO)
 #define PAGE_KERNEL_IO_NOCACHE		__pgprot(__PAGE_KERNEL_IO_NOCACHE)
diff --git a/arch/x86/include/asm/vvar.h b/arch/x86/include/asm/vvar.h
index a4eaca4..de656ac 100644
--- a/arch/x86/include/asm/vvar.h
+++ b/arch/x86/include/asm/vvar.h
@@ -10,15 +10,14 @@
  * In normal kernel code, they are used like any other variable.
  * In user code, they are accessed through the VVAR macro.
  *
- * Each of these variables lives in the vsyscall page, and each
- * one needs a unique offset within the little piece of the page
- * reserved for vvars.  Specify that offset in DECLARE_VVAR.
- * (There are 896 bytes available.  If you mess up, the linker will
- * catch it.)
+ * These variables live in a page of kernel data that has an extra RO
+ * mapping for userspace.  Each variable needs a unique offset within
+ * that page; specify that offset with the DECLARE_VVAR macro.  (If
+ * you mess up, the linker will catch it.)
  */
 
-/* Offset of vars within vsyscall page */
-#define VSYSCALL_VARS_OFFSET (3072 + 128)
+/* Base address of vvars.  This is not ABI. */
+#define VVAR_ADDRESS (-10*1024*1024 - 4096)
 
 #if defined(__VVAR_KERNEL_LDS)
 
@@ -26,17 +25,17 @@
  * right place.
  */
 #define DECLARE_VVAR(offset, type, name) \
-	EMIT_VVAR(name, VSYSCALL_VARS_OFFSET + offset)
+	EMIT_VVAR(name, offset)
 
 #else
 
 #define DECLARE_VVAR(offset, type, name)				\
 	static type const * const vvaraddr_ ## name =			\
-		(void *)(VSYSCALL_START + VSYSCALL_VARS_OFFSET + (offset));
+		(void *)(VVAR_ADDRESS + (offset));
 
 #define DEFINE_VVAR(type, name)						\
-	type __vvar_ ## name						\
-	__attribute__((section(".vsyscall_var_" #name), aligned(16)))
+	type name							\
+	__attribute__((section(".vvar_" #name), aligned(16)))
 
 #define VVAR(name) (*vvaraddr_ ## name)
 
@@ -49,4 +48,3 @@ DECLARE_VVAR(16, int, vgetcpu_mode)
 DECLARE_VVAR(128, struct vsyscall_gtod_data, vsyscall_gtod_data)
 
 #undef DECLARE_VVAR
-#undef VSYSCALL_VARS_OFFSET
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1432bd4..3c1ec1c 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -161,12 +161,6 @@ SECTIONS
 
 #define VVIRT_OFFSET (VSYSCALL_ADDR - __vsyscall_0)
 #define VVIRT(x) (ADDR(x) - VVIRT_OFFSET)
-#define EMIT_VVAR(x, offset) .vsyscall_var_ ## x	\
-	ADDR(.vsyscall_0) + offset		 	\
-	: AT(VLOAD(.vsyscall_var_ ## x)) {     		\
-		*(.vsyscall_var_ ## x)			\
-	}						\
-	x = VVIRT(.vsyscall_var_ ## x);
 
 	. = ALIGN(4096);
 	__vsyscall_0 = .;
@@ -192,17 +186,28 @@ SECTIONS
 		*(.vsyscall_3)
 	}
 
-#define __VVAR_KERNEL_LDS
-#include <asm/vvar.h>
-#undef __VVAR_KERNEL_LDS
-
-	. = __vsyscall_0 + PAGE_SIZE;
+	. = ALIGN(__vsyscall_0 + PAGE_SIZE, PAGE_SIZE);
 
 #undef VSYSCALL_ADDR
 #undef VLOAD_OFFSET
 #undef VLOAD
 #undef VVIRT_OFFSET
 #undef VVIRT
+
+	__vvar_page = .;
+
+#define EMIT_VVAR(name, offset) .vvar_ ## name		\
+	(__vvar_page + offset) :			\
+	 AT(ADDR(.vvar_ ## name) - LOAD_OFFSET) {	\
+		*(.vvar_ ## x)				\
+	} :data
+
+#define __VVAR_KERNEL_LDS
+#include <asm/vvar.h>
+#undef __VVAR_KERNEL_LDS
+
+       . = ALIGN(__vvar_page + PAGE_SIZE, PAGE_SIZE);
+
 #undef EMIT_VVAR
 
 #endif /* CONFIG_X86_64 */
diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index 5f6ad03..ee22180 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -284,9 +284,14 @@ void __init map_vsyscall(void)
 {
 	extern char __vsyscall_0;
 	unsigned long physaddr_page0 = __pa_symbol(&__vsyscall_0);
+	extern char __vvar_page;
+	unsigned long physaddr_vvar_page = __pa_symbol(&__vvar_page);
 
 	/* Note that VSYSCALL_MAPPED_PAGES must agree with the code below. */
 	__set_fixmap(VSYSCALL_FIRST_PAGE, physaddr_page0, PAGE_KERNEL_VSYSCALL);
+	__set_fixmap(VVAR_PAGE, physaddr_vvar_page, PAGE_KERNEL_VVAR);
+	BUILD_BUG_ON((unsigned long)__fix_to_virt(VVAR_PAGE) !=
+		     (unsigned long)VVAR_ADDRESS);
 }
 
 static int __init vsyscall_init(void)
diff --git a/tools/power/x86/turbostat/turbostat b/tools/power/x86/turbostat/turbostat
new file mode 100755
index 0000000000000000000000000000000000000000..97aabd2a0c77eed562f15bc5285911a4db3fa763
GIT binary patch
literal 29200
zcmdsg3w%`7)$dNiKtLb?jg5kO)I<Y92=9QR10*<T(1@hPPw8aHOvq@GiSvL!rP43Z
zIv%OCwbg1>u6;<YR;`aF*g}Yp;G-5=s!`EKrR~I^BBDh_bN_4Yz0Wx_$xyt0zx%tt
zdmyv+UhBWtT6^u+Is2TMbA5Tmg3OEzp^r>)zM$N?;cjNhPa(3Ga8~h3MYeE@Q^goD
z7#tV=+$<p~Q<}Cb)tY82To&jcN@prYrduWy9V>G}Q|%#9vS%M|m6UYar829j5C$`-
z9Qi6LISX1X6N+{zKN92^9|nICo5~fbazzTS=_civrn-JqH~Jf^^p4GRBS(2R5mCmU
zHsx4=SNV^qpVIjYB_MyHn-iK&Q}#5~<t{@xuAeOSA3wF9lj?)ZXH1(^@1Iy7Y>YKe
zY%ZNKamKXbNT_%U`=1L<Wb(pGs#vFvhk+c~J^_CeGLj#fcwpGYC%inN?B*FiU%c^8
zCq3(b;YO03hCeI*E=mpOs1V=4->ie?*eOr_d-Nms{ChOA10FAdJCJaF8h#`VpR61B
zOFgaxV<7!yY4{JM;h&s_|3n&oeHwmA8h%e2{%2|U6!?Mabz>SlwaGyE*VFLdPQ%}s
zhELPTK>GibhX1QH{6HFhOB#N48vdhc_;b?m|CWY7ISv2pH2k~4A0<u_2NqW&h53~~
zg_r?K{BFh9<NG$nPo&_5*Sn%2)aZ>wec`CrD=?gEX!Onyf#zUTM4H0E#%QhZF1*y+
z6pPk~+E7!VQG}YJ!A8Gm3WUR<FjNF1p_*B<ypc$aud!A{f-4$*^#b?^@ZoDjjjz5w
zR3ladq9|y?YbCt4ralx2fL#+^-4yUH_YxLBUwDOxME$TqC1XK9Tnf~zY!bDr!og^O
zn9*QEAXMx3tros$C@5C>g3%`6YL(ZK`aqybG=&2`Kh#szR6QGf!A4OV4v^JmR6Qy#
ztXMd2zIRITR4X;pN=+%Aj_76LAI(PeNonW5K}k8M#F<8645utXb0YoGaeOd1f@Z>O
zj5FeWjq_Pzw93a$4h|)oW09w$^|ETFG*|1kC5CQqTH6d_s6)3m@lV%GP~C5cPh(ae
zT}uBDFqBCabvyWKNn(D&!Kdix<A8%tHuQ1W!B1LmRKFy<^rw$(2VeJblI1%12#|H;
zJNUyaE@I-~ALrl~IrzgJ{1OKr4P+gq4*p1sD}>v@&vEcQ4!&-45?4C-qa6694*qBd
zzuLh+-odYP@M$g8N0Wn}r<tJ54*rP_{yGQ$YYzSm4t~CazsbQr$-%$P!KbxMADbQg
zF`5aw#lhF-Y9xNh!9T@;Z*}lbb?`kc`?C{c&@&VF(MN2W5T2H&vY+gme#V7{=qoJ1
zXU=JEq$poUC40L2a1@?QdFtvtyEs3Z^3=t9+BrXr^3=6^9^!l^<!R{kZ07u@3y`NS
z-m{7G2Pr?8^6NPN4&|v!_cU?-Ey`0@?y2VdtCXiM+*8T<zfhjKZjYPu&rzPbY)=X2
zJ19?Gwa4K6R?1Ts?aAf*?<h}Qvqy0LKFU*<>^blS7=`yxp1NXBH|KAs{3y!r;`|RO
zPhGF4o%1(Rp1NGmL!AF_%2QYC+06Mhl&3D%vx)Oz%2U_sS;zUSDNkLhr-}2+DNkLg
zr<(KsMS1E%J(ZkaOnK@$J#NllM0x5mJtdr<M|tWhJqGfV-(48r{Y_8&El<n71C>h_
zcD5fl-z_|yJF1y<Zl2>7iF*M&v}qId5s&H0C^v*>Lwhu{Zx?rwP0pjr6aUBM1Y9>K
zABIHi;wSQP%8p{Ak@bA1H<xldKFjj7^k#Twbwvtlb51K0=xps7p1A8>=<Yk=X$G0D
zzmug|UE#N$aXpV@2AR9^*(YmjYm0METaPMGxj9|)V2A8RN4GKyzGZJWVCUJs?q{Fe
zv`PE4G`{wJPdv7{EpmZdm{s?CHkLo+iNENHcXGLeKsAZCjkRd*R&e9x`&-uT7pqT^
zMLH83!0dG0K~4e8#e1*zm{prS8<%yN<@-JH-CL=Tdg5=yI}@FsLvg&k+taePJ3Bh<
zi5p-a5jjwX>Y1@_kGW_w+*ueu0J!DN6qwkKzrIA`NM9dxLeck;@|fja2*gqZf+}XZ
zK7krkxd0mPB8~Bm)_chSTo?owwvmU>n%Ds^OxHQ^S)rR0`u>5?KVtN|w(fZf{qR8O
z0Hg1g=&@TLmle$6A}?80@-pglTjzW8rN7&iPM6c)Q9z?k&ne8)17Y5!(74XlcAuAa
zZ{;Gk-G4DEL!<Uc)U_$79gKPnT*PPW*4HKK;uO>!jCxw5{wYy~DX18u?$)UHC8`fS
z)f2b<x|C5tjXEq*ucV;HF>1a>Ww_C{rt6UuR2HKqYSb`^+L(fR{WFT!3W++y%v*+3
z+wl_XW2`+K9tIYME@JjG+i6No{1H6a?`V+cd*W*o=#L4rvbwE6Y8jKF<r>yP5P#cb
z2{}BM!eATnSdhB5^CaYNDUd=1@tC=7<r4C{6iB88sWjRymzeLTVBYwY#}9n2ENZKk
zn5$DTTWw59TZ6<bNWuKX#$>mxk(e`5Fku^0+IFMFeBn;6`6V{S-FBPAyqba;Z(}@d
zzmk~8Q!s;VOl8}n67$0p%v+!6x_g$Id472NM2AGyG18uSS_#?mRKA*F(9pSP=)<jh
z7+!VQ*@gH2yRWY|AAJDa&cxe1WRW)AO`bSt%i6;t_NMGH&|}CB<9Z1e4|;OO??nIl
zi%Pd2%*oCB2v-URbH3SeFlXtGgE^x}lj~Y?Z&rCCXTy(#5T<J=@@fYA?FAIij@EtT
zS?YYZ`!MzOb8YmxqoIG!=+6P!(4MpYAy|UxOO96JT9oMd9=o)Xi$%}WW1&X|HiwJb
z9o@?|ew_sNw<+*j=Tlv5@Dv8uCxPoU_^}kQi@~QcIA?tuJM{@LsLKX&ksL##<YL`!
z9zv0C@6AKtJiFg%>CW&B?I0Efof=_8Gfw^v=3Ifsp+dCQAH7U%PSI{O+Ah$Jt?$aJ
zpTG|CRF?t!2|x7{TB1IiL)GkP{ZKX85J;^h>QtaimynLHFUX|kE>_j}hgFUL?2+ke
zhM9P~<@(P4)qM`teHJe7g6q3MpHnf)whE<>k&ewV1k@QESSmlf^)nfvgXd8o><G>K
z7~N?yf+hW^%+}$b)v@XX#mlvME5{+2zhxaAtr>IO-2ao-&)pwEVfUAc+C1p{e>7@C
zsEVfR59cOFZ8$2agYuvHsp(pd(#KFgv=e3@M_VmOvHFuk)K4D*mnI82+`cVy&f%7S
z3|%RA#L&rKeYpK{c5+xg`cMtGn^=%$xIGAz>FNQ5s<FZ9DsJPL9Iq_->W=T9mF)O=
z?9$i!yHs>cF8y)RS9j@(6qkN|kfQWga8sgGa!f9b{~vJanbPDaoyab^`@2+nOfKD8
z{MDm0HN~Zt4=74^f}0X0_c6IN=zqYa%`=mubl?EFG_1c%o?~+9+KFF1N+VKSTE;GY
z58RX}RUVT|Z%p{=F0Gl79Hr;przkC9m*^sr*8Iw)Z6oD;Tfu1itmJ1vV;Y{$Y;{qv
z1-#@_yu@<u4I!Xf4rhEj2PJ;PB{~xa^@NSnL;J*WujCGV;z*Yor8sl!!o~Y8PrPbp
zXZe22Rid*z!Nk=!-Jk?c7Cu2L&iS5SU=Ink*=1JNfpnSfx>+6337llkF4I%jYM_>m
zbFNe0?dxl$U7w!yvUC?NI7kVd294`LYNscz7(^h(scFZ`QkgE1P+Gak>=Pat!)!Q}
za&cevUSD5J`Th(bcan}d7tG5Mt5zaTuT7)o>1b`DvN)wZ2lcYfkG_C^!2DieHY?05
zC+1a&`H{kOD-6!*Y(0+yvmNK^eTkd)^ZJR3R2tHIO#Zs2n#B|lh)~(tz#-G5@E)dA
z-TZlQ*>s%*%PG~Zg*{bWvQ(rj71@>)i$!Fq$Z1IlJxn>U($a%d`&l}I`Zz497IBx@
zmK3X!C|N3TT2ev}Q?k@7EtU7T^uuIJrK(P)wk5@?BubV_otBi)!;~y-k(OSY(yvZM
z$(G#8lH0bVSd~P{lG|xX2|Y~7QkS%JZGTJKdXu8(QI<TmCB>>FN|roMOG@ZrN|v(c
zbB`F(-_nv~OO?t}rEN*EDv6S%N~a|y^e`n$rP9)!llw*Qop+Mzl&vgf+m;lI&$!7_
zHZ0}b+Me@xd%Qf+vNj>2Go%kq3h7}=X6vNcv-+Ejz^p7)nX63Z+9nl?MPxG9GC4HG
zq(XX_lF7}|<Rc~hYAUFva5rbYyih&_$cFY<%wy)A0tef=WXaWFJ1;b!?cqsfIkP)j
zMVXv<rV7V|u|Z<aSC~?T!J=(XDn4L3UEcyE1ME&$%u0Dx3USzp;ZIco<(YFNr#s#G
zC?a~#l@bL&RMbpFGmn+DW(^v<)>M@*i}Dp&Y`D^zOP(N&5LYf`Uc7QC?c_!S$k80#
zwt}i+y8M%pgYsyCn-9xPXJQ1;s1kmigilX_Z;)`TP@Rc4FkICNB~9~Q2wb3LgwvoD
z08a139!Zyojt1E!qQ`;k6geB_V+*F0LV|smQhOh!g!fP4vF>fnkZq^hCAR(xp6b1p
zoDB=W+`b83Cf<ZVLhk)1SLU!>%n}s(3|h>vXG5U|zcL>b0(ZV8VHSxDB{FOgu}D;;
zL`AlUSR^V@q7qv~ED~iaQMN527KzB61dj*pAhSqRszjxBDPobxtwe5HL@W|{l*nU?
zh()4GC91SVnnjye2{Z-9(uOCo2QNzqGw;rc=q5FP!AVHhR&Jhu$I#{!%WjX3r90ou
ziyaN8BkMNIyo#d&Rh->LX5N`c1r|vU-6dw;rxT9iA~y>v@6I;!b{`d(tu*GEd0UPO
zq$VScrDopsM+KHDpxey*`cZ+Z?LhJ-9u=rs(p_oh4LT~2+7+F>Gx6@*j#JfK+STm2
z3j>D4rn_!iF4`2R=l?-XP8*$>xc_ht&7#pU8gnFzG51=Sd4}owWeVtX3HomvRAjm~
zrGUPWpk+3w#B^n)fanw#E|uD#Y}56|_++2{B|*pApj^{+dWxnIgIUvi`*h{zm71>o
zDVh#TpPscrZqwD70{TRPeqn<=rt5bppid>}dK*+}y55zb(@odgNb$g-z+)c_bzEXY
zm#V%(15xfG=WO@}2BLMDf%*+)T{-J>U})Qmh`@IIiNr|KLZzt5cO;4m`zSc&hs^Rr
z$8?P~8fvloi4CCKyunDfZI)C1ym1C@8Xm%a-cI~=s1+1jlp^uHx6~A|Ba!`O&HWr|
zwYgi7Zrdr%{k15qxqO&2U2g$^i>sAJ+W}ePe}N%;t$n%n?wbUC4HCNEOrWc8dw>=Q
zj?_MK+m2MecD-foArbW+YrM!0U4fU7Zp$xs!w=VJq|?Of8}LdsPwxMj(FTW69~t#X
zqj#R2*63T^mQkybzFbWYL%FW$ex%!)rO~4FMjMpTq*?}!HMHMz82lC)+#(JB{;agM
zJo0yJ{0-XFo3Y6nDu%r2f&~q2-D_~*<K+jEMwp2_M&OHYAVz%8<`q@kbalWO$$pE=
zMNg;eL1gJb3s)-1EmAVamfWQzYox@qCBIOTYAN}Fk|=9*I}v9$a4kV6DL8pOO{Qs)
z!3ta1B(qht+V-@gCwICQKm`(ThzsFXBw<Z=^{kHTg99*Qc^1<zk$xaAu$Xr7*g}hG
zBV*GnW}PfvH>=~@M|1Nec&Ey63!I1yEtovFb-HF0TK-Ee7Ml_can|O#&fb$WpA>bv
zu2e<Q+Li#N&(j%a=@{-8NVB*TizFh1dSCb3-qD)R&T-eVI8ZiEXOfiBJ_>v5R2H*p
zh7)T5U;?qZ?rw?Yw$p8hvebr^O|s!>ho>mMc5gh^6|dT3x&{}>x$TtJm&@J4T(rlm
z>N1z@bvzvreXXz0aT5I4UUW`<&K5gi`+gKlbb~2RfPV!vyhxZ;d*j?ckm`za9FW=*
zXLr%{p4pyzd9_G1VQa*peCa^STP)=Zk6C#gDL+10`Dau*)$_9~<@;YfR?nMB`Qt~D
zJiluo<<EjC!*cU6DtEni2Ihd4r;6+=k*4_CpS2yP#;N*Q>T@)$P`h>N0=JL?y2X`P
zLZ)b~;o0Rx{T$64%a<$w+NN_MJefni%ds!_t5-M*d<yp}fM)sq*7U}nl%n7Oo)iz@
z$#bNuzbB6XDx<O*fV7^Bb$C+g^knW5>cl0OYj?`Jo34a`sPP@XM3pq!jp}!c7>2Ry
zc@}mi&b91j0JF7@BEWN<{g}v>E_woKZ_C=fB6`C19|MtaS*4zbP+e`dZIR*Yk_zmz
zfEAX4jM&67Gk_<Qy3i(1;Djpvq)Zp`RnDTFW%0M@k)^)fw29XH0L?2Hr=1BuoTKNO
zTAnJoa_&_!_}jb4m6M@7vFj5=FLM&I2S}+ahT35-0H1W2c>$za2VdwGj;CK9ep#Q-
z#n&E*#}3D<4%v%V?)HqS&~^`LD?eho3ZP(C`H|>k+^;%~Pkv7^N92&!lcCkDI^yUp
zOQE>a^#x!VFs?cZ_J+Zrs(gW4Oxwt<?Ikp1ZU;K?T`WhY>(5g2ki@CT_P&67&tRa$
zzX3s$RVxL5v=lr<3a(&j-hCuBNBoG1YZ_yz2@<n-?<B873eIK0ZqK+Kp7>5r-1R)O
z^X`|@!W8LalKopE8`@W&1ut%=bW4Jsmb+O>HxY5Uv4ad=3-j{MlcGE#vJ1A!o%<~2
zbS4&I$HH{Iek#XuSYiSK-SaX&OxG;%;C800^pnU@Eq3CT(>6P5{+B2ud5@9Q9I=3j
z>sH2crOS9jByW)vJi`LJ(l0SPZ>E$!lp<}H>^35*(!Zp%tn}?tdIu3*>F+Zy??Ndu
zh{#Gmg?)C!Y$S6!6HBor(vyQ)(vx$Z@9f`%3boEe2l8zEaTA@9$WkkY3cBQ5jFJ_n
zkgkr_1q8x&JU!THZO4Z=2(BW9xk6#`oS2J&+5Q=HCr<yH?(d0@k%WO_80Z}<4TMzT
z5zfLRsqm*%c;6mfSPm9k1EL^KkcRL!QLnpY?UBJzm6?i3cH*icnnZ8gN@F0t!}Hg-
zJcNA8z>NUU>AMhEph?#2VhwVbXyxBYQhpLNwHzKFb+sIx2!s7fVdBeGC94jb<wy8x
z?g6+BBg5G&>|EjQ9p-6$@}r`wZoaSD?ZFu+uCU4vNV{_OtnMs-0Z3t%zYxcETlubd
z`5v=;4_3#{@;^wyA9C)Q8-J=S9@rOOwkN)5SG?*4Am&^cXpF^9MY%b_#=yeZ2`D?K
zF&e0k4eNA0JVs6grt3kZ;L<ssjku4ranTE&ja7S|SWAI7T{V!U+_iWJF^6HQu32sw
z?$%bjakbador9SfbL5W1AWrW`niE}^t_Md4<1;5ViUGMhx(D%fKt}m~({-lwW&RVJ
zr2V%~HbfcbVsq_&bJ1S2YM+U7p=d#_8QTw%ZN~P-rxldPcg&4r6||q#8V(;(?Urr>
zeS!MZ9neD^_C?j7M#6im!*2VlUgU213)QV(v}K&vIjU>N`8#X(4HjbEs&Q@I(&ZAE
zf}_rbdaY$(DjWF25R`CeOE_BXg+iL2<E|=F<GK`|stR2v$tn!<pk*-TMmekS8T86^
zdiQQ_9{A*J9;by}c(^I<dKl_({}G=`f3wm*R_RX~a(7tzuZ4cowTjC{N13h%WGbou
zW4;B~bX|texa)38SD_2+<q_~3oQ%a?w<j}S025EPu|E#%8H}9*ESA`6Yl+1ayBxf>
z5^~jaZOf;~*bDne%h{)PsYy*etd}?w-pl>o!#R(;kN4zkYxiWddo~Wr%=x{@`F(pK
zedQ_bB+!bWs+E(IiEqF^nc9K*NTw!ZT9b?ZXCG5d*TFkHFN#m-btM*K?p2Gg{p4Si
zYtWfkjZJdLa*_x46S;61i#=vC5zF_eYe2IK=jD6~x%<yplvhC7hZB510+Q%MDcKjE
zhKzgpcHWR#34zRFZ%8CoAdB&iEubS?sbUBrRx}^DEr&^2kd*z-Qq~M*2==vKqhRBr
z7`LjVtGwJ_&C<LlNNSGw1rt{}V|nGf2%8&pdTHkElCp&?vlqWtn4Py(N~fkspOoxB
z5K)WY7D~&-?@lSbi-=zQe!#rEi=^mGA~Il|Snc?9o9@3K$NbJj74Jy6mLRL2z^qj1
zcG~ia3>VnskWXN4ZIs>a!V|Slp8FHGA_R^eg6ZFguf;R(eC6IvSMG7wZ&0S^rwBHJ
z-f}n?J*DOF%IJ`m!;7Nhdz*1S2QeP`kGsYo(Hn(>ujY}F&cuB$QeS!$x!z@M&8puc
zx3wE5Th7k^@SkXWz6KF(gJI`;6I8W*OP08Wwb>(OAGG$4kp1@O=*xV`yr>(ev|Be&
z3GNYhL!<0~eK{M-uzp$HmzFow?|{dwI)Dlx!0=Q6VY>cHIe<qeT&J?tj@C`Aa`)D0
zsJ}J&&V>tj{@ww9{{*UXN9#@nKi>%-4Y>4oCH%$yD!_CGe6s1fm&)Dd+0mWl$&wp4
zKcOn+NcadmOQBi^-`R-Q7j#v`{RQ2gjT3JkWr&K6g*OlZKT&~|<v>OJx&kaIR~6vf
z9Q#Wv#{FY%{PTtJk3Bm+&srFNIq~-M7z`WUjgFoC8h!t#BK~Pb{Nu9t`*ZtF*z0L|
z65rwI4Zq_tuPVs)nDqs@6{f!+zXB<v!i*LaRhZWml+Zn_Cv#3)huVn`@qG;E;fWvg
z#J}J{+*AF6oWtls3ff%Vxu_;aX!N8f=<DL@dm_Er_}+<$?~h>UegM_<jS%_Wl*ILD
zKjJ(ou(HrZU>}p3!l9Z;{8hzC{=mvfHBGTeh5ktmk+8%dU+6dZtBys*#l;hhlIFrS
z&Et$YM&Z@|VWR1p6^5t@g#*HvUs)v#!ze7Q!Cg(mxX`l(pGz*Eub7jGNm4v9AQLlm
zVkRXD3!7?iJ6t9hLmb9Xf-xv(7>h1nOinZ#qRCftjc-L@wgC^>iP<s<Ph<ktktmrw
zqrQIWFi|trnVI3t%yeXY`1Wb7BM-Nda!xI9FiGargAkIN4!J+p(1dS`Mxx|@(=gQ4
zFht=rzbKqpJf)Vh<8g?>>70}4X~mOk$3uh?T#shebZ8f&Ce&CPToDWVqQOw3QB&t@
zToLfYp5Hhf&4bp#A=?juFjfT`qsDTCY?UwUj~IoBi4hGk1#)3y#Mc<8Uu{I{L#wFS
z45$rE;8X-(uZ<YNMkDC24_v@%#T6mg_ca>+V8pk)K49RRxJJ!H>PLZyv3#|$JQTuL
zc~=;#g3&tT3@SLMDO6v7hAb^aC~RCELnNYg0mF}P(;A^h6qIP{l7UDh6fSP?HG6AV
z>72>3sH_h<QB5d@6WQ>8sFtcf`q7ue&-`+D>bl2fK#iwku7JiLYXvm^SSuj$+~<$6
z2F`h*fyT<0DpwepZOpAK$CvTY0~T>2rDKlgE;8O&5Dr`$3pCcOHo}2OtUeks{2`+;
z6ot8!Xm|AHU^swYiB7Q6S8vE-V!TlsY(}pNaxV<W8XKt>R`72~OevmGJhg-j*Q1+;
zsYv<a#g{HNWDu4I@TKy=>|~~|7D({qSIvz`tU=ZvgKWG}SX@#|Ga->ho;3!#s}ZQJ
z4b}w7Z?4CHI2vB~mbk2*91)n+yqA=F=2nz1@>W*NU9#ZP#f!WPFIjNuY#RFY_04MF
z45*LnMq}aSA!C~1tBD3z1`s4f4Q2iVe(IORPx+Gg$tifO7x0LjEA&s0ao4exah()d
z9ifFNh#7#=(Le*QN!&z}FcF37Ln~HKYN}fu38GEBY7Ow>+xcNsRvcq_@_@?F{9qK0
z_aDpOFkvs22D(ZEzTVIC%a_m~xS421n~lj=Rz}a<6VTBo1{;IXpszl-2Hmz;G{zdd
z@DP)41YLg>PwEzLl@GHxqNrpnwASzvUcwtgMlxrZSQ1`soN=bn5b|T1T^=y7Okrki
zpt%^iFsAKcqOf>cME@C9f5)=Au*|3rtPIpg&NXU%4Z-@=v!N$YKO4=|)I<Y~S3E3w
zSe}at{j>T1xM6~e&PEkz_2!;zbApX4ec_-FixUltBAW8q2J1rEOF|gr4hVDP6rz>E
zMVeQxoVaosJ{FOYGbt6crU!2%P-9$&j@QuOYxEm9lUX)Q{)PnowgW0Wqt*ZfRc{t3
zO}25yaL&r$>_xLV09LyhWzaNZf-$9Ja)~j~s0cI~^TMH3jmCNPfyU+TYXV^`qs5`{
ziu1Xib<t?k>`9YikwAD=V0p1Ep7d?PPx93?1t-O#!TQLgriN%J1ZtU@q{jTZ^~G~#
zD_r5u7?X7_{l-oq7!^<T^;J^(>At>Z&^tT(`fdaL-m`su?Vx`K?FRiFHXrD;FB^Lz
z251g$E_gtfz0ucK2YM1NeQp3P!$TljKwCg}f_?y+0DT9KPh{hQ@u#?MDgiwcdzh7=
z4}dm-ZpKFLZJ<4%t)QQPc7aylxjTAp!i#I#{6Rvz0$K`s;=6r)OF<)`&7i*qy$!S%
zv=ub#1NaR(4U~RVsTtG&{Vu2*^jXkq&@I?+UI%)^p}xM&pdOsyw}W<pCO{LQ{3A(-
z)d@0yB4f>Bk<px+F=qIX>`jPG0r+F_cc`PU?-jz#9hv*x%=se**R*71J`51?a1f>p
zf4_VZJ|iX;E-R8rDAMD{-{)wHcc9mjuV~52+LmD{B0V?ZZ#&w`18U(i=QD=n_v7zh
zkiU)XR!g4s#!v!~oOY+~+uhgqbx=#5c^Q`{KlbCV8}cF8WU}OI;K$CvNq(FFKeyv9
z#U*GX3zxMpOWUD-_AdGg@AT=oWc{8iLw+oQJpOuL-#JO;&$7x#A-@HE>jGdc{cEHP
zRQ?voAA!8lDbEZtfaH%to`GG1Pe85mS%=s@)pHN{ui{ya2c7aOP|qJE$NLE6K0N($
zW0IW>mL2*ntmo6n$<BF@PsbSe&{=+2%b=UGGIz_`$o}D{et|~{7dmm7WtN@wkdK2r
z*D1ePYM}c50`ij}zuYO$yu>R16yz0<A5V64dk_`}mEQ;c1sIcSh@s_~F{}I_w9`)^
zpW~D_+x80}e;o2<PI=~1t9%*cA3{FEDZkb(?}vOS#{N8~e65sI9M?mB4&?cumVY6N
zW7a+%L|{`r?uUE~=8DNkTR2MNAo&ZBPk_9P7`mR<+kWhayb^NS;<e-pZTav)LYxoz
ze5d@|wtNEQKFI&>>{nT@TK&EPa^pSBQO@$2udxF1uO9Ml$a!;2`<L0q@&d?jf_&Zk
z*ixbLm)+$0?qKgocgv8QGrymOZOEk~Y0jbf>QUelz)^GSgZM9ye(<>;X~F+z6kAAq
z=xU|rHJ_S@4jqf(GAnKv=H?XbVbkHcz|HidFZx21sLpib)BerAc9+c4t~ni>GTlUk
zT7pH9GIShs@jG~m(mgHbfv(3#`(@Qkb=$S7gkFbeYNq3Bl89YH+?>+<CY9IgPqo6)
zl0b*Hk9&4ppAv;foH(!JazN!1gWX*2|JH~89l}(5kE?chM$wlP{imWID?0c%>F<e(
zo~7svMHeXgbwz6xU8(4I75#~#zf$yZMW0dhB}M<K=*Nl<9<J;wdX}Oy6kVX`*A=Z*
zbfu!Y-gT<q;dDfZ=Waju!ui;eD#EW4jx(kePcNR#=aD6|rc5)676<%>#}}3SiPOdj
zQ5;#_5cMqwjfN$y(<#30Z4$+}APyAIowsme)VD(ARy4+nm&bzm=|o)M6jQFw7pW7)
z{?(1pCuuY+CAgA~VDl&`<Ap37sP|C;m1?SwiekDn5yjC!Gd^pPfh^?rMSY?;Q0J`;
z`x*k?IzP(loG7k|hQbjTl;r9f2%!@f4mB_sf*O1jvu>$pc?4)WN5{pz4&Ya$V^E&P
zf*yw=(~TSs#i04xdFXMYd1oR)V^WtNo9RXl>yTm?oU=bLH76akhB?chm+59Xom0?J
zAc^&vTD}Yt+(Zg#`_-9laA^KwB1$9?+`o{{R(VY>L4xKiU0%;?n{;_KVbQ!tb=T$f
z{6goL1kvU7JW1!-bm)0e=A;gtCi+eM(VV2q>v<@lCR(Z=9om15)ASn1XpYl)J<pY>
ziPBAlaisb$xkAvn5XDrN*Yj$VDqpRD+J0*N*Qb=<HN?#&cB%4uUZ?W3_N12oK6n&Y
zy{_nW=?1m1=r{wy2l?te;>bA5Zyx4m@n%)NJEg<w@_IeIC8fOn{oHkGq1-uz)mb09
zoMzmfQeLm49zLlTVx$Hts_kp`W+cc(9si|jK3uBGw-41wNp<`*cMB5E^7{8?_3zM5
zOH+Sc{_&LZl^3~LRC$p^>`QSt)qXo*WJ0g&bi;*@N=Z`Vm0G?NFlTxC1ur^skForV
zz&gtts=T4fr`k)+{}tt_?`i+_x~zXUn6828(EU@F*Y(;9nbW@hoygY9xR}-7WzO#Z
zTF;v)<@I?{yDHDhbzx9lPSbZ%%P+2Gbo*i^)_IW3+2wWq01~82*S}kp?^5NnRsA(y
z*IVa5LV_SAnxUvJKOu#$)oD5ka&6lQ9dx<Oec@yxt;=aP<@MOnJfb&TDRT)m08c@-
z7=Jo$dS0R7HIV(Lt7ZAkY95oh{`NJ0Aeu$qX1e={=yy!@I_VT<!4tjyCG!Va=i|wI
zmvtVV%pWZDI+V;GBJ}!_%+D5j-ALvS6?#2L<`1*Z$CLTTS=SZG{NZBrFt?M%=`1nA
z`n__eFblJ$o=21UIYQ5m$^2aW?zY>>;&c{%kyp=WP9c8f)&3oEhagKFk9nus!I8-<
zTy5%k#VO1ZdH8+tf$%3<zZaW~|C-R_JDHy^+SRyA=G(s~oXkJjndP(`2P4G&!`(?T
z8efCN2YpFQWX9qzV}Lk}XMX=Ulrq169LmAZL_6+Eu}i}*1HXrSSk+69TN+lx7nfHf
zWsk>Zq)Cr^gqzsbk<S11K#>30De?aS^GAu2OWl?ZO5BaVi~-v5H{g@qT(#~JmX00x
z8z^pn0zU&+rK#=sHWeBmo`=%#haeys1N0yI=COfUJGDQI0iS+fIyWV5XM#T({{L&V
zn=SF4C{hy@|LMt+{~=3-n9Y2Zaga+Co_@uX4n3!S6MV9puk7m6fjWi1TiNw+DIvm&
zZ_aQthxfveTCMo{{7S*aI>o<gg2b!S9`QrwYm`lY2|oEVHpQPu(%@eN->{2Ysn^r+
zKS-k|3jv`xzd75@I(Y93spFV$skIVkrQy#`!><6J>Sd(VYe^b>BlG*ulTqf65dNud
z_J#LCk!n%;OO*ax<*AuQ|J_Q@*#%M$?}4MtgK6*`Y4~q~PxW%A)axCkzgYE?Wy&AA
zk3w<2L-mJ7#XlK*(o>eAXQsku&6Ixf-YLq=Cw?Y&x74`S^}R^x@ucXf0so8v@bnAB
zWVa#(Pxp1mPx|fx9lXa5+QN8yozmmwhYCM*qMP7?_v}IMQT$P=y)I`d?nyCUWgO%w
z)}Jr-EUsp**8d#y?e#+Q-vA%0W54;NH;sO}7^M1Erqp*R_*gaj9cM6qgqSnZ%`(3C
zgVZF&&rtQ!@t>YX&qd&q-6bh@y=m~_H2e+BA0gV$lm2wGS3+zg{h2wUI%S@}UEz&W
zB%bdHq0AP>A1~-fh@Hn{3Qs?COb1<m((yF&W!m}qvcf->>t;6J`$DRl^rOB@Q|g;g
z{Df*>eZG1`={Z-;=Qk=N*|@N#ez<D7)X(=eAUjd<KT!NR3V$~EWcR8RyR#I(I)#5R
z>rsUr#G8g6ReI<vYjkM;*D8KfivF7vpT33W#Qy~RT%5ObsW`9@ANMHyx)ePRGe2MK
zN*Nd1z#pjJ{zd7zA;s=%Y4GnW{X0|I<rBt_5S!GxFj6@>3>S>#e?sZk{^x){kUwXF
zPwlcPrCt*h|F#rAOBH`}3Vt5zA0c+Ad3ByLT&ei2Q{2p1sQBMhdbXtKuTl7iOC(<R
zw{I!_7**fNO3!zco&zcI`7z@a$U*K^di?5qkMAu*{-DzHkn$&2m3cgkp63+4UEwiI
z_;@u9{ypZ85EaV*GKK#{@$XXn`HG*7>%W2eVGj5MwPOMJ)Q-DS{GXTxe*yDHh{@yJ
zY?1F3BeggUzE0uk_toj(duX5yiXS`MO>DvUq(Rp(zyEdFI`GNQ?v%LwK<RmIteeyb
z+TY{@4-?fpURfe)!qG@HhF?^#!Mi->FS&HFw_@StOYkNWueU5IchP)@$RF~qs1Gg2
zgGs#U?e)c)1va#s>I2b$zj*qrX)^^0@aR!9<!0e!A+&FfCqo-nYdhtal(Crwi{~yX
z_v##((P`TX@36BK>!}fP_`-@y=gqC~Ub<kx<>gDfOXkk2C@05h2jA-phkdKP0la1<
ziGgR5!gz;D*%g<}U9@n%pbdHXQWH_r81;tycq<FNvc-!WJvif~aM7bll3nX|>o=Xy
zTTRqUQEDPFFW)J^4G1qkV<@a=U_~(EjWjM7$_TwQ#j75n<Y#0Bzr7_{?FxLG3EyKv
zJ@6GKa$mk4M!xUGhR%d2>E$!ti&lCU2Ug(8+dz1Jy)P08z-_!m#x6om(<^1XxW!@V
zm9L$#K{M5hVr&i-qxab)b8u5-IlXzt`68KQgp7*SWOy@8s7XXudhu*kqk5f9aslc?
zv1q{SgKj+X3}5lOoMcG97tth(BXyxwUOav$s4W@;b-wyQgO@$<@&m*OdIKuY?Z9m-
zpRbJwbhh;Hnz}&EHPjDqtI11E8Fk{D@SG(*D2E2C4dI<TkVdWvHhF6%3z!JiK=CTw
z=n;Mw5Z=oZZ1m!3DZdCd;+ZM*H|+;%9;{vM)sMfir^!`}p?gQvJA{%YUazA+t5@?R
zBP_K^uM4s<Q&sn`!#jThVd};hu<9*FHj4TK_j4~l8S9NTqSAOgTJRf#l5zY@eKM=2
z!H>6{(aVGUk&qY9bK|jTo<6AQ@ths*wW--h&jkGLAsS|UbB<nCl&q158x2_*_Y(E~
zqht{JDZ(rth?EVA2Qa<#6#$uw#PFCeHA<K}kM%U3HyWuCje&sQ8xD|ter^$U<<~SN
zn^131N@mi;!abh4SRI~pq#lByNqk*f+!9nX1VETCERcTim!9lX?{P}jj>jAEx~K*O
zko#IcA-}UJihu+fVQnRKVy-h!jq=@5$z^y10mc65fhd^>p1_mv{A99sc}@K_f}&}M
zgjp2kfu389MQB9JH&P{Aq!}rhLprEoFjvr&EZ<m_46z>ZmqSOrOe-1V4~8)X)nhue
zA4rv5$Q$5WvFc4%$)#XaHcS7>N3IQep5WoGU+tBwgWr%PI~+g1k73VU7k8DzKB}5G
PR9hPfMD=Q;NB{o@pOHv+

literal 0
HcmV?d00001

-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/5] x86-64: Remove kernel.vsyscall64 sysctl
  2011-05-27 17:38 [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Andy Lutomirski
  2011-05-27 17:38 ` [PATCH 1/5] x86-64: Fix alignment of jiffies variable Andy Lutomirski
  2011-05-27 17:38 ` [PATCH 2/5] x86-64: Give vvars their own page Andy Lutomirski
@ 2011-05-27 17:38 ` Andy Lutomirski
  2011-05-27 17:38 ` [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc Andy Lutomirski
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 28+ messages in thread
From: Andy Lutomirski @ 2011-05-27 17:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, x86; +Cc: linux-kernel, Andy Lutomirski

It's unnecessary overhead in code that's supposed to be highly
optimized.  Removing it allows us to remove one of the two syscall
instructions in the vsyscall page.

The only sensible use for it is for UML users, and it doesn't fully
address inconsistent vsyscall results on UML.  The real fix for UML
is to stop using vsyscalls entirely.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
---
 arch/x86/include/asm/vgtod.h   |    1 -
 arch/x86/kernel/vsyscall_64.c  |   34 +------------------------
 arch/x86/vdso/vclock_gettime.c |   55 +++++++++++++++------------------------
 3 files changed, 22 insertions(+), 68 deletions(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index 646b4c1..aa5add8 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -11,7 +11,6 @@ struct vsyscall_gtod_data {
 	time_t		wall_time_sec;
 	u32		wall_time_nsec;
 
-	int		sysctl_enabled;
 	struct timezone sys_tz;
 	struct { /* extract of a clocksource struct */
 		cycle_t (*vread)(void);
diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index ee22180..3e8dac7 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -53,7 +53,6 @@ DEFINE_VVAR(int, vgetcpu_mode);
 DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data) =
 {
 	.lock = SEQLOCK_UNLOCKED,
-	.sysctl_enabled = 1,
 };
 
 void update_vsyscall_tz(void)
@@ -103,15 +102,6 @@ static __always_inline int gettimeofday(struct timeval *tv, struct timezone *tz)
 	return ret;
 }
 
-static __always_inline long time_syscall(long *t)
-{
-	long secs;
-	asm volatile("syscall"
-		: "=a" (secs)
-		: "0" (__NR_time),"D" (t) : __syscall_clobber);
-	return secs;
-}
-
 static __always_inline void do_vgettimeofday(struct timeval * tv)
 {
 	cycle_t now, base, mask, cycle_delta;
@@ -122,8 +112,7 @@ static __always_inline void do_vgettimeofday(struct timeval * tv)
 		seq = read_seqbegin(&VVAR(vsyscall_gtod_data).lock);
 
 		vread = VVAR(vsyscall_gtod_data).clock.vread;
-		if (unlikely(!VVAR(vsyscall_gtod_data).sysctl_enabled ||
-			     !vread)) {
+		if (unlikely(!vread)) {
 			gettimeofday(tv,NULL);
 			return;
 		}
@@ -165,8 +154,6 @@ time_t __vsyscall(1) vtime(time_t *t)
 {
 	unsigned seq;
 	time_t result;
-	if (unlikely(!VVAR(vsyscall_gtod_data).sysctl_enabled))
-		return time_syscall(t);
 
 	do {
 		seq = read_seqbegin(&VVAR(vsyscall_gtod_data).lock);
@@ -227,22 +214,6 @@ static long __vsyscall(3) venosys_1(void)
 	return -ENOSYS;
 }
 
-#ifdef CONFIG_SYSCTL
-static ctl_table kernel_table2[] = {
-	{ .procname = "vsyscall64",
-	  .data = &vsyscall_gtod_data.sysctl_enabled, .maxlen = sizeof(int),
-	  .mode = 0644,
-	  .proc_handler = proc_dointvec },
-	{}
-};
-
-static ctl_table kernel_root_table2[] = {
-	{ .procname = "kernel", .mode = 0555,
-	  .child = kernel_table2 },
-	{}
-};
-#endif
-
 /* Assume __initcall executes before all user space. Hopefully kmod
    doesn't violate that. We'll find out if it does. */
 static void __cpuinit vsyscall_set_cpu(int cpu)
@@ -301,9 +272,6 @@ static int __init vsyscall_init(void)
 	BUG_ON((unsigned long) &vtime != VSYSCALL_ADDR(__NR_vtime));
 	BUG_ON((VSYSCALL_ADDR(0) != __fix_to_virt(VSYSCALL_FIRST_PAGE)));
 	BUG_ON((unsigned long) &vgetcpu != VSYSCALL_ADDR(__NR_vgetcpu));
-#ifdef CONFIG_SYSCTL
-	register_sysctl_table(kernel_root_table2);
-#endif
 	on_each_cpu(cpu_vsyscall_init, NULL, 1);
 	/* notifier priority > KVM */
 	hotcpu_notifier(cpu_vsyscall_notifier, 30);
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index a724905..cf54813 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -116,21 +116,21 @@ notrace static noinline int do_monotonic_coarse(struct timespec *ts)
 
 notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
 {
-	if (likely(gtod->sysctl_enabled))
-		switch (clock) {
-		case CLOCK_REALTIME:
-			if (likely(gtod->clock.vread))
-				return do_realtime(ts);
-			break;
-		case CLOCK_MONOTONIC:
-			if (likely(gtod->clock.vread))
-				return do_monotonic(ts);
-			break;
-		case CLOCK_REALTIME_COARSE:
-			return do_realtime_coarse(ts);
-		case CLOCK_MONOTONIC_COARSE:
-			return do_monotonic_coarse(ts);
-		}
+	switch (clock) {
+	case CLOCK_REALTIME:
+		if (likely(gtod->clock.vread))
+			return do_realtime(ts);
+		break;
+	case CLOCK_MONOTONIC:
+		if (likely(gtod->clock.vread))
+			return do_monotonic(ts);
+		break;
+	case CLOCK_REALTIME_COARSE:
+		return do_realtime_coarse(ts);
+	case CLOCK_MONOTONIC_COARSE:
+		return do_monotonic_coarse(ts);
+	}
+
 	return vdso_fallback_gettime(clock, ts);
 }
 int clock_gettime(clockid_t, struct timespec *)
@@ -139,7 +139,7 @@ int clock_gettime(clockid_t, struct timespec *)
 notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
 {
 	long ret;
-	if (likely(gtod->sysctl_enabled && gtod->clock.vread)) {
+	if (likely(gtod->clock.vread)) {
 		if (likely(tv != NULL)) {
 			BUILD_BUG_ON(offsetof(struct timeval, tv_usec) !=
 				     offsetof(struct timespec, tv_nsec) ||
@@ -161,27 +161,14 @@ notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
 int gettimeofday(struct timeval *, struct timezone *)
 	__attribute__((weak, alias("__vdso_gettimeofday")));
 
-/* This will break when the xtime seconds get inaccurate, but that is
- * unlikely */
-
-static __always_inline long time_syscall(long *t)
-{
-	long secs;
-	asm volatile("syscall"
-		     : "=a" (secs)
-		     : "0" (__NR_time), "D" (t) : "cc", "r11", "cx", "memory");
-	return secs;
-}
-
+/*
+ * This will break when the xtime seconds get inaccurate, but that is
+ * unlikely
+ */
 notrace time_t __vdso_time(time_t *t)
 {
-	time_t result;
-
-	if (unlikely(!VVAR(vsyscall_gtod_data).sysctl_enabled))
-		return time_syscall(t);
-
 	/* This is atomic on x86_64 so we don't need any locks. */
-	result = ACCESS_ONCE(VVAR(vsyscall_gtod_data).wall_time_sec);
+	time_t result = ACCESS_ONCE(VVAR(vsyscall_gtod_data).wall_time_sec);
 
 	if (t)
 		*t = result;
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-27 17:38 [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Andy Lutomirski
                   ` (2 preceding siblings ...)
  2011-05-27 17:38 ` [PATCH 3/5] x86-64: Remove kernel.vsyscall64 sysctl Andy Lutomirski
@ 2011-05-27 17:38 ` Andy Lutomirski
  2011-05-29 19:10   ` Ingo Molnar
  2011-05-29 19:49   ` Jesper Juhl
  2011-05-27 17:38 ` [PATCH 5/5] x86-64: Map the HPET NX Andy Lutomirski
  2011-05-29 19:19 ` [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Ingo Molnar
  5 siblings, 2 replies; 28+ messages in thread
From: Andy Lutomirski @ 2011-05-27 17:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, x86; +Cc: linux-kernel, Andy Lutomirski

Now the only way to issue a syscall with side effects through the
vsyscall page is to call a misaligned instruction.  I haven't
checked for that.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
---
 arch/x86/include/asm/traps.h    |    4 +++
 arch/x86/include/asm/vsyscall.h |    6 +++++
 arch/x86/kernel/entry_64.S      |    2 +
 arch/x86/kernel/traps.c         |    4 +++
 arch/x86/kernel/vsyscall_64.c   |   47 ++++++++++++++++++++++++++++++++++-----
 5 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 0310da6..7eae1e4 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_X86_TRAPS_H
 #define _ASM_X86_TRAPS_H
 
+#include <linux/kprobes.h>
+
 #include <asm/debugreg.h>
 #include <asm/siginfo.h>			/* TRAP_TRACE, ... */
 
@@ -38,6 +40,7 @@ asmlinkage void alignment_check(void);
 asmlinkage void machine_check(void);
 #endif /* CONFIG_X86_MCE */
 asmlinkage void simd_coprocessor_error(void);
+asmlinkage void intcc(void);
 
 dotraplinkage void do_divide_error(struct pt_regs *, long);
 dotraplinkage void do_debug(struct pt_regs *, long);
@@ -64,6 +67,7 @@ dotraplinkage void do_alignment_check(struct pt_regs *, long);
 dotraplinkage void do_machine_check(struct pt_regs *, long);
 #endif
 dotraplinkage void do_simd_coprocessor_error(struct pt_regs *, long);
+dotraplinkage void do_intcc(struct pt_regs *, long);
 #ifdef CONFIG_X86_32
 dotraplinkage void do_iret_error(struct pt_regs *, long);
 #endif
diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index d555973..293ae08 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -31,6 +31,12 @@ extern struct timezone sys_tz;
 
 extern void map_vsyscall(void);
 
+/* Emulation */
+static inline bool in_vsyscall_page(unsigned long addr)
+{
+	return (addr & ~(PAGE_SIZE - 1)) == VSYSCALL_START;
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_VSYSCALL_H */
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 8a445a0..8e12f50 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1121,6 +1121,8 @@ zeroentry spurious_interrupt_bug do_spurious_interrupt_bug
 zeroentry coprocessor_error do_coprocessor_error
 errorentry alignment_check do_alignment_check
 zeroentry simd_coprocessor_error do_simd_coprocessor_error
+zeroentry intcc do_intcc
+
 
 	/* Reload gs selector with exception handling */
 	/* edi:  new selector */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..d34894e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -872,6 +872,10 @@ void __init trap_init(void)
 	set_bit(SYSCALL_VECTOR, used_vectors);
 #endif
 
+	set_system_intr_gate(0xCC, &intcc);
+	set_bit(0xCC, used_vectors);
+	printk(KERN_ERR "intcc gate isntalled\n");
+
 	/*
 	 * Should be a barrier for any external CPU state:
 	 */
diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index 3e8dac7..6135a28 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -32,6 +32,7 @@
 #include <linux/cpu.h>
 #include <linux/smp.h>
 #include <linux/notifier.h>
+#include <linux/syscalls.h>
 
 #include <asm/vsyscall.h>
 #include <asm/pgtable.h>
@@ -44,6 +45,7 @@
 #include <asm/desc.h>
 #include <asm/topology.h>
 #include <asm/vgtod.h>
+#include <asm/traps.h>
 
 #define __vsyscall(nr) \
 		__attribute__ ((unused, __section__(".vsyscall_" #nr))) notrace
@@ -92,13 +94,11 @@ static __always_inline void do_get_tz(struct timezone * tz)
 	*tz = VVAR(vsyscall_gtod_data).sys_tz;
 }
 
-static __always_inline int gettimeofday(struct timeval *tv, struct timezone *tz)
+static __always_inline int fallback_gettimeofday(struct timeval *tv)
 {
 	int ret;
-	asm volatile("syscall"
-		: "=a" (ret)
-		: "0" (__NR_gettimeofday),"D" (tv),"S" (tz)
-		: __syscall_clobber );
+	/* Invoke do_intcc. */
+	asm volatile("int $0xcc" : "=a" (ret) : "D" (tv));
 	return ret;
 }
 
@@ -113,7 +113,7 @@ static __always_inline void do_vgettimeofday(struct timeval * tv)
 
 		vread = VVAR(vsyscall_gtod_data).clock.vread;
 		if (unlikely(!vread)) {
-			gettimeofday(tv,NULL);
+			fallback_gettimeofday(tv);
 			return;
 		}
 
@@ -236,6 +236,41 @@ static void __cpuinit vsyscall_set_cpu(int cpu)
 	write_gdt_entry(get_cpu_gdt_table(cpu), GDT_ENTRY_PER_CPU, &d, DESCTYPE_S);
 }
 
+void dotraplinkage do_intcc(struct pt_regs *regs, long error_code)
+{
+	/* Kernel code must never get here. */
+	if (!user_mode(regs))
+		BUG();
+
+	local_irq_enable();
+
+	if (!in_vsyscall_page(regs->ip)) {
+		struct task_struct *tsk = current;
+		if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
+		    printk_ratelimit()) {
+			printk(KERN_INFO
+			       "%s[%d] illegal int $0xCC ip:%lx sp:%lx",
+			       tsk->comm, task_pid_nr(tsk),
+			       regs->ip, regs->sp);
+			print_vma_addr(" in ", regs->ip);
+			printk("\n");
+		}
+
+		force_sig(SIGSEGV, current);
+		return;
+	}
+
+	if (current->seccomp.mode) {
+		do_exit(SIGKILL);
+		return;
+	}
+
+	regs->ax = sys_gettimeofday((struct timeval __user *)regs->di, NULL);
+
+	local_irq_disable();
+	return;
+}
+
 static void __cpuinit cpu_vsyscall_init(void *arg)
 {
 	/* preemption should be already off */
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/5] x86-64: Map the HPET NX
  2011-05-27 17:38 [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Andy Lutomirski
                   ` (3 preceding siblings ...)
  2011-05-27 17:38 ` [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc Andy Lutomirski
@ 2011-05-27 17:38 ` Andy Lutomirski
  2011-05-29 19:19 ` [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Ingo Molnar
  5 siblings, 0 replies; 28+ messages in thread
From: Andy Lutomirski @ 2011-05-27 17:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, x86; +Cc: linux-kernel, Andy Lutomirski

Currently the HPET mapping is a user-accessible syscall instruction
at a fixed address some of the time.  A sufficiently determined
hacker might be able to guess when.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
---
 arch/x86/include/asm/pgtable_types.h |    4 ++--
 arch/x86/kernel/hpet.c               |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 6a29aed6..013286a 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -107,8 +107,8 @@
 #define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
 #define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_PCD)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
-#define __PAGE_KERNEL_VSYSCALL_NOCACHE	(__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
 #define __PAGE_KERNEL_VVAR		(__PAGE_KERNEL_RO | _PAGE_USER)
+#define __PAGE_KERNEL_VVAR_NOCACHE	(__PAGE_KERNEL_VVAR | _PAGE_PCD | _PAGE_PWT)
 #define __PAGE_KERNEL_LARGE		(__PAGE_KERNEL | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_NOCACHE	(__PAGE_KERNEL | _PAGE_CACHE_UC | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_EXEC	(__PAGE_KERNEL_EXEC | _PAGE_PSE)
@@ -130,8 +130,8 @@
 #define PAGE_KERNEL_LARGE_NOCACHE	__pgprot(__PAGE_KERNEL_LARGE_NOCACHE)
 #define PAGE_KERNEL_LARGE_EXEC		__pgprot(__PAGE_KERNEL_LARGE_EXEC)
 #define PAGE_KERNEL_VSYSCALL		__pgprot(__PAGE_KERNEL_VSYSCALL)
-#define PAGE_KERNEL_VSYSCALL_NOCACHE	__pgprot(__PAGE_KERNEL_VSYSCALL_NOCACHE)
 #define PAGE_KERNEL_VVAR		__pgprot(__PAGE_KERNEL_VVAR)
+#define PAGE_KERNEL_VVAR_NOCACHE	__pgprot(__PAGE_KERNEL_VVAR_NOCACHE)
 
 #define PAGE_KERNEL_IO			__pgprot(__PAGE_KERNEL_IO)
 #define PAGE_KERNEL_IO_NOCACHE		__pgprot(__PAGE_KERNEL_IO_NOCACHE)
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index bfe8f72..bf71830 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -71,7 +71,7 @@ static inline void hpet_set_mapping(void)
 {
 	hpet_virt_address = ioremap_nocache(hpet_address, HPET_MMAP_SIZE);
 #ifdef CONFIG_X86_64
-	__set_fixmap(VSYSCALL_HPET, hpet_address, PAGE_KERNEL_VSYSCALL_NOCACHE);
+	__set_fixmap(VSYSCALL_HPET, hpet_address, PAGE_KERNEL_VVAR_NOCACHE);
 #endif
 }
 
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-27 17:38 ` [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc Andy Lutomirski
@ 2011-05-29 19:10   ` Ingo Molnar
  2011-05-29 19:23     ` Andrew Lutomirski
  2011-05-29 20:26     ` Borislav Petkov
  2011-05-29 19:49   ` Jesper Juhl
  1 sibling, 2 replies; 28+ messages in thread
From: Ingo Molnar @ 2011-05-29 19:10 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Thomas Gleixner, x86, linux-kernel


* Andy Lutomirski <luto@MIT.EDU> wrote:

> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -1121,6 +1121,8 @@ zeroentry spurious_interrupt_bug do_spurious_interrupt_bug
>  zeroentry coprocessor_error do_coprocessor_error
>  errorentry alignment_check do_alignment_check
>  zeroentry simd_coprocessor_error do_simd_coprocessor_error
> +zeroentry intcc do_intcc
> +
>  
>  	/* Reload gs selector with exception handling */
>  	/* edi:  new selector */
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c

I forgot to reply to your prior question about zeroentry vs. 
paranoidzeroentry.

That distinction is an undocumented x86-64-ism.

Background:

The SWAPGS instruction is rather fragile: it must nest perfectly and 
only in single depth, it should only be used if entering from user 
mode to kernel mode and then when returning to user-space, and 
precisely so. If we mess that up even slightly, we crash.

So when we have a secondary entry, already in kernel mode, we *must 
not* use SWAPGS blindly - nor must we forget doing a SWAPGS when it's 
not switched/swapped yet.

Now, there's a secondary complication: there's a cheap way to test 
which mode the CPU is in and an expensive way.

The cheap way is to pick this info off the entry frame on the kernel 
stack, from the CS of the ptregs area of the kernel stack:

        xorl %ebx,%ebx
        testl $3,CS+8(%rsp)
        je error_kernelspace
        SWAPGS

The expensive (paranoid) way is to read back the MSR_GS_BASE value 
(which is what SWAPGS modifies):

        movl $1,%ebx
        movl $MSR_GS_BASE,%ecx
        rdmsr
        testl %edx,%edx
        js 1f   /* negative -> in kernel */
        SWAPGS
        xorl %ebx,%ebx
1:      ret


and the whole paranoid non-paranoid macro complexity is about whether 
to suffer that RDMSR cost.

If we are at an interrupt or user-trap/gate-alike boundary then we 
can use the faster check: the stack will be a reliable indicator of 
whether SWAPGS was already done: if we see that we are a secondary 
entry interrupting kernel mode execution, then we know that the GS 
base has already been switched. If it says that we interrupted 
user-space execution then we must do the SWAPGS.

But if we are in an NMI/MCE/DEBUG/whatever super-atomic entry 
context, which might have triggered right after a normal entry wrote 
CS to the stack but before we executed SWAPGS, then the only safe way 
to check for GS is the slower method: the RDMSR.

So we try only to mark those entry methods 'paranoid' that absolutely 
need the more expensive check for the GS base - and we generate all 
'normal' entry points with the regular (faster) entry macros.

I hope this explains!

All in one, your zeroentry choice should be fine: INT 0xCC will not 
issue in NMI context.

Btw, as a sidenote, and since you are already touching this code, 
would you be interested in putting this explanation into the source 
code? It's certainly not obvious and whoever wrote those macros did 
not think of documenting them for later generations ;-)

> +++ b/arch/x86/kernel/traps.c
> @@ -872,6 +872,10 @@ void __init trap_init(void)
>  	set_bit(SYSCALL_VECTOR, used_vectors);
>  #endif
>  
> +	set_system_intr_gate(0xCC, &intcc);
> +	set_bit(0xCC, used_vectors);
> +	printk(KERN_ERR "intcc gate isntalled\n");

I think you mentioned it but i cannot remember your reasoning why you 
marked it 0xcc (and not closer to the existing syscall vector) - 
please add a comment about it into the source code as well.

Ok, i suspect you marked it 0xCC because that's the INT3 instruction 
- not very useful for exploits?

> +void dotraplinkage do_intcc(struct pt_regs *regs, long error_code)
> +{
> +	/* Kernel code must never get here. */
> +	if (!user_mode(regs))
> +		BUG();

Nit: you can use BUG_ON() for that.

> +	local_irq_enable();
> +
> +	if (!in_vsyscall_page(regs->ip)) {
> +		struct task_struct *tsk = current;
> +		if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&

Nit: please put an empty new line between local variable definitions 
and the first statement that follows - we do this for visual clarity.

A not-so-nit: i'd not limit this message to unhandled signals alone. 
An attacker could install a SIGSEGV handler, send a SIGSEGV and 
attempt the exploit right then - he'll get a free attempt with no 
logging performed, right?.

> +		    printk_ratelimit()) {
> +			printk(KERN_INFO
> +			       "%s[%d] illegal int $0xCC ip:%lx sp:%lx",
> +			       tsk->comm, task_pid_nr(tsk),
> +			       regs->ip, regs->sp);

I'd suggest putting the text 'exploit attempt?' into the printk 
somewhere - a sysadmin might not necessarily know what an illegal int 
$0xCC is..

> +			print_vma_addr(" in ", regs->ip);
> +			printk("\n");
> +		}
> +
> +		force_sig(SIGSEGV, current);
> +		return;
> +	}
> +
> +	if (current->seccomp.mode) {
> +		do_exit(SIGKILL);
> +		return;
> +	}
> +
> +	regs->ax = sys_gettimeofday((struct timeval __user *)regs->di, NULL);

Does the vsyscall gettimeofday ignore the zone parameter too?

> +
> +	local_irq_disable();
> +	return;
> +}

Nit: no need for a 'return;' at the end of a void function.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-27 17:38 [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Andy Lutomirski
                   ` (4 preceding siblings ...)
  2011-05-27 17:38 ` [PATCH 5/5] x86-64: Map the HPET NX Andy Lutomirski
@ 2011-05-29 19:19 ` Ingo Molnar
  2011-05-31  2:33   ` Andrew Lutomirski
  5 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2011-05-29 19:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich


* Andy Lutomirski <luto@MIT.EDU> wrote:

> I lied about taking awhile to do this.

Heh :-)

A very nice series btw!

> There are a bunch of syscall instructions in kernel space at fixed
> addresses that user code can execute.
> 
> One is a time() fallback.  Patch 3/5 removes it.
> 
> Several are data that isn't marked NX.  Patch 2/5 makes vvars NX and
> 5/5 makes the HPET NX.
> 
> The last one is the gettimeofday fallback.  We need that, but it
> doesn't have to be a real syscall.  Patch 3/5 adds int 0xCC (callable
> only from the vsyscall page) that implements the gettimeofday fallback
> and nothing else.
> 
> Patch 1/5 is just a dumb but harmless bug fix from the last vdso
> series.
> 
> I've only tested this in KVM with a hacked-up initramfs, but Ingo
> wanted it for 2.6.40, so here it is.
> 
> Andy Lutomirski (5):
>   x86-64: Fix alignment of jiffies variable
>   x86-64: Give vvars their own page
>   x86-64: Remove kernel.vsyscall64 sysctl
>   x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
>   x86-64: Map the HPET NX
> 
>  arch/x86/include/asm/fixmap.h        |    1 +
>  arch/x86/include/asm/pgtable_types.h |    6 ++-
>  arch/x86/include/asm/traps.h         |    4 ++
>  arch/x86/include/asm/vgtod.h         |    1 -
>  arch/x86/include/asm/vsyscall.h      |    6 ++
>  arch/x86/include/asm/vvar.h          |   24 ++++-----
>  arch/x86/kernel/entry_64.S           |    2 +
>  arch/x86/kernel/hpet.c               |    2 +-
>  arch/x86/kernel/traps.c              |    4 ++
>  arch/x86/kernel/vmlinux.lds.S        |   27 ++++++----
>  arch/x86/kernel/vsyscall_64.c        |   86 ++++++++++++++++++---------------
>  arch/x86/vdso/vclock_gettime.c       |   55 ++++++++-------------
>  tools/power/x86/turbostat/turbostat  |  Bin 0 -> 29200 bytes
>  13 files changed, 117 insertions(+), 101 deletions(-)
>  create mode 100755 tools/power/x86/turbostat/turbostat

If no-one finds any review problems with these patches and if you fix 
the details i pointed out for 3/5 then we can do this for v2.6.40.

I really like this series, it makes full-PIE randomized user-space 
executables fully safe against known-address syscall instructions. As 
much as i like crazy speedups, they are probably more relevant to the 
everyday Linux user than the other patches ;-)

Btw., do you know CONFIG_X86_PTDUMP=y and /debug/kernel_page_tables? 
You could use that to double check that after your patches all 
executable (and fixed address) pages are removed [or are harmless].

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:10   ` Ingo Molnar
@ 2011-05-29 19:23     ` Andrew Lutomirski
  2011-05-29 19:43       ` Ingo Molnar
  2011-05-29 19:49       ` Ingo Molnar
  2011-05-29 20:26     ` Borislav Petkov
  1 sibling, 2 replies; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-29 19:23 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, x86, linux-kernel

On Sun, May 29, 2011 at 3:10 PM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andy Lutomirski <luto@MIT.EDU> wrote:
>
>> --- a/arch/x86/kernel/entry_64.S
>> +++ b/arch/x86/kernel/entry_64.S
>> @@ -1121,6 +1121,8 @@ zeroentry spurious_interrupt_bug do_spurious_interrupt_bug
>>  zeroentry coprocessor_error do_coprocessor_error
>>  errorentry alignment_check do_alignment_check
>>  zeroentry simd_coprocessor_error do_simd_coprocessor_error
>> +zeroentry intcc do_intcc
>> +
>>
>>       /* Reload gs selector with exception handling */
>>       /* edi:  new selector */
>> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
>
> I forgot to reply to your prior question about zeroentry vs.
> paranoidzeroentry.
>
> That distinction is an undocumented x86-64-ism.

Is this an erratum or just the undocumented fact that
swapgs twice puts usergs back and confuses the kernel?

>
> Btw, as a sidenote, and since you are already touching this code,
> would you be interested in putting this explanation into the source
> code? It's certainly not obvious and whoever wrote those macros did
> not think of documenting them for later generations ;-)

Will do.

>
>> +++ b/arch/x86/kernel/traps.c
>> @@ -872,6 +872,10 @@ void __init trap_init(void)
>>       set_bit(SYSCALL_VECTOR, used_vectors);
>>  #endif
>>
>> +     set_system_intr_gate(0xCC, &intcc);
>> +     set_bit(0xCC, used_vectors);
>> +     printk(KERN_ERR "intcc gate isntalled\n");
>
> I think you mentioned it but i cannot remember your reasoning why you
> marked it 0xcc (and not closer to the existing syscall vector) -
> please add a comment about it into the source code as well.
>
> Ok, i suspect you marked it 0xCC because that's the INT3 instruction
> - not very useful for exploits?

Exactly.

The comments in irq_vectors.h make it sound like vectors 0x81..0xed
are used for device interrupts but AFAICT it's only 0x20..0x39 that
are used, so the precise choice of vector doesn't matter that much.

>
>> +void dotraplinkage do_intcc(struct pt_regs *regs, long error_code)
>> +{
>> +     /* Kernel code must never get here. */
>> +     if (!user_mode(regs))
>> +             BUG();
>
> Nit: you can use BUG_ON() for that.

Yep.

>
>> +     local_irq_enable();
>> +
>> +     if (!in_vsyscall_page(regs->ip)) {
>> +             struct task_struct *tsk = current;
>> +             if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
>
> Nit: please put an empty new line between local variable definitions
> and the first statement that follows - we do this for visual clarity.
>
> A not-so-nit: i'd not limit this message to unhandled signals alone.
> An attacker could install a SIGSEGV handler, send a SIGSEGV and
> attempt the exploit right then - he'll get a free attempt with no
> logging performed, right?.

I think if an exploit can call sigaction, then we've already lost.
But I can still make the change.

>
>> +                 printk_ratelimit()) {
>> +                     printk(KERN_INFO
>> +                            "%s[%d] illegal int $0xCC ip:%lx sp:%lx",
>> +                            tsk->comm, task_pid_nr(tsk),
>> +                            regs->ip, regs->sp);
>
> I'd suggest putting the text 'exploit attempt?' into the printk
> somewhere - a sysadmin might not necessarily know what an illegal int
> $0xCC is..

Will do.

>
>> +                     print_vma_addr(" in ", regs->ip);
>> +                     printk("\n");
>> +             }
>> +
>> +             force_sig(SIGSEGV, current);
>> +             return;
>> +     }
>> +
>> +     if (current->seccomp.mode) {
>> +             do_exit(SIGKILL);
>> +             return;
>> +     }
>> +
>> +     regs->ax = sys_gettimeofday((struct timeval __user *)regs->di, NULL);
>
> Does the vsyscall gettimeofday ignore the zone parameter too?

No, but the vsyscall gettimeofday doesn't use the fallback to get the timezone.

>
>> +
>> +     local_irq_disable();
>> +     return;
>> +}
>
> Nit: no need for a 'return;' at the end of a void function.

:)

That pointless "return" statement was to hide the fact that the
local_irq_enable wasn't correctly matched.

I'm changing this code a fair bit in preparation for the extra bonus
patch to defang vsyscalls even more by trapping all of them.

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:23     ` Andrew Lutomirski
@ 2011-05-29 19:43       ` Ingo Molnar
  2011-05-29 19:49       ` Ingo Molnar
  1 sibling, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2011-05-29 19:43 UTC (permalink / raw)
  To: Andrew Lutomirski; +Cc: Thomas Gleixner, x86, linux-kernel


* Andrew Lutomirski <luto@mit.edu> wrote:

> On Sun, May 29, 2011 at 3:10 PM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > * Andy Lutomirski <luto@MIT.EDU> wrote:
> >
> >> --- a/arch/x86/kernel/entry_64.S
> >> +++ b/arch/x86/kernel/entry_64.S
> >> @@ -1121,6 +1121,8 @@ zeroentry spurious_interrupt_bug do_spurious_interrupt_bug
> >>  zeroentry coprocessor_error do_coprocessor_error
> >>  errorentry alignment_check do_alignment_check
> >>  zeroentry simd_coprocessor_error do_simd_coprocessor_error
> >> +zeroentry intcc do_intcc
> >> +
> >>
> >>       /* Reload gs selector with exception handling */
> >>       /* edi:  new selector */
> >> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> >
> > I forgot to reply to your prior question about zeroentry vs.
> > paranoidzeroentry.
> >
> > That distinction is an undocumented x86-64-ism.
> 
> Is this an erratum or just the undocumented fact that
> swapgs twice puts usergs back and confuses the kernel?

There's no erratum needed for this to be unreliable: if an NMI hits like this:


	SYSENTER

	<=== ... NMI entry ...

	SWAPGS

then the CS check of the entry frame will show 'we interrupted kernel 
mode code', but in reality the SWAPGS has not been done yet.

Regular interrupts (and pagefaults, etc.) can never interrupt the 
above sequence 'in the middle', where the CS check is unreliable.

So yes, it's about not confusing the kernel into the wrong SWAPGS 
state.

I suspect you could trigger badness very quickly: mark the NMI entry 
zeroentry and run some more extreme NMI load like:

        # 100 KHz NMI with precise (no skid) cycles PEBS profiling:

	perf record -a -e cycles:pp -F 100000 sleep 60

and i guess you'll see a nasty crash very quickly.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:23     ` Andrew Lutomirski
  2011-05-29 19:43       ` Ingo Molnar
@ 2011-05-29 19:49       ` Ingo Molnar
  2011-05-29 19:57         ` Andrew Lutomirski
  1 sibling, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2011-05-29 19:49 UTC (permalink / raw)
  To: Andrew Lutomirski; +Cc: Thomas Gleixner, x86, linux-kernel


* Andrew Lutomirski <luto@mit.edu> wrote:

> > Ok, i suspect you marked it 0xCC because that's the INT3 instruction
> > - not very useful for exploits?
> 
> Exactly.
> 
> The comments in irq_vectors.h make it sound like vectors 0x81..0xed 
> are used for device interrupts but AFAICT it's only 0x20..0x39 that 
> are used, so the precise choice of vector doesn't matter that much.

No, we use almost all of the vector space for device interrupts. Why 
do you think only 0x20..0x39 is used?

> > A not-so-nit: i'd not limit this message to unhandled signals 
> > alone. An attacker could install a SIGSEGV handler, send a 
> > SIGSEGV and attempt the exploit right then - he'll get a free 
> > attempt with no logging performed, right?.
> 
> I think if an exploit can call sigaction, then we've already lost. 

Yes, indeed. In theory an app could be catching SIGSEGV and we could 
have an exploit there. But that's pretty theoretical ...

> But I can still make the change.

If you did it to not repeat the message then i think the 
printk_ratelimit() is more than enough. force_sig() will be able to 
sort out repeat signals just fine.

> >> +     local_irq_disable();
> >> +     return;
> >> +}
> >
> > Nit: no need for a 'return;' at the end of a void function.
> 
> :)
> 
> That pointless "return" statement was to hide the fact that the
> local_irq_enable wasn't correctly matched.

indeed. I noticed the do_exit() local_irq_enable() assymetry which is 
harmless (we never return), but missed the force_sig() one that isn't 
so harmless.

> I'm changing this code a fair bit in preparation for the extra 
> bonus patch to defang vsyscalls even more by trapping all of them.

ok.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-27 17:38 ` [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc Andy Lutomirski
  2011-05-29 19:10   ` Ingo Molnar
@ 2011-05-29 19:49   ` Jesper Juhl
  2011-05-29 19:54     ` Jesper Juhl
  1 sibling, 1 reply; 28+ messages in thread
From: Jesper Juhl @ 2011-05-29 19:49 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

On Fri, 27 May 2011, Andy Lutomirski wrote:

> Now the only way to issue a syscall with side effects through the
> vsyscall page is to call a misaligned instruction.  I haven't
> checked for that.
> 
> Signed-off-by: Andy Lutomirski <luto@mit.edu>
> ---
>  arch/x86/include/asm/traps.h    |    4 +++
>  arch/x86/include/asm/vsyscall.h |    6 +++++
>  arch/x86/kernel/entry_64.S      |    2 +
>  arch/x86/kernel/traps.c         |    4 +++
>  arch/x86/kernel/vsyscall_64.c   |   47 ++++++++++++++++++++++++++++++++++-----
>  5 files changed, 57 insertions(+), 6 deletions(-)
> 

one very tiny nit below.

[...]
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index b9b6716..d34894e 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -872,6 +872,10 @@ void __init trap_init(void)
>  	set_bit(SYSCALL_VECTOR, used_vectors);
>  #endif
>  
> +	set_system_intr_gate(0xCC, &intcc);
> +	set_bit(0xCC, used_vectors);
> +	printk(KERN_ERR "intcc gate isntalled\n");

Let's spell the error message correctly:

	printk(KERN_ERR "intcc gate installed\n");

-- 
Jesper Juhl <jj@chaosbits.net>       http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:49   ` Jesper Juhl
@ 2011-05-29 19:54     ` Jesper Juhl
  2011-05-29 20:05       ` Andrew Lutomirski
  0 siblings, 1 reply; 28+ messages in thread
From: Jesper Juhl @ 2011-05-29 19:54 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

On Sun, 29 May 2011, Jesper Juhl wrote:

> On Fri, 27 May 2011, Andy Lutomirski wrote:
> 
> > Now the only way to issue a syscall with side effects through the
> > vsyscall page is to call a misaligned instruction.  I haven't
> > checked for that.
> > 
> > Signed-off-by: Andy Lutomirski <luto@mit.edu>
> > ---
> >  arch/x86/include/asm/traps.h    |    4 +++
> >  arch/x86/include/asm/vsyscall.h |    6 +++++
> >  arch/x86/kernel/entry_64.S      |    2 +
> >  arch/x86/kernel/traps.c         |    4 +++
> >  arch/x86/kernel/vsyscall_64.c   |   47 ++++++++++++++++++++++++++++++++++-----
> >  5 files changed, 57 insertions(+), 6 deletions(-)
> > 
> 
> one very tiny nit below.
> 
> [...]
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index b9b6716..d34894e 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -872,6 +872,10 @@ void __init trap_init(void)
> >  	set_bit(SYSCALL_VECTOR, used_vectors);
> >  #endif
> >  
> > +	set_system_intr_gate(0xCC, &intcc);
> > +	set_bit(0xCC, used_vectors);
> > +	printk(KERN_ERR "intcc gate isntalled\n");
> 
> Let's spell the error message correctly:
> 
> 	printk(KERN_ERR "intcc gate installed\n");
> 
Hmm, why is this KERN_ERR btw? Shouldn't it just be KERN_NOTICE or 
KERN_INFO ?

-- 
Jesper Juhl <jj@chaosbits.net>       http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:49       ` Ingo Molnar
@ 2011-05-29 19:57         ` Andrew Lutomirski
  2011-05-29 20:01           ` Ingo Molnar
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-29 19:57 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, x86, linux-kernel

On Sun, May 29, 2011 at 3:49 PM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Lutomirski <luto@mit.edu> wrote:
>
>> > Ok, i suspect you marked it 0xCC because that's the INT3 instruction
>> > - not very useful for exploits?
>>
>> Exactly.
>>
>> The comments in irq_vectors.h make it sound like vectors 0x81..0xed
>> are used for device interrupts but AFAICT it's only 0x20..0x39 that
>> are used, so the precise choice of vector doesn't matter that much.
>
> No, we use almost all of the vector space for device interrupts. Why
> do you think only 0x20..0x39 is used?

Possibility my inability to understand all the IRQ mapping code in
just half an hour of trying.

In arch/x86/kernel/irq.c, arch_probe_nr_irqs returns NR_IRQS_LEGACY,
which I think means that the genirq code allocates will only expect
IRQs on that many vectors.

If I'm wrong then my patch could be bad: if something tries to use
vector 0xcc for a device interrupt, then the vsyscall emulation code
will eat that interrupt.

(0xcc is barely below the maximum.  INVALIDATE_TLB_VECTOR_START could
be as low as 0xcf.)

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:57         ` Andrew Lutomirski
@ 2011-05-29 20:01           ` Ingo Molnar
  2011-05-29 20:04             ` Andrew Lutomirski
  0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2011-05-29 20:01 UTC (permalink / raw)
  To: Andrew Lutomirski; +Cc: Thomas Gleixner, x86, linux-kernel


* Andrew Lutomirski <luto@mit.edu> wrote:

> On Sun, May 29, 2011 at 3:49 PM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > * Andrew Lutomirski <luto@mit.edu> wrote:
> >
> >> > Ok, i suspect you marked it 0xCC because that's the INT3 instruction
> >> > - not very useful for exploits?
> >>
> >> Exactly.
> >>
> >> The comments in irq_vectors.h make it sound like vectors 0x81..0xed
> >> are used for device interrupts but AFAICT it's only 0x20..0x39 that
> >> are used, so the precise choice of vector doesn't matter that much.
> >
> > No, we use almost all of the vector space for device interrupts. Why
> > do you think only 0x20..0x39 is used?
> 
> Possibility my inability to understand all the IRQ mapping code in 
> just half an hour of trying.

Hey, you managed to find all the scattered pieces in just half an 
hour, i'm impressed ;-)

> In arch/x86/kernel/irq.c, arch_probe_nr_irqs returns 
> NR_IRQS_LEGACY, which I think means that the genirq code allocates 
> will only expect IRQs on that many vectors.
> 
> If I'm wrong then my patch could be bad: if something tries to use 
> vector 0xcc for a device interrupt, then the vsyscall emulation 
> code will eat that interrupt.

I saw the used_vector trick you did and it looked safe to me: we set 
up these gates very early on, when there's no device interrupts yet.

If you want to be really sure you could do a BUG_ON(test_bit()) 
before setting it.

> (0xcc is barely below the maximum.  INVALIDATE_TLB_VECTOR_START 
> could be as low as 0xcf.)

Yeah - 0xcc could be fine even if it's in the middle - we are able to 
skip over used ones.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 20:01           ` Ingo Molnar
@ 2011-05-29 20:04             ` Andrew Lutomirski
  0 siblings, 0 replies; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-29 20:04 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, x86, linux-kernel

On Sun, May 29, 2011 at 4:01 PM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Lutomirski <luto@mit.edu> wrote:
>
>> On Sun, May 29, 2011 at 3:49 PM, Ingo Molnar <mingo@elte.hu> wrote:
>> >
>> > * Andrew Lutomirski <luto@mit.edu> wrote:
>> >
>> >> > Ok, i suspect you marked it 0xCC because that's the INT3 instruction
>> >> > - not very useful for exploits?
>> >>
>> >> Exactly.
>> >>
>> >> The comments in irq_vectors.h make it sound like vectors 0x81..0xed
>> >> are used for device interrupts but AFAICT it's only 0x20..0x39 that
>> >> are used, so the precise choice of vector doesn't matter that much.
>> >
>> > No, we use almost all of the vector space for device interrupts. Why
>> > do you think only 0x20..0x39 is used?
>>
>> Possibility my inability to understand all the IRQ mapping code in
>> just half an hour of trying.
>
> Hey, you managed to find all the scattered pieces in just half an
> hour, i'm impressed ;-)

grep and an SSD are amazing.  I'll add the BUG_ON just to satisfy my
paranoia.  I'll also update the comment in irq_vectors.h.

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:54     ` Jesper Juhl
@ 2011-05-29 20:05       ` Andrew Lutomirski
  2011-05-29 20:07         ` Jesper Juhl
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-29 20:05 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

On Sun, May 29, 2011 at 3:54 PM, Jesper Juhl <jj@chaosbits.net> wrote:
> On Sun, 29 May 2011, Jesper Juhl wrote:
>
>> On Fri, 27 May 2011, Andy Lutomirski wrote:
>>
>> > Now the only way to issue a syscall with side effects through the
>> > vsyscall page is to call a misaligned instruction.  I haven't
>> > checked for that.
>> >
>> > Signed-off-by: Andy Lutomirski <luto@mit.edu>
>> > ---
>> >  arch/x86/include/asm/traps.h    |    4 +++
>> >  arch/x86/include/asm/vsyscall.h |    6 +++++
>> >  arch/x86/kernel/entry_64.S      |    2 +
>> >  arch/x86/kernel/traps.c         |    4 +++
>> >  arch/x86/kernel/vsyscall_64.c   |   47 ++++++++++++++++++++++++++++++++++-----
>> >  5 files changed, 57 insertions(+), 6 deletions(-)
>> >
>>
>> one very tiny nit below.
>>
>> [...]
>> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
>> > index b9b6716..d34894e 100644
>> > --- a/arch/x86/kernel/traps.c
>> > +++ b/arch/x86/kernel/traps.c
>> > @@ -872,6 +872,10 @@ void __init trap_init(void)
>> >     set_bit(SYSCALL_VECTOR, used_vectors);
>> >  #endif
>> >
>> > +   set_system_intr_gate(0xCC, &intcc);
>> > +   set_bit(0xCC, used_vectors);
>> > +   printk(KERN_ERR "intcc gate isntalled\n");
>>
>> Let's spell the error message correctly:
>>
>>       printk(KERN_ERR "intcc gate installed\n");
>>
> Hmm, why is this KERN_ERR btw? Shouldn't it just be KERN_NOTICE or
> KERN_INFO ?

IMO it shouldn't be there at all.  It was a debugging leftover that I
forgot to delete.

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 20:05       ` Andrew Lutomirski
@ 2011-05-29 20:07         ` Jesper Juhl
  0 siblings, 0 replies; 28+ messages in thread
From: Jesper Juhl @ 2011-05-29 20:07 UTC (permalink / raw)
  To: Andrew Lutomirski; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1817 bytes --]

On Sun, 29 May 2011, Andrew Lutomirski wrote:

> On Sun, May 29, 2011 at 3:54 PM, Jesper Juhl <jj@chaosbits.net> wrote:
> > On Sun, 29 May 2011, Jesper Juhl wrote:
> >
> >> On Fri, 27 May 2011, Andy Lutomirski wrote:
> >>
> >> > Now the only way to issue a syscall with side effects through the
> >> > vsyscall page is to call a misaligned instruction.  I haven't
> >> > checked for that.
> >> >
> >> > Signed-off-by: Andy Lutomirski <luto@mit.edu>
> >> > ---
> >> >  arch/x86/include/asm/traps.h    |    4 +++
> >> >  arch/x86/include/asm/vsyscall.h |    6 +++++
> >> >  arch/x86/kernel/entry_64.S      |    2 +
> >> >  arch/x86/kernel/traps.c         |    4 +++
> >> >  arch/x86/kernel/vsyscall_64.c   |   47 ++++++++++++++++++++++++++++++++++-----
> >> >  5 files changed, 57 insertions(+), 6 deletions(-)
> >> >
> >>
> >> one very tiny nit below.
> >>
> >> [...]
> >> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> >> > index b9b6716..d34894e 100644
> >> > --- a/arch/x86/kernel/traps.c
> >> > +++ b/arch/x86/kernel/traps.c
> >> > @@ -872,6 +872,10 @@ void __init trap_init(void)
> >> >     set_bit(SYSCALL_VECTOR, used_vectors);
> >> >  #endif
> >> >
> >> > +   set_system_intr_gate(0xCC, &intcc);
> >> > +   set_bit(0xCC, used_vectors);
> >> > +   printk(KERN_ERR "intcc gate isntalled\n");
> >>
> >> Let's spell the error message correctly:
> >>
> >>       printk(KERN_ERR "intcc gate installed\n");
> >>
> > Hmm, why is this KERN_ERR btw? Shouldn't it just be KERN_NOTICE or
> > KERN_INFO ?
> 
> IMO it shouldn't be there at all.  It was a debugging leftover that I
> forgot to delete.
> 
Just removing it sounds good to me :)

-- 
Jesper Juhl <jj@chaosbits.net>       http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  2011-05-29 19:10   ` Ingo Molnar
  2011-05-29 19:23     ` Andrew Lutomirski
@ 2011-05-29 20:26     ` Borislav Petkov
  1 sibling, 0 replies; 28+ messages in thread
From: Borislav Petkov @ 2011-05-29 20:26 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andy Lutomirski, Thomas Gleixner, x86, linux-kernel

I would welcome it very much if this explanation landed somewhere into
<Documentation/x86/x86_64/> for all those of us who find ourselves
staring at entry code now and then :).

Thanks.

On Sun, May 29, 2011 at 09:10:55PM +0200, Ingo Molnar wrote:
> 
> * Andy Lutomirski <luto@MIT.EDU> wrote:
> 
> > --- a/arch/x86/kernel/entry_64.S
> > +++ b/arch/x86/kernel/entry_64.S
> > @@ -1121,6 +1121,8 @@ zeroentry spurious_interrupt_bug do_spurious_interrupt_bug
> >  zeroentry coprocessor_error do_coprocessor_error
> >  errorentry alignment_check do_alignment_check
> >  zeroentry simd_coprocessor_error do_simd_coprocessor_error
> > +zeroentry intcc do_intcc
> > +
> >  
> >  	/* Reload gs selector with exception handling */
> >  	/* edi:  new selector */
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> 
> I forgot to reply to your prior question about zeroentry vs. 
> paranoidzeroentry.
> 
> That distinction is an undocumented x86-64-ism.
> 
> Background:
> 
> The SWAPGS instruction is rather fragile: it must nest perfectly and 
> only in single depth, it should only be used if entering from user 
> mode to kernel mode and then when returning to user-space, and 
> precisely so. If we mess that up even slightly, we crash.
> 
> So when we have a secondary entry, already in kernel mode, we *must 
> not* use SWAPGS blindly - nor must we forget doing a SWAPGS when it's 
> not switched/swapped yet.
> 
> Now, there's a secondary complication: there's a cheap way to test 
> which mode the CPU is in and an expensive way.
> 
> The cheap way is to pick this info off the entry frame on the kernel 
> stack, from the CS of the ptregs area of the kernel stack:
> 
>         xorl %ebx,%ebx
>         testl $3,CS+8(%rsp)
>         je error_kernelspace
>         SWAPGS
> 
> The expensive (paranoid) way is to read back the MSR_GS_BASE value 
> (which is what SWAPGS modifies):
> 
>         movl $1,%ebx
>         movl $MSR_GS_BASE,%ecx
>         rdmsr
>         testl %edx,%edx
>         js 1f   /* negative -> in kernel */
>         SWAPGS
>         xorl %ebx,%ebx
> 1:      ret
> 
> 
> and the whole paranoid non-paranoid macro complexity is about whether 
> to suffer that RDMSR cost.
> 
> If we are at an interrupt or user-trap/gate-alike boundary then we 
> can use the faster check: the stack will be a reliable indicator of 
> whether SWAPGS was already done: if we see that we are a secondary 
> entry interrupting kernel mode execution, then we know that the GS 
> base has already been switched. If it says that we interrupted 
> user-space execution then we must do the SWAPGS.
> 
> But if we are in an NMI/MCE/DEBUG/whatever super-atomic entry 
> context, which might have triggered right after a normal entry wrote 
> CS to the stack but before we executed SWAPGS, then the only safe way 
> to check for GS is the slower method: the RDMSR.
> 
> So we try only to mark those entry methods 'paranoid' that absolutely 
> need the more expensive check for the GS base - and we generate all 
> 'normal' entry points with the regular (faster) entry macros.
> 
> I hope this explains!
> 
> All in one, your zeroentry choice should be fine: INT 0xCC will not 
> issue in NMI context.
> 
> Btw, as a sidenote, and since you are already touching this code, 
> would you be interested in putting this explanation into the source 
> code? It's certainly not obvious and whoever wrote those macros did 
> not think of documenting them for later generations ;-)
> 
> > +++ b/arch/x86/kernel/traps.c
> > @@ -872,6 +872,10 @@ void __init trap_init(void)
> >  	set_bit(SYSCALL_VECTOR, used_vectors);
> >  #endif
> >  
> > +	set_system_intr_gate(0xCC, &intcc);
> > +	set_bit(0xCC, used_vectors);
> > +	printk(KERN_ERR "intcc gate isntalled\n");
> 
> I think you mentioned it but i cannot remember your reasoning why you 
> marked it 0xcc (and not closer to the existing syscall vector) - 
> please add a comment about it into the source code as well.
> 
> Ok, i suspect you marked it 0xCC because that's the INT3 instruction 
> - not very useful for exploits?
> 
> > +void dotraplinkage do_intcc(struct pt_regs *regs, long error_code)
> > +{
> > +	/* Kernel code must never get here. */
> > +	if (!user_mode(regs))
> > +		BUG();
> 
> Nit: you can use BUG_ON() for that.
> 
> > +	local_irq_enable();
> > +
> > +	if (!in_vsyscall_page(regs->ip)) {
> > +		struct task_struct *tsk = current;
> > +		if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
> 
> Nit: please put an empty new line between local variable definitions 
> and the first statement that follows - we do this for visual clarity.
> 
> A not-so-nit: i'd not limit this message to unhandled signals alone. 
> An attacker could install a SIGSEGV handler, send a SIGSEGV and 
> attempt the exploit right then - he'll get a free attempt with no 
> logging performed, right?.
> 
> > +		    printk_ratelimit()) {
> > +			printk(KERN_INFO
> > +			       "%s[%d] illegal int $0xCC ip:%lx sp:%lx",
> > +			       tsk->comm, task_pid_nr(tsk),
> > +			       regs->ip, regs->sp);
> 
> I'd suggest putting the text 'exploit attempt?' into the printk 
> somewhere - a sysadmin might not necessarily know what an illegal int 
> $0xCC is..
> 
> > +			print_vma_addr(" in ", regs->ip);
> > +			printk("\n");
> > +		}
> > +
> > +		force_sig(SIGSEGV, current);
> > +		return;
> > +	}
> > +
> > +	if (current->seccomp.mode) {
> > +		do_exit(SIGKILL);
> > +		return;
> > +	}
> > +
> > +	regs->ax = sys_gettimeofday((struct timeval __user *)regs->di, NULL);
> 
> Does the vsyscall gettimeofday ignore the zone parameter too?
> 
> > +
> > +	local_irq_disable();
> > +	return;
> > +}
> 
> Nit: no need for a 'return;' at the end of a void function.
> 
> Thanks,
> 
> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/5] x86-64: Give vvars their own page
  2011-05-27 17:38 ` [PATCH 2/5] x86-64: Give vvars their own page Andy Lutomirski
@ 2011-05-29 20:34   ` Borislav Petkov
  2011-05-30  1:37     ` Andrew Lutomirski
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2011-05-29 20:34 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

On Fri, May 27, 2011 at 01:38:39PM -0400, Andy Lutomirski wrote:
> Move vvars out of the vsyscall page into their own page and mark it
> NX.
> 
> Without this patch, an attacker who can force a daemon to call some
> fixed address could wait until the time contains, say, 0xCD80, and
> then execute the current time.
> 
> Signed-off-by: Andy Lutomirski <luto@mit.edu>
> ---
>  arch/x86/include/asm/fixmap.h        |    1 +
>  arch/x86/include/asm/pgtable_types.h |    2 ++
>  arch/x86/include/asm/vvar.h          |   22 ++++++++++------------
>  arch/x86/kernel/vmlinux.lds.S        |   27 ++++++++++++++++-----------
>  arch/x86/kernel/vsyscall_64.c        |    5 +++++
>  tools/power/x86/turbostat/turbostat  |  Bin 0 -> 29200 bytes

You've added the turbostat binary to the diffstat too. I believe this
wasn't your intention, no? :)

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/5] x86-64: Give vvars their own page
  2011-05-29 20:34   ` Borislav Petkov
@ 2011-05-30  1:37     ` Andrew Lutomirski
  0 siblings, 0 replies; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-30  1:37 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	x86, linux-kernel

On Sun, May 29, 2011 at 4:34 PM, Borislav Petkov <bp@alien8.de> wrote:
> On Fri, May 27, 2011 at 01:38:39PM -0400, Andy Lutomirski wrote:
>> Move vvars out of the vsyscall page into their own page and mark it
>> NX.
>>
>> Without this patch, an attacker who can force a daemon to call some
>> fixed address could wait until the time contains, say, 0xCD80, and
>> then execute the current time.
>>
>> Signed-off-by: Andy Lutomirski <luto@mit.edu>
>> ---
>>  arch/x86/include/asm/fixmap.h        |    1 +
>>  arch/x86/include/asm/pgtable_types.h |    2 ++
>>  arch/x86/include/asm/vvar.h          |   22 ++++++++++------------
>>  arch/x86/kernel/vmlinux.lds.S        |   27 ++++++++++++++++-----------
>>  arch/x86/kernel/vsyscall_64.c        |    5 +++++
>>  tools/power/x86/turbostat/turbostat  |  Bin 0 -> 29200 bytes
>
> You've added the turbostat binary to the diffstat too. I believe this
> wasn't your intention, no? :)

Foiled again!

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-29 19:19 ` [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Ingo Molnar
@ 2011-05-31  2:33   ` Andrew Lutomirski
  2011-05-31  8:07     ` Ingo Molnar
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-31  2:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich

On Sun, May 29, 2011 at 3:19 PM, Ingo Molnar <mingo@elte.hu> wrote:
> Btw., do you know CONFIG_X86_PTDUMP=y and /debug/kernel_page_tables?
> You could use that to double check that after your patches all
> executable (and fixed address) pages are removed [or are harmless].

Done.  Now there's only one user-executable page and it's mostly harmless.

Maybe I'll try to get rid of vread_tsc and vread_hpet later on to make
it even more harmless.

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-31  2:33   ` Andrew Lutomirski
@ 2011-05-31  8:07     ` Ingo Molnar
  2011-05-31 12:27       ` Andrew Lutomirski
  0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2011-05-31  8:07 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich


* Andrew Lutomirski <luto@mit.edu> wrote:

> On Sun, May 29, 2011 at 3:19 PM, Ingo Molnar <mingo@elte.hu> wrote:
> > Btw., do you know CONFIG_X86_PTDUMP=y and /debug/kernel_page_tables?
> > You could use that to double check that after your patches all
> > executable (and fixed address) pages are removed [or are harmless].
> 
> Done.  Now there's only one user-executable page and it's mostly harmless.

ok. Will test your v3 series.

> Maybe I'll try to get rid of vread_tsc and vread_hpet later on to 
> make it even more harmless.

Yeah, that's a good idea. They need pushing into the INT 0xCC 
do_intcc() handler, that's all that's needed AFAICS - or can you see 
other complications with them?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-31  8:07     ` Ingo Molnar
@ 2011-05-31 12:27       ` Andrew Lutomirski
  2011-05-31 12:54         ` Ingo Molnar
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-31 12:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich

On Tue, May 31, 2011 at 4:07 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Lutomirski <luto@mit.edu> wrote:
>
>> On Sun, May 29, 2011 at 3:19 PM, Ingo Molnar <mingo@elte.hu> wrote:
>> > Btw., do you know CONFIG_X86_PTDUMP=y and /debug/kernel_page_tables?
>> > You could use that to double check that after your patches all
>> > executable (and fixed address) pages are removed [or are harmless].
>>
>> Done.  Now there's only one user-executable page and it's mostly harmless.
>
> ok. Will test your v3 series.
>
>> Maybe I'll try to get rid of vread_tsc and vread_hpet later on to
>> make it even more harmless.
>
> Yeah, that's a good idea. They need pushing into the INT 0xCC
> do_intcc() handler, that's all that's needed AFAICS - or can you see
> other complications with them?
>

They're called from the vDSO.  I think they should just be moved into
the vDSO since they're not used by the vsyscall code any more, but
there are two problems.  The clocksource.vread mechanism (or whatever
its called) won't really work if we let them get relocated (not a big
deal).  More importantly, vread_tsc contains an alternative and the
vDSO can't currently contain alternative instructions.  That can
probably be fixed, but it'll take a bit of work.

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-31 12:27       ` Andrew Lutomirski
@ 2011-05-31 12:54         ` Ingo Molnar
  2011-05-31 13:06           ` Andrew Lutomirski
  0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2011-05-31 12:54 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich


* Andrew Lutomirski <luto@mit.edu> wrote:

> [...] More importantly, vread_tsc contains an alternative and the 
> vDSO can't currently contain alternative instructions.  That can 
> probably be fixed, but it'll take a bit of work.

You could start with picking the more compatible alternative 
instruction initially. I don't at all mind losing half a cycle of 
performance in that case ... this code should be secure first.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-31 12:54         ` Ingo Molnar
@ 2011-05-31 13:06           ` Andrew Lutomirski
  2011-05-31 13:11             ` Ingo Molnar
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-31 13:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich

On Tue, May 31, 2011 at 8:54 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Lutomirski <luto@mit.edu> wrote:
>
>> [...] More importantly, vread_tsc contains an alternative and the
>> vDSO can't currently contain alternative instructions.  That can
>> probably be fixed, but it'll take a bit of work.
>
> You could start with picking the more compatible alternative
> instruction initially. I don't at all mind losing half a cycle of
> performance in that case ... this code should be secure first.

The more compatible one is mfence, which in some cases could (I think)
be a lot more than half a cycle.

A better option might be rdtscp, which is actually documented to work,
but I'm not sure it's available on all supported CPUs.

I'm content to wait a bit on this one.  I say let's get the rest done
first and tackle the last little hard part at the end.

--Andy

>
> Thanks,
>
>        Ingo
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-31 13:06           ` Andrew Lutomirski
@ 2011-05-31 13:11             ` Ingo Molnar
  2011-05-31 13:17               ` Andrew Lutomirski
  0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2011-05-31 13:11 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich


* Andrew Lutomirski <luto@mit.edu> wrote:

> > You could start with picking the more compatible alternative 
> > instruction initially. I don't at all mind losing half a cycle of 
> > performance in that case ... this code should be secure first.
> 
> The more compatible one is mfence, which in some cases could (I 
> think) be a lot more than half a cycle.

I'd still suggest to do the mfence change now and remove the 
alternatives patching for now - if it's more than half a cycle then 
it sure will be implemented properly, right?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses
  2011-05-31 13:11             ` Ingo Molnar
@ 2011-05-31 13:17               ` Andrew Lutomirski
  0 siblings, 0 replies; 28+ messages in thread
From: Andrew Lutomirski @ 2011-05-31 13:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, x86, linux-kernel, Linus Torvalds,
	Andrew Morton, Arjan van de Ven, Jan Beulich

On Tue, May 31, 2011 at 9:11 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Lutomirski <luto@mit.edu> wrote:
>
>> > You could start with picking the more compatible alternative
>> > instruction initially. I don't at all mind losing half a cycle of
>> > performance in that case ... this code should be secure first.
>>
>> The more compatible one is mfence, which in some cases could (I
>> think) be a lot more than half a cycle.
>
> I'd still suggest to do the mfence change now and remove the
> alternatives patching for now - if it's more than half a cycle then
> it sure will be implemented properly, right?

I don't know.  I just cut 5 ns off the thing a couple weeks ago and no
one beat me to it :)

I'll take a look at how hard the patching will be.

--Andy

>
> Thanks,
>
>        Ingo
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2011-05-31 13:18 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-27 17:38 [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Andy Lutomirski
2011-05-27 17:38 ` [PATCH 1/5] x86-64: Fix alignment of jiffies variable Andy Lutomirski
2011-05-27 17:38 ` [PATCH 2/5] x86-64: Give vvars their own page Andy Lutomirski
2011-05-29 20:34   ` Borislav Petkov
2011-05-30  1:37     ` Andrew Lutomirski
2011-05-27 17:38 ` [PATCH 3/5] x86-64: Remove kernel.vsyscall64 sysctl Andy Lutomirski
2011-05-27 17:38 ` [PATCH 4/5] x86-64: Replace vsyscall gettimeofday fallback with int 0xcc Andy Lutomirski
2011-05-29 19:10   ` Ingo Molnar
2011-05-29 19:23     ` Andrew Lutomirski
2011-05-29 19:43       ` Ingo Molnar
2011-05-29 19:49       ` Ingo Molnar
2011-05-29 19:57         ` Andrew Lutomirski
2011-05-29 20:01           ` Ingo Molnar
2011-05-29 20:04             ` Andrew Lutomirski
2011-05-29 20:26     ` Borislav Petkov
2011-05-29 19:49   ` Jesper Juhl
2011-05-29 19:54     ` Jesper Juhl
2011-05-29 20:05       ` Andrew Lutomirski
2011-05-29 20:07         ` Jesper Juhl
2011-05-27 17:38 ` [PATCH 5/5] x86-64: Map the HPET NX Andy Lutomirski
2011-05-29 19:19 ` [PATCH 0/5] x86-64: Remove syscall instructions at fixed addresses Ingo Molnar
2011-05-31  2:33   ` Andrew Lutomirski
2011-05-31  8:07     ` Ingo Molnar
2011-05-31 12:27       ` Andrew Lutomirski
2011-05-31 12:54         ` Ingo Molnar
2011-05-31 13:06           ` Andrew Lutomirski
2011-05-31 13:11             ` Ingo Molnar
2011-05-31 13:17               ` Andrew Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.