From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1035357AbcIWSeu (ORCPT ); Fri, 23 Sep 2016 14:34:50 -0400 Received: from thejh.net ([37.221.195.125]:48059 "EHLO thejh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1035314AbcIWSes (ORCPT ); Fri, 23 Sep 2016 14:34:48 -0400 Date: Fri, 23 Sep 2016 20:34:43 +0200 From: Jann Horn To: Kees Cook Cc: Tycho Andersen , Andy Lutomirski , Linus Torvalds , Andy Lutomirski , X86 ML , Borislav Petkov , "linux-kernel@vger.kernel.org" , Brian Gerst Subject: Re: [PATCH 09/12] x86/process: Pin the target stack in get_wchan() Message-ID: <20160923183443.GA3416@pc.thejh.net> References: <20160917020051.GD2543@pc.thejh.net> <20160923074306.GE20504@pc.thejh.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/04w6evG8XlLl3ft" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --/04w6evG8XlLl3ft Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Sep 23, 2016 at 11:28:26AM -0700, Kees Cook wrote: > On Fri, Sep 23, 2016 at 12:43 AM, Jann Horn wrote: > > On Thu, Sep 22, 2016 at 03:44:37PM -0700, Andy Lutomirski wrote: > >> On Fri, Sep 16, 2016 at 7:00 PM, Jann Horn wrote: > >> > On Tue, Sep 13, 2016 at 02:29:29PM -0700, Andy Lutomirski wrote: > >> >> This will prevent a crash if get_wchan() runs after the task stack > >> >> is freed. > >> > > >> > I think I found some more stuff. Have a look at KSTK_EIP() and KSTK_= ESP(), I think > >> > they read from the saved userspace registers area at the top of the = kernel stack? > >> > > >> > Used on remote processes in: > >> > vma_is_stack_for_task() (via /proc/$pid/maps) > >> > >> This isn't used in /proc/$pid/maps -- it's only used in > >> /proc/$pid/task/$tid/maps. I wonder if anyone actually cares about it > >> -- it certainly won't work reliably. > >> > >> I could pin the stack in vma_is_stack_for_task, but it seems > >> potentially better to me to change it to vma_is_stack_for_current() > >> and remove the offending caller in /proc, replacing it with "return > >> 0". Thoughts? > > > > I just scrolled through the debian codesearch results for "\[stack\]" - > > there seem to only be 105 across all of debian's packages, many of them > > duplicates - and I didn't see any that looked like they used the tid ma= p. > > So I think this might work. > > > > ( https://codesearch.debian.net/search?q=3D%22%5C%5Bstack%5C%5D%22 ) > > > > > >> > do_task_stat() (/proc/$pid/stat) > >> > >> Like this: > >> > >> mm =3D get_task_mm(task); > >> if (mm) { > >> vsize =3D task_vsize(mm); > >> if (permitted) { > >> eip =3D KSTK_EIP(task); > >> esp =3D KSTK_ESP(task); > >> } > >> } > >> > >> Can we just delete this outright? It seems somewhere between mostly > >> and entirely useless, and it also seems dangerous. Until very > >> recently, on x86_64, this would have been a potential info leak, as > >> SYSCALL followed closely by a hardware interrupt would cause *kernel* > >> values to land in task_pt_regs(). I don't even want to think about > >> what this code does if the task is in vm86 mode. I wouldn't be at all > >> surprised if non-x86 architectures have all kinds of interesting > >> thinks happen if you do this to a task that isn't running normal > >> non-atomic kernel code at the time. > >> > >> I would advocate for unconditionally returning zeros in these two stat= fields. > > > > I'd like that a lot. > > > > I guess the two things that might theoretically use it are ptrace users > > and (very theoretically) sampling profiling stuff or so? > > > > In gdb, the only code I can find that reads this is in gdb/linux-nat.c,= but > > it's behind an "#ifdef 0": > > > > #if 0 /* Don't know how architecture-dependent the rest is... > > Anyway the signal bitmap info is available from "status". = */ > > if (fscanf (procfile, "%lu ", <mp) > 0) /* FIXME arch= ? */ > > printf_filtered (_("Kernel stack pointer: 0x%lx\n"), ltmp= ); > > if (fscanf (procfile, "%lu ", <mp) > 0) /* FIXME arch= ? */ > > printf_filtered (_("Kernel instr pointer: 0x%lx\n"), ltmp= ); > > [...] > > > > strace and ltrace don't seem to be using it. >=20 > Does CRIU use this? I wouldn't expect so, since they're using ptrace, > IIUC, to freeze/restore. As far as I can tell: parse_pid_stat() parses them into a struct proc_pid_stat as "esp" and "eip", but those struct members are never used (like, probably, most other members of that struct). child_opened_proc.c just opens /proc/%d/stat and then closes it again immediately. So in summary: I don't think so. --/04w6evG8XlLl3ft Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJX5XXDAAoJED4KNFJOeCOoprYP/3H1QJUUvSPLKpyJWrUbFyy8 N4a0J5jMcCOz1kIOEzXQGajQRMmCQAXU3iN1j3yFQ32LM59FDGvzeVhga70Xy+eO bGG9G7Uf2rbk4XFPKt27GOtQFXmCyPHvLu1KlMWPFYqVTuCcoJvgEJS5hyID5kUf Hh0aWX1F6b3KRWny2bQPQC93tUnDudjEYZnjl40z4jh1BCSgpsV6S6z2+Pej5Gb2 FwydmRU9cotJa1L9S3ecVOz9NEdIHABDG6fUErb6fdDWwIn89vjIi1YUgjSOZq+u iKyx05mEVz9unoEjv835/xNjFbA3fBB8DRYhgbGCNYwXJClfD6DLBTXicyUpDNRi bVy/A8hnsL3bI9P6d8a5BmIbyul/ONJVip/AHg2CD0Yt2RJ5608eoS6nugxFPzOw SiHZCTNcP1KBLwpI2HVpYS+buVrBdn408k/C9lGk3TqESMuGMlEulOrVKRvjnFGG 2JZ7rJg5Q7VXrzz6JPq/tOkGU7CUh9A8ZNBRCII8lASPDTR2bQjwmIzsKzvlvDSV UZr25UrdAu2alRppdR2Cv5i6WODM4LhOnHXN0ZbP44LT2m0ssvIo/Ty0MBqTigU3 06zKSWnU60xS0PaCwDG6RCbjvZKmcJSviwrklLKTOBXdaUumXDv3Aqbpx7INHtso ccBoab7l9Pn//j9UwI97 =ksVP -----END PGP SIGNATURE----- --/04w6evG8XlLl3ft--