From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1037135AbdEZOxN (ORCPT ); Fri, 26 May 2017 10:53:13 -0400 Received: from mail-wm0-f42.google.com ([74.125.82.42]:37227 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1035659AbdEZOxL (ORCPT ); Fri, 26 May 2017 10:53:11 -0400 Subject: Re: Memory corruption kernel issue (potentially exploitable), request for help From: Oliver Freyermuth To: linux-kernel@vger.kernel.org References: <9a5a76c2-b38a-9cde-8da6-616ba23c6d88@googlemail.com> Message-ID: Date: Fri, 26 May 2017 16:53:08 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <9a5a76c2-b38a-9cde-8da6-616ba23c6d88@googlemail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dear Kernel hackers, a small follow up: The problem is reproducible with an Ubuntu 17.04 live system. It vanishes as soon as I disable the Realtek-network-card in the UEFI, put in an Intel card, and use that. So the problem must either be a kernel bug (then most likely for r8168 only), or a very strange firmware / hardware issue of this specific card. If you have any further suggestions to debug this (if it's a kernel bug, I would guess it's exploitable since it allows writes to non-userspace memory), please let me know. Cheers and all the best, Oliver Am 26.05.2017 um 13:26 schrieb Oliver Freyermuth: > Dear Kernel hackers, > > I have a machine with a self-built, non-tainted kernel, which exhibits memory corruption as soon as I execute > while true; do cat /proc/self/net/dev > /dev/null; done > as normal user. > > I am running 4.11.3 (almost vanilla, only Gentoo patches in) on mostly standard hardware (Intel CPU + GPU). > I can also reproduce with 4.9 on that machine. > RAM has already been exchanged. Due to a BIOS bug, the machine needs "iommu=soft" as kernel parameter, but nothing special otherwise. > > The corruption appears in two ways: > Often via: > Corrupted low memory at ffff88000000b000 (b000 phys) = 0016e109 > Almost every time visible via: > memtester 15G > (machine has 16 G). > > Checking the output of memtester, the values it finds match with the content of the numbers in: > /proc/self/net/dev > > After each boot, it seems the memory page where the corruption appears is slightly changed, it is usually in the region around 0x94F6000 (physical address). > > I have attached my kernel config, gzipped. > > I would be very grateful for any advice on how to debug this further - it does not really look like a hardware issue to me anymore, > but if it could be, please enlighten me. > > Please include me in replies, as I am not subscribed to the list. > > In case relevant, my network controller is: > 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06) > > Thanks and all the best, > Oliver Freyermuth >