From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753234AbcFJKSr (ORCPT ); Fri, 10 Jun 2016 06:18:47 -0400 Received: from mail-lf0-f42.google.com ([209.85.215.42]:34966 "EHLO mail-lf0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753164AbcFJKSm (ORCPT ); Fri, 10 Jun 2016 06:18:42 -0400 MIME-Version: 1.0 In-Reply-To: References: From: Dmitry Vyukov Date: Fri, 10 Jun 2016 12:18:20 +0200 Message-ID: Subject: Re: x86: bad pte in pageattr_test To: Thomas Gleixner Cc: Ingo Molnar , "H. Peter Anvin" , "x86@kernel.org" , LKML , Andrey Ryabinin , Konstantin Khlebnikov , syzkaller , Kostya Serebryany , Alexander Potapenko , Sasha Levin , "linux-mm@kvack.org" , Peter Zijlstra Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 9, 2016 at 11:34 PM, Thomas Gleixner wrote: > On Tue, 7 Jun 2016, Dmitry Vyukov wrote: >> >> I've got the following WARNING while running syzkaller fuzzer: >> >> >> >> CPA ffff880054118000: bad pte after revert 8000000054118363 >> >> > CPA ffff880059990000: bad pte 8000000059990060 > > In both cases the PTE bit which the test modifies is in the wrong state. > >> Should we delete this test if it is not important? > > No. There is something badly wrong. > > PAGE_BIT_CPA_TEST is the same as PAGE_BIT_SPECIAL. And the latter is used by > the mm code to mark user space mappings. The test code only modifies the > direct mapping, i.e. the kernel side one. > > So something sets PAGE_BIT_SPECIAL on a kernel PTE. And that's definitely a > bug. > > These are the last entries from your syzkaller log file of the first incident: > > r0 = perf_event_open(&(0x7f000000f000-0x78)={0x2, 0x78, 0x11, 0x7, 0xd537, 0x6, 0x0, 0xc1, 0xffff, 0x5, 0x0, 0x40, 0x4, 0x9, 0x5369, 0x8, 0x7, 0x8508, 0x3, 0x80, 0x0}, 0x0, 0xffffffff, 0xffffffffffffffff, 0x0) > mmap(&(0x7f0000cbb000)=nil, (0x1000), 0x3, 0x32, 0xffffffffffffffff, 0x0) > r1 = syz_open_dev$mouse(&(0x7f0000cbb000)="2f6465762f696e7075742f6d6f7573652300", 0x100, 0xa00) > mmap(&(0x7f0000cbc000)=nil, (0x1000), 0x3, 0x32, 0xffffffffffffffff, 0x0) > setsockopt$BT_SNDMTU(r1, 0x112, 0xc, &(0x7f0000cbc000)=0x5, 0x2) > mmap(&(0x7f0000cbb000)=nil, (0x1000), 0x3, 0x32, 0xffffffffffffffff, 0x0) > ioctl$EVIOCGEFFECTS(r1, 0x80044584, &(0x7f0000cbc000-0x942)=nil) > r2 = fcntl$dupfd(r0, 0x406, r0) > mmap(&(0x7f0000cbc000)=nil, (0x1000), 0x3, 0x32, 0xffffffffffffffff, 0x0) > mmap(&(0x7f00002bf000)=nil, (0x1000), 0x3, 0x8010, 0xffffffffffffffff, 0x0) > mmap(&(0x7f0000000000)=nil, (0x0), 0x3, 0x32, 0xffffffffffffffff, 0x0) > pwritev(r2, &(0x7f00007e9000)=[{&(0x7f0000cbc000)=.... > > Do you have log of the second one available as well? > > CC'ing mm and perf folks. Here is the second log: https://gist.githubusercontent.com/dvyukov/dd7970a5daaa7a30f6d37fa5592b56de/raw/f29182024538e604c95d989f7b398816c3c595dc/gistfile1.txt I've hit only twice. The first time I tried hard to reproduce it, with no success. So unfortunately that's all we have. Re logs: my setup executes up to 16 programs in parallel. So for normal BUGs any of the preceding 16 programs can be guilty. But since this check is asynchronous, it can be just any preceding program in the log. I would expect that it is triggered by some rarely-executing poorly tested code. Maybe mmap of some device?