From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965035AbcJXNoE (ORCPT ); Mon, 24 Oct 2016 09:44:04 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:54880 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S938809AbcJXNoA (ORCPT ); Mon, 24 Oct 2016 09:44:00 -0400 Subject: Re: bio linked list corruption. To: Dave Jones , Andy Lutomirski , Andy Lutomirski , Linus Torvalds , Jens Axboe , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel References: <20161018234248.GB93792@clm-mbp.masoncoding.com> <332c8e94-a969-093f-1fb4-30d89be8993e@kernel.org> <20161020225028.czodw54tjbiwwv3o@codemonkey.org.uk> <20161020230341.jsxpia2sy53xn5l5@codemonkey.org.uk> <20161021200245.kahjzgqzdfyoe3uz@codemonkey.org.uk> <20161022152033.gkmm3l75kqjzsije@codemonkey.org.uk> <20161024044051.onmh4h6sc2bjxzzc@codemonkey.org.uk> From: Chris Mason Message-ID: <77d9983d-a00a-1dc1-a9a1-631de1d0c146@fb.com> Date: Mon, 24 Oct 2016 09:42:39 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20161024044051.onmh4h6sc2bjxzzc@codemonkey.org.uk> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [2620:10d:c091:180::773d] X-ClientProxiedBy: DM5PR17CA0016.namprd17.prod.outlook.com (10.168.112.154) To MWHPR15MB1246.namprd15.prod.outlook.com (10.175.3.8) X-MS-Office365-Filtering-Correlation-Id: 8d248e95-d73e-4365-8dcf-08d3fc13a2dc X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1246;2:zQmPmcI4pVPxdjQTeX3062P7AS9s1pkUp74ZUrOqLqiaLGHtyXjv0JuhVM38n6RcsnfnOoHxMm6ZPR/Mmwy/JmqHHYxzqyhuPHB6DcWznhKD2Yqs4LIKkzph8aDsJq+julP/GuC6ERgL5uyN5kDTPh872opBdZq7xzVk0VePyQWGYBKcepHIdE0XOAzl6b/ZtYJFZj4xfypZLC0SZDQJaA==;3:L0FLFSleJ6JTxApIidkEfj6ELWmLd5bqRsXXq9+dTulD3GZxssafAZC+Fk3DJA5kHUQHcxmEC3txlVO270Vvyf5/p+wcG+FYR+ArXOBBwS3BDpmqZtwxIv4dyTA1D6NFavvYa9iQ/bGY3IiiYgiyfA==;25:/0VxSTBVItkuvGPMd+VCX6jaQp4Tr/Cwlpie1GAPoCfEqSpmnDIWwUWJOxLPT85iv/V6Sp9HzdDJ2IqXnQMRYmC2sVAmJDSkTx4tD0CAHBh93sNTi+Gb9DpUCytlzyg1YR6s8N1qT6MKsaGshxFCgCO6fgvS80zjK4AoDalUVRYJW1uBOStOYmVfk8ZrZa8FVRo52HOPIT4VsR6SS4cwXw8beb1To3nZhXqOlk8d7jKSgyBnXcIIU92jfSgMqRRHZjhHw+gtdnMfbTDLlL4Ocl8H0Z/YaKd7ubcZGwjUrWDculTAYX8r2mU/IfhLvse8cpFtwGrVrhJHd7FL6e5Cls7NXzMAa7gCxHzMSjyJuEpOSM1xFufPtfnAdah/yw3XZ56PzoE9DAC5h6u1uTHvMoVWft2o4kP8vToVRa/Te5V/dbfyWzeAn7VU37l6uINR X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:MWHPR15MB1246; X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1246;31:D2MPomOGmHwvdyRirwQ6jOYBouWhuSVZSJ05YgCa++7/+bxPJT1ILiVID3jv9uIBNnqbRY0+79sVDJD5UHaTDm0UOBc+xM8H09QbAcw8l8JuDAXL3x9UaO/bW7MrrJ/3AzDLsZWdy2vhDEQtsB6ImyILsBcjHZpc2memOUdqvhrEpq+tYQBCZcN8p+XOm/qceX+CNSZ8bT51OqyB2v1DFyz7hI4Fwb+DZTCV7itXxAno5/sVDaKOqj/BjhKICG0pea8YnO+G8ymkdVIlUbA8Rg==;20:kPfy3+osBJWQamHKDBRCHzYT0ScGPf5nwsZWvtd/TsBpRkgDltspufYnAR6VN9SS/jA+NTYSZeWJgBvuss+eERRLoubnfeUbhHMuNUg3mQcJL62J9Z/Uv3fQ2bc0nWxrK7MDD+aHYXaX4rWfLNgMxWAYhi6ZnSQwbMHISlib5QU=;4:LY+4hdu+rfJBpr6XQ6MvgmxuVAMaEpcPi1s7mdlX9H3Dq8HHWsFYYhj4g6AQKecQcb8Gic+YQY2aKO/yxpu1hWNjNGFHWJq0ro68qqSjmUUfFLZ1vr3pfJ+/P2Al0/LORe75l0el+E7itRbLin7mQglItSvwACePPqUsArPOVOJDN+4vDv6tngZ7VEp7rM6GkG8uNbCNbmHe1MaSMG+KdO2T8IFQULmVmO98qRQwdfh3qwJ+TVjqtHMJAsXGdtn3PxxCvmBhlcuvlqvieW8f3eyXDIh9Nh4DVtnAJB3eGmmQfuuNWM2oWs4MYFoyVQqcodm3JPUx8b6Ri7UJ7gVILIvdfH0LfnbrV2rw0HsqBIUrZbJtOqRBOC56QRJ3i/2p71qbNE3ZKASZuGGBfXdgFjZFXSetD6hJmIeRUnQvvN9bsILXo9FK4Jzn4btzgm3u X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(84791874153150); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001);SRVR:MWHPR15MB1246;BCL:0;PCL:0;RULEID:;SRVR:MWHPR15MB1246; X-Forefront-PRVS: 0105DAA385 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(7916002)(24454002)(377454003)(51234002)(189002)(199003)(81166006)(33646002)(36756003)(81156014)(2906002)(54356999)(92566002)(23746002)(586003)(83506001)(1706002)(6116002)(68736007)(3480700004)(15975445007)(7846002)(6666003)(77096005)(8676002)(2950100002)(47776003)(65956001)(5660300001)(5001770100001)(4001350100001)(105586002)(106356001)(97736004)(65806001)(7736002)(31696002)(76176999)(86362001)(42186005)(19580395003)(305945005)(65826007)(101416001)(31686004)(50986999)(189998001)(107886002)(230700001)(93886004)(50466002)(64126003)(921003)(42262002)(1121003);DIR:OUT;SFP:1102;SCL:1;SRVR:MWHPR15MB1246;H:[IPv6:2620:10d:c0a1:1110:8000::204d];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;MWHPR15MB1246;23:zzaN6QyV6IQVjtL7OqtkeCXeDZ5q/dGMZ+G06?= =?Windows-1252?Q?Owgvk5r5Y2eD3iz2W/JxgS9p4kPfSg7Xgzr/oMFdTTskC0xKgE3JrO4y?= =?Windows-1252?Q?vSYdFMSnL+ySQUdaNL/gOFAdnH8y8obUSY1LVcvSFpAp30v67KZNgdbZ?= =?Windows-1252?Q?yDx5c+JlZamgk2jLUBDB0TyV/ZhGb+KgqkHmtth5dBVMf4IgoO1KbPOV?= =?Windows-1252?Q?PrzyCz0XYrLZGq2dVgZqQYkrR37PVhS/wl3HiHVsAZh/VzY2B+cBgnYs?= =?Windows-1252?Q?KNbt7Kstx4kGlox1B2lUtROoC76blbxctYJ1l3hMZ1p7sTq2nHO12mKN?= =?Windows-1252?Q?LWc1Upo/7kpfhZ5ZNp0JweEP2Mt6XblurrFKgTf5Z1Sa0ehHZMM6Y0Q1?= =?Windows-1252?Q?GtMt9FhmNnKAlNbNxdTdeemHUnyzqYmyLvAa0D2AiXjlwO9ia326QHos?= =?Windows-1252?Q?w/QyfEaL8OG9WSZo/Ij8suQgjEQaNCVfhP6/FOd78VxJYdEBaY2lc++v?= =?Windows-1252?Q?kJYcbmtS2lq+bf6KjCqaTNzFrBRe4I21RGCRGBe5R8ktXkI3bjO1jWEa?= =?Windows-1252?Q?QYsFsKJW4NACvLLl56ZcwL+YQaCbK7USbuRiwMLqyXlLYzr57cdiD941?= =?Windows-1252?Q?LLllsDJr5iaHhiBECP4LjQshzwCf9ojtYCXAJV5gDstiNlTYCw/fRqKI?= =?Windows-1252?Q?TPftCYjIhuIl8+ygqxiGBc9/PjtUqIGHsmVybJ6mnjcC2I8oMW3NmFG5?= =?Windows-1252?Q?PAxcJioV4wWOoA46s3BfgZVcajjj8sktfIqSU4R9Ao/mbjE1IaOGO6Qe?= =?Windows-1252?Q?ijJJop7xrLZodRTnFj4i9Q5B/Ie2PGyalhLnPpU01ZQOhHkgAF4m9OPc?= =?Windows-1252?Q?ZpvhETGTmsWyg6m+xUKrWe2tXYxYNnKcni2FINqCEJcjZ0V7plCAcIhk?= =?Windows-1252?Q?mav+2U3IqwXWmpywOckyiK3dDkedQlaOJOqavHAIFuycOlVid7zbf1r2?= =?Windows-1252?Q?C926cIxunkItkp+mdvyraPJkk4KNvLkLUzU48jdqhDWAjYsw4aW1hLmQ?= =?Windows-1252?Q?NkfXqYOWTTIp7hVQVrqFIy0+xjJPaf38Eq6tm0DGyC18rTwY8Xqfl5jo?= =?Windows-1252?Q?2xBES54A6Jz/4RJDTJiRE7noHrXNzH2th3qLQrg1iRcCOCKD6N6aJqNf?= =?Windows-1252?Q?4Rn03BMI6m5b4/GnVuKMUkrEOPRA8ilOr+IedH4sImGafQblHJ1qdWjV?= =?Windows-1252?Q?VGArzaVGpfXGFMUe4PILAKeQIJaOKiZTCh9e39lnYppd3AFvODXJXcJK?= =?Windows-1252?Q?YmNF+s7+1Fo+CuHGqbo2j7D69Fkm5oZtuu4dbB4KtaOpCxLOfExUwUyL?= =?Windows-1252?Q?in+FvG/vcsrRDhk3DaXhiN39UGJIhnWABBGfRzCpDuYxLeFeckKtSNc2?= =?Windows-1252?Q?lcb5jKwWJvx0mpxkh0Ps5Ugs/1sYJMppwz2tgp9NaG24mWZfIEopzKuv?= =?Windows-1252?Q?esBxfm83fsmMmXZQDuWEjT+IlOU?= X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1246;6:0u58QhoF8SUYe9UD+89cn3Qysy3lP0D+oAuIxicRR0nHSxoyRLzFSzcGcGsHj+OjUjl3yzScE1vRe3ES8RLYRl1mvNxxgrwzaMd1yr75Y0T0s0i0El/dYHbuGE3THutmQbYKURFJvOC3mn7Iwj5AW7oKjTaCQJ25WOxV0VekqGgDCYmTn0HJnCInmcFogI1hQjvqLGXuqFUrErMx2yxOVU+6lr8/gzAEXShrkh9M5cucGtaYB1V/trq2h28R9CTCDhpgYGW+2YpfeIOHFnHiF/1Pa+dVZj7MGb+TIs2UwHOgF0wgM+iHvTuwgE2ATrvd;5:mGNgbXd+0WoAzl5k1fEn5qN+ydy7Ck2rLTQHKyM5Lh5bu79oiI0FamiPZJwnHy8uCeFp8lSILqaZyHIW0eLHi0VaCW3hQfskEZb3JWwAkRjwQncLMYL2UWSX14FTFfAP8NgYVrQZNJ1YmqOq6nLW/A==;24:87puxIJkEDHHF5zQfrxi3yCnnLn1O81tfExwfsFym1S0TZ/3yV2N0RK7HS7zLPbSub6Uuke7hfEYPA/Y0ki59HMkvZuRKZpLIGR3umyY8eA=;7:rHLx3hMYLpLi8m4vNkrYCyjd06lIVHBBC2Mg9FnG5z4lN3QA/4ofCRl4nybRkFgGi6LbbDuYIR8xf5DOEkkiNf07YdNldfkQ/y0i5Y67fusKu2JpGDe8wWuBm7rfhkM6vPq6bWlzfY5dhq80+1RX6PSEBYUeUdRD/Rm0aLye5CYbu3WXv3Ww8i5MxaszocImNuN0dJpLzRxUeym6rraxxWcUv3ynIRRI2BmK6Y6i7k8hR/WD7pMqFd+Cw/LStU2bBe++2/Do7F1n4bi2W7qHczfqMa65QKnrfe/5q+iog7um4HGIxFa9P5Q9QZS9lAH5/8NQnTT7fr3CKYVEfsterokuMootF57XQHxiIoLr5Kw= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;MWHPR15MB1246;20:l/4TNo++MajgrtAmdR6JTyVuYB1FnArNVI1Lsk9kwYIn4GsRAg2LC+r4507Gkr9JDJoOLzSZIqS2V6Ssp0VFFrM3WWWsjx0NgTzYtWemCp64c8lBabBh7AvUssXXPM3T2zLZakEodaAb5uZ6W1omk0BF1cwxALYHhHQTFjh0NRs= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Oct 2016 13:42:43.9559 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR15MB1246 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-24_09:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/24/2016 12:40 AM, Dave Jones wrote: > On Sun, Oct 23, 2016 at 05:32:21PM -0400, Chris Mason wrote: > > > > > > On 10/22/2016 11:20 AM, Dave Jones wrote: > > > On Fri, Oct 21, 2016 at 04:02:45PM -0400, Dave Jones wrote: > > > > > > > > It could be worth trying this, too: > > > > > > > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=174531fef4e8 > > > > > > > > > > It occurred to me that the current code is a little bit fragile. > > > > > > > > It's been nearly 24hrs with the above changes, and it's been pretty much > > > > silent the whole time. > > > > > > > > The only thing of note over that time period has been a btrfs lockdep > > > > warning that's been around for a while, and occasional btrfs checksum > > > > failures, which I've been seeing for a while, but seem to have gotten > > > > worse since 4.8. > > > > > > > > I'm pretty confident in the disk being ok in this machine, so I think > > > > the checksum warnings are bogus. Chris suggested they may be the result > > > > of memory corruption, but there's little else going on. > > > > > > The only interesting thing last nights run was this.. > > > > > > BUG: Bad page state in process kworker/u8:1 pfn:4e2b70 > > > page:ffffea00138adc00 count:0 mapcount:0 mapping:ffff88046e9fc2e0 index:0xdf0 > > > flags: 0x400000000000000c(referenced|uptodate) > > > page dumped because: non-NULL mapping > > > CPU: 3 PID: 24234 Comm: kworker/u8:1 Not tainted 4.9.0-rc1-think+ #11 > > > Workqueue: writeback wb_workfn (flush-btrfs-2) > > > > Well crud, we're back to wondering if this is Btrfs or the stack > > corruption. Since the pagevecs are on the stack and this is a new > > crash, my guess is you'll be able to trigger it on xfs/ext4 too. But we > > should make sure. > > Here's an interesting one from today, pointing the finger at xattrs again. > > > [69943.450108] Oops: 0003 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [69943.454452] CPU: 1 PID: 21558 Comm: trinity-c60 Not tainted 4.9.0-rc1-think+ #11 > [69943.463510] task: ffff8804f8dd3740 task.stack: ffffc9000b108000 > [69943.468077] RIP: 0010:[] Was this btrfs? -chris