From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756745AbcEXSjV (ORCPT ); Tue, 24 May 2016 14:39:21 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:33952 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753823AbcEXSjS (ORCPT ); Tue, 24 May 2016 14:39:18 -0400 Subject: Re: Regression in 4.6.0-git - bisected to commit dd254f5a382c To: Al Viro References: <57437683.30008@lwfinger.net> <20160524001854.GW14480@ZenIV.linux.org.uk> <57447CE1.9020207@lwfinger.net> <20160524162812.GY14480@ZenIV.linux.org.uk> Cc: LKML From: Larry Finger Message-ID: <57449FD2.8090808@lwfinger.net> Date: Tue, 24 May 2016 13:39:14 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 In-Reply-To: <20160524162812.GY14480@ZenIV.linux.org.uk> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/24/2016 11:28 AM, Al Viro wrote: > On Tue, May 24, 2016 at 11:10:09AM -0500, Larry Finger wrote: > >> For now, the following one-line hack allows my system to boot: >> >> diff --git a/fs/read_write.c b/fs/read_write.c >> index 933b53a..d5d64d9 100644 >> --- a/fs/read_write.c >> +++ b/fs/read_write.c >> @@ -721,6 +721,7 @@ static ssize_t do_loop_readv_writev(struct file *filp, >> struct iov_iter *iter, >> ret += nr; >> if (nr != iovec.iov_len) >> break; >> + nr = max_t(ssize_t, nr, 1); >> iov_iter_advance(iter, nr); >> } >> >> I have no idea what subtle bug in do_loop_readv_writev() is causing nr to be >> zero, but it seems to have been exposed by commit dd254f5a382c. > > This is definitely broken. What happens is that something calls readv() > or writev() with one of the iovecs in the middle of the vector having > zero ->iov_len. The call of iov_iter_advance(iter, 0) used to step to > the next iovec in such situation; the change in this commit leaves that to the > next primitive called, and most of the time that works just fine. > do_loop_readv_writev(), though, is picking the size to be passed to > ->write()/->read() from the current iovec, and if it had been zero from > the very beginning... guess what's going to happen. > > I'm somewhat curious about the source of that syscall - it's a good testcase > to add to ltp/xfstests, but it smells very odd for normal userland code. > It certainly needs to be fixed kernel-side, but the code issuing that is > probably worth looking into. Oddities are like mushrooms - found one, look > around for more and don't assume that the rest will be of the same species... > > I'm looking into iterate_and_advance right now; hopefully will have a sane > replacement shortly... I did a pull from mainline this morning (commit 84787c572d402) and the problem has somehow been fixed. I did not see any likely candidates in the commit messages, and I'm bisecting again to see which change did it. Larry