From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E80EC43387 for ; Sat, 12 Jan 2019 00:06:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 138792183F for ; Sat, 12 Jan 2019 00:06:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SYHbrJcx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726467AbfALAGh (ORCPT ); Fri, 11 Jan 2019 19:06:37 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:38989 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725927AbfALAGg (ORCPT ); Fri, 11 Jan 2019 19:06:36 -0500 Received: by mail-pg1-f194.google.com with SMTP id w6so6972088pgl.6; Fri, 11 Jan 2019 16:06:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=I1nIkiX2Tc0fkHb2d3oi30EcJ5EKDLXYgwIE4qsSO1U=; b=SYHbrJcxws5IH1/bObn9UX5FfQMLdLmPpIZ4BqAW7TRPmrXWva4+MIqo/cYl8Cl9D+ sC2ftCpz095nAlWotr4ZRcnrxYdzfWVhoN7RQlvH4WUYY66W3Wp/fEiGVKcz3W7fdr6P XjpEg154xE7cD+9wre0l/3CiJiIjPQpYfKpnaX/WZ4Knf/IOuIZAEhNzPhZXIGmu/Cir fNzEH9owIcZkB8EMZ/Oz4IzlOGa15se/j5+sPPHNaqfK9sZRC1twcR1AuiauWfYPfWek zWaMApWrFPj5XkdpktL7YnCPOvN8+F1uLxlx+u3jjG336y8g8RABRvojjhnJDJgZ7z/i m5YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=I1nIkiX2Tc0fkHb2d3oi30EcJ5EKDLXYgwIE4qsSO1U=; b=Ez8LTJnl68C7dSb74gVubD4geiUkM6Mikkm/pfLCrSKzkZp4TlMUklQ0xUSTPMlfAK fNhp3RHvexVEVOUp7EZtMlSLVUu9WZMCwxRUDtHu18S77eZ6PwJK/4REUiaUkOapg6oG mgb5CewBBphmOwoRQw2Y2Vu+CtoPA2YsOdGs91dPG//H00ys6W3qCrv3yOr9DfHMT4VB T0rawTJPCDYy6ltpOD3kTsG+ZIo2JSaHsK0qYZva2hhr3hjmA5t4L+vlUUUfPZrc7ou7 i7piGUi2hQXSc7ctZSFoNNOo9aTQW3rnAtmhl3mTPbKqPSCUuGMfaPmwgahVdyB1xZ43 IfIw== X-Gm-Message-State: AJcUukdZfRQ7V+dk22yaL9ZJJexwC5tQpFW4n25ZB5rCPv6Tei/RBzNa 0IkyjI9YmDKAI6iSrRra+T65ZcIlKaGTLbPn39I= X-Google-Smtp-Source: ALg8bN64s+smcL6DSPQ8YTXNWRjEWZgK/cgGRZqAYJPc1PB+Y0UCBrtrxp+lmHmCPEcMCJj1QA6TB8teXf1NGubPtMs= X-Received: by 2002:a65:6684:: with SMTP id b4mr15225053pgw.55.1547251595388; Fri, 11 Jan 2019 16:06:35 -0800 (PST) MIME-Version: 1.0 References: <20190111132817.GE6310@bombadil.infradead.org> <2E820BDF-2984-4D02-A5A9-E25635348211@dilger.ca> In-Reply-To: <2E820BDF-2984-4D02-A5A9-E25635348211@dilger.ca> From: Steve French Date: Fri, 11 Jan 2019 18:06:23 -0600 Message-ID: Subject: Re: scp bug due to progress indicator when copying from remote to local on Linux To: Andreas Dilger Cc: Matthew Wilcox , CIFS , linux-fsdevel , Pavel Shilovsky Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Fri, Jan 11, 2019 at 5:51 PM Andreas Dilger wrote: > > On Jan 11, 2019, at 3:02 PM, Steve French wrote: > > > > On Fri, Jan 11, 2019 at 3:48 PM Andreas Dilger wrote: > >> > >> On Jan 11, 2019, at 2:13 PM, Steve French wrote: > >>> > >>> On Fri, Jan 11, 2019 at 7:28 AM Matthew Wilcox wrote: > >>>> > >>>> On Fri, Jan 11, 2019 at 12:11:00AM -0600, Steve French wrote: > >>>>> In discussing an interesting problem with scp recently with Pavel, we > >>>>> found a fairly obvious bug in scp when it is run with a progress > >>>>> indicator (which is the default when the source file is remote). > >>>>> > >>>>> scp triggers "SIGALRM" probably from update_progress_meter() in > >>>>> progressmeter.c when executed with "scp localhost:somelargefile /mnt" > >>>>> ie with an ssh source path but a local path for the target, as long as > > > > > >> My typical response for issues like this is to find the github repo for > >> the code and push a patch to fix the code myself. If I can't find the > >> upstream project/source repo easily, download the .src.rpm and make a > >> patch and submit it to the major distro package maintainers (RHEL, SLES, > >> Ubuntu or Debian), so that at least it is fixed for the vast majority > >> of users, and those maintainers can handle pushing the patch further > >> upstream (and smaller distros will copy patches/sources from them). > > > > I have built the scp code as an experiment and by the way there are many, > > many things that could be changed (other copy tools e.g. robocopy even rsync > > and cp) look much more mature (with lots more useful performance options). > > My earlier experiments e.g. changing the scp i/o size to something larger > > did help as expected for various cases (16K is suboptimal in many cases). > > Years ago I submitted a patch to fix GNU fileutils so that "cp" could use > st_blksize to determine the data IO size for the copy. Originally it was > allocate its copy buffer on the stack, but when Lustre returned st_blksize > of 8MB it exploded. 2MB was safe at the time, and what we used for many > years but fixing "cp" allowed us to increase st_blksize over time. Probably > scp should be modified to do the same. > > It is often just a case of the developer didn't know any better, or the code > was written so long ago that 16KB transfers were seen as a huge improvement > over 4KB or 1KB transfers, and nobody has looked at it again. Since we all > depend on other developers to know their side of the house (e.g. I'm wholly > dependent on others for the crypto implementation of SSH/scp), we may as > well provide some help when there is something as widely used as scp that > we can improve in some useful way. > > > But in this case it seems much simpler - the code path is basically > > read 16K over ssh > > write 16K to local/network file system (cached) > > (repeated 1000s of times) > > then ftruncate > > Even if the filesystem is transferring large IOs from disk/network, doing > many thousands of excess system calls is pure overhead that could be avoided. > Also, even if scp could only do 16KB network transfers, staying in tight > "scp copies data over ssh" loop is better for icache/dcache/context, then > doing a single large write call to/from the filesystem is most efficient > on that side of the house. > > >> ... why would the truncate() take a long time even if the file > >> is large, since it shouldn't actually be doing any work to change the > >> size of the file (I'm assuming that the target file is either empty or > >> about the same size during an overwrite, you didn't mention anything > >> unusual as a requirement). > > > > ftruncate if it were simply setting the file size would be very fast, > > except that setting the file size needs to have the data written out > > (writes otherwise could risk extending the file size) so triggers a > > call to writepages (filemap_write_and_wait) as a sideeffect > > so all the cached data is finally written out which can take a few > > seconds and cause the scp progress indicator to malfunction > > and trigger the SIGALRM. This is normally not a problem > > fsync etc. can take longer than a single i/o operation. > > This seems more like a kernel/filesystem issue. We shouldn't have to flush > the entire file to disk just to change the file size. In the exceedingly > common case of the current size == truncate() size, the truncate should be > a no-op unless the on-disk size is > new size? In the unlikely case that > file size > new size then it could truncate only the pages and on-disk > blocks beyond the new size? > > I don't see how pages in cache < new size could cause the file size to be > extended. Is this a VFS/MM thing, or CIFS-specific? I suspect that for various other network/cluster file systems, flushing (not to disk but at least to the server, ie across the network) is considered safer to do before certain metadata updates (including setting file size) -- Thanks, Steve