From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4E5EC282CC for ; Mon, 11 Feb 2019 08:32:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 87B232146F for ; Mon, 11 Feb 2019 08:32:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=dilger-ca.20150623.gappssmtp.com header.i=@dilger-ca.20150623.gappssmtp.com header.b="jpnEk1mp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727208AbfBKIcc (ORCPT ); Mon, 11 Feb 2019 03:32:32 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:43720 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726847AbfBKIcb (ORCPT ); Mon, 11 Feb 2019 03:32:31 -0500 Received: by mail-pl1-f196.google.com with SMTP id f90so4741595plb.10 for ; Mon, 11 Feb 2019 00:32:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dilger-ca.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=ig7RNgZMXeCJk5825pWBjp3xhwSX4DkJOkhvoitZq+o=; b=jpnEk1mpx/KUYUYt/q+SiGjEIftu3IqWMSdHAuAMwVqj7jkxn8Xa/IBfScrqLApzNB TLPe2jIEqmpOF3Gy5nOK1NMbp60aIV6JrO5XzbjEX0KVwT3HoFxBdf5W52nW0tv1DZ4h zsEKc//QR99jEG6+8JiwWVhkbZX2opZRWUJyMpXtpsfnuxeBQQI7+TagnSHTwYcP7nF/ 13z7wpaQ/dVzRkBmhJADrVoq8v7aACDGaihFvW1QAhz2nVTs43/oXKbKejwrK850c0+K flebpx51HnOPPXRn/ShI6DeQOdt+C+8uy/Ha30iglR8HCk35kXAIZoia02ZDUoF2vmK1 uLPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=ig7RNgZMXeCJk5825pWBjp3xhwSX4DkJOkhvoitZq+o=; b=Ymw+PN2E+2IVyhEdkxFqXDGqL+zx+t3kO6q9O85+nEf0WCVN6jTYhfG7lP4aPOmhle 67Cp87cPur6TTQsWZtbMu4MGNJf6zMHvEc9UYnAxDFrU6vD8I87aW5lRLenw8atEchvL M+nA6nzFVQEhPH583JgJlRC+bV2isxUpwmC6FPIJA6HCIyTvVp//VI/yx9oNyhFrqiwG x6xu8LmXsnS6508o05lNjzyLbU/1+PD8SJC2TVGpgcuubt+Pvi2Gzh/iFK4SYiISQ9jQ IClsDG7uBegwpr8tcGjEBu4R3rtZacAfo4TGRii1RD3vpYycYFZ7DhgtQ7IECcFprRzQ rtLg== X-Gm-Message-State: AHQUAuZYr+nZfg9UADkAviP26sBdoMnqPusV4EPNWA8gEbLu0CFNQkoz htsypGyZ3CMY+VkvVLLR0iFzZA== X-Google-Smtp-Source: AHgI3IYL1kXk/vNELWmzxRQxQROjYNcSBY3C2BXmhEJgOfH3VcpP892uKWZrhDDACJZg+GEKVYvbgQ== X-Received: by 2002:a17:902:e307:: with SMTP id cg7mr36525836plb.255.1549873950940; Mon, 11 Feb 2019 00:32:30 -0800 (PST) Received: from cabot-wlan.adilger.int (S0106a84e3fe4b223.cg.shawcable.net. [70.77.216.213]) by smtp.gmail.com with ESMTPSA id y9sm13669963pfi.74.2019.02.11.00.32.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Feb 2019 00:32:29 -0800 (PST) From: Andreas Dilger Message-Id: <45C4394E-1E3B-496A-BD7A-0374CD8E3399@dilger.ca> Content-Type: multipart/signed; boundary="Apple-Mail=_CCAADEDE-9A7B-44AD-9D5A-A358E40C5D7B"; protocol="application/pgp-signature"; micalg=pgp-sha256 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [LSF/MM TOPIC] Enhancing Copy Tools for Linux FS Date: Mon, 11 Feb 2019 01:32:11 -0700 In-Reply-To: Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel , CIFS , samba-technical To: Steve French References: <9039FE7D-7AE6-4732-A3AA-52AD9BA02B4C@dilger.ca> X-Mailer: Apple Mail (2.3273) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org --Apple-Mail=_CCAADEDE-9A7B-44AD-9D5A-A358E40C5D7B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On Feb 8, 2019, at 4:56 PM, Steve French wrote: >=20 > On Fri, Feb 8, 2019 at 5:03 PM Steve French = wrote: >>=20 >> On Fri, Feb 8, 2019 at 4:37 PM Andreas Dilger = wrote: >>>=20 >>> On Feb 8, 2019, at 8:19 AM, Steve French wrote: >>>>=20 >>>> Current Linux copy tools have various problems compared to other >>>> platforms - small I/O sizes (and not even configurable for most), >>>=20 >>> Hmm, this comment puzzles me, since "cp" already uses s_blksize >>> returned for the file as the IO size? Not sure if tar/rsync do >>> the same, but if they don't already use s_blksize they should. >=20 > I did some experiments changing the block size returned from 1K to 64K = to 1MB > and see no difference in the copy size used by cp (it was always 128K = in all > the cases when caching is disabled) Strange. I just re-tested this on Lustre, in case something had changed = in GNU fileutils that I didn't notice, and it worked fine for me, using = both "cp --version =3D 8.4" on RHEL and "cp --version =3D 8.26" on Ubuntu: $ dd if=3D/dev/urandom of=3D/tmp/foo bs=3D1M count=3D12 $ strace -v cp /tmp/foo /testfs/tmp : open("/tmp/foo", O_RDONLY) =3D 3 fstat(3, {... st_blksize=3D4096, st_blocks=3D24576, st_size=3D12582912, = ...}) =3D 0 open("/testfs/tmp/foo", O_WRONLY|O_CREAT|O_EXCL, 0664) =3D 4 fstat(4, { ... st_blksize=3D4194304, st_blocks=3D0, st_size=3D0, ...}) =3D= 0 read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) =3D = 4194304 write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) =3D = 4194304 : Note the "st_blksize=3D4194304" for the target file returned by Lustre = matches the read and write buffer size used by "cp". The same is true if Lustre = is the source file and not the target, so it probably picks the maximum of = both: open("/testfs/tmp/foo", O_RDONLY) =3D 3 fstat(3, {... st_blksize=3D4194304, st_blocks=3D24576, st_size=3D12582912 = ...}) =3D 0 open("/tmp/bar", O_WRONLY|O_TRUNC) =3D 4 fstat(4, {... st_blksize=3D4096, st_blocks=3D0, st_size=3D0 ...}) =3D 0 read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) =3D = 4194304 write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) =3D = 4194304 : Running the same command with /tmp as the target uses a smaller buffer = size matching the "st_blocks=3D32768" and correspondingly more read/write = calls: $ strace -v cp /tmp/foo /tmp/baz : open("/tmp/baz", O_WRONLY|O_CREAT|O_EXCL, 0664) =3D 4 fstat(4, {... st_blksize=3D4096, st_blocks=3D0, st_size=3D0, ...}) =3D 0 read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) =3D 32768 write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) =3D = 32768 : In this case, cp probably has some minimum buffer size it uses to avoid = the poor performance of using 4KB blocks. Cheers, Andreas --Apple-Mail=_CCAADEDE-9A7B-44AD-9D5A-A358E40C5D7B Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iQIzBAEBCAAdFiEEDb73u6ZejP5ZMprvcqXauRfMH+AFAlxhMwsACgkQcqXauRfM H+DschAAn+6vPhptDGWXr/kn9WlH6wFCIj3tZnP2W25FjQM7i62wEr35Anuo3bv7 mbPqw4tdn0EYLg/cmKBuPoLymn5NizV+cNAGt1zrLuOQtf8vzpz65PYIlGdlJkvx 1IVyiVx5DcvwZsiN+WcvHuwGSZZKQG5Q80AJTJ/dUbYQavdafrMvmxhDAwqeAcWJ 6WyscmUQ5K8Lh9hzHuZZlIJo+3LrFjFcV6TuIzVo3UqIbfnmHd0gg02SSYVITFiY vqQ49vd+AXWtP9VzMReUD45Y70xk1yVY2UUqvcqI42R7+rED7aunN/6EJeZAT/7c HG/dK4OGMJUH4UbHTvNxEbWOyuF5wAMunIe6myuRuEIKghk2EpsFjetMxkYRCorE sc7hGMfwySCRFFXb5PoJi/hU4Es1z8/urFm+YOmBVx/4NSa9fBtFszul/H25RLiL vgTDT1OHr4UHiuD9ni6I6kyUoe3/MdlaFxPJ/vh9zcCRQE4nDqDe41xlTUAEhSKk V2FgfXbxyztOR2Ej5jLRJeda9lJ4tHApDosKAHirz5hE2/bOtUM50oigfPhCRNr5 Ip/gS5eKbbtYitXRoKuNLq8cLphAVZz6aXp/QeTpYWe363rS4zMv/KOG7VC9tw60 XqXdHc3br0xMlrxJ/DU+mF2zAY7N/ArKyHYycN2m7uA5CwT0cKI= =4xVj -----END PGP SIGNATURE----- --Apple-Mail=_CCAADEDE-9A7B-44AD-9D5A-A358E40C5D7B--