From: Steve French <smfrench@gmail.com>
To: Andreas Dilger <adilger@dilger.ca>
Cc: lsf-pc@lists.linux-foundation.org,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
CIFS <linux-cifs@vger.kernel.org>,
samba-technical <samba-technical@lists.samba.org>
Subject: Re: [LSF/MM TOPIC] Enhancing Copy Tools for Linux FS
Date: Mon, 11 Feb 2019 11:43:58 -0600 [thread overview]
Message-ID: <CAH2r5mu1CGsvJZYcddDidS+j5_Gv+YeD9EDNHGpijux2u4fpKw@mail.gmail.com> (raw)
In-Reply-To: <45C4394E-1E3B-496A-BD7A-0374CD8E3399@dilger.ca>
On Mon, Feb 11, 2019 at 2:32 AM Andreas Dilger <adilger@dilger.ca> wrote:
>
> On Feb 8, 2019, at 4:56 PM, Steve French <smfrench@gmail.com> wrote:
> >
> > On Fri, Feb 8, 2019 at 5:03 PM Steve French <smfrench@gmail.com> wrote:
> >>
> >> On Fri, Feb 8, 2019 at 4:37 PM Andreas Dilger <adilger@dilger.ca> wrote:
> >>>
> >>> On Feb 8, 2019, at 8:19 AM, Steve French <smfrench@gmail.com> wrote:
<snip>
> > I did some experiments changing the block size returned from 1K to 64K to 1MB
> > and see no difference in the copy size used by cp (it was always 128K in all
> > the cases when caching is disabled)
I figured out the problem - I read your note as meaning s_blocksize (which not
st_blksize), ie the block size in the superblock not on the file.
Changing st_blksize (stat->blksize) to 4MB did lead to the better performance
(and large I/O matching the block size) for uncached cp
> Strange. I just re-tested this on Lustre, in case something had changed in
> GNU fileutils that I didn't notice, and it worked fine for me, using both
> "cp --version = 8.4" on RHEL and "cp --version = 8.26" on Ubuntu:
>
> $ dd if=/dev/urandom of=/tmp/foo bs=1M count=12
> $ strace -v cp /tmp/foo /testfs/tmp
> :
> open("/tmp/foo", O_RDONLY) = 3
> fstat(3, {... st_blksize=4096, st_blocks=24576, st_size=12582912, ...}) = 0
> open("/testfs/tmp/foo", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4
> fstat(4, { ... st_blksize=4194304, st_blocks=0, st_size=0, ...}) = 0
> read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> :
>
> Note the "st_blksize=4194304" for the target file returned by Lustre matches
> the read and write buffer size used by "cp". The same is true if Lustre is
> the source file and not the target, so it probably picks the maximum of both:
>
> open("/testfs/tmp/foo", O_RDONLY) = 3
> fstat(3, {... st_blksize=4194304, st_blocks=24576, st_size=12582912 ...}) = 0
> open("/tmp/bar", O_WRONLY|O_TRUNC) = 4
> fstat(4, {... st_blksize=4096, st_blocks=0, st_size=0 ...}) = 0
> read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> :
>
> Running the same command with /tmp as the target uses a smaller buffer size
> matching the "st_blocks=32768" and correspondingly more read/write calls:
>
> $ strace -v cp /tmp/foo /tmp/baz
> :
> open("/tmp/baz", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4
> fstat(4, {... st_blksize=4096, st_blocks=0, st_size=0, ...}) = 0
> read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) = 32768
> write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) = 32768
> :
>
> In this case, cp probably has some minimum buffer size it uses to avoid the
> poor performance of using 4KB blocks.
Yes - although the code is a little hard to follow it looks like 128K
in my system's version of cp (Ubuntu)
--
Thanks,
Steve
next prev parent reply other threads:[~2019-02-11 17:44 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-08 15:19 [LSF/MM TOPIC] Enhancing Copy Tools for Linux FS Steve French
2019-02-08 22:37 ` Andreas Dilger
2019-02-08 23:03 ` Steve French
2019-02-08 23:56 ` Steve French
2019-02-11 8:32 ` Andreas Dilger
2019-02-11 17:43 ` Steve French [this message]
2019-02-11 23:22 ` L. A. Walsh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAH2r5mu1CGsvJZYcddDidS+j5_Gv+YeD9EDNHGpijux2u4fpKw@mail.gmail.com \
--to=smfrench@gmail.com \
--cc=adilger@dilger.ca \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=samba-technical@lists.samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).