All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ovl: improving copy-up efficiency for sparse file
@ 2019-10-04 13:20 Chengguang Xu
  2019-10-15 10:26 ` 回复:[PATCH] " Chengguang Xu
  0 siblings, 1 reply; 4+ messages in thread
From: Chengguang Xu @ 2019-10-04 13:20 UTC (permalink / raw)
  To: miklos; +Cc: linux-unionfs, Chengguang Xu

Current copy-up is not efficient for sparse file,
It's not only slow but also wasting more disk space
when the target lower file has huge hole inside.
This patch tries to recognize file hole and skip it
during copy-up.

In detail, this optimization checks the hole according
to copy-up chunk size so it may not recognize all kind
of holes in the file. However, it is easy to implement
and will be enough for most of the time.

Additionally, this optimization relies on lseek(2)
SEEK_DATA implementation, so for some specific
filesystems which do not support this feature
will behave as before on copy-up.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/copy_up.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index b801c6353100..028033c9f021 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -144,10 +144,11 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 		goto out;
 	/* Couldn't clone, so now we try to copy the data */
 
-	/* FIXME: copy up sparse files efficiently */
 	while (len) {
 		size_t this_len = OVL_COPY_UP_CHUNK_SIZE;
 		long bytes;
+		loff_t old_next_data_pos;
+		loff_t hole_len;
 
 		if (len < this_len)
 			this_len = len;
@@ -157,6 +158,18 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 			break;
 		}
 
+		old_next_data_pos = vfs_llseek(old_file, old_pos, SEEK_DATA);
+		if (old_next_data_pos >= old_pos + OVL_COPY_UP_CHUNK_SIZE) {
+			hole_len = (old_next_data_pos - old_pos) /
+				OVL_COPY_UP_CHUNK_SIZE * OVL_COPY_UP_CHUNK_SIZE;
+			old_pos += hole_len;
+			new_pos += hole_len;
+			len -= hole_len;
+			continue;
+		} else if (old_next_data_pos == -ENXIO) {
+			break;
+		}
+
 		bytes = do_splice_direct(old_file, &old_pos,
 					 new_file, &new_pos,
 					 this_len, SPLICE_F_MOVE);
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* 回复:[PATCH] ovl: improving copy-up efficiency for sparse file
  2019-10-04 13:20 [PATCH] ovl: improving copy-up efficiency for sparse file Chengguang Xu
@ 2019-10-15 10:26 ` Chengguang Xu
  2019-10-15 14:26   ` [PATCH] " Amir Goldstein
  0 siblings, 1 reply; 4+ messages in thread
From: Chengguang Xu @ 2019-10-15 10:26 UTC (permalink / raw)
  To: miklos, linux-unionfs; +Cc: Chengguang Xu

 ---- 在 星期五, 2019-10-04 21:20:30 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
 > Current copy-up is not efficient for sparse file,
 > It's not only slow but also wasting more disk space
 > when the target lower file has huge hole inside.
 > This patch tries to recognize file hole and skip it
 > during copy-up.
 > 
 > In detail, this optimization checks the hole according
 > to copy-up chunk size so it may not recognize all kind
 > of holes in the file. However, it is easy to implement
 > and will be enough for most of the time.
 > 
 > Additionally, this optimization relies on lseek(2)
 > SEEK_DATA implementation, so for some specific
 > filesystems which do not support this feature
 > will behave as before on copy-up.
 > 
 > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
 > ---

Any better idea or suggestion for this?


Thanks,
Chengguang


 >  fs/overlayfs/copy_up.c | 15 ++++++++++++++-
 >  1 file changed, 14 insertions(+), 1 deletion(-)
 > 
 > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
 > index b801c6353100..028033c9f021 100644
 > --- a/fs/overlayfs/copy_up.c
 > +++ b/fs/overlayfs/copy_up.c
 > @@ -144,10 +144,11 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 >          goto out;
 >      /* Couldn't clone, so now we try to copy the data */
 >  
 > -    /* FIXME: copy up sparse files efficiently */
 >      while (len) {
 >          size_t this_len = OVL_COPY_UP_CHUNK_SIZE;
 >          long bytes;
 > +        loff_t old_next_data_pos;
 > +        loff_t hole_len;
 >  
 >          if (len < this_len)
 >              this_len = len;
 > @@ -157,6 +158,18 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 >              break;
 >          }
 >  
 > +        old_next_data_pos = vfs_llseek(old_file, old_pos, SEEK_DATA);
 > +        if (old_next_data_pos >= old_pos + OVL_COPY_UP_CHUNK_SIZE) {
 > +            hole_len = (old_next_data_pos - old_pos) /
 > +                OVL_COPY_UP_CHUNK_SIZE * OVL_COPY_UP_CHUNK_SIZE;
 > +            old_pos += hole_len;
 > +            new_pos += hole_len;
 > +            len -= hole_len;
 > +            continue;
 > +        } else if (old_next_data_pos == -ENXIO) {
 > +            break;
 > +        }
 > +
 >          bytes = do_splice_direct(old_file, &old_pos,
 >                       new_file, &new_pos,
 >                       this_len, SPLICE_F_MOVE);
 > -- 
 > 2.21.0
 > 
 > 
 >

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ovl: improving copy-up efficiency for sparse file
  2019-10-15 10:26 ` 回复:[PATCH] " Chengguang Xu
@ 2019-10-15 14:26   ` Amir Goldstein
  2019-10-16 10:19     ` Chengguang Xu
  0 siblings, 1 reply; 4+ messages in thread
From: Amir Goldstein @ 2019-10-15 14:26 UTC (permalink / raw)
  To: cgxu519; +Cc: miklos, linux-unionfs

On Tue, Oct 15, 2019 at 2:38 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
>  ---- 在 星期五, 2019-10-04 21:20:30 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
>  > Current copy-up is not efficient for sparse file,
>  > It's not only slow but also wasting more disk space
>  > when the target lower file has huge hole inside.
>  > This patch tries to recognize file hole and skip it
>  > during copy-up.
>  >
>  > In detail, this optimization checks the hole according
>  > to copy-up chunk size so it may not recognize all kind
>  > of holes in the file. However, it is easy to implement
>  > and will be enough for most of the time.

I must say I do not see how aligning to copy-up chunk size
simplifies the change. Why is that more complicated?

if (old_next_data_pos >= old_pos) {
      hole_len = old_next_data_pos - old_pos;
...

It can still copy hole up to this_len, because there is no
SEEK_HOLE, so you can document that.

>  >
>  > Additionally, this optimization relies on lseek(2)
>  > SEEK_DATA implementation, so for some specific
>  > filesystems which do not support this feature
>  > will behave as before on copy-up.
>  >

I am not sure if we support any lower fs with no f_op->llseek
in that case, copy up will not behave as before - it will return
-ESPIPE and will be a regression.

>  > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
>  > ---
>
> Any better idea or suggestion for this?

This change should be accompanied with proper xfstests examining
all sorts of sparse files.
See overlay/001 and _run_seek_sanity_test for inspiration.

Perhaps you can run all _run_seek_sanity_test tests on
lower fs, then mount overlay. copy up all the sanity test
files and then check something???

Thanks,
Amir.


>
>
>  >  fs/overlayfs/copy_up.c | 15 ++++++++++++++-
>  >  1 file changed, 14 insertions(+), 1 deletion(-)
>  >
>  > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>  > index b801c6353100..028033c9f021 100644
>  > --- a/fs/overlayfs/copy_up.c
>  > +++ b/fs/overlayfs/copy_up.c
>  > @@ -144,10 +144,11 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
>  >          goto out;
>  >      /* Couldn't clone, so now we try to copy the data */
>  >
>  > -    /* FIXME: copy up sparse files efficiently */
>  >      while (len) {
>  >          size_t this_len = OVL_COPY_UP_CHUNK_SIZE;
>  >          long bytes;
>  > +        loff_t old_next_data_pos;
>  > +        loff_t hole_len;
>  >
>  >          if (len < this_len)
>  >              this_len = len;
>  > @@ -157,6 +158,18 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
>  >              break;
>  >          }
>  >
>  > +        old_next_data_pos = vfs_llseek(old_file, old_pos, SEEK_DATA);
>  > +        if (old_next_data_pos >= old_pos + OVL_COPY_UP_CHUNK_SIZE) {
>  > +            hole_len = (old_next_data_pos - old_pos) /
>  > +                OVL_COPY_UP_CHUNK_SIZE * OVL_COPY_UP_CHUNK_SIZE;
>  > +            old_pos += hole_len;
>  > +            new_pos += hole_len;
>  > +            len -= hole_len;
>  > +            continue;
>  > +        } else if (old_next_data_pos == -ENXIO) {
>  > +            break;
>  > +        }
>  > +
>  >          bytes = do_splice_direct(old_file, &old_pos,
>  >                       new_file, &new_pos,
>  >                       this_len, SPLICE_F_MOVE);
>  > --

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ovl: improving copy-up efficiency for sparse file
  2019-10-15 14:26   ` [PATCH] " Amir Goldstein
@ 2019-10-16 10:19     ` Chengguang Xu
  0 siblings, 0 replies; 4+ messages in thread
From: Chengguang Xu @ 2019-10-16 10:19 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: miklos, linux-unionfs

 ---- 在 星期二, 2019-10-15 22:26:35 Amir Goldstein <amir73il@gmail.com> 撰写 ----
 > On Tue, Oct 15, 2019 at 2:38 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > >  ---- 在 星期五, 2019-10-04 21:20:30 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
 > >  > Current copy-up is not efficient for sparse file,
 > >  > It's not only slow but also wasting more disk space
 > >  > when the target lower file has huge hole inside.
 > >  > This patch tries to recognize file hole and skip it
 > >  > during copy-up.
 > >  >
 > >  > In detail, this optimization checks the hole according
 > >  > to copy-up chunk size so it may not recognize all kind
 > >  > of holes in the file. However, it is easy to implement
 > >  > and will be enough for most of the time.
 > 
 > I must say I do not see how aligning to copy-up chunk size
 > simplifies the change. Why is that more complicated?

Hi Amir,

Thanks for your feedback. 
I would like to say if we do not align to copy-up chunk size then we have to recognize exact size of 
every hole in the lower file and that may need relying on both SEEK_HOLE and SEEK_DATA methods. 
However, currently SEEK_HOLE implementation seems not so reliable. Look deep into the code,
generic_file_llseek() could not work correctly for SEEK_HOLE in our check, so it means for filesystems
which don't have their own version of  f_op->llseek function will get problem.


 > 
 > if (old_next_data_pos >= old_pos) {
 >       hole_len = old_next_data_pos - old_pos;
 > ...
 > 
 > It can still copy hole up to this_len, because there is no
 > SEEK_HOLE, so you can document that.

I'll do.

 > 
 > >  >
 > >  > Additionally, this optimization relies on lseek(2)
 > >  > SEEK_DATA implementation, so for some specific
 > >  > filesystems which do not support this feature
 > >  > will behave as before on copy-up.
 > >  >
 > 
 > I am not sure if we support any lower fs with no f_op->llseek
 > in that case, copy up will not behave as before - it will return
 > -ESPIPE and will be a regression.

We need adding a check for lower fs llseek support and catching
llseek error for safety. If we notice the optimization could not work
then we back to original copy-up behavior.


 > 
 > >  > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
 > >  > ---
 > >
 > > Any better idea or suggestion for this?
 > 
 > This change should be accompanied with proper xfstests examining
 > all sorts of sparse files.
 > See overlay/001 and _run_seek_sanity_test for inspiration.
 > 
 > Perhaps you can run all _run_seek_sanity_test tests on
 > lower fs, then mount overlay. copy up all the sanity test
 > files and then check something???
 > 

Yeah, thanks for your suggestion, I'll do more work on testing.

Thanks,
Chengguang

 > 
 > 
 > >
 > >
 > >  >  fs/overlayfs/copy_up.c | 15 ++++++++++++++-
 > >  >  1 file changed, 14 insertions(+), 1 deletion(-)
 > >  >
 > >  > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
 > >  > index b801c6353100..028033c9f021 100644
 > >  > --- a/fs/overlayfs/copy_up.c
 > >  > +++ b/fs/overlayfs/copy_up.c
 > >  > @@ -144,10 +144,11 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 > >  >          goto out;
 > >  >      /* Couldn't clone, so now we try to copy the data */
 > >  >
 > >  > -    /* FIXME: copy up sparse files efficiently */
 > >  >      while (len) {
 > >  >          size_t this_len = OVL_COPY_UP_CHUNK_SIZE;
 > >  >          long bytes;
 > >  > +        loff_t old_next_data_pos;
 > >  > +        loff_t hole_len;
 > >  >
 > >  >          if (len < this_len)
 > >  >              this_len = len;
 > >  > @@ -157,6 +158,18 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 > >  >              break;
 > >  >          }
 > >  >
 > >  > +        old_next_data_pos = vfs_llseek(old_file, old_pos, SEEK_DATA);
 > >  > +        if (old_next_data_pos >= old_pos + OVL_COPY_UP_CHUNK_SIZE) {
 > >  > +            hole_len = (old_next_data_pos - old_pos) /
 > >  > +                OVL_COPY_UP_CHUNK_SIZE * OVL_COPY_UP_CHUNK_SIZE;
 > >  > +            old_pos += hole_len;
 > >  > +            new_pos += hole_len;
 > >  > +            len -= hole_len;
 > >  > +            continue;
 > >  > +        } else if (old_next_data_pos == -ENXIO) {
 > >  > +            break;
 > >  > +        }
 > >  > +
 > >  >          bytes = do_splice_direct(old_file, &old_pos,
 > >  >                       new_file, &new_pos,
 > >  >                       this_len, SPLICE_F_MOVE);
 > >  > --
 >

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-10-16 10:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-04 13:20 [PATCH] ovl: improving copy-up efficiency for sparse file Chengguang Xu
2019-10-15 10:26 ` 回复:[PATCH] " Chengguang Xu
2019-10-15 14:26   ` [PATCH] " Amir Goldstein
2019-10-16 10:19     ` Chengguang Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.