All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] generic/286: fix integer underflow on block sizes != 4096
@ 2021-05-22 18:48 Jakob Unterwurzacher
  2021-05-23  9:05 ` Eryu Guan
  0 siblings, 1 reply; 6+ messages in thread
From: Jakob Unterwurzacher @ 2021-05-22 18:48 UTC (permalink / raw)
  To: fstests; +Cc: Jakob Unterwurzacher

The read loop always requested 4096 bytes, which only works
when the total read length is a multiple of 4096 bytes.

This is not neccessarily true, and when it's not, len wraps
around to UINT64_MAX and you get a lot of these:

	ERROR: [error:38] reached EOF:Success

This was caught when running xfstests against gocryptfs,
an encrypted overlay file system.

On ext4, the test still passes after this change.

Signed-off-by: Jakob Unterwurzacher <jakobunt@gmail.com>
---
 src/seek_copy_test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/seek_copy_test.c b/src/seek_copy_test.c
index 0c2c6a3d..28c021e2 100644
--- a/src/seek_copy_test.c
+++ b/src/seek_copy_test.c
@@ -98,7 +98,7 @@ do_extent_copy(int src_fd, int dest_fd, off_t data_off, off_t hole_off)
 	}
 
 	while (len > 0) {
-		ssize_t nr_read = read(src_fd, buf, BUF_SIZE);
+		ssize_t nr_read = read(src_fd, buf, MIN(len, BUF_SIZE));
 		if (nr_read < 0) {
 			if (errno == EINTR)
 				continue;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] generic/286: fix integer underflow on block sizes != 4096
  2021-05-22 18:48 [PATCH] generic/286: fix integer underflow on block sizes != 4096 Jakob Unterwurzacher
@ 2021-05-23  9:05 ` Eryu Guan
  2021-05-25 17:34   ` Jakob Unterwurzacher
  0 siblings, 1 reply; 6+ messages in thread
From: Eryu Guan @ 2021-05-23  9:05 UTC (permalink / raw)
  To: Jakob Unterwurzacher; +Cc: fstests

On Sat, May 22, 2021 at 08:48:14PM +0200, Jakob Unterwurzacher wrote:
> The read loop always requested 4096 bytes, which only works
> when the total read length is a multiple of 4096 bytes.

The total read length should be

"The length of this extent is (hole_off - data_off)"

according to the comments above do_extent_copy(). Total read length
being not a multiple of 4k means 'data_off' or 'hole_off' is not 4k
aligned.

> 
> This is not neccessarily true, and when it's not, len wraps

But generic/286 creates source files with lenght of all data extents and
hole extents being multiple of 4k. So I still don't understand why this
is valid for gocryptfs. Shouldn't that be a bug in seek_data/seek_hole
in gocryptfs? Could you please elaborate?

Thanks,
Eryu

> around to UINT64_MAX and you get a lot of these:
> 
> 	ERROR: [error:38] reached EOF:Success
> 
> This was caught when running xfstests against gocryptfs,
> an encrypted overlay file system.
> 
> On ext4, the test still passes after this change.
> 
> Signed-off-by: Jakob Unterwurzacher <jakobunt@gmail.com>
> ---
>  src/seek_copy_test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/seek_copy_test.c b/src/seek_copy_test.c
> index 0c2c6a3d..28c021e2 100644
> --- a/src/seek_copy_test.c
> +++ b/src/seek_copy_test.c
> @@ -98,7 +98,7 @@ do_extent_copy(int src_fd, int dest_fd, off_t data_off, off_t hole_off)
>  	}
>  
>  	while (len > 0) {
> -		ssize_t nr_read = read(src_fd, buf, BUF_SIZE);
> +		ssize_t nr_read = read(src_fd, buf, MIN(len, BUF_SIZE));
>  		if (nr_read < 0) {
>  			if (errno == EINTR)
>  				continue;
> -- 
> 2.31.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] generic/286: fix integer underflow on block sizes != 4096
  2021-05-23  9:05 ` Eryu Guan
@ 2021-05-25 17:34   ` Jakob Unterwurzacher
  2021-05-26  3:20     ` Eryu Guan
  0 siblings, 1 reply; 6+ messages in thread
From: Jakob Unterwurzacher @ 2021-05-25 17:34 UTC (permalink / raw)
  To: Eryu Guan; +Cc: fstests

On Sun, May 23, 2021 at 11:05 AM Eryu Guan <guan@eryu.me> wrote:
> The total read length should be
>
> "The length of this extent is (hole_off - data_off)"
>
> according to the comments above do_extent_copy(). Total read length
> being not a multiple of 4k means 'data_off' or 'hole_off' is not 4k
> aligned.

That is correct.

> But generic/286 creates source files with length of all data extents and
> hole extents being multiple of 4k. So I still don't understand why this
> is valid for gocryptfs. Shouldn't that be a bug in seek_data/seek_hole
> in gocryptfs? Could you please elaborate?

Yes sure, the situation is a bit complicated. gocryptfs works similar
to eCryptFS and EncFS (also overlay filesystems).
The files are stored in encrypted form in regular files on ext4 or xfs
or whatever "real disk" filesystem.
Disk space allocation & file holes are handled by the real filesystem.
A gocryptfs mount shows a decrypted view of these files.

Now, gocryptfs uses AES-GCM for encryption. This adds 32 bytes of
overhead to every 4096-byte block,
which gives a storage size of 4128 bytes.

The encryption overhead is why the files & holes created by
generic/286 are not 4k-aligned on disk when viewed through the
gocryptfs mount.

Thanks, Jakob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] generic/286: fix integer underflow on block sizes != 4096
  2021-05-25 17:34   ` Jakob Unterwurzacher
@ 2021-05-26  3:20     ` Eryu Guan
  2021-05-26  3:41       ` Darrick J. Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Eryu Guan @ 2021-05-26  3:20 UTC (permalink / raw)
  To: Jakob Unterwurzacher; +Cc: Eryu Guan, fstests

On Tue, May 25, 2021 at 07:34:14PM +0200, Jakob Unterwurzacher wrote:
> On Sun, May 23, 2021 at 11:05 AM Eryu Guan <guan@eryu.me> wrote:
> > The total read length should be
> >
> > "The length of this extent is (hole_off - data_off)"
> >
> > according to the comments above do_extent_copy(). Total read length
> > being not a multiple of 4k means 'data_off' or 'hole_off' is not 4k
> > aligned.
> 
> That is correct.
> 
> > But generic/286 creates source files with length of all data extents and
> > hole extents being multiple of 4k. So I still don't understand why this
> > is valid for gocryptfs. Shouldn't that be a bug in seek_data/seek_hole
> > in gocryptfs? Could you please elaborate?
> 
> Yes sure, the situation is a bit complicated. gocryptfs works similar
> to eCryptFS and EncFS (also overlay filesystems).
> The files are stored in encrypted form in regular files on ext4 or xfs
> or whatever "real disk" filesystem.
> Disk space allocation & file holes are handled by the real filesystem.
> A gocryptfs mount shows a decrypted view of these files.
> 
> Now, gocryptfs uses AES-GCM for encryption. This adds 32 bytes of
> overhead to every 4096-byte block,
> which gives a storage size of 4128 bytes.

Ah, that makes sense to me now. Would you please include the detailed
explaination in commit log as well?

Thanks,
Eryu

> 
> The encryption overhead is why the files & holes created by
> generic/286 are not 4k-aligned on disk when viewed through the
> gocryptfs mount.
> 
> Thanks, Jakob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] generic/286: fix integer underflow on block sizes != 4096
  2021-05-26  3:20     ` Eryu Guan
@ 2021-05-26  3:41       ` Darrick J. Wong
  2021-05-26  8:02         ` Jakob Unterwurzacher
  0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2021-05-26  3:41 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Jakob Unterwurzacher, Eryu Guan, fstests

On Wed, May 26, 2021 at 11:20:37AM +0800, Eryu Guan wrote:
> On Tue, May 25, 2021 at 07:34:14PM +0200, Jakob Unterwurzacher wrote:
> > On Sun, May 23, 2021 at 11:05 AM Eryu Guan <guan@eryu.me> wrote:
> > > The total read length should be
> > >
> > > "The length of this extent is (hole_off - data_off)"
> > >
> > > according to the comments above do_extent_copy(). Total read length
> > > being not a multiple of 4k means 'data_off' or 'hole_off' is not 4k
> > > aligned.
> > 
> > That is correct.
> > 
> > > But generic/286 creates source files with length of all data extents and
> > > hole extents being multiple of 4k. So I still don't understand why this
> > > is valid for gocryptfs. Shouldn't that be a bug in seek_data/seek_hole
> > > in gocryptfs? Could you please elaborate?
> > 
> > Yes sure, the situation is a bit complicated. gocryptfs works similar
> > to eCryptFS and EncFS (also overlay filesystems).
> > The files are stored in encrypted form in regular files on ext4 or xfs
> > or whatever "real disk" filesystem.
> > Disk space allocation & file holes are handled by the real filesystem.
> > A gocryptfs mount shows a decrypted view of these files.
> > 
> > Now, gocryptfs uses AES-GCM for encryption. This adds 32 bytes of
> > overhead to every 4096-byte block,
> > which gives a storage size of 4128 bytes.
> 
> Ah, that makes sense to me now. Would you please include the detailed
> explaination in commit log as well?

...and maybe a sample output of a seek_data/seek_hole scan between a
gocryptfs file and the ext4fs underneath it?  I'm still trying to wrap
my head around what the problem here is.

It might also help to describe where the 32 bytes of overhead goes --
are you interleaving the overhead inline with 4k of encrypted content?

--D

> 
> Thanks,
> Eryu
> 
> > 
> > The encryption overhead is why the files & holes created by
> > generic/286 are not 4k-aligned on disk when viewed through the
> > gocryptfs mount.
> > 
> > Thanks, Jakob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] generic/286: fix integer underflow on block sizes != 4096
  2021-05-26  3:41       ` Darrick J. Wong
@ 2021-05-26  8:02         ` Jakob Unterwurzacher
  0 siblings, 0 replies; 6+ messages in thread
From: Jakob Unterwurzacher @ 2021-05-26  8:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Eryu Guan, Eryu Guan, fstests

> > Ah, that makes sense to me now. Would you please include the detailed
> > explaination in commit log as well?
>
> ...and maybe a sample output of a seek_data/seek_hole scan between a
> gocryptfs file and the ext4fs underneath it?  I'm still trying to wrap
> my head around what the problem here is.
>
> It might also help to describe where the 32 bytes of overhead goes --
> are you interleaving the overhead inline with 4k of encrypted content?

Yes, it's all inline, the file format is like this [2]:

file header  ... 18 bytes
data block 1 ... 16 bytes block header  (IV)
                 1-4096 bytes user data
                 16 bytes block trailer (MAC)
[more data blocks...]

I am attaching the SEEK_DATA/SEEK_HOLE trace below [1]. One seek goes like this:
(1) gocryptfs gets a seek() call via FUSE
(2) translate plaintext to ciphertext offset
(3) call ext4 seek
(4) translate back to plaintext offset and return

Actually Eryu's first comment had me start thinking about this, and
what gocryptfs
does is a little stupid, because the offsets that get returned to userspace may
point in the middle of a block, but gocryptfs reads or writes in
blocks of 4128 bytes (= 4096 bytes of user data)
(*except at EOF), so it may as well round up at seek already.

This would also mean that the offsets returned to userspace get
aligned to 4096 bytes,
and generic/286 would just work as is. Another filesystem not aligning
to 4096 bytes may hit
the underflow, but it won't be gocryptfs :)

Thanks, Jakob

[1]:
gocryptfs seek(        0, SEEK_DATA) -> translate -> ext4 seek(
0, SEEK_DATA) =         0 -> translate -> return         0
gocryptfs seek(        0, SEEK_HOLE) -> translate -> ext4 seek(
0, SEEK_HOLE) =   1060864 -> translate -> return   1052622
gocryptfs seek(  1052622, SEEK_DATA) -> translate -> ext4 seek(
1060864, SEEK_DATA) =   5283840 -> translate -> return   5242862
gocryptfs seek(  5242862, SEEK_HOLE) -> translate -> ext4 seek(
5283840, SEEK_HOLE) =   6344704 -> translate -> return   6295502
gocryptfs seek(  6295502, SEEK_DATA) -> translate -> ext4 seek(
6344704, SEEK_DATA) =  10567680 -> translate -> return  10485742
gocryptfs seek( 10485742, SEEK_HOLE) -> translate -> ext4 seek(
10567680, SEEK_HOLE) =  11628544 -> translate -> return  11538382
gocryptfs seek( 11538382, SEEK_DATA) -> translate -> ext4 seek(
11628544, SEEK_DATA) =  15851520 -> translate -> return  15728622
gocryptfs seek( 15728622, SEEK_HOLE) -> translate -> ext4 seek(
15851520, SEEK_HOLE) =  16912384 -> translate -> return  16781262
gocryptfs seek( 16781262, SEEK_DATA) -> translate -> ext4 seek(
16912384, SEEK_DATA) =  21135360 -> translate -> return  20971502
gocryptfs seek( 20971502, SEEK_HOLE) -> translate -> ext4 seek(
21135360, SEEK_HOLE) =  22196224 -> translate -> return  22024142
gocryptfs seek( 22024142, SEEK_DATA) -> translate -> ext4 seek(
22196224, SEEK_DATA) =  26419200 -> translate -> return  26214382
gocryptfs seek( 26214382, SEEK_HOLE) -> translate -> ext4 seek(
26419200, SEEK_HOLE) =  27480064 -> translate -> return  27267022
gocryptfs seek( 27267022, SEEK_DATA) -> translate -> ext4 seek(
27480064, SEEK_DATA) =  31703040 -> translate -> return  31457262
gocryptfs seek( 31457262, SEEK_HOLE) -> translate -> ext4 seek(
31703040, SEEK_HOLE) =  32763904 -> translate -> return  32509902
gocryptfs seek( 32509902, SEEK_DATA) -> translate -> ext4 seek(
32763904, SEEK_DATA) =  36986880 -> translate -> return  36700142
gocryptfs seek( 36700142, SEEK_HOLE) -> translate -> ext4 seek(
36986880, SEEK_HOLE) =  38047744 -> translate -> return  37752782
gocryptfs seek( 37752782, SEEK_DATA) -> translate -> ext4 seek(
38047744, SEEK_DATA) =  42270720 -> translate -> return  41943022
gocryptfs seek( 41943022, SEEK_HOLE) -> translate -> ext4 seek(
42270720, SEEK_HOLE) =  43331584 -> translate -> return  42995662
gocryptfs seek( 42995662, SEEK_DATA) -> translate -> ext4 seek(
43331584, SEEK_DATA) =  47554560 -> translate -> return  47185902
gocryptfs seek( 47185902, SEEK_HOLE) -> translate -> ext4 seek(
47554560, SEEK_HOLE) =  48615424 -> translate -> return  48238542
gocryptfs seek( 48238542, SEEK_DATA) -> translate -> ext4 seek(
48615424, SEEK_DATA) =  52838400 -> translate -> return  52428782
gocryptfs seek( 52428782, SEEK_HOLE) -> translate -> ext4 seek(
52838400, SEEK_HOLE) =  53899264 -> translate -> return  53481422
gocryptfs seek( 53481422, SEEK_DATA) -> translate -> ext4 seek(
53899264, SEEK_DATA) =  58122240 -> translate -> return  57671662
gocryptfs seek( 57671662, SEEK_HOLE) -> translate -> ext4 seek(
58122240, SEEK_HOLE) =  59183104 -> translate -> return  58724302
gocryptfs seek( 58724302, SEEK_DATA) -> translate -> ext4 seek(
59183104, SEEK_DATA) =  63406080 -> translate -> return  62914542
gocryptfs seek( 62914542, SEEK_HOLE) -> translate -> ext4 seek(
63406080, SEEK_HOLE) =  64466944 -> translate -> return  63967182
gocryptfs seek( 63967182, SEEK_DATA) -> translate -> ext4 seek(
64466944, SEEK_DATA) =  68689920 -> translate -> return  68157422
gocryptfs seek( 68157422, SEEK_HOLE) -> translate -> ext4 seek(
68689920, SEEK_HOLE) =  69750784 -> translate -> return  69210062
gocryptfs seek( 69210062, SEEK_DATA) -> translate -> ext4 seek(
69750784, SEEK_DATA) =  73973760 -> translate -> return  73400302
gocryptfs seek( 73400302, SEEK_HOLE) -> translate -> ext4 seek(
73973760, SEEK_HOLE) =  75034624 -> translate -> return  74452942
gocryptfs seek( 74452942, SEEK_DATA) -> translate -> ext4 seek(
75034624, SEEK_DATA) =  79257600 -> translate -> return  78643182
gocryptfs seek( 78643182, SEEK_HOLE) -> translate -> ext4 seek(
79257600, SEEK_HOLE) =  80318464 -> translate -> return  79695822
gocryptfs seek( 79695822, SEEK_DATA) -> translate -> ext4 seek(
80318464, SEEK_DATA) =  84541440 -> translate -> return  83886062
gocryptfs seek( 83886062, SEEK_HOLE) -> translate -> ext4 seek(
84541440, SEEK_HOLE) =  85602304 -> translate -> return  84938702
gocryptfs seek( 84938702, SEEK_DATA) -> translate -> ext4 seek(
85602304, SEEK_DATA) =  89825280 -> translate -> return  89128942
gocryptfs seek( 89128942, SEEK_HOLE) -> translate -> ext4 seek(
89825280, SEEK_HOLE) =  90886144 -> translate -> return  90181582
gocryptfs seek( 90181582, SEEK_DATA) -> translate -> ext4 seek(
90886144, SEEK_DATA) =  95109120 -> translate -> return  94371822
gocryptfs seek( 94371822, SEEK_HOLE) -> translate -> ext4 seek(
95109120, SEEK_HOLE) =  96169984 -> translate -> return  95424462
gocryptfs seek( 95424462, SEEK_DATA) -> translate -> ext4 seek(
96169984, SEEK_DATA) = 100392960 -> translate -> return  99614702
gocryptfs seek( 99614702, SEEK_HOLE) -> translate -> ext4
seek(100392960, SEEK_HOLE) = 101453824 -> translate -> return
100667342
gocryptfs seek(100667342, SEEK_DATA) -> translate -> ext4
seek(101453824, SEEK_DATA) = 105676800 -> translate -> return
104857582
gocryptfs seek(104857582, SEEK_HOLE) -> translate -> ext4
seek(105676800, SEEK_HOLE) = 106733586 -> translate -> return
105906176

[2]: https://github.com/rfjakob/gocryptfs/blob/master/Documentation/file-format.md

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-05-26  8:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-22 18:48 [PATCH] generic/286: fix integer underflow on block sizes != 4096 Jakob Unterwurzacher
2021-05-23  9:05 ` Eryu Guan
2021-05-25 17:34   ` Jakob Unterwurzacher
2021-05-26  3:20     ` Eryu Guan
2021-05-26  3:41       ` Darrick J. Wong
2021-05-26  8:02         ` Jakob Unterwurzacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.