All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
To: Dave Chinner <david@fromorbit.com>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-hardening@vger.kernel.org,
	Kees Cook <keescook@chromium.org>
Subject: Re: [PATCH][next] xfs: Replace one-element arrays with flexible-array members
Date: Sun, 5 Feb 2023 18:21:09 -0600	[thread overview]
Message-ID: <edd631aa-9f12-fe45-e381-f75c384861e9@embeddedor.com> (raw)
In-Reply-To: <20230205225119.GU360264@dread.disaster.area>



On 2/5/23 16:51, Dave Chinner wrote:
> On Thu, Feb 02, 2023 at 07:24:50PM -0600, Gustavo A. R. Silva wrote:
>> One-element arrays are deprecated, and we are replacing them with flexible
>> array members instead. So, replace one-element arrays with flexible-array
>> members in structures xfs_attr_leaf_name_local and
>> xfs_attr_leaf_name_remote.
>>
>> The only binary differences reported after the changes are all like
>> these:
>>
>> fs/xfs/libxfs/xfs_attr_leaf.o
>> _@@ -435,7 +435,7 @@
>>        3b8:      movzbl 0x2(%rbx),%eax
>>        3bc:      rol    $0x8,%bp
>>        3c0:      movzwl %bp,%ebp
>> -     3c3:      lea    0x2(%rax,%rbp,1),%ebx
>> +     3c3:      lea    0x3(%rax,%rbp,1),%ebx
>>        3c7:      call   3cc <xfs_attr_leaf_entsize+0x8c>
>>                          3c8: R_X86_64_PLT32     __tsan_func_exit-0x4
>>        3cc:      or     $0x3,%ebx
>> _@@ -454,7 +454,7 @@
>>        3ea:      movzbl 0x8(%rbx),%ebx
>>        3ee:      call   3f3 <xfs_attr_leaf_entsize+0xb3>
>>                          3ef: R_X86_64_PLT32     __tsan_func_exit-0x4
>> -     3f3:      add    $0xa,%ebx
>> +     3f3:      add    $0xb,%ebx
>>        3f6:      or     $0x3,%ebx
>>        3f9:      add    $0x1,%ebx
>>        3fc:      mov    %ebx,%eax
>>
>> similar changes in fs/xfs/scrub/attr.o and fs/xfs/xfs.o object files.
> 
> That seems like a red flag to me - an off-by-one change in the
> compiled code that calculates of the on-disk size of a structure as
> a result of an in-memory structure change just smells like a bug.

Ughh..

You're right. I somehow got confused between the moment I first
build-tested this in my build machine and after a final last-minute
review I did on the machine from which I ultimately send the patches
out.

More comments below...

> 
> How did you test this change?
> 
>> And the reason for this is because of the round_up() macro called in
>> functions xfs_attr_leaf_entsize_remote() and xfs_attr_leaf_entsize_local(),
>> which is compensanting for the one-byte reduction in size (due to the
>> flex-array transformation) of structures xfs_attr_leaf_name_remote and
>> xfs_attr_leaf_name_local. So, sizes remain the same before and after
>> changes.
> 
> I'm not sure that is true. Before this change:

Yeah; this in fact was a final last-minute review I did before sending out
the patch, and it was when I noticed the round_up() macro was doing something
quite idiomatic when it comes to calculating the sizes of structures containing
one-element arrays. People usually subtract the sizeof(type-of-one-element)
from the sizeof(struct-with-one-element-array) when they perform other
calculations. And in this case as the sizeof(type-of-one-element) is one byte,
at the moment I thought that subtraction was because of that, and then when I
build-tested that final change, I totally forgot about the padding (I had
actually noticed it when I modified the structure definitions :/) and now I
see I got all confused.

> 
> sizeof(xfs_attr_leaf_name_local_t) = 4
> sizeof(xfs_attr_leaf_name_remote_t) = 12
> 
> After this change:
> 
> sizeof(xfs_attr_leaf_name_local_t) = 4
> sizeof(xfs_attr_leaf_name_remote_t) = 12

Yes; in fact I noticed that. :/

> 
> i.e. no change because the structures aren't defined as packed
> structures.  Hence the compiler pads them to out to 4 byte alignment
> naturally regardless of the flex array definition. pahole on x86-64
> also confirms that the (padded) size of the structure is not
> changed.

Yep; I actually was going to include the pahole output for both structures
in the changelog text, but I decided not to do it at the last minute as
I didn't see it necessary because, as you pointed out, the sizes before
and after the flex-array transformations are the same.

> 
> However, the on-disk structure it is being used to decode is packed,
> and we're only using pointer arithmetic to pull the location of the
> name/value pairs out of the buffer to copy them - it's the structure
> size calculations that actually define the size of the structures
> for a given name length, not the sizeof() value or the flex array
> definitions...
> 
>> This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
>> routines on memcpy() and help us make progress towards globally
>> enabling -fstrict-flex-arrays=3 [1].
>>
>> Link: https://github.com/KSPP/linux/issues/79
>> Link: https://github.com/KSPP/linux/issues/251
>> Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
>> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
>> ---
>>   fs/xfs/libxfs/xfs_da_format.h | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
>> index 25e2841084e1..e1e62ebb0c44 100644
>> --- a/fs/xfs/libxfs/xfs_da_format.h
>> +++ b/fs/xfs/libxfs/xfs_da_format.h
>> @@ -620,14 +620,14 @@ typedef struct xfs_attr_leaf_entry {	/* sorted on key, not name */
>>   typedef struct xfs_attr_leaf_name_local {
>>   	__be16	valuelen;		/* number of bytes in value */
>>   	__u8	namelen;		/* length of name bytes */
>> -	__u8	nameval[1];		/* name/value bytes */
>> +	__u8	nameval[];		/* name/value bytes */
>>   } xfs_attr_leaf_name_local_t;
>>   
>>   typedef struct xfs_attr_leaf_name_remote {
>>   	__be32	valueblk;		/* block number of value bytes */
>>   	__be32	valuelen;		/* number of bytes in value */
>>   	__u8	namelen;		/* length of name bytes */
>> -	__u8	name[1];		/* name bytes */
>> +	__u8	name[];			/* name bytes */
>>   } xfs_attr_leaf_name_remote_t;
>>   
>>   typedef struct xfs_attr_leafblock {
>> @@ -747,13 +747,13 @@ xfs_attr3_leaf_name_local(xfs_attr_leafblock_t *leafp, int idx)
>>    */
>>   static inline int xfs_attr_leaf_entsize_remote(int nlen)
>>   {
>> -	return round_up(sizeof(struct xfs_attr_leaf_name_remote) - 1 +
>> +	return round_up(sizeof(struct xfs_attr_leaf_name_remote) +
>>   			nlen, XFS_ATTR_LEAF_NAME_ALIGN);
>>   }
> 
> To be honest, the actual padding and alignment calculations are
> kinda whacky because that's the way they were defined back in 1995.
> And, well, once set in the on-disk format, it can't easily be
> changed. FYI, here's the original definition from 1995:
> 
> #define XFS_ATTR_LEAF_ENTSIZE_REMOTE(nlen)	/* space for remote struct */ \
> 	(((sizeof(xfs_attr_leaf_name_remote_t)-1 + (nlen)) +3)&~0x3)
> 
> So apart using round_up and defines instead of magic numbers, the
> current calculation is unchanged from the original definition.
> 
> AFAICT, the modification you are proposing above breaks this because the
> sizeof(xfs_attr_leaf_name_remote) result has not changed with the
> change of the structure definition.
> 
> e.g. if namelen = 17, before we had:
> 
> 	size	= round_up(12 - 1 + 17, 4)
> 		= round_up(28, 4)
> 		= 28
> 
> Which is correct because the on-disk format is packed:
> 
>          0   4   89  12      20   26 28
> 	+---+---++--+-------+-----+-+-----....
>                    |---------------| 17 bytes of name.
> 		                  |-| 2 bytes of padding
> 				    |-----.... Next attr record.
> 
> We end up with 2 bytes of padded between the end of the name and the
> start of the next attribute record in the block.
> 
> But after this patch, now we calculate the size as:
> 
> 	size	= round_up(12 + 17, 4)
> 		= round_up(29, 4)
> 		= 32
> 
> Which is a different result, and would result in incorrect parsing
> of the attribute records in the buffer. Hence I don't think it is
> valid to be changing the entsize calculations like this if sizeof()
> is not changing results.

Yep; you're right.

> 
> Which comes back to my original question: how did you test this?

I compared the generated object files in fs/xfs/, fs/xfs/scrub/ and
fs/xfs/libxfs/ before and after the changes with something like
these[1]:

ARGS=--disassemble --demangle --reloc --no-show-raw-insn --section=.text
for i in $(cd $OUT/xfs/before && echo *.o); do  echo $i; diff -u <(objdump $ARGS $OUT/xfs/before/$i | sed "0,/^Disassembly/d") <(objdump $ARGS $OUT/xfs/after/$i 
| sed "0,/^Disassembly/d"); done

where of course the generated object files before the changes are
located in OUT/xfs/before/ and the ones after changes in $OUT/xfs/after/

I just double-checked and, indeed, the changes I mentioned in the
changelog text only show up when I modify the entsize functions.

So, because of the padding, the flex-array transformations don't
actually affect the sizes of the involved structures. So, it seems
that change is enough and is the correct one.

I really appreciate your comments and feedback, Dave. And I'm sorry
for the confusion.

Thank you!
--
Gustavo

[1] https://outflux.net/blog/archives/2022/06/24/finding-binary-differences/

  reply	other threads:[~2023-02-06  0:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-03  1:24 [PATCH][next] xfs: Replace one-element arrays with flexible-array members Gustavo A. R. Silva
2023-02-03 17:53 ` Kees Cook
2023-02-06 19:17   ` Gustavo A. R. Silva
2023-02-03 21:32 ` Darrick J. Wong
2023-02-05 22:51 ` Dave Chinner
2023-02-06  0:21   ` Gustavo A. R. Silva [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-03-02 15:05 Gustavo A. R. Silva
2021-03-09 17:42 ` Darrick J. Wong
2021-03-09 19:57   ` Gustavo A. R. Silva
2021-03-09 21:26     ` Darrick J. Wong
2021-03-09 22:03       ` Gustavo A. R. Silva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=edd631aa-9f12-fe45-e381-f75c384861e9@embeddedor.com \
    --to=gustavo@embeddedor.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gustavoars@kernel.org \
    --cc=keescook@chromium.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.