linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Non-packed structures in IP headers
       [not found] <AS8PR05MB78952FE7E8D82245D309DEBCE7AA9@AS8PR05MB7895.eurprd05.prod.outlook.com>
@ 2021-10-01 20:10 ` Florian Weimer
  2021-10-02 15:54   ` David Laight
  2021-10-04 10:30   ` Cufi, Carles
  0 siblings, 2 replies; 6+ messages in thread
From: Florian Weimer @ 2021-10-01 20:10 UTC (permalink / raw)
  To: Cufi, Carles
  Cc: netdev, jukka.rissanen, johan.hedberg, Lubos, Robert, Bursztyka,
	Tomasz, linux-toolchains

* Carles Cufi:

> I was looking through the structures for IPv{4,6} packet headers and
> noticed that several of those that seem to be used to parse a packet
> directly from the wire are not declared as packed. This surprised me
> because, although I did find that provisions are made so that the
> alignment of the structure, it is still technically possible for the
> compiler to inject padding bytes inside those structures, since AFAIK
> the C standard makes no guarantees about padding unless it's
> instructed to pack the structure.

The C standards do not make such guarantees, but the platform ABI
standards describe struct layout and ensure that there is no padding.
Linux relies on that not just for networking, but also for the userspace
ABI, support for separately compiled kernel modules, and in other
places.

Sometimes there are alignment concerns in the way these structs are
used, but I believe the kernel generally controls placement of the data
that is being worked on, so that does not matter, either.

Therefore, I do not believe this is an actual problem.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Non-packed structures in IP headers
  2021-10-01 20:10 ` Non-packed structures in IP headers Florian Weimer
@ 2021-10-02 15:54   ` David Laight
  2021-10-04 10:41     ` Cufi, Carles
  2021-10-04 10:30   ` Cufi, Carles
  1 sibling, 1 reply; 6+ messages in thread
From: David Laight @ 2021-10-02 15:54 UTC (permalink / raw)
  To: 'Florian Weimer', Cufi, Carles
  Cc: netdev, jukka.rissanen, johan.hedberg, Lubos, Robert, Bursztyka,
	Tomasz, linux-toolchains

From: Florian Weimer
> Sent: 01 October 2021 21:10
> 
> * Carles Cufi:
> 
> > I was looking through the structures for IPv{4,6} packet headers and
> > noticed that several of those that seem to be used to parse a packet
> > directly from the wire are not declared as packed. This surprised me
> > because, although I did find that provisions are made so that the
> > alignment of the structure, it is still technically possible for the
> > compiler to inject padding bytes inside those structures, since AFAIK
> > the C standard makes no guarantees about padding unless it's
> > instructed to pack the structure.
> 
> The C standards do not make such guarantees, but the platform ABI
> standards describe struct layout and ensure that there is no padding.
> Linux relies on that not just for networking, but also for the userspace
> ABI, support for separately compiled kernel modules, and in other
> places.

In particular structures are used to map hardware register blocks.

> Sometimes there are alignment concerns in the way these structs are
> used, but I believe the kernel generally controls placement of the data
> that is being worked on, so that does not matter, either.
> 
> Therefore, I do not believe this is an actual problem.

And adding __packed forces the compiler to do byte accesses
(with shifts) on cpu that don't support misaligned memory accesses.

So it really is wrong to specify __packed unless the structure
can be unaligned in memory, or has a 'broken' definition
that has fields that aren't 'naturally aligned'.
In the latter case it is enough to mark the field that requires
the padding before it removed as (IIRC) __aligned(1).
The compiler will then remove the padding but still assume the
field is partially aligned - so my do two 32bit access instead
of 8 8bit ones).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Non-packed structures in IP headers
  2021-10-01 20:10 ` Non-packed structures in IP headers Florian Weimer
  2021-10-02 15:54   ` David Laight
@ 2021-10-04 10:30   ` Cufi, Carles
  2021-10-09  6:56     ` Florian Weimer
  1 sibling, 1 reply; 6+ messages in thread
From: Cufi, Carles @ 2021-10-04 10:30 UTC (permalink / raw)
  To: Florian Weimer
  Cc: netdev, jukka.rissanen, johan.hedberg, Lubos, Robert, Bursztyka,
	Tomasz, linux-toolchains

Hi Florian,

Thanks for your response.

> 
> * Carles Cufi:
> 
> > I was looking through the structures for IPv{4,6} packet headers and
> > noticed that several of those that seem to be used to parse a packet
> > directly from the wire are not declared as packed. This surprised me
> > because, although I did find that provisions are made so that the
> > alignment of the structure, it is still technically possible for the
> > compiler to inject padding bytes inside those structures, since AFAIK
> > the C standard makes no guarantees about padding unless it's
> > instructed to pack the structure.
> 
> The C standards do not make such guarantees, but the platform ABI
> standards describe struct layout and ensure that there is no padding.
> Linux relies on that not just for networking, but also for the userspace
> ABI, support for separately compiled kernel modules, and in other places.

That makes sense, but aren't ABI standards different for every architecture? For example, I checked the Arm AAPCS[1] and it states:

"The size of an aggregate shall be the smallest multiple of its alignment that is sufficient to hold all of its
members."

Which, unless I am reading this wrong, means that the compiler would indeed insert padding if the size of the IP headers structs was not a multiple of 4. In this particular case, the struct sizes for the IP headers are 20 and 40 bytes respectively, so there will be no padding inserted. But I only checked a single architecture's ABI (or Procedure Call Standard) documentation, is this true for all archs? 

> Sometimes there are alignment concerns in the way these structs are used,
> but I believe the kernel generally controls placement of the data that is
> being worked on, so that does not matter, either.

I did see those when browsing the code, thanks for confirming this. It is really padding that I am concerned about, and not alignment.

> Therefore, I do not believe this is an actual problem.

Would the static assert still make sense in order to check this for all architectures?

Thanks,

Carles

[1] https://github.com/ARM-software/abi-aa/blob/2bcab1e3b22d55170c563c3c7940134089176746/aapcs64/aapcs64.rst#aggregates


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Non-packed structures in IP headers
  2021-10-02 15:54   ` David Laight
@ 2021-10-04 10:41     ` Cufi, Carles
  2021-10-04 12:18       ` David Laight
  0 siblings, 1 reply; 6+ messages in thread
From: Cufi, Carles @ 2021-10-04 10:41 UTC (permalink / raw)
  To: David Laight, 'Florian Weimer'
  Cc: netdev, jukka.rissanen, johan.hedberg, Lubos, Robert, Bursztyka,
	Tomasz, linux-toolchains

> From: David Laight <David.Laight@ACULAB.COM>
> Sent: 02 October 2021 17:55
> From: Florian Weimer
> > Sent: 01 October 2021 21:10
> >
> > * Carles Cufi:
> >
> > > I was looking through the structures for IPv{4,6} packet headers and
> > > noticed that several of those that seem to be used to parse a packet
> > > directly from the wire are not declared as packed. This surprised me
> > > because, although I did find that provisions are made so that the
> > > alignment of the structure, it is still technically possible for the
> > > compiler to inject padding bytes inside those structures, since
> > > AFAIK the C standard makes no guarantees about padding unless it's
> > > instructed to pack the structure.
> >
> > The C standards do not make such guarantees, but the platform ABI
> > standards describe struct layout and ensure that there is no padding.
> > Linux relies on that not just for networking, but also for the
> > userspace ABI, support for separately compiled kernel modules, and in
> > other places.
> 
> In particular structures are used to map hardware register blocks.

Non-padded ones? Because this again might be an issue depending on the compiler/ABI as per my understanding.

> 
> > Sometimes there are alignment concerns in the way these structs are
> > used, but I believe the kernel generally controls placement of the
> > data that is being worked on, so that does not matter, either.
> >
> > Therefore, I do not believe this is an actual problem.
> 
> And adding __packed forces the compiler to do byte accesses (with shifts)
> on cpu that don't support misaligned memory accesses.

Right, I understand that using __packed involves a runtime penalty hit on memory accesses, but I wasn't proposing to pack those structs, just to check their size for padding at compile-time.

> So it really is wrong to specify __packed unless the structure can be
> unaligned in memory, or has a 'broken' definition that has fields that
> aren't 'naturally aligned'.

But who defines what "broken" or "natural alignment" is for each architecture? Furthermore, the C standard doesn't really mention any of these terms as far as I know, it just leaves complete freedom to the compiler to add the padding it might consider appropriate.

> In the latter case it is enough to mark the field that requires the
> padding before it removed as (IIRC) __aligned(1).
> The compiler will then remove the padding but still assume the field is
> partially aligned - so my do two 32bit access instead of 8 8bit ones).

Interesting, I have never seen this used before, but then again I come from an (deeply) embedded background that tends to pack all of its structs that represent bytes that go over the wire (or hardware registers for that matter).

What is also interesting in this case, is why the ethernet header is packed in Linux[2]. There don't seem to be any special alignment or size constraints when compared to the IP headers, since the single 16-bit integer is placed 12 bytes after the beginning of the struct. I assume the reason for packing this header is indeed alignment, and not padding. Unlike the IP headers, the kernel probably doesn't ensure that an ethernet header begins at an address compatible with the alignment requirements of that 16-bit integer, so the header needs packing not because there's a risk of padding being introduced by the compiler, but because the header might start at an odd memory address?

Thanks,

Carles

[2] https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/if_ether.h#L169

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Non-packed structures in IP headers
  2021-10-04 10:41     ` Cufi, Carles
@ 2021-10-04 12:18       ` David Laight
  0 siblings, 0 replies; 6+ messages in thread
From: David Laight @ 2021-10-04 12:18 UTC (permalink / raw)
  To: 'Cufi, Carles', 'Florian Weimer'
  Cc: netdev, jukka.rissanen, johan.hedberg, Lubos, Robert, Bursztyka,
	Tomasz, linux-toolchains

From: Cufi, Carles
> Sent: 04 October 2021 11:42
> 
> > From: David Laight <David.Laight@ACULAB.COM>
> > Sent: 02 October 2021 17:55
> > From: Florian Weimer
> > > Sent: 01 October 2021 21:10
> > >
> > > * Carles Cufi:
> > >
> > > > I was looking through the structures for IPv{4,6} packet headers and
> > > > noticed that several of those that seem to be used to parse a packet
> > > > directly from the wire are not declared as packed. This surprised me
> > > > because, although I did find that provisions are made so that the
> > > > alignment of the structure, it is still technically possible for the
> > > > compiler to inject padding bytes inside those structures, since
> > > > AFAIK the C standard makes no guarantees about padding unless it's
> > > > instructed to pack the structure.
> > >
> > > The C standards do not make such guarantees, but the platform ABI
> > > standards describe struct layout and ensure that there is no padding.
> > > Linux relies on that not just for networking, but also for the
> > > userspace ABI, support for separately compiled kernel modules, and in
> > > other places.
> >
> > In particular structures are used to map hardware register blocks.
> 
> Non-padded ones? Because this again might be an issue depending on the compiler/ABI as per my
> understanding.

The ABI are usually 'sane' - at the time of writing.
So you can assume that the compiler will only add padding if it is
necessary to get a field on its correct alignment.
The most common issue is whether the ABI specifies 32bit or 64bit
alignment for 64bit items.

Most hardware register layouts are all of 'words' (usually 32bit).
So there is unlikely to be any alignment padding.
(But there may be unused/undocumented registers.)
The advantage of using a C structure (rather than constants) to
define a hardware register layout is that is much harder to end
up using offsets for the wrong register block.

> > > Sometimes there are alignment concerns in the way these structs are
> > > used, but I believe the kernel generally controls placement of the
> > > data that is being worked on, so that does not matter, either.
> > >
> > > Therefore, I do not believe this is an actual problem.
> >
> > And adding __packed forces the compiler to do byte accesses (with shifts)
> > on cpu that don't support misaligned memory accesses.
> 
> Right, I understand that using __packed involves a runtime penalty hit on memory accesses, but I
> wasn't proposing to pack those structs, just to check their size for padding at compile-time.

That is sensible, but not really needed if the structure is short
and everything is 'naturally aligned'.

> > So it really is wrong to specify __packed unless the structure can be
> > unaligned in memory, or has a 'broken' definition that has fields that
> > aren't 'naturally aligned'.
> 
> But who defines what "broken" or "natural alignment" is for each architecture? Furthermore, the C
> standard doesn't really mention any of these terms as far as I know, it just leaves complete freedom
> to the compiler to add the padding it might consider appropriate.

'natural alignment' means that a 2**n byte item is always a
multiple of 2**n mytes from the front.

> > In the latter case it is enough to mark the field that requires the
> > padding before it removed as (IIRC) __aligned(1).
> > The compiler will then remove the padding but still assume the field is
> > partially aligned - so my do two 32bit access instead of 8 8bit ones).
> 
> Interesting, I have never seen this used before, but then again I come from an (deeply) embedded
> background that tends to pack all of its structs that represent bytes that go over the wire (or
> hardware registers for that matter).

Hmmm, I get you are using constant offsets not C struts for register layouts.
Either that or you've not actually looked at the object code.

> What is also interesting in this case, is why the ethernet header is packed in Linux[2]. There don't
> seem to be any special alignment or size constraints when compared to the IP headers, since the single
> 16-bit integer is placed 12 bytes after the beginning of the struct. I assume the reason for packing
> this header is indeed alignment, and not padding. Unlike the IP headers, the kernel probably doesn't
> ensure that an ethernet header begins at an address compatible with the alignment requirements of that
> 16-bit integer, so the header needs packing not because there's a risk of padding being introduced by
> the compiler, but because the header might start at an odd memory address?

Almost certainly because the base address might be odd.
(Although odd is actually unlikely.)

Ethernet frames are annoying.
They were 'designed' before anyone thought about 32bit cpus.
So they have two 6-byte addresses followed by a 2-byte 'ethertype'
(or a 2 byte length for ISO frames).
This is then followed by the IP header - which is all 32bit fields.
So if you want the IP header aligned (which is better - even on x86)
you need to rean the ethernet frame to a 4n+2 byte boundary.

It is surprising how many systems that have cpu that require 32bit
items be on 32bit boundaries (crossing page boundaries is a PITA)
have ethernet controllers that require receive buffers be 32bit
aligned.
Even writing two bytes on junk would help.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Non-packed structures in IP headers
  2021-10-04 10:30   ` Cufi, Carles
@ 2021-10-09  6:56     ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2021-10-09  6:56 UTC (permalink / raw)
  To: Cufi, Carles
  Cc: netdev, jukka.rissanen, johan.hedberg, Lubos, Robert, Bursztyka,
	Tomasz, linux-toolchains

* Carles Cufi:

>> * Carles Cufi:
>> 
>> > I was looking through the structures for IPv{4,6} packet headers and
>> > noticed that several of those that seem to be used to parse a packet
>> > directly from the wire are not declared as packed. This surprised me
>> > because, although I did find that provisions are made so that the
>> > alignment of the structure, it is still technically possible for the
>> > compiler to inject padding bytes inside those structures, since AFAIK
>> > the C standard makes no guarantees about padding unless it's
>> > instructed to pack the structure.
>> 
>> The C standards do not make such guarantees, but the platform ABI
>> standards describe struct layout and ensure that there is no padding.
>> Linux relies on that not just for networking, but also for the userspace
>> ABI, support for separately compiled kernel modules, and in other places.
>
> That makes sense, but aren't ABI standards different for every
> architecture? For example, I checked the Arm AAPCS[1] and it states:
>
> "The size of an aggregate shall be the smallest multiple of its
> alignment that is sufficient to hold all of its members."
>
> Which, unless I am reading this wrong, means that the compiler would
> indeed insert padding if the size of the IP headers structs was not
> a multiple of 4. In this particular case, the struct sizes for the
> IP headers are 20 and 40 bytes respectively, so there will be no
> padding inserted. But I only checked a single architecture's ABI (or
> Procedure Call Standard) documentation, is this true for all archs?

For structure layout in memory, there is a large overlap between ABIs.
There is divergence around long long (which is easily avoided by
adding padding manually), and potentially bit fileds (but I haven't
looked at that).

Things only get weird for pass-by-value structs and unions and return
types.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-09  7:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <AS8PR05MB78952FE7E8D82245D309DEBCE7AA9@AS8PR05MB7895.eurprd05.prod.outlook.com>
2021-10-01 20:10 ` Non-packed structures in IP headers Florian Weimer
2021-10-02 15:54   ` David Laight
2021-10-04 10:41     ` Cufi, Carles
2021-10-04 12:18       ` David Laight
2021-10-04 10:30   ` Cufi, Carles
2021-10-09  6:56     ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).