linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: "Bastien Roucariès" <rouca@debian.org>
Cc: "linux-man@vger.kernel.org" <linux-man@vger.kernel.org>,
	eblake <eblake@redhat.com>, Zack Weinberg <zack@owlfolio.org>,
	GNU C Library <libc-alpha@sourceware.org>
Subject: struct sockaddr_storage, union (was: Improve getsockname)
Date: Thu, 19 Jan 2023 21:19:31 +0100	[thread overview]
Message-ID: <0fdb2d61-db2d-d85c-35e4-ce32b2f36269@gmail.com> (raw)
In-Reply-To: <2860541.uBSZ6KuyZf@portable-bastien>


[-- Attachment #1.1: Type: text/plain, Size: 6757 bytes --]

[CC += Zack, glibc]

Hi Bastien, Eric, and Zack,

On 1/19/23 20:44, Bastien Roucariès wrote:
> Le jeudi 19 janvier 2023, 12:42:52 UTC Alejandro Colomar a écrit :
> Hi all,
> [adding Eric Blake from redhat and if I remember well member of POSIX comitee]

Thanks.

BTW, your post triggered some changes.  I didn't CC to not be too noisy, but 
since you seem to be interested in it, here are a few things:


-  Suggest deprecating struct sockaddr_storage to glibc:

 
<https://inbox.sourceware.org/libc-alpha/0f25d60f-f183-b518-b6c1-6d46aa63ee57@gmail.com/T/>

-  Which caused Zack to update some old stackoverflow answer :)

    <https://stackoverflow.com/a/42190913/6872717>

-  He made me realize getnameinfo(3) exists, and so I used in place of a similar 
function that I had implemented recently:

    <https://github.com/shadow-maint/shadow/pull/629>

-  And while reading the manual page for getnameinfo(3), I realized it was 
missing some missing _Nullable qualifiers:

 
<https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=39017f90644141a8bd297c783201e04b238769fb>


BTW, Zack, glibc could add [[nonnull(1)]] for the prototype.  Should I send a 
patch adding it to the header?

> 
> Eric to be short posix behavior of sockaddr sockaddr_storage is UB

Agree.  Almost all uses of struct sockaddr_storage are UB, and those that are 
not UB, are error prone.  stockaddr_storage is bad.

> and example supplied are UB. See here:
> https://pubs.opengroup.org/onlinepubs/009696699/basedefs/sys/socket.h.html

I don't see any usage examples there.  Only a definition.  The definition isn't 
bad in itself; it's just that it's unusable.

BTW, now we need to check all the man pages that refer to that structure and fix 
them to not promote UB.

> 
> I think we could save the day by defining struct sockadd_storage using anonymous union (thanks to C11 support),
>   but it will leave C++. But how can be done...
> 
>> Hi Bastien,
>> Hi Bastien,
>>
>> On 1/17/23 11:22, Bastien Roucariès wrote:
>>> Hi,
>>>
>>> I have a lot of student that does not use correctly getsockname in case of
>>> dual stack.
>>
>> Please show some examples of common mistakes, if you can.
> oh the basic one is to pass a sockaddr_in instead of sockaddr_in6... Or to pass a sockaddr_in6 and do not think that it could be returned an IPV4 socket or an Unix stream socket.

Understood.

>>
>>>
>>> May be this kind of discussion should be factorized in  sockaddr_storage (the
>>> historical note particularly).
>>>
>>> i suppose the same should be done for getpeername
>>>
>>> I think a safe programming example may be given that accept a socket as stdin
>>> and print information on it. Using socat it could be simple to test.
>>
>> Please provide some if you can.  However, be careful, since it might easily fall
>> into Undefined Behavior.
> 
> Ok see it attached.

Thanks.

>>
>>> maybe
>>> forcing ENOTSUPP if *addr > sizeof(sockadd_storage)
>>>
>>> Regards
>>>
>>> Bastien
>>
>> I'll add some comments to the patch.
>>
>>>  From 0afb3ad23f8ea09331f21a377e3ad19c44e4df18 Mon Sep 17 00:00:00 2001
>>> From: =?UTF-8?q?Bastien=20Roucari=C3=A8s?= <rouca@debian.org>
>>> Date: Tue, 17 Jan 2023 10:07:43 +0000
>>> Subject: [PATCH] Document use of struct sockaddr_storage in getsockname
>>>
>>> Document:
>>> - storage requierement
>>> - future compatibility
>>> ---
>>>   man2/getsockname.2 | 56 ++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 56 insertions(+)
>>>
>>> diff --git a/man2/getsockname.2 b/man2/getsockname.2
>>> index e6e8980c9..5914c9e12 100644
>>> --- a/man2/getsockname.2
>>> +++ b/man2/getsockname.2
>>> @@ -39,6 +39,17 @@ The returned address is truncated if the buffer provided is too small;
>>>   in this case,
>>>   .I addrlen
>>>   will return a value greater than was supplied to the call.
>>> +.PP
>>> +For greater portability
>>> +.I addr
>>> +should point to a structure of type
>>> +.I sockaddr_storage.
>>
>> This is not correct.  If the object has a type of 'struct sockaddr_storage',
>> then it can only be accessed as a 'struct sockaddr_storage'.  Trying to access
>> it as any other structure type would be Undefined Behavior.  The way to do it
>> conforming to ISO C would be to declare a union which contains all the 'struct
>> sockaddr_*' that are interesting, and access it through the correct union
>> member.  Anything else is on UB land.  The only use of 'struct sockaddr_storage'
>> that I can think of, is for having a more consistent union size.
> 
> Ok I see, Eric could we redefine at posix level the struct sockaddr_storage to be something like, it seems that it is allowed by
> /*
>   *  Following fields are implementation-defined.
>   */
> 
> union sockaddr_storage {
>    sa_family_t     ss_family;
>    struct sockaddr sock;
>    struct _sockaddr_storage storage;
>    struct sockaddr_in in;
>    struct sockaddr_in6 in6;
>    struct sockaddr_un un;
> };

Hmmm, interesting.  We can't do that, because this would be a breaking change to 
source code.  We can't change from `struct sockaddr_storage` to `union 
sockaddr_storage`.  However, this suggestion, combined with the idea of using an 
anonymous union, can be made to work as something that is compatible with the 
current sockaddr_storage:

struct sockaddr_storage {
	union {
		sa_family_t          ss_family;
		struct sockaddr      sa;
		struct sockaddr_in   sin;
		struct sockaddr_in6  sin6;
		struct sockaddr_un   sun;
	};
};

This is compatible:

-  It had at least the `ss_family` field.  It's still there, at the same binary 
location.
-  It has a size at least as large as any other sockaddr_* structure, and a 
suitable alignment.
-  Old code still works with it just fine.
-  New code will be able to avoid UB, and all casts, just by accessing the right 
structure element.
-  It's trivial to test at configure time if the implementation provides this 
new definition of the structure.

>>
>>> +.I sockaddr_storage
>>> +structure is large enough to hold any of the other
>>> +.I sockaddr_*
>>> +variants and always well aligned. On return, it should be cast to the correct
>>> +.I sockaddr_*
>>
>> The fact that it is correctly aligned, and a cast will work most of the time,
>> isn't enough for strict aliasing rules.  The compiler is free to assume things,
>> just by the fact that it's a different type.
> 
> Ok any idea for writing this kind of stuff

I'm thinking about writing something to several pages; will keep you all updated 
on important changes to the pages.


Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-01-19 20:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-17 10:22 Improve getsockname Bastien Roucariès
2023-01-19 12:42 ` Alejandro Colomar
2023-01-19 19:44   ` Bastien Roucariès
2023-01-19 20:19     ` Alejandro Colomar [this message]
2023-01-19 21:00       ` struct sockaddr_storage, union (was: Improve getsockname) Bastien Roucariès
2023-01-19 21:19         ` Alejandro Colomar
2023-01-19 21:38           ` Bastien Roucariès
2023-01-19 23:31             ` Alejandro Colomar
2023-01-20  0:12               ` Alejandro Colomar
2023-01-20 21:11               ` Bastien Roucariès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0fdb2d61-db2d-d85c-35e4-ce32b2f36269@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=eblake@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-man@vger.kernel.org \
    --cc=rouca@debian.org \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).