Le samedi 21 janvier 2023, 03:17:39 UTC Alejandro Colomar a écrit : > Hi Zack, > > On 1/21/23 03:38, Alejandro Colomar wrote: > > Hi Zack, > > > > On 1/20/23 20:25, Alejandro Colomar wrote: > >> [CC += GCC] // pun not intended :P > >> > >> Hi Zack, > >> > >> On 1/20/23 19:04, Zack Weinberg wrote: > >>> On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote: > >>>> The historical design of `sockaddr_storage` makes it impossible to use > >>>> without breaking strict aliasing rules. Not only this type is unusable, > >>>> but even the use of other `sockaddr_*` structures directly (when the > >>>> programmer only cares about a single address family) is also incorrect, > >>>> since at some point the structure will be accessed as a `sockaddr`, and > >>>> that breaks strict aliasing rules too. > >>>> > >>>> So, the only way for a programmer to not invoke Undefined Behavior is to > >>>> declare a union that includes `sockaddr` and any `sockaddr_*` structures > >>>> that are of interest, which allows later accessing as either the correct > >>>> structure or plain `sockaddr` for the sa_family. > >>> > >>> ... > >>> > >>>> struct new_sockaddr_storage nss; > >>>> > >>>> // ... (initialize oss and nss) > >>>> > >>>> inet_sockaddr2str(&nss.sa); // correct (and has no casts) > >>> > >>> I think we need to move slowly here and be _sure_ that any proposed change > >>> does in fact reduce the amount of UB. > >> > >> Sure, I'm just sending the patch to polish the idea around some concrete code. > >> While I checked that it compiles, I didn't add any tests about it or anything, > >> to see that it's usable (and Joseph's email showed me that it's far from being > >> finished). I expect that this'll take some time. > >> > >> > >>> This construct, in particular, might > >>> not actually be correct in practice: see https://godbolt.org/z/rn51cracn for > >>> a case where, if I'm reading it right, the compiler assumes that a write > >>> through a `struct fancy *` cannot alter the values accessible through a > >>> `struct simple *` even though both pointers point into the same union. > >>> (Test case provided by ; > >> > > > > [...] > > > > I was wrong in my guess; the correct output is 3/3; I think I had read it the > > other way around. So yes, I believe it's doing what you just wrote there, but > > can't understand why. > > > > I reduced @supercat's example to a smaller reproducer program (I couldn't > > minimize it any more than this; any further simplification removes the incorrect > > behavior): > > > > #include > > > > struct a { int y[1];}; > > struct b { int y[1];}; > > union u { struct a a; struct b b; }; > > > > > > int read_a(struct a *a) > > { > > return a->y[0]; > > } > > > > void write_b(struct b *b, int j) > > { > > b->y[j] = 2; > > } > > > > int use_union(union u *u, int j) > > { > > if (u->a.y[0] == 0) > > write_b(&u->b, j); > > //write_b((struct b *)u, j); // this has the same issue > > return read_a(&u->a); > > return read_a((struct a *)u); // this has the same issue > > } > > > > int (*volatile vtest)(union u *u, int j) = use_union; > > > > int main(void) > > { > > int r1, r2; > > union u u; > > struct b b = {0}; > > > > u.b = b; > > r1 = vtest(&u, 0); > > r2 = u.a.y[0]; > > > > printf("%d/%d\n", r1, r2); > > } > > > Ahh, indeed it seems to be UB. It's in the same 6.5.2.3/6: there's a > requirement that the information about the union is kept in the function in > which it's accessed. > > The standard presents an example, which is a bit ambiguous: > > The following is not a valid fragment (because the union type is not > visible within function f): > > struct t1 { int m; }; > struct t2 { int m; }; > int f(struct t1 *p1, struct t2 *p2) > { > if (p1->m < 0) > p2->m = -p2->m; > return p1->m; > } > int g() > { > union { > struct t1 s1; > struct t2 s2; > } u; > /* ... */ > return f(&u.s1, &u.s2); > } > > I don't know what's the intention if the union type was global but the variable > `u` was still not seen by f(). But it seems GCC's interpretation is UB, > according to the test we just saw. > > The solution that I can see for that is making sockaddr also be a union. That > way, the union is kept across all calls (since they all use sockaddr). > > struct sockaddr { > union { > struct { > sa_family_t sa_family; > char sa_data[14]; // why 14? > } > struct sockaddr_in sin; > struct sockaddr_in6 sin6; > struct sockaddr_un sun; > }; > }; No the solution is to avoid sockaddr and mark as deprecated. The problem it should be part of union without raising a warning each time we use a safe type... The other solution is to render public and ABI stable the type here https://github.com/bminor/glibc/blob/ae612c45efb5e34713859a5facf92368307efb6e/socket/sys/socket.h#L78 under for instance sockaddr_ptr and sockaddr_const_ptr Moreover this are some patch arch by arch https://sourceware.org/legacy-ml/libc-alpha/2016-02/msg00340.html that should be made default Bastien > > struct sockaddr_storage { > union { > sa_family_t ss_family; > struct sockaddr sa; > }; > }; > > > With this, sockaddr_storage becomes useless, but still usable. New code, could > just use sockaddr and use the internal union members as necessary. Union info > is kept across all function boundaries. > > Cheers, > > Alex > > -- > >