All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] delayacct: align to 8 byte boundary on 64-bit systems
@ 2010-02-12 16:48 Jeff Mahoney
  2010-02-12 18:19 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Mahoney @ 2010-02-12 16:48 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Andrew Morton

 prepare_reply sets up an skb for the response. If I understand it correctly,
 the payload contains:

 +--------------------------------+
 | genlmsghdr - 4 bytes           |
 +--------------------------------+
 | NLA header - 4 bytes           | /* Aggregate header */
 +-+------------------------------+
 | | NLA header - 4 bytes         | /* PID header */
 | +------------------------------+
 | | pid/tgid   - 4 bytes         |
 | +------------------------------+
 | | NLA header - 4 bytes         | /* stats header */
 | + -----------------------------+ <- oops. aligned on 4 byte boundary
 | | struct taskstats - 328 bytes |
 +-+------------------------------+

 The start of the taskstats struct must be 8 byte aligned on IA64 (and other
 systems with 8 byte alignment rules for 64-bit types) or runtime alignment
 warnings will be issued.

 This patch pads the pid/tgid field out to sizeof(long), which forces
 the alignment of taskstats. The getdelays userspace code is ok with this
 since it assumes 32-bit pid/tgid and then honors that header's length field.

 An array is used to avoid exposing kernel memory contents to userspace in the
 response.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
---
 kernel/taskstats.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
 	struct nlattr *na, *ret;
 	int aggr;
 
+	/* If we don't pad, we end up with alignment on a 4 byte boundary.
+	 * This causes lots of runtime warnings on systems requiring 8 byte
+	 * alignment */
+	u32 pids[2] = { pid, 0 };
+	int pid_size = ALIGN(sizeof(pid), sizeof(long));
+
 	aggr = (type == TASKSTATS_TYPE_PID)
 			? TASKSTATS_TYPE_AGGR_PID
 			: TASKSTATS_TYPE_AGGR_TGID;
@@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
 	na = nla_nest_start(skb, aggr);
 	if (!na)
 		goto err;
-	if (nla_put(skb, type, sizeof(pid), &pid) < 0)
+	if (nla_put(skb, type, pid_size, pids) < 0)
 		goto err;
 	ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
 	if (!ret)
-- 
Jeff Mahoney
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems
  2010-02-12 16:48 [PATCH] delayacct: align to 8 byte boundary on 64-bit systems Jeff Mahoney
@ 2010-02-12 18:19 ` Andrew Morton
  2010-02-12 19:20   ` Jeff Mahoney
  2010-02-13  2:14   ` Balbir Singh
  0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2010-02-12 18:19 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Linux Kernel Mailing List, Balbir Singh

On Fri, 12 Feb 2010 11:48:27 -0500
Jeff Mahoney <jeffm@suse.com> wrote:

>  prepare_reply sets up an skb for the response. If I understand it correctly,
>  the payload contains:
> 
>  +--------------------------------+
>  | genlmsghdr - 4 bytes           |
>  +--------------------------------+
>  | NLA header - 4 bytes           | /* Aggregate header */
>  +-+------------------------------+
>  | | NLA header - 4 bytes         | /* PID header */
>  | +------------------------------+
>  | | pid/tgid   - 4 bytes         |

So we put another four zero bytes in here and add four to the "PID header".

>  | +------------------------------+
>  | | NLA header - 4 bytes         | /* stats header */
>  | + -----------------------------+ <- oops. aligned on 4 byte boundary
>  | | struct taskstats - 328 bytes |
>  +-+------------------------------+
> 
>  The start of the taskstats struct must be 8 byte aligned on IA64 (and other
>  systems with 8 byte alignment rules for 64-bit types) or runtime alignment
>  warnings will be issued.
> 
>  This patch pads the pid/tgid field out to sizeof(long), which forces
>  the alignment of taskstats. The getdelays userspace code is ok with this
>  since it assumes 32-bit pid/tgid and then honors that header's length field.
> 
>  An array is used to avoid exposing kernel memory contents to userspace in the
>  response.
> 
> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
> ---
>  kernel/taskstats.c |    8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
>  	struct nlattr *na, *ret;
>  	int aggr;
>  
> +	/* If we don't pad, we end up with alignment on a 4 byte boundary.
> +	 * This causes lots of runtime warnings on systems requiring 8 byte
> +	 * alignment */
> +	u32 pids[2] = { pid, 0 };
> +	int pid_size = ALIGN(sizeof(pid), sizeof(long));
> +
>  	aggr = (type == TASKSTATS_TYPE_PID)
>  			? TASKSTATS_TYPE_AGGR_PID
>  			: TASKSTATS_TYPE_AGGR_TGID;
> @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
>  	na = nla_nest_start(skb, aggr);
>  	if (!na)
>  		goto err;
> -	if (nla_put(skb, type, sizeof(pid), &pid) < 0)
> +	if (nla_put(skb, type, pid_size, pids) < 0)
>  		goto err;
>  	ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
>  	if (!ret)

So any code which assumes that the pid/tgid field is four bytes long
will break.  Code which takes that length from the netlink message
header will work OK.

32-bit architectures are unaltered.

Seems safe enough.  We'd be safer still if we didn't do this on 64-bit
architectures which don't need it.  ie: x86_64.  But if we do that we
add a risk that people will develop shoddy code which works on x86_64
and doesn't work on ia64.

hmm.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems
  2010-02-12 18:19 ` Andrew Morton
@ 2010-02-12 19:20   ` Jeff Mahoney
  2010-02-12 19:29     ` Andrew Morton
  2010-02-13  2:14   ` Balbir Singh
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff Mahoney @ 2010-02-12 19:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List, Balbir Singh

On 02/12/2010 01:19 PM, Andrew Morton wrote:
> On Fri, 12 Feb 2010 11:48:27 -0500
> Jeff Mahoney <jeffm@suse.com> wrote:
> 
>>  prepare_reply sets up an skb for the response. If I understand it correctly,
>>  the payload contains:
>>
>>  +--------------------------------+
>>  | genlmsghdr - 4 bytes           |
>>  +--------------------------------+
>>  | NLA header - 4 bytes           | /* Aggregate header */
>>  +-+------------------------------+
>>  | | NLA header - 4 bytes         | /* PID header */
>>  | +------------------------------+
>>  | | pid/tgid   - 4 bytes         |
> 
> So we put another four zero bytes in here and add four to the "PID header".
> 
>>  | +------------------------------+
>>  | | NLA header - 4 bytes         | /* stats header */
>>  | + -----------------------------+ <- oops. aligned on 4 byte boundary
>>  | | struct taskstats - 328 bytes |
>>  +-+------------------------------+
>>
>>  The start of the taskstats struct must be 8 byte aligned on IA64 (and other
>>  systems with 8 byte alignment rules for 64-bit types) or runtime alignment
>>  warnings will be issued.
>>
>>  This patch pads the pid/tgid field out to sizeof(long), which forces
>>  the alignment of taskstats. The getdelays userspace code is ok with this
>>  since it assumes 32-bit pid/tgid and then honors that header's length field.
>>
>>  An array is used to avoid exposing kernel memory contents to userspace in the
>>  response.
>>
>> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
>> ---
>>  kernel/taskstats.c |    8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> --- a/kernel/taskstats.c
>> +++ b/kernel/taskstats.c
>> @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
>>  	struct nlattr *na, *ret;
>>  	int aggr;
>>  
>> +	/* If we don't pad, we end up with alignment on a 4 byte boundary.
>> +	 * This causes lots of runtime warnings on systems requiring 8 byte
>> +	 * alignment */
>> +	u32 pids[2] = { pid, 0 };
>> +	int pid_size = ALIGN(sizeof(pid), sizeof(long));
>> +
>>  	aggr = (type == TASKSTATS_TYPE_PID)
>>  			? TASKSTATS_TYPE_AGGR_PID
>>  			: TASKSTATS_TYPE_AGGR_TGID;
>> @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
>>  	na = nla_nest_start(skb, aggr);
>>  	if (!na)
>>  		goto err;
>> -	if (nla_put(skb, type, sizeof(pid), &pid) < 0)
>> +	if (nla_put(skb, type, pid_size, pids) < 0)
>>  		goto err;
>>  	ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
>>  	if (!ret)
> 
> So any code which assumes that the pid/tgid field is four bytes long
> will break.  Code which takes that length from the netlink message
> header will work OK.
> 
> 32-bit architectures are unaltered.
> 
> Seems safe enough.  We'd be safer still if we didn't do this on 64-bit
> architectures which don't need it.  ie: x86_64.  But if we do that we
> add a risk that people will develop shoddy code which works on x86_64
> and doesn't work on ia64.

Is there a way to do that without needlessly complicating things? I
didn't see any existing infrastructure to do that.

Another option was to put an empty attribute in with a garbage type,
which would add a 4 byte header - but even the getdelays code included
with the kernel can't deal with that.

It's ugly all the way around.

-Jeff

-- 
Jeff Mahoney
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems
  2010-02-12 19:20   ` Jeff Mahoney
@ 2010-02-12 19:29     ` Andrew Morton
  2010-02-12 19:34       ` Jeff Mahoney
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2010-02-12 19:29 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Linux Kernel Mailing List, Balbir Singh

On Fri, 12 Feb 2010 14:20:23 -0500
Jeff Mahoney <jeffm@suse.com> wrote:

> > Seems safe enough.  We'd be safer still if we didn't do this on 64-bit
> > architectures which don't need it.  ie: x86_64.  But if we do that we
> > add a risk that people will develop shoddy code which works on x86_64
> > and doesn't work on ia64.
> 
> Is there a way to do that without needlessly complicating things? I
> didn't see any existing infrastructure to do that.

ifdefs?  I don't think it's worth doing, really.  Probably anyone who
wrote an application for this copied the getdelays.c code anyway.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems
  2010-02-12 19:29     ` Andrew Morton
@ 2010-02-12 19:34       ` Jeff Mahoney
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Mahoney @ 2010-02-12 19:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List, Balbir Singh

On 02/12/2010 02:29 PM, Andrew Morton wrote:
> On Fri, 12 Feb 2010 14:20:23 -0500
> Jeff Mahoney <jeffm@suse.com> wrote:
> 
>>> Seems safe enough.  We'd be safer still if we didn't do this on 64-bit
>>> architectures which don't need it.  ie: x86_64.  But if we do that we
>>> add a risk that people will develop shoddy code which works on x86_64
>>> and doesn't work on ia64.
>>
>> Is there a way to do that without needlessly complicating things? I
>> didn't see any existing infrastructure to do that.
> 
> ifdefs?  I don't think it's worth doing, really.  Probably anyone who
> wrote an application for this copied the getdelays.c code anyway.

Yeah I didn't want to get into #if defined(LIST OF ARCHES) for this. I
was hoping for a way to get the alignment rules for the arch. In the
absence of that, this is good enough. You're probably right about people
just lifting the getdelays.c code.

-Jeff

-- 
Jeff Mahoney
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems
  2010-02-12 18:19 ` Andrew Morton
  2010-02-12 19:20   ` Jeff Mahoney
@ 2010-02-13  2:14   ` Balbir Singh
  2010-02-17 21:47     ` Jeff Mahoney
  1 sibling, 1 reply; 7+ messages in thread
From: Balbir Singh @ 2010-02-13  2:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jeff Mahoney, Linux Kernel Mailing List

* Andrew Morton <akpm@linux-foundation.org> [2010-02-12 10:19:57]:

> On Fri, 12 Feb 2010 11:48:27 -0500
> Jeff Mahoney <jeffm@suse.com> wrote:
> 
> >  prepare_reply sets up an skb for the response. If I understand it correctly,
> >  the payload contains:
> > 
> >  +--------------------------------+
> >  | genlmsghdr - 4 bytes           |
> >  +--------------------------------+
> >  | NLA header - 4 bytes           | /* Aggregate header */
> >  +-+------------------------------+
> >  | | NLA header - 4 bytes         | /* PID header */
> >  | +------------------------------+
> >  | | pid/tgid   - 4 bytes         |
> 
> So we put another four zero bytes in here and add four to the "PID header".
>

Do you know if these breaks existing applications?
 
> >  | +------------------------------+
> >  | | NLA header - 4 bytes         | /* stats header */
> >  | + -----------------------------+ <- oops. aligned on 4 byte boundary
> >  | | struct taskstats - 328 bytes |
> >  +-+------------------------------+
> > 
> >  The start of the taskstats struct must be 8 byte aligned on IA64 (and other
> >  systems with 8 byte alignment rules for 64-bit types) or runtime alignment
> >  warnings will be issued.
> > 
> >  This patch pads the pid/tgid field out to sizeof(long), which forces
> >  the alignment of taskstats. The getdelays userspace code is ok with this
> >  since it assumes 32-bit pid/tgid and then honors that header's length field.
> > 

Could you define OK above? Does the application work without
recompiling? Have you checked for endianness issues?

> >  An array is used to avoid exposing kernel memory contents to userspace in the
> >  response.
> > 
> > Signed-off-by: Jeff Mahoney <jeffm@suse.com>
> > ---
> >  kernel/taskstats.c |    8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
> >  	struct nlattr *na, *ret;
> >  	int aggr;
> >  
> > +	/* If we don't pad, we end up with alignment on a 4 byte boundary.
> > +	 * This causes lots of runtime warnings on systems requiring 8 byte
> > +	 * alignment */
> > +	u32 pids[2] = { pid, 0 };

Shouldn't this be endianness dependent?

> > +	int pid_size = ALIGN(sizeof(pid), sizeof(long));
> > +
> >  	aggr = (type == TASKSTATS_TYPE_PID)
> >  			? TASKSTATS_TYPE_AGGR_PID
> >  			: TASKSTATS_TYPE_AGGR_TGID;
> > @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
> >  	na = nla_nest_start(skb, aggr);
> >  	if (!na)
> >  		goto err;
> > -	if (nla_put(skb, type, sizeof(pid), &pid) < 0)
> > +	if (nla_put(skb, type, pid_size, pids) < 0)
> >  		goto err;
> >  	ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
> >  	if (!ret)
> 
> So any code which assumes that the pid/tgid field is four bytes long
> will break.  Code which takes that length from the netlink message
> header will work OK.
> 

I think we assume that PID/TGID is 32 bits long, we use the length to
extract the rest of the message. We cast NLA_DATA to int* in
getdelays.c.

> 32-bit architectures are unaltered.
> 
> Seems safe enough.  We'd be safer still if we didn't do this on 64-bit
> architectures which don't need it.  ie: x86_64.  But if we do that we
> add a risk that people will develop shoddy code which works on x86_64
> and doesn't work on ia64.
>

May be, this deserves an ABI and version bump in taskstats. I'll test
this patch soon. 

-- 
	Three Cheers,
	Balbir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems
  2010-02-13  2:14   ` Balbir Singh
@ 2010-02-17 21:47     ` Jeff Mahoney
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Mahoney @ 2010-02-17 21:47 UTC (permalink / raw)
  To: balbir; +Cc: Andrew Morton, Linux Kernel Mailing List

On 02/12/2010 09:14 PM, Balbir Singh wrote:
> * Andrew Morton <akpm@linux-foundation.org> [2010-02-12 10:19:57]:
> 
>> On Fri, 12 Feb 2010 11:48:27 -0500
>> Jeff Mahoney <jeffm@suse.com> wrote:
>>
>>>  prepare_reply sets up an skb for the response. If I understand it correctly,
>>>  the payload contains:
>>>
>>>  +--------------------------------+
>>>  | genlmsghdr - 4 bytes           |
>>>  +--------------------------------+
>>>  | NLA header - 4 bytes           | /* Aggregate header */
>>>  +-+------------------------------+
>>>  | | NLA header - 4 bytes         | /* PID header */
>>>  | +------------------------------+
>>>  | | pid/tgid   - 4 bytes         |
>>
>> So we put another four zero bytes in here and add four to the "PID header".
>>
> 
> Do you know if these breaks existing applications?

AFAIK it doesn't. The initial problem was reported by a partner and
there have been no reports of breakage since the fix was shipped in a
maintenance update for SLE11 in October. I think Andrew's guess about
users just copying the getdelays.c code is probably pretty accurate.

>>>  | +------------------------------+
>>>  | | NLA header - 4 bytes         | /* stats header */
>>>  | + -----------------------------+ <- oops. aligned on 4 byte boundary
>>>  | | struct taskstats - 328 bytes |
>>>  +-+------------------------------+
>>>
>>>  The start of the taskstats struct must be 8 byte aligned on IA64 (and other
>>>  systems with 8 byte alignment rules for 64-bit types) or runtime alignment
>>>  warnings will be issued.
>>>
>>>  This patch pads the pid/tgid field out to sizeof(long), which forces
>>>  the alignment of taskstats. The getdelays userspace code is ok with this
>>>  since it assumes 32-bit pid/tgid and then honors that header's length field.
>>>
> 
> Could you define OK above? Does the application work without
> recompiling? Have you checked for endianness issues?

Yes, it seemed to work as before. I just used getdelays.c to test it.

>>>  An array is used to avoid exposing kernel memory contents to userspace in the
>>>  response.
>>>
>>> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
>>> ---
>>>  kernel/taskstats.c |    8 +++++++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> --- a/kernel/taskstats.c
>>> +++ b/kernel/taskstats.c
>>> @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
>>>  	struct nlattr *na, *ret;
>>>  	int aggr;
>>>  
>>> +	/* If we don't pad, we end up with alignment on a 4 byte boundary.
>>> +	 * This causes lots of runtime warnings on systems requiring 8 byte
>>> +	 * alignment */
>>> +	u32 pids[2] = { pid, 0 };
> 
> Shouldn't this be endianness dependent?

It could if the user casts to a 64-bit type instead of a 32-bit type,
but that assumption was covered in the comment about the pids and tgids
being 32-bit.

>>> +	int pid_size = ALIGN(sizeof(pid), sizeof(long));
>>> +
>>>  	aggr = (type == TASKSTATS_TYPE_PID)
>>>  			? TASKSTATS_TYPE_AGGR_PID
>>>  			: TASKSTATS_TYPE_AGGR_TGID;
>>> @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
>>>  	na = nla_nest_start(skb, aggr);
>>>  	if (!na)
>>>  		goto err;
>>> -	if (nla_put(skb, type, sizeof(pid), &pid) < 0)
>>> +	if (nla_put(skb, type, pid_size, pids) < 0)
>>>  		goto err;
>>>  	ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
>>>  	if (!ret)
>>
>> So any code which assumes that the pid/tgid field is four bytes long
>> will break.  Code which takes that length from the netlink message
>> header will work OK.
>>
> 
> I think we assume that PID/TGID is 32 bits long, we use the length to
> extract the rest of the message. We cast NLA_DATA to int* in
> getdelays.c.
> 
>> 32-bit architectures are unaltered.
>>
>> Seems safe enough.  We'd be safer still if we didn't do this on 64-bit
>> architectures which don't need it.  ie: x86_64.  But if we do that we
>> add a risk that people will develop shoddy code which works on x86_64
>> and doesn't work on ia64.
>>
> 
> May be, this deserves an ABI and version bump in taskstats. I'll test
> this patch soon. 

That could be too. Though if an ABI bump is in the works, then it's
probably better to not fix it this way and have an empty attribute type
instead.

-Jeff

-- 
Jeff Mahoney
SuSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-02-17 21:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-12 16:48 [PATCH] delayacct: align to 8 byte boundary on 64-bit systems Jeff Mahoney
2010-02-12 18:19 ` Andrew Morton
2010-02-12 19:20   ` Jeff Mahoney
2010-02-12 19:29     ` Andrew Morton
2010-02-12 19:34       ` Jeff Mahoney
2010-02-13  2:14   ` Balbir Singh
2010-02-17 21:47     ` Jeff Mahoney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.