All of lore.kernel.org
 help / color / mirror / Atom feed
* __sk_buff.data_end
@ 2017-04-19 21:31 Johannes Berg
  2017-04-19 22:20 ` __sk_buff.data_end Johannes Berg
  2017-04-19 23:51 ` __sk_buff.data_end Daniel Borkmann
  0 siblings, 2 replies; 15+ messages in thread
From: Johannes Berg @ 2017-04-19 21:31 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: netdev

Hi Alexei, Daniel,

I'm looking at adding the __wifi_sk_buff I talked about, and I notice
that it uses CB space to store data_end. Unfortunately, in a lot of
cases, we don't have any CB space to spare in wifi.

Is there any way to generate a series of instructions that instead does
the necessary calculations? I don't actually *see* such a way, because
I don't see how I could have a scratch register or scratch stack space,
but perhaps there's a way to do it?

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-19 21:31 __sk_buff.data_end Johannes Berg
@ 2017-04-19 22:20 ` Johannes Berg
  2017-04-20  0:01   ` __sk_buff.data_end Daniel Borkmann
  2017-04-19 23:51 ` __sk_buff.data_end Daniel Borkmann
  1 sibling, 1 reply; 15+ messages in thread
From: Johannes Berg @ 2017-04-19 22:20 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: netdev

On Wed, 2017-04-19 at 23:31 +0200, Johannes Berg wrote:
> Hi Alexei, Daniel,
> 
> I'm looking at adding the __wifi_sk_buff I talked about, and I notice
> that it uses CB space to store data_end. Unfortunately, in a lot of
> cases, we don't have any CB space to spare in wifi.

I guess I can work around this, would this seem reasonable?

 struct bpf_skb_data_end {
        struct qdisc_skb_cb qdisc_cb;
-       void *data_end;
+       /*
+        * The alignment here is for mac80211, since that doesn't use
+        * a pointer but a u64 value and needs to save/restore that
+        * across running its BPF programs.
+        */
+       void *data_end __aligned(sizeof(u64));
 };

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-19 21:31 __sk_buff.data_end Johannes Berg
  2017-04-19 22:20 ` __sk_buff.data_end Johannes Berg
@ 2017-04-19 23:51 ` Daniel Borkmann
  1 sibling, 0 replies; 15+ messages in thread
From: Daniel Borkmann @ 2017-04-19 23:51 UTC (permalink / raw)
  To: Johannes Berg, Alexei Starovoitov; +Cc: netdev

On 04/19/2017 11:31 PM, Johannes Berg wrote:
> Hi Alexei, Daniel,
>
> I'm looking at adding the __wifi_sk_buff I talked about, and I notice
> that it uses CB space to store data_end. Unfortunately, in a lot of
> cases, we don't have any CB space to spare in wifi.
>
> Is there any way to generate a series of instructions that instead does
> the necessary calculations? I don't actually *see* such a way, because
> I don't see how I could have a scratch register or scratch stack space,
> but perhaps there's a way to do it?

One option would be, similarly as in bpf_prog_run_save_cb(), to just
save / restore _only_ the data_end pointer portion for this. Calculating
this inline via bpf_convert_ctx_access() would indeed need one more
reg than just the dst reg available in order to perform the data_end
calculation only as BPF insns.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-19 22:20 ` __sk_buff.data_end Johannes Berg
@ 2017-04-20  0:01   ` Daniel Borkmann
  2017-04-20  0:12     ` __sk_buff.data_end Alexei Starovoitov
  2017-04-20  6:01     ` __sk_buff.data_end Johannes Berg
  0 siblings, 2 replies; 15+ messages in thread
From: Daniel Borkmann @ 2017-04-20  0:01 UTC (permalink / raw)
  To: Johannes Berg, Alexei Starovoitov; +Cc: netdev

On 04/20/2017 12:20 AM, Johannes Berg wrote:
> On Wed, 2017-04-19 at 23:31 +0200, Johannes Berg wrote:
>> Hi Alexei, Daniel,
>>
>> I'm looking at adding the __wifi_sk_buff I talked about, and I notice
>> that it uses CB space to store data_end. Unfortunately, in a lot of
>> cases, we don't have any CB space to spare in wifi.
>
> I guess I can work around this, would this seem reasonable?
>
>   struct bpf_skb_data_end {
>          struct qdisc_skb_cb qdisc_cb;
> -       void *data_end;
> +       /*
> +        * The alignment here is for mac80211, since that doesn't use
> +        * a pointer but a u64 value and needs to save/restore that
> +        * across running its BPF programs.
> +        */
> +       void *data_end __aligned(sizeof(u64));
>   };

Yeah, should work as well for the 32 bit archs, on 64 bit we
have this effectively already:

struct bpf_skb_data_end {
         struct qdisc_skb_cb        qdisc_cb;             /*     0    28 */

         /* XXX 4 bytes hole, try to pack */

         void *                     data_end;             /*    32     8 */

         /* size: 40, cachelines: 1, members: 2 */
         /* sum members: 36, holes: 1, sum holes: 4 */
         /* last cacheline: 40 bytes */
};

Can you elaborate on why this works for mac80211? It uses cb
only up to that point from where you invoke the prog?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20  0:01   ` __sk_buff.data_end Daniel Borkmann
@ 2017-04-20  0:12     ` Alexei Starovoitov
  2017-04-20  0:38       ` __sk_buff.data_end Daniel Borkmann
  2017-04-20  6:06       ` __sk_buff.data_end Johannes Berg
  2017-04-20  6:01     ` __sk_buff.data_end Johannes Berg
  1 sibling, 2 replies; 15+ messages in thread
From: Alexei Starovoitov @ 2017-04-20  0:12 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Johannes Berg, Alexei Starovoitov, netdev

On Thu, Apr 20, 2017 at 02:01:49AM +0200, Daniel Borkmann wrote:
> On 04/20/2017 12:20 AM, Johannes Berg wrote:
> >On Wed, 2017-04-19 at 23:31 +0200, Johannes Berg wrote:
> >>Hi Alexei, Daniel,
> >>
> >>I'm looking at adding the __wifi_sk_buff I talked about, and I notice
> >>that it uses CB space to store data_end. Unfortunately, in a lot of
> >>cases, we don't have any CB space to spare in wifi.
> >
> >I guess I can work around this, would this seem reasonable?
> >
> >  struct bpf_skb_data_end {
> >         struct qdisc_skb_cb qdisc_cb;
> >-       void *data_end;
> >+       /*
> >+        * The alignment here is for mac80211, since that doesn't use
> >+        * a pointer but a u64 value and needs to save/restore that
> >+        * across running its BPF programs.
> >+        */
> >+       void *data_end __aligned(sizeof(u64));
> >  };
> 
> Yeah, should work as well for the 32 bit archs, on 64 bit we
> have this effectively already:
> 
> struct bpf_skb_data_end {
>         struct qdisc_skb_cb        qdisc_cb;             /*     0    28 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         void *                     data_end;             /*    32     8 */
> 
>         /* size: 40, cachelines: 1, members: 2 */
>         /* sum members: 36, holes: 1, sum holes: 4 */
>         /* last cacheline: 40 bytes */
> };
> 
> Can you elaborate on why this works for mac80211? It uses cb
> only up to that point from where you invoke the prog?

+1

also didn't we discuss that wifi has crazy non-linear skb?
this data/data_end is used by cls_bpf with headlen only
for direct packet access where performance matters.
Since wifi skbs have only eth in headlen, there is not much
pointing adding support for data/data_end to wifi.
Just use ld_abs/ld_ind instructions and load_bytes() helper.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20  0:12     ` __sk_buff.data_end Alexei Starovoitov
@ 2017-04-20  0:38       ` Daniel Borkmann
  2017-04-20  6:07         ` __sk_buff.data_end Johannes Berg
  2017-04-20  6:06       ` __sk_buff.data_end Johannes Berg
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2017-04-20  0:38 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Johannes Berg, Alexei Starovoitov, netdev

On 04/20/2017 02:12 AM, Alexei Starovoitov wrote:
> On Thu, Apr 20, 2017 at 02:01:49AM +0200, Daniel Borkmann wrote:
>> On 04/20/2017 12:20 AM, Johannes Berg wrote:
>>> On Wed, 2017-04-19 at 23:31 +0200, Johannes Berg wrote:
>>>> Hi Alexei, Daniel,
>>>>
>>>> I'm looking at adding the __wifi_sk_buff I talked about, and I notice
>>>> that it uses CB space to store data_end. Unfortunately, in a lot of
>>>> cases, we don't have any CB space to spare in wifi.
>>>
>>> I guess I can work around this, would this seem reasonable?
>>>
>>>   struct bpf_skb_data_end {
>>>          struct qdisc_skb_cb qdisc_cb;
>>> -       void *data_end;
>>> +       /*
>>> +        * The alignment here is for mac80211, since that doesn't use
>>> +        * a pointer but a u64 value and needs to save/restore that
>>> +        * across running its BPF programs.
>>> +        */
>>> +       void *data_end __aligned(sizeof(u64));
>>>   };
>>
>> Yeah, should work as well for the 32 bit archs, on 64 bit we
>> have this effectively already:
>>
>> struct bpf_skb_data_end {
>>          struct qdisc_skb_cb        qdisc_cb;             /*     0    28 */
>>
>>          /* XXX 4 bytes hole, try to pack */
>>
>>          void *                     data_end;             /*    32     8 */
>>
>>          /* size: 40, cachelines: 1, members: 2 */
>>          /* sum members: 36, holes: 1, sum holes: 4 */
>>          /* last cacheline: 40 bytes */
>> };
>>
>> Can you elaborate on why this works for mac80211? It uses cb
>> only up to that point from where you invoke the prog?
>
> +1
>
> also didn't we discuss that wifi has crazy non-linear skb?
> this data/data_end is used by cls_bpf with headlen only
> for direct packet access where performance matters.

bpf_skb_pull_data() helper can be used as an option to pull
in more, though, f.e. up to bpf_skb_pull_data(skb, skb->len)
in the worst case, which then results in a fully linearized
skb where data/data_end has complete access. That much may
not be needed, though, but f.e. cls_bpf can certainly expand
the available headlen for direct packet access.

> Since wifi skbs have only eth in headlen, there is not much
> pointing adding support for data/data_end to wifi.
> Just use ld_abs/ld_ind instructions and load_bytes() helper.

Afaik, the ld_abs/ld_ind are not an option due to the data
on the wire being in little endian, but the bpf_skb_load_bytes()
might be the way to go initially, agree.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20  0:01   ` __sk_buff.data_end Daniel Borkmann
  2017-04-20  0:12     ` __sk_buff.data_end Alexei Starovoitov
@ 2017-04-20  6:01     ` Johannes Berg
  2017-04-20 14:10       ` __sk_buff.data_end Daniel Borkmann
  1 sibling, 1 reply; 15+ messages in thread
From: Johannes Berg @ 2017-04-20  6:01 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev

On Thu, 2017-04-20 at 02:01 +0200, Daniel Borkmann wrote:
> 
> Yeah, should work as well for the 32 bit archs, on 64 bit we
> have this effectively already:

Right.

[...]

> Can you elaborate on why this works for mac80211? It uses cb
> only up to that point from where you invoke the prog?

No, it works because then I can move a u64 field to the same offset,
and save/restore it across the BPF call :)

But I don't have a *pointer* field to move there, and no space for the
alignment anyway (already using all 48 bytes).

Come to think of it - somebody had proposed extensions to this by
passing an on-stack pointer in addition to the data in the cb.

Perhaps we can extend BPF to have an optional second argument, and
track a second context around the verifier, if applicable? Then we can
solve all of this really easily, because it means we don't always have
to go from the SKB context but could go from the other one (which could
be that on-stack buffer).

Alternatively I can clear another pointer (u64) in the CB, store a
pointer there, and always emit code following that pointer - should be
possible right?

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20  0:12     ` __sk_buff.data_end Alexei Starovoitov
  2017-04-20  0:38       ` __sk_buff.data_end Daniel Borkmann
@ 2017-04-20  6:06       ` Johannes Berg
  1 sibling, 0 replies; 15+ messages in thread
From: Johannes Berg @ 2017-04-20  6:06 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: Alexei Starovoitov, netdev

On Wed, 2017-04-19 at 17:12 -0700, Alexei Starovoitov wrote:
> 
> also didn't we discuss that wifi has crazy non-linear skb?

Not always, depends on the driver.

But also, you have to remember that this filter would be before the
conversion to Ethernet header. Right now, when packets enter the stack,
we always linearize the 802.11 header - or even the whole frame for the
(rare) management packets. We don't do this before monitor mode and
before this filter right now, but there's no super good reason we can't
(it was just so we still report there in case of allocation failures.)

> this data/data_end is used by cls_bpf with headlen only
this data/data_end is used by cls_bpf with headlen only
> for direct packet access where performance matters.

Well performance does matter, though perhaps not as much - the people
I'd think could be the first to use this were adding code way after
doing many function calls, so ... :)

> Since wifi skbs have only eth in headlen, there is not much
> pointing adding support for data/data_end to wifi.
> Just use ld_abs/ld_ind instructions and load_bytes() helper.

You guys just told me not to support the skb access instructions ;-)

But yeah I have the load_bytes helper working. Just thought it might be
nice to get more. But we can also leave that for later.

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20  0:38       ` __sk_buff.data_end Daniel Borkmann
@ 2017-04-20  6:07         ` Johannes Berg
  0 siblings, 0 replies; 15+ messages in thread
From: Johannes Berg @ 2017-04-20  6:07 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: Alexei Starovoitov, netdev

On Thu, 2017-04-20 at 02:38 +0200, Daniel Borkmann wrote:

> > Since wifi skbs have only eth in headlen, there is not much
> > pointing adding support for data/data_end to wifi.
> > Just use ld_abs/ld_ind instructions and load_bytes() helper.
> 
> Afaik, the ld_abs/ld_ind are not an option due to the data
> on the wire being in little endian, but the bpf_skb_load_bytes()
> might be the way to go initially, agree.

Oh yeah, we could really only use byte loads which would be even less
efficient, so that's a good argument for not even providing the data
load instructions... :)

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20  6:01     ` __sk_buff.data_end Johannes Berg
@ 2017-04-20 14:10       ` Daniel Borkmann
  2017-04-20 14:17         ` __sk_buff.data_end Johannes Berg
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2017-04-20 14:10 UTC (permalink / raw)
  To: Johannes Berg, Alexei Starovoitov; +Cc: netdev

On 04/20/2017 08:01 AM, Johannes Berg wrote:
> On Thu, 2017-04-20 at 02:01 +0200, Daniel Borkmann wrote:
>>
>> Yeah, should work as well for the 32 bit archs, on 64 bit we
>> have this effectively already:
>
> Right.
>
> [...]
>
>> Can you elaborate on why this works for mac80211? It uses cb
>> only up to that point from where you invoke the prog?
>
> No, it works because then I can move a u64 field to the same offset,
> and save/restore it across the BPF call :)

Right.

> But I don't have a *pointer* field to move there, and no space for the
> alignment anyway (already using all 48 bytes).
>
> Come to think of it - somebody had proposed extensions to this by
> passing an on-stack pointer in addition to the data in the cb.
>
> Perhaps we can extend BPF to have an optional second argument, and
> track a second context around the verifier, if applicable? Then we can
> solve all of this really easily, because it means we don't always have
> to go from the SKB context but could go from the other one (which could
> be that on-stack buffer).

I think this would be a rather more complex operation on the BPF side,
it would need changes from LLVM (which assumes initial ctx sits in r1),
verifier for tracking this ctx2, all the way down to JITs plus some way
to handle 1 and 2 argument program calls generically. Much easier to
pass additional meta data for the program via cb[], for example.

> Alternatively I can clear another pointer (u64) in the CB, store a
> pointer there, and always emit code following that pointer - should be
> possible right?

What kind of pointer? If it's something like data_end as read-only, then
this needs to be tracked in the verifier in addition, of course. Other
option you could do (depending on what you want to achieve) is to have
a bpf_probe_read() version as a helper for your prog type that would
further walk that pointer/struct (similar to tracing) where this comes
w/o any backward compat guarantees, though.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20 14:10       ` __sk_buff.data_end Daniel Borkmann
@ 2017-04-20 14:17         ` Johannes Berg
  2017-04-20 14:28           ` __sk_buff.data_end Daniel Borkmann
  0 siblings, 1 reply; 15+ messages in thread
From: Johannes Berg @ 2017-04-20 14:17 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev

On Thu, 2017-04-20 at 16:10 +0200, Daniel Borkmann wrote:
> 
> I think this would be a rather more complex operation on the BPF
> side, it would need changes from LLVM (which assumes initial ctx sits
> in r1), verifier for tracking this ctx2, all the way down to JITs
> plus some way to handle 1 and 2 argument program calls generically.
> Much easier to pass additional meta data for the program via cb[],
> for example.

Yeah, it did seem very complex :)

> > Alternatively I can clear another pointer (u64) in the CB, store a
> > pointer there, and always emit code following that pointer - should
> > be possible right?
> 
> What kind of pointer? If it's something like data_end as read-only,
> then this needs to be tracked in the verifier in addition, of course.
> Other option you could do (depending on what you want to achieve) is
> to have a bpf_probe_read() version as a helper for your prog type
> that would further walk that pointer/struct (similar to tracing)
> where this comes w/o any backward compat guarantees, though.

I meant something like this

struct wifi_cb {
	struct wifi_data *wifi_data;
	...
	void *data_end; // with BUILD_BUG_ON to the right offset
};

Then struct wifi_data can contain extra data that doesn't fit into
wifi_cb, like the stuff I evicted for *data_end and *wifi_data. Let's
say one of those fields is "u64 boottime_ns;" (as I did in my patch
now), so we have

struct wifi_data {
	u64 boottime_ns;
};

then I can still have

struct __wifi_sk_buff {
	u32 len;
	u32 data;
	u32 data_end;
	u32 boottime_ns; // this is strange but
			 // seems to be done this way?
};

And then when boottime_ns is accessed, I can have:

        case offsetof(struct __wifi_sk_buff, boottime_ns):
                off  = si->off;
                off -= offsetof(struct __wifi_sk_buff, boottime_ns);
                off += offsetof(struct sk_buff, cb);
                off += offsetof(struct wifi_cb, wifi_data);
                *insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg,
                                      si->src_reg, off);
		off = offsetof(struct wifi_data, boottime_ns);
		*isns++ = BPF_LDX_MEM(BPF_SIZEOF(u64), si->dst_reg,
				      si->src_reg, off);
                break;

no?

It seems to me this should work, and essentially emit code to follow
the pointer to inside struct wifi_data. Assuming 

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20 14:17         ` __sk_buff.data_end Johannes Berg
@ 2017-04-20 14:28           ` Daniel Borkmann
  2017-04-20 14:32             ` __sk_buff.data_end Johannes Berg
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2017-04-20 14:28 UTC (permalink / raw)
  To: Johannes Berg, Alexei Starovoitov; +Cc: netdev

On 04/20/2017 04:17 PM, Johannes Berg wrote:
> On Thu, 2017-04-20 at 16:10 +0200, Daniel Borkmann wrote:
>>
>> I think this would be a rather more complex operation on the BPF
>> side, it would need changes from LLVM (which assumes initial ctx sits
>> in r1), verifier for tracking this ctx2, all the way down to JITs
>> plus some way to handle 1 and 2 argument program calls generically.
>> Much easier to pass additional meta data for the program via cb[],
>> for example.
>
> Yeah, it did seem very complex :)
>
>>> Alternatively I can clear another pointer (u64) in the CB, store a
>>> pointer there, and always emit code following that pointer - should
>>> be possible right?
>>
>> What kind of pointer? If it's something like data_end as read-only,
>> then this needs to be tracked in the verifier in addition, of course.
>> Other option you could do (depending on what you want to achieve) is
>> to have a bpf_probe_read() version as a helper for your prog type
>> that would further walk that pointer/struct (similar to tracing)
>> where this comes w/o any backward compat guarantees, though.
>
> I meant something like this
>
> struct wifi_cb {
> 	struct wifi_data *wifi_data;
> 	...
> 	void *data_end; // with BUILD_BUG_ON to the right offset
> };
>
> Then struct wifi_data can contain extra data that doesn't fit into
> wifi_cb, like the stuff I evicted for *data_end and *wifi_data. Let's
> say one of those fields is "u64 boottime_ns;" (as I did in my patch
> now), so we have
>
> struct wifi_data {
> 	u64 boottime_ns;
> };
>
> then I can still have
>
> struct __wifi_sk_buff {
> 	u32 len;
> 	u32 data;
> 	u32 data_end;
> 	u32 boottime_ns; // this is strange but
> 			 // seems to be done this way?
> };
>
> And then when boottime_ns is accessed, I can have:
>
>          case offsetof(struct __wifi_sk_buff, boottime_ns):
>                  off  = si->off;
>                  off -= offsetof(struct __wifi_sk_buff, boottime_ns);
>                  off += offsetof(struct sk_buff, cb);
>                  off += offsetof(struct wifi_cb, wifi_data);
>                  *insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg,
>                                        si->src_reg, off);
> 		off = offsetof(struct wifi_data, boottime_ns);
> 		*isns++ = BPF_LDX_MEM(BPF_SIZEOF(u64), si->dst_reg,
> 				      si->src_reg, off);
>                  break;
>
> no?
>
> It seems to me this should work, and essentially emit code to follow
> the pointer to inside struct wifi_data. Assuming

I see what you mean now. Yes, that's fine. We already do something
similar essentially with skb->ifindex access already (skb->dev +
dev->ifindex), f.e.:

[...]
	case offsetof(struct __sk_buff, ifindex):
		BUILD_BUG_ON(FIELD_SIZEOF(struct net_device, ifindex) != 4);

		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, dev),
				      si->dst_reg, si->src_reg,
				      offsetof(struct sk_buff, dev));
		*insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1);
		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
				      offsetof(struct net_device, ifindex));
		break;
[...]

Which is not too different from the above. You'd probably need to
populate the struct wifi_data each time if you place it onto the
stack, but perhaps could be optimized by storing that somewhere
else (e.g. somewhere via netdev, etc) and walking the pointer from
there, which would also spare you the cb[] save/restore.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20 14:28           ` __sk_buff.data_end Daniel Borkmann
@ 2017-04-20 14:32             ` Johannes Berg
  2017-04-20 14:46               ` __sk_buff.data_end Daniel Borkmann
  0 siblings, 1 reply; 15+ messages in thread
From: Johannes Berg @ 2017-04-20 14:32 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev

On Thu, 2017-04-20 at 16:28 +0200, Daniel Borkmann wrote:
> 
> I see what you mean now. Yes, that's fine. We already do something
> similar essentially with skb->ifindex access already (skb->dev +
> dev->ifindex), f.e.:
> 
> [...]
> 	case offsetof(struct __sk_buff, ifindex):
> 		BUILD_BUG_ON(FIELD_SIZEOF(struct net_device, ifindex)
> != 4);
> 
> 		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff,
> dev),
> 				      si->dst_reg, si->src_reg,
> 				      offsetof(struct sk_buff, dev));
> 		*insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1);
> 		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
> 				      offsetof(struct net_device,
> ifindex));
> 		break;
> [...]

Oh, right, good point.

> Which is not too different from the above. You'd probably need to
> populate the struct wifi_data each time if you place it onto the
> stack, but perhaps could be optimized by storing that somewhere
> else (e.g. somewhere via netdev, etc) and walking the pointer from
> there, which would also spare you the cb[] save/restore.

Hmm. I don't see what "somewhere else" I could possibly have though,
given that I want the (kernel-side) context to be "struct sk_buff *"
to allow the skb helpers?

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20 14:32             ` __sk_buff.data_end Johannes Berg
@ 2017-04-20 14:46               ` Daniel Borkmann
  2017-04-20 14:48                 ` __sk_buff.data_end Johannes Berg
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2017-04-20 14:46 UTC (permalink / raw)
  To: Johannes Berg, Alexei Starovoitov; +Cc: netdev

On 04/20/2017 04:32 PM, Johannes Berg wrote:
> On Thu, 2017-04-20 at 16:28 +0200, Daniel Borkmann wrote:
>>
>> I see what you mean now. Yes, that's fine. We already do something
>> similar essentially with skb->ifindex access already (skb->dev +
>> dev->ifindex), f.e.:
>>
>> [...]
>> 	case offsetof(struct __sk_buff, ifindex):
>> 		BUILD_BUG_ON(FIELD_SIZEOF(struct net_device, ifindex)
>> != 4);
>>
>> 		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff,
>> dev),
>> 				      si->dst_reg, si->src_reg,
>> 				      offsetof(struct sk_buff, dev));
>> 		*insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1);
>> 		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
>> 				      offsetof(struct net_device,
>> ifindex));
>> 		break;
>> [...]
>
> Oh, right, good point.
>
>> Which is not too different from the above. You'd probably need to
>> populate the struct wifi_data each time if you place it onto the
>> stack, but perhaps could be optimized by storing that somewhere
>> else (e.g. somewhere via netdev, etc) and walking the pointer from
>> there, which would also spare you the cb[] save/restore.
>
> Hmm. I don't see what "somewhere else" I could possibly have though,
> given that I want the (kernel-side) context to be "struct sk_buff *"
> to allow the skb helpers?

I have not enough context on the wireless side, perhaps could be
somewhere under skb->dev->ieee80211_ptr or so, iff suitable. But
it also really doesn't matter much since this is all transparently
handled in the kernel, meaning these kind of rewrites can still be
changed at a later point in time, f.e. if it's only 'u64 boottime_ns'
right now, that could live directly in the cb[] w/o extra pointer,
and should that grow to more members, then it could be moved behind
a pointer later on and it still works as-is from the program point
of view.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: __sk_buff.data_end
  2017-04-20 14:46               ` __sk_buff.data_end Daniel Borkmann
@ 2017-04-20 14:48                 ` Johannes Berg
  0 siblings, 0 replies; 15+ messages in thread
From: Johannes Berg @ 2017-04-20 14:48 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev

On Thu, 2017-04-20 at 16:46 +0200, Daniel Borkmann wrote:
> > Hmm. I don't see what "somewhere else" I could possibly have
> > though, given that I want the (kernel-side) context to be "struct
> > sk_buff *" to allow the skb helpers?
> 
> I have not enough context on the wireless side, perhaps could be
> somewhere under skb->dev->ieee80211_ptr or so, iff suitable. 

I don't think I even have skb->dev assigned at this point :)

> But
> it also really doesn't matter much since this is all transparently
> handled in the kernel, meaning these kind of rewrites can still be
> changed at a later point in time, f.e. if it's only 'u64 boottime_ns'
> right now, that could live directly in the cb[] w/o extra pointer,
> and should that grow to more members, then it could be moved behind
> a pointer later on and it still works as-is from the program point
> of view.

Well, there are 48 bytes in the cb already, so doing this would mean
moving two pointer-sized things out, but yeah - as long as we don't add
everything right now we can keep those things that are available to BPF
in the cb and move something else out.

As long as what I described there with the indirect load works, I'm
fine with this, and I can even do data_end pretty easily then.

johannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-04-20 14:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-19 21:31 __sk_buff.data_end Johannes Berg
2017-04-19 22:20 ` __sk_buff.data_end Johannes Berg
2017-04-20  0:01   ` __sk_buff.data_end Daniel Borkmann
2017-04-20  0:12     ` __sk_buff.data_end Alexei Starovoitov
2017-04-20  0:38       ` __sk_buff.data_end Daniel Borkmann
2017-04-20  6:07         ` __sk_buff.data_end Johannes Berg
2017-04-20  6:06       ` __sk_buff.data_end Johannes Berg
2017-04-20  6:01     ` __sk_buff.data_end Johannes Berg
2017-04-20 14:10       ` __sk_buff.data_end Daniel Borkmann
2017-04-20 14:17         ` __sk_buff.data_end Johannes Berg
2017-04-20 14:28           ` __sk_buff.data_end Daniel Borkmann
2017-04-20 14:32             ` __sk_buff.data_end Johannes Berg
2017-04-20 14:46               ` __sk_buff.data_end Daniel Borkmann
2017-04-20 14:48                 ` __sk_buff.data_end Johannes Berg
2017-04-19 23:51 ` __sk_buff.data_end Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.