All of lore.kernel.org
 help / color / mirror / Atom feed
* Assertion error when using map
@ 2019-12-30 23:23 Changli Gao
  2020-01-02 23:25 ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Changli Gao @ 2019-12-30 23:23 UTC (permalink / raw)
  To: netfilter

I want to use map to simplify the configuration of DSCP fields with
the following command:

> ... ip dscp set meta cgroup map { 3000 : 0x2c, 4000 : 0x20 }

But it fails with the following message:

> BUG: invalid mapping expression set reference
> nft: evaluate.c:1426: expr_evaluate_map: Assertion `0' failed.
> Aborted (core dumped)

It seems that the parser recognize the command as a valid one, but the
later evaluation process doesn't think so.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Assertion error when using map
  2019-12-30 23:23 Assertion error when using map Changli Gao
@ 2020-01-02 23:25 ` Florian Westphal
  2020-01-03  3:05   ` Changli Gao
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2020-01-02 23:25 UTC (permalink / raw)
  To: Changli Gao; +Cc: netfilter

Changli Gao <xiaosuo@gmail.com> wrote:
> I want to use map to simplify the configuration of DSCP fields with
> the following command:
> 
> > ... ip dscp set meta cgroup map { 3000 : 0x2c, 4000 : 0x20 }
> 
> But it fails with the following message:
> 
> > BUG: invalid mapping expression set reference
> > nft: evaluate.c:1426: expr_evaluate_map: Assertion `0' failed.
> > Aborted (core dumped)
> 
> It seems that the parser recognize the command as a valid one, but the
> later evaluation process doesn't think so.

Yes, this is unsupported.

The problem comes from 'dscp' being of non-byte-divisible length.

tcp dport set 42

is simple:
  [ immediate reg 1 0x00002a00 ]
  [ payload write reg 1 => 2b @ transport header + 2 ..

We can just place the immediate in a register and tell payload
expression to place two bytes from the register at the proper location.

ip dscp set 42

is already more complicated:

  [ payload load 2b @ network header + 0 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x000003ff ) ^ 0x0000a800 ]
  [ payload write reg 1 => 2b @ network header + 0 ..

because 'payload write' size is in bytes, just placing
42 in a register and then telling payload expression to write
that to the proper location in the packet will zero the ecn
signalling bits.

So, nft first loads the existing data, masks off the dscp
bits (retaining everything else sharing the same byte-addressed
location). then xors the immediate (0xa8 == 42 << 2).

Then, the register is written to the packet payload.

In order to support this for map, we would need something
similar to this:

   [ meta load cgroup => reg 1 ]
   [ lookup reg 1 set __map%d dreg 1 ]  # reg1 now contains desired dscp value
   [ payload load 2b @ network header + 0 => reg 2 ] # reg 2: original bytes that need mangling
   [ bitwise reg 2 = (reg=2 & 0x000003ff ) ^ reg1 ] # XOR reg1 into reg2
   [ payload write reg 2 => 2b @ ... # write back reg2 to packet header

This needs quite some work:

We must preprocess the map data values to contain the shifted
immediates, i.e. if user stores 0x20 we need to pass 0x80 to the kernel.

nft does this via 'binop_transfer' in the evaluation phase (but not yet for
maps as you found).

The second, more severe problem is that 'bitwise' only takes one source
register, not two.  So the '[ reg2 &= reg2 & 0x3ff ^ reg1 ] is not
possible at the moment.  nft_bitwise.c in kernel needs to be extended
for this.

We will also likely need surgery on netlink
linearization/delinearization steps.

Regarding bitwise, there are other use cases that will need the
ability to handle more than one sreg, e.g. to restore only parts
of the packet mark to the connmark or vice versa, while retaining
existing bits, so this will need to be added eventually.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Assertion error when using map
  2020-01-02 23:25 ` Florian Westphal
@ 2020-01-03  3:05   ` Changli Gao
  2020-01-04 11:15     ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Changli Gao @ 2020-01-03  3:05 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter

I know it is difficult. Do you have any plan to support this kind of
features? Or, is there any way to work around this issue?

On Fri, Jan 3, 2020 at 7:25 AM Florian Westphal <fw@strlen.de> wrote:
>
> Changli Gao <xiaosuo@gmail.com> wrote:
> > I want to use map to simplify the configuration of DSCP fields with
> > the following command:
> >
> > > ... ip dscp set meta cgroup map { 3000 : 0x2c, 4000 : 0x20 }
> >
> > But it fails with the following message:
> >
> > > BUG: invalid mapping expression set reference
> > > nft: evaluate.c:1426: expr_evaluate_map: Assertion `0' failed.
> > > Aborted (core dumped)
> >
> > It seems that the parser recognize the command as a valid one, but the
> > later evaluation process doesn't think so.
>
> Yes, this is unsupported.
>
> The problem comes from 'dscp' being of non-byte-divisible length.
>
> tcp dport set 42
>
> is simple:
>   [ immediate reg 1 0x00002a00 ]
>   [ payload write reg 1 => 2b @ transport header + 2 ..
>
> We can just place the immediate in a register and tell payload
> expression to place two bytes from the register at the proper location.
>
> ip dscp set 42
>
> is already more complicated:
>
>   [ payload load 2b @ network header + 0 => reg 1 ]
>   [ bitwise reg 1 = (reg=1 & 0x000003ff ) ^ 0x0000a800 ]
>   [ payload write reg 1 => 2b @ network header + 0 ..
>
> because 'payload write' size is in bytes, just placing
> 42 in a register and then telling payload expression to write
> that to the proper location in the packet will zero the ecn
> signalling bits.
>
> So, nft first loads the existing data, masks off the dscp
> bits (retaining everything else sharing the same byte-addressed
> location). then xors the immediate (0xa8 == 42 << 2).
>
> Then, the register is written to the packet payload.
>
> In order to support this for map, we would need something
> similar to this:
>
>    [ meta load cgroup => reg 1 ]
>    [ lookup reg 1 set __map%d dreg 1 ]  # reg1 now contains desired dscp value
>    [ payload load 2b @ network header + 0 => reg 2 ] # reg 2: original bytes that need mangling
>    [ bitwise reg 2 = (reg=2 & 0x000003ff ) ^ reg1 ] # XOR reg1 into reg2
>    [ payload write reg 2 => 2b @ ... # write back reg2 to packet header
>
> This needs quite some work:
>
> We must preprocess the map data values to contain the shifted
> immediates, i.e. if user stores 0x20 we need to pass 0x80 to the kernel.
>
> nft does this via 'binop_transfer' in the evaluation phase (but not yet for
> maps as you found).
>
> The second, more severe problem is that 'bitwise' only takes one source
> register, not two.  So the '[ reg2 &= reg2 & 0x3ff ^ reg1 ] is not
> possible at the moment.  nft_bitwise.c in kernel needs to be extended
> for this.
>
> We will also likely need surgery on netlink
> linearization/delinearization steps.
>
> Regarding bitwise, there are other use cases that will need the
> ability to handle more than one sreg, e.g. to restore only parts
> of the packet mark to the connmark or vice versa, while retaining
> existing bits, so this will need to be added eventually.



-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Assertion error when using map
  2020-01-03  3:05   ` Changli Gao
@ 2020-01-04 11:15     ` Florian Westphal
  0 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2020-01-04 11:15 UTC (permalink / raw)
  To: Changli Gao; +Cc: Florian Westphal, netfilter

Changli Gao <xiaosuo@gmail.com> wrote:
> I know it is difficult. Do you have any plan to support this kind of
> features? Or, is there any way to work around this issue?

Yes, I think we should extend bitwise first, since that would make this
work:
src/nft -e -a --debug=netlink add rule inet filter input \
   ct mark set "ct mark & 0xffff0000 | meta mark & 0xffff"
inet filter input
  [ meta load mark => reg 2 ]
  [ bitwise reg 2 = (reg=2 & 0x0000ffff ) ^ 0x00000000 ]
  [ ct load mark => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ reg 2]
  [ ct set mark with reg 1 ]

As you can see I have netlink linearization part working, don't know yet
when I will start working on kernel part (or testing of this thing for
that matter ...).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-01-04 11:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-30 23:23 Assertion error when using map Changli Gao
2020-01-02 23:25 ` Florian Westphal
2020-01-03  3:05   ` Changli Gao
2020-01-04 11:15     ` Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.