Re: [PATCH net-next 1/1] net/sched: Introduce skb hash classifier

From: Jamal Hadi Salim <jhs@mojatatu.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Jiri Pirko <jiri@resnulli.us>,
	Ariel Levkovich <lariel@mellanox.com>
Subject: Re: [PATCH net-next 1/1] net/sched: Introduce skb hash classifier
Date: Wed, 19 Aug 2020 05:48:03 -0400	[thread overview]
Message-ID: <b7460988-1f9c-5693-f09c-729453c1e58a@mojatatu.com> (raw)
In-Reply-To: <CAM_iQpX71-jFUddZoSQrXWpd0KRpi0ueoK=h3ugBh5ufYvqLEQ@mail.gmail.com>

On 2020-08-17 3:47 p.m., Cong Wang wrote:
> On Mon, Aug 17, 2020 at 4:19 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>>

[..]
>> There is no ambiguity of intent in the fw case, there is only one field.
>> In the case of having multiple fields it is ambigious if you
>> unconditionally look.
>>
>> Example: policy says to match skb mark of 5 and hash of 3.
>> If packet arrives with skb->mark is 5 and skb->hash is 3
>> very clearly matched the intent of the policy.
>> If packet arrives withj skb->mark 7 and hash 3 it clearly
>> did not match the intent. etc.
> 
> This example clearly shows no ambiguous, right? ;)
> 

Ambigious only from the perspective of relational AND vs OR
(your original pseudo code had it in OR relation).

> 
>>
>>> But if filters were put in a global hashtable, the above would be
>>> much harder to implement.
>>>
>>
>> Ok, yes. My assumption has been you will have some global shared
>> structure where all filters will be installed on.
> 
> Sure, if not hashtable, we could simply put them in a list:
> 
> list_for_each_filter {
>    if (filter_parameter_has_hash) {
>      match skb->hash with cls->param_hash
>    }
>    if (filter_parameter_has_mark) {
>      match skb->mark with cls->param_mark
>    }
> }
> 

Yes, that would work - but iteration is linear.

> 
>>
>> I think i may have misunderstood all along what you were saying
>> which is:
>>
>> a) add the rules so they are each _independent with different
>>      priorities_ in a chain.
> 
> Yes, because this gives users freedom to pick a different prio
> from its value (hash or mark).
>

ok.

> 
>>
>> b)  when i do lookup for packet arrival, i will only see a filter
>>    that matches "match mark 5 and hash 3" (meaning there is no
>>    ambiguity on intent). If packet data doesnt match policy then
>>    i will iterate to another filter on the chain list with lower
>>    priority.
> 
> Right. Multiple values mean AND, not OR, so if you specify
> mark 5 and hash 3, it will match skb->mark==5 && skb->hash==3.
> If not matched, it will continue the iteration until the end.
>

That would remove the ambiguity (assuming iteration with "continue"
to create the AND effect).

>>
>> Am i correct in my understanding?
>>
>> If i am - then we still have a problem with lookup scale in presence
>> of a large number of filters since essentially this approach
>> is linear lookup (similar problem iptables has). I am afraid
>> a hash table or something with similar principle goals is needed.
> 
> Yeah, this is why I asked you whether we have to put them in a
> hashtable in previous emails, as hashtable organizes them with
> a key, it is hard to combine multiple fields in one key and allow
> to extend easily in the future. But other people smarter than me
> may have better ideas here.

To achieve reasonable performance (with many filters) I dont think
there is escape from having something that is centralized
(per priority) - sort of what fw, u32 or flower do. A hash table is
the most common approach; i was hoping that IDR maybe useful since the
skb->hash maps nicely to "32 bit key" but Vlad was saying at the
tc workshop that he was seeing bottlenecks with IDR.

I will test a few hash algorithms (including common one like jenkins)
for this use case.
Problem right now is we have that rcu-inspired upper bucket limit
(which exists) in fw as well: If i have 1M entries which can only be
spread to 256 buckets perfectly then i have ~4K entries per bucket
which are a linked list. So we are still not doing so well. Do you have
time to make that better for fw (since you are an rcu maestro).

cheers,
jamal