dev.dpdk.org archive mirror
 help / color / mirror / Atom feed
From: Wang Xiang <xiang.w.wang@intel.com>
To: Ori Kam <orika@mellanox.com>
Cc: Jerin Jacob <jerinjacobk@gmail.com>,
	Jerin Jacob <jerinj@marvell.com>, dpdk-dev <dev@dpdk.org>,
	Pavan Nikhilesh <pbhagavatula@marvell.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	Opher Reviv <opher@mellanox.com>,
	Alex Rosenbaum <alexr@mellanox.com>,
	"dovrat@marvell.com" <dovrat@marvell.com>,
	Prasun Kapoor <pkapoor@marvell.com>,
	Nipun Gupta <nipun.gupta@nxp.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"yang.a.hong@intel.com" <yang.a.hong@intel.com>,
	"harry.chang@intel.com" <harry.chang@intel.com>,
	"gu.jian1@zte.com.cn" <gu.jian1@zte.com.cn>,
	"shanjiangh@chinatelecom.cn" <shanjiangh@chinatelecom.cn>,
	"zhangy.yun@chinatelecom.cn" <zhangy.yun@chinatelecom.cn>,
	"lixingfu@huachentel.com" <lixingfu@huachentel.com>,
	"wushuai@inspur.com" <wushuai@inspur.com>,
	"yuyingxia@yxlink.com" <yuyingxia@yxlink.com>,
	"fanchenggang@sunyainfo.com" <fanchenggang@sunyainfo.com>,
	"davidfgao@tencent.com" <davidfgao@tencent.com>,
	"liuzhong1@chinaunicom.cn" <liuzhong1@chinaunicom.cn>,
	"zhaoyong11@huawei.com" <zhaoyong11@huawei.com>,
	"oc@yunify.com" <oc@yunify.com>,
	"jim@netgate.com" <jim@netgate.com>,
	"hongjun.ni@intel.com" <hongjun.ni@intel.com>,
	"j.bromhead@titan-ic.com" <j.bromhead@titan-ic.com>,
	"deri@ntop.org" <deri@ntop.org>,
	"fc@napatech.com" <fc@napatech.com>,
	"arthur.su@lionic.com" <arthur.su@lionic.com>,
	Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
Date: Wed, 26 Feb 2020 04:03:18 -0500	[thread overview]
Message-ID: <20200226090318.GA85327@hs1> (raw)
In-Reply-To: <AM6PR05MB5176621E6EEFB45C88CB4AD3DBED0@AM6PR05MB5176.eurprd05.prod.outlook.com>

Hi Ori and Jerin,

One comment regarding my concern with len and end_offset problem.
From open source SW regex library(libpcre, re2 and Hyperscan) and 
Intel's perspective, the matching results returned are always start
offset and end offset. More importantly, Hyperscan only reports end offset
most of the time.

It'll be good to keep this union as an abstraction and enforce the default
behavior for each solution, i.e. HW solutions doesn't support MATCH_AS_START
flag at rule compile time. Applications will know the meaning of variable at
rule compile time with the flag so they don't have to do extra check at fast path
run-time matching.
Welcome for better abstraction ideas.

Thanks,
Xiang

> > > > +                       /**< Starting Byte Position for matched rule. */
> > > > +                       RTE_STD_C11
> > > > +                       union {
> > > > +                               uint16_t len;
> > > > +                               /**< Length of match in bytes */
> > > > +                               uint16_t end_offset;
> > > > +                               /**< The end offset of the match. In case
> > > > +                                * MATCH_AS_START configuration is disabled.
> > > > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > +                                */
> > >
> > > We have not concluded on this scheme. Have one field which has
> > > different meaning will be difficult
> > > for application. i.e fast path we need to have a check for this.
> > >
> >
> > This is the time to conclude . at least for the first version.
> > Why do we have one field with different meaning?
> > The result can be ether len or end_offset.
> >
> > > I think, Based on the majority of HW/SW implementation, we need to
> > > either go with len or
> > > end_offset. What Mellanox HW returns? len or end_offset?
> > >
> >
> > From Mellanox perspective we prefer the len approach. We also think
> > it is much more user oriented.
> >
> > > or We can keep it as len or end_offset based on which drivers upstream
> first,
> > > other drivers when it comes, we can see how to abstract it?
> > >
> >
> > I can except that assuming we choose the start and len approach
>
> I think, we can have first version with "start and len" by removing
> RTE_REGEX_DEV_CFG_MATCH_AS_START.
> When can think, how to abstract new drivers when it upstream based on
> the overhead.
>


On Tue, Feb 25, 2020 at 07:48:54AM +0000, Ori Kam wrote:
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Tuesday, February 25, 2020 7:57 AM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > 
> > > > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> > >
> > > We started to create a super basic app, after the API will be finalized and we
> > will have HW
> > > we can push it. (if you need it faster than feel free)
> > 
> > A simple Unit test case needs to be present for the APIs. On the
> > course of developing common code,
> > it can be developed to test the common code with dummy/skeleton driver.
> > 
> 
> Agree this is what we are currently have.
> 
> > >
> > > > 5) Need a maintainer for maintaining the regex subsystem
> > > >
> > > We wish to maintain it if you agree.
> > 
> > Yes. Please.
> > 
> 
> Great.
> 
> > > > >
> > > > > One more thing, regarding the ops structure, I think it is better to split it
> > to 2
> > > > different
> > > > > structures one enque and one for dequeue, since there are no real shared
> > > > data and we will
> > > > > be able to save memory, what do you think?
> > > >
> > > > Ops are allocated from mempool so it will be overhead to manage both.
> > > > moreover, some
> > > > of the fields added in req can be used for resp as info. cryptodev
> > > > follows the similar concept,
> > > > I think, we can have symmetry with cryptodev wherever is possible to avoid
> > > > end-user to learn new API models.
> > >
> > > True that there will be overhead with 2 mempools (small one)
> > > but lets assume 255 results. This means that the buffer should be 255 *
> > sizeof(rte_regex_match) = 2K
> > > also this will enable us to replace groupX with group[] which will allow even
> > more groups.
> > > In addition don't think that crypto is a good example.
> > > The main difference is that in RegEx the output is different format then the
> > input.
> > 
> > # IMO, Some of the fields may be useful for a response as well. I
> > think application may be interested in following
> > req filed in the response.
> > a) buf_addr
> 
> I don't see how this can be used in the response. since if working in out of order result.
> you don’t know which result will be returned. 
> I also think it is error prone to use the same op for the enqueue and dequeue.
> 
> > b) scan_size
> 
> Please see above.
> 
> > c) user_id (This would be main one)
> 
> Agree
> 
> > 
> > # Having two mempools adds overhead per lcore L1 cache usage and extra
> > complexity to the application.
> > 
> > # IMO, From a performance perspective, one mempool is good due to less
> > stress on the cache and it is costly to
> > add new mempool for HW mempool implementations.
> > 
> > # I think, group[] use case we can add it when it required by
> > introducing "matches_start_offset" field, which will
> > tell the req, where is the end of group[] and where "matches" start
> > with single mempool scheme also.
> > 
> > # I think, one of the other use case for "matches_start_offset" that,
> > It may possible to put vendor-specific
> > opaque data. It will be filled by driver on response. The application
> > can reference the matches as
> > 
> > struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops-
> > 
> >matches_start_offset);
> > 
> 
> O.K for now we will keep  it as is, and we will see what will be in the future.
> 
> > >
> > > > I assume you will send the v4 with these comments. I think, with v4 we
> > > > can start implementing common library code.
> > >
> > > Just need to agree on the split (one more iteration )
> > > and I will start working on the common code.
> > 
> > Ack.
> 
> Great,
> I'm starting to work on V4 with all comments so the RFC will be acked and then will start 
> coding the rest of the common code.
> 

  reply	other threads:[~2020-02-26  1:47 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
2019-07-15  4:26 ` Jerin Jacob Kollanukkaran
2019-08-15  9:35 ` Thomas Monjalon
2019-08-15 11:34   ` Thomas Monjalon
2019-08-19  3:09     ` Jerin Jacob Kollanukkaran
2019-08-20  1:54       ` Wang, Xiang W
2019-09-10  8:05         ` Jerin Jacob Kollanukkaran
2019-09-19 13:58           ` Wang Xiang
2019-09-27 14:35             ` Jerin Jacob Kollanukkaran
2019-10-14 13:59               ` Wang Xiang
2020-01-26 11:55                 ` Ori Kam
2019-08-21  5:32     ` Shahaf Shuler
2019-08-21 15:12       ` John Bromhead
2019-09-10 10:31       ` Jerin Jacob Kollanukkaran
2019-09-10 11:02       ` Jerin Jacob Kollanukkaran
2019-09-27 14:45         ` Jerin Jacob Kollanukkaran
2019-10-02  5:53           ` Shahaf Shuler
2019-10-02  8:31             ` Jerin Jacob Kollanukkaran
2019-10-02  8:52               ` Shahaf Shuler
2019-10-02  9:34                 ` Jerin Jacob Kollanukkaran
2020-01-27 21:19 ` [dpdk-dev] [PATCH v2] net/regexdev: " Ori Kam
2020-01-28  9:00 ` [dpdk-dev] [PATCH v3] regexdev: " Ori Kam
2020-02-22 16:52   ` Jerin Jacob
2020-02-23  8:41     ` Ori Kam
2020-02-23  9:53       ` Jerin Jacob
2020-02-23 12:33         ` Ori Kam
2020-02-25  5:57           ` Jerin Jacob
2020-02-25  7:48             ` Ori Kam
2020-02-26  9:03               ` Wang Xiang [this message]
2020-02-26  8:36                 ` Ori Kam
2020-02-27  9:25                   ` Wang Xiang
2020-02-27  7:31                     ` Ori Kam
2020-02-27  9:16                       ` Wang Xiang
2020-02-27 14:40 ` [dpdk-dev] [RFC v4] " Ori Kam
2020-02-27 14:55   ` Jerin Jacob
2020-02-27 15:08 ` [dpdk-dev] [RFC v5] " Ori Kam
2020-03-01  6:13   ` [dpdk-dev] [EXT] " Pavan Nikhilesh Bhagavatula
2020-03-01  7:31     ` Ori Kam
2020-03-01 13:23       ` Pavan Nikhilesh Bhagavatula
2020-03-01 14:10         ` Ori Kam
2020-03-01 14:38           ` Pavan Nikhilesh Bhagavatula
2020-03-01 15:41             ` Ori Kam
2020-03-01 15:57               ` Pavan Nikhilesh Bhagavatula
2020-03-02  7:18                 ` Jerin Jacob
2020-03-03  7:06                   ` Ori Kam
2020-03-02  7:05   ` [dpdk-dev] " Wang Xiang
2020-03-03  7:44     ` Ori Kam
2020-03-03  7:54       ` Jerin Jacob
2020-03-10 10:32 ` [dpdk-dev] [RFC v6] " Ori Kam
2020-03-10 13:42   ` Pavan Nikhilesh Bhagavatula
2020-03-10 16:23     ` Ori Kam
2020-03-10 16:36       ` Pavan Nikhilesh Bhagavatula
2020-03-10 17:00         ` Ori Kam
2020-03-12 12:13           ` Ori Kam
2020-03-13  1:20   ` Wang Xiang
2020-03-15 10:05     ` Ori Kam
2020-03-16  1:25       ` Wang Xiang
2020-03-16  9:09         ` Ori Kam
2020-03-16 20:48           ` Wang Xiang
2020-03-16 13:49             ` Ori Kam
2020-03-16 21:10               ` Wang Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200226090318.GA85327@hs1 \
    --to=xiang.w.wang@intel.com \
    --cc=alexr@mellanox.com \
    --cc=arthur.su@lionic.com \
    --cc=bruce.richardson@intel.com \
    --cc=davidfgao@tencent.com \
    --cc=deri@ntop.org \
    --cc=dev@dpdk.org \
    --cc=dovrat@marvell.com \
    --cc=fanchenggang@sunyainfo.com \
    --cc=fc@napatech.com \
    --cc=gu.jian1@zte.com.cn \
    --cc=harry.chang@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hongjun.ni@intel.com \
    --cc=j.bromhead@titan-ic.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=jim@netgate.com \
    --cc=liuzhong1@chinaunicom.cn \
    --cc=lixingfu@huachentel.com \
    --cc=nipun.gupta@nxp.com \
    --cc=oc@yunify.com \
    --cc=opher@mellanox.com \
    --cc=orika@mellanox.com \
    --cc=pbhagavatula@marvell.com \
    --cc=pkapoor@marvell.com \
    --cc=shahafs@mellanox.com \
    --cc=shanjiangh@chinatelecom.cn \
    --cc=thomas@monjalon.net \
    --cc=wushuai@inspur.com \
    --cc=yang.a.hong@intel.com \
    --cc=yuyingxia@yxlink.com \
    --cc=zhangy.yun@chinatelecom.cn \
    --cc=zhaoyong11@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).