All of lore.kernel.org
 help / color / mirror / Atom feed
From: Weiping Zhang <zwp10758@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Weiping Zhang <zhangweiping@didiglobal.com>,
	Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	Bart Van Assche <bvanassche@acm.org>,
	keith.busch@intel.com, Minwoo Im <minwoo.im.dev@gmail.com>,
	"Nadolski, Edmund" <edmund.nadolski@intel.com>,
	linux-block@vger.kernel.org, cgroups@vger.kernel.org,
	linux-nvme@lists.infradead.org
Subject: Re: [PATCH v4 4/5] genirq/affinity: allow driver's discontigous affinity set
Date: Tue, 4 Feb 2020 11:11:25 +0800	[thread overview]
Message-ID: <CAA70yB7ThwiaGFkM6J-cja4OcD0oH_6KTwH7vpmp9mVG0Xte4w@mail.gmail.com> (raw)
In-Reply-To: <871rrevfmz.fsf@nanos.tec.linutronix.de>

Thomas Gleixner <tglx@linutronix.de> 于2020年2月1日周六 下午5:19写道:
>
> Weiping Zhang <zhangweiping@didiglobal.com> writes:
>
> > nvme driver will add 4 sets for supporting NVMe weighted round robin,
> > and some of these sets may be empty(depends on user configuration),
> > so each particular set is assigned one static index for avoiding the
> > management trouble, then the empty set will be been by
> > irq_create_affinity_masks().
>
> What's the point of an empty interrupt set in the first place? This does
> not make sense and smells like a really bad hack.
>
> Can you please explain in detail why this is required and why it
> actually makes sense?
>
Hi Thomas,
Sorry to late reply, I will post new patch to avoid creating empty sets.
In this version, nvme add extra 4 sets, because nvme will split its
io queues into 7 parts (poll, default, read, wrr_low, wrr_medium,
wrr_high, wrr_urgent),
the poll queues does not use irq, so nvme will has at most 6 irq sets.
And nvme driver use
two variables(dev->io_queues[index] and affd->set_size[index]) to
track how many queues/irqs
in each part. And the user may set some queues count to 0, for example:
nvme use 96 io queues.

default
dev->io_queues[0]=90
affd->set_size[0] = 90

read
dev->io_queues[1]=0
affd->set_size[1] = 0

wrr low
dev->io_queues[2]=0
affd->set_size[2] = 0

wrr medium
dev->io_queues[3]=0
affd->set_size[3] = 0

wrr high
dev->io_queues[4]=6
affd->set_size[4] = 6

wrr urgent
dev->io_queues[5]=0
affd->set_size[5] = 0

In this case the index from 1 to 3 will has 0 irqs.

But actually, it's no need to use fixed index for io_queues and set_size,
nvme just tells irq engine, how many irq_sets it has, and how may irqs
in each set,
so i will post V5 to solve this problem.
        nr_sets = 1;
        dev->io_queues[HCTX_TYPE_DEFAULT] = nr_default;
        affd->set_size[nr_sets - 1] = nr_default;
        dev->io_queues[HCTX_TYPE_READ] = nr_read;
        if (nr_read) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_read;
        }
        dev->io_queues[HCTX_TYPE_WRR_LOW] = nr_wrr_low;
        if (nr_wrr_low) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_low;
        }
        dev->io_queues[HCTX_TYPE_WRR_MEDIUM] = nr_wrr_medium;
        if (nr_wrr_medium) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_medium;
        }
        dev->io_queues[HCTX_TYPE_WRR_HIGH] = nr_wrr_high;
        if (nr_wrr_high) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_high;
        }
        dev->io_queues[HCTX_TYPE_WRR_URGENT] = nr_wrr_urgent;
        if (nr_wrr_urgent) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_urgent;
        }
        affd->nr_sets = nr_sets;

Thanks
Weiping

WARNING: multiple messages have this Message-ID (diff)
From: Weiping Zhang <zwp10758@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Bart Van Assche <bvanassche@acm.org>,
	Weiping Zhang <zhangweiping@didiglobal.com>,
	linux-nvme@lists.infradead.org, keith.busch@intel.com,
	Minwoo Im <minwoo.im.dev@gmail.com>,
	cgroups@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	"Nadolski, Edmund" <edmund.nadolski@intel.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v4 4/5] genirq/affinity: allow driver's discontigous affinity set
Date: Tue, 4 Feb 2020 11:11:25 +0800	[thread overview]
Message-ID: <CAA70yB7ThwiaGFkM6J-cja4OcD0oH_6KTwH7vpmp9mVG0Xte4w@mail.gmail.com> (raw)
In-Reply-To: <871rrevfmz.fsf@nanos.tec.linutronix.de>

Thomas Gleixner <tglx@linutronix.de> 于2020年2月1日周六 下午5:19写道:
>
> Weiping Zhang <zhangweiping@didiglobal.com> writes:
>
> > nvme driver will add 4 sets for supporting NVMe weighted round robin,
> > and some of these sets may be empty(depends on user configuration),
> > so each particular set is assigned one static index for avoiding the
> > management trouble, then the empty set will be been by
> > irq_create_affinity_masks().
>
> What's the point of an empty interrupt set in the first place? This does
> not make sense and smells like a really bad hack.
>
> Can you please explain in detail why this is required and why it
> actually makes sense?
>
Hi Thomas,
Sorry to late reply, I will post new patch to avoid creating empty sets.
In this version, nvme add extra 4 sets, because nvme will split its
io queues into 7 parts (poll, default, read, wrr_low, wrr_medium,
wrr_high, wrr_urgent),
the poll queues does not use irq, so nvme will has at most 6 irq sets.
And nvme driver use
two variables(dev->io_queues[index] and affd->set_size[index]) to
track how many queues/irqs
in each part. And the user may set some queues count to 0, for example:
nvme use 96 io queues.

default
dev->io_queues[0]=90
affd->set_size[0] = 90

read
dev->io_queues[1]=0
affd->set_size[1] = 0

wrr low
dev->io_queues[2]=0
affd->set_size[2] = 0

wrr medium
dev->io_queues[3]=0
affd->set_size[3] = 0

wrr high
dev->io_queues[4]=6
affd->set_size[4] = 6

wrr urgent
dev->io_queues[5]=0
affd->set_size[5] = 0

In this case the index from 1 to 3 will has 0 irqs.

But actually, it's no need to use fixed index for io_queues and set_size,
nvme just tells irq engine, how many irq_sets it has, and how may irqs
in each set,
so i will post V5 to solve this problem.
        nr_sets = 1;
        dev->io_queues[HCTX_TYPE_DEFAULT] = nr_default;
        affd->set_size[nr_sets - 1] = nr_default;
        dev->io_queues[HCTX_TYPE_READ] = nr_read;
        if (nr_read) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_read;
        }
        dev->io_queues[HCTX_TYPE_WRR_LOW] = nr_wrr_low;
        if (nr_wrr_low) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_low;
        }
        dev->io_queues[HCTX_TYPE_WRR_MEDIUM] = nr_wrr_medium;
        if (nr_wrr_medium) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_medium;
        }
        dev->io_queues[HCTX_TYPE_WRR_HIGH] = nr_wrr_high;
        if (nr_wrr_high) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_high;
        }
        dev->io_queues[HCTX_TYPE_WRR_URGENT] = nr_wrr_urgent;
        if (nr_wrr_urgent) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_urgent;
        }
        affd->nr_sets = nr_sets;

Thanks
Weiping

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

WARNING: multiple messages have this Message-ID (diff)
From: Weiping Zhang <zwp10758-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: Weiping Zhang
	<zhangweiping-+mmu7dyatJ+Rq8AjE7tl8g@public.gmane.org>,
	Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
	Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>,
	keith.busch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	Minwoo Im <minwoo.im.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	"Nadolski,
	Edmund" <edmund.nadolski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: Re: [PATCH v4 4/5] genirq/affinity: allow driver's discontigous affinity set
Date: Tue, 4 Feb 2020 11:11:25 +0800	[thread overview]
Message-ID: <CAA70yB7ThwiaGFkM6J-cja4OcD0oH_6KTwH7vpmp9mVG0Xte4w@mail.gmail.com> (raw)
In-Reply-To: <871rrevfmz.fsf-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>

Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> 于2020年2月1日周六 下午5:19写道:
>
> Weiping Zhang <zhangweiping-+mmu7dyatJ+Rq8AjE7tl8g@public.gmane.org> writes:
>
> > nvme driver will add 4 sets for supporting NVMe weighted round robin,
> > and some of these sets may be empty(depends on user configuration),
> > so each particular set is assigned one static index for avoiding the
> > management trouble, then the empty set will be been by
> > irq_create_affinity_masks().
>
> What's the point of an empty interrupt set in the first place? This does
> not make sense and smells like a really bad hack.
>
> Can you please explain in detail why this is required and why it
> actually makes sense?
>
Hi Thomas,
Sorry to late reply, I will post new patch to avoid creating empty sets.
In this version, nvme add extra 4 sets, because nvme will split its
io queues into 7 parts (poll, default, read, wrr_low, wrr_medium,
wrr_high, wrr_urgent),
the poll queues does not use irq, so nvme will has at most 6 irq sets.
And nvme driver use
two variables(dev->io_queues[index] and affd->set_size[index]) to
track how many queues/irqs
in each part. And the user may set some queues count to 0, for example:
nvme use 96 io queues.

default
dev->io_queues[0]=90
affd->set_size[0] = 90

read
dev->io_queues[1]=0
affd->set_size[1] = 0

wrr low
dev->io_queues[2]=0
affd->set_size[2] = 0

wrr medium
dev->io_queues[3]=0
affd->set_size[3] = 0

wrr high
dev->io_queues[4]=6
affd->set_size[4] = 6

wrr urgent
dev->io_queues[5]=0
affd->set_size[5] = 0

In this case the index from 1 to 3 will has 0 irqs.

But actually, it's no need to use fixed index for io_queues and set_size,
nvme just tells irq engine, how many irq_sets it has, and how may irqs
in each set,
so i will post V5 to solve this problem.
        nr_sets = 1;
        dev->io_queues[HCTX_TYPE_DEFAULT] = nr_default;
        affd->set_size[nr_sets - 1] = nr_default;
        dev->io_queues[HCTX_TYPE_READ] = nr_read;
        if (nr_read) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_read;
        }
        dev->io_queues[HCTX_TYPE_WRR_LOW] = nr_wrr_low;
        if (nr_wrr_low) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_low;
        }
        dev->io_queues[HCTX_TYPE_WRR_MEDIUM] = nr_wrr_medium;
        if (nr_wrr_medium) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_medium;
        }
        dev->io_queues[HCTX_TYPE_WRR_HIGH] = nr_wrr_high;
        if (nr_wrr_high) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_high;
        }
        dev->io_queues[HCTX_TYPE_WRR_URGENT] = nr_wrr_urgent;
        if (nr_wrr_urgent) {
                nr_sets++;
                affd->set_size[nr_sets - 1] = nr_wrr_urgent;
        }
        affd->nr_sets = nr_sets;

Thanks
Weiping

  reply	other threads:[~2020-02-04  3:11 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-28 11:51 [PATCH v4 0/5] Add support Weighted Round Robin for blkcg and nvme Weiping Zhang
2020-01-28 11:51 ` Weiping Zhang
2020-01-28 11:51 ` Weiping Zhang
2020-01-28 11:52 ` [PATCH v4 1/5] block: add weighted round robin for blkcgroup Weiping Zhang
2020-01-28 11:52   ` Weiping Zhang
2020-01-28 11:52   ` Weiping Zhang
2020-01-28 11:53 ` [PATCH v4 2/5] nvme: add get_ams for nvme_ctrl_ops Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang
2020-01-28 11:53 ` [PATCH v4 3/5] nvme-pci: rename module parameter write_queues to read_queues Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang
2020-01-28 11:53 ` [PATCH v4 4/5] genirq/affinity: allow driver's discontigous affinity set Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang
2020-02-01  9:08   ` Thomas Gleixner
2020-02-01  9:08     ` Thomas Gleixner
2020-02-04  3:11     ` Weiping Zhang [this message]
2020-02-04  3:11       ` Weiping Zhang
2020-02-04  3:11       ` Weiping Zhang
2020-01-28 11:53 ` [PATCH v4 5/5] nvme: add support weighted round robin queue Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang
2020-01-28 11:53   ` Weiping Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA70yB7ThwiaGFkM6J-cja4OcD0oH_6KTwH7vpmp9mVG0Xte4w@mail.gmail.com \
    --to=zwp10758@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=cgroups@vger.kernel.org \
    --cc=edmund.nadolski@intel.com \
    --cc=hch@lst.de \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=minwoo.im.dev@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=zhangweiping@didiglobal.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.