From mboxrd@z Thu Jan 1 00:00:00 1970 From: Felipe Franciosi Subject: Re: [PATCH 04/10] xen/blkfront: separate ring information to an new struct Date: Wed, 18 Feb 2015 18:08:07 +0000 Message-ID: <9F2C4E7DFB7839489C89757A66C5AD629EB997__37783.7482643828$1424282993$gmane$org@AMSPEX01CL03.citrite.net> References: <1423988345-4005-1-git-send-email-bob.liu@oracle.com> <1423988345-4005-5-git-send-email-bob.liu@oracle.com> <54E4CBD1.1000802@citrix.com> <20150218173746.GF8152@l.oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <20150218173746.GF8152@l.oracle.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: 'Konrad Rzeszutek Wilk' , Roger Pau Monne , Bob Liu Cc: "hch@infradead.org" , "linux-kernel@vger.kernel.org" , "xen-devel@lists.xen.org" , "axboe@fb.com" , David Vrabel , "avanzini.arianna@gmail.com" List-Id: xen-devel@lists.xenproject.org > -----Original Message----- > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > Sent: 18 February 2015 17:38 > To: Roger Pau Monne > Cc: Bob Liu; xen-devel@lists.xen.org; David Vrabel; linux- > kernel@vger.kernel.org; Felipe Franciosi; axboe@fb.com; hch@infradead.org; > avanzini.arianna@gmail.com > Subject: Re: [PATCH 04/10] xen/blkfront: separate ring information to an = new > struct > = > On Wed, Feb 18, 2015 at 06:28:49PM +0100, Roger Pau Monn=E9 wrote: > > El 15/02/15 a les 9.18, Bob Liu ha escrit: > > > A ring is the representation of a hardware queue, this patch > > > separate ring information from blkfront_info to an new struct > > > blkfront_ring_info to make preparation for real multi hardware queues > supporting. > > > > > > Signed-off-by: Arianna Avanzini > > > Signed-off-by: Bob Liu > > > --- > > > drivers/block/xen-blkfront.c | 403 > > > +++++++++++++++++++++++-------------------- > > > 1 file changed, 218 insertions(+), 185 deletions(-) > > > > > > diff --git a/drivers/block/xen-blkfront.c > > > b/drivers/block/xen-blkfront.c index 5a90a51..aaa4a0e 100644 > > > --- a/drivers/block/xen-blkfront.c > > > +++ b/drivers/block/xen-blkfront.c > > > @@ -102,23 +102,15 @@ MODULE_PARM_DESC(max, "Maximum amount > of > > > segments in indirect requests (default #define BLK_RING_SIZE > > > __CONST_RING_SIZE(blkif, PAGE_SIZE) > > > > > > /* > > > - * We have one of these per vbd, whether ide, scsi or 'other'. > > > They > > > - * hang in private_data off the gendisk structure. We may end up > > > - * putting all kinds of interesting stuff here :-) > > > + * Per-ring info. > > > + * A blkfront_info structure can associate with one or more > > > + blkfront_ring_info, > > > + * depending on how many hardware queues supported. > > > */ > > > -struct blkfront_info > > > -{ > > > +struct blkfront_ring_info { > > > spinlock_t io_lock; > > > - struct mutex mutex; > > > - struct xenbus_device *xbdev; > > > - struct gendisk *gd; > > > - int vdevice; > > > - blkif_vdev_t handle; > > > - enum blkif_state connected; > > > int ring_ref; > > > struct blkif_front_ring ring; > > > unsigned int evtchn, irq; > > > - struct request_queue *rq; > > > struct work_struct work; > > > struct gnttab_free_callback callback; > > > struct blk_shadow shadow[BLK_RING_SIZE]; @@ -126,6 +118,22 @@ > > > struct blkfront_info > > > struct list_head indirect_pages; > > > unsigned int persistent_gnts_c; > > > unsigned long shadow_free; > > > + struct blkfront_info *info; > > > > AFAICT you seem to have a list of persistent grants, indirect pages > > and a grant table callback for each ring, isn't this supposed to be > > shared between all rings? > > > > I don't think we should be going down that route, or else we can hoard > > a large amount of memory and grants. > = > It does remove the lock that would have to be accessed by each ring threa= d to > access those. Those values (grants) can be limited to be a smaller value = such > that the overall number is the same as it was with the previous version. = As in: > each ring has =3D MAX_GRANTS / nr_online_cpus(). > > We should definitely be concerned with the amount of memory consumed on the= backend for each plugged virtual disk. We have faced several problems in X= enServer around this area before; it drastically affects VBD scalability pe= r host. This makes me think that all the persistent grants work was done as a worka= round while we were facing several performance problems around concurrent g= rant un/mapping operations. Given all the recent submissions made around th= is (grant ops) area, is this something we should perhaps revisit and discus= s whether we want to continue offering persistent grants as a feature? Thanks, Felipe