All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Farman <farman@linux.ibm.com>
To: Matthew Rosato <mjrosato@linux.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	linux-s390@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment
Date: Mon, 19 Dec 2022 11:22:34 -0500	[thread overview]
Message-ID: <61827ce18008642d556ca899179fb6216a079939.camel@linux.ibm.com> (raw)
In-Reply-To: <f814a82c-f1a6-4e90-4898-290dbbc73770@linux.ibm.com>

On Fri, 2022-12-16 at 15:10 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > Everything about this allocation is harder than necessary,
> > since the memory allocation is already aligned to our needs.
> > Break them apart for readability, instead of doing the
> > funky artithmetic.
> > 
> > Of the structures that are involved, only ch_ccw needs the
> > GFP_DMA flag, so the others can be allocated without it.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++------------
> > ----
> >  1 file changed, 21 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index d41d94cecdf8..4b6b5f9dc92d 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -311,40 +311,41 @@ static inline int is_tic_within_range(struct
> > ccw1 *ccw, u32 head, int len)
> >  static struct ccwchain *ccwchain_alloc(struct channel_program *cp,
> > int len)
> >  {
> >         struct ccwchain *chain;
> > -       void *data;
> > -       size_t size;
> > -
> > -       /* Make ccw address aligned to 8. */
> > -       size = ((sizeof(*chain) + 7L) & -8L) +
> > -               sizeof(*chain->ch_ccw) * len +
> > -               sizeof(*chain->ch_pa) * len;
> > -       chain = kzalloc(size, GFP_DMA | GFP_KERNEL);
> > +
> > +       chain = kzalloc(sizeof(*chain), GFP_KERNEL);
> 
> I suppose you could consider a WARN_ONCE here if one of these
> kzalloc'd addresses has something in the low-order 3 bits; would
> probably make it more obvious if for some reason the alignment
> guarantee was broken vs some status after-the-fact in the IRB.  But
> as per our discussion off-list I think that can only happen if
> ARCH_KMALLOC_MINALIGN were to change.

Yeah, maybe, but the "status after-the-fact" is a program check that
would be generated by the channel, just as would be done if the ORB was
located in a similarly-weird location (which we don't check for
either). Since this is all mainline paths, I don't think it makes sense
to re-check all those possible permutations here.

(And, for what it's worth, it's not this allocation that matters, but
rather the one that gets stuffed into the ORB below [1])

> 
> >         if (!chain)
> >                 return NULL;
> >  
> > -       data = (u8 *)chain + ((sizeof(*chain) + 7L) & -8L);
> > -       chain->ch_ccw = (struct ccw1 *)data;
> > -
> > -       data = (u8 *)(chain->ch_ccw) + sizeof(*chain->ch_ccw) *
> > len;
> > -       chain->ch_pa = (struct page_array *)data;
> > +       chain->ch_ccw = kcalloc(len, sizeof(*chain->ch_ccw),
> > GFP_DMA | GFP_KERNEL);

[1]

> > +       if (!chain->ch_ccw)
> > +               goto out_err;
> >  
> > -       chain->ch_len = len;
> > +       chain->ch_pa = kcalloc(len, sizeof(*chain->ch_pa),
> > GFP_KERNEL);
> > +       if (!chain->ch_pa)
> > +               goto out_err;
> >  
> >         list_add_tail(&chain->next, &cp->ccwchain_list);
> >  
> >         return chain;
> > +
> > +out_err:
> > +       kfree(chain->ch_ccw);
> > +       kfree(chain);
> > +       return NULL;
> >  }
> >  
> >  static void ccwchain_free(struct ccwchain *chain)
> >  {
> >         list_del(&chain->next);
> > +       kfree(chain->ch_pa);
> > +       kfree(chain->ch_ccw);
> >         kfree(chain);
> >  }
> >  
> >  /* Free resource for a ccw that allocated memory for its cda. */
> >  static void ccwchain_cda_free(struct ccwchain *chain, int idx)
> >  {
> > -       struct ccw1 *ccw = chain->ch_ccw + idx;
> > +       struct ccw1 *ccw = &chain->ch_ccw[idx];
> >  
> >         if (ccw_is_tic(ccw))
> >                 return;
> > @@ -443,6 +444,8 @@ static int ccwchain_handle_ccw(u32 cda, struct
> > channel_program *cp)
> >         chain = ccwchain_alloc(cp, len);
> >         if (!chain)
> >                 return -ENOMEM;
> > +
> > +       chain->ch_len = len;
> >         chain->ch_iova = cda;
> >  
> >         /* Copy the actual CCWs into the new chain */
> > @@ -464,7 +467,7 @@ static int ccwchain_loop_tic(struct ccwchain
> > *chain, struct channel_program *cp)
> >         int i, ret;
> >  
> >         for (i = 0; i < chain->ch_len; i++) {
> > -               tic = chain->ch_ccw + i;
> > +               tic = &chain->ch_ccw[i];
> 
> These don't seem equivalent...  Before at each iteration you'd offset
> tic by i bytes, now you're treating i as an index of 8B ccw1 structs,
> so it seems like this went from tic = x + i to tic = x + (8 * i)? 
> Was the old code broken or am I missing something? 

I think the latter. :) The old code did one allocation measured in
bytes, stored it in chain, and then calculated locations within that
for ch_ccw and ch_pa, cast to the respective pointer types. (See the
reference [1] above.)

So any use of "i" was an index into the pointer types and thus already
a "8 * i" addition from your example. My intention here was to remove
the pseudo-assembly above, and changed these along the way as I was un-
tangling everything. Looking at the resulting assembly before/after,
these hunks don't end up changing at all so I'll back these changes
back out. Especially since...

> 
> >  
> >                 if (!ccw_is_tic(tic))
> >                         continue;
> > @@ -739,8 +742,8 @@ int cp_prefetch(struct channel_program *cp)
> >         list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >                 len = chain->ch_len;
> >                 for (idx = 0; idx < len; idx++) {
> > -                       ccw = chain->ch_ccw + idx;
> > -                       pa = chain->ch_pa + idx;
> > +                       ccw = &chain->ch_ccw[idx];
> > +                       pa = &chain->ch_pa[idx];
> 
> Same sort of question re: ch_pa

...this prompted me to notice that I didn't change the users of "chain-
>ch_pa + i" when calling page_array_unpin_free(), so now we have both
flavors which isn't ideal.

BEFORE:
                        ccw = chain->ch_ccw + idx;
                        pa = chain->ch_pa + idx;
    1536:       eb 3b 00 01 00 0d       sllg    %r3,%r11,1
    153c:       b9 08 00 3b             agr     %r3,%r11
    1540:       eb 33 00 03 00 0d       sllg    %r3,%r3,3
    1546:       e3 30 80 28 00 08       ag      %r3,40(%r8)
                        ccw = chain->ch_ccw + idx;
    154c:       eb 2b 00 03 00 0d       sllg    %r2,%r11,3
    1552:       e3 20 80 10 00 08       ag      %r2,16(%r8)
AFTER
                        ccw = &chain->ch_ccw[idx];
                        pa = &chain->ch_pa[idx];
    15be:       eb 3b 00 01 00 0d       sllg    %r3,%r11,1
    15c4:       b9 08 00 3b             agr     %r3,%r11
    15c8:       eb 33 00 03 00 0d       sllg    %r3,%r3,3
    15ce:       e3 30 80 28 00 08       ag      %r3,40(%r8)
                        ccw = &chain->ch_ccw[idx];
    15d4:       eb 2b 00 03 00 0d       sllg    %r2,%r11,3
    15da:       e3 20 80 10 00 08       ag      %r2,16(%r8)


> 
> >  
> >                         ret = ccwchain_fetch_one(ccw, pa, cp);
> >                         if (ret)
> 


  reply	other threads:[~2022-12-19 16:22 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
2022-11-21 21:40 ` [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary Eric Farman
2022-11-22 16:12   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 02/16] vfio/ccw: simplify the cp_get_orb interface Eric Farman
2022-11-22 16:13   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 03/16] vfio/ccw: allow non-zero storage keys Eric Farman
2022-12-15 20:55   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 04/16] vfio/ccw: move where IDA flag is set in ORB Eric Farman
2022-12-15 20:55   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw Eric Farman
2022-11-22  1:41   ` Jason Gunthorpe
2022-12-15 20:59   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 06/16] vfio/ccw: simplify CCW chain fetch routines Eric Farman
2022-12-15 21:18   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment Eric Farman
2022-12-16 20:10   ` Matthew Rosato
2022-12-19 16:22     ` Eric Farman [this message]
2022-11-21 21:40 ` [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct Eric Farman
2022-12-16 19:59   ` Matthew Rosato
2022-12-19 16:22     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 09/16] vfio/ccw: populate page_array struct inline Eric Farman
2022-12-16 21:05   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 10/16] vfio/ccw: refactor the idaw counter Eric Farman
2022-12-19 19:16   ` Matthew Rosato
2022-12-19 19:31     ` Eric Farman
2022-12-19 19:40       ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW Eric Farman
2022-12-19 19:27   ` Matthew Rosato
2022-12-19 20:27     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 12/16] vfio/ccw: calculate number of IDAWs regardless of format Eric Farman
2022-12-19 19:49   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal Eric Farman
2022-12-19 20:14   ` Matthew Rosato
2022-12-19 21:00     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL Eric Farman
2022-12-19 20:29   ` Matthew Rosato
2022-12-19 21:04     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 15/16] vfio/ccw: don't group contiguous pages on 2K IDAWs Eric Farman
2022-12-19 20:40   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 16/16] vfio/ccw: remove old IDA format restrictions Eric Farman
2022-12-19 20:44   ` Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61827ce18008642d556ca899179fb6216a079939.camel@linux.ibm.com \
    --to=farman@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=oberpar@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=vneethv@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.