All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Damien Le Moal <damien.lemoal@opensource.wdc.com>,
	Oliver Sang <oliver.sang@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Linux Memory Management List" <linux-mm@kvack.org>,
	<linux-ide@vger.kernel.org>, <lkp@lists.01.org>, <lkp@intel.com>,
	<ying.huang@intel.com>, <feng.tang@intel.com>,
	<zhengjun.xing@linux.intel.com>, <fengwei.yin@intel.com>
Subject: Re: [ata] 0568e61225: stress-ng.copy-file.ops_per_sec -15.0% regression
Date: Tue, 16 Aug 2022 21:44:33 +0100	[thread overview]
Message-ID: <c93e529d-b688-9910-50c4-779c2f85fbc3@huawei.com> (raw)
In-Reply-To: <05a48c68-33ae-10e2-e565-6c124bad93c5@opensource.wdc.com>

On 16/08/2022 21:02, Damien Le Moal wrote:
>> ou confirm this? Thanks!
>>
>> On this basis, it appears that max_hw_sectors_kb is getting capped from
>> scsi default @ 1024 sectors by commit 0568e61225. If it were getting
>> capped by swiotlb mapping limit then that would give us 512 sectors -
>> this value is fixed.
>>
>> So for my SHT change proposal I am just trying to revert to previous
>> behaviour in 5.19 - make max_hw_sectors_kb crazy big again.
> I reread the entire thing and I think I got things reverted here. The perf
> regression happens with the 512/512 settings, while the original 1280/32767
> before your patches was OK.

Right, that's as I read it. It would be useful for Oliver to confirm the 
results.

> So is your patch defining the optimal mapping size
> cause the reduction to 512/512.

The optimal mapping size only affects specifically sas controllers, so I 
think that we can ignore that one for now. The reduction to 512/512 
comes from the change in ata_scsi_dev_config().

> It would mean that for ATA, we need a sane
> default mapping instead of SCSI default 1024 sectors.

Right

> Now I understand your
> proposed change using ATA_MAX_SECTORS_LBA48.
> 
> However, that would be correct only for LBA48 capable drives.
> ata_dev_configure() already sets dev->max_sectors correctly according to the
> drive type, capabilities and eventual quirks. So the problem comes from the
> libata-scsi change:
> 
> dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
> 
> when sdev->host->max_sectors is 0 (not initialized).

That cannot happen. If sht->max_sectors is 0, then we set 
shost->max_sectors at SCSI default 1024 sectors in scsi_host_alloc()

For my proposed change, dev->max_sectors would still be initialized in 
ata_dev_configure() according to drive type, etc. And it should be <= 
LBA48 max sectors (=65535).

So then in ata_scsi_dev_config():

dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors)

this only has an impact for ahci controllers if sdev->host->max_sectors 
was capped according to host dma dev max mapping size.

I will admit that this is not ideal. An alt approach is to change 
ata_scsi_dev_config() to cap the dev max_sectors only according to shost 
dma dev mapping limit (similar to scsi_add_host_with_dma()), but that 
would not work for a controller like ipr, which does have a geniune 
max_sectors limit (which we should respect).

Thanks,
John


> So maybe simply changing
> this line to:
> 
> dev->max_sectors = min_not_zero(dev->max_sectors, sdev->host->max_sectors);
> 
> would do the trick ? Any particular adapter driver that needs a mapping cap on
> the adpter max mapping size can still set sdev->host->max_sectors as needed, and
> we keep the same defaults as before when it is not set. Thoughts ? Or am I
> missing something else ?
> 
> 
>>> The regression may come not from commands becoming tiny, but from the fact that
>>> after the patch, max_sectors_kb is too large,
>> I don't think it is, but need confirmation.
>>
>>> causing a lot of overhead with
>>> qemu swiotlb mapping and slowing down IO processing.
>>> Above, it can be seen that we ed up with max_sectors_kb being 1280, which is the
>>> default for most scsi disks (including ATA drives). That is normal. But before
>>> that, it was 512, which likely better fits qemu swiotlb and does not generate
>> Again, I don't think this this is the case. Need confirmation.
>>
>>> overhead. So the above fix will not change anything I think...


WARNING: multiple messages have this Message-ID (diff)
From: John Garry <john.garry@huawei.com>
To: lkp@lists.01.org
Subject: Re: [ata] 0568e61225: stress-ng.copy-file.ops_per_sec -15.0% regression
Date: Tue, 16 Aug 2022 21:44:33 +0100	[thread overview]
Message-ID: <c93e529d-b688-9910-50c4-779c2f85fbc3@huawei.com> (raw)
In-Reply-To: <05a48c68-33ae-10e2-e565-6c124bad93c5@opensource.wdc.com>

[-- Attachment #1: Type: text/plain, Size: 3588 bytes --]

On 16/08/2022 21:02, Damien Le Moal wrote:
>> ou confirm this? Thanks!
>>
>> On this basis, it appears that max_hw_sectors_kb is getting capped from
>> scsi default @ 1024 sectors by commit 0568e61225. If it were getting
>> capped by swiotlb mapping limit then that would give us 512 sectors -
>> this value is fixed.
>>
>> So for my SHT change proposal I am just trying to revert to previous
>> behaviour in 5.19 - make max_hw_sectors_kb crazy big again.
> I reread the entire thing and I think I got things reverted here. The perf
> regression happens with the 512/512 settings, while the original 1280/32767
> before your patches was OK.

Right, that's as I read it. It would be useful for Oliver to confirm the 
results.

> So is your patch defining the optimal mapping size
> cause the reduction to 512/512.

The optimal mapping size only affects specifically sas controllers, so I 
think that we can ignore that one for now. The reduction to 512/512 
comes from the change in ata_scsi_dev_config().

> It would mean that for ATA, we need a sane
> default mapping instead of SCSI default 1024 sectors.

Right

> Now I understand your
> proposed change using ATA_MAX_SECTORS_LBA48.
> 
> However, that would be correct only for LBA48 capable drives.
> ata_dev_configure() already sets dev->max_sectors correctly according to the
> drive type, capabilities and eventual quirks. So the problem comes from the
> libata-scsi change:
> 
> dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
> 
> when sdev->host->max_sectors is 0 (not initialized).

That cannot happen. If sht->max_sectors is 0, then we set 
shost->max_sectors at SCSI default 1024 sectors in scsi_host_alloc()

For my proposed change, dev->max_sectors would still be initialized in 
ata_dev_configure() according to drive type, etc. And it should be <= 
LBA48 max sectors (=65535).

So then in ata_scsi_dev_config():

dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors)

this only has an impact for ahci controllers if sdev->host->max_sectors 
was capped according to host dma dev max mapping size.

I will admit that this is not ideal. An alt approach is to change 
ata_scsi_dev_config() to cap the dev max_sectors only according to shost 
dma dev mapping limit (similar to scsi_add_host_with_dma()), but that 
would not work for a controller like ipr, which does have a geniune 
max_sectors limit (which we should respect).

Thanks,
John


> So maybe simply changing
> this line to:
> 
> dev->max_sectors = min_not_zero(dev->max_sectors, sdev->host->max_sectors);
> 
> would do the trick ? Any particular adapter driver that needs a mapping cap on
> the adpter max mapping size can still set sdev->host->max_sectors as needed, and
> we keep the same defaults as before when it is not set. Thoughts ? Or am I
> missing something else ?
> 
> 
>>> The regression may come not from commands becoming tiny, but from the fact that
>>> after the patch, max_sectors_kb is too large,
>> I don't think it is, but need confirmation.
>>
>>> causing a lot of overhead with
>>> qemu swiotlb mapping and slowing down IO processing.
>>> Above, it can be seen that we ed up with max_sectors_kb being 1280, which is the
>>> default for most scsi disks (including ATA drives). That is normal. But before
>>> that, it was 512, which likely better fits qemu swiotlb and does not generate
>> Again, I don't think this this is the case. Need confirmation.
>>
>>> overhead. So the above fix will not change anything I think...

  reply	other threads:[~2022-08-16 20:44 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05  8:05 [ata] 0568e61225: stress-ng.copy-file.ops_per_sec -15.0% regression kernel test robot
2022-08-05  8:05 ` kernel test robot
2022-08-08 14:52 ` Damien Le Moal
2022-08-08 14:52   ` Damien Le Moal
2022-08-09  9:58   ` John Garry
2022-08-09  9:58     ` John Garry
2022-08-09 14:16     ` John Garry
2022-08-09 14:16       ` John Garry
2022-08-09 14:57       ` Damien Le Moal
2022-08-09 14:57         ` Damien Le Moal
2022-08-10  8:33         ` John Garry
2022-08-10  8:33           ` John Garry
2022-08-10 13:52           ` Damien Le Moal
2022-08-10 13:52             ` Damien Le Moal
2022-08-09 14:55     ` Damien Le Moal
2022-08-09 14:55       ` Damien Le Moal
2022-08-09 15:16       ` David Laight
2022-08-09 15:16         ` David Laight
2022-08-10 13:57         ` Damien Le Moal
2022-08-10 13:57           ` Damien Le Moal
2022-08-12  5:01       ` Oliver Sang
2022-08-12  5:01         ` Oliver Sang
2022-08-12 11:13         ` John Garry
2022-08-12 11:13           ` John Garry
2022-08-12 14:58           ` John Garry
2022-08-12 14:58             ` John Garry
2022-08-16  6:57             ` Oliver Sang
2022-08-16  6:57               ` Oliver Sang
2022-08-16 10:35               ` John Garry
2022-08-16 10:35                 ` John Garry
2022-08-16 15:42                 ` Damien Le Moal
2022-08-16 15:42                   ` Damien Le Moal
2022-08-16 16:38                   ` John Garry
2022-08-16 16:38                     ` John Garry
2022-08-16 20:02                     ` Damien Le Moal
2022-08-16 20:02                       ` Damien Le Moal
2022-08-16 20:44                       ` John Garry [this message]
2022-08-16 20:44                         ` John Garry
2022-08-17 15:55                         ` Damien Le Moal
2022-08-17 15:55                           ` Damien Le Moal
2022-08-17 13:51                     ` Oliver Sang
2022-08-17 13:51                       ` Oliver Sang
2022-08-17 14:04                       ` John Garry
2022-08-17 14:04                         ` John Garry
2022-08-18  2:06                         ` Oliver Sang
2022-08-18  2:06                           ` Oliver Sang
2022-08-18  9:28                           ` John Garry
2022-08-18  9:28                             ` John Garry
2022-08-19  6:24                             ` Oliver Sang
2022-08-19  6:24                               ` Oliver Sang
2022-08-19  7:54                               ` John Garry
2022-08-19  7:54                                 ` John Garry
2022-08-20 16:36                               ` Damien Le Moal
2022-08-20 16:36                                 ` Damien Le Moal
2022-08-12 15:41           ` Damien Le Moal
2022-08-12 15:41             ` Damien Le Moal
2022-08-12 17:17             ` John Garry
2022-08-12 17:17               ` John Garry
2022-08-12 18:27               ` Damien Le Moal
2022-08-12 18:27                 ` Damien Le Moal
2022-08-13  7:23                 ` John Garry
2022-08-13  7:23                   ` John Garry
2022-08-16  2:52           ` Oliver Sang
2022-08-16  2:52             ` Oliver Sang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c93e529d-b688-9910-50c4-779c2f85fbc3@huawei.com \
    --to=john.garry@huawei.com \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=hch@lst.de \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=martin.petersen@oracle.com \
    --cc=oliver.sang@intel.com \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.