On Wed, 2019-04-24 at 14:07 -0600, Keith Busch wrote: > On Wed, Apr 24, 2019 at 09:55:16AM -0700, Sagi Grimberg wrote: > > > > > As different nvme controllers are connect via different fabrics, some require > > > different timeout settings than others. This series implements per-controller > > > timeouts in the nvme subsystem which can be set via sysfs. > > > > How much of a real issue is this? > > > > block io_timeout defaults to 30 seconds which are considered a universal > > eternity for pretty much any nvme fabric. Moreover, io_timeout is > > mutable already on a per-namespace level. > > > > This leaves the admin_timeout which goes beyond this to 60 seconds... > > > > Can you describe what exactly are you trying to solve? > > I think they must have an nvme target that is backed by slow media > (i.e. non-SSD). If that's the case, I think it may be a better option > if the target advertises relatively shallow queue depths and/or lower > MDTS that better aligns to the backing storage capabilies. It isn't that the media is slow; the max timeout is based on the SLA for certain classes of "fabric" outages. Linux copes *really* badly with I/O errors, and if we can make the timeout last long enough to cover the switch restart worst case, then users are a lot happier.