linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Pe: [PATCH v5 1/3] virtio-scsi: first version
@ 2012-02-06  9:51 Christian Hoff
  2012-02-07  9:54 ` Paolo Bonzini
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Hoff @ 2012-02-06  9:51 UTC (permalink / raw)
  To: pbonzini, linux-kernel; +Cc: linux-scsi, rusty, mst, kvm

Hello Paolo,

first let me say that your patch is working fine on my local clone of the 
qemu repository.

Let me ask just one question about the format of the data being 
transmitted over the virtqueue.

Paolo Bonzini wrote:
+                cmd->req.cmd = (struct virtio_scsi_cmd_req){
+                                .lun[0] = 1,
+                                .lun[1] = sc->device->id,
+                                .lun[2] = (sc->device->lun >> 8) | 0x40,
+                                .lun[3] = sc->device->lun & 0xff,
+                               [...]
+                };

Can't we have seperate fields for the SCSI target ID and the LUN number 
here? Putting all this into a single field seems confusing. The following 
line of code (sc->device->lun >> 8) | 0x40 essentially means that LUN 
numbers will be limited to 8+6 Bits=14 Bits for no obvious reason that I 
can see. Maybe we could just split the LUN field up into two uint32 fields 
for target ID and LUN number?

Also, lun[1] = sc->device->id means that only 255 SCSI target IDs will be 
supported. Think about bigger usage scenarios, such as FCP networks with 
several hundred HBAs in the net. If you want to have the target ID<->HBA 
mapping the same as on the guest as on the host, then 255 virtual target 
IDs could be a limit.

Sorry for coming up so late with these suggestions. I hope there is still 
enough time left to discuss and address these problems.

Mit freundlichen Grüßen / Kind regards

Christian Hoff

Student - Applied Computer Science


Phone:
49-16098976-950
 IBM Deutschland

E-Mail:
christian.hoff@de.ibm.com
 Am Fichtenberg 1


 71083 Herrenberg


 Germany


IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Martina Koederitz (Vorsitzende), Reinhard Reschke, 
Dieter Scholz, Gregor Pillen, Joachim Heel, Christian Noll
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-06  9:51 Pe: [PATCH v5 1/3] virtio-scsi: first version Christian Hoff
@ 2012-02-07  9:54 ` Paolo Bonzini
  2012-02-07 11:10   ` Michael S. Tsirkin
  2012-02-07 11:56   ` Christian Borntraeger
  0 siblings, 2 replies; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-07  9:54 UTC (permalink / raw)
  To: Christian Hoff; +Cc: linux-kernel, linux-scsi, rusty, mst, kvm

On 02/06/2012 10:51 AM, Christian Hoff wrote:
> Hello Paolo,
>
> first let me say that your patch is working fine on my local clone of the
> qemu repository.
>
> Let me ask just one question about the format of the data being
> transmitted over the virtqueue.
>
> Paolo Bonzini wrote:
> +                cmd->req.cmd = (struct virtio_scsi_cmd_req){
> +                                .lun[0] = 1,
> +                                .lun[1] = sc->device->id,
> +                                .lun[2] = (sc->device->lun>>  8) | 0x40,
> +                                .lun[3] = sc->device->lun&  0xff,
> +                               [...]
> +                };
>
> Can't we have seperate fields for the SCSI target ID and the LUN number
> here? Putting all this into a single field seems confusing. The following
> line of code (sc->device->lun>>  8) | 0x40 essentially means that LUN
> numbers will be limited to 8+6 Bits=14 Bits for no obvious reason that I
> can see. Maybe we could just split the LUN field up into two uint32 fields
> for target ID and LUN number?

The 14-bit limitation can be lifted.  SAM defines a 24-bit LUN format 
too, but I've never seen it used in practice.

> Also, lun[1] = sc->device->id means that only 255 SCSI target IDs will be
> supported. Think about bigger usage scenarios, such as FCP networks with
> several hundred HBAs in the net. If you want to have the target ID<->HBA
> mapping the same as on the guest as on the host, then 255 virtual target
> IDs could be a limit.

I think you would hit other scalability limitations well before that.  I 
plan to give each target its own MSI-X interrupt, but there is no 
infinite supplies of those either.

VMware only supports 15 targets and 255 LUNs per host, by comparison.  I 
think 255 targets and 16383 LUNs is already several times more than is 
actually needed.  But in any case, we could still use the fixed "1" byte 
to go beyond 255 targets.

> Sorry for coming up so late with these suggestions. I hope there is still
> enough time left to discuss and address these problems.

Sure. :)  I hope the above answer is satisfactory, though.

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07  9:54 ` Paolo Bonzini
@ 2012-02-07 11:10   ` Michael S. Tsirkin
  2012-02-07 11:26     ` Paolo Bonzini
  2012-02-07 11:56   ` Christian Borntraeger
  1 sibling, 1 reply; 34+ messages in thread
From: Michael S. Tsirkin @ 2012-02-07 11:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Christian Hoff, linux-kernel, linux-scsi, rusty, kvm

On Tue, Feb 07, 2012 at 10:54:54AM +0100, Paolo Bonzini wrote:
> >Also, lun[1] = sc->device->id means that only 255 SCSI target IDs will be
> >supported. Think about bigger usage scenarios, such as FCP networks with
> >several hundred HBAs in the net. If you want to have the target ID<->HBA
> >mapping the same as on the guest as on the host, then 255 virtual target
> >IDs could be a limit.
> 
> I think you would hit other scalability limitations well before
> that.  I plan to give each target its own MSI-X interrupt, but there
> is no infinite supplies of those either.

virtio-pci generally lets guests share MSI-X vectors between queues,
why not allow this here?

-- 
MST

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07 11:10   ` Michael S. Tsirkin
@ 2012-02-07 11:26     ` Paolo Bonzini
  0 siblings, 0 replies; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-07 11:26 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Christian Hoff, linux-kernel, linux-scsi, rusty, kvm

On 02/07/2012 12:10 PM, Michael S. Tsirkin wrote:
>>> Also, lun[1] = sc->device->id means that only 255 SCSI target IDs will be
>>> >  >supported. Think about bigger usage scenarios, such as FCP networks with
>>> >  >several hundred HBAs in the net. If you want to have the target ID<->HBA
>>> >  >mapping the same as on the guest as on the host, then 255 virtual target
>>> >  >IDs could be a limit.
>> >
>> >  I think you would hit other scalability limitations well before
>> >  that.  I plan to give each target its own MSI-X interrupt, but there
>> >  is no infinite supplies of those either.
> virtio-pci generally lets guests share MSI-X vectors between queues,
> why not allow this here?

Yes, of course.  However, with dozens of queues, many of them will share 
the same vector and all of them will be examined when you get the 
interrupt.  Even if you find the right balance between sharing (because 
you have to) and separating (because of scalability), I wouldn't be 
surprised if more than 255 targets do not work too well.

Anyway multiqueue is not even in this patchset, so there's more work to 
do before we can worry. :)

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07  9:54 ` Paolo Bonzini
  2012-02-07 11:10   ` Michael S. Tsirkin
@ 2012-02-07 11:56   ` Christian Borntraeger
  2012-02-07 12:31     ` Paolo Bonzini
  1 sibling, 1 reply; 34+ messages in thread
From: Christian Borntraeger @ 2012-02-07 11:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Christian Hoff, linux-kernel, linux-scsi, rusty, mst, kvm

>> +                cmd->req.cmd = (struct virtio_scsi_cmd_req){
>> +                                .lun[0] = 1,
>> +                                .lun[1] = sc->device->id,
>> +                                .lun[2] = (sc->device->lun>>  8) | 0x40,
>> +                                .lun[3] = sc->device->lun&  0xff,
>> +                               [...]
>> +                };
>>
>> Can't we have seperate fields for the SCSI target ID and the LUN number
>> here? Putting all this into a single field seems confusing. The following
>> line of code (sc->device->lun>>  8) | 0x40 essentially means that LUN
>> numbers will be limited to 8+6 Bits=14 Bits for no obvious reason that I
>> can see. Maybe we could just split the LUN field up into two uint32 fields
>> for target ID and LUN number?
> 
> The 14-bit limitation can be lifted.  SAM defines a 24-bit LUN format too, 
> but I've never seen it used in practice.

Why not lift that limitation before the first version is committed upstream?

As far as I see we have to allocate multiple target ids if we want
to provide multipath (e.g. 8 target ids if there are 8 pathes, thus limiting
ourselves to 64 targets, no?)

As a compromise between space/flexibility, cant we just split the 4 bytes
in a similar fashion as our major/minor numbers (12/20bit)?

In the mainframe area I have seen real-life problems hitting the 64k 
device limit for disks (mostly due to multipathing), it was extended
afterwards, but every extension tends to make an interface less
appealing. (look at some virtio drivers and you will find places
were feature bits made the code "less pretty")
> 
>> Also, lun[1] = sc->device->id means that only 255 SCSI target IDs will be
>> supported. Think about bigger usage scenarios, such as FCP networks with
>> several hundred HBAs in the net. If you want to have the target ID<->HBA
>> mapping the same as on the guest as on the host, then 255 virtual target
>> IDs could be a limit.
> 
> I think you would hit other scalability limitations well before that.

Performance might be fixed later, but the interface is then settled.
[..]

> But in any case, we could still use the fixed "1" byte to go beyond 255 targets.

Again, why not now? Any extension would require a feature bit, no?
Is there a technical reason why a fixed 1 here is better than just using that space
as scsi id? (e.g. hash table sizes, lookup etc)

Regards

Christian

PS: what puzzles me as well, is the fact that struct virtio_scsi_cmd_req has lun[8]
in the structure. That would sum up as 5 bytes wasted of those 8, no?


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07 11:56   ` Christian Borntraeger
@ 2012-02-07 12:31     ` Paolo Bonzini
  2012-02-07 13:18       ` Christian Borntraeger
  0 siblings, 1 reply; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-07 12:31 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Christian Hoff, linux-kernel, linux-scsi, rusty, mst, kvm

On 02/07/2012 12:56 PM, Christian Borntraeger wrote:
>> The 14-bit limitation can be lifted.  SAM defines a 24-bit LUN format too,
>> but I've never seen it used in practice.
>
> Why not lift that limitation before the first version is committed upstream?

Because nobody supports >14-bit LUNs.  The in-kernel LIO target doesn't, 
I don't know anything that does.

> As far as I see we have to allocate multiple target ids if we want
> to provide multipath (e.g. 8 target ids if there are 8 pathes, thus limiting
> ourselves to 64 targets, no?)

Well, 256/8 is actually 32, but yes. :)  But it's more likely that you 
would do multipathing and ALUA on the host, and present a single path to 
the guest using the QEMU SCSI target.  The guest would see a single disk 
that just works.

> As a compromise between space/flexibility, cant we just split the 4 bytes
> in a similar fashion as our major/minor numbers (12/20bit)?

The structure of the LUN is defined by SAM, not by me.  I specified a 
mandatory subset for two reason: 1) simplicity of implementation; 2) 
nobody supports the full hierarchical LUNs spec (SCSI has a lot of 
fringe features), so you need to standardize on something to start with.

Also, you can always have more than one HBA.  You do not even have to 
hotplug the HBAs, you can start a guest with 8 HBAs on a single 
multifunction PCI device, and add disks to them as you see fit.

> Again, why not now? Any extension would require a feature bit, no?

No, because the guest would simply scan a wider LUN space, and it would 
not find anything on older hosts.  It could also look at the 
max_channel/max_target/max_lun fields in the configuration and print a 
warning to the user that some targets may not be accessible.

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07 12:31     ` Paolo Bonzini
@ 2012-02-07 13:18       ` Christian Borntraeger
  2012-02-07 13:59         ` Christian Hoff
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Borntraeger @ 2012-02-07 13:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Christian Hoff, linux-kernel, linux-scsi, rusty, mst, kvm

On 07/02/12 13:31, Paolo Bonzini wrote:
> The structure of the LUN is defined by SAM, not by me. 

Ok, got it.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07 13:18       ` Christian Borntraeger
@ 2012-02-07 13:59         ` Christian Hoff
  2012-02-07 14:28           ` Paolo Bonzini
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Hoff @ 2012-02-07 13:59 UTC (permalink / raw)
  To: BORNTRAE; +Cc: kvm, linux-kernel, linux-scsi, mst, Paolo Bonzini, rusty

Hello Paolo,

On 07/02/12 13:31, Paolo Bonzini wrote:
> The structure of the LUN is defined by SAM, not by me. 

Just had a look at the SAM LUN structure. It appears that you are using 
the SCSI format for hierarchical LUNs(SAM 5 chapter 4.7.6) in order to 
indicate that the LUN is actually attached to the host system and not to 
the Qemu guest operating system itself.

What I am wondering is why you are doing this? The hierarchical SCSI 
format is only retained while the LUN number is being transmitted over the 
virtqueue, so there does not seem to be a technical necessity for using 
this format in order to transmit SCSI target ID and LUN number information 
to the host.

Instead the format has some disadvantages:
- It uses up 8 bytes where 3 bytes would be sufficient in order to store 
both the target ID and LUN number information
- The format limits us to 255 target IDs. I agree that the LUN limit is 
probably more a theoretical and not a practical one, but 255 target IDs 
could become a limitation in the future.

Nonetheless I think that virtio-scsi is a useful project and addresses 
many of the limitations imposed by virtio-block. The fact that I am still 
persisting has more to do with interest in the project rather than wanting 
to keep the code from going upstream. Thank you for your work on this 
topic.

Mit freundlichen Grüßen / Kind regards

Christian Hoff

Student - Applied Computer Science


Phone:
49-16098976-950
 IBM Deutschland

E-Mail:
christian.hoff@de.ibm.com
 Am Fichtenberg 1


 71083 Herrenberg


 Germany


IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Martina Koederitz (Vorsitzende), Reinhard Reschke, 
Dieter Scholz, Gregor Pillen, Joachim Heel, Christian Noll
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 




From:
BORNTRAE@linux.vnet.ibm.com
To:
Paolo Bonzini <pbonzini@redhat.com>
Cc:
Christian Hoff/Germany/IBM@IBMDE, linux-kernel@vger.kernel.org, 
linux-scsi@vger.kernel.org, rusty@rustcorp.com.au, mst@redhat.com, 
kvm@vger.kernel.org
Date:
07/02/2012 14:18
Subject:
Re: Pe: [PATCH v5 1/3] virtio-scsi: first version



On 07/02/12 13:31, Paolo Bonzini wrote:
> The structure of the LUN is defined by SAM, not by me. 

Ok, got it.




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07 13:59         ` Christian Hoff
@ 2012-02-07 14:28           ` Paolo Bonzini
  2012-02-08 13:37             ` Christian Hoff
  0 siblings, 1 reply; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-07 14:28 UTC (permalink / raw)
  To: Christian Hoff; +Cc: BORNTRAE, kvm, linux-kernel, linux-scsi, mst, rusty

On 02/07/2012 02:59 PM, Christian Hoff wrote:
> Instead the format has some disadvantages:
> - It uses up 8 bytes where 3 bytes would be sufficient in order to store
> both the target ID and LUN number information
> - The format limits us to 255 target IDs. I agree that the LUN limit is
> probably more a theoretical and not a practical one, but 255 target IDs
> could become a limitation in the future.

It also provides better upwards-compatibility in case the limitations 
are actually hit.  If I had used "uint8_t target; uint16_t lun;" an 
extension would require a feature bit and a new struct.  With 8-bytes, 
you can just expand the definition.  That pretty much sums it up.

But again, I don't think the limitations are serious.  A MegaSAS header 
has room for 256 targets too, VMWare has only 15, Hyper-V has 1 (and 2 
channels, but I think that's an off-by-one), and you can always have 
multiple HBAs on the same guest.

> Nonetheless I think that virtio-scsi is a useful project and addresses
> many of the limitations imposed by virtio-block. The fact that I am still
> persisting has more to do with interest in the project rather than wanting
> to keep the code from going upstream.

No problem. :)

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-07 14:28           ` Paolo Bonzini
@ 2012-02-08 13:37             ` Christian Hoff
  2012-02-09  9:25               ` Paolo Bonzini
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Hoff @ 2012-02-08 13:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: BORNTRAE, kvm, linux-kernel, linux-scsi, mst, rusty

Paolo Bonzini wrote:
> Christian Hoff wrote:
> > Instead the format has some disadvantages:
> > - It uses up 8 bytes where 3 bytes would be sufficient in order to 
store
> > both the target ID and LUN number information
> > - The format limits us to 255 target IDs. I agree that the LUN limit 
is
> > probably more a theoretical and not a practical one, but 255 target 
IDs
> > could become a limitation in the future.
>
> It also provides better upwards-compatibility in case the limitations 
> are actually hit.  If I had used "uint8_t target; uint16_t lun;" an 
> extension would require a feature bit and a new struct.  With 8-bytes, 
> you can just expand the definition.  That pretty much sums it up.

Ok, fair enough. This addresses my question.

Again, I have already done much testing with virtio-scsi and can confirm 
that the code is working flawlessly. In my opinion, virtio-scsi is a 
worthwhile addition to virtio-block and should be considered for inclusion 
into mainline kernel code.

Mit freundlichen Grüßen / Kind regards

Christian Hoff

Student - Applied Computer Science


Phone:
49-16098976-950
 IBM Deutschland

E-Mail:
christian.hoff@de.ibm.com
 Am Fichtenberg 1


 71083 Herrenberg


 Germany


IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Martina Koederitz (Vorsitzende), Reinhard Reschke, 
Dieter Scholz, Gregor Pillen, Joachim Heel, Christian Noll
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-08 13:37             ` Christian Hoff
@ 2012-02-09  9:25               ` Paolo Bonzini
  2012-02-09 12:18                 ` Christian Hoff
  2012-02-12 20:16                 ` James Bottomley
  0 siblings, 2 replies; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-09  9:25 UTC (permalink / raw)
  To: Christian Hoff, James Bottomley
  Cc: BORNTRAE, kvm, linux-kernel, linux-scsi, mst, rusty

On 02/08/2012 02:37 PM, Christian Hoff wrote:
> Again, I have already done much testing with virtio-scsi and can confirm
> that the code is working flawlessly. In my opinion, virtio-scsi is a
> worthwhile addition to virtio-block and should be considered for inclusion
> into mainline kernel code.

Thank you very much!

James, will you include virtio-scsi in 3.4?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-09  9:25               ` Paolo Bonzini
@ 2012-02-09 12:18                 ` Christian Hoff
  2012-02-12 20:16                 ` James Bottomley
  1 sibling, 0 replies; 34+ messages in thread
From: Christian Hoff @ 2012-02-09 12:18 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: BORNTRAE, James Bottomley, kvm, linux-kernel, linux-scsi, mst, rusty

Paolo Bonzini wrote:
> James, will you include virtio-scsi in 3.4?

The key arguments from my side for inclusion of virtio-scsi are:
- Support for SCSI multipathing, which can optionally be done on the host 
operating system.
- We can now use other SCSI(non-block) devices and make them accessible to 
the guest. For example we were able to virtualize a SCSI FCP tape using 
virtio-scsi. One of the biggest constraints of virtio-block is that it is 
limited to block devices, while virtio-scsi presents you with a more 
generic solution.

So I consider virtio-scsi to be the cleaner approach to do virtualization 
of SCSI commands. This is supported by the fact that virtio-block has a 
maintenance problem and has to change its protocol from time to time to 
support new features introduced by a new version of the SCSI standard.

So please consider including virtio-scsi into the mainline kernel tree. 
This is good and conceptually clean code, and it has proven to work very 
well in our internal testing.

Mit freundlichen Grüßen / Kind regards

Christian Hoff

Student - Applied Computer Science


Phone:
49-16098976-950
 IBM Deutschland

E-Mail:
christian.hoff@de.ibm.com
 Am Fichtenberg 1


 71083 Herrenberg


 Germany


IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Martina Koederitz (Vorsitzende), Reinhard Reschke, 
Dieter Scholz, Gregor Pillen, Joachim Heel, Christian Noll
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-09  9:25               ` Paolo Bonzini
  2012-02-09 12:18                 ` Christian Hoff
@ 2012-02-12 20:16                 ` James Bottomley
  2012-02-12 23:41                   ` Rusty Russell
                                     ` (2 more replies)
  1 sibling, 3 replies; 34+ messages in thread
From: James Bottomley @ 2012-02-12 20:16 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Christian Hoff, BORNTRAE, kvm, linux-kernel, linux-scsi, mst, rusty

On Thu, 2012-02-09 at 10:25 +0100, Paolo Bonzini wrote:
> On 02/08/2012 02:37 PM, Christian Hoff wrote:
> > Again, I have already done much testing with virtio-scsi and can confirm
> > that the code is working flawlessly. In my opinion, virtio-scsi is a
> > worthwhile addition to virtio-block and should be considered for inclusion
> > into mainline kernel code.
> 
> Thank you very much!
> 
> James, will you include virtio-scsi in 3.4?

Well, no-one's yet answered the question I had about why.  virtio-scsi
seems to be a basic duplication of virtio-blk except that it seems to
fix some problems virtio-blk has.  Namely queue parameter discover,
which virtio-blk doesn't seem to do.  There may also be a reason to cut
the stack lower down.  Error handling is most often cited for this, but
no-one's satisfactorily explaned why it's better to do error handling in
the guest instead of the host.

Could someone please explain to me why you can't simply fix virtio-blk?
Or would virtio-blk maintainers give a reason why they're unwilling to
have it fixed?

This isn't a "no" by the way: we have absolutely hideous virtual drivers
for other virtualisation systems in SCSI which should also have been in
block, except that's not the way the various virt people think, so I'm
willing to extend KVM the same courtesy ... I'd just really like to know
that the virtio-blk situation is intractable before I do.

James



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-12 20:16                 ` James Bottomley
@ 2012-02-12 23:41                   ` Rusty Russell
  2012-02-13  7:05                   ` Christian Borntraeger
  2012-02-13  9:19                   ` Paolo Bonzini
  2 siblings, 0 replies; 34+ messages in thread
From: Rusty Russell @ 2012-02-12 23:41 UTC (permalink / raw)
  To: James Bottomley, Paolo Bonzini
  Cc: Christian Hoff, BORNTRAE, kvm, linux-kernel, linux-scsi, mst

On Sun, 12 Feb 2012 14:16:17 -0600, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Thu, 2012-02-09 at 10:25 +0100, Paolo Bonzini wrote:
> > On 02/08/2012 02:37 PM, Christian Hoff wrote:
> > > Again, I have already done much testing with virtio-scsi and can confirm
> > > that the code is working flawlessly. In my opinion, virtio-scsi is a
> > > worthwhile addition to virtio-block and should be considered for inclusion
> > > into mainline kernel code.
> > 
> > Thank you very much!
> > 
> > James, will you include virtio-scsi in 3.4?
> 
> Well, no-one's yet answered the question I had about why.  virtio-scsi
> seems to be a basic duplication of virtio-blk except that it seems to
> fix some problems virtio-blk has.  Namely queue parameter discover,
> which virtio-blk doesn't seem to do.  There may also be a reason to cut
> the stack lower down.  Error handling is most often cited for this, but
> no-one's satisfactorily explaned why it's better to do error handling in
> the guest instead of the host.
> 
> Could someone please explain to me why you can't simply fix virtio-blk?
> Or would virtio-blk maintainers give a reason why they're unwilling to
> have it fixed?

My concern is simple: virtio_blk covers the 99% of cases, with very
little complexity.  To get that last 1%, we will end up re-specing much
of SCSI.

Having found someone who understand SCSI and is eager to maintain a
driver and spec, I am deeply tempted to partition the problem as simple
== virtio_blk, complex == virtio_scsi.

In fact, it would allow us to tighten the spec on VIRTIO_BLK_T_SCSI_CMD
to its actual use, which AFAICT is CDROMEJECT (maybe CDROMCLOSETRAY).

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-12 20:16                 ` James Bottomley
  2012-02-12 23:41                   ` Rusty Russell
@ 2012-02-13  7:05                   ` Christian Borntraeger
  2012-02-13  7:57                     ` Dor Laor
  2012-02-13 11:08                     ` Bart Van Assche
  2012-02-13  9:19                   ` Paolo Bonzini
  2 siblings, 2 replies; 34+ messages in thread
From: Christian Borntraeger @ 2012-02-13  7:05 UTC (permalink / raw)
  To: James Bottomley
  Cc: Paolo Bonzini, Christian Hoff, borntrae, kvm, linux-kernel,
	linux-scsi, mst, rusty

On 12/02/12 21:16, James Bottomley wrote:
> Well, no-one's yet answered the question I had about why.  

Just to give one example from a different angle:
In the big datacenters tape libraries are still very important, and lots
of them have a scsi attachement. virtio-blk certainly is not the right
way to handle those. Furthermore it seems even pretty hard to craft
a virtio-tape since most of those libraries have vendor specific library
controls (via sg). We would need to duplicate scsi generic (hint, hint :-)

> virtio-scsi seems to be a basic duplication of virtio-blk except that it seems to
> fix some problems virtio-blk has.  Namely queue parameter discover,
> which virtio-blk doesn't seem to do.  There may also be a reason to cut
> the stack lower down.  Error handling is most often cited for this, but
> no-one's satisfactorily explaned why it's better to do error handling in
> the guest instead of the host.
> 
> Could someone please explain to me why you can't simply fix virtio-blk?

I dont think that virtio-scsi will replace virtio-blk everywhere. For non-scsi
block devices, image files or logical volumes virtio-blk seems to be the right 
approach, I think.

> Or would virtio-blk maintainers give a reason why they're unwilling to
> have it fixed?

I dont consider virtio-blk broken. It just doesnt cover everything.

Christian


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13  7:05                   ` Christian Borntraeger
@ 2012-02-13  7:57                     ` Dor Laor
  2012-02-13 12:40                       ` Nicholas A. Bellinger
  2012-02-13 11:08                     ` Bart Van Assche
  1 sibling, 1 reply; 34+ messages in thread
From: Dor Laor @ 2012-02-13  7:57 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: James Bottomley, Paolo Bonzini, Christian Hoff, borntrae, kvm,
	linux-kernel, linux-scsi, mst, rusty

On 02/13/2012 09:05 AM, Christian Borntraeger wrote:
> On 12/02/12 21:16, James Bottomley wrote:
>> Well, no-one's yet answered the question I had about why.
>
> Just to give one example from a different angle:
> In the big datacenters tape libraries are still very important, and lots
> of them have a scsi attachement. virtio-blk certainly is not the right
> way to handle those. Furthermore it seems even pretty hard to craft
> a virtio-tape since most of those libraries have vendor specific library
> controls (via sg). We would need to duplicate scsi generic (hint, hint :-)
>
>> virtio-scsi seems to be a basic duplication of virtio-blk except that it seems to
>> fix some problems virtio-blk has.  Namely queue parameter discover,
>> which virtio-blk doesn't seem to do.  There may also be a reason to cut
>> the stack lower down.  Error handling is most often cited for this, but
>> no-one's satisfactorily explaned why it's better to do error handling in
>> the guest instead of the host.
>>
>> Could someone please explain to me why you can't simply fix virtio-blk?
>
> I dont think that virtio-scsi will replace virtio-blk everywhere. For non-scsi
> block devices, image files or logical volumes virtio-blk seems to be the right
> approach, I think.

+1

virtio-scsi is superior w.r.t:
   - Device support: tapes, cdroms, other
   - Does guest-host mapped multipath
   - Supports plenty of virtual disks mapped to the guest w/o need for a
     pci slot per each virtio-blk
   - offload fancy/new/sophisticated scsi commands from the guest to the
     storage array w/o need for qemu implementation. Example XCOPY.

There are some more goodies like ability to support windows guest 
clustering w/o hacky versions of scsi pass through over virtio-blk.
virtio-blk is also a candidate to change the request based towards bio 
based implementation, so sticking to it does not buy us too much.

>
>> Or would virtio-blk maintainers give a reason why they're unwilling to
>> have it fixed?
>
> I dont consider virtio-blk broken. It just doesnt cover everything.
>
> Christian
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-12 20:16                 ` James Bottomley
  2012-02-12 23:41                   ` Rusty Russell
  2012-02-13  7:05                   ` Christian Borntraeger
@ 2012-02-13  9:19                   ` Paolo Bonzini
  2012-02-14  0:07                     ` Rusty Russell
  2 siblings, 1 reply; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-13  9:19 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christian Hoff, BORNTRAE, kvm, linux-kernel, linux-scsi, mst, rusty

On 02/12/2012 09:16 PM, James Bottomley wrote:
> Well, no-one's yet answered the question I had about why.  virtio-scsi
> seems to be a basic duplication of virtio-blk except that it seems to
> fix some problems virtio-blk has.  Namely queue parameter discover,
> which virtio-blk doesn't seem to do.

The biggest differences between virtio-blk and virtio-scsi are that:

1) how the feature set is defined.  virtio-blk defines the feature set 
of the device through a shared spec between the guest and the host.  The 
virtio-scsi spec does not define a feature set for the devices, only for 
the transport.  Introducing new features in the guest does not need to 
be done specifically for virt, it can be done in generic code (sd.c). 
This results in a large feature set and at the same time a very stable spec.

Right now virtio-blk covers common usecases nicely.  However, the Linux 
block layer _is_ growing support for new operations: discard is already 
there, write same is in the works, extended copy will also come in due 
time.  Perhaps we'll add them to virtio-blk, perhaps not.  If we will, 
we will have to modify the spec, the host implementation, and the guest 
drivers for each possible guest OS.  virtio-scsi will support them 
transparently.  Depending on your configuration, it might work without 
touching the host at all.


2) for disks with SCSI attachment, the native interface is exposed 
precisely as it is in the host.  I think we had some misunderstanding 
WRT queue parameter discovery.  My concern with virtio-blk's SG_IO 
support is more general than that.  It is that SG_IO accesses the host 
disk, not the guest disk.  They will have the same data, but they are 
effectively different disks.  For example they might have different 
queue parameters, hence the misunderstanding.

People are mostly using the SG_IO interface for sane purposes.  For 
example you can ping the storage with INQUIRY commands to detect 
problems on the NAS or SAN.  For these usecases the difference does not 
matter.  However, there _are_ worrisome usecases for SG_IO that people 
are looking at.  For example installing vendor backup tools in their 
guests.  These tools send vendor-specific commands to the disks. 
Nothing particularly insane about that, but we want them to do it using 
a saner interface than VIRTIO_BLK_T_SCSI_CMD.


On top of this, only virtio-scsi obviously will support devices such as 
tapes.

> There may also be a reason to cut the stack lower down.  Error
> handling is most often cited for this, but no-one's satisfactorily
> explaned why it's better to do error handling in the guest instead of
> the host.

It's not necessarily better.  However error handling in the host may 
simply not be there.  This is for example the case of NFS-based storage 
with the "hard" option.

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13  7:05                   ` Christian Borntraeger
  2012-02-13  7:57                     ` Dor Laor
@ 2012-02-13 11:08                     ` Bart Van Assche
  1 sibling, 0 replies; 34+ messages in thread
From: Bart Van Assche @ 2012-02-13 11:08 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: James Bottomley, Paolo Bonzini, Christian Hoff, borntrae, kvm,
	linux-kernel, linux-scsi, mst, rusty

On Mon, Feb 13, 2012 at 8:05 AM, Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
> On 12/02/12 21:16, James Bottomley wrote:
> > Could someone please explain to me why you can't simply fix virtio-blk?
>
> I dont think that virtio-scsi will replace virtio-blk everywhere. For non-scsi
> block devices, image files or logical volumes virtio-blk seems to be the right
> approach, I think.
>
> > Or would virtio-blk maintainers give a reason why they're unwilling to
> > have it fixed?
>
> I dont consider virtio-blk broken. It just doesnt cover everything.

Although I'm not sure whether that helps here: since about a year
there is software present in the upstream kernel that allows to use
any block device or even a file as a SCSI device.

Bart.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13  7:57                     ` Dor Laor
@ 2012-02-13 12:40                       ` Nicholas A. Bellinger
  2012-02-13 12:54                         ` Dor Laor
  0 siblings, 1 reply; 34+ messages in thread
From: Nicholas A. Bellinger @ 2012-02-13 12:40 UTC (permalink / raw)
  To: dlaor
  Cc: Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, mst,
	rusty, Stefan Hajnoczi, target-devel

Hi Dor, James & Co,

On Mon, 2012-02-13 at 09:57 +0200, Dor Laor wrote:
> On 02/13/2012 09:05 AM, Christian Borntraeger wrote:
> > On 12/02/12 21:16, James Bottomley wrote:
> >> Well, no-one's yet answered the question I had about why.
> >
> > Just to give one example from a different angle:
> > In the big datacenters tape libraries are still very important, and lots
> > of them have a scsi attachement. virtio-blk certainly is not the right
> > way to handle those. Furthermore it seems even pretty hard to craft
> > a virtio-tape since most of those libraries have vendor specific library
> > controls (via sg). We would need to duplicate scsi generic (hint, hint :-)
> >
> >> virtio-scsi seems to be a basic duplication of virtio-blk except that it seems to
> >> fix some problems virtio-blk has.  Namely queue parameter discover,
> >> which virtio-blk doesn't seem to do.  There may also be a reason to cut
> >> the stack lower down.  Error handling is most often cited for this, but
> >> no-one's satisfactorily explaned why it's better to do error handling in
> >> the guest instead of the host.
> >>
> >> Could someone please explain to me why you can't simply fix virtio-blk?
> >
> > I dont think that virtio-scsi will replace virtio-blk everywhere. For non-scsi
> > block devices, image files or logical volumes virtio-blk seems to be the right
> > approach, I think.
> 
> +1
> 
> virtio-scsi is superior w.r.t:
>    - Device support: tapes, cdroms, other

AFAICT any type of non TYPE_DISK struct scsi_device passthrough is going
to currently require virtio-scsi in order to work.

>    - Does guest-host mapped multipath

The logic that comes with target_core_fabric_configfs.c and the native
target control plane gives a host-side (tcm_vhost) fabric driver generic
explict/implict ALUA multipath support by default.

I think there are some interesting possibilities for paravirtualized
ALUA multipath..  8-)

>    - Supports plenty of virtual disks mapped to the guest w/o need for a
>      pci slot per each virtio-blk

Ouch, virtio-blk lacks multi-lun per pci slot support..?

>    - offload fancy/new/sophisticated scsi commands from the guest to the
>      storage array w/o need for qemu implementation. Example XCOPY.
> 

...

> There are some more goodies like ability to support windows guest 
> clustering w/o hacky versions of scsi pass through over virtio-blk.
> virtio-blk is also a candidate to change the request based towards bio 
> based implementation, so sticking to it does not buy us too much.
> 

MSFT cluster guests that require SPC-3 PR support can run today with
tcm_loop LLD SCSI LUNs + SG_IO/BSG + right megasas QEMU HBA emulation,
but I do agree this would be better served by virtio-scsi for guests
that require SPC-3 PR support or passthrough.

--nab








^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 12:40                       ` Nicholas A. Bellinger
@ 2012-02-13 12:54                         ` Dor Laor
  2012-02-13 13:00                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 34+ messages in thread
From: Dor Laor @ 2012-02-13 12:54 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, mst,
	rusty, Stefan Hajnoczi, target-devel

On 02/13/2012 02:40 PM, Nicholas A. Bellinger wrote:
> Hi Dor, James&  Co,
>
> On Mon, 2012-02-13 at 09:57 +0200, Dor Laor wrote:
>> On 02/13/2012 09:05 AM, Christian Borntraeger wrote:
>>> On 12/02/12 21:16, James Bottomley wrote:
>>>> Well, no-one's yet answered the question I had about why.
>>>
>>> Just to give one example from a different angle:
>>> In the big datacenters tape libraries are still very important, and lots
>>> of them have a scsi attachement. virtio-blk certainly is not the right
>>> way to handle those. Furthermore it seems even pretty hard to craft
>>> a virtio-tape since most of those libraries have vendor specific library
>>> controls (via sg). We would need to duplicate scsi generic (hint, hint :-)
>>>
>>>> virtio-scsi seems to be a basic duplication of virtio-blk except that it seems to
>>>> fix some problems virtio-blk has.  Namely queue parameter discover,
>>>> which virtio-blk doesn't seem to do.  There may also be a reason to cut
>>>> the stack lower down.  Error handling is most often cited for this, but
>>>> no-one's satisfactorily explaned why it's better to do error handling in
>>>> the guest instead of the host.
>>>>
>>>> Could someone please explain to me why you can't simply fix virtio-blk?
>>>
>>> I dont think that virtio-scsi will replace virtio-blk everywhere. For non-scsi
>>> block devices, image files or logical volumes virtio-blk seems to be the right
>>> approach, I think.
>>
>> +1
>>
>> virtio-scsi is superior w.r.t:
>>     - Device support: tapes, cdroms, other
>
> AFAICT any type of non TYPE_DISK struct scsi_device passthrough is going
> to currently require virtio-scsi in order to work.
>
>>     - Does guest-host mapped multipath
>
> The logic that comes with target_core_fabric_configfs.c and the native
> target control plane gives a host-side (tcm_vhost) fabric driver generic
> explict/implict ALUA multipath support by default.
>
> I think there are some interesting possibilities for paravirtualized
> ALUA multipath..  8-)
>
>>     - Supports plenty of virtual disks mapped to the guest w/o need for a
>>       pci slot per each virtio-blk
>
> Ouch, virtio-blk lacks multi-lun per pci slot support..?

Only if you use the pci multi-function option but that kills standard 
hot unplug

>
>>     - offload fancy/new/sophisticated scsi commands from the guest to the
>>       storage array w/o need for qemu implementation. Example XCOPY.
>>
>
> ...
>
>> There are some more goodies like ability to support windows guest
>> clustering w/o hacky versions of scsi pass through over virtio-blk.
>> virtio-blk is also a candidate to change the request based towards bio
>> based implementation, so sticking to it does not buy us too much.
>>
>
> MSFT cluster guests that require SPC-3 PR support can run today with
> tcm_loop LLD SCSI LUNs + SG_IO/BSG + right megasas QEMU HBA emulation,
> but I do agree this would be better served by virtio-scsi for guests
> that require SPC-3 PR support or passthrough.
>
> --nab
>
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 12:54                         ` Dor Laor
@ 2012-02-13 13:00                           ` Michael S. Tsirkin
  2012-02-13 13:13                             ` ronnie sahlberg
  0 siblings, 1 reply; 34+ messages in thread
From: Michael S. Tsirkin @ 2012-02-13 13:00 UTC (permalink / raw)
  To: Dor Laor
  Cc: Nicholas A. Bellinger, Christian Borntraeger, James Bottomley,
	Paolo Bonzini, Christian Hoff, borntrae, kvm, linux-kernel,
	linux-scsi, rusty, Stefan Hajnoczi, target-devel

On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
> Only if you use the pci multi-function option but that kills
> standard hot unplug

It doesn't kill it as such, rather you can't unplug luns individually.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 13:00                           ` Michael S. Tsirkin
@ 2012-02-13 13:13                             ` ronnie sahlberg
  2012-02-13 13:17                               ` Paolo Bonzini
  2012-02-13 13:18                               ` Michael S. Tsirkin
  0 siblings, 2 replies; 34+ messages in thread
From: ronnie sahlberg @ 2012-02-13 13:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dor Laor, Nicholas A. Bellinger, Christian Borntraeger,
	James Bottomley, Paolo Bonzini, Christian Hoff, borntrae, kvm,
	linux-kernel, linux-scsi, rusty, Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
>> Only if you use the pci multi-function option but that kills
>> standard hot unplug
>
> It doesn't kill it as such, rather you can't unplug luns individually.

Isnt that just a consequence of the current implementation rather than
a SCSI limitation?

A different way to do hoplug could be to flag all devices as removable
in the standard inq page then
leave the LUN there persistently and what you remove/add is not the
LUN device itself but just the media in the device.

Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
"media insert".
The device remains present all time, you never remove it, but instead
hot-plug controls if the media is present or not.


This would require implementing at least START_STOP_UNIT and
PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.


regards
ronnie sahlberg

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 13:13                             ` ronnie sahlberg
@ 2012-02-13 13:17                               ` Paolo Bonzini
  2012-02-13 13:18                               ` Michael S. Tsirkin
  1 sibling, 0 replies; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-13 13:17 UTC (permalink / raw)
  To: ronnie sahlberg
  Cc: Michael S. Tsirkin, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Christian Hoff, borntrae,
	kvm, linux-kernel, linux-scsi, rusty, Stefan Hajnoczi,
	target-devel

On 02/13/2012 02:13 PM, ronnie sahlberg wrote:
> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
>>> Only if you use the pci multi-function option but that kills
>>> standard hot unplug
>>
>> It doesn't kill it as such, rather you can't unplug luns individually.
>
> Isnt that just a consequence of the current implementation rather than
> a SCSI limitation?

We're talking about virtio-blk here. :)

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 13:13                             ` ronnie sahlberg
  2012-02-13 13:17                               ` Paolo Bonzini
@ 2012-02-13 13:18                               ` Michael S. Tsirkin
  2012-02-13 15:12                                 ` Hannes Reinecke
  1 sibling, 1 reply; 34+ messages in thread
From: Michael S. Tsirkin @ 2012-02-13 13:18 UTC (permalink / raw)
  To: ronnie sahlberg
  Cc: Dor Laor, Nicholas A. Bellinger, Christian Borntraeger,
	James Bottomley, Paolo Bonzini, Christian Hoff, borntrae, kvm,
	linux-kernel, linux-scsi, rusty, Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
> >> Only if you use the pci multi-function option but that kills
> >> standard hot unplug
> >
> > It doesn't kill it as such, rather you can't unplug luns individually.
> 
> Isnt that just a consequence of the current implementation rather than
> a SCSI limitation?

Yes.

> A different way to do hoplug could be to flag all devices as removable
> in the standard inq page then
> leave the LUN there persistently and what you remove/add is not the
> LUN device itself but just the media in the device.
> 
> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
> "media insert".
> The device remains present all time, you never remove it, but instead
> hot-plug controls if the media is present or not.
> 
> 
> This would require implementing at least START_STOP_UNIT and
> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
> 
> 
> regards
> ronnie sahlberg

That would work.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 13:18                               ` Michael S. Tsirkin
@ 2012-02-13 15:12                                 ` Hannes Reinecke
  2012-02-13 20:42                                   ` ronnie sahlberg
  0 siblings, 1 reply; 34+ messages in thread
From: Hannes Reinecke @ 2012-02-13 15:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ronnie sahlberg, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On 02/13/2012 02:18 PM, Michael S. Tsirkin wrote:
> On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
>> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
>>>> Only if you use the pci multi-function option but that kills
>>>> standard hot unplug
>>>
>>> It doesn't kill it as such, rather you can't unplug luns individually.
>>
>> Isnt that just a consequence of the current implementation rather than
>> a SCSI limitation?
> 
> Yes.
> 
>> A different way to do hoplug could be to flag all devices as removable
>> in the standard inq page then
>> leave the LUN there persistently and what you remove/add is not the
>> LUN device itself but just the media in the device.
>>
>> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
>> "media insert".
>> The device remains present all time, you never remove it, but instead
>> hot-plug controls if the media is present or not.
>>
>>
>> This would require implementing at least START_STOP_UNIT and
>> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
>>
>>
>> regards
>> ronnie sahlberg
> 
> That would work.
>
Or we simply use the Peripheral Qualifier that the device is gone;
eg we could simply set PQ = 1, return sense code 0x25/00 and be done
with ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 15:12                                 ` Hannes Reinecke
@ 2012-02-13 20:42                                   ` ronnie sahlberg
  2012-02-13 20:53                                     ` ronnie sahlberg
  0 siblings, 1 reply; 34+ messages in thread
From: ronnie sahlberg @ 2012-02-13 20:42 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Michael S. Tsirkin, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 2:12 AM, Hannes Reinecke <hare@suse.de> wrote:
> On 02/13/2012 02:18 PM, Michael S. Tsirkin wrote:
>> On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
>>> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
>>>>> Only if you use the pci multi-function option but that kills
>>>>> standard hot unplug
>>>>
>>>> It doesn't kill it as such, rather you can't unplug luns individually.
>>>
>>> Isnt that just a consequence of the current implementation rather than
>>> a SCSI limitation?
>>
>> Yes.
>>
>>> A different way to do hoplug could be to flag all devices as removable
>>> in the standard inq page then
>>> leave the LUN there persistently and what you remove/add is not the
>>> LUN device itself but just the media in the device.
>>>
>>> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
>>> "media insert".
>>> The device remains present all time, you never remove it, but instead
>>> hot-plug controls if the media is present or not.
>>>
>>>
>>> This would require implementing at least START_STOP_UNIT and
>>> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
>>>
>>>
>>> regards
>>> ronnie sahlberg
>>
>> That would work.
>>
> Or we simply use the Peripheral Qualifier that the device is gone;
> eg we could simply set PQ = 1, return sense code 0x25/00 and be done
> with ...
>

That is still similar to "rip a device out from the guest without notice"
and can cause the guest to be "surprised".


Removable media is standard feature in SCSI SBC (and other commandsets).
The nice part of removable media is that it activates a contract
between the device and the guest
to prevent removal of the media when the guest depends on the media
not being removed.

I.e.  If you have a SBC device with the removable-media bit set,
this is used to tell the initiator "this media can be removed, be
prepared that this might happen".
So when you mount such a SBC device in the guest, the guest will issue
a "PREVENT_ALLOW_MEDIUM_REMOVAL"
to tell the device "this medium is in use and may not be removed".

This automatically provides you with a mechanism where any guest can
signal to qemu when qemu may or may not remove the device/medium.



In addition to implementing PREVENT_ALLOW_MEDIUM_REMOVAL emulation,
qemu would also need to check the prevent-allow status before it
allows the device to be removed.

If nothing else, using this approach will automatically provide a
channel from the guest kernel to qemu to tell qemu when a device may
be unplugged and when it is not safe to unplug the device.



regards
ronnie sahlberg

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 20:42                                   ` ronnie sahlberg
@ 2012-02-13 20:53                                     ` ronnie sahlberg
  2012-02-13 22:59                                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 34+ messages in thread
From: ronnie sahlberg @ 2012-02-13 20:53 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Michael S. Tsirkin, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 7:42 AM, ronnie sahlberg
<ronniesahlberg@gmail.com> wrote:
> On Tue, Feb 14, 2012 at 2:12 AM, Hannes Reinecke <hare@suse.de> wrote:
>> On 02/13/2012 02:18 PM, Michael S. Tsirkin wrote:
>>> On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
>>>> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
>>>>>> Only if you use the pci multi-function option but that kills
>>>>>> standard hot unplug
>>>>>
>>>>> It doesn't kill it as such, rather you can't unplug luns individually.
>>>>
>>>> Isnt that just a consequence of the current implementation rather than
>>>> a SCSI limitation?
>>>
>>> Yes.
>>>
>>>> A different way to do hoplug could be to flag all devices as removable
>>>> in the standard inq page then
>>>> leave the LUN there persistently and what you remove/add is not the
>>>> LUN device itself but just the media in the device.
>>>>
>>>> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
>>>> "media insert".
>>>> The device remains present all time, you never remove it, but instead
>>>> hot-plug controls if the media is present or not.
>>>>
>>>>
>>>> This would require implementing at least START_STOP_UNIT and
>>>> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
>>>>
>>>>
>>>> regards
>>>> ronnie sahlberg
>>>
>>> That would work.
>>>
>> Or we simply use the Peripheral Qualifier that the device is gone;
>> eg we could simply set PQ = 1, return sense code 0x25/00 and be done
>> with ...
>>
>
> That is still similar to "rip a device out from the guest without notice"
> and can cause the guest to be "surprised".
>
>
> Removable media is standard feature in SCSI SBC (and other commandsets).
> The nice part of removable media is that it activates a contract
> between the device and the guest
> to prevent removal of the media when the guest depends on the media
> not being removed.
>
> I.e.  If you have a SBC device with the removable-media bit set,
> this is used to tell the initiator "this media can be removed, be
> prepared that this might happen".
> So when you mount such a SBC device in the guest, the guest will issue
> a "PREVENT_ALLOW_MEDIUM_REMOVAL"
> to tell the device "this medium is in use and may not be removed".
>

What I mean is that if /dev/sdb is removable,
if you mount this as   "mount /dev/sdb1 /mnt"
this will automatically cause the guest kernel to send a
PREVENT_ALLOW_MEDIUM_REMOVAL to /dev/sdb to prevent removal.

When you "umount /dev/sdb1"   the kernel/guest will automagically send
PREVENT_ALLOW_MEDIUM_REMOVEAL to /dev/sdb and allow removal of the
media again.


If you capture this command and track the "prevent/allow removal
status"  you automatically get a channel where qemu will
know when it is safe to unplug the device  and when it is not safe to
unplug the device.
This is a nice feature.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 20:53                                     ` ronnie sahlberg
@ 2012-02-13 22:59                                       ` Michael S. Tsirkin
  2012-02-13 23:30                                         ` ronnie sahlberg
  0 siblings, 1 reply; 34+ messages in thread
From: Michael S. Tsirkin @ 2012-02-13 22:59 UTC (permalink / raw)
  To: ronnie sahlberg
  Cc: Hannes Reinecke, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 07:53:26AM +1100, ronnie sahlberg wrote:
> On Tue, Feb 14, 2012 at 7:42 AM, ronnie sahlberg
> <ronniesahlberg@gmail.com> wrote:
> > On Tue, Feb 14, 2012 at 2:12 AM, Hannes Reinecke <hare@suse.de> wrote:
> >> On 02/13/2012 02:18 PM, Michael S. Tsirkin wrote:
> >>> On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
> >>>> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >>>>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
> >>>>>> Only if you use the pci multi-function option but that kills
> >>>>>> standard hot unplug
> >>>>>
> >>>>> It doesn't kill it as such, rather you can't unplug luns individually.
> >>>>
> >>>> Isnt that just a consequence of the current implementation rather than
> >>>> a SCSI limitation?
> >>>
> >>> Yes.
> >>>
> >>>> A different way to do hoplug could be to flag all devices as removable
> >>>> in the standard inq page then
> >>>> leave the LUN there persistently and what you remove/add is not the
> >>>> LUN device itself but just the media in the device.
> >>>>
> >>>> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
> >>>> "media insert".
> >>>> The device remains present all time, you never remove it, but instead
> >>>> hot-plug controls if the media is present or not.
> >>>>
> >>>>
> >>>> This would require implementing at least START_STOP_UNIT and
> >>>> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
> >>>>
> >>>>
> >>>> regards
> >>>> ronnie sahlberg
> >>>
> >>> That would work.
> >>>
> >> Or we simply use the Peripheral Qualifier that the device is gone;
> >> eg we could simply set PQ = 1, return sense code 0x25/00 and be done
> >> with ...
> >>
> >
> > That is still similar to "rip a device out from the guest without notice"
> > and can cause the guest to be "surprised".
> >
> >
> > Removable media is standard feature in SCSI SBC (and other commandsets).
> > The nice part of removable media is that it activates a contract
> > between the device and the guest
> > to prevent removal of the media when the guest depends on the media
> > not being removed.
> >
> > I.e.  If you have a SBC device with the removable-media bit set,
> > this is used to tell the initiator "this media can be removed, be
> > prepared that this might happen".
> > So when you mount such a SBC device in the guest, the guest will issue
> > a "PREVENT_ALLOW_MEDIUM_REMOVAL"
> > to tell the device "this medium is in use and may not be removed".
> >
> 
> What I mean is that if /dev/sdb is removable,
> if you mount this as   "mount /dev/sdb1 /mnt"
> this will automatically cause the guest kernel to send a
> PREVENT_ALLOW_MEDIUM_REMOVAL to /dev/sdb to prevent removal.
> 
> When you "umount /dev/sdb1"   the kernel/guest will automagically send
> PREVENT_ALLOW_MEDIUM_REMOVEAL to /dev/sdb and allow removal of the
> media again.
> 
> 
> If you capture this command and track the "prevent/allow removal
> status"  you automatically get a channel where qemu will
> know when it is safe to unplug the device  and when it is not safe to
> unplug the device.
> This is a nice feature.

Presumably there's a way for device to notify the OS
that user requested removal, as well?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 22:59                                       ` Michael S. Tsirkin
@ 2012-02-13 23:30                                         ` ronnie sahlberg
  2012-02-13 23:33                                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 34+ messages in thread
From: ronnie sahlberg @ 2012-02-13 23:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Hannes Reinecke, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 9:59 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Tue, Feb 14, 2012 at 07:53:26AM +1100, ronnie sahlberg wrote:
>> On Tue, Feb 14, 2012 at 7:42 AM, ronnie sahlberg
>> <ronniesahlberg@gmail.com> wrote:
>> > On Tue, Feb 14, 2012 at 2:12 AM, Hannes Reinecke <hare@suse.de> wrote:
>> >> On 02/13/2012 02:18 PM, Michael S. Tsirkin wrote:
>> >>> On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
>> >>>> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >>>>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
>> >>>>>> Only if you use the pci multi-function option but that kills
>> >>>>>> standard hot unplug
>> >>>>>
>> >>>>> It doesn't kill it as such, rather you can't unplug luns individually.
>> >>>>
>> >>>> Isnt that just a consequence of the current implementation rather than
>> >>>> a SCSI limitation?
>> >>>
>> >>> Yes.
>> >>>
>> >>>> A different way to do hoplug could be to flag all devices as removable
>> >>>> in the standard inq page then
>> >>>> leave the LUN there persistently and what you remove/add is not the
>> >>>> LUN device itself but just the media in the device.
>> >>>>
>> >>>> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
>> >>>> "media insert".
>> >>>> The device remains present all time, you never remove it, but instead
>> >>>> hot-plug controls if the media is present or not.
>> >>>>
>> >>>>
>> >>>> This would require implementing at least START_STOP_UNIT and
>> >>>> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
>> >>>>
>> >>>>
>> >>>> regards
>> >>>> ronnie sahlberg
>> >>>
>> >>> That would work.
>> >>>
>> >> Or we simply use the Peripheral Qualifier that the device is gone;
>> >> eg we could simply set PQ = 1, return sense code 0x25/00 and be done
>> >> with ...
>> >>
>> >
>> > That is still similar to "rip a device out from the guest without notice"
>> > and can cause the guest to be "surprised".
>> >
>> >
>> > Removable media is standard feature in SCSI SBC (and other commandsets).
>> > The nice part of removable media is that it activates a contract
>> > between the device and the guest
>> > to prevent removal of the media when the guest depends on the media
>> > not being removed.
>> >
>> > I.e.  If you have a SBC device with the removable-media bit set,
>> > this is used to tell the initiator "this media can be removed, be
>> > prepared that this might happen".
>> > So when you mount such a SBC device in the guest, the guest will issue
>> > a "PREVENT_ALLOW_MEDIUM_REMOVAL"
>> > to tell the device "this medium is in use and may not be removed".
>> >
>>
>> What I mean is that if /dev/sdb is removable,
>> if you mount this as   "mount /dev/sdb1 /mnt"
>> this will automatically cause the guest kernel to send a
>> PREVENT_ALLOW_MEDIUM_REMOVAL to /dev/sdb to prevent removal.
>>
>> When you "umount /dev/sdb1"   the kernel/guest will automagically send
>> PREVENT_ALLOW_MEDIUM_REMOVEAL to /dev/sdb and allow removal of the
>> media again.
>>
>>
>> If you capture this command and track the "prevent/allow removal
>> status"  you automatically get a channel where qemu will
>> know when it is safe to unplug the device  and when it is not safe to
>> unplug the device.
>> This is a nice feature.
>
> Presumably there's a way for device to notify the OS
> that user requested removal, as well?


I think that is done by responding with sense  to one of the commands,
like the every few second TEST_UNIT_READY that the
initiator/guest-kernel will send.

5Ah 01h    DT WROM BK  OPERATOR MEDIUM REMOVAL REQUEST

This sense code should be the one to use.


I dont know if linux scsi initiator honors this  or what it will do.



I guess something like this could work ?

IF device is marked as prevent-removal THEN
    send OPERATOR SEND MEDIUM REMOVAL REQUEST to the initiator
    wait xyz seconds
    IF device is still marked as prevent-removal THEN
        ask operator "guest refused to release the LUN, do you want to
forcefully remove it?"
    ELSE
        unmount the media
    FI
ELSE
   unmount the media
FI

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 23:30                                         ` ronnie sahlberg
@ 2012-02-13 23:33                                           ` Michael S. Tsirkin
  2012-02-14  0:49                                             ` ronnie sahlberg
  0 siblings, 1 reply; 34+ messages in thread
From: Michael S. Tsirkin @ 2012-02-13 23:33 UTC (permalink / raw)
  To: ronnie sahlberg
  Cc: Hannes Reinecke, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 10:30:59AM +1100, ronnie sahlberg wrote:
> On Tue, Feb 14, 2012 at 9:59 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Tue, Feb 14, 2012 at 07:53:26AM +1100, ronnie sahlberg wrote:
> >> On Tue, Feb 14, 2012 at 7:42 AM, ronnie sahlberg
> >> <ronniesahlberg@gmail.com> wrote:
> >> > On Tue, Feb 14, 2012 at 2:12 AM, Hannes Reinecke <hare@suse.de> wrote:
> >> >> On 02/13/2012 02:18 PM, Michael S. Tsirkin wrote:
> >> >>> On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
> >> >>>> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >>>>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
> >> >>>>>> Only if you use the pci multi-function option but that kills
> >> >>>>>> standard hot unplug
> >> >>>>>
> >> >>>>> It doesn't kill it as such, rather you can't unplug luns individually.
> >> >>>>
> >> >>>> Isnt that just a consequence of the current implementation rather than
> >> >>>> a SCSI limitation?
> >> >>>
> >> >>> Yes.
> >> >>>
> >> >>>> A different way to do hoplug could be to flag all devices as removable
> >> >>>> in the standard inq page then
> >> >>>> leave the LUN there persistently and what you remove/add is not the
> >> >>>> LUN device itself but just the media in the device.
> >> >>>>
> >> >>>> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
> >> >>>> "media insert".
> >> >>>> The device remains present all time, you never remove it, but instead
> >> >>>> hot-plug controls if the media is present or not.
> >> >>>>
> >> >>>>
> >> >>>> This would require implementing at least START_STOP_UNIT and
> >> >>>> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
> >> >>>>
> >> >>>>
> >> >>>> regards
> >> >>>> ronnie sahlberg
> >> >>>
> >> >>> That would work.
> >> >>>
> >> >> Or we simply use the Peripheral Qualifier that the device is gone;
> >> >> eg we could simply set PQ = 1, return sense code 0x25/00 and be done
> >> >> with ...
> >> >>
> >> >
> >> > That is still similar to "rip a device out from the guest without notice"
> >> > and can cause the guest to be "surprised".
> >> >
> >> >
> >> > Removable media is standard feature in SCSI SBC (and other commandsets).
> >> > The nice part of removable media is that it activates a contract
> >> > between the device and the guest
> >> > to prevent removal of the media when the guest depends on the media
> >> > not being removed.
> >> >
> >> > I.e.  If you have a SBC device with the removable-media bit set,
> >> > this is used to tell the initiator "this media can be removed, be
> >> > prepared that this might happen".
> >> > So when you mount such a SBC device in the guest, the guest will issue
> >> > a "PREVENT_ALLOW_MEDIUM_REMOVAL"
> >> > to tell the device "this medium is in use and may not be removed".
> >> >
> >>
> >> What I mean is that if /dev/sdb is removable,
> >> if you mount this as   "mount /dev/sdb1 /mnt"
> >> this will automatically cause the guest kernel to send a
> >> PREVENT_ALLOW_MEDIUM_REMOVAL to /dev/sdb to prevent removal.
> >>
> >> When you "umount /dev/sdb1"   the kernel/guest will automagically send
> >> PREVENT_ALLOW_MEDIUM_REMOVEAL to /dev/sdb and allow removal of the
> >> media again.
> >>
> >>
> >> If you capture this command and track the "prevent/allow removal
> >> status"  you automatically get a channel where qemu will
> >> know when it is safe to unplug the device  and when it is not safe to
> >> unplug the device.
> >> This is a nice feature.
> >
> > Presumably there's a way for device to notify the OS
> > that user requested removal, as well?
> 
> 
> I think that is done by responding with sense  to one of the commands,
> like the every few second TEST_UNIT_READY that the
> initiator/guest-kernel will send.

Does it do this even for mounted media?
I didn't realize ...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13  9:19                   ` Paolo Bonzini
@ 2012-02-14  0:07                     ` Rusty Russell
  0 siblings, 0 replies; 34+ messages in thread
From: Rusty Russell @ 2012-02-14  0:07 UTC (permalink / raw)
  To: Paolo Bonzini, James Bottomley
  Cc: Christian Hoff, BORNTRAE, kvm, linux-kernel, linux-scsi, mst

On Mon, 13 Feb 2012 10:19:56 +0100, Paolo Bonzini <pbonzini@redhat.com> wrote:
> block layer _is_ growing support for new operations: discard is already 
> there, write same is in the works, extended copy will also come in due 
> time.  Perhaps we'll add them to virtio-blk, perhaps not.

FYI, I'd take patches for discard in virtio_blk today; it's a no-brainer
in a virtual devoce.

But I wouldn't want extended copy and write same.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-13 23:33                                           ` Michael S. Tsirkin
@ 2012-02-14  0:49                                             ` ronnie sahlberg
  2012-02-14  1:11                                               ` Michael S. Tsirkin
  0 siblings, 1 reply; 34+ messages in thread
From: ronnie sahlberg @ 2012-02-14  0:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Hannes Reinecke, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 10:33 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Tue, Feb 14, 2012 at 10:30:59AM +1100, ronnie sahlberg wrote:
>> On Tue, Feb 14, 2012 at 9:59 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Tue, Feb 14, 2012 at 07:53:26AM +1100, ronnie sahlberg wrote:
>> >> On Tue, Feb 14, 2012 at 7:42 AM, ronnie sahlberg
>> >> <ronniesahlberg@gmail.com> wrote:
>> >> > On Tue, Feb 14, 2012 at 2:12 AM, Hannes Reinecke <hare@suse.de> wrote:
>> >> >> On 02/13/2012 02:18 PM, Michael S. Tsirkin wrote:
>> >> >>> On Tue, Feb 14, 2012 at 12:13:36AM +1100, ronnie sahlberg wrote:
>> >> >>>> On Tue, Feb 14, 2012 at 12:00 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> >>>>> On Mon, Feb 13, 2012 at 02:54:03PM +0200, Dor Laor wrote:
>> >> >>>>>> Only if you use the pci multi-function option but that kills
>> >> >>>>>> standard hot unplug
>> >> >>>>>
>> >> >>>>> It doesn't kill it as such, rather you can't unplug luns individually.
>> >> >>>>
>> >> >>>> Isnt that just a consequence of the current implementation rather than
>> >> >>>> a SCSI limitation?
>> >> >>>
>> >> >>> Yes.
>> >> >>>
>> >> >>>> A different way to do hoplug could be to flag all devices as removable
>> >> >>>> in the standard inq page then
>> >> >>>> leave the LUN there persistently and what you remove/add is not the
>> >> >>>> LUN device itself but just the media in the device.
>> >> >>>>
>> >> >>>> Instead of hot-plug remove the LUN,  hot-plug becomes "media eject" or
>> >> >>>> "media insert".
>> >> >>>> The device remains present all time, you never remove it, but instead
>> >> >>>> hot-plug controls if the media is present or not.
>> >> >>>>
>> >> >>>>
>> >> >>>> This would require implementing at least START_STOP_UNIT and
>> >> >>>> PREVENT_ALLOW_MEDIUM_REMOVAL opcode emulation from SBC.
>> >> >>>>
>> >> >>>>
>> >> >>>> regards
>> >> >>>> ronnie sahlberg
>> >> >>>
>> >> >>> That would work.
>> >> >>>
>> >> >> Or we simply use the Peripheral Qualifier that the device is gone;
>> >> >> eg we could simply set PQ = 1, return sense code 0x25/00 and be done
>> >> >> with ...
>> >> >>
>> >> >
>> >> > That is still similar to "rip a device out from the guest without notice"
>> >> > and can cause the guest to be "surprised".
>> >> >
>> >> >
>> >> > Removable media is standard feature in SCSI SBC (and other commandsets).
>> >> > The nice part of removable media is that it activates a contract
>> >> > between the device and the guest
>> >> > to prevent removal of the media when the guest depends on the media
>> >> > not being removed.
>> >> >
>> >> > I.e.  If you have a SBC device with the removable-media bit set,
>> >> > this is used to tell the initiator "this media can be removed, be
>> >> > prepared that this might happen".
>> >> > So when you mount such a SBC device in the guest, the guest will issue
>> >> > a "PREVENT_ALLOW_MEDIUM_REMOVAL"
>> >> > to tell the device "this medium is in use and may not be removed".
>> >> >
>> >>
>> >> What I mean is that if /dev/sdb is removable,
>> >> if you mount this as   "mount /dev/sdb1 /mnt"
>> >> this will automatically cause the guest kernel to send a
>> >> PREVENT_ALLOW_MEDIUM_REMOVAL to /dev/sdb to prevent removal.
>> >>
>> >> When you "umount /dev/sdb1"   the kernel/guest will automagically send
>> >> PREVENT_ALLOW_MEDIUM_REMOVEAL to /dev/sdb and allow removal of the
>> >> media again.
>> >>
>> >>
>> >> If you capture this command and track the "prevent/allow removal
>> >> status"  you automatically get a channel where qemu will
>> >> know when it is safe to unplug the device  and when it is not safe to
>> >> unplug the device.
>> >> This is a nice feature.
>> >
>> > Presumably there's a way for device to notify the OS
>> > that user requested removal, as well?
>>
>>
>> I think that is done by responding with sense  to one of the commands,
>> like the every few second TEST_UNIT_READY that the
>> initiator/guest-kernel will send.
>
> Does it do this even for mounted media?
> I didn't realize ...

Yes it does. At least for SBC devices that are flagged as removable.
"Normal" SBC devices that are not removable might trigger different
behaviour in the kernel.
(endless string of TEST_UNIT_READY is a way to check if something
"unexpected" happens on the device.)

I just tested on a 3.0.0  kernel.

I have a mediachanger for SBC and the SBC is flagged as removable.
/dev/sdd is the iscsi lun with a removable SBC disk.

By just exposing this device to the kernel, the kernel keeps sending,
or if not the kernel maybe some other process trying to poll the
status?

every few seconds :
PREVENT_ALLOW_MEDIUM_REMOVAL  prevent removal
PREVENT_ALLOW_MEDIUM_REMOVAL  to immediatel change it back to allow
removal again
TEST_UNIT_READY


After I run this
mount /dev/sdd1 /mnt

The kernel sends a single
PREVENT_ALLOW_MEDIUM_REMOVAL to prevent removal
then every few seconds a
TEST_UNIT_READY


This is what it does for removable SBC disks. Maybe it does something
differently for "permanent" SBC disks.

regards
ronnie sahlberg

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-14  0:49                                             ` ronnie sahlberg
@ 2012-02-14  1:11                                               ` Michael S. Tsirkin
  2012-02-14  9:57                                                 ` Paolo Bonzini
  0 siblings, 1 reply; 34+ messages in thread
From: Michael S. Tsirkin @ 2012-02-14  1:11 UTC (permalink / raw)
  To: ronnie sahlberg
  Cc: Hannes Reinecke, Dor Laor, Nicholas A. Bellinger,
	Christian Borntraeger, James Bottomley, Paolo Bonzini,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On Tue, Feb 14, 2012 at 11:49:55AM +1100, ronnie sahlberg wrote:
> By just exposing this device to the kernel, the kernel keeps sending,
> or if not the kernel maybe some other process trying to poll the
> status?
> 
> every few seconds :
> PREVENT_ALLOW_MEDIUM_REMOVAL  prevent removal
> PREVENT_ALLOW_MEDIUM_REMOVAL  to immediatel change it back to allow
> removal again
> TEST_UNIT_READY
> 
> 
> After I run this
> mount /dev/sdd1 /mnt
> 
> The kernel sends a single
> PREVENT_ALLOW_MEDIUM_REMOVAL to prevent removal
> then every few seconds a
> TEST_UNIT_READY
> 
> 
> This is what it does for removable SBC disks. Maybe it does something
> differently for "permanent" SBC disks.
> 
> regards
> ronnie sahlberg

There used to be a daemon for this.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Pe: [PATCH v5 1/3] virtio-scsi: first version
  2012-02-14  1:11                                               ` Michael S. Tsirkin
@ 2012-02-14  9:57                                                 ` Paolo Bonzini
  0 siblings, 0 replies; 34+ messages in thread
From: Paolo Bonzini @ 2012-02-14  9:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ronnie sahlberg, Hannes Reinecke, Dor Laor,
	Nicholas A. Bellinger, Christian Borntraeger, James Bottomley,
	Christian Hoff, borntrae, kvm, linux-kernel, linux-scsi, rusty,
	Stefan Hajnoczi, target-devel

On 02/14/2012 02:11 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 14, 2012 at 11:49:55AM +1100, ronnie sahlberg wrote:
>> By just exposing this device to the kernel, the kernel keeps sending,
>> or if not the kernel maybe some other process trying to poll the
>> status?
>>
>> every few seconds :
>> PREVENT_ALLOW_MEDIUM_REMOVAL  prevent removal
>> PREVENT_ALLOW_MEDIUM_REMOVAL  to immediatel change it back to allow
>> removal again
>> TEST_UNIT_READY
>>
>>
>> After I run this
>> mount /dev/sdd1 /mnt
>>
>> The kernel sends a single
>> PREVENT_ALLOW_MEDIUM_REMOVAL to prevent removal
>> then every few seconds a
>> TEST_UNIT_READY

Sorry to interrupt you again guys, but: the discussion started with 
virtio-blk hotplug and now we're talking about SCSI commands?  Sure 
somebody switched topic at some point :) and anyway this is irrelevant 
to what virtio-blk can/cannot do.

BTW, for virtio-scsi the spec provides a way to do hotplug and hotunplug 
without any polling, though it's not implemented yet in the driver.

Paolo

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2012-02-14  9:57 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-06  9:51 Pe: [PATCH v5 1/3] virtio-scsi: first version Christian Hoff
2012-02-07  9:54 ` Paolo Bonzini
2012-02-07 11:10   ` Michael S. Tsirkin
2012-02-07 11:26     ` Paolo Bonzini
2012-02-07 11:56   ` Christian Borntraeger
2012-02-07 12:31     ` Paolo Bonzini
2012-02-07 13:18       ` Christian Borntraeger
2012-02-07 13:59         ` Christian Hoff
2012-02-07 14:28           ` Paolo Bonzini
2012-02-08 13:37             ` Christian Hoff
2012-02-09  9:25               ` Paolo Bonzini
2012-02-09 12:18                 ` Christian Hoff
2012-02-12 20:16                 ` James Bottomley
2012-02-12 23:41                   ` Rusty Russell
2012-02-13  7:05                   ` Christian Borntraeger
2012-02-13  7:57                     ` Dor Laor
2012-02-13 12:40                       ` Nicholas A. Bellinger
2012-02-13 12:54                         ` Dor Laor
2012-02-13 13:00                           ` Michael S. Tsirkin
2012-02-13 13:13                             ` ronnie sahlberg
2012-02-13 13:17                               ` Paolo Bonzini
2012-02-13 13:18                               ` Michael S. Tsirkin
2012-02-13 15:12                                 ` Hannes Reinecke
2012-02-13 20:42                                   ` ronnie sahlberg
2012-02-13 20:53                                     ` ronnie sahlberg
2012-02-13 22:59                                       ` Michael S. Tsirkin
2012-02-13 23:30                                         ` ronnie sahlberg
2012-02-13 23:33                                           ` Michael S. Tsirkin
2012-02-14  0:49                                             ` ronnie sahlberg
2012-02-14  1:11                                               ` Michael S. Tsirkin
2012-02-14  9:57                                                 ` Paolo Bonzini
2012-02-13 11:08                     ` Bart Van Assche
2012-02-13  9:19                   ` Paolo Bonzini
2012-02-14  0:07                     ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).