linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Namhyung Kim <namhyung@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	virtio-dev@lists.oasis-open.org,
	"Tony Luck" <tony.luck@intel.com>,
	"Kees Cook" <keescook@chromium.org>, KVM <kvm@vger.kernel.org>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Anton Vorontsov" <anton@enomsg.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Minchan Kim" <minchan@kernel.org>,
	"Anthony Liguori" <aliguori@amazon.com>,
	"Colin Cross" <ccross@android.com>,
	virtualization@lists.linux-foundation.org,
	"Ingo Molnar" <mingo@kernel.org>
Subject: Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
Date: Wed, 16 Nov 2016 07:10:36 -0500 (EST)	[thread overview]
Message-ID: <1627682877.13077005.1479298236793.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <CAM9d7cj8CxS77WC-7O7fRX1s0sqTZqJo=ZHseN-hc2jpE8QD2Q@mail.gmail.com>

> Not sure how independent ERST is from ACPI and other specs.  It looks
> like referencing UEFI spec at least.

It is just the format of error records that comes from the UEFI spec
(include/linux/cper.h) but you can ignore it, I think.  It should be
handled by tools on the host side.  For you, the error log address
range contains a CPER header followed by a binary blob.  In practice,
you only need the record length field (bytes 20-23 of the header),
though it may be a good idea to validate the signature at the beginning
of the header.

> Btw, is the ERST used for pstore only (in Linux)?

Yes.  It can store various records, including dmesg and MCE.

There are other examples in QEMU of interfaces with ACPI.  They all use the
DSDT, but the logic is similar.  For example, docs/specs/acpi_mem_hotplug.txt
documents the memory hotplug interface. In all cases, ACPI tables contain small
programs that talk to specialized hardware registers, typically allocated to
hard-coded I/O ports.

In your case, the registers could occupy 16 consecutive I/O ports, like the
following:

     0x00       read/write   operation type (0=write,1=read,2=clear,3=dummy write)

     0x01       read-only    bit 7: if set, operation in progress

                             bit 0-6: operation status, see "Command Status Definition" in
                             the ACPI spec

     0x02       read-only    when read:

                             - read a 64-bit record id from the store to memory,
                               from the address that was last written to 0x08.

                             - if the id is valid and is not the last id in the store,
                               write the next 64-bit record id to the same address

                             - otherwise, write the first record id to the same address,
                               or 0xffffffffffffffff if the store is empty

     0x03                    unused, read as zero

     0x04-0x07  read/write   offset of the error record into the error log address range

     0x08-0x0b  read/write   when read, return number of stored records

                             when written, the written value is a 32-bit memory address,
                             which points to a 64-bit location used to communicate record ids.

     0x0c-0x0f  read/write   when read, always return -1 (together with the "mask" field
                             and READ_REGISTER, this lets ERST instructions return any value!)

                             when written, trigger the pstore operation:

                             - if the current operation is a dummy write, do nothing

                             - if the current operation is a write, write a new record, using
                             the written value as the base of the error log address range.  The
                             length must be parsed from the CPER header.

                             - if the current operation is a clear, read the record id
                             from the memory location that was last written to 0x08 and do the
                             operation.  the value written is ignored.

                             - if the current operation is a read, read the record id from the
                             memory location that was last written to 0x08, using the written
                             value as the base of the error log address range.

In addition, the firmware will need to reserve a few KB of RAM for the error log
address range (I checked a real system and it reserves 8KB).  The first eight
bytes are needed for the record identifier interface, because there's no such
thing as 64-bit I/O ports, and the rest can be used for the actual buffer.

QEMU already has an interface to allocate RAM and patch the address into an
ACPI table (bios_linker_loader_alloc).  Because this interface is actually meant
to load data from QEMU into the firmware (using the "fw_cfg" interface), you
would have to add a dummy 8KB file to fw_cfg using fw_cfg_add_file (for
example "etc/erst-memory"), it can be just full of zeros.

QEMU supports two chipsets, PIIX and ICH9, and the free I/O port ranges are
different.  You could use 0xa20 for ICH9 and 0xae20 for PIIX.

All in all, the contents of the ERST table would not be very different from a
non-virtual system, except that on real hardware the firmware would use SMIs
as the trap mechanism.  You almost have a one-to-one mapping between ERST
actions and registers accesses:

   BEGIN_WRITE_OPERATION                  write value 0 to register at 0x00
   BEGIN_READ_OPERATION                   write value 1 to register at 0x00
   BEGIN_CLEAR_OPERATION                  write value 2 to register at 0x00
   BEGIN_DUMMY_WRITE_OPERATION            write value 3 to register at 0x00
   END_OPERATION                          no-op
   CHECK_BUSY_STATUS                      read register at 0x01 with mask 0x80
   GET_COMMAND_STATUS                     read register at 0x01 with mask 0x7f
   SET_RECORD_OFFSET                      write register at 0x04
   GET_RECORD_COUNT                       read register at 0x08
   EXECUTE_OPERATION                      write ERST memory base + 8 to 0x0c
   GET_ERROR_LOG_ADDRESS_RANGE            read register at 0x0c (with mask = ERST memory base + 8)
   GET_ERROR_LOG_ADDRESS_RANGE_LENGTH     read register at 0x0c (with mask = 8192 - 8 = 8184)
   GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES read register at 0x0c (with mask = 0)

Only the get/set record identifier instructions are a little harder:

   GET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
                                          read register at 0x02
                                          read eight bytes at ERST memory base

   SET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
                                          write eight bytes at ERST memory base

On top of this, you need to add the APEI UUID (see apei_osc_setup in Linux)
to build_q35_osc_method, and use "-M q35" when you start QEMU.  If you need
more help just ask.  I or others can help you with the ACPI glue, then you
can write the file backend yourself, based on your existing virtio-pstore code.

> Also I need to control pstore driver like using bigger buffer,
> enabling specific message types and so on if ERST supports.  Is it
> possible for ERST to provide such information?

It's the normal pstore driver, same as on a real server.  What exactly do you
need?

Paolo

  reply	other threads:[~2016-11-16 12:10 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-20  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Namhyung Kim
2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-09-13 15:19   ` Michael S. Tsirkin
2016-09-16  9:05     ` Namhyung Kim
2016-11-10 16:39   ` Michael S. Tsirkin
2016-11-15  4:50     ` Namhyung Kim
2016-11-15  5:06       ` Michael S. Tsirkin
2016-11-15  5:50         ` Namhyung Kim
2016-11-15 14:35           ` Michael S. Tsirkin
2016-11-15  9:57         ` Paolo Bonzini
2016-11-15 14:36           ` Namhyung Kim
2016-11-15 14:38             ` Paolo Bonzini
2016-11-16  7:04               ` Namhyung Kim
2016-11-16 12:10                 ` Paolo Bonzini [this message]
2016-11-18  3:32                   ` Namhyung Kim
2016-11-18  4:07                     ` Michael S. Tsirkin
2016-11-18  9:46                       ` [virtio-dev] " Paolo Bonzini
2016-11-18  9:45                     ` Paolo Bonzini
2016-08-20  8:07 ` [PATCH 2/3] qemu: Implement virtio-pstore device Namhyung Kim
2016-08-24 22:00   ` Daniel P. Berrange
2016-08-26  4:48     ` Namhyung Kim
2016-08-26 12:27       ` Daniel P. Berrange
2016-09-13 15:57   ` Michael S. Tsirkin
2016-09-16 10:05     ` Namhyung Kim
2016-11-10 22:50       ` Michael S. Tsirkin
2016-11-15  6:23         ` Namhyung Kim
2016-11-15 14:38           ` Michael S. Tsirkin
2016-08-20  8:07 ` [PATCH 3/3] kvmtool: " Namhyung Kim
2016-08-23 10:25 ` [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Joel Fernandes
2016-08-23 15:20   ` Namhyung Kim
2016-08-24  7:10     ` Joel
  -- strict thread matches above, loose matches on Subject: below --
2016-09-04 14:38 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v5) Namhyung Kim
2016-09-04 14:38 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-09-08 20:49   ` Kees Cook
2016-09-22 11:57   ` Stefan Hajnoczi
2016-09-23  5:48     ` Namhyung Kim
2016-08-31  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v4) Namhyung Kim
2016-08-31  8:08 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-08-31 14:54   ` Michael S. Tsirkin
2016-09-01  0:03     ` Namhyung Kim
2016-07-18  4:37 [RFC/PATCHSET 0/3] virtio-pstore: Implement virtio pstore device Namhyung Kim
2016-07-18  4:37 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-07-18  5:12   ` Kees Cook
2016-07-18  5:50     ` Namhyung Kim
2016-07-18 17:50       ` Kees Cook
2016-07-19 13:43         ` Namhyung Kim
2016-07-19 15:32           ` Namhyung Kim
2016-07-20 12:56           ` Namhyung Kim
2016-07-18  7:54   ` Cornelia Huck
2016-07-18  8:29     ` Namhyung Kim
2016-07-18  9:02       ` Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1627682877.13077005.1479298236793.JavaMail.zimbra@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=aliguori@amazon.com \
    --cc=anton@enomsg.org \
    --cc=ccross@android.com \
    --cc=keescook@chromium.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mst@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rkrcmar@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@intel.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).