All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
@ 2024-04-29  6:50 Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
                   ` (19 more replies)
  0 siblings, 20 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Hi,

The most important change in this version is instroducing a common
HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
between vIOMMU and HostIOMMUDevice.

HostIOMMUDeviceClass::realize() is introduced to initialize
HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.

HostIOMMUDeviceClass::check_cap() is introduced to query host IOMMU
device capabilities.

After the change, part2 is only 3 patches, so merge it with part1 to be
a single prerequisite series, same for changelog. If anyone doesn't like
that, I can split again.

The class tree is as below:

                              HostIOMMUDevice
                                     | .caps
                                     | .realize()
                                     | .check_cap()
                                     |
            .-----------------------------------------------.
            |                        |                      |
HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
            | .vdev                  | {.vdev}              | .iommufd
                                                            | .devid
                                                            | [.ioas_id]
                                                            | [.attach_hwpt()]
                                                            | [.detach_hwpt()]
                                                            |
                                          .----------------------.
                                          |                      |
                       HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
                                          | .vdev                | {.vdev}

* The attributes in [] will be implemented in nesting series.
* The classes in {} will be implemented in future.
* .vdev in different class points to different agent device,
* i.e., for VFIO it points to VFIODevice.

PATCH1-4: Introduce HostIOMMUDevice and its sub classes
PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize() and .check_cap() handler
PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU
PATCH17-19: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)

Qemu code can be found at:
https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v3

Besides the compatibility check in this series, in nesting series, this
host IOMMU device is extended for much wider usage. For anyone interested
on the nesting series, here is the link:
https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2

Thanks
Zhenzhong

Changelog:
v3:
- refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
- introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
- introduce helper range_get_last_bit() for range operation (Cédric)
- separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
- replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
- add header in include/sysemu/iommufd.h (Cédric)

v2:
- use QOM to abstract host IOMMU device and its sub-classes (Cédric)
- move host IOMMU device creation in attach_device() (Cédric)
- refine pci_device_set/unset_iommu_device doc further (Eric)
- define host IOMMU info format of different backend
- implement get_host_iommu_info() for different backend (Cédric)
- drop cap/ecap update logic (MST)
- check aw-bits from get_host_iommu_info() in legacy mode

v1:
- use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
- change host_iommu_device_init to host_iommu_device_create
- allocate HostIOMMUDevice in host_iommu_device_create callback
  and set the VFIODevice base_hdev handle (Eric)
- refine pci_device_set/unset_iommu_device doc (Eric)
- use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
- convert HostIOMMUDevice to sub object pointer in vtd_check_hdev

rfcv2:
- introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
- remove iommufd_device.[ch] (Cédric)
- remove duplicate iommufd/devid define from VFIODevice (Eric)
- drop the p in aliased_pbus and aliased_pdevfn (Eric)
- assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
- use errp in iommufd_device_get_info (Eric)
- split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
- move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
- make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
- block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
- add R-B

Yi Liu (2):
  hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  intel_iommu: Implement [set|unset]_iommu_device() callbacks

Zhenzhong Duan (17):
  backends: Introduce HostIOMMUDevice abstract
  vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device
  backends/iommufd: Introduce abstract HostIOMMUDeviceIOMMUFD device
  vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device
  backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  range: Introduce range_get_last_bit()
  vfio/container: Implement HostIOMMUDeviceClass::realize() handler
  backends/iommufd: Introduce helper function
    iommufd_backend_get_device_info()
  vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
  vfio/container: Implement HostIOMMUDeviceClass::check_cap() handler
  backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  vfio: Introduce VFIOIOMMUClass::hiod_typename attribute
  vfio: Create host IOMMU device instance
  hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
  vfio/pci: Pass HostIOMMUDevice to vIOMMU
  intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
  intel_iommu: Check compatibility with host IOMMU capabilities

 MAINTAINERS                           |   2 +
 hw/i386/intel_iommu_internal.h        |   8 ++
 include/hw/i386/intel_iommu.h         |   3 +
 include/hw/pci/pci.h                  |  38 ++++-
 include/hw/vfio/vfio-common.h         |  26 ++++
 include/hw/vfio/vfio-container-base.h |   3 +
 include/qemu/range.h                  |  11 ++
 include/sysemu/host_iommu_device.h    |  95 +++++++++++++
 include/sysemu/iommufd.h              |  34 +++++
 backends/host_iommu_device.c          |  59 ++++++++
 backends/iommufd.c                    |  75 +++++++---
 hw/i386/intel_iommu.c                 | 197 ++++++++++++++++++++------
 hw/pci/pci.c                          |  75 +++++++++-
 hw/vfio/common.c                      |  18 ++-
 hw/vfio/container.c                   |  49 ++++++-
 hw/vfio/iommufd.c                     |  52 ++++++-
 hw/vfio/pci.c                         |  20 ++-
 backends/Kconfig                      |   5 +
 backends/meson.build                  |   1 +
 19 files changed, 701 insertions(+), 70 deletions(-)
 create mode 100644 include/sysemu/host_iommu_device.h
 create mode 100644 backends/host_iommu_device.c

-- 
2.34.1



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3 01/19] backends: Introduce HostIOMMUDevice abstract
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device Zhenzhong Duan
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan, Paolo Bonzini

Introduce HostIOMMUDevice as an abstraction of host IOMMU device.

Introduce .realize() to initialize HostIOMMUDevice further after
instance init.

Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
for VFIO, and VDPA in the future.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 MAINTAINERS                        |  2 ++
 include/sysemu/host_iommu_device.h | 51 ++++++++++++++++++++++++++++++
 backends/host_iommu_device.c       | 30 ++++++++++++++++++
 backends/Kconfig                   |  5 +++
 backends/meson.build               |  1 +
 5 files changed, 89 insertions(+)
 create mode 100644 include/sysemu/host_iommu_device.h
 create mode 100644 backends/host_iommu_device.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 302b6fd00c..f67cd36b34 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2190,6 +2190,8 @@ M: Zhenzhong Duan <zhenzhong.duan@intel.com>
 S: Supported
 F: backends/iommufd.c
 F: include/sysemu/iommufd.h
+F: backends/host_iommu_device.c
+F: include/sysemu/host_iommu_device.h
 F: include/qemu/chardev_open.h
 F: util/chardev_open.c
 F: docs/devel/vfio-iommufd.rst
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
new file mode 100644
index 0000000000..2b58a94d62
--- /dev/null
+++ b/include/sysemu/host_iommu_device.h
@@ -0,0 +1,51 @@
+/*
+ * Host IOMMU device abstract declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#ifndef HOST_IOMMU_DEVICE_H
+#define HOST_IOMMU_DEVICE_H
+
+#include "qom/object.h"
+#include "qapi/error.h"
+
+#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
+OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
+
+struct HostIOMMUDevice {
+    Object parent_obj;
+};
+
+/**
+ * struct HostIOMMUDeviceClass - The base class for all host IOMMU devices.
+ *
+ * Different type of host devices (e.g., VFIO or VDPA device) or devices
+ * with different backend (e.g., VFIO legacy container or IOMMUFD backend)
+ * can have different sub-classes.
+ */
+struct HostIOMMUDeviceClass {
+    ObjectClass parent_class;
+
+    /**
+     * @realize: initialize host IOMMU device instance further.
+     *
+     * Mandatory callback.
+     *
+     * @hiod: pointer to a host IOMMU device instance.
+     *
+     * @opaque: pointer to agent device of this host IOMMU device,
+     *          i.e., for VFIO, pointer to VFIODevice
+     *
+     * @errp: pass an Error out when realize fails.
+     *
+     * Returns: true on success, false on failure.
+     */
+    bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+};
+#endif
diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
new file mode 100644
index 0000000000..41f2fdce20
--- /dev/null
+++ b/backends/host_iommu_device.c
@@ -0,0 +1,30 @@
+/*
+ * Host IOMMU device abstract
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/host_iommu_device.h"
+
+OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
+                            host_iommu_device,
+                            HOST_IOMMU_DEVICE,
+                            OBJECT)
+
+static void host_iommu_device_class_init(ObjectClass *oc, void *data)
+{
+}
+
+static void host_iommu_device_init(Object *obj)
+{
+}
+
+static void host_iommu_device_finalize(Object *obj)
+{
+}
diff --git a/backends/Kconfig b/backends/Kconfig
index 2cb23f62fa..34ab29e994 100644
--- a/backends/Kconfig
+++ b/backends/Kconfig
@@ -3,3 +3,8 @@ source tpm/Kconfig
 config IOMMUFD
     bool
     depends on VFIO
+
+config HOST_IOMMU_DEVICE
+    bool
+    default y
+    depends on VFIO
diff --git a/backends/meson.build b/backends/meson.build
index 8b2b111497..2e975d641e 100644
--- a/backends/meson.build
+++ b/backends/meson.build
@@ -25,6 +25,7 @@ if have_vhost_user
 endif
 system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c'))
 system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
+system_ss.add(when: 'CONFIG_HOST_IOMMU_DEVICE', if_true: files('host_iommu_device.c'))
 if have_vhost_user_crypto
   system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c'))
 endif
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  7:51   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 03/19] backends/iommufd: Introduce abstract HostIOMMUDeviceIOMMUFD device Zhenzhong Duan
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

HostIOMMUDeviceLegacyVFIO represents a host IOMMU device under VFIO
legacy container backend.

It includes a link to VFIODevice.

Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h | 12 ++++++++++++
 hw/vfio/container.c           |  6 +++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index b9da6c08ef..aa3abe0a18 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -31,6 +31,7 @@
 #endif
 #include "sysemu/sysemu.h"
 #include "hw/vfio/vfio-container-base.h"
+#include "sysemu/host_iommu_device.h"
 
 #define VFIO_MSG_PREFIX "vfio %s: "
 
@@ -147,6 +148,17 @@ typedef struct VFIOGroup {
     bool ram_block_discard_allowed;
 } VFIOGroup;
 
+#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
+OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceLegacyVFIO,
+                           HOST_IOMMU_DEVICE_LEGACY_VFIO)
+
+/* Abstract of host IOMMU device with VFIO legacy container backend */
+struct HostIOMMUDeviceLegacyVFIO {
+    HostIOMMUDevice parent_obj;
+
+    VFIODevice *vdev;
+};
+
 typedef struct VFIODMABuf {
     QemuDmaBuf buf;
     uint32_t pos_x, pos_y, pos_updates;
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 77bdec276e..3b6826996a 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1148,7 +1148,11 @@ static const TypeInfo types[] = {
         .name = TYPE_VFIO_IOMMU_LEGACY,
         .parent = TYPE_VFIO_IOMMU,
         .class_init = vfio_iommu_legacy_class_init,
-    },
+    }, {
+        .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
+        .parent = TYPE_HOST_IOMMU_DEVICE,
+        .instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
+    }
 };
 
 DEFINE_TYPES(types)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 03/19] backends/iommufd: Introduce abstract HostIOMMUDeviceIOMMUFD device
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device Zhenzhong Duan
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

HostIOMMUDeviceIOMMUFD represents a host IOMMU device under iommufd
backend.

Currently it contains public iommufd handle and device id which
will be passed to vIOMMU to allocate/free ioas, hwpt, etc.

When nested translation is supported in future, vIOMMU will
request iommufd related operations like attaching/detaching hwpt.
VFIO and VDPA device have different way of attaching/detaching hwpt.
So HostIOMMUDeviceIOMMUFD is still an abstract class which will be
inherited by VFIO and VDPA device sub-classes.

Opportunistically, add missed header to include/sysemu/iommufd.h.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/sysemu/iommufd.h | 30 ++++++++++++++++++++++++++++++
 backends/iommufd.c       | 37 ++++++++++++++++++++-----------------
 2 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 9af27ebd6c..6a9fb0007a 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -1,9 +1,23 @@
+/*
+ * iommufd container backend declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ * Copyright Red Hat, Inc. 2024
+ *
+ * Authors: Yi Liu <yi.l.liu@intel.com>
+ *          Eric Auger <eric.auger@redhat.com>
+ *          Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
 #ifndef SYSEMU_IOMMUFD_H
 #define SYSEMU_IOMMUFD_H
 
 #include "qom/object.h"
 #include "exec/hwaddr.h"
 #include "exec/cpu-common.h"
+#include "sysemu/host_iommu_device.h"
 
 #define TYPE_IOMMUFD_BACKEND "iommufd"
 OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, IOMMUFD_BACKEND)
@@ -33,4 +47,20 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
                             ram_addr_t size, void *vaddr, bool readonly);
 int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
                               hwaddr iova, ram_addr_t size);
+
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
+OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
+                    HOST_IOMMU_DEVICE_IOMMUFD)
+
+/* Abstract of host IOMMU device with iommufd backend */
+struct HostIOMMUDeviceIOMMUFD {
+    HostIOMMUDevice parent_obj;
+
+    IOMMUFDBackend *iommufd;
+    uint32_t devid;
+};
+
+struct HostIOMMUDeviceIOMMUFDClass {
+    HostIOMMUDeviceClass parent_class;
+};
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 76a0204852..19e46194a2 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -211,23 +211,26 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
     return ret;
 }
 
-static const TypeInfo iommufd_backend_info = {
-    .name = TYPE_IOMMUFD_BACKEND,
-    .parent = TYPE_OBJECT,
-    .instance_size = sizeof(IOMMUFDBackend),
-    .instance_init = iommufd_backend_init,
-    .instance_finalize = iommufd_backend_finalize,
-    .class_size = sizeof(IOMMUFDBackendClass),
-    .class_init = iommufd_backend_class_init,
-    .interfaces = (InterfaceInfo[]) {
-        { TYPE_USER_CREATABLE },
-        { }
+static const TypeInfo types[] = {
+    {
+        .name = TYPE_IOMMUFD_BACKEND,
+        .parent = TYPE_OBJECT,
+        .instance_size = sizeof(IOMMUFDBackend),
+        .instance_init = iommufd_backend_init,
+        .instance_finalize = iommufd_backend_finalize,
+        .class_size = sizeof(IOMMUFDBackendClass),
+        .class_init = iommufd_backend_class_init,
+        .interfaces = (InterfaceInfo[]) {
+            { TYPE_USER_CREATABLE },
+            { }
+        }
+    }, {
+        .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+        .parent = TYPE_HOST_IOMMU_DEVICE,
+        .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
+        .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
+        .abstract = true,
     }
 };
 
-static void register_types(void)
-{
-    type_register_static(&iommufd_backend_info);
-}
-
-type_init(register_types);
+DEFINE_TYPES(types)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (2 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 03/19] backends/iommufd: Introduce abstract HostIOMMUDeviceIOMMUFD device Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  7:52   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

HostIOMMUDeviceIOMMUFDVFIO represents a host IOMMU device under VFIO
iommufd backend. It will be created during VFIO device attaching and
passed to vIOMMU.

It includes a link to VFIODevice so that we can do VFIO device
specific operations, i.e., [at/de]taching hwpt, etc.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h | 13 +++++++++++++
 hw/vfio/iommufd.c             |  6 +++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index aa3abe0a18..0943add3bc 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -32,6 +32,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/vfio/vfio-container-base.h"
 #include "sysemu/host_iommu_device.h"
+#include "sysemu/iommufd.h"
 
 #define VFIO_MSG_PREFIX "vfio %s: "
 
@@ -159,6 +160,18 @@ struct HostIOMMUDeviceLegacyVFIO {
     VFIODevice *vdev;
 };
 
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
+            TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"
+OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceIOMMUFDVFIO,
+                           HOST_IOMMU_DEVICE_IOMMUFD_VFIO)
+
+/* Abstraction of host IOMMU device with VFIO IOMMUFD backend */
+struct HostIOMMUDeviceIOMMUFDVFIO {
+    HostIOMMUDeviceIOMMUFD parent;
+
+    VFIODevice *vdev;
+};
+
 typedef struct VFIODMABuf {
     QemuDmaBuf buf;
     uint32_t pos_x, pos_y, pos_updates;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 8827ffe636..997f4ac43e 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -639,7 +639,11 @@ static const TypeInfo types[] = {
         .name = TYPE_VFIO_IOMMU_IOMMUFD,
         .parent = TYPE_VFIO_IOMMU,
         .class_init = vfio_iommu_iommufd_class_init,
-    },
+    }, {
+        .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
+        .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+        .instance_size = sizeof(HostIOMMUDeviceIOMMUFDVFIO),
+    }
 };
 
 DEFINE_TYPES(types)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (3 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  9:41   ` Cédric Le Goater
  2024-05-07  6:11   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 06/19] range: Introduce range_get_last_bit() Zhenzhong Duan
                   ` (14 subsequent siblings)
  19 siblings, 2 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
Different platform IOMMU can support different elements.

Currently only two elements, type and aw_bits, type hints the host
platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
host IOMMU address width.

Introduce .check_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
is supported.

Introduce a HostIOMMUDevice API host_iommu_device_check_cap() which
is a wrapper of .check_cap().

Introduce a HostIOMMUDevice API host_iommu_device_check_cap_common()
to check common capabalities of different host platform IOMMUs.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/sysemu/host_iommu_device.h | 44 ++++++++++++++++++++++++++++++
 backends/host_iommu_device.c       | 29 ++++++++++++++++++++
 2 files changed, 73 insertions(+)

diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index 2b58a94d62..12b6afb463 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -14,12 +14,27 @@
 
 #include "qom/object.h"
 #include "qapi/error.h"
+#include "linux/iommufd.h"
+
+/**
+ * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
+ *
+ * @type: host platform IOMMU type.
+ *
+ * @aw_bits: host IOMMU address width. 0xff if no limitation.
+ */
+typedef struct HostIOMMUDeviceCaps {
+    enum iommu_hw_info_type type;
+    uint8_t aw_bits;
+} HostIOMMUDeviceCaps;
 
 #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
 OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
 
 struct HostIOMMUDevice {
     Object parent_obj;
+
+    HostIOMMUDeviceCaps caps;
 };
 
 /**
@@ -47,5 +62,34 @@ struct HostIOMMUDeviceClass {
      * Returns: true on success, false on failure.
      */
     bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+    /**
+     * @check_cap: check if a host IOMMU device capability is supported.
+     *
+     * Optional callback, if not implemented, hint not supporting query
+     * of @cap.
+     *
+     * @hiod: pointer to a host IOMMU device instance.
+     *
+     * @cap: capability to check.
+     *
+     * @errp: pass an Error out when fails to query capability.
+     *
+     * Returns: <0 on failure, 0 if a @cap is unsupported, or else
+     * 1 or some positive value for some special @cap,
+     * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
+     */
+    int (*check_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
 };
+
+/*
+ * Host IOMMU device capability list.
+ */
+#define HOST_IOMMU_DEVICE_CAP_IOMMUFD       0
+#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE    1
+#define HOST_IOMMU_DEVICE_CAP_AW_BITS       2
+
+
+int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp);
+int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod, int cap,
+                                       Error **errp);
 #endif
diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
index 41f2fdce20..b97d008cc7 100644
--- a/backends/host_iommu_device.c
+++ b/backends/host_iommu_device.c
@@ -28,3 +28,32 @@ static void host_iommu_device_init(Object *obj)
 static void host_iommu_device_finalize(Object *obj)
 {
 }
+
+/* Wrapper of HostIOMMUDeviceClass:check_cap */
+int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
+{
+    HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
+    if (!hiodc->check_cap) {
+        error_setg(errp, ".check_cap() not implemented");
+        return -EINVAL;
+    }
+
+    return hiodc->check_cap(hiod, cap, errp);
+}
+
+/* Implement check on common IOMMU capabilities */
+int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod, int cap,
+                                       Error **errp)
+{
+    HostIOMMUDeviceCaps *caps = &hiod->caps;
+
+    switch (cap) {
+    case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
+        return caps->type;
+    case HOST_IOMMU_DEVICE_CAP_AW_BITS:
+        return caps->aw_bits;
+    default:
+        error_setg(errp, "Not support query cap %x", cap);
+        return -EINVAL;
+    }
+}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 06/19] range: Introduce range_get_last_bit()
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (4 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  9:41   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

This helper get the highest 1 bit position of the upper bound.

If the range is empty or upper bound is zero, -1 is returned.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/qemu/range.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/qemu/range.h b/include/qemu/range.h
index 205e1da76d..8e05bc1d9f 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -20,6 +20,8 @@
 #ifndef QEMU_RANGE_H
 #define QEMU_RANGE_H
 
+#include "qemu/bitops.h"
+
 /*
  * Operations on 64 bit address ranges.
  * Notes:
@@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
     return !(last2 < first1 || last1 < first2);
 }
 
+/* Get highest non-zero bit position of a range */
+static inline int range_get_last_bit(Range *range)
+{
+    if (range_is_empty(range) || !range->upb) {
+        return -1;
+    }
+    return find_last_bit(&range->upb, sizeof(range->upb));
+}
+
 /*
  * Return -1 if @a < @b, 1 @a > @b, and 0 if they touch or overlap.
  * Both @a and @b must not be empty.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (5 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 06/19] range: Introduce range_get_last_bit() Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  9:41   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Utilize range_get_last_bit() to get host IOMMU address width and
package it in HostIOMMUDeviceCaps for query with .check_cap().

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/container.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 3b6826996a..863eec3943 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1143,6 +1143,34 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
     vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
 };
 
+static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
+                                     Error **errp)
+{
+    VFIODevice *vdev = opaque;
+    /* iova_ranges is a sorted list */
+    GList *l = g_list_last(vdev->bcontainer->iova_ranges);
+
+    /* There is no VFIO uAPI to query host platform IOMMU type */
+    hiod->caps.type = IOMMU_HW_INFO_TYPE_NONE;
+    HOST_IOMMU_DEVICE_IOMMUFD_VFIO(hiod)->vdev = vdev;
+
+    if (l) {
+        Range *range = l->data;
+        hiod->caps.aw_bits = range_get_last_bit(range) + 1;
+    } else {
+        hiod->caps.aw_bits = 0xff;
+    }
+
+    return true;
+}
+
+static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
+{
+    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+    hioc->realize = hiod_legacy_vfio_realize;
+};
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_VFIO_IOMMU_LEGACY,
@@ -1152,6 +1180,7 @@ static const TypeInfo types[] = {
         .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
         .parent = TYPE_HOST_IOMMU_DEVICE,
         .instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
+        .class_init = hiod_legacy_vfio_class_init,
     }
 };
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info()
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (6 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  9:41   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan, Yi Sun

Introduce a helper function iommufd_backend_get_device_info() to get
host IOMMU related information through iommufd uAPI.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/sysemu/iommufd.h |  4 ++++
 backends/iommufd.c       | 24 +++++++++++++++++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 6a9fb0007a..e9593637a3 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -17,6 +17,7 @@
 #include "qom/object.h"
 #include "exec/hwaddr.h"
 #include "exec/cpu-common.h"
+#include <linux/iommufd.h>
 #include "sysemu/host_iommu_device.h"
 
 #define TYPE_IOMMUFD_BACKEND "iommufd"
@@ -47,6 +48,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
                             ram_addr_t size, void *vaddr, bool readonly);
 int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
                               hwaddr iova, ram_addr_t size);
+int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+                                    enum iommu_hw_info_type *type,
+                                    void *data, uint32_t len, Error **errp);
 
 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
 OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 19e46194a2..d61209788a 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -19,7 +19,6 @@
 #include "monitor/monitor.h"
 #include "trace.h"
 #include <sys/ioctl.h>
-#include <linux/iommufd.h>
 
 static void iommufd_backend_init(Object *obj)
 {
@@ -211,6 +210,29 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
     return ret;
 }
 
+int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+                                    enum iommu_hw_info_type *type,
+                                    void *data, uint32_t len, Error **errp)
+{
+    struct iommu_hw_info info = {
+        .size = sizeof(info),
+        .dev_id = devid,
+        .data_len = len,
+        .data_uptr = (uintptr_t)data,
+    };
+    int ret;
+
+    ret = ioctl(be->fd, IOMMU_GET_HW_INFO, &info);
+    if (ret) {
+        error_setg_errno(errp, errno, "Failed to get hardware info");
+    } else {
+        g_assert(type);
+        *type = info.out_data_type;
+    }
+
+    return ret;
+}
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_IOMMUFD_BACKEND,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (7 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 10/19] vfio/container: Implement HostIOMMUDeviceClass::check_cap() handler Zhenzhong Duan
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan, Marcel Apfelbaum

It calls iommufd_backend_get_device_info() to get host IOMMU
related information and translate it into HostIOMMUDeviceCaps
for query with .check_cap().

Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
(aw_bits - 1).

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/i386/intel_iommu.h |  1 +
 hw/vfio/iommufd.c             | 44 +++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 7fa0a695c8..7d694b0813 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState, INTEL_IOMMU_DEVICE)
 #define VTD_HOST_AW_48BIT           48
 #define VTD_HOST_ADDRESS_WIDTH      VTD_HOST_AW_39BIT
 #define VTD_HAW_MASK(aw)            ((1ULL << (aw)) - 1)
+#define VTD_MGAW_FROM_CAP(cap)      ((cap >> 16) & 0x3fULL)
 
 #define DMAR_REPORT_F_INTR          (1)
 
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 997f4ac43e..6bc2dc68f6 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -25,6 +25,7 @@
 #include "qemu/cutils.h"
 #include "qemu/chardev_open.h"
 #include "pci.h"
+#include "hw/i386/intel_iommu_internal.h"
 
 static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova,
                             ram_addr_t size, void *vaddr, bool readonly)
@@ -634,6 +635,48 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
     vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
 };
 
+static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
+                                      Error **errp)
+{
+    VFIODevice *vdev = opaque;
+    HostIOMMUDeviceIOMMUFD *idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
+    HostIOMMUDeviceCaps *caps = &hiod->caps;
+    enum iommu_hw_info_type type;
+    union {
+        struct iommu_hw_info_vtd vtd;
+    } data;
+    int ret;
+
+    HOST_IOMMU_DEVICE_IOMMUFD_VFIO(hiod)->vdev = vdev;
+    idev->iommufd = vdev->iommufd;
+    idev->devid = vdev->devid;
+
+    ret = iommufd_backend_get_device_info(idev->iommufd, idev->devid,
+                                          &type, &data, sizeof(data), errp);
+    if (ret) {
+        return false;
+    }
+
+    caps->type = type;
+
+    switch (type) {
+    case IOMMU_HW_INFO_TYPE_INTEL_VTD:
+        caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
+        break;
+    case IOMMU_HW_INFO_TYPE_NONE:
+        break;
+    }
+
+    return true;
+}
+
+static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
+{
+    HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+    hiodc->realize = hiod_iommufd_vfio_realize;
+};
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_VFIO_IOMMU_IOMMUFD,
@@ -643,6 +686,7 @@ static const TypeInfo types[] = {
         .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
         .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
         .instance_size = sizeof(HostIOMMUDeviceIOMMUFDVFIO),
+        .class_init = hiod_iommufd_vfio_class_init,
     }
 };
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 10/19] vfio/container: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (8 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 11/19] backends/iommufd: " Zhenzhong Duan
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/container.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 863eec3943..3683487605 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1164,11 +1164,23 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     return true;
 }
 
+static int hiod_legacy_vfio_check_cap(HostIOMMUDevice *hiod, int cap,
+                                      Error **errp)
+{
+    switch (cap) {
+    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
+        return 0;
+    default:
+        return host_iommu_device_check_cap_common(hiod, cap, errp);
+    }
+}
+
 static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
 {
     HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
 
     hioc->realize = hiod_legacy_vfio_realize;
+    hioc->check_cap = hiod_legacy_vfio_check_cap;
 };
 
 static const TypeInfo types[] = {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (9 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 10/19] vfio/container: Implement HostIOMMUDeviceClass::check_cap() handler Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  9:41   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 12/19] vfio: Introduce VFIOIOMMUClass::hiod_typename attribute Zhenzhong Duan
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 backends/iommufd.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index d61209788a..28faec528e 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -233,6 +233,23 @@ int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
     return ret;
 }
 
+static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
+{
+    switch (cap) {
+    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
+        return 1;
+    default:
+        return host_iommu_device_check_cap_common(hiod, cap, errp);
+    }
+}
+
+static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
+{
+    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+    hioc->check_cap = hiod_iommufd_check_cap;
+};
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_IOMMUFD_BACKEND,
@@ -251,6 +268,7 @@ static const TypeInfo types[] = {
         .parent = TYPE_HOST_IOMMU_DEVICE,
         .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
         .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
+        .class_init = hiod_iommufd_class_init,
         .abstract = true,
     }
 };
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 12/19] vfio: Introduce VFIOIOMMUClass::hiod_typename attribute
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (10 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 11/19] backends/iommufd: " Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 13/19] vfio: Create host IOMMU device instance Zhenzhong Duan
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Initialize attribute VFIOIOMMUClass::hiod_typename based on
VFIO backend type.

This attribute will facilitate HostIOMMUDevice creation in
vfio_attach_device().

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-container-base.h | 3 +++
 hw/vfio/container.c                   | 2 ++
 hw/vfio/iommufd.c                     | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index 3582d5f97a..c387f0d8a4 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -110,6 +110,9 @@ DECLARE_CLASS_CHECKERS(VFIOIOMMUClass, VFIO_IOMMU, TYPE_VFIO_IOMMU)
 struct VFIOIOMMUClass {
     InterfaceClass parent_class;
 
+    /* Properties */
+    const char *hiod_typename;
+
     /* basic feature */
     int (*setup)(VFIOContainerBase *bcontainer, Error **errp);
     int (*dma_map)(const VFIOContainerBase *bcontainer,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 3683487605..57c814fcd5 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1133,6 +1133,8 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
 {
     VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
 
+    vioc->hiod_typename = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO;
+
     vioc->setup = vfio_legacy_setup;
     vioc->dma_map = vfio_legacy_dma_map;
     vioc->dma_unmap = vfio_legacy_dma_unmap;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 6bc2dc68f6..1ac7dea789 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -628,6 +628,8 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
 {
     VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
 
+    vioc->hiod_typename = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO;
+
     vioc->dma_map = iommufd_cdev_map;
     vioc->dma_unmap = iommufd_cdev_unmap;
     vioc->attach_device = iommufd_cdev_attach;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 13/19] vfio: Create host IOMMU device instance
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (11 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 12/19] vfio: Introduce VFIOIOMMUClass::hiod_typename attribute Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-30  9:41   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Create host IOMMU device instance in vfio_attach_device() and call
.realize() to initialize it further.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h |  1 +
 hw/vfio/common.c              | 18 +++++++++++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 0943add3bc..b204b93a55 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -126,6 +126,7 @@ typedef struct VFIODevice {
     OnOffAuto pre_copy_dirty_page_tracking;
     bool dirty_pages_supported;
     bool dirty_tracking;
+    HostIOMMUDevice *hiod;
     int devid;
     IOMMUFDBackend *iommufd;
 } VFIODevice;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 8f9cbdc026..0be8b70ebd 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1497,6 +1497,8 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
 {
     const VFIOIOMMUClass *ops =
         VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
+    HostIOMMUDevice *hiod;
+    int ret;
 
     if (vbasedev->iommufd) {
         ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
@@ -1504,7 +1506,20 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
 
     assert(ops);
 
-    return ops->attach_device(name, vbasedev, as, errp);
+    ret = ops->attach_device(name, vbasedev, as, errp);
+    if (ret < 0) {
+        return ret;
+    }
+
+    hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
+    if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp)) {
+        object_unref(hiod);
+        ops->detach_device(vbasedev);
+        return -EINVAL;
+    }
+    vbasedev->hiod = hiod;
+
+    return 0;
 }
 
 void vfio_detach_device(VFIODevice *vbasedev)
@@ -1512,5 +1527,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
     if (!vbasedev->bcontainer) {
         return;
     }
+    object_unref(vbasedev->hiod);
     vbasedev->bcontainer->ops->detach_device(vbasedev);
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (12 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 13/19] vfio: Create host IOMMU device instance Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan, Yi Sun, Marcel Apfelbaum

Extract out pci_device_get_iommu_bus_devfn() from
pci_device_iommu_address_space() to facilitate
implementation of pci_device_[set|unset]_iommu_device()
in following patch.

No functional change intended.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/pci/pci.c | 48 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 324c1302d2..02a4bb2af6 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2648,11 +2648,27 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
     }
 }
 
-AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+/*
+ * Get IOMMU root bus, aliased bus and devfn of a PCI device
+ *
+ * IOMMU root bus is needed by all call sites to call into iommu_ops.
+ * For call sites which don't need aliased BDF, passing NULL to
+ * aliased_[bus|devfn] is allowed.
+ *
+ * @piommu_bus: return root #PCIBus backed by an IOMMU for the PCI device.
+ *
+ * @aliased_bus: return aliased #PCIBus of the PCI device, optional.
+ *
+ * @aliased_devfn: return aliased devfn of the PCI device, optional.
+ */
+static void pci_device_get_iommu_bus_devfn(PCIDevice *dev,
+                                           PCIBus **piommu_bus,
+                                           PCIBus **aliased_bus,
+                                           int *aliased_devfn)
 {
     PCIBus *bus = pci_get_bus(dev);
     PCIBus *iommu_bus = bus;
-    uint8_t devfn = dev->devfn;
+    int devfn = dev->devfn;
 
     while (iommu_bus && !iommu_bus->iommu_ops && iommu_bus->parent_dev) {
         PCIBus *parent_bus = pci_get_bus(iommu_bus->parent_dev);
@@ -2693,7 +2709,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
 
         iommu_bus = parent_bus;
     }
-    if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
+
+    assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
+    assert(iommu_bus);
+
+    if (pci_bus_bypass_iommu(bus) || !iommu_bus->iommu_ops) {
+        iommu_bus = NULL;
+    }
+
+    *piommu_bus = iommu_bus;
+
+    if (aliased_bus) {
+        *aliased_bus = bus;
+    }
+
+    if (aliased_devfn) {
+        *aliased_devfn = devfn;
+    }
+}
+
+AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+{
+    PCIBus *bus;
+    PCIBus *iommu_bus;
+    int devfn;
+
+    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn);
+    if (iommu_bus) {
         return iommu_bus->iommu_ops->get_address_space(bus,
                                  iommu_bus->iommu_opaque, devfn);
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (13 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-05-07  7:04   ` Cédric Le Goater
  2024-04-29  6:50 ` [PATCH v3 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Yi Sun, Zhenzhong Duan, Marcel Apfelbaum

From: Yi Liu <yi.l.liu@intel.com>

pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
set/unset HostIOMMUDevice for a given PCI device.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/pci/pci.h | 38 +++++++++++++++++++++++++++++++++++++-
 hw/pci/pci.c         | 27 +++++++++++++++++++++++++++
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index eaa3fc99d8..849e391813 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -3,6 +3,7 @@
 
 #include "exec/memory.h"
 #include "sysemu/dma.h"
+#include "sysemu/host_iommu_device.h"
 
 /* PCI includes legacy ISA access.  */
 #include "hw/isa/isa.h"
@@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
      *
      * @devfn: device and function number
      */
-   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+    AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+    /**
+     * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
+     *
+     * Optional callback, if not implemented in vIOMMU, then vIOMMU can't
+     * retrieve host information from the associated HostIOMMUDevice.
+     *
+     * @bus: the #PCIBus of the PCI device.
+     *
+     * @opaque: the data passed to pci_setup_iommu().
+     *
+     * @devfn: device and function number of the PCI device.
+     *
+     * @dev: the data structure representing host IOMMU device.
+     *
+     * @errp: pass an Error out only when return false
+     *
+     * Returns: 0 if HostIOMMUDevice is attached, or else <0 with errp set.
+     */
+    int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
+                            HostIOMMUDevice *dev, Error **errp);
+    /**
+     * @unset_iommu_device: detach a HostIOMMUDevice from a vIOMMU
+     *
+     * Optional callback.
+     *
+     * @bus: the #PCIBus of the PCI device.
+     *
+     * @opaque: the data passed to pci_setup_iommu().
+     *
+     * @devfn: device and function number of the PCI device.
+     */
+    void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
 } PCIIOMMUOps;
 
 AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
+int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+                                Error **errp);
+void pci_device_unset_iommu_device(PCIDevice *dev);
 
 /**
  * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 02a4bb2af6..c3293e9357 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2742,6 +2742,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
     return &address_space_memory;
 }
 
+int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+                                Error **errp)
+{
+    PCIBus *iommu_bus;
+
+    /* set_iommu_device requires device's direct BDF instead of aliased BDF */
+    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+    if (iommu_bus && iommu_bus->iommu_ops->set_iommu_device) {
+        return iommu_bus->iommu_ops->set_iommu_device(pci_get_bus(dev),
+                                                      iommu_bus->iommu_opaque,
+                                                      dev->devfn, hiod, errp);
+    }
+    return 0;
+}
+
+void pci_device_unset_iommu_device(PCIDevice *dev)
+{
+    PCIBus *iommu_bus;
+
+    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+    if (iommu_bus && iommu_bus->iommu_ops->unset_iommu_device) {
+        return iommu_bus->iommu_ops->unset_iommu_device(pci_get_bus(dev),
+                                                        iommu_bus->iommu_opaque,
+                                                        dev->devfn);
+    }
+}
+
 void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
 {
     /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (14 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 17/19] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan, Yi Sun

With HostIOMMUDevice passed, vIOMMU can check compatibility with host
IOMMU, call into IOMMUFD specific methods, etc.

Originally-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/pci.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 64780d1b79..224501a86e 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3111,11 +3111,17 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 
     vfio_bars_register(vdev);
 
-    ret = vfio_add_capabilities(vdev, errp);
+    ret = pci_device_set_iommu_device(pdev, vbasedev->hiod, errp);
     if (ret) {
+        error_prepend(errp, "Failed to set iommu_device: ");
         goto out_teardown;
     }
 
+    ret = vfio_add_capabilities(vdev, errp);
+    if (ret) {
+        goto out_unset_idev;
+    }
+
     if (vdev->vga) {
         vfio_vga_quirk_setup(vdev);
     }
@@ -3132,7 +3138,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
             error_setg(errp,
                        "cannot support IGD OpRegion feature on hotplugged "
                        "device");
-            goto out_teardown;
+            goto out_unset_idev;
         }
 
         ret = vfio_get_dev_region_info(vbasedev,
@@ -3141,13 +3147,13 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
         if (ret) {
             error_setg_errno(errp, -ret,
                              "does not support requested IGD OpRegion feature");
-            goto out_teardown;
+            goto out_unset_idev;
         }
 
         ret = vfio_pci_igd_opregion_init(vdev, opregion, errp);
         g_free(opregion);
         if (ret) {
-            goto out_teardown;
+            goto out_unset_idev;
         }
     }
 
@@ -3233,6 +3239,8 @@ out_deregister:
     if (vdev->intx.mmap_timer) {
         timer_free(vdev->intx.mmap_timer);
     }
+out_unset_idev:
+    pci_device_unset_iommu_device(pdev);
 out_teardown:
     vfio_teardown_msi(vdev);
     vfio_bars_exit(vdev);
@@ -3261,6 +3269,7 @@ static void vfio_instance_finalize(Object *obj)
 static void vfio_exitfn(PCIDevice *pdev)
 {
     VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
 
     vfio_unregister_req_notifier(vdev);
     vfio_unregister_err_notifier(vdev);
@@ -3275,7 +3284,8 @@ static void vfio_exitfn(PCIDevice *pdev)
     vfio_teardown_msi(vdev);
     vfio_pci_disable_rp_atomics(vdev);
     vfio_bars_exit(vdev);
-    vfio_migration_exit(&vdev->vbasedev);
+    vfio_migration_exit(vbasedev);
+    pci_device_unset_iommu_device(pdev);
 }
 
 static void vfio_pci_reset(DeviceState *dev)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 17/19] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (15 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Marcel Apfelbaum

Extract cap/ecap initialization in vtd_cap_init() to make code
cleaner.

No functional change intended.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/i386/intel_iommu.c | 93 ++++++++++++++++++++++++-------------------
 1 file changed, 51 insertions(+), 42 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index cc8e59674e..519063c8f8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3934,30 +3934,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
     return;
 }
 
-/* Do the initialization. It will also be called when reset, so pay
- * attention when adding new initialization stuff.
- */
-static void vtd_init(IntelIOMMUState *s)
+static void vtd_cap_init(IntelIOMMUState *s)
 {
     X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
 
-    memset(s->csr, 0, DMAR_REG_SIZE);
-    memset(s->wmask, 0, DMAR_REG_SIZE);
-    memset(s->w1cmask, 0, DMAR_REG_SIZE);
-    memset(s->womask, 0, DMAR_REG_SIZE);
-
-    s->root = 0;
-    s->root_scalable = false;
-    s->dmar_enabled = false;
-    s->intr_enabled = false;
-    s->iq_head = 0;
-    s->iq_tail = 0;
-    s->iq = 0;
-    s->iq_size = 0;
-    s->qi_enabled = false;
-    s->iq_last_desc_type = VTD_INV_DESC_NONE;
-    s->iq_dw = false;
-    s->next_frcd_reg = 0;
     s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND |
              VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS |
              VTD_CAP_MGAW(s->aw_bits);
@@ -3974,27 +3954,6 @@ static void vtd_init(IntelIOMMUState *s)
     }
     s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
-    /*
-     * Rsvd field masks for spte
-     */
-    vtd_spte_rsvd[0] = ~0ULL;
-    vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
-                                                  x86_iommu->dt_supported);
-    vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
-    vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
-    vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
-
-    vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
-                                                         x86_iommu->dt_supported);
-    vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
-                                                         x86_iommu->dt_supported);
-
-    if (s->scalable_mode || s->snoop_control) {
-        vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
-        vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
-        vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
-    }
-
     if (x86_iommu_ir_supported(x86_iommu)) {
         s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
         if (s->intr_eim == ON_OFF_AUTO_ON) {
@@ -4027,6 +3986,56 @@ static void vtd_init(IntelIOMMUState *s)
     if (s->pasid) {
         s->ecap |= VTD_ECAP_PASID;
     }
+}
+
+/*
+ * Do the initialization. It will also be called when reset, so pay
+ * attention when adding new initialization stuff.
+ */
+static void vtd_init(IntelIOMMUState *s)
+{
+    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+
+    memset(s->csr, 0, DMAR_REG_SIZE);
+    memset(s->wmask, 0, DMAR_REG_SIZE);
+    memset(s->w1cmask, 0, DMAR_REG_SIZE);
+    memset(s->womask, 0, DMAR_REG_SIZE);
+
+    s->root = 0;
+    s->root_scalable = false;
+    s->dmar_enabled = false;
+    s->intr_enabled = false;
+    s->iq_head = 0;
+    s->iq_tail = 0;
+    s->iq = 0;
+    s->iq_size = 0;
+    s->qi_enabled = false;
+    s->iq_last_desc_type = VTD_INV_DESC_NONE;
+    s->iq_dw = false;
+    s->next_frcd_reg = 0;
+
+    vtd_cap_init(s);
+
+    /*
+     * Rsvd field masks for spte
+     */
+    vtd_spte_rsvd[0] = ~0ULL;
+    vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
+                                                  x86_iommu->dt_supported);
+    vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
+    vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
+    vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
+
+    vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
+                                                    x86_iommu->dt_supported);
+    vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
+                                                    x86_iommu->dt_supported);
+
+    if (s->scalable_mode || s->snoop_control) {
+        vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
+        vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
+        vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
+    }
 
     vtd_reset_caches(s);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (16 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 17/19] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-04-29  6:50 ` [PATCH v3 19/19] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
  2024-05-03 14:04 ` [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Cédric Le Goater
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Yi Sun, Zhenzhong Duan, Marcel Apfelbaum, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost

From: Yi Liu <yi.l.liu@intel.com>

Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
In set call, a new structure VTDHostIOMMUDevice which holds
a reference to HostIOMMUDevice is stored in hash table
indexed by PCI BDF.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/i386/intel_iommu_internal.h |  8 ++++
 include/hw/i386/intel_iommu.h  |  2 +
 hw/i386/intel_iommu.c          | 76 ++++++++++++++++++++++++++++++++++
 3 files changed, 86 insertions(+)

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index f8cf99bddf..becafd03c1 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -537,4 +537,12 @@ typedef struct VTDRootEntry VTDRootEntry;
 #define VTD_SL_IGN_COM              0xbff0000000000000ULL
 #define VTD_SL_TM                   (1ULL << 62)
 
+
+typedef struct VTDHostIOMMUDevice {
+    IntelIOMMUState *iommu_state;
+    PCIBus *bus;
+    uint8_t devfn;
+    HostIOMMUDevice *dev;
+    QLIST_ENTRY(VTDHostIOMMUDevice) next;
+} VTDHostIOMMUDevice;
 #endif
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 7d694b0813..2bbde41e45 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -293,6 +293,8 @@ struct IntelIOMMUState {
     /* list of registered notifiers */
     QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
 
+    GHashTable *vtd_host_iommu_dev;             /* VTDHostIOMMUDevice */
+
     /* interrupt remapping */
     bool intr_enabled;              /* Whether guest enabled IR */
     dma_addr_t intr_root;           /* Interrupt remapping table pointer */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 519063c8f8..4f84e2e801 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2)
            (key1->pasid == key2->pasid);
 }
 
+static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
+{
+    const struct vtd_as_key *key1 = v1;
+    const struct vtd_as_key *key2 = v2;
+
+    return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
+}
 /*
  * Note that we use pointer to PCIBus as the key, so hashing/shifting
  * based on the pointer value is intended. Note that we deal with
@@ -3812,6 +3819,70 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
     return vtd_dev_as;
 }
 
+static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
+                                    HostIOMMUDevice *hiod, Error **errp)
+{
+    IntelIOMMUState *s = opaque;
+    VTDHostIOMMUDevice *vtd_hdev;
+    struct vtd_as_key key = {
+        .bus = bus,
+        .devfn = devfn,
+    };
+    struct vtd_as_key *new_key;
+
+    assert(hiod);
+
+    vtd_iommu_lock(s);
+
+    vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
+
+    if (vtd_hdev) {
+        error_setg(errp, "IOMMUFD device already exist");
+        vtd_iommu_unlock(s);
+        return -EEXIST;
+    }
+
+    vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
+    vtd_hdev->bus = bus;
+    vtd_hdev->devfn = (uint8_t)devfn;
+    vtd_hdev->iommu_state = s;
+    vtd_hdev->dev = hiod;
+
+    new_key = g_malloc(sizeof(*new_key));
+    new_key->bus = bus;
+    new_key->devfn = devfn;
+
+    object_ref(hiod);
+    g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
+
+    vtd_iommu_unlock(s);
+
+    return 0;
+}
+
+static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
+{
+    IntelIOMMUState *s = opaque;
+    VTDHostIOMMUDevice *vtd_hdev;
+    struct vtd_as_key key = {
+        .bus = bus,
+        .devfn = devfn,
+    };
+
+    vtd_iommu_lock(s);
+
+    vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
+    if (!vtd_hdev) {
+        vtd_iommu_unlock(s);
+        return;
+    }
+
+    g_hash_table_remove(s->vtd_host_iommu_dev, &key);
+    object_unref(vtd_hdev->dev);
+
+    vtd_iommu_unlock(s);
+}
+
 /* Unmap the whole range in the notifier's scope. */
 static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
 {
@@ -4116,6 +4187,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
 
 static PCIIOMMUOps vtd_iommu_ops = {
     .get_address_space = vtd_host_dma_iommu,
+    .set_iommu_device = vtd_dev_set_iommu_device,
+    .unset_iommu_device = vtd_dev_unset_iommu_device,
 };
 
 static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
@@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error **errp)
                                      g_free, g_free);
     s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal,
                                       g_free, g_free);
+    s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
+                                                  vtd_as_idev_equal,
+                                                  g_free, g_free);
     vtd_init(s);
     pci_setup_iommu(bus, &vtd_iommu_ops, dev);
     /* Pseudo address space under root PCI bus. */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 19/19] intel_iommu: Check compatibility with host IOMMU capabilities
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (17 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
@ 2024-04-29  6:50 ` Zhenzhong Duan
  2024-05-03 14:04 ` [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Cédric Le Goater
  19 siblings, 0 replies; 59+ messages in thread
From: Zhenzhong Duan @ 2024-04-29  6:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Marcel Apfelbaum

If check fails, host device (either VFIO or VDPA device) is not
compatible with current vIOMMU config and should not be passed to
guest.

Only aw_bits is checked for now, we don't care other capabilities
before scalable modern mode is introduced.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/i386/intel_iommu.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 4f84e2e801..4a295c41cc 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3819,6 +3819,26 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
     return vtd_dev_as;
 }
 
+static int vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice *vtd_hdev,
+                          Error **errp)
+{
+    HostIOMMUDevice *hiod = vtd_hdev->dev;
+    int ret;
+
+    /* Common checks */
+    ret = host_iommu_device_check_cap(hiod, HOST_IOMMU_DEVICE_CAP_AW_BITS,
+                                      errp);
+    if (ret < 0) {
+        return ret;
+    }
+    if (s->aw_bits > ret) {
+        error_setg(errp, "aw-bits %d > host aw-bits %d", s->aw_bits, ret);
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
                                     HostIOMMUDevice *hiod, Error **errp)
 {
@@ -3829,6 +3849,7 @@ static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
         .devfn = devfn,
     };
     struct vtd_as_key *new_key;
+    int ret;
 
     assert(hiod);
 
@@ -3848,6 +3869,13 @@ static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
     vtd_hdev->iommu_state = s;
     vtd_hdev->dev = hiod;
 
+    ret = vtd_check_hdev(s, vtd_hdev, errp);
+    if (ret) {
+        g_free(vtd_hdev);
+        vtd_iommu_unlock(s);
+        return ret;
+    }
+
     new_key = g_malloc(sizeof(*new_key));
     new_key->bus = bus;
     new_key->devfn = devfn;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device
  2024-04-29  6:50 ` [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device Zhenzhong Duan
@ 2024-04-30  7:51   ` Cédric Le Goater
  2024-04-30  9:13     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  7:51 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> HostIOMMUDeviceLegacyVFIO represents a host IOMMU device under VFIO
> legacy container backend.
> 
> It includes a link to VFIODevice.

I don't see any use of this attribute. May be introduce later when needed.

Thanks,

C.

> 
> Suggested-by: Eric Auger <eric.auger@redhat.com>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/hw/vfio/vfio-common.h | 12 ++++++++++++
>   hw/vfio/container.c           |  6 +++++-
>   2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index b9da6c08ef..aa3abe0a18 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -31,6 +31,7 @@
>   #endif
>   #include "sysemu/sysemu.h"
>   #include "hw/vfio/vfio-container-base.h"
> +#include "sysemu/host_iommu_device.h"
>   
>   #define VFIO_MSG_PREFIX "vfio %s: "
>   
> @@ -147,6 +148,17 @@ typedef struct VFIOGroup {
>       bool ram_block_discard_allowed;
>   } VFIOGroup;
>   
> +#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
> +OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceLegacyVFIO,
> +                           HOST_IOMMU_DEVICE_LEGACY_VFIO)
> +
> +/* Abstract of host IOMMU device with VFIO legacy container backend */
> +struct HostIOMMUDeviceLegacyVFIO {
> +    HostIOMMUDevice parent_obj;
> +
> +    VFIODevice *vdev;
> +};
> +
>   typedef struct VFIODMABuf {
>       QemuDmaBuf buf;
>       uint32_t pos_x, pos_y, pos_updates;
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 77bdec276e..3b6826996a 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -1148,7 +1148,11 @@ static const TypeInfo types[] = {
>           .name = TYPE_VFIO_IOMMU_LEGACY,
>           .parent = TYPE_VFIO_IOMMU,
>           .class_init = vfio_iommu_legacy_class_init,
> -    },
> +    }, {
> +        .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
> +        .parent = TYPE_HOST_IOMMU_DEVICE,
> +        .instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
> +    }
>   };
>   
>   DEFINE_TYPES(types)



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device
  2024-04-29  6:50 ` [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device Zhenzhong Duan
@ 2024-04-30  7:52   ` Cédric Le Goater
  2024-04-30  9:25     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  7:52 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> HostIOMMUDeviceIOMMUFDVFIO represents a host IOMMU device under VFIO
> iommufd backend. It will be created during VFIO device attaching and
> passed to vIOMMU.
> 
> It includes a link to VFIODevice so that we can do VFIO device
> specific operations, i.e., [at/de]taching hwpt, etc.
> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/hw/vfio/vfio-common.h | 13 +++++++++++++
>   hw/vfio/iommufd.c             |  6 +++++-
>   2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index aa3abe0a18..0943add3bc 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -32,6 +32,7 @@
>   #include "sysemu/sysemu.h"
>   #include "hw/vfio/vfio-container-base.h"
>   #include "sysemu/host_iommu_device.h"
> +#include "sysemu/iommufd.h"
>   
>   #define VFIO_MSG_PREFIX "vfio %s: "
>   
> @@ -159,6 +160,18 @@ struct HostIOMMUDeviceLegacyVFIO {
>       VFIODevice *vdev;
>   };
>   
> +#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
> +            TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"
> +OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceIOMMUFDVFIO,
> +                           HOST_IOMMU_DEVICE_IOMMUFD_VFIO)
> +
> +/* Abstraction of host IOMMU device with VFIO IOMMUFD backend */
> +struct HostIOMMUDeviceIOMMUFDVFIO {
> +    HostIOMMUDeviceIOMMUFD parent;
> +
> +    VFIODevice *vdev;

Seems useless today.

Thanks,

C.



> +};
> +
>   typedef struct VFIODMABuf {
>       QemuDmaBuf buf;
>       uint32_t pos_x, pos_y, pos_updates;
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 8827ffe636..997f4ac43e 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -639,7 +639,11 @@ static const TypeInfo types[] = {
>           .name = TYPE_VFIO_IOMMU_IOMMUFD,
>           .parent = TYPE_VFIO_IOMMU,
>           .class_init = vfio_iommu_iommufd_class_init,
> -    },
> +    }, {
> +        .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
> +        .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
> +        .instance_size = sizeof(HostIOMMUDeviceIOMMUFDVFIO),
> +    }
>   };
>   
>   DEFINE_TYPES(types)



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device
  2024-04-30  7:51   ` Cédric Le Goater
@ 2024-04-30  9:13     ` Duan, Zhenzhong
  2024-04-30 12:00       ` Cédric Le Goater
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30  9:13 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 02/19] vfio/container: Introduce
>HostIOMMUDeviceLegacyVFIO device
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceLegacyVFIO represents a host IOMMU device under
>VFIO
>> legacy container backend.
>>
>> It includes a link to VFIODevice.
>
>I don't see any use of this attribute. May be introduce later when needed.

Indeed, will remove.

Then 'struct HostIOMMUDeviceLegacyVFIO' is same as
struct HostIOMMUDevice.

Not clear if it's preferred to remove 'struct HostIOMMUDeviceLegacyVFIO'
and use HostIOMMUDevice instead. Something like:

OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDevice,
                                            HOST_IOMMU_DEVICE_LEGACY_VFIO)

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>>
>> Suggested-by: Eric Auger <eric.auger@redhat.com>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/hw/vfio/vfio-common.h | 12 ++++++++++++
>>   hw/vfio/container.c           |  6 +++++-
>>   2 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b9da6c08ef..aa3abe0a18 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -31,6 +31,7 @@
>>   #endif
>>   #include "sysemu/sysemu.h"
>>   #include "hw/vfio/vfio-container-base.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>>   #define VFIO_MSG_PREFIX "vfio %s: "
>>
>> @@ -147,6 +148,17 @@ typedef struct VFIOGroup {
>>       bool ram_block_discard_allowed;
>>   } VFIOGroup;
>>
>> +#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO
>TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
>> +OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceLegacyVFIO,
>> +                           HOST_IOMMU_DEVICE_LEGACY_VFIO)
>> +
>> +/* Abstract of host IOMMU device with VFIO legacy container backend */
>> +struct HostIOMMUDeviceLegacyVFIO {
>> +    HostIOMMUDevice parent_obj;
>> +
>> +    VFIODevice *vdev;
>> +};
>> +
>>   typedef struct VFIODMABuf {
>>       QemuDmaBuf buf;
>>       uint32_t pos_x, pos_y, pos_updates;
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 77bdec276e..3b6826996a 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1148,7 +1148,11 @@ static const TypeInfo types[] = {
>>           .name = TYPE_VFIO_IOMMU_LEGACY,
>>           .parent = TYPE_VFIO_IOMMU,
>>           .class_init = vfio_iommu_legacy_class_init,
>> -    },
>> +    }, {
>> +        .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
>> +        .parent = TYPE_HOST_IOMMU_DEVICE,
>> +        .instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
>> +    }
>>   };
>>
>>   DEFINE_TYPES(types)


^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device
  2024-04-30  7:52   ` Cédric Le Goater
@ 2024-04-30  9:25     ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30  9:25 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 04/19] vfio/iommufd: Introduce
>HostIOMMUDeviceIOMMUFDVFIO device
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceIOMMUFDVFIO represents a host IOMMU device under
>VFIO
>> iommufd backend. It will be created during VFIO device attaching and
>> passed to vIOMMU.
>>
>> It includes a link to VFIODevice so that we can do VFIO device
>> specific operations, i.e., [at/de]taching hwpt, etc.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/hw/vfio/vfio-common.h | 13 +++++++++++++
>>   hw/vfio/iommufd.c             |  6 +++++-
>>   2 files changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index aa3abe0a18..0943add3bc 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -32,6 +32,7 @@
>>   #include "sysemu/sysemu.h"
>>   #include "hw/vfio/vfio-container-base.h"
>>   #include "sysemu/host_iommu_device.h"
>> +#include "sysemu/iommufd.h"
>>
>>   #define VFIO_MSG_PREFIX "vfio %s: "
>>
>> @@ -159,6 +160,18 @@ struct HostIOMMUDeviceLegacyVFIO {
>>       VFIODevice *vdev;
>>   };
>>
>> +#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
>> +            TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"
>> +OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceIOMMUFDVFIO,
>> +                           HOST_IOMMU_DEVICE_IOMMUFD_VFIO)
>> +
>> +/* Abstraction of host IOMMU device with VFIO IOMMUFD backend */
>> +struct HostIOMMUDeviceIOMMUFDVFIO {
>> +    HostIOMMUDeviceIOMMUFD parent;
>> +
>> +    VFIODevice *vdev;
>
>Seems useless today.

Yes, useless before nesting series, will add in nesting series.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>
>> +};
>> +
>>   typedef struct VFIODMABuf {
>>       QemuDmaBuf buf;
>>       uint32_t pos_x, pos_y, pos_updates;
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index 8827ffe636..997f4ac43e 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -639,7 +639,11 @@ static const TypeInfo types[] = {
>>           .name = TYPE_VFIO_IOMMU_IOMMUFD,
>>           .parent = TYPE_VFIO_IOMMU,
>>           .class_init = vfio_iommu_iommufd_class_init,
>> -    },
>> +    }, {
>> +        .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
>> +        .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>> +        .instance_size = sizeof(HostIOMMUDeviceIOMMUFDVFIO),
>> +    }
>>   };
>>
>>   DEFINE_TYPES(types)


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  2024-04-29  6:50 ` [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
@ 2024-04-30  9:41   ` Cédric Le Goater
  2024-04-30  9:55     ` Duan, Zhenzhong
  2024-05-07  6:11   ` Cédric Le Goater
  1 sibling, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  9:41 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
> Different platform IOMMU can support different elements.
> 
> Currently only two elements, type and aw_bits, type hints the host
> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
> host IOMMU address width.
> 
> Introduce .check_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
> is supported.
> 
> Introduce a HostIOMMUDevice API host_iommu_device_check_cap() which
> is a wrapper of .check_cap().
> 
> Introduce a HostIOMMUDevice API host_iommu_device_check_cap_common()
> to check common capabalities of different host platform IOMMUs.
> 
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/sysemu/host_iommu_device.h | 44 ++++++++++++++++++++++++++++++
>   backends/host_iommu_device.c       | 29 ++++++++++++++++++++
>   2 files changed, 73 insertions(+)
> 
> diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
> index 2b58a94d62..12b6afb463 100644
> --- a/include/sysemu/host_iommu_device.h
> +++ b/include/sysemu/host_iommu_device.h
> @@ -14,12 +14,27 @@
>   
>   #include "qom/object.h"
>   #include "qapi/error.h"
> +#include "linux/iommufd.h"
> +
> +/**
> + * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
> + *
> + * @type: host platform IOMMU type.
> + *
> + * @aw_bits: host IOMMU address width. 0xff if no limitation.
> + */
> +typedef struct HostIOMMUDeviceCaps {
> +    enum iommu_hw_info_type type;
> +    uint8_t aw_bits;
> +} HostIOMMUDeviceCaps;
>   
>   #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>   OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
>   
>   struct HostIOMMUDevice {
>       Object parent_obj;
> +
> +    HostIOMMUDeviceCaps caps;
>   };
>   
>   /**
> @@ -47,5 +62,34 @@ struct HostIOMMUDeviceClass {
>        * Returns: true on success, false on failure.
>        */
>       bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
> +    /**
> +     * @check_cap: check if a host IOMMU device capability is supported.
> +     *
> +     * Optional callback, if not implemented, hint not supporting query
> +     * of @cap.
> +     *
> +     * @hiod: pointer to a host IOMMU device instance.
> +     *
> +     * @cap: capability to check.
> +     *
> +     * @errp: pass an Error out when fails to query capability.
> +     *
> +     * Returns: <0 on failure, 0 if a @cap is unsupported, or else
> +     * 1 or some positive value for some special @cap,
> +     * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
> +     */
> +    int (*check_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>   };
> +
> +/*
> + * Host IOMMU device capability list.
> + */
> +#define HOST_IOMMU_DEVICE_CAP_IOMMUFD       0
> +#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE    1
> +#define HOST_IOMMU_DEVICE_CAP_AW_BITS       2
> +
> +
> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp);
> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod, int cap,
> +                                       Error **errp);
>   #endif
> diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
> index 41f2fdce20..b97d008cc7 100644
> --- a/backends/host_iommu_device.c
> +++ b/backends/host_iommu_device.c
> @@ -28,3 +28,32 @@ static void host_iommu_device_init(Object *obj)
>   static void host_iommu_device_finalize(Object *obj)
>   {
>   }
> +
> +/* Wrapper of HostIOMMUDeviceClass:check_cap */
> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp)

Since we have an 'Error **errp', we could return a bool instead,
unless this is a 'get_cap' routine ?

Thanks,

C.


> +{
> +    HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
> +    if (!hiodc->check_cap) {
> +        error_setg(errp, ".check_cap() not implemented");
> +        return -EINVAL;
> +    }
> +
> +    return hiodc->check_cap(hiod, cap, errp);
> +}
> +
> +/* Implement check on common IOMMU capabilities */
> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod, int cap,
> +                                       Error **errp)
> +{
> +    HostIOMMUDeviceCaps *caps = &hiod->caps;
> +
> +    switch (cap) {
> +    case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
> +        return caps->type;
> +    case HOST_IOMMU_DEVICE_CAP_AW_BITS:
> +        return caps->aw_bits;
> +    default:
> +        error_setg(errp, "Not support query cap %x", cap);
> +        return -EINVAL;
> +    }
> +}



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
  2024-04-29  6:50 ` [PATCH v3 06/19] range: Introduce range_get_last_bit() Zhenzhong Duan
@ 2024-04-30  9:41   ` Cédric Le Goater
  2024-04-30  9:58     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  9:41 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> This helper get the highest 1 bit position of the upper bound.
> 
> If the range is empty or upper bound is zero, -1 is returned.
> 
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/qemu/range.h | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/include/qemu/range.h b/include/qemu/range.h
> index 205e1da76d..8e05bc1d9f 100644
> --- a/include/qemu/range.h
> +++ b/include/qemu/range.h
> @@ -20,6 +20,8 @@
>   #ifndef QEMU_RANGE_H
>   #define QEMU_RANGE_H
>   
> +#include "qemu/bitops.h"
> +
>   /*
>    * Operations on 64 bit address ranges.
>    * Notes:
> @@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
>       return !(last2 < first1 || last1 < first2);
>   }
>   
> +/* Get highest non-zero bit position of a range */
> +static inline int range_get_last_bit(Range *range)
> +{
> +    if (range_is_empty(range) || !range->upb) {
> +        return -1;
> +    }
> +    return find_last_bit(&range->upb, sizeof(range->upb));

This breaks builds on 32-bit host systems.


Thanks,

C.


> +}
> +
>   /*
>    * Return -1 if @a < @b, 1 @a > @b, and 0 if they touch or overlap.
>    * Both @a and @b must not be empty.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
  2024-04-29  6:50 ` [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-04-30  9:41   ` Cédric Le Goater
  2024-04-30  9:59     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  9:41 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> Utilize range_get_last_bit() to get host IOMMU address width and
> package it in HostIOMMUDeviceCaps for query with .check_cap().
> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   hw/vfio/container.c | 29 +++++++++++++++++++++++++++++
>   1 file changed, 29 insertions(+)
> 
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 3b6826996a..863eec3943 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -1143,6 +1143,34 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>       vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>   };
>   
> +static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> +                                     Error **errp)
> +{
> +    VFIODevice *vdev = opaque;
> +    /* iova_ranges is a sorted list */
> +    GList *l = g_list_last(vdev->bcontainer->iova_ranges);
> +
> +    /* There is no VFIO uAPI to query host platform IOMMU type */
> +    hiod->caps.type = IOMMU_HW_INFO_TYPE_NONE;
> +    HOST_IOMMU_DEVICE_IOMMUFD_VFIO(hiod)->vdev = vdev;

cast uses the wrong type and I am not sure the ->vdev is useful.


Thanks,

C.

  
> +
> +    if (l) {
> +        Range *range = l->data;
> +        hiod->caps.aw_bits = range_get_last_bit(range) + 1;
> +    } else {
> +        hiod->caps.aw_bits = 0xff;
> +    }
> +
> +    return true;
> +}
> +
> +static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
> +{
> +    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
> +
> +    hioc->realize = hiod_legacy_vfio_realize;
> +};
> +
>   static const TypeInfo types[] = {
>       {
>           .name = TYPE_VFIO_IOMMU_LEGACY,
> @@ -1152,6 +1180,7 @@ static const TypeInfo types[] = {
>           .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
>           .parent = TYPE_HOST_IOMMU_DEVICE,
>           .instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
> +        .class_init = hiod_legacy_vfio_class_init,
>       }
>   };
>   



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info()
  2024-04-29  6:50 ` [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
@ 2024-04-30  9:41   ` Cédric Le Goater
  2024-04-30 10:06     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  9:41 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Yi Sun

On 4/29/24 08:50, Zhenzhong Duan wrote:
> Introduce a helper function iommufd_backend_get_device_info() to get
> host IOMMU related information through iommufd uAPI.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/sysemu/iommufd.h |  4 ++++
>   backends/iommufd.c       | 24 +++++++++++++++++++++++-
>   2 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
> index 6a9fb0007a..e9593637a3 100644
> --- a/include/sysemu/iommufd.h
> +++ b/include/sysemu/iommufd.h
> @@ -17,6 +17,7 @@
>   #include "qom/object.h"
>   #include "exec/hwaddr.h"
>   #include "exec/cpu-common.h"
> +#include <linux/iommufd.h>
>   #include "sysemu/host_iommu_device.h"
>   
>   #define TYPE_IOMMUFD_BACKEND "iommufd"
> @@ -47,6 +48,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
>                               ram_addr_t size, void *vaddr, bool readonly);
>   int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>                                 hwaddr iova, ram_addr_t size);
> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
> +                                    enum iommu_hw_info_type *type,
> +                                    void *data, uint32_t len, Error **errp);
>   
>   #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
>   OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index 19e46194a2..d61209788a 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -19,7 +19,6 @@
>   #include "monitor/monitor.h"
>   #include "trace.h"
>   #include <sys/ioctl.h>
> -#include <linux/iommufd.h>
>   
>   static void iommufd_backend_init(Object *obj)
>   {
> @@ -211,6 +210,29 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>       return ret;
>   }
>   
> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
> +                                    enum iommu_hw_info_type *type,
> +                                    void *data, uint32_t len, Error **errp)

When taking an 'Error **' argument, routines preferably return a bool.

Thanks,

C.



  
> +{
> +    struct iommu_hw_info info = {
> +        .size = sizeof(info),
> +        .dev_id = devid,
> +        .data_len = len,
> +        .data_uptr = (uintptr_t)data,
> +    };
> +    int ret;
> +
> +    ret = ioctl(be->fd, IOMMU_GET_HW_INFO, &info);
> +    if (ret) {
> +        error_setg_errno(errp, errno, "Failed to get hardware info");
> +    } else {
> +        g_assert(type);
> +        *type = info.out_data_type;
> +    }
> +
> +    return ret;
> +}
> +
>   static const TypeInfo types[] = {
>       {
>           .name = TYPE_IOMMUFD_BACKEND,



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-04-29  6:50 ` [PATCH v3 11/19] backends/iommufd: " Zhenzhong Duan
@ 2024-04-30  9:41   ` Cédric Le Goater
  2024-04-30 10:06     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  9:41 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   backends/iommufd.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index d61209788a..28faec528e 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -233,6 +233,23 @@ int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>       return ret;
>   }
>   
> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
> +{
> +    switch (cap) {
> +    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
> +        return 1;

I don't understand this value.


Thanks,

C.


> +    default:
> +        return host_iommu_device_check_cap_common(hiod, cap, errp);
> +    }
> +}
> +
> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
> +{
> +    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
> +
> +    hioc->check_cap = hiod_iommufd_check_cap;
> +};
> +
>   static const TypeInfo types[] = {
>       {
>           .name = TYPE_IOMMUFD_BACKEND,
> @@ -251,6 +268,7 @@ static const TypeInfo types[] = {
>           .parent = TYPE_HOST_IOMMU_DEVICE,
>           .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
>           .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
> +        .class_init = hiod_iommufd_class_init,
>           .abstract = true,
>       }
>   };



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/19] vfio: Create host IOMMU device instance
  2024-04-29  6:50 ` [PATCH v3 13/19] vfio: Create host IOMMU device instance Zhenzhong Duan
@ 2024-04-30  9:41   ` Cédric Le Goater
  2024-04-30 10:16     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30  9:41 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> Create host IOMMU device instance in vfio_attach_device() and call
> .realize() to initialize it further.
> 
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/hw/vfio/vfio-common.h |  1 +
>   hw/vfio/common.c              | 18 +++++++++++++++++-
>   2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 0943add3bc..b204b93a55 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -126,6 +126,7 @@ typedef struct VFIODevice {
>       OnOffAuto pre_copy_dirty_page_tracking;
>       bool dirty_pages_supported;
>       bool dirty_tracking;
> +    HostIOMMUDevice *hiod;
>       int devid;
>       IOMMUFDBackend *iommufd;
>   } VFIODevice;
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 8f9cbdc026..0be8b70ebd 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1497,6 +1497,8 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
>   {
>       const VFIOIOMMUClass *ops =
>           VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
> +    HostIOMMUDevice *hiod;
> +    int ret;
>   
>       if (vbasedev->iommufd) {
>           ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
> @@ -1504,7 +1506,20 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
>   
>       assert(ops);
>   
> -    return ops->attach_device(name, vbasedev, as, errp);
> +    ret = ops->attach_device(name, vbasedev, as, errp);
> +    if (ret < 0) {
> +        return ret;


hmm, I wonder if we should change the return value of vfio_attach_device()
to be a bool.


Thanks,

C.

	

> +    }
> +
> +    hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
> +    if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp)) {
> +        object_unref(hiod);
> +        ops->detach_device(vbasedev);
> +        return -EINVAL;
> +    }
> +    vbasedev->hiod = hiod;
> +
> +    return 0;
>   }
>   
>   void vfio_detach_device(VFIODevice *vbasedev)
> @@ -1512,5 +1527,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
>       if (!vbasedev->bcontainer) {
>           return;
>       }
> +    object_unref(vbasedev->hiod);
>       vbasedev->bcontainer->ops->detach_device(vbasedev);
>   }



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  2024-04-30  9:41   ` Cédric Le Goater
@ 2024-04-30  9:55     ` Duan, Zhenzhong
  2024-04-30 12:01       ` Cédric Le Goater
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30  9:55 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce
>HostIOMMUDeviceCaps
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
>> Different platform IOMMU can support different elements.
>>
>> Currently only two elements, type and aw_bits, type hints the host
>> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
>> host IOMMU address width.
>>
>> Introduce .check_cap() handler to check if
>HOST_IOMMU_DEVICE_CAP_XXX
>> is supported.
>>
>> Introduce a HostIOMMUDevice API host_iommu_device_check_cap()
>which
>> is a wrapper of .check_cap().
>>
>> Introduce a HostIOMMUDevice API
>host_iommu_device_check_cap_common()
>> to check common capabalities of different host platform IOMMUs.
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/sysemu/host_iommu_device.h | 44
>++++++++++++++++++++++++++++++
>>   backends/host_iommu_device.c       | 29 ++++++++++++++++++++
>>   2 files changed, 73 insertions(+)
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> index 2b58a94d62..12b6afb463 100644
>> --- a/include/sysemu/host_iommu_device.h
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -14,12 +14,27 @@
>>
>>   #include "qom/object.h"
>>   #include "qapi/error.h"
>> +#include "linux/iommufd.h"
>> +
>> +/**
>> + * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
>> + *
>> + * @type: host platform IOMMU type.
>> + *
>> + * @aw_bits: host IOMMU address width. 0xff if no limitation.
>> + */
>> +typedef struct HostIOMMUDeviceCaps {
>> +    enum iommu_hw_info_type type;
>> +    uint8_t aw_bits;
>> +} HostIOMMUDeviceCaps;
>>
>>   #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>>   OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>>
>>   struct HostIOMMUDevice {
>>       Object parent_obj;
>> +
>> +    HostIOMMUDeviceCaps caps;
>>   };
>>
>>   /**
>> @@ -47,5 +62,34 @@ struct HostIOMMUDeviceClass {
>>        * Returns: true on success, false on failure.
>>        */
>>       bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
>> +    /**
>> +     * @check_cap: check if a host IOMMU device capability is supported.
>> +     *
>> +     * Optional callback, if not implemented, hint not supporting query
>> +     * of @cap.
>> +     *
>> +     * @hiod: pointer to a host IOMMU device instance.
>> +     *
>> +     * @cap: capability to check.
>> +     *
>> +     * @errp: pass an Error out when fails to query capability.
>> +     *
>> +     * Returns: <0 on failure, 0 if a @cap is unsupported, or else
>> +     * 1 or some positive value for some special @cap,
>> +     * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
>> +     */
>> +    int (*check_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>>   };
>> +
>> +/*
>> + * Host IOMMU device capability list.
>> + */
>> +#define HOST_IOMMU_DEVICE_CAP_IOMMUFD       0
>> +#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE    1
>> +#define HOST_IOMMU_DEVICE_CAP_AW_BITS       2
>> +
>> +
>> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap,
>Error **errp);
>> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod,
>int cap,
>> +                                       Error **errp);
>>   #endif
>> diff --git a/backends/host_iommu_device.c
>b/backends/host_iommu_device.c
>> index 41f2fdce20..b97d008cc7 100644
>> --- a/backends/host_iommu_device.c
>> +++ b/backends/host_iommu_device.c
>> @@ -28,3 +28,32 @@ static void host_iommu_device_init(Object *obj)
>>   static void host_iommu_device_finalize(Object *obj)
>>   {
>>   }
>> +
>> +/* Wrapper of HostIOMMUDeviceClass:check_cap */
>> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap,
>Error **errp)
>
>Since we have an 'Error **errp', we could return a bool instead,
>unless this is a 'get_cap' routine ?

Maybe better to name it host_iommu_device_get_cap()?
Because not all results are bool, some are integer, i.e., aw_bits.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>> +{
>> +    HostIOMMUDeviceClass *hiodc =
>HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>> +    if (!hiodc->check_cap) {
>> +        error_setg(errp, ".check_cap() not implemented");
>> +        return -EINVAL;
>> +    }
>> +
>> +    return hiodc->check_cap(hiod, cap, errp);
>> +}
>> +
>> +/* Implement check on common IOMMU capabilities */
>> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod,
>int cap,
>> +                                       Error **errp)
>> +{
>> +    HostIOMMUDeviceCaps *caps = &hiod->caps;
>> +
>> +    switch (cap) {
>> +    case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
>> +        return caps->type;
>> +    case HOST_IOMMU_DEVICE_CAP_AW_BITS:
>> +        return caps->aw_bits;
>> +    default:
>> +        error_setg(errp, "Not support query cap %x", cap);
>> +        return -EINVAL;
>> +    }
>> +}


^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 06/19] range: Introduce range_get_last_bit()
  2024-04-30  9:41   ` Cédric Le Goater
@ 2024-04-30  9:58     ` Duan, Zhenzhong
  2024-05-02 10:30       ` Cédric Le Goater
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30  9:58 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> This helper get the highest 1 bit position of the upper bound.
>>
>> If the range is empty or upper bound is zero, -1 is returned.
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/qemu/range.h | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/include/qemu/range.h b/include/qemu/range.h
>> index 205e1da76d..8e05bc1d9f 100644
>> --- a/include/qemu/range.h
>> +++ b/include/qemu/range.h
>> @@ -20,6 +20,8 @@
>>   #ifndef QEMU_RANGE_H
>>   #define QEMU_RANGE_H
>>
>> +#include "qemu/bitops.h"
>> +
>>   /*
>>    * Operations on 64 bit address ranges.
>>    * Notes:
>> @@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1,
>uint64_t len1,
>>       return !(last2 < first1 || last1 < first2);
>>   }
>>
>> +/* Get highest non-zero bit position of a range */
>> +static inline int range_get_last_bit(Range *range)
>> +{
>> +    if (range_is_empty(range) || !range->upb) {
>> +        return -1;
>> +    }
>> +    return find_last_bit(&range->upb, sizeof(range->upb));
>
>This breaks builds on 32-bit host systems.

Oh, I missed 32bit build. Thanks, will fix.

Thanks
zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> +}
>> +
>>   /*
>>    * Return -1 if @a < @b, 1 @a > @b, and 0 if they touch or overlap.
>>    * Both @a and @b must not be empty.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
  2024-04-30  9:41   ` Cédric Le Goater
@ 2024-04-30  9:59     ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30  9:59 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 07/19] vfio/container: Implement
>HostIOMMUDeviceClass::realize() handler
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Utilize range_get_last_bit() to get host IOMMU address width and
>> package it in HostIOMMUDeviceCaps for query with .check_cap().
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   hw/vfio/container.c | 29 +++++++++++++++++++++++++++++
>>   1 file changed, 29 insertions(+)
>>
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 3b6826996a..863eec3943 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1143,6 +1143,34 @@ static void
>vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>>       vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>>   };
>>
>> +static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void
>*opaque,
>> +                                     Error **errp)
>> +{
>> +    VFIODevice *vdev = opaque;
>> +    /* iova_ranges is a sorted list */
>> +    GList *l = g_list_last(vdev->bcontainer->iova_ranges);
>> +
>> +    /* There is no VFIO uAPI to query host platform IOMMU type */
>> +    hiod->caps.type = IOMMU_HW_INFO_TYPE_NONE;
>> +    HOST_IOMMU_DEVICE_IOMMUFD_VFIO(hiod)->vdev = vdev;
>
>cast uses the wrong type and I am not sure the ->vdev is useful.

Good catch, will remove vdev as you suggested.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> +
>> +    if (l) {
>> +        Range *range = l->data;
>> +        hiod->caps.aw_bits = range_get_last_bit(range) + 1;
>> +    } else {
>> +        hiod->caps.aw_bits = 0xff;
>> +    }
>> +
>> +    return true;
>> +}
>> +
>> +static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>> +{
>> +    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> +    hioc->realize = hiod_legacy_vfio_realize;
>> +};
>> +
>>   static const TypeInfo types[] = {
>>       {
>>           .name = TYPE_VFIO_IOMMU_LEGACY,
>> @@ -1152,6 +1180,7 @@ static const TypeInfo types[] = {
>>           .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
>>           .parent = TYPE_HOST_IOMMU_DEVICE,
>>           .instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
>> +        .class_init = hiod_legacy_vfio_class_init,
>>       }
>>   };
>>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-04-30  9:41   ` Cédric Le Goater
@ 2024-04-30 10:06     ` Duan, Zhenzhong
  2024-04-30 12:12       ` Cédric Le Goater
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30 10:06 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::check_cap() handler
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   backends/iommufd.c | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index d61209788a..28faec528e 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -233,6 +233,23 @@ int
>iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>       return ret;
>>   }
>>
>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap,
>Error **errp)
>> +{
>> +    switch (cap) {
>> +    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>> +        return 1;
>
>I don't understand this value.

1 means this host iommu device is attached to IOMMUFD backend,
or else 0 if attached to legacy backend.

Strictly speaking, HOST_IOMMU_DEVICE_CAP_IOMMUFD is not a
hardware capability, I'm trying to put all(sw/hw) in CAPs checking
framework just like KVM<->qemu CAPs does.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> +    default:
>> +        return host_iommu_device_check_cap_common(hiod, cap, errp);
>> +    }
>> +}
>> +
>> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>> +{
>> +    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> +    hioc->check_cap = hiod_iommufd_check_cap;
>> +};
>> +
>>   static const TypeInfo types[] = {
>>       {
>>           .name = TYPE_IOMMUFD_BACKEND,
>> @@ -251,6 +268,7 @@ static const TypeInfo types[] = {
>>           .parent = TYPE_HOST_IOMMU_DEVICE,
>>           .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
>>           .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
>> +        .class_init = hiod_iommufd_class_init,
>>           .abstract = true,
>>       }
>>   };


^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info()
  2024-04-30  9:41   ` Cédric Le Goater
@ 2024-04-30 10:06     ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30 10:06 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P,
	Yi Sun



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 08/19] backends/iommufd: Introduce helper
>function iommufd_backend_get_device_info()
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Introduce a helper function iommufd_backend_get_device_info() to get
>> host IOMMU related information through iommufd uAPI.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/sysemu/iommufd.h |  4 ++++
>>   backends/iommufd.c       | 24 +++++++++++++++++++++++-
>>   2 files changed, 27 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
>> index 6a9fb0007a..e9593637a3 100644
>> --- a/include/sysemu/iommufd.h
>> +++ b/include/sysemu/iommufd.h
>> @@ -17,6 +17,7 @@
>>   #include "qom/object.h"
>>   #include "exec/hwaddr.h"
>>   #include "exec/cpu-common.h"
>> +#include <linux/iommufd.h>
>>   #include "sysemu/host_iommu_device.h"
>>
>>   #define TYPE_IOMMUFD_BACKEND "iommufd"
>> @@ -47,6 +48,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend
>*be, uint32_t ioas_id, hwaddr iova,
>>                               ram_addr_t size, void *vaddr, bool readonly);
>>   int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t
>ioas_id,
>>                                 hwaddr iova, ram_addr_t size);
>> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>> +                                    enum iommu_hw_info_type *type,
>> +                                    void *data, uint32_t len, Error **errp);
>>
>>   #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>TYPE_HOST_IOMMU_DEVICE "-iommufd"
>>   OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD,
>HostIOMMUDeviceIOMMUFDClass,
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index 19e46194a2..d61209788a 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -19,7 +19,6 @@
>>   #include "monitor/monitor.h"
>>   #include "trace.h"
>>   #include <sys/ioctl.h>
>> -#include <linux/iommufd.h>
>>
>>   static void iommufd_backend_init(Object *obj)
>>   {
>> @@ -211,6 +210,29 @@ int
>iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>>       return ret;
>>   }
>>
>> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>> +                                    enum iommu_hw_info_type *type,
>> +                                    void *data, uint32_t len, Error **errp)
>
>When taking an 'Error **' argument, routines preferably return a bool.

Got it, will fix.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>
>
>> +{
>> +    struct iommu_hw_info info = {
>> +        .size = sizeof(info),
>> +        .dev_id = devid,
>> +        .data_len = len,
>> +        .data_uptr = (uintptr_t)data,
>> +    };
>> +    int ret;
>> +
>> +    ret = ioctl(be->fd, IOMMU_GET_HW_INFO, &info);
>> +    if (ret) {
>> +        error_setg_errno(errp, errno, "Failed to get hardware info");
>> +    } else {
>> +        g_assert(type);
>> +        *type = info.out_data_type;
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>>   static const TypeInfo types[] = {
>>       {
>>           .name = TYPE_IOMMUFD_BACKEND,


^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 13/19] vfio: Create host IOMMU device instance
  2024-04-30  9:41   ` Cédric Le Goater
@ 2024-04-30 10:16     ` Duan, Zhenzhong
  2024-04-30 12:15       ` Cédric Le Goater
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-04-30 10:16 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 13/19] vfio: Create host IOMMU device instance
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Create host IOMMU device instance in vfio_attach_device() and call
>> .realize() to initialize it further.
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/hw/vfio/vfio-common.h |  1 +
>>   hw/vfio/common.c              | 18 +++++++++++++++++-
>>   2 files changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index 0943add3bc..b204b93a55 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -126,6 +126,7 @@ typedef struct VFIODevice {
>>       OnOffAuto pre_copy_dirty_page_tracking;
>>       bool dirty_pages_supported;
>>       bool dirty_tracking;
>> +    HostIOMMUDevice *hiod;
>>       int devid;
>>       IOMMUFDBackend *iommufd;
>>   } VFIODevice;
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 8f9cbdc026..0be8b70ebd 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1497,6 +1497,8 @@ int vfio_attach_device(char *name, VFIODevice
>*vbasedev,
>>   {
>>       const VFIOIOMMUClass *ops =
>>
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>> +    HostIOMMUDevice *hiod;
>> +    int ret;
>>
>>       if (vbasedev->iommufd) {
>>           ops =
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUF
>D));
>> @@ -1504,7 +1506,20 @@ int vfio_attach_device(char *name,
>VFIODevice *vbasedev,
>>
>>       assert(ops);
>>
>> -    return ops->attach_device(name, vbasedev, as, errp);
>> +    ret = ops->attach_device(name, vbasedev, as, errp);
>> +    if (ret < 0) {
>> +        return ret;
>
>
>hmm, I wonder if we should change the return value of vfio_attach_device()
>to be a bool.

I see, also VFIOIOMMUClass:: setup and VFIOIOMMUClass::add_window.
I can add cleanup patches to fix them if you have no other plan.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>
>> +    }
>> +
>> +    hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
>> +    if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev,
>errp)) {
>> +        object_unref(hiod);
>> +        ops->detach_device(vbasedev);
>> +        return -EINVAL;
>> +    }
>> +    vbasedev->hiod = hiod;
>> +
>> +    return 0;
>>   }
>>
>>   void vfio_detach_device(VFIODevice *vbasedev)
>> @@ -1512,5 +1527,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
>>       if (!vbasedev->bcontainer) {
>>           return;
>>       }
>> +    object_unref(vbasedev->hiod);
>>       vbasedev->bcontainer->ops->detach_device(vbasedev);
>>   }


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device
  2024-04-30  9:13     ` Duan, Zhenzhong
@ 2024-04-30 12:00       ` Cédric Le Goater
  0 siblings, 0 replies; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30 12:00 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

On 4/30/24 11:13, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH v3 02/19] vfio/container: Introduce
>> HostIOMMUDeviceLegacyVFIO device
>>
>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>> HostIOMMUDeviceLegacyVFIO represents a host IOMMU device under
>> VFIO
>>> legacy container backend.
>>>
>>> It includes a link to VFIODevice.
>>
>> I don't see any use of this attribute. May be introduce later when needed.
> 
> Indeed, will remove.
> 
> Then 'struct HostIOMMUDeviceLegacyVFIO' is same as
> struct HostIOMMUDevice.
> 
> Not clear if it's preferred to remove 'struct HostIOMMUDeviceLegacyVFIO'
> and use HostIOMMUDevice instead. Something like:
> 
> OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDevice,
>                                              HOST_IOMMU_DEVICE_LEGACY_VFIO)

I would. The simpler the better.

Thanks,

C.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  2024-04-30  9:55     ` Duan, Zhenzhong
@ 2024-04-30 12:01       ` Cédric Le Goater
  0 siblings, 0 replies; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30 12:01 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

On 4/30/24 11:55, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce
>> HostIOMMUDeviceCaps
>>
>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
>>> Different platform IOMMU can support different elements.
>>>
>>> Currently only two elements, type and aw_bits, type hints the host
>>> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
>>> host IOMMU address width.
>>>
>>> Introduce .check_cap() handler to check if
>> HOST_IOMMU_DEVICE_CAP_XXX
>>> is supported.
>>>
>>> Introduce a HostIOMMUDevice API host_iommu_device_check_cap()
>> which
>>> is a wrapper of .check_cap().
>>>
>>> Introduce a HostIOMMUDevice API
>> host_iommu_device_check_cap_common()
>>> to check common capabalities of different host platform IOMMUs.
>>>
>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>    include/sysemu/host_iommu_device.h | 44
>> ++++++++++++++++++++++++++++++
>>>    backends/host_iommu_device.c       | 29 ++++++++++++++++++++
>>>    2 files changed, 73 insertions(+)
>>>
>>> diff --git a/include/sysemu/host_iommu_device.h
>> b/include/sysemu/host_iommu_device.h
>>> index 2b58a94d62..12b6afb463 100644
>>> --- a/include/sysemu/host_iommu_device.h
>>> +++ b/include/sysemu/host_iommu_device.h
>>> @@ -14,12 +14,27 @@
>>>
>>>    #include "qom/object.h"
>>>    #include "qapi/error.h"
>>> +#include "linux/iommufd.h"
>>> +
>>> +/**
>>> + * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
>>> + *
>>> + * @type: host platform IOMMU type.
>>> + *
>>> + * @aw_bits: host IOMMU address width. 0xff if no limitation.
>>> + */
>>> +typedef struct HostIOMMUDeviceCaps {
>>> +    enum iommu_hw_info_type type;
>>> +    uint8_t aw_bits;
>>> +} HostIOMMUDeviceCaps;
>>>
>>>    #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>>>    OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>> HOST_IOMMU_DEVICE)
>>>
>>>    struct HostIOMMUDevice {
>>>        Object parent_obj;
>>> +
>>> +    HostIOMMUDeviceCaps caps;
>>>    };
>>>
>>>    /**
>>> @@ -47,5 +62,34 @@ struct HostIOMMUDeviceClass {
>>>         * Returns: true on success, false on failure.
>>>         */
>>>        bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
>>> +    /**
>>> +     * @check_cap: check if a host IOMMU device capability is supported.
>>> +     *
>>> +     * Optional callback, if not implemented, hint not supporting query
>>> +     * of @cap.
>>> +     *
>>> +     * @hiod: pointer to a host IOMMU device instance.
>>> +     *
>>> +     * @cap: capability to check.
>>> +     *
>>> +     * @errp: pass an Error out when fails to query capability.
>>> +     *
>>> +     * Returns: <0 on failure, 0 if a @cap is unsupported, or else
>>> +     * 1 or some positive value for some special @cap,
>>> +     * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
>>> +     */
>>> +    int (*check_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>>>    };
>>> +
>>> +/*
>>> + * Host IOMMU device capability list.
>>> + */
>>> +#define HOST_IOMMU_DEVICE_CAP_IOMMUFD       0
>>> +#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE    1
>>> +#define HOST_IOMMU_DEVICE_CAP_AW_BITS       2
>>> +
>>> +
>>> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap,
>> Error **errp);
>>> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod,
>> int cap,
>>> +                                       Error **errp);
>>>    #endif
>>> diff --git a/backends/host_iommu_device.c
>> b/backends/host_iommu_device.c
>>> index 41f2fdce20..b97d008cc7 100644
>>> --- a/backends/host_iommu_device.c
>>> +++ b/backends/host_iommu_device.c
>>> @@ -28,3 +28,32 @@ static void host_iommu_device_init(Object *obj)
>>>    static void host_iommu_device_finalize(Object *obj)
>>>    {
>>>    }
>>> +
>>> +/* Wrapper of HostIOMMUDeviceClass:check_cap */
>>> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap,
>> Error **errp)
>>
>> Since we have an 'Error **errp', we could return a bool instead,
>> unless this is a 'get_cap' routine ?
> 
> Maybe better to name it host_iommu_device_get_cap()?
> Because not all results are bool, some are integer, i.e., aw_bits.

LGTM.


Thanks,

C.





^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-04-30 10:06     ` Duan, Zhenzhong
@ 2024-04-30 12:12       ` Cédric Le Goater
  2024-05-01 12:34         ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30 12:12 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

On 4/30/24 12:06, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>> HostIOMMUDeviceClass::check_cap() handler
>>
>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>    backends/iommufd.c | 18 ++++++++++++++++++
>>>    1 file changed, 18 insertions(+)
>>>
>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>> index d61209788a..28faec528e 100644
>>> --- a/backends/iommufd.c
>>> +++ b/backends/iommufd.c
>>> @@ -233,6 +233,23 @@ int
>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>>        return ret;
>>>    }
>>>
>>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap,
>> Error **errp)
>>> +{
>>> +    switch (cap) {
>>> +    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>>> +        return 1;
>>
>> I don't understand this value.
> 
> 1 means this host iommu device is attached to IOMMUFD backend,
> or else 0 if attached to legacy backend.

Hmm, this looks hacky to me and it is not used anywhere in the patchset.
Let's reconsider when there is actually a use for it. Until then, please
drop. My feeling is that a new HostIOMMUDeviceClass handler/attributed
should be introduced instead.


Thanks,

C.



> Strictly speaking, HOST_IOMMU_DEVICE_CAP_IOMMUFD is not a
> hardware capability, I'm trying to put all(sw/hw) in CAPs checking
> framework just like KVM<->qemu CAPs does.
> 
> Thanks
> Zhenzhong
> 
>>
>>
>> Thanks,
>>
>> C.
>>
>>
>>> +    default:
>>> +        return host_iommu_device_check_cap_common(hiod, cap, errp);
>>> +    }
>>> +}
>>> +
>>> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>>> +{
>>> +    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>>> +
>>> +    hioc->check_cap = hiod_iommufd_check_cap;
>>> +};
>>> +
>>>    static const TypeInfo types[] = {
>>>        {
>>>            .name = TYPE_IOMMUFD_BACKEND,
>>> @@ -251,6 +268,7 @@ static const TypeInfo types[] = {
>>>            .parent = TYPE_HOST_IOMMU_DEVICE,
>>>            .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
>>>            .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
>>> +        .class_init = hiod_iommufd_class_init,
>>>            .abstract = true,
>>>        }
>>>    };
> 



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/19] vfio: Create host IOMMU device instance
  2024-04-30 10:16     ` Duan, Zhenzhong
@ 2024-04-30 12:15       ` Cédric Le Goater
  0 siblings, 0 replies; 59+ messages in thread
From: Cédric Le Goater @ 2024-04-30 12:15 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

On 4/30/24 12:16, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH v3 13/19] vfio: Create host IOMMU device instance
>>
>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>> Create host IOMMU device instance in vfio_attach_device() and call
>>> .realize() to initialize it further.
>>>
>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>    include/hw/vfio/vfio-common.h |  1 +
>>>    hw/vfio/common.c              | 18 +++++++++++++++++-
>>>    2 files changed, 18 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>> common.h
>>> index 0943add3bc..b204b93a55 100644
>>> --- a/include/hw/vfio/vfio-common.h
>>> +++ b/include/hw/vfio/vfio-common.h
>>> @@ -126,6 +126,7 @@ typedef struct VFIODevice {
>>>        OnOffAuto pre_copy_dirty_page_tracking;
>>>        bool dirty_pages_supported;
>>>        bool dirty_tracking;
>>> +    HostIOMMUDevice *hiod;
>>>        int devid;
>>>        IOMMUFDBackend *iommufd;
>>>    } VFIODevice;
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index 8f9cbdc026..0be8b70ebd 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -1497,6 +1497,8 @@ int vfio_attach_device(char *name, VFIODevice
>> *vbasedev,
>>>    {
>>>        const VFIOIOMMUClass *ops =
>>>
>> VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>>> +    HostIOMMUDevice *hiod;
>>> +    int ret;
>>>
>>>        if (vbasedev->iommufd) {
>>>            ops =
>> VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUF
>> D));
>>> @@ -1504,7 +1506,20 @@ int vfio_attach_device(char *name,
>> VFIODevice *vbasedev,
>>>
>>>        assert(ops);
>>>
>>> -    return ops->attach_device(name, vbasedev, as, errp);
>>> +    ret = ops->attach_device(name, vbasedev, as, errp);
>>> +    if (ret < 0) {
>>> +        return ret;
>>
>>
>> hmm, I wonder if we should change the return value of vfio_attach_device()
>> to be a bool.
> 
> I see, also VFIOIOMMUClass:: setup and VFIOIOMMUClass::add_window.
> I can add cleanup patches to fix them if you have no other plan.

Yes please. I don't have plans for changes in that area. The only pending
patches are from the series "[v4] migration: Improve error reporting" [*].

Thanks,

C.


[*] https://lore.kernel.org/qemu-devel/20240306133441.2351700-1-clg@redhat.com/




^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-04-30 12:12       ` Cédric Le Goater
@ 2024-05-01 12:34         ` Duan, Zhenzhong
  2024-05-02  8:17           ` Cédric Le Goater
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-01 12:34 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::check_cap() handler
>
>On 4/30/24 12:06, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>>> HostIOMMUDeviceClass::check_cap() handler
>>>
>>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>>    backends/iommufd.c | 18 ++++++++++++++++++
>>>>    1 file changed, 18 insertions(+)
>>>>
>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>> index d61209788a..28faec528e 100644
>>>> --- a/backends/iommufd.c
>>>> +++ b/backends/iommufd.c
>>>> @@ -233,6 +233,23 @@ int
>>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>>>>        return ret;
>>>>    }
>>>>
>>>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap,
>>> Error **errp)
>>>> +{
>>>> +    switch (cap) {
>>>> +    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>>>> +        return 1;
>>>
>>> I don't understand this value.
>>
>> 1 means this host iommu device is attached to IOMMUFD backend,
>> or else 0 if attached to legacy backend.
>
>Hmm, this looks hacky to me and it is not used anywhere in the patchset.
>Let's reconsider when there is actually a use for it. Until then, please
>drop. My feeling is that a new HostIOMMUDeviceClass handler/attributed
>should be introduced instead.

Got it, will drop it in this series.

Is "return 1" directly the concern on your side? If yes, what about adding a new
element be_type which can be initialized in realize(), like below:

--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -28,6 +28,9 @@
  * @fs1gp: first stage(a.k.a, Stage-1) 1GB huge page support.
  */
 typedef struct HostIOMMUDeviceCaps {
+#define HOST_IOMMU_DEVICE_CAP_BACKEND_LEGACY        0
+#define HOST_IOMMU_DEVICE_CAP_BACKEND_IOMMUFD       1
+    uint32_t be_type;
     enum iommu_hw_info_type type;
     uint8_t aw_bits;
     bool nesting;
@@ -91,7 +94,7 @@ struct HostIOMMUDeviceClass {
 /*
  * Host IOMMU device capability list.
  */
-#define HOST_IOMMU_DEVICE_CAP_IOMMUFD       0
+#define HOST_IOMMU_DEVICE_CAP_BACKEND_TYPE  0
 #define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE    1
 #define HOST_IOMMU_DEVICE_CAP_AW_BITS       2
 #define HOST_IOMMU_DEVICE_CAP_NESTING       3

This looks a bit simpler than adding another handler.
Or you have other concern?

Thanks
Zhenzhong 

>
>
>Thanks,
>
>C.
>
>
>
>> Strictly speaking, HOST_IOMMU_DEVICE_CAP_IOMMUFD is not a
>> hardware capability, I'm trying to put all(sw/hw) in CAPs checking
>> framework just like KVM<->qemu CAPs does.
>>
>> Thanks
>> Zhenzhong
>>
>>>
>>>
>>> Thanks,
>>>
>>> C.
>>>
>>>
>>>> +    default:
>>>> +        return host_iommu_device_check_cap_common(hiod, cap, errp);
>>>> +    }
>>>> +}
>>>> +
>>>> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>>>> +{
>>>> +    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>>>> +
>>>> +    hioc->check_cap = hiod_iommufd_check_cap;
>>>> +};
>>>> +
>>>>    static const TypeInfo types[] = {
>>>>        {
>>>>            .name = TYPE_IOMMUFD_BACKEND,
>>>> @@ -251,6 +268,7 @@ static const TypeInfo types[] = {
>>>>            .parent = TYPE_HOST_IOMMU_DEVICE,
>>>>            .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
>>>>            .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
>>>> +        .class_init = hiod_iommufd_class_init,
>>>>            .abstract = true,
>>>>        }
>>>>    };
>>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-05-01 12:34         ` Duan, Zhenzhong
@ 2024-05-02  8:17           ` Cédric Le Goater
  2024-05-06  1:47             ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-05-02  8:17 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

>>>>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap,
>>>> Error **errp)
>>>>> +{
>>>>> +    switch (cap) {
>>>>> +    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>>>>> +        return 1;
>>>>
>>>> I don't understand this value.
>>>
>>> 1 means this host iommu device is attached to IOMMUFD backend,
>>> or else 0 if attached to legacy backend.
>>
>> Hmm, this looks hacky to me and it is not used anywhere in the patchset.
>> Let's reconsider when there is actually a use for it. Until then, please
>> drop. My feeling is that a new HostIOMMUDeviceClass handler/attributed
>> should be introduced instead.
> 
> Got it, will drop it in this series.
> 
> Is "return 1" directly the concern on your side? 

I don't know yet why the implementation would need to know if the host
IOMMU device is of type IOMMUFD. If that's the case, there are alternative
ways, like using OBJECT_CHECK( ..., TYPE_HOST_IOMMU_DEVICE_IOMMUFD) or
a class attribute defined at build time but that's a bit the same. Let's
see when the need arises.


Thanks,

C.






^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
  2024-04-30  9:58     ` Duan, Zhenzhong
@ 2024-05-02 10:30       ` Cédric Le Goater
  2024-05-06  6:45         ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-05-02 10:30 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

On 4/30/24 11:58, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
>>
>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>> This helper get the highest 1 bit position of the upper bound.
>>>
>>> If the range is empty or upper bound is zero, -1 is returned.
>>>
>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>    include/qemu/range.h | 11 +++++++++++
>>>    1 file changed, 11 insertions(+)
>>>
>>> diff --git a/include/qemu/range.h b/include/qemu/range.h
>>> index 205e1da76d..8e05bc1d9f 100644
>>> --- a/include/qemu/range.h
>>> +++ b/include/qemu/range.h
>>> @@ -20,6 +20,8 @@
>>>    #ifndef QEMU_RANGE_H
>>>    #define QEMU_RANGE_H
>>>
>>> +#include "qemu/bitops.h"
>>> +
>>>    /*
>>>     * Operations on 64 bit address ranges.
>>>     * Notes:
>>> @@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1,
>> uint64_t len1,
>>>        return !(last2 < first1 || last1 < first2);
>>>    }
>>>
>>> +/* Get highest non-zero bit position of a range */
>>> +static inline int range_get_last_bit(Range *range)
>>> +{
>>> +    if (range_is_empty(range) || !range->upb) {
>>> +        return -1;
>>> +    }
>>> +    return find_last_bit(&range->upb, sizeof(range->upb));
>>
>> This breaks builds on 32-bit host systems.
> 
> Oh, I missed 32bit build. Thanks, will fix.

This should provide the same result ?

     return 63 - clz64(range->upb);

Thanks,

C.




^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (18 preceding siblings ...)
  2024-04-29  6:50 ` [PATCH v3 19/19] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
@ 2024-05-03 14:04 ` Cédric Le Goater
  2024-05-03 14:10   ` Jason Gunthorpe
  2024-05-06  2:30   ` Duan, Zhenzhong
  19 siblings, 2 replies; 59+ messages in thread
From: Cédric Le Goater @ 2024-05-03 14:04 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

On 4/29/24 08:50, Zhenzhong Duan wrote:
> Hi,
> 
> The most important change in this version is instroducing a common
> HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
> between vIOMMU and HostIOMMUDevice.
> 
> HostIOMMUDeviceClass::realize() is introduced to initialize
> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
> 
> HostIOMMUDeviceClass::check_cap() is introduced to query host IOMMU
> device capabilities.
> 
> After the change, part2 is only 3 patches, so merge it with part1 to be
> a single prerequisite series, same for changelog. If anyone doesn't like
> that, I can split again.
> 
> The class tree is as below:
> 
>                                HostIOMMUDevice
>                                       | .caps
>                                       | .realize()
>                                       | .check_cap()
>                                       |
>              .-----------------------------------------------.
>              |                        |                      |
> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
>              | .vdev                  | {.vdev}              | .iommufd
>                                                              | .devid
>                                                              | [.ioas_id]
>                                                              | [.attach_hwpt()]
>                                                              | [.detach_hwpt()]
>                                                              |
>                                            .----------------------.
>                                            |                      |
>                         HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
>                                            | .vdev                | {.vdev}
> 
> * The attributes in [] will be implemented in nesting series.
> * The classes in {} will be implemented in future.
> * .vdev in different class points to different agent device,
> * i.e., for VFIO it points to VFIODevice.
> 
> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
> PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize() and .check_cap() handler
> PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU
> PATCH17-19: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
> 
> Qemu code can be found at:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v3
> 
> Besides the compatibility check in this series, in nesting series, this
> host IOMMU device is extended for much wider usage. For anyone interested
> on the nesting series, here is the link:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2


v4 should be a good candidate, we will need feedback from the vIOMMU
maintainers though.

However, have you considered another/complementary approach which
would be to create an host IOMMU (iommufd) backend object and a vIOMMU
device object together for each vfio-pci device being plugged in the
machine ?

Something like,
     
     -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \
     -object iommufd,id=iommufd1 \
     -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on,iommufd=iommufd1 \
     -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0

The vIOMMU device would be linked to the host IOMMU (iommufd) backend
object at realize time and it would simplify the discovery of the host
IOMMU properties. The implementation would be more straight forward.

That said, I didn't study deeply what needs to be done. The vIOMMU
implementation is not ready yet to support multiple instances and some
massaging is needed to change that first.

Thanks,

C.

       



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-05-03 14:04 ` [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Cédric Le Goater
@ 2024-05-03 14:10   ` Jason Gunthorpe
  2024-05-03 14:29     ` Cédric Le Goater
  2024-05-06  2:30   ` Duan, Zhenzhong
  1 sibling, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2024-05-03 14:10 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Zhenzhong Duan, qemu-devel, alex.williamson, eric.auger, mst,
	peterx, jasowang, nicolinc, joao.m.martins, kevin.tian, yi.l.liu,
	chao.p.peng

On Fri, May 03, 2024 at 04:04:25PM +0200, Cédric Le Goater wrote:
> However, have you considered another/complementary approach which
> would be to create an host IOMMU (iommufd) backend object and a vIOMMU
> device object together for each vfio-pci device being plugged in the
> machine ?
> 
> Something like,
>     -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \
>     -object iommufd,id=iommufd1 \
>     -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on,iommufd=iommufd1 \
>     -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0

? The main point of this is to have a single iommufd FD open in
qemu. Not multiple. Would you achieve this with a iommufd0 and
iommufd1 ?

Jason


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-05-03 14:10   ` Jason Gunthorpe
@ 2024-05-03 14:29     ` Cédric Le Goater
  0 siblings, 0 replies; 59+ messages in thread
From: Cédric Le Goater @ 2024-05-03 14:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Zhenzhong Duan, qemu-devel, alex.williamson, eric.auger, mst,
	peterx, jasowang, nicolinc, joao.m.martins, kevin.tian, yi.l.liu,
	chao.p.peng

On 5/3/24 16:10, Jason Gunthorpe wrote:
> On Fri, May 03, 2024 at 04:04:25PM +0200, Cédric Le Goater wrote:
>> However, have you considered another/complementary approach which
>> would be to create an host IOMMU (iommufd) backend object and a vIOMMU
>> device object together for each vfio-pci device being plugged in the
>> machine ?
>>
>> Something like,
>>      -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \
>>      -object iommufd,id=iommufd1 \
>>      -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on,iommufd=iommufd1 \
>>      -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0
> 
> ? The main point of this is to have a single iommufd FD open in
> qemu. Not multiple.

oups. The above example should have the same IOMMUFD object device
instance: iommufd0.

This is bogus copy-paste of a command line with multiple vfio-pci
devices, each using its own IOMMUFD object device instance. That's
where the idea comes from. Sorry for the noise.

C.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler
  2024-05-02  8:17           ` Cédric Le Goater
@ 2024-05-06  1:47             ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-06  1:47 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::check_cap() handler
>
>>>>>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int
>cap,
>>>>> Error **errp)
>>>>>> +{
>>>>>> +    switch (cap) {
>>>>>> +    case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>>>>>> +        return 1;
>>>>>
>>>>> I don't understand this value.
>>>>
>>>> 1 means this host iommu device is attached to IOMMUFD backend,
>>>> or else 0 if attached to legacy backend.
>>>
>>> Hmm, this looks hacky to me and it is not used anywhere in the patchset.
>>> Let's reconsider when there is actually a use for it. Until then, please
>>> drop. My feeling is that a new HostIOMMUDeviceClass
>handler/attributed
>>> should be introduced instead.
>>
>> Got it, will drop it in this series.
>>
>> Is "return 1" directly the concern on your side?
>
>I don't know yet why the implementation would need to know if the host
>IOMMU device is of type IOMMUFD. If that's the case, there are alternative
>ways, like using OBJECT_CHECK( ..., TYPE_HOST_IOMMU_DEVICE_IOMMUFD)
>or
>a class attribute defined at build time but that's a bit the same. Let's
>see when the need arises.

Got it, let's revisit it in nesting series, will drop it for now.

Thanks
Zhenzhong

^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-05-03 14:04 ` [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Cédric Le Goater
  2024-05-03 14:10   ` Jason Gunthorpe
@ 2024-05-06  2:30   ` Duan, Zhenzhong
  2024-05-06 12:05     ` Jason Gunthorpe
  1 sibling, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-06  2:30 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P

Hi Cédric,

>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Sent: Friday, May 3, 2024 10:04 PM
>To: Duan, Zhenzhong <zhenzhong.duan@intel.com>; qemu-
>devel@nongnu.org
>Cc: alex.williamson@redhat.com; eric.auger@redhat.com; mst@redhat.com;
>peterx@redhat.com; jasowang@redhat.com; jgg@nvidia.com;
>nicolinc@nvidia.com; joao.m.martins@oracle.com; Tian, Kevin
><kevin.tian@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; Peng, Chao P
><chao.p.peng@intel.com>
>Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Hi,
>>
>> The most important change in this version is instroducing a common
>> HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new
>interface
>> between vIOMMU and HostIOMMUDevice.
>>
>> HostIOMMUDeviceClass::realize() is introduced to initialize
>> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>>
>> HostIOMMUDeviceClass::check_cap() is introduced to query host IOMMU
>> device capabilities.
>>
>> After the change, part2 is only 3 patches, so merge it with part1 to be
>> a single prerequisite series, same for changelog. If anyone doesn't like
>> that, I can split again.
>>
>> The class tree is as below:
>>
>>                                HostIOMMUDevice
>>                                       | .caps
>>                                       | .realize()
>>                                       | .check_cap()
>>                                       |
>>              .-----------------------------------------------.
>>              |                        |                      |
>> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}
>HostIOMMUDeviceIOMMUFD
>>              | .vdev                  | {.vdev}              | .iommufd
>>                                                              | .devid
>>                                                              | [.ioas_id]
>>                                                              | [.attach_hwpt()]
>>                                                              | [.detach_hwpt()]
>>                                                              |
>>                                            .----------------------.
>>                                            |                      |
>>                         HostIOMMUDeviceIOMMUFDVFIO
>{HostIOMMUDeviceIOMMUFDVDPA}
>>                                            | .vdev                | {.vdev}
>>
>> * The attributes in [] will be implemented in nesting series.
>> * The classes in {} will be implemented in future.
>> * .vdev in different class points to different agent device,
>> * i.e., for VFIO it points to VFIODevice.
>>
>> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
>> PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize()
>and .check_cap() handler
>> PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU
>> PATCH17-19: Implement compatibility check between host IOMMU and
>vIOMMU(intel_iommu)
>>
>> Qemu code can be found at:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
>q_v3
>>
>> Besides the compatibility check in this series, in nesting series, this
>> host IOMMU device is extended for much wider usage. For anyone
>interested
>> on the nesting series, here is the link:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfc
>v2
>
>
>v4 should be a good candidate, we will need feedback from the vIOMMU
>maintainers though.
>
>However, have you considered another/complementary approach which
>would be to create an host IOMMU (iommufd) backend object and a
>vIOMMU
>device object together for each vfio-pci device being plugged in the
>machine ?

I did consider about a single iommufd instance for qemu and finally chose
to support multiple iommufd instances, reason below:

I was taking iommufd as a backend of VFIO device not a backend of vIOMMU.
So there is an iommufd property linked to iommufd instances.
We do support multiple iommufd instances in nesting series just as
we do in cdev series, such as:

-device intel-iommu,caching-mode=on,dma-drain=on,device-iotlb=on,x-scalable-mode=modern
-object iommufd,id=iommufd0
-device vfio-pci,host=6f:01.0,id=vfio0,bus=root0,iommufd=iommufd0
-object iommufd,id=iommufd1
-device vfio-pci,host=6f:01.1,id=vfio1,bus=root1,iommufd=iommufd1
-device vfio-pci,host=6f:01.2,id=vfio2,bus=root2

Adding iommufd property to vIOMMU will limit the whole qemu to use only
one iommufd instance, it's also confusing if there is also vfio device with legacy
backend.

I'm not clear how useful multiple iommufd instances support are.
One possible benefit is for security? It may bring a slightly fine-grained
isolation in kernel.

We can discuss this further, if it's unnecessary to support multiple iommufd
instances in nesting series, I can change to single iommufd instance support
and add an iommufd property for vIOMMU just as you suggested.

Thanks
Zhenzhong

>
>Something like,
>
>     -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \
>     -object iommufd,id=iommufd1 \
>     -device intel-iommu,intremap=on,device-iotlb=on,caching-
>mode=on,iommufd=iommufd1 \
>     -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0
>
>The vIOMMU device would be linked to the host IOMMU (iommufd) backend
>object at realize time and it would simplify the discovery of the host
>IOMMU properties. The implementation would be more straight forward.
>
>That said, I didn't study deeply what needs to be done. The vIOMMU
>implementation is not ready yet to support multiple instances and some
>massaging is needed to change that first.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 06/19] range: Introduce range_get_last_bit()
  2024-05-02 10:30       ` Cédric Le Goater
@ 2024-05-06  6:45         ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-06  6:45 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
>
>On 4/30/24 11:58, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
>>>
>>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>>> This helper get the highest 1 bit position of the upper bound.
>>>>
>>>> If the range is empty or upper bound is zero, -1 is returned.
>>>>
>>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>>    include/qemu/range.h | 11 +++++++++++
>>>>    1 file changed, 11 insertions(+)
>>>>
>>>> diff --git a/include/qemu/range.h b/include/qemu/range.h
>>>> index 205e1da76d..8e05bc1d9f 100644
>>>> --- a/include/qemu/range.h
>>>> +++ b/include/qemu/range.h
>>>> @@ -20,6 +20,8 @@
>>>>    #ifndef QEMU_RANGE_H
>>>>    #define QEMU_RANGE_H
>>>>
>>>> +#include "qemu/bitops.h"
>>>> +
>>>>    /*
>>>>     * Operations on 64 bit address ranges.
>>>>     * Notes:
>>>> @@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1,
>>> uint64_t len1,
>>>>        return !(last2 < first1 || last1 < first2);
>>>>    }
>>>>
>>>> +/* Get highest non-zero bit position of a range */
>>>> +static inline int range_get_last_bit(Range *range)
>>>> +{
>>>> +    if (range_is_empty(range) || !range->upb) {
>>>> +        return -1;
>>>> +    }
>>>> +    return find_last_bit(&range->upb, sizeof(range->upb));
>>>
>>> This breaks builds on 32-bit host systems.
>>
>> Oh, I missed 32bit build. Thanks, will fix.
>
>This should provide the same result ?
>
>     return 63 - clz64(range->upb);

Yes, I tried 32bit and 64bit, it works. Will use it, thanks for suggestion.

BRs.
Zhenzhong

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-05-06  2:30   ` Duan, Zhenzhong
@ 2024-05-06 12:05     ` Jason Gunthorpe
  2024-05-07  2:24       ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2024-05-06 12:05 UTC (permalink / raw)
  To: Duan, Zhenzhong
  Cc: Cédric Le Goater, qemu-devel, alex.williamson, eric.auger,
	mst, peterx, jasowang, nicolinc, joao.m.martins, Tian, Kevin,
	Liu, Yi L, Peng, Chao P

On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote:

> I'm not clear how useful multiple iommufd instances support are.
> One possible benefit is for security? It may bring a slightly fine-grained
> isolation in kernel.

No. I don't think there is any usecase, it is only harmful.

Jason


^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-05-06 12:05     ` Jason Gunthorpe
@ 2024-05-07  2:24       ` Duan, Zhenzhong
  2024-05-07 11:58         ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-07  2:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Cédric Le Goater, qemu-devel, alex.williamson, eric.auger,
	mst, peterx, jasowang, nicolinc, joao.m.martins, Tian, Kevin,
	Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Jason Gunthorpe <jgg@nvidia.com>
>Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote:
>
>> I'm not clear how useful multiple iommufd instances support are.
>> One possible benefit is for security? It may bring a slightly fine-grained
>> isolation in kernel.
>
>No. I don't think there is any usecase, it is only harmful.

OK, so we need to limit QEMU to only one iommufd instance.

In cdev series, we support mix of legacy and iommufd backend and multiple iommufd backend instances for flexibility.
We need to make a choice to have this limitation only for nesting series or globally(including cdev).
May I ask what harmfulness we may have?

Thanks
Zhenzhong


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  2024-04-29  6:50 ` [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
  2024-04-30  9:41   ` Cédric Le Goater
@ 2024-05-07  6:11   ` Cédric Le Goater
  2024-05-07  6:24     ` Duan, Zhenzhong
  1 sibling, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-05-07  6:11 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng

Hello Zhenzhong,

On 4/29/24 08:50, Zhenzhong Duan wrote:
> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
> Different platform IOMMU can support different elements.
> 
> Currently only two elements, type and aw_bits, type hints the host
> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
> host IOMMU address width.
> 
> Introduce .check_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
> is supported.
> 
> Introduce a HostIOMMUDevice API host_iommu_device_check_cap() which
> is a wrapper of .check_cap().
> 
> Introduce a HostIOMMUDevice API host_iommu_device_check_cap_common()
> to check common capabalities of different host platform IOMMUs.
> 
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/sysemu/host_iommu_device.h | 44 ++++++++++++++++++++++++++++++
>   backends/host_iommu_device.c       | 29 ++++++++++++++++++++
>   2 files changed, 73 insertions(+)
> 
> diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
> index 2b58a94d62..12b6afb463 100644
> --- a/include/sysemu/host_iommu_device.h
> +++ b/include/sysemu/host_iommu_device.h
> @@ -14,12 +14,27 @@
>   
>   #include "qom/object.h"
>   #include "qapi/error.h"
> +#include "linux/iommufd.h"


Please use instead :

    #include <linux/iommufd.h>


Thanks,

C.


> +
> +/**
> + * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
> + *
> + * @type: host platform IOMMU type.
> + *
> + * @aw_bits: host IOMMU address width. 0xff if no limitation.
> + */
> +typedef struct HostIOMMUDeviceCaps {
> +    enum iommu_hw_info_type type;
> +    uint8_t aw_bits;
> +} HostIOMMUDeviceCaps;
>   
>   #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>   OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
>   
>   struct HostIOMMUDevice {
>       Object parent_obj;
> +
> +    HostIOMMUDeviceCaps caps;
>   };
>   
>   /**
> @@ -47,5 +62,34 @@ struct HostIOMMUDeviceClass {
>        * Returns: true on success, false on failure.
>        */
>       bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
> +    /**
> +     * @check_cap: check if a host IOMMU device capability is supported.
> +     *
> +     * Optional callback, if not implemented, hint not supporting query
> +     * of @cap.
> +     *
> +     * @hiod: pointer to a host IOMMU device instance.
> +     *
> +     * @cap: capability to check.
> +     *
> +     * @errp: pass an Error out when fails to query capability.
> +     *
> +     * Returns: <0 on failure, 0 if a @cap is unsupported, or else
> +     * 1 or some positive value for some special @cap,
> +     * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
> +     */
> +    int (*check_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>   };
> +
> +/*
> + * Host IOMMU device capability list.
> + */
> +#define HOST_IOMMU_DEVICE_CAP_IOMMUFD       0
> +#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE    1
> +#define HOST_IOMMU_DEVICE_CAP_AW_BITS       2
> +
> +
> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp);
> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod, int cap,
> +                                       Error **errp);
>   #endif
> diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
> index 41f2fdce20..b97d008cc7 100644
> --- a/backends/host_iommu_device.c
> +++ b/backends/host_iommu_device.c
> @@ -28,3 +28,32 @@ static void host_iommu_device_init(Object *obj)
>   static void host_iommu_device_finalize(Object *obj)
>   {
>   }
> +
> +/* Wrapper of HostIOMMUDeviceClass:check_cap */
> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
> +{
> +    HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
> +    if (!hiodc->check_cap) {
> +        error_setg(errp, ".check_cap() not implemented");
> +        return -EINVAL;
> +    }
> +
> +    return hiodc->check_cap(hiod, cap, errp);
> +}
> +
> +/* Implement check on common IOMMU capabilities */
> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod, int cap,
> +                                       Error **errp)
> +{
> +    HostIOMMUDeviceCaps *caps = &hiod->caps;
> +
> +    switch (cap) {
> +    case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
> +        return caps->type;
> +    case HOST_IOMMU_DEVICE_CAP_AW_BITS:
> +        return caps->aw_bits;
> +    default:
> +        error_setg(errp, "Not support query cap %x", cap);
> +        return -EINVAL;
> +    }
> +}



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  2024-05-07  6:11   ` Cédric Le Goater
@ 2024-05-07  6:24     ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-07  6:24 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce
>HostIOMMUDeviceCaps
>
>Hello Zhenzhong,
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
>> Different platform IOMMU can support different elements.
>>
>> Currently only two elements, type and aw_bits, type hints the host
>> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
>> host IOMMU address width.
>>
>> Introduce .check_cap() handler to check if
>HOST_IOMMU_DEVICE_CAP_XXX
>> is supported.
>>
>> Introduce a HostIOMMUDevice API host_iommu_device_check_cap()
>which
>> is a wrapper of .check_cap().
>>
>> Introduce a HostIOMMUDevice API
>host_iommu_device_check_cap_common()
>> to check common capabalities of different host platform IOMMUs.
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/sysemu/host_iommu_device.h | 44
>++++++++++++++++++++++++++++++
>>   backends/host_iommu_device.c       | 29 ++++++++++++++++++++
>>   2 files changed, 73 insertions(+)
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> index 2b58a94d62..12b6afb463 100644
>> --- a/include/sysemu/host_iommu_device.h
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -14,12 +14,27 @@
>>
>>   #include "qom/object.h"
>>   #include "qapi/error.h"
>> +#include "linux/iommufd.h"
>
>
>Please use instead :
>
>    #include <linux/iommufd.h>

Got it.

Thanks
Zhenzhong

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  2024-04-29  6:50 ` [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
@ 2024-05-07  7:04   ` Cédric Le Goater
  2024-05-07  7:48     ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-05-07  7:04 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, kevin.tian, yi.l.liu, chao.p.peng,
	Yi Sun, Marcel Apfelbaum

Hello Zhenzhong,

On 4/29/24 08:50, Zhenzhong Duan wrote:
> From: Yi Liu <yi.l.liu@intel.com>
> 
> pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
> to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
> set/unset HostIOMMUDevice for a given PCI device.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/hw/pci/pci.h | 38 +++++++++++++++++++++++++++++++++++++-
>   hw/pci/pci.c         | 27 +++++++++++++++++++++++++++
>   2 files changed, 64 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index eaa3fc99d8..849e391813 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -3,6 +3,7 @@
>   
>   #include "exec/memory.h"
>   #include "sysemu/dma.h"
> +#include "sysemu/host_iommu_device.h"

This include directive pulls a Linux header file <linux/iommufd.h>
which doesn't exist on all platforms, such as windows and it breaks
compile. So,

>   
>   /* PCI includes legacy ISA access.  */
>   #include "hw/isa/isa.h"
> @@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
>        *
>        * @devfn: device and function number
>        */
> -   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
> +    AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
> +    /**
> +     * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
> +     *
> +     * Optional callback, if not implemented in vIOMMU, then vIOMMU can't
> +     * retrieve host information from the associated HostIOMMUDevice.
> +     *
> +     * @bus: the #PCIBus of the PCI device.
> +     *
> +     * @opaque: the data passed to pci_setup_iommu().
> +     *
> +     * @devfn: device and function number of the PCI device.
> +     *
> +     * @dev: the data structure representing host IOMMU device.
> +     *
> +     * @errp: pass an Error out only when return false
> +     *
> +     * Returns: 0 if HostIOMMUDevice is attached, or else <0 with errp set.
> +     */
> +    int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
> +                            HostIOMMUDevice *dev, Error **errp);
> +    /**
> +     * @unset_iommu_device: detach a HostIOMMUDevice from a vIOMMU
> +     *
> +     * Optional callback.
> +     *
> +     * @bus: the #PCIBus of the PCI device.
> +     *
> +     * @opaque: the data passed to pci_setup_iommu().
> +     *
> +     * @devfn: device and function number of the PCI device.
> +     */
> +    void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
>   } PCIIOMMUOps;
>   
>   AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
> +                                Error **errp);

please include a forward declaration for HostIOMMUDevice instead.

Thanks,

C.




> +void pci_device_unset_iommu_device(PCIDevice *dev);
>   
>   /**
>    * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 02a4bb2af6..c3293e9357 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2742,6 +2742,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>       return &address_space_memory;
>   }
>   
> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
> +                                Error **errp)
> +{
> +    PCIBus *iommu_bus;
> +
> +    /* set_iommu_device requires device's direct BDF instead of aliased BDF */
> +    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
> +    if (iommu_bus && iommu_bus->iommu_ops->set_iommu_device) {
> +        return iommu_bus->iommu_ops->set_iommu_device(pci_get_bus(dev),
> +                                                      iommu_bus->iommu_opaque,
> +                                                      dev->devfn, hiod, errp);
> +    }
> +    return 0;
> +}
> +
> +void pci_device_unset_iommu_device(PCIDevice *dev)
> +{
> +    PCIBus *iommu_bus;
> +
> +    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
> +    if (iommu_bus && iommu_bus->iommu_ops->unset_iommu_device) {
> +        return iommu_bus->iommu_ops->unset_iommu_device(pci_get_bus(dev),
> +                                                        iommu_bus->iommu_opaque,
> +                                                        dev->devfn);
> +    }
> +}
> +
>   void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
>   {
>       /*



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  2024-05-07  7:04   ` Cédric Le Goater
@ 2024-05-07  7:48     ` Duan, Zhenzhong
  2024-05-07 12:26       ` Cédric Le Goater
  0 siblings, 1 reply; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-07  7:48 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P,
	Yi Sun, Marcel Apfelbaum

Hi Cédric,

>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 15/19] hw/pci: Introduce
>pci_device_[set|unset]_iommu_device()
>
>Hello Zhenzhong,
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> From: Yi Liu <yi.l.liu@intel.com>
>>
>> pci_device_[set|unset]_iommu_device() call
>pci_device_get_iommu_bus_devfn()
>> to get iommu_bus->iommu_ops and call [set|unset]_iommu_device
>callback to
>> set/unset HostIOMMUDevice for a given PCI device.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   include/hw/pci/pci.h | 38
>+++++++++++++++++++++++++++++++++++++-
>>   hw/pci/pci.c         | 27 +++++++++++++++++++++++++++
>>   2 files changed, 64 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>> index eaa3fc99d8..849e391813 100644
>> --- a/include/hw/pci/pci.h
>> +++ b/include/hw/pci/pci.h
>> @@ -3,6 +3,7 @@
>>
>>   #include "exec/memory.h"
>>   #include "sysemu/dma.h"
>> +#include "sysemu/host_iommu_device.h"
>
>This include directive pulls a Linux header file <linux/iommufd.h>
>which doesn't exist on all platforms, such as windows and it breaks
>compile. So,
>
>>
>>   /* PCI includes legacy ISA access.  */
>>   #include "hw/isa/isa.h"
>> @@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
>>        *
>>        * @devfn: device and function number
>>        */
>> -   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>devfn);
>> +    AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>devfn);
>> +    /**
>> +     * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
>> +     *
>> +     * Optional callback, if not implemented in vIOMMU, then vIOMMU
>can't
>> +     * retrieve host information from the associated HostIOMMUDevice.
>> +     *
>> +     * @bus: the #PCIBus of the PCI device.
>> +     *
>> +     * @opaque: the data passed to pci_setup_iommu().
>> +     *
>> +     * @devfn: device and function number of the PCI device.
>> +     *
>> +     * @dev: the data structure representing host IOMMU device.
>> +     *
>> +     * @errp: pass an Error out only when return false
>> +     *
>> +     * Returns: 0 if HostIOMMUDevice is attached, or else <0 with errp set.
>> +     */
>> +    int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
>> +                            HostIOMMUDevice *dev, Error **errp);
>> +    /**
>> +     * @unset_iommu_device: detach a HostIOMMUDevice from a
>vIOMMU
>> +     *
>> +     * Optional callback.
>> +     *
>> +     * @bus: the #PCIBus of the PCI device.
>> +     *
>> +     * @opaque: the data passed to pci_setup_iommu().
>> +     *
>> +     * @devfn: device and function number of the PCI device.
>> +     */
>> +    void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
>>   } PCIIOMMUOps;
>>
>>   AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>*hiod,
>> +                                Error **errp);
>
>please include a forward declaration for HostIOMMUDevice instead.

Got it, will do.
Maybe using iommu_hw_info_type in include/sysemu/host_iommu_device.h
isn't a good idea from start.

Thanks
Zhenzhong


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-05-07  2:24       ` Duan, Zhenzhong
@ 2024-05-07 11:58         ` Jason Gunthorpe
  2024-05-08  6:36           ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2024-05-07 11:58 UTC (permalink / raw)
  To: Duan, Zhenzhong
  Cc: Cédric Le Goater, qemu-devel, alex.williamson, eric.auger,
	mst, peterx, jasowang, nicolinc, joao.m.martins, Tian, Kevin,
	Liu, Yi L, Peng, Chao P

On Tue, May 07, 2024 at 02:24:30AM +0000, Duan, Zhenzhong wrote:
> >On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote:
> >
> >> I'm not clear how useful multiple iommufd instances support are.
> >> One possible benefit is for security? It may bring a slightly fine-grained
> >> isolation in kernel.
> >
> >No. I don't think there is any usecase, it is only harmful.
> 
> OK, so we need to limit QEMU to only one iommufd instance.

I don't know about limit, but you don't need to do extra stuff to make
it work.

The main issue will be to get all the viommu instances to share the
same iommufd IOAS for the guest physical mapping. Otherwise each
viommu should be largely unware of the others sharing (or not) a
iommufd.

If you can structure things properly it probably doesn't need a hard
limit, it will just work worse.

Jason


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  2024-05-07  7:48     ` Duan, Zhenzhong
@ 2024-05-07 12:26       ` Cédric Le Goater
  2024-05-08  6:24         ` Duan, Zhenzhong
  0 siblings, 1 reply; 59+ messages in thread
From: Cédric Le Goater @ 2024-05-07 12:26 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P,
	Yi Sun, Marcel Apfelbaum

On 5/7/24 09:48, Duan, Zhenzhong wrote:
> Hi Cédric,
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH v3 15/19] hw/pci: Introduce
>> pci_device_[set|unset]_iommu_device()
>>
>> Hello Zhenzhong,
>>
>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>> From: Yi Liu <yi.l.liu@intel.com>
>>>
>>> pci_device_[set|unset]_iommu_device() call
>> pci_device_get_iommu_bus_devfn()
>>> to get iommu_bus->iommu_ops and call [set|unset]_iommu_device
>> callback to
>>> set/unset HostIOMMUDevice for a given PCI device.
>>>
>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>    include/hw/pci/pci.h | 38
>> +++++++++++++++++++++++++++++++++++++-
>>>    hw/pci/pci.c         | 27 +++++++++++++++++++++++++++
>>>    2 files changed, 64 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>>> index eaa3fc99d8..849e391813 100644
>>> --- a/include/hw/pci/pci.h
>>> +++ b/include/hw/pci/pci.h
>>> @@ -3,6 +3,7 @@
>>>
>>>    #include "exec/memory.h"
>>>    #include "sysemu/dma.h"
>>> +#include "sysemu/host_iommu_device.h"
>>
>> This include directive pulls a Linux header file <linux/iommufd.h>
>> which doesn't exist on all platforms, such as windows and it breaks
>> compile. So,
>>
>>>
>>>    /* PCI includes legacy ISA access.  */
>>>    #include "hw/isa/isa.h"
>>> @@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
>>>         *
>>>         * @devfn: device and function number
>>>         */
>>> -   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>> devfn);
>>> +    AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>> devfn);
>>> +    /**
>>> +     * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
>>> +     *
>>> +     * Optional callback, if not implemented in vIOMMU, then vIOMMU
>> can't
>>> +     * retrieve host information from the associated HostIOMMUDevice.
>>> +     *
>>> +     * @bus: the #PCIBus of the PCI device.
>>> +     *
>>> +     * @opaque: the data passed to pci_setup_iommu().
>>> +     *
>>> +     * @devfn: device and function number of the PCI device.
>>> +     *
>>> +     * @dev: the data structure representing host IOMMU device.
>>> +     *
>>> +     * @errp: pass an Error out only when return false
>>> +     *
>>> +     * Returns: 0 if HostIOMMUDevice is attached, or else <0 with errp set.
>>> +     */
>>> +    int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
>>> +                            HostIOMMUDevice *dev, Error **errp);
>>> +    /**
>>> +     * @unset_iommu_device: detach a HostIOMMUDevice from a
>> vIOMMU
>>> +     *
>>> +     * Optional callback.
>>> +     *
>>> +     * @bus: the #PCIBus of the PCI device.
>>> +     *
>>> +     * @opaque: the data passed to pci_setup_iommu().
>>> +     *
>>> +     * @devfn: device and function number of the PCI device.
>>> +     */
>>> +    void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
>>>    } PCIIOMMUOps;
>>>
>>>    AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>>> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>> *hiod,
>>> +                                Error **errp);
>>
>> please include a forward declaration for HostIOMMUDevice instead.
> 
> Got it, will do.
> Maybe using iommu_hw_info_type in include/sysemu/host_iommu_device.h
> isn't a good idea from start.

It is not indeed since some included files are used on the Windows build.

Thanks,

C.



> 
> Thanks
> Zhenzhong
> 



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  2024-05-07 12:26       ` Cédric Le Goater
@ 2024-05-08  6:24         ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-08  6:24 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, Tian, Kevin, Liu, Yi L, Peng, Chao P,
	Yi Sun, Marcel Apfelbaum



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v3 15/19] hw/pci: Introduce
>pci_device_[set|unset]_iommu_device()
>
>On 5/7/24 09:48, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH v3 15/19] hw/pci: Introduce
>>> pci_device_[set|unset]_iommu_device()
>>>
>>> Hello Zhenzhong,
>>>
>>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>>> From: Yi Liu <yi.l.liu@intel.com>
>>>>
>>>> pci_device_[set|unset]_iommu_device() call
>>> pci_device_get_iommu_bus_devfn()
>>>> to get iommu_bus->iommu_ops and call [set|unset]_iommu_device
>>> callback to
>>>> set/unset HostIOMMUDevice for a given PCI device.
>>>>
>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>>    include/hw/pci/pci.h | 38
>>> +++++++++++++++++++++++++++++++++++++-
>>>>    hw/pci/pci.c         | 27 +++++++++++++++++++++++++++
>>>>    2 files changed, 64 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>>>> index eaa3fc99d8..849e391813 100644
>>>> --- a/include/hw/pci/pci.h
>>>> +++ b/include/hw/pci/pci.h
>>>> @@ -3,6 +3,7 @@
>>>>
>>>>    #include "exec/memory.h"
>>>>    #include "sysemu/dma.h"
>>>> +#include "sysemu/host_iommu_device.h"
>>>
>>> This include directive pulls a Linux header file <linux/iommufd.h>
>>> which doesn't exist on all platforms, such as windows and it breaks
>>> compile. So,
>>>
>>>>
>>>>    /* PCI includes legacy ISA access.  */
>>>>    #include "hw/isa/isa.h"
>>>> @@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
>>>>         *
>>>>         * @devfn: device and function number
>>>>         */
>>>> -   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>>> devfn);
>>>> +    AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque,
>int
>>> devfn);
>>>> +    /**
>>>> +     * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
>>>> +     *
>>>> +     * Optional callback, if not implemented in vIOMMU, then vIOMMU
>>> can't
>>>> +     * retrieve host information from the associated HostIOMMUDevice.
>>>> +     *
>>>> +     * @bus: the #PCIBus of the PCI device.
>>>> +     *
>>>> +     * @opaque: the data passed to pci_setup_iommu().
>>>> +     *
>>>> +     * @devfn: device and function number of the PCI device.
>>>> +     *
>>>> +     * @dev: the data structure representing host IOMMU device.
>>>> +     *
>>>> +     * @errp: pass an Error out only when return false
>>>> +     *
>>>> +     * Returns: 0 if HostIOMMUDevice is attached, or else <0 with errp
>set.
>>>> +     */
>>>> +    int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
>>>> +                            HostIOMMUDevice *dev, Error **errp);
>>>> +    /**
>>>> +     * @unset_iommu_device: detach a HostIOMMUDevice from a
>>> vIOMMU
>>>> +     *
>>>> +     * Optional callback.
>>>> +     *
>>>> +     * @bus: the #PCIBus of the PCI device.
>>>> +     *
>>>> +     * @opaque: the data passed to pci_setup_iommu().
>>>> +     *
>>>> +     * @devfn: device and function number of the PCI device.
>>>> +     */
>>>> +    void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
>>>>    } PCIIOMMUOps;
>>>>
>>>>    AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>>>> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>>> *hiod,
>>>> +                                Error **errp);
>>>
>>> please include a forward declaration for HostIOMMUDevice instead.
>>
>> Got it, will do.
>> Maybe using iommu_hw_info_type in
>include/sysemu/host_iommu_device.h
>> isn't a good idea from start.
>
>It is not indeed since some included files are used on the Windows build.

Windows build pass after below change to drop iommu_hw_info.
I'll do more build test, then send v5, sorry for the trouble.

diff --git a/backends/iommufd.c b/backends/iommufd.c
index 7c332bd167..fc3e498bc6 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -208,7 +208,7 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
 }

 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
-                                     enum iommu_hw_info_type *type,
+                                     unit32_t *type,
                                      void *data, uint32_t len, Error **errp)
 {
     struct iommu_hw_info info = {
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 29fbe61454..ddef667c0d 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -20,6 +20,7 @@
  */

 #include "qemu/osdep.h"
+#include CONFIG_DEVICES /* CONFIG_HOST_IOMMU_DEVICE */
 #include "qemu/error-report.h"
 #include "qemu/main-loop.h"
 #include "qapi/error.h"
@@ -4203,6 +4204,8 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
 static bool vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice *vtd_hdev,
                            Error **errp)
 {
+#ifdef CONFIG_HOST_IOMMU_DEVICE
+
     HostIOMMUDevice *hiod = vtd_hdev->dev;
     int ret;

@@ -4223,6 +4226,9 @@ static bool vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice *vtd_hdev,

     error_setg(errp, "host device is unsupported in scalable modern mode yet");
     return false;
+#else
+    return true;
+#endif
 }

 static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index 680c2a311a..ed4cbc967a 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -14,7 +14,6 @@

 #include "qom/object.h"
 #include "qapi/error.h"
-#include <linux/iommufd.h>

 /**
  * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
@@ -24,7 +23,7 @@
  * @aw_bits: host IOMMU address width. 0xff if no limitation.
  */
 typedef struct HostIOMMUDeviceCaps {
-    enum iommu_hw_info_type type;
+    uint32_t type;
     uint8_t aw_bits;
 } HostIOMMUDeviceCaps;

diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index ee1907c23a..6503a51d79 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -17,7 +17,6 @@
 #include "qom/object.h"
 #include "exec/hwaddr.h"
 #include "exec/cpu-common.h"
-#include <linux/iommufd.h>
 #include "sysemu/host_iommu_device.h"

 #define TYPE_IOMMUFD_BACKEND "iommufd"
@@ -49,7 +48,7 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
 int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
                               hwaddr iova, ram_addr_t size);
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
-                                     enum iommu_hw_info_type *type,
+                                     uint32_t *type,
                                      void *data, uint32_t len, Error **errp);

 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"

Thanks
Zhenzhong

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* RE: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU
  2024-05-07 11:58         ` Jason Gunthorpe
@ 2024-05-08  6:36           ` Duan, Zhenzhong
  0 siblings, 0 replies; 59+ messages in thread
From: Duan, Zhenzhong @ 2024-05-08  6:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Cédric Le Goater, qemu-devel, alex.williamson, eric.auger,
	mst, peterx, jasowang, nicolinc, joao.m.martins, Tian, Kevin,
	Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Jason Gunthorpe <jgg@nvidia.com>
>Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>On Tue, May 07, 2024 at 02:24:30AM +0000, Duan, Zhenzhong wrote:
>> >On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote:
>> >
>> >> I'm not clear how useful multiple iommufd instances support are.
>> >> One possible benefit is for security? It may bring a slightly fine-grained
>> >> isolation in kernel.
>> >
>> >No. I don't think there is any usecase, it is only harmful.
>>
>> OK, so we need to limit QEMU to only one iommufd instance.
>
>I don't know about limit, but you don't need to do extra stuff to make
>it work.
>
>The main issue will be to get all the viommu instances to share the
>same iommufd IOAS for the guest physical mapping. Otherwise each
>viommu should be largely unware of the others sharing (or not) a
>iommufd.

I see.

>
>If you can structure things properly it probably doesn't need a hard
>limit, it will just work worse.

OK, thanks for clarify.
The extra code to support multiple instances in intel_iommu is trivial.
So I'd like to keep this flexibility to user just like cdev. User can configure
QEMU cmdline to use one IOMMUFD instance easily whenever they want.

Thanks
Zhenzhong


^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2024-05-08  6:36 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-29  6:50 [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device Zhenzhong Duan
2024-04-30  7:51   ` Cédric Le Goater
2024-04-30  9:13     ` Duan, Zhenzhong
2024-04-30 12:00       ` Cédric Le Goater
2024-04-29  6:50 ` [PATCH v3 03/19] backends/iommufd: Introduce abstract HostIOMMUDeviceIOMMUFD device Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device Zhenzhong Duan
2024-04-30  7:52   ` Cédric Le Goater
2024-04-30  9:25     ` Duan, Zhenzhong
2024-04-29  6:50 ` [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
2024-04-30  9:41   ` Cédric Le Goater
2024-04-30  9:55     ` Duan, Zhenzhong
2024-04-30 12:01       ` Cédric Le Goater
2024-05-07  6:11   ` Cédric Le Goater
2024-05-07  6:24     ` Duan, Zhenzhong
2024-04-29  6:50 ` [PATCH v3 06/19] range: Introduce range_get_last_bit() Zhenzhong Duan
2024-04-30  9:41   ` Cédric Le Goater
2024-04-30  9:58     ` Duan, Zhenzhong
2024-05-02 10:30       ` Cédric Le Goater
2024-05-06  6:45         ` Duan, Zhenzhong
2024-04-29  6:50 ` [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
2024-04-30  9:41   ` Cédric Le Goater
2024-04-30  9:59     ` Duan, Zhenzhong
2024-04-29  6:50 ` [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
2024-04-30  9:41   ` Cédric Le Goater
2024-04-30 10:06     ` Duan, Zhenzhong
2024-04-29  6:50 ` [PATCH v3 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 10/19] vfio/container: Implement HostIOMMUDeviceClass::check_cap() handler Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 11/19] backends/iommufd: " Zhenzhong Duan
2024-04-30  9:41   ` Cédric Le Goater
2024-04-30 10:06     ` Duan, Zhenzhong
2024-04-30 12:12       ` Cédric Le Goater
2024-05-01 12:34         ` Duan, Zhenzhong
2024-05-02  8:17           ` Cédric Le Goater
2024-05-06  1:47             ` Duan, Zhenzhong
2024-04-29  6:50 ` [PATCH v3 12/19] vfio: Introduce VFIOIOMMUClass::hiod_typename attribute Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 13/19] vfio: Create host IOMMU device instance Zhenzhong Duan
2024-04-30  9:41   ` Cédric Le Goater
2024-04-30 10:16     ` Duan, Zhenzhong
2024-04-30 12:15       ` Cédric Le Goater
2024-04-29  6:50 ` [PATCH v3 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
2024-05-07  7:04   ` Cédric Le Goater
2024-05-07  7:48     ` Duan, Zhenzhong
2024-05-07 12:26       ` Cédric Le Goater
2024-05-08  6:24         ` Duan, Zhenzhong
2024-04-29  6:50 ` [PATCH v3 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 17/19] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
2024-04-29  6:50 ` [PATCH v3 19/19] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
2024-05-03 14:04 ` [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU Cédric Le Goater
2024-05-03 14:10   ` Jason Gunthorpe
2024-05-03 14:29     ` Cédric Le Goater
2024-05-06  2:30   ` Duan, Zhenzhong
2024-05-06 12:05     ` Jason Gunthorpe
2024-05-07  2:24       ` Duan, Zhenzhong
2024-05-07 11:58         ` Jason Gunthorpe
2024-05-08  6:36           ` Duan, Zhenzhong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.