All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver
@ 2024-04-30 21:25 Chang S. Bae
  2024-04-30 21:25 ` [PATCH 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
                   ` (4 more replies)
  0 siblings, 5 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-04-30 21:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph,
	chang.seok.bae

The recent update [1] in the SDM highlights the requirement of
initializing the AMX state for executing the scan test:
    "... maintaining AMX state in a non-initialized state ... will
     prevent the execution of In-Field Scan tests."
which is one of CPU state conditions required for the test's execution.

In situations where AMX workloads are running, the switched-away active
user AMX state remains due to the optimization to reduce the state
switching cost. A user state reload is fully completed right before
returning to userspace. Consequently, if the switched-in kernel task is
executing the scan test, this non-initialized AMX state causes the test
to be unable to start.

Given the benefit of the scan test in detecting hardware faults, ensuring
its seamless execution is not negligible. This necessitates a proper API
for the driver code to initialize AMX states. Although fpu_idle_fpregs()
may initialize the AMX state, its primary usage should be limited to
sleep state handling, making it unsuitable for the scan test.

The across-architecture FPU API, kernel_fpu_begin()/kernel_fpu_end(), is
universally established for floating-point SIMD code in the kernel. On
x86, kernel_fpu_begin_mask() is available, with kernel_fpu_begin()
serving as a wrapper to it for initializing legacy states by default.

The proposed solution extends kernel_fpu_begin_mask() to accept a new
option for initializing the AMX state. Additionally, it introduces custom
FPU handling wrappers for the In-Field Scan driver, which are variants of
kernel_fpu_begin()/kernel_fpu_end(). This approach is considerably
compliant with established semantics, following the EFI case with
efi_fpu_begin/efi_fpu_end().

Thanks,
Chang

[1] Intel Software Development Manual as of March 2024, Section 18.2
    RECOMMENDATIONS FOR SYSTEM SOFTWARE of Vol. 1.
    https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html

Chang S. Bae (2):
  x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  platform/x86/intel/ifs: Initialize AMX state for the scan test

 arch/x86/include/asm/fpu/api.h           |  1 +
 arch/x86/kernel/fpu/core.c               |  3 +++
 drivers/platform/x86/intel/ifs/ifs.h     | 14 ++++++++++++++
 drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
 4 files changed, 24 insertions(+)


base-commit: e67572cd2204894179d89bd7b984072f19313b03
-- 
2.34.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-04-30 21:25 [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
@ 2024-04-30 21:25 ` Chang S. Bae
  2024-04-30 23:42   ` Ashok Raj
  2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Chang S. Bae @ 2024-04-30 21:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph,
	chang.seok.bae

The In-Field Scan [1] test does not start its operation when the CPU is
unprepared. If the AMX state is uninitialized, the scan test will
immediately terminate. Therefore, a proper initialization of the AMX
state is necessary to run the test.

Although fpu_idle_fpregs() initializes the state, its usage should be
limited to specialized cases, primarily before entering the sleep state.
The restore_fpregs_from_fpstate() function offers a suitable mechanism
for initializing fpstate in general, which remains within the core code.

Extend kernel_fpu_begin_mask() to include AMX state initialization,
providing the in-field scan driver code access to the appropriate
initialization method while maintaining compliance with established FPU
API semantics.

[1] https://docs.kernel.org/arch/x86/ifs.html
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
---
The necessity for AMX initialization is clarified in the Intel Software
Development Manual as of March 2024, particularly in Section 18.2
RECOMMENDATIONS FOR SYSTEM SOFTWARE of Vol. 1.

Side note: restore_fpregs_from_fpstate() also sets the x87 state to a
fixed value. However, this only applies to AMD CPUs with the FXSAVE_LEAK
quirk.
---
 arch/x86/include/asm/fpu/api.h | 1 +
 arch/x86/kernel/fpu/core.c     | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index a2be3aefff9f..67887fc45c24 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -25,6 +25,7 @@
 /* Kernel FPU states to initialize in kernel_fpu_begin_mask() */
 #define KFPU_387	_BITUL(0)	/* 387 state will be initialized */
 #define KFPU_MXCSR	_BITUL(1)	/* MXCSR will be initialized */
+#define KFPU_AMX	_BITUL(2)	/* AMX will be initialized */
 
 extern void kernel_fpu_begin_mask(unsigned int kfpu_mask);
 extern void kernel_fpu_end(void);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 520deb411a70..0c0235b4a901 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -440,6 +440,9 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 
 	if (unlikely(kfpu_mask & KFPU_387) && boot_cpu_has(X86_FEATURE_FPU))
 		asm volatile ("fninit");
+
+	if (unlikely(kfpu_mask & KFPU_AMX) && boot_cpu_has(X86_FEATURE_AMX_TILE))
+		restore_fpregs_from_fpstate(&init_fpstate, XFEATURE_MASK_XTILE);
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-04-30 21:25 [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
  2024-04-30 21:25 ` [PATCH 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
@ 2024-04-30 21:25 ` Chang S. Bae
  2024-05-01  0:06   ` Ashok Raj
                     ` (4 more replies)
  2024-05-02 20:52 ` [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
                   ` (2 subsequent siblings)
  4 siblings, 5 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-04-30 21:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph,
	chang.seok.bae

The scan test does not start when the AMX state remains active and is not
re-initialized. With the extension of kernel_fpu_begin_mask(), the driver
code can now initialize the state properly.

Introduce custom FPU handling wrappers to ensure compliant with the
established FPU API semantics, as kernel_fpu_begin() exclusively sets
legacy states. This follows the EFI case from commit b0dc553cfc9d
("x86/fpu: Make the EFI FPU calling convention explicit").

Then, use these wrappers to surround the MSR_ACTIVATE_SCAN write to
minimize the critical section. To prevent unnecessary delays, invoke
ifs_fpu_begin() before entering the rendezvous loop.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
Tested-by: Jithu Joseph <jithu.joseph@intel.com>
---
 drivers/platform/x86/intel/ifs/ifs.h     | 14 ++++++++++++++
 drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
 2 files changed, 20 insertions(+)

diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h
index 56b9f3e3cf76..71d8b50854b2 100644
--- a/drivers/platform/x86/intel/ifs/ifs.h
+++ b/drivers/platform/x86/intel/ifs/ifs.h
@@ -325,4 +325,18 @@ int do_core_test(int cpu, struct device *dev);
 extern struct attribute *plat_ifs_attrs[];
 extern struct attribute *plat_ifs_array_attrs[];
 
+static inline void ifs_fpu_begin(void)
+{
+	/*
+	 * The AMX state must be initialized prior to executing In-Field
+	 * Scan tests, according to Intel SDM.
+	 */
+	kernel_fpu_begin_mask(KFPU_AMX);
+}
+
+static inline void ifs_fpu_end(void)
+{
+	kernel_fpu_end();
+}
+
 #endif
diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c
index 95b4b71fab53..a35eac7c0b44 100644
--- a/drivers/platform/x86/intel/ifs/runtest.c
+++ b/drivers/platform/x86/intel/ifs/runtest.c
@@ -188,6 +188,9 @@ static int doscan(void *data)
 	/* Only the first logical CPU on a core reports result */
 	first = cpumask_first(cpu_smt_mask(cpu));
 
+	/* Prepare FPU state before entering the rendezvous loop*/
+	ifs_fpu_begin();
+
 	wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC);
 
 	/*
@@ -199,6 +202,9 @@ static int doscan(void *data)
 	 * are processed in a single pass) before it retires.
 	 */
 	wrmsrl(MSR_ACTIVATE_SCAN, params->activate->data);
+
+	ifs_fpu_end();
+
 	rdmsrl(MSR_SCAN_STATUS, status.data);
 
 	trace_ifs_status(ifsd->cur_batch, start, stop, status.data);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-04-30 21:25 ` [PATCH 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
@ 2024-04-30 23:42   ` Ashok Raj
  0 siblings, 0 replies; 30+ messages in thread
From: Ashok Raj @ 2024-04-30 23:42 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: linux-kernel, x86, platform-driver-x86, tglx, mingo, bp,
	dave.hansen, hdegoede, ilpo.jarvinen, tony.luck, jithu.joseph,
	Ashok Raj

On Tue, Apr 30, 2024 at 02:25:07PM -0700, Bae, Chang Seok wrote:
> The In-Field Scan [1] test does not start its operation when the CPU is
> unprepared. If the AMX state is uninitialized, the scan test will
> immediately terminate. Therefore, a proper initialization of the AMX
> state is necessary to run the test.

Instead of using new terminology like "unprepared", maybe use the
term "non-initialized state" as in the SDM for clarity? 

	Maintaining AMX state in a non-initialized state will prevent In-field
	Scan test [1] to abort without making progress.

> 
> Although fpu_idle_fpregs() initializes the state, its usage should be
> limited to specialized cases, primarily before entering the sleep state.
> The restore_fpregs_from_fpstate() function offers a suitable mechanism
> for initializing fpstate in general, which remains within the core code.
> 
> Extend kernel_fpu_begin_mask() to include AMX state initialization,
> providing the in-field scan driver code access to the appropriate
> initialization method while maintaining compliance with established FPU
> API semantics.

Maybe simplify the above paragraph with

Extend kernel_fpu_begin_mask() to allow In-field scan driver to initialize
AMX state before running tests.

> 
> [1] https://docs.kernel.org/arch/x86/ifs.html
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> ---
> The necessity for AMX initialization is clarified in the Intel Software
> Development Manual as of March 2024, particularly in Section 18.2
> RECOMMENDATIONS FOR SYSTEM SOFTWARE of Vol. 1.
> 
> Side note: restore_fpregs_from_fpstate() also sets the x87 state to a
> fixed value. However, this only applies to AMD CPUs with the FXSAVE_LEAK
> quirk.
> ---
>  arch/x86/include/asm/fpu/api.h | 1 +
>  arch/x86/kernel/fpu/core.c     | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
> index a2be3aefff9f..67887fc45c24 100644
> --- a/arch/x86/include/asm/fpu/api.h
> +++ b/arch/x86/include/asm/fpu/api.h
> @@ -25,6 +25,7 @@
>  /* Kernel FPU states to initialize in kernel_fpu_begin_mask() */
>  #define KFPU_387	_BITUL(0)	/* 387 state will be initialized */
>  #define KFPU_MXCSR	_BITUL(1)	/* MXCSR will be initialized */
> +#define KFPU_AMX	_BITUL(2)	/* AMX will be initialized */
>  
>  extern void kernel_fpu_begin_mask(unsigned int kfpu_mask);
>  extern void kernel_fpu_end(void);
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 520deb411a70..0c0235b4a901 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -440,6 +440,9 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
>  
>  	if (unlikely(kfpu_mask & KFPU_387) && boot_cpu_has(X86_FEATURE_FPU))
>  		asm volatile ("fninit");
> +
> +	if (unlikely(kfpu_mask & KFPU_AMX) && boot_cpu_has(X86_FEATURE_AMX_TILE))
> +		restore_fpregs_from_fpstate(&init_fpstate, XFEATURE_MASK_XTILE);
>  }
>  EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask);
>  
> -- 
> 2.34.1
> 

-- 
Cheers,
Ashok

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
@ 2024-05-01  0:06   ` Ashok Raj
  2024-05-01  8:58   ` Hans de Goede
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 30+ messages in thread
From: Ashok Raj @ 2024-05-01  0:06 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: linux-kernel, x86, platform-driver-x86, tglx, mingo, bp,
	dave.hansen, hdegoede, ilpo.jarvinen, tony.luck, jithu.joseph,
	Ashok Raj

On Tue, Apr 30, 2024 at 02:25:08PM -0700, Bae, Chang Seok wrote:
> The scan test does not start when the AMX state remains active and is not
> re-initialized. With the extension of kernel_fpu_begin_mask(), the driver
> code can now initialize the state properly.

Avoid "re-initialized" and maybe use the same SDM language here as well?

	Infield Scan aborts if AMX state is not in initialized state. Use
	the kernel_fpu_begin_mask(KFPU_AMX) to ensure AMX state is initialized.

> 
> Introduce custom FPU handling wrappers to ensure compliant with the
> established FPU API semantics, as kernel_fpu_begin() exclusively sets
> legacy states. This follows the EFI case from commit b0dc553cfc9d
> ("x86/fpu: Make the EFI FPU calling convention explicit").

The EFI and commit mention is worthy of prior use, but the first line
isn't reading well... Maybe

	Introduce custom FPU handling wrappers to ensure compliance with the 
	established FPU API semantics.  This change follows the EFI case from 
	commit b0dc553cfc9d ("x86/fpu: Make the EFI FPU calling convention explicit").

I dropped the note about legacy states, I didn't quite know how to word
that.

> 
> Then, use these wrappers to surround the MSR_ACTIVATE_SCAN write to
> minimize the critical section. To prevent unnecessary delays, invoke
> ifs_fpu_begin() before entering the rendezvous loop.
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
> Tested-by: Jithu Joseph <jithu.joseph@intel.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
  2024-05-01  0:06   ` Ashok Raj
@ 2024-05-01  8:58   ` Hans de Goede
  2024-05-08  0:22     ` Chang S. Bae
  2024-05-02 11:00   ` Ilpo Järvinen
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Hans de Goede @ 2024-05-01  8:58 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

Hi,

On 4/30/24 11:25 PM, Chang S. Bae wrote:
> The scan test does not start when the AMX state remains active and is not
> re-initialized. With the extension of kernel_fpu_begin_mask(), the driver
> code can now initialize the state properly.
> 
> Introduce custom FPU handling wrappers to ensure compliant with the
> established FPU API semantics, as kernel_fpu_begin() exclusively sets
> legacy states. This follows the EFI case from commit b0dc553cfc9d
> ("x86/fpu: Make the EFI FPU calling convention explicit").
> 
> Then, use these wrappers to surround the MSR_ACTIVATE_SCAN write to
> minimize the critical section. To prevent unnecessary delays, invoke
> ifs_fpu_begin() before entering the rendezvous loop.
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
> Tested-by: Jithu Joseph <jithu.joseph@intel.com>

Thanks, the patch looks good to me.

I believe this is best merged through the tip/x86 tree together
with patch 1/2 :

Acked-by: Hans de Goede <hdegoede@redhat.com>

I have checked and this should not cause any conflicts with
the ifs changes current in platform-drivers-x86/for-next.

Regards,

Hans




> ---
>  drivers/platform/x86/intel/ifs/ifs.h     | 14 ++++++++++++++
>  drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h
> index 56b9f3e3cf76..71d8b50854b2 100644
> --- a/drivers/platform/x86/intel/ifs/ifs.h
> +++ b/drivers/platform/x86/intel/ifs/ifs.h
> @@ -325,4 +325,18 @@ int do_core_test(int cpu, struct device *dev);
>  extern struct attribute *plat_ifs_attrs[];
>  extern struct attribute *plat_ifs_array_attrs[];
>  
> +static inline void ifs_fpu_begin(void)
> +{
> +	/*
> +	 * The AMX state must be initialized prior to executing In-Field
> +	 * Scan tests, according to Intel SDM.
> +	 */
> +	kernel_fpu_begin_mask(KFPU_AMX);
> +}
> +
> +static inline void ifs_fpu_end(void)
> +{
> +	kernel_fpu_end();
> +}
> +
>  #endif
> diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c
> index 95b4b71fab53..a35eac7c0b44 100644
> --- a/drivers/platform/x86/intel/ifs/runtest.c
> +++ b/drivers/platform/x86/intel/ifs/runtest.c
> @@ -188,6 +188,9 @@ static int doscan(void *data)
>  	/* Only the first logical CPU on a core reports result */
>  	first = cpumask_first(cpu_smt_mask(cpu));
>  
> +	/* Prepare FPU state before entering the rendezvous loop*/
> +	ifs_fpu_begin();
> +
>  	wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC);
>  
>  	/*
> @@ -199,6 +202,9 @@ static int doscan(void *data)
>  	 * are processed in a single pass) before it retires.
>  	 */
>  	wrmsrl(MSR_ACTIVATE_SCAN, params->activate->data);
> +
> +	ifs_fpu_end();
> +
>  	rdmsrl(MSR_SCAN_STATUS, status.data);
>  
>  	trace_ifs_status(ifsd->cur_batch, start, stop, status.data);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
  2024-05-01  0:06   ` Ashok Raj
  2024-05-01  8:58   ` Hans de Goede
@ 2024-05-02 11:00   ` Ilpo Järvinen
  2024-05-02 22:19     ` Chang S. Bae
  2024-05-05  6:39   ` kernel test robot
  2024-05-09  0:45   ` Kuppuswamy Sathyanarayanan
  4 siblings, 1 reply; 30+ messages in thread
From: Ilpo Järvinen @ 2024-05-02 11:00 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: LKML, x86, platform-driver-x86, tglx, mingo, bp, dave.hansen,
	Hans de Goede, tony.luck, ashok.raj, jithu.joseph

On Tue, 30 Apr 2024, Chang S. Bae wrote:

> The scan test does not start when the AMX state remains active and is not
> re-initialized. With the extension of kernel_fpu_begin_mask(), the driver
> code can now initialize the state properly.
> 
> Introduce custom FPU handling wrappers to ensure compliant with the
> established FPU API semantics, as kernel_fpu_begin() exclusively sets
> legacy states. This follows the EFI case from commit b0dc553cfc9d
> ("x86/fpu: Make the EFI FPU calling convention explicit").
> 
> Then, use these wrappers to surround the MSR_ACTIVATE_SCAN write to
> minimize the critical section. To prevent unnecessary delays, invoke
> ifs_fpu_begin() before entering the rendezvous loop.
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
> Tested-by: Jithu Joseph <jithu.joseph@intel.com>
> ---
>  drivers/platform/x86/intel/ifs/ifs.h     | 14 ++++++++++++++
>  drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h
> index 56b9f3e3cf76..71d8b50854b2 100644
> --- a/drivers/platform/x86/intel/ifs/ifs.h
> +++ b/drivers/platform/x86/intel/ifs/ifs.h
> @@ -325,4 +325,18 @@ int do_core_test(int cpu, struct device *dev);
>  extern struct attribute *plat_ifs_attrs[];
>  extern struct attribute *plat_ifs_array_attrs[];
>  
> +static inline void ifs_fpu_begin(void)
> +{
> +	/*
> +	 * The AMX state must be initialized prior to executing In-Field
> +	 * Scan tests, according to Intel SDM.
> +	 */
> +	kernel_fpu_begin_mask(KFPU_AMX);
> +}
> +
> +static inline void ifs_fpu_end(void)
> +{
> +	kernel_fpu_end();
> +}
> +
>  #endif
> diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c
> index 95b4b71fab53..a35eac7c0b44 100644
> --- a/drivers/platform/x86/intel/ifs/runtest.c
> +++ b/drivers/platform/x86/intel/ifs/runtest.c
> @@ -188,6 +188,9 @@ static int doscan(void *data)
>  	/* Only the first logical CPU on a core reports result */
>  	first = cpumask_first(cpu_smt_mask(cpu));
>  
> +	/* Prepare FPU state before entering the rendezvous loop*/

Missing space

-- 
 i.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver
  2024-04-30 21:25 [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
  2024-04-30 21:25 ` [PATCH 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
  2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
@ 2024-05-02 20:52 ` Chang S. Bae
  2024-05-02 21:35   ` Chang S. Bae
  2024-05-02 20:59 ` Dave Hansen
  2024-05-07 23:53 ` [PATCH v2 " Chang S. Bae
  4 siblings, 1 reply; 30+ messages in thread
From: Chang S. Bae @ 2024-05-02 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 4/30/2024 2:25 PM, Chang S. Bae wrote:
> The recent update [1] in the SDM highlights the requirement of
> initializing the AMX state for executing the scan test:
>      "... maintaining AMX state in a non-initialized state ... will
>       prevent the execution of In-Field Scan tests."
> which is one of CPU state conditions required for the test's execution.

This brief mention may prompt questions about why the hardware must 
refuse to run the test under these conditions. Because this is part of 
its internal, we'd go back to folks who wrote this test and will grab 
their write-up to provide the logic behind this requirement.

Thanks,
Chang


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver
  2024-04-30 21:25 [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
                   ` (2 preceding siblings ...)
  2024-05-02 20:52 ` [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
@ 2024-05-02 20:59 ` Dave Hansen
  2024-05-07 23:53 ` [PATCH v2 " Chang S. Bae
  4 siblings, 0 replies; 30+ messages in thread
From: Dave Hansen @ 2024-05-02 20:59 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 4/30/24 14:25, Chang S. Bae wrote:
> The recent update [1] in the SDM highlights the requirement of
> initializing the AMX state for executing the scan test:
>     "... maintaining AMX state in a non-initialized state ... will
>      prevent the execution of In-Field Scan tests."
> which is one of CPU state conditions required for the test's execution.

This ended up just being phrased weirdly.  It's a lot more compact to
just say:

	AMX must be in its init state for In-Field Scan tests to run.

> In situations where AMX workloads are running, the switched-away active
> user AMX state remains due to the optimization to reduce the state
> switching cost. A user state reload is fully completed right before
> returning to userspace. Consequently, if the switched-in kernel task is
> executing the scan test, this non-initialized AMX state causes the test
> to be unable to start.

FPU state in general (and AMX state in particular) is large and
expensive to context switch so the kernel tries to leave FPU state alone
 even while running kernel tasks.  But, this behavior obviously
conflicts with the (new) IFS need for AMX must be in its init state.

Right?

...
> [1] Intel Software Development Manual as of March 2024, Section 18.2
>     RECOMMENDATIONS FOR SYSTEM SOFTWARE of Vol. 1.
>     https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html

Rather than, "we're just following the spec", I think there can be a
better explanation here.

These in-field scan tests (IFS) poke the hardware in unique ways. It
ends up that IFS and AMX could attempt to use the same hardware
resources at the same time and step on each other.  While it would be
possible to add additional resources to the CPU to allow simultaneous
AMX and IFS, the hardware to do this would be relatively expensive.  It
seems pretty reasonable for software to help out here.

The other argument that could be made is that an admin could isolate the
CPUs on which they wanted to run an IFS test.  They could use cpusets,
or even the task binding API to try and evict AMX workloads from these
CPUs.  But, the promise of IFS is that it can be run without disturbing
workloads _too_ much.  Basically anything an admin would do is probably
too onerous and high-impact.

So this mechanism provides two things: One, it makes the hardware
simpler and two, it takes the admin out of the picture.  Thing just work.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver
  2024-05-02 20:52 ` [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
@ 2024-05-02 21:35   ` Chang S. Bae
  0 siblings, 0 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-05-02 21:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/2/2024 1:52 PM, Chang S. Bae wrote:
> 
> This brief mention may prompt questions about why the hardware must 
> refuse to run the test under these conditions. Because this is part of 
> its internal, we'd go back to folks who wrote this test and will grab 
> their write-up to provide the logic behind this requirement.

Okay, I got the write-up sooner than expected:

In-Field Scan (IFS) is a destructive test meaning that it will overwrite 
the existing state to achieve the goal of testing the logic. IFS must 
ensure that it is able to return the CPU back to the OS in the same 
architectural state as it was prior to IFS start. To do this 
architectural state (register state visible to the OS) is saved in a 
cache [*]. The cache has a limited size. When AMX is in use, the state 
of the AMX logic that needs to be saved is too large to be accommodated 
in the cache. Therefore OS must clear the AMX logic from being in use 
prior to IFS start.

[*] This cache is software-inaccessible space.

Thanks,
Chang

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-05-02 11:00   ` Ilpo Järvinen
@ 2024-05-02 22:19     ` Chang S. Bae
  0 siblings, 0 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-05-02 22:19 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: LKML, x86, platform-driver-x86, tglx, mingo, bp, dave.hansen,
	Hans de Goede, tony.luck, ashok.raj, jithu.joseph

On 5/2/2024 4:00 AM, Ilpo Järvinen wrote:
> On Tue, 30 Apr 2024, Chang S. Bae wrote:
> 
>> +	/* Prepare FPU state before entering the rendezvous loop*/
> 
> Missing space

Yeah, right. Thanks for spotting this.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
                     ` (2 preceding siblings ...)
  2024-05-02 11:00   ` Ilpo Järvinen
@ 2024-05-05  6:39   ` kernel test robot
  2024-05-09  0:45   ` Kuppuswamy Sathyanarayanan
  4 siblings, 0 replies; 30+ messages in thread
From: kernel test robot @ 2024-05-05  6:39 UTC (permalink / raw)
  To: Chang S. Bae; +Cc: llvm, oe-kbuild-all

Hi Chang,

kernel test robot noticed the following build errors:

[auto build test ERROR on e67572cd2204894179d89bd7b984072f19313b03]

url:    https://github.com/intel-lab-lkp/linux/commits/Chang-S-Bae/x86-fpu-Extend-kernel_fpu_begin_mask-to-initialize-AMX-state/20240501-054234
base:   e67572cd2204894179d89bd7b984072f19313b03
patch link:    https://lore.kernel.org/r/20240430212508.105117-3-chang.seok.bae%40intel.com
patch subject: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
config: x86_64-buildonly-randconfig-004-20240505 (https://download.01.org/0day-ci/archive/20240505/202405051417.5CWuLrhT-lkp@intel.com/config)
compiler: clang version 18.1.4 (https://github.com/llvm/llvm-project e6c3289804a67ea0bb6a86fadbe454dd93b8d855)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240505/202405051417.5CWuLrhT-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202405051417.5CWuLrhT-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/platform/x86/intel/ifs/load.c:9:
>> drivers/platform/x86/intel/ifs/ifs.h:334:2: error: call to undeclared function 'kernel_fpu_begin_mask'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     334 |         kernel_fpu_begin_mask(KFPU_AMX);
         |         ^
>> drivers/platform/x86/intel/ifs/ifs.h:334:24: error: use of undeclared identifier 'KFPU_AMX'
     334 |         kernel_fpu_begin_mask(KFPU_AMX);
         |                               ^
>> drivers/platform/x86/intel/ifs/ifs.h:339:2: error: call to undeclared function 'kernel_fpu_end'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     339 |         kernel_fpu_end();
         |         ^
   3 errors generated.


vim +/kernel_fpu_begin_mask +334 drivers/platform/x86/intel/ifs/ifs.h

   327	
   328	static inline void ifs_fpu_begin(void)
   329	{
   330		/*
   331		 * The AMX state must be initialized prior to executing In-Field
   332		 * Scan tests, according to Intel SDM.
   333		 */
 > 334		kernel_fpu_begin_mask(KFPU_AMX);
   335	}
   336	
   337	static inline void ifs_fpu_end(void)
   338	{
 > 339		kernel_fpu_end();
   340	}
   341	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver
  2024-04-30 21:25 [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
                   ` (3 preceding siblings ...)
  2024-05-02 20:59 ` Dave Hansen
@ 2024-05-07 23:53 ` Chang S. Bae
  2024-05-07 23:53   ` [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
                     ` (2 more replies)
  4 siblings, 3 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-05-07 23:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph,
	chang.seok.bae

Following feedback from the previous posting [1], this revision comes
with the changelog updates for more clarity:

* The recent change in the SDM alone does not fully explain the
  underlying reasons. Provide a clearer explanation rather than simply
  echoing the SDM text.

* While it is feasible for administrators to isolate CPUs running the
  In-Field Scan tests from those running AMX workloads, this approach is
  considerably disruptive and conflicts with the goal of its live
  testing. Add this point to the changelog as well.

Thanks to Dave for highlighting these aspects [2]. Additionally, it is
worth noting that the IFS Technical Paper [3] was recently published,
which may offer further context on the IFS testing.

Thanks,
Chang

[1] V1: https://lore.kernel.org/lkml/20240430212508.105117-1-chang.seok.bae@intel.com/
[2] Feedback: https://lore.kernel.org/lkml/a41d7012-2eea-436e-9f7e-6a7702f7e2c2@intel.com/
[3] IFS Paper: https://www.intel.com/content/www/us/en/content-details/822279/finding-faulty-components-in-a-live-fleet-environment.html

Chang S. Bae (2):
  x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  platform/x86/intel/ifs: Initialize AMX state for the scan test

 arch/x86/include/asm/fpu/api.h           |  1 +
 arch/x86/kernel/fpu/core.c               |  3 +++
 drivers/platform/x86/intel/ifs/ifs.h     | 15 +++++++++++++++
 drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
 4 files changed, 25 insertions(+)


base-commit: 7598293ab37c92025086de4b0ecd9474013a725f
-- 
2.34.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-07 23:53 ` [PATCH v2 " Chang S. Bae
@ 2024-05-07 23:53   ` Chang S. Bae
  2024-05-08 13:00     ` Thomas Gleixner
  2024-05-08 14:40     ` Dave Hansen
  2024-05-07 23:53   ` [PATCH v2 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
  2024-05-08  8:28   ` [PATCH v2 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Ilpo Järvinen
  2 siblings, 2 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-05-07 23:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph,
	chang.seok.bae

The In-Field Scan (IFS) test [1] is a destructive process, overwriting
the existing state to test the logic on the fly. As part of this test
process, the architectural state is saved before the test begins and
then restored upon completion.

However, due to resource constraints in storage, AMX state is excluded
from the scope of state recovery. Consequently, AMX state must be in its
initialized state for the IFS test to run.

When AMX workloads are running, an active user AMX state remains even
after a context switch, optimizing to reduce the state reload cost. In
such cases, the test cannot proceed if it is scheduled.

System administrators may attempt to mitigate this issue, by arranging
AMX workloads not to run on CPUs selected for the tests. However, this
approach is disruptive for managing large-scaled systems, diminishing the
benefit of the live testing.

The kernel can help by properly initializing the state before the test.
This initialization impacts the performance to some degree. But, this
approach is considerably cheaper than adding hardware resources and
simpler than a userspace approach.

While fpu_idle_fpregs() can initialize the AMX state, its usage should be
limited to specialized cases, primarily before entering the sleep state.
The restore_fpregs_from_fpstate() function offers a suitable mechanism
for initializing fpstate in general, which remains within the core code.

Extend kernel_fpu_begin_mask() to allow the IFS driver to initialize AMX
state through restore_fpregs_from_fpstate().

[1]: https://docs.kernel.org/arch/x86/ifs.html
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
---
V1 -> V2: Revise the changelog (Dave Hansen and Ashok Raj).

The recently published IFS documentation [2] elaborates its purpose and
the requirements of the context restoration after the scan test.

Additionally, the necessity for AMX initialization is emphasized in the
Intel Software Development Manual as of March 2024, in Section 18.2 of
Vol.1.

Side note: restore_fpregs_from_fpstate() also sets the x87 state to a
fixed value. However, this only applies to AMD CPUs with the FXSAVE_LEAK
quirk.

[2] IFS Technical Paper: Finding Faulty Components in a Live Fleet
    Environment
    https://www.intel.com/content/www/us/en/content-details/822279/finding-faulty-components-in-a-live-fleet-environment.html
---
 arch/x86/include/asm/fpu/api.h | 1 +
 arch/x86/kernel/fpu/core.c     | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index a2be3aefff9f..67887fc45c24 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -25,6 +25,7 @@
 /* Kernel FPU states to initialize in kernel_fpu_begin_mask() */
 #define KFPU_387	_BITUL(0)	/* 387 state will be initialized */
 #define KFPU_MXCSR	_BITUL(1)	/* MXCSR will be initialized */
+#define KFPU_AMX	_BITUL(2)	/* AMX will be initialized */
 
 extern void kernel_fpu_begin_mask(unsigned int kfpu_mask);
 extern void kernel_fpu_end(void);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1209c7aebb21..04cc6f14ca42 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -440,6 +440,9 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 
 	if (unlikely(kfpu_mask & KFPU_387) && boot_cpu_has(X86_FEATURE_FPU))
 		asm volatile ("fninit");
+
+	if (unlikely(kfpu_mask & KFPU_AMX) && boot_cpu_has(X86_FEATURE_AMX_TILE))
+		restore_fpregs_from_fpstate(&init_fpstate, XFEATURE_MASK_XTILE);
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-05-07 23:53 ` [PATCH v2 " Chang S. Bae
  2024-05-07 23:53   ` [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
@ 2024-05-07 23:53   ` Chang S. Bae
  2024-05-08  0:19     ` Ashok Raj
  2024-05-08  8:28   ` [PATCH v2 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Ilpo Järvinen
  2 siblings, 1 reply; 30+ messages in thread
From: Chang S. Bae @ 2024-05-07 23:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph,
	chang.seok.bae

In-Field Scan aborts if AMX state is not in initialized state. Use
kernel_fpu_begin_mask(KFPU_AMX) to ensure AMX state is initialized.

Introduce custom FPU handling wrappers to ensure compliance with the
established FPU API semantics. This change follows the EFI case from
commit b0dc553cfc9d ("x86/fpu: Make the EFI FPU calling convention
explicit").

Then, use these wrappers to surround the MSR_ACTIVATE_SCAN write to
minimize the critical section. To prevent unnecessary delays, invoke
ifs_fpu_begin() before entering the rendezvous loop.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
Tested-by: Jithu Joseph <jithu.joseph@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
---
V1 -> V2: Massage the changelog (Ashok), add a space to the code comment
          (Ilpo), and include the header file explicitly (0-day).
---
 drivers/platform/x86/intel/ifs/ifs.h     | 15 +++++++++++++++
 drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h
index 56b9f3e3cf76..a2ec0dab809b 100644
--- a/drivers/platform/x86/intel/ifs/ifs.h
+++ b/drivers/platform/x86/intel/ifs/ifs.h
@@ -129,6 +129,7 @@
  */
 #include <linux/device.h>
 #include <linux/miscdevice.h>
+#include <asm/fpu/api.h>
 
 #define MSR_ARRAY_BIST				0x00000105
 #define MSR_COPY_SCAN_HASHES			0x000002c2
@@ -325,4 +326,18 @@ int do_core_test(int cpu, struct device *dev);
 extern struct attribute *plat_ifs_attrs[];
 extern struct attribute *plat_ifs_array_attrs[];
 
+static inline void ifs_fpu_begin(void)
+{
+	/*
+	 * The AMX state must be initialized prior to executing In-Field
+	 * Scan tests, according to Intel SDM.
+	 */
+	kernel_fpu_begin_mask(KFPU_AMX);
+}
+
+static inline void ifs_fpu_end(void)
+{
+	kernel_fpu_end();
+}
+
 #endif
diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c
index 95b4b71fab53..ca2b496a6b01 100644
--- a/drivers/platform/x86/intel/ifs/runtest.c
+++ b/drivers/platform/x86/intel/ifs/runtest.c
@@ -188,6 +188,9 @@ static int doscan(void *data)
 	/* Only the first logical CPU on a core reports result */
 	first = cpumask_first(cpu_smt_mask(cpu));
 
+	/* Prepare FPU state before entering the rendezvous loop */
+	ifs_fpu_begin();
+
 	wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC);
 
 	/*
@@ -199,6 +202,9 @@ static int doscan(void *data)
 	 * are processed in a single pass) before it retires.
 	 */
 	wrmsrl(MSR_ACTIVATE_SCAN, params->activate->data);
+
+	ifs_fpu_end();
+
 	rdmsrl(MSR_SCAN_STATUS, status.data);
 
 	trace_ifs_status(ifsd->cur_batch, start, stop, status.data);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-05-07 23:53   ` [PATCH v2 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
@ 2024-05-08  0:19     ` Ashok Raj
  2024-05-08  0:21       ` Chang S. Bae
  0 siblings, 1 reply; 30+ messages in thread
From: Ashok Raj @ 2024-05-08  0:19 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: linux-kernel, x86, platform-driver-x86, tglx, mingo, bp,
	dave.hansen, hdegoede, ilpo.jarvinen, tony.luck, jithu.joseph,
	Ashok Raj

On Tue, May 07, 2024 at 04:53:44PM -0700, Bae, Chang Seok wrote:
> In-Field Scan aborts if AMX state is not in initialized state. Use
> kernel_fpu_begin_mask(KFPU_AMX) to ensure AMX state is initialized.
> 
> Introduce custom FPU handling wrappers to ensure compliance with the
> established FPU API semantics. This change follows the EFI case from
> commit b0dc553cfc9d ("x86/fpu: Make the EFI FPU calling convention
> explicit").
> 
> Then, use these wrappers to surround the MSR_ACTIVATE_SCAN write to
> minimize the critical section. To prevent unnecessary delays, invoke
> ifs_fpu_begin() before entering the rendezvous loop.
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
> Tested-by: Jithu Joseph <jithu.joseph@intel.com>
> Acked-by: Hans de Goede <hdegoede@redhat.com>

Reviewed-by: Ashok Raj <ashok.raj@intel.com>

for both patches. 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-05-08  0:19     ` Ashok Raj
@ 2024-05-08  0:21       ` Chang S. Bae
  0 siblings, 0 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-05-08  0:21 UTC (permalink / raw)
  To: Ashok Raj
  Cc: linux-kernel, x86, platform-driver-x86, tglx, mingo, bp,
	dave.hansen, hdegoede, ilpo.jarvinen, tony.luck, jithu.joseph

On 5/7/2024 5:19 PM, Ashok Raj wrote:
> 
> Reviewed-by: Ashok Raj <ashok.raj@intel.com>
> 
> for both patches.

Thanks!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-05-01  8:58   ` Hans de Goede
@ 2024-05-08  0:22     ` Chang S. Bae
  0 siblings, 0 replies; 30+ messages in thread
From: Chang S. Bae @ 2024-05-08  0:22 UTC (permalink / raw)
  To: Hans de Goede, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/1/2024 1:58 AM, Hans de Goede wrote:
> 
> Thanks, the patch looks good to me.
> 
> I believe this is best merged through the tip/x86 tree together
> with patch 1/2 :
> 
> Acked-by: Hans de Goede <hdegoede@redhat.com>
> 
> I have checked and this should not cause any conflicts with
> the ifs changes current in platform-drivers-x86/for-next.

Thanks!


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver
  2024-05-07 23:53 ` [PATCH v2 " Chang S. Bae
  2024-05-07 23:53   ` [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
  2024-05-07 23:53   ` [PATCH v2 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
@ 2024-05-08  8:28   ` Ilpo Järvinen
  2 siblings, 0 replies; 30+ messages in thread
From: Ilpo Järvinen @ 2024-05-08  8:28 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: LKML, x86, platform-driver-x86, tglx, mingo, bp, dave.hansen,
	Hans de Goede, tony.luck, ashok.raj, jithu.joseph

[-- Attachment #1: Type: text/plain, Size: 1430 bytes --]

On Tue, 7 May 2024, Chang S. Bae wrote:

> Following feedback from the previous posting [1], this revision comes
> with the changelog updates for more clarity:
> 
> * The recent change in the SDM alone does not fully explain the
>   underlying reasons. Provide a clearer explanation rather than simply
>   echoing the SDM text.
> 
> * While it is feasible for administrators to isolate CPUs running the
>   In-Field Scan tests from those running AMX workloads, this approach is
>   considerably disruptive and conflicts with the goal of its live
>   testing. Add this point to the changelog as well.
> 
> Thanks to Dave for highlighting these aspects [2]. Additionally, it is
> worth noting that the IFS Technical Paper [3] was recently published,
> which may offer further context on the IFS testing.
> 
> Thanks,
> Chang
> 
> [1] V1: https://lore.kernel.org/lkml/20240430212508.105117-1-chang.seok.bae@intel.com/
> [2] Feedback: https://lore.kernel.org/lkml/a41d7012-2eea-436e-9f7e-6a7702f7e2c2@intel.com/
> [3] IFS Paper: https://www.intel.com/content/www/us/en/content-details/822279/finding-faulty-components-in-a-live-fleet-environment.html
> 
> Chang S. Bae (2):
>   x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
>   platform/x86/intel/ifs: Initialize AMX state for the scan test

For both patches:

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>


-- 
 i.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-07 23:53   ` [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
@ 2024-05-08 13:00     ` Thomas Gleixner
  2024-05-08 14:40     ` Dave Hansen
  1 sibling, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2024-05-08 13:00 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph,
	chang.seok.bae

On Tue, May 07 2024 at 16:53, Chang S. Bae wrote:
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 1209c7aebb21..04cc6f14ca42 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -440,6 +440,9 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
>  
>  	if (unlikely(kfpu_mask & KFPU_387) && boot_cpu_has(X86_FEATURE_FPU))
>  		asm volatile ("fninit");
> +
> +	if (unlikely(kfpu_mask & KFPU_AMX) && boot_cpu_has(X86_FEATURE_AMX_TILE))

  cpu_feature_enabled() please

> +		restore_fpregs_from_fpstate(&init_fpstate, XFEATURE_MASK_XTILE);
>  }
>  EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-07 23:53   ` [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
  2024-05-08 13:00     ` Thomas Gleixner
@ 2024-05-08 14:40     ` Dave Hansen
  2024-05-08 18:03       ` Chang S. Bae
  2024-05-09  0:29       ` Chang S. Bae
  1 sibling, 2 replies; 30+ messages in thread
From: Dave Hansen @ 2024-05-08 14:40 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

The subject should probably be:

	Allow kernel FPU users to initialize AMX state

On 5/7/24 16:53, Chang S. Bae wrote:
> The In-Field Scan (IFS) test [1] is a destructive process, overwriting
> the existing state to test the logic on the fly. As part of this test
> process, the architectural state is saved before the test begins and
> then restored upon completion.

This should say "_most_ architectural state".

> However, due to resource constraints in storage, AMX state is excluded
> from the scope of state recovery. Consequently, AMX state must be in its
> initialized state for the IFS test to run.

This doesn't mention how this issue got introduced.  Are we all bad at
reading the SDM? :)

> When AMX workloads are running, an active user AMX state remains even
> after a context switch, optimizing to reduce the state reload cost. In
> such cases, the test cannot proceed if it is scheduled.

This is a bit out of the blue.  What does scheduling have do do with IFS?

> System administrators may attempt to mitigate this issue, by arranging
> AMX workloads not to run on CPUs selected for the tests. However, this
> approach is disruptive for managing large-scaled systems, diminishing the
> benefit of the live testing.

I personally prefer to go and mention alternative solutions *after* I've
discussed the things that are truly relevant to the problem and fix.  I
think this distracts from the real issue.

> The kernel can help by properly initializing the state before the test.
> This initialization impacts the performance to some degree. But, this
> approach is considerably cheaper than adding hardware resources and
> simpler than a userspace approach.
> 
> While fpu_idle_fpregs() can initialize the AMX state, its usage should be
> limited to specialized cases, primarily before entering the sleep state.
> The restore_fpregs_from_fpstate() function offers a suitable mechanism
> for initializing fpstate in general, which remains within the core code.

I'm not sure those last two paragraphs add much value.  I'd try to
banish most of that content to *after* you talk about the solution.  Or
maybe put it in the cover letter.

> Extend kernel_fpu_begin_mask() to allow the IFS driver to initialize AMX
> state through restore_fpregs_from_fpstate().

The function names are pretty blatantly obvious from the patch.  No need
to copy them here.

As I look closer, I'm not sure I think this is a good match for the two
other KFPU_* flags.  I don't think either KFPU_387 or KFPU_MXCSR causes
any XFEATURE to be tracked as init.  The SDM says that FNINIT "sets the
FPU control, status, tag, instruction pointer, and data pointer
registers to their default states."

Off the top of my head, I don't know how that maps to the XSAVE init
tracking but I do know that MXCSR has a very weird mapping to the first
two XSAVE features.

I really think it would be simplest to just have this whole thing do this:

	kernel_fpu_begin();

	// Zap AMX state

	kernel_fpu_end();

Where the zap is either an os_xrstor() or an explicit tile release
instruction.

It's just a matter of whether you want this to work like a regular old
kernel FPU user or you want to tie it to *only* being able to run in its
own kernel thread. -- Aside: I don't think I realized IFS had its own
thread earlier.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-08 14:40     ` Dave Hansen
@ 2024-05-08 18:03       ` Chang S. Bae
  2024-05-08 19:11         ` Dave Hansen
  2024-05-09  0:29       ` Chang S. Bae
  1 sibling, 1 reply; 30+ messages in thread
From: Chang S. Bae @ 2024-05-08 18:03 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/8/2024 7:40 AM, Dave Hansen wrote:
> On 5/7/24 16:53, Chang S. Bae wrote:
> 
>> However, due to resource constraints in storage, AMX state is excluded
>> from the scope of state recovery. Consequently, AMX state must be in its
>> initialized state for the IFS test to run.
> 
> This doesn't mention how this issue got introduced.  Are we all bad at
> reading the SDM? :)

Ah, I'd rather zap out this SDM sentence.

>> When AMX workloads are running, an active user AMX state remains even
>> after a context switch, optimizing to reduce the state reload cost. In
>> such cases, the test cannot proceed if it is scheduled.
> 
> This is a bit out of the blue.  What does scheduling have do do with IFS?

$ echo <cpu#> > /sys/devices/virtual/misc/intel_ifs_0/run_test

Then,
run_test_store()
-> do_core_test()
   -> ifs_test_core()
     -> stop_core_cpuslocked()
       -> stop_cpus()
         -> queue_stop_cpus_work()
           -> cpu_stop_queue_work()
             -> wake_q_add()
             -> wake_up_q()

So, the CPU stopper threads for <cpu#> and its sibling to execute 
doscan() are queued up with the highest priority.

queue_stop_cpus_work() has

	for_each_cpu(cpu, cpumask) {
		work = &per_cpu(cpu_stopper.stop_work, cpu);
		work->fn = fn;
		work->arg = arg;
		work->done = done;
		work->caller = _RET_IP_;
		if (cpu_stop_queue_work(cpu, work))
			queued = true;
	}

Those threads are created during early boot via 
smpboot_register_percpu_thread().

> I'm not sure those last two paragraphs add much value.  I'd try to
> banish most of that content to *after* you talk about the solution.  Or
> maybe put it in the cover letter.

It looks like lots of distractions coming from bunch of alternatives in 
different levels.

Thanks,
Chang

PS: Let me respond the solution discussion separately. I do want to 
experiment the init-track behavior a bit.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-08 18:03       ` Chang S. Bae
@ 2024-05-08 19:11         ` Dave Hansen
  0 siblings, 0 replies; 30+ messages in thread
From: Dave Hansen @ 2024-05-08 19:11 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/8/24 11:03, Chang S. Bae wrote:
> On 5/8/2024 7:40 AM, Dave Hansen wrote:
>> On 5/7/24 16:53, Chang S. Bae wrote:
>>
>>> However, due to resource constraints in storage, AMX state is excluded
>>> from the scope of state recovery. Consequently, AMX state must be in its
>>> initialized state for the IFS test to run.
>>
>> This doesn't mention how this issue got introduced.  Are we all bad at
>> reading the SDM? :)
> 
> Ah, I'd rather zap out this SDM sentence.

My point is that this is fixing a bug.  Where did that bug come from?
What got screwed up here?

Hint: I don't think us software folks screwed up here.  It was likely
the folks that built the two hardware features (AMX and IFS) forgot to
talk to each other, or someone forgot to document the AMX clobbering
aspect of the architecture.

>>> When AMX workloads are running, an active user AMX state remains even
>>> after a context switch, optimizing to reduce the state reload cost. In
>>> such cases, the test cannot proceed if it is scheduled.
>>
>> This is a bit out of the blue.  What does scheduling have do do with IFS?
...
> So, the CPU stopper threads for <cpu#> and its sibling to execute
> doscan() are queued up with the highest priority.
...

But this is the IFS implementation *today*.  The explanation depends on
IFS being implemented with something that context switches.  It also
depends on folks expecting context switches to always switch FPU state.

I'd just say:

	The kernel generally runs with live user FPU state, including
	AMX. That state can prevent IFS tests from running.

That's _much_ more simple, generic and also fully explains the
situation.  It also isn't dependent on the IFS stop_cpus_run()
implementation of today, which could totally change tomorrow.

The underlying rule has zero to do with scheduling or context switching
optimizations.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-08 14:40     ` Dave Hansen
  2024-05-08 18:03       ` Chang S. Bae
@ 2024-05-09  0:29       ` Chang S. Bae
  2024-05-09 17:36         ` Dave Hansen
  1 sibling, 1 reply; 30+ messages in thread
From: Chang S. Bae @ 2024-05-09  0:29 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/8/2024 7:40 AM, Dave Hansen wrote:
> 
> As I look closer, I'm not sure I think this is a good match for the two
> other KFPU_* flags.  I don't think either KFPU_387 or KFPU_MXCSR causes
> any XFEATURE to be tracked as init.  The SDM says that FNINIT "sets the
> FPU control, status, tag, instruction pointer, and data pointer
> registers to their default states."
> 
> Off the top of my head, I don't know how that maps to the XSAVE init
> tracking but I do know that MXCSR has a very weird mapping to the first
> two XSAVE features.

FNINIT does *not* set the component to be tracked as init. Indeed, the 
SSE component is somewhat convoluted. AMX state will be likely tracked 
as init.  But, I believe this init tracking mechanism is 
implementation-specific, not architectural. Beyond this intricacy of 
init-tracking, KFPU_* flags all initialize each state:

/* Kernel FPU states to initialize in kernel_fpu_begin_mask() */
#define KFPU_387	_BITUL(0)	/* 387 state will be initialized */
#define KFPU_MXCSR	_BITUL(1)	/* MXCSR will be initialized */

> I really think it would be simplest to just have this whole thing do this:
> 
> 	kernel_fpu_begin();
> 
> 	// Zap AMX state
> 
> 	kernel_fpu_end();
> 
> Where the zap is either an os_xrstor() or an explicit tile release
> instruction.

If I understand correctly, the change could be something like this, 
although I'm not sure about this new API naming and prototype at this point:

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index a2be3aefff9f..906634402ea6 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -28,6 +28,7 @@

  extern void kernel_fpu_begin_mask(unsigned int kfpu_mask);
  extern void kernel_fpu_end(void);
+extern void kernel_fpu_reset(void);
  extern bool irq_fpu_usable(void);
  extern void fpregs_mark_activate(void);

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1209c7aebb21..0351f9ee3e49 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -452,6 +452,15 @@ void kernel_fpu_end(void)
  }
  EXPORT_SYMBOL_GPL(kernel_fpu_end);

+void kernel_fpu_reset(void)
+{
+       kernel_fpu_begin();
+       if (cpu_feature_enabled(X86_FEATURE_AMX_TILE))
+               tile_release();
+       kernel_fpu_end();
+}
+EXPORT_SYMBOL(kernel_fpu_reset);
+
  /*
   * Sync the FPU register state to current's memory register state when the
   * current task owns the FPU. The hardware register state is preserved.
diff --git a/drivers/platform/x86/intel/ifs/ifs.h 
b/drivers/platform/x86/intel/ifs/ifs.h
index 56b9f3e3cf76..5e7ba94b4054 100644
--- a/drivers/platform/x86/intel/ifs/ifs.h
+++ b/drivers/platform/x86/intel/ifs/ifs.h
@@ -129,6 +129,7 @@
   */
  #include <linux/device.h>
  #include <linux/miscdevice.h>
+#include <asm/fpu/api.h>

  #define MSR_ARRAY_BIST                         0x00000105
  #define MSR_COPY_SCAN_HASHES                   0x000002c2
diff --git a/drivers/platform/x86/intel/ifs/runtest.c 
b/drivers/platform/x86/intel/ifs/runtest.c
index 95b4b71fab53..33ef4962aeba 100644
--- a/drivers/platform/x86/intel/ifs/runtest.c
+++ b/drivers/platform/x86/intel/ifs/runtest.c
@@ -188,6 +188,8 @@ static int doscan(void *data)
         /* Only the first logical CPU on a core reports result */
         first = cpumask_first(cpu_smt_mask(cpu));

+       kernel_fpu_reset();
+
         wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC);

         /*

> It's just a matter of whether you want this to work like a regular old
> kernel FPU user or you want to tie it to *only* being able to run in its
> own kernel thread. -- Aside: I don't think I realized IFS had its own
> thread earlier.

While both approaches aim for the same functionality, the code change 
seems to be smaller for the above. At the same time, it is notable that 
these new APIs, fpu_idle_fpregs() and the other, are *only* for AMX.

1. The diff state of this series' approach:
  arch/x86/include/asm/fpu/api.h           |  1 +
  arch/x86/kernel/fpu/core.c               |  3 +++
  drivers/platform/x86/intel/ifs/ifs.h     | 15 +++++++++++++++
  drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
  4 files changed, 25 insertions(+)

2. The above code changes:
  arch/x86/include/asm/fpu/api.h           | 1 +
  arch/x86/kernel/fpu/core.c               | 9 +++++++++
  drivers/platform/x86/intel/ifs/ifs.h     | 1 +
  drivers/platform/x86/intel/ifs/runtest.c | 2 ++
  4 files changed, 13 insertions(+)

Thanks,
Chang

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test
  2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
                     ` (3 preceding siblings ...)
  2024-05-05  6:39   ` kernel test robot
@ 2024-05-09  0:45   ` Kuppuswamy Sathyanarayanan
  4 siblings, 0 replies; 30+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2024-05-09  0:45 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph


On 4/30/24 2:25 PM, Chang S. Bae wrote:
> The scan test does not start when the AMX state remains active and is not
> re-initialized. With the extension of kernel_fpu_begin_mask(), the driver
> code can now initialize the state properly.
>
> Introduce custom FPU handling wrappers to ensure compliant with the
> established FPU API semantics, as kernel_fpu_begin() exclusively sets
> legacy states. This follows the EFI case from commit b0dc553cfc9d
> ("x86/fpu: Make the EFI FPU calling convention explicit").
>
> Then, use these wrappers to surround the MSR_ACTIVATE_SCAN write to
> minimize the critical section. To prevent unnecessary delays, invoke
> ifs_fpu_begin() before entering the rendezvous loop.
>
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
> Tested-by: Jithu Joseph <jithu.joseph@intel.com>
> ---

Looks good to me.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>  drivers/platform/x86/intel/ifs/ifs.h     | 14 ++++++++++++++
>  drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
>  2 files changed, 20 insertions(+)
>
> diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h
> index 56b9f3e3cf76..71d8b50854b2 100644
> --- a/drivers/platform/x86/intel/ifs/ifs.h
> +++ b/drivers/platform/x86/intel/ifs/ifs.h
> @@ -325,4 +325,18 @@ int do_core_test(int cpu, struct device *dev);
>  extern struct attribute *plat_ifs_attrs[];
>  extern struct attribute *plat_ifs_array_attrs[];
>  
> +static inline void ifs_fpu_begin(void)
> +{
> +	/*
> +	 * The AMX state must be initialized prior to executing In-Field
> +	 * Scan tests, according to Intel SDM.

Nit: May be helpful to include the section title in SDM.

> +	 */
> +	kernel_fpu_begin_mask(KFPU_AMX);
> +}
> +
> +static inline void ifs_fpu_end(void)
> +{
> +	kernel_fpu_end();
> +}
> +
>  #endif
> diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c
> index 95b4b71fab53..a35eac7c0b44 100644
> --- a/drivers/platform/x86/intel/ifs/runtest.c
> +++ b/drivers/platform/x86/intel/ifs/runtest.c
> @@ -188,6 +188,9 @@ static int doscan(void *data)
>  	/* Only the first logical CPU on a core reports result */
>  	first = cpumask_first(cpu_smt_mask(cpu));
>  
> +	/* Prepare FPU state before entering the rendezvous loop*/
> +	ifs_fpu_begin();
> +
>  	wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC);
>  
>  	/*
> @@ -199,6 +202,9 @@ static int doscan(void *data)
>  	 * are processed in a single pass) before it retires.
>  	 */
>  	wrmsrl(MSR_ACTIVATE_SCAN, params->activate->data);
> +
> +	ifs_fpu_end();
> +
>  	rdmsrl(MSR_SCAN_STATUS, status.data);
>  
>  	trace_ifs_status(ifsd->cur_batch, start, stop, status.data);

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-09  0:29       ` Chang S. Bae
@ 2024-05-09 17:36         ` Dave Hansen
  2024-05-09 17:41           ` Dave Hansen
                             ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Dave Hansen @ 2024-05-09 17:36 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/8/24 17:29, Chang S. Bae wrote:
> +void kernel_fpu_reset(void)
> +{
> +       kernel_fpu_begin();
> +       if (cpu_feature_enabled(X86_FEATURE_AMX_TILE))
> +               tile_release();
> +       kernel_fpu_end();
> +}
> +EXPORT_SYMBOL(kernel_fpu_reset);
> +
...
> --- a/drivers/platform/x86/intel/ifs/runtest.c
> +++ b/drivers/platform/x86/intel/ifs/runtest.c
> @@ -188,6 +188,8 @@ static int doscan(void *data)
>         /* Only the first logical CPU on a core reports result */
>         first = cpumask_first(cpu_smt_mask(cpu));
> 
> +       kernel_fpu_reset();
> +
>         wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC);

Remember, kernel_fpu_begin/end() mark a section of code that needs the
FPU.  Once code calls kernel_fpu_end(), it no longer owns the FPU and
all bets are off.  A interrupt could theoretically come in and do
whatever it wants.

I _assume_ that this is practically impossible since the stop_machine()
infrastructure keeps interrupts at bay.  But it's rather subtle.

I'd probably just do this:

+	kernel_fpu_begin();
+	// AMX *MUST* be in the init state for the wrmsr() to work.
+	// But, the more in the init state, the less state the test
+	// has to save and restore.  Just zap everything.
+	restore_fpregs_from_fpstate(&init_fpstate,	
+				    fpu_user_cfg.max_features);
+
        wrmsrl(MSR_ACTIVATE_SCAN, params->activate->data);
        rdmsrl(MSR_SCAN_STATUS, status.data);

+	kernel_fpu_end();

That's dirt simple.  It doesn't require new infrastructure.  It doesn't
call an opaque new helper.  It doesn't require a feature check.  It
probably makes the IFS test run faster.  It will also magically work for
any fancy new feature that comes along which *ALSO* needs to be in its
init state ... with zero changes to this code.  For bonus points, this
code is quite universal.  It will work, as-is, in a bunch of kernel
contexts if future deranged kernel developer copies and pastes it.  The
code you suggested above can race unless it's called under
stop_machine() and isn't safe to copy elsewhere.

Three lines of code:

	1. IFS declares its need to own the FPU for a moment, like any
	   other kernel_fpu_begin() user.  It's not a special snowflake.
	   It is boring.
	2. IFS zaps the FPU state
	3. IFS gives up the FPU

Am I out of my mind?  What am I missing?  Why bother with _anything_
more complicated than this?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-09 17:36         ` Dave Hansen
@ 2024-05-09 17:41           ` Dave Hansen
  2024-05-09 18:41           ` Chang S. Bae
  2024-05-10  7:51           ` Thomas Gleixner
  2 siblings, 0 replies; 30+ messages in thread
From: Dave Hansen @ 2024-05-09 17:41 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/9/24 10:36, Dave Hansen wrote:
> +	restore_fpregs_from_fpstate(&init_fpstate,	
> +				    fpu_user_cfg.max_features);

There is *one* subtlety here.  This assumes that 'init_fpstate' has AMX
marked as being in its init state.  But I think that's a pretty safe
assumption.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-09 17:36         ` Dave Hansen
  2024-05-09 17:41           ` Dave Hansen
@ 2024-05-09 18:41           ` Chang S. Bae
  2024-05-09 18:50             ` Dave Hansen
  2024-05-10  7:51           ` Thomas Gleixner
  2 siblings, 1 reply; 30+ messages in thread
From: Chang S. Bae @ 2024-05-09 18:41 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/9/2024 10:36 AM, Dave Hansen wrote:
> 
> I'd probably just do this:
> 
> +	kernel_fpu_begin();
> +	// AMX *MUST* be in the init state for the wrmsr() to work.
> +	// But, the more in the init state, the less state the test
> +	// has to save and restore.  Just zap everything.
> +	restore_fpregs_from_fpstate(&init_fpstate,	
> +				    fpu_user_cfg.max_features);
> +

I assume that this snippet goes to the IFS driver side. Then, we need to 
introduce and export a new wrapper for this. 
restore_fpregs_from_fpstate() and its arguments are not accessible as of 
now.

Also, I think we should encapsulate them. If we follow this style, we 
could have invoked tilerelease() directly from the idle driver, right?

>          wrmsrl(MSR_ACTIVATE_SCAN, params->activate->data);
>          rdmsrl(MSR_SCAN_STATUS, status.data);
> 
> +	kernel_fpu_end();


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-09 18:41           ` Chang S. Bae
@ 2024-05-09 18:50             ` Dave Hansen
  0 siblings, 0 replies; 30+ messages in thread
From: Dave Hansen @ 2024-05-09 18:50 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, tglx, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On 5/9/24 11:41, Chang S. Bae wrote:
> On 5/9/2024 10:36 AM, Dave Hansen wrote:
>>
>> I'd probably just do this:
>>
>> +    kernel_fpu_begin();
>> +    // AMX *MUST* be in the init state for the wrmsr() to work.
>> +    // But, the more in the init state, the less state the test
>> +    // has to save and restore.  Just zap everything.
>> +    restore_fpregs_from_fpstate(&init_fpstate,   
>> +                    fpu_user_cfg.max_features);
>> +
> 
> I assume that this snippet goes to the IFS driver side. Then, we need
> to introduce and export a new wrapper for this. 
> restore_fpregs_from_fpstate() and its arguments are not accessible as
> of now.

Yes, a new wrapper to initialize all the user FPU state is fine.

> Also, I think we should encapsulate them. If we follow this style, we
> could have invoked tilerelease() directly from the idle driver, right?

You could have...  But I think the point there truly was to do a minimal
amount of work because you're going idle and need to race to get there.
The path to idle is super hot and you don't want to do _any_ additional
work.  It's worth writing highly AMX-specific code because generic code
would be slow there.

The IFS path is a super duper slow one.  It takes *FOREVER*.  I like the
idea of being simple, dumb and slow.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state
  2024-05-09 17:36         ` Dave Hansen
  2024-05-09 17:41           ` Dave Hansen
  2024-05-09 18:41           ` Chang S. Bae
@ 2024-05-10  7:51           ` Thomas Gleixner
  2 siblings, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2024-05-10  7:51 UTC (permalink / raw)
  To: Dave Hansen, Chang S. Bae, linux-kernel
  Cc: x86, platform-driver-x86, mingo, bp, dave.hansen, hdegoede,
	ilpo.jarvinen, tony.luck, ashok.raj, jithu.joseph

On Thu, May 09 2024 at 10:36, Dave Hansen wrote:
> Three lines of code:
>
> 	1. IFS declares its need to own the FPU for a moment, like any
> 	   other kernel_fpu_begin() user.  It's not a special snowflake.
> 	   It is boring.
> 	2. IFS zaps the FPU state
> 	3. IFS gives up the FPU
>
> Am I out of my mind?  What am I missing?  Why bother with _anything_
> more complicated than this?

Right. Keep it simple :)

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-05-10  7:51 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-30 21:25 [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
2024-04-30 21:25 ` [PATCH 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
2024-04-30 23:42   ` Ashok Raj
2024-04-30 21:25 ` [PATCH 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
2024-05-01  0:06   ` Ashok Raj
2024-05-01  8:58   ` Hans de Goede
2024-05-08  0:22     ` Chang S. Bae
2024-05-02 11:00   ` Ilpo Järvinen
2024-05-02 22:19     ` Chang S. Bae
2024-05-05  6:39   ` kernel test robot
2024-05-09  0:45   ` Kuppuswamy Sathyanarayanan
2024-05-02 20:52 ` [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Chang S. Bae
2024-05-02 21:35   ` Chang S. Bae
2024-05-02 20:59 ` Dave Hansen
2024-05-07 23:53 ` [PATCH v2 " Chang S. Bae
2024-05-07 23:53   ` [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state Chang S. Bae
2024-05-08 13:00     ` Thomas Gleixner
2024-05-08 14:40     ` Dave Hansen
2024-05-08 18:03       ` Chang S. Bae
2024-05-08 19:11         ` Dave Hansen
2024-05-09  0:29       ` Chang S. Bae
2024-05-09 17:36         ` Dave Hansen
2024-05-09 17:41           ` Dave Hansen
2024-05-09 18:41           ` Chang S. Bae
2024-05-09 18:50             ` Dave Hansen
2024-05-10  7:51           ` Thomas Gleixner
2024-05-07 23:53   ` [PATCH v2 2/2] platform/x86/intel/ifs: Initialize AMX state for the scan test Chang S. Bae
2024-05-08  0:19     ` Ashok Raj
2024-05-08  0:21       ` Chang S. Bae
2024-05-08  8:28   ` [PATCH v2 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the In-Field Scan driver Ilpo Järvinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.