All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
@ 2024-05-07  2:29 John Hubbard
  2024-05-07  8:13 ` David Hildenbrand
  2024-05-07 19:26 ` Borislav Petkov
  0 siblings, 2 replies; 11+ messages in thread
From: John Hubbard @ 2024-05-07  2:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H . Peter Anvin, LKML, linux-mm,
	John Hubbard

If a system experiences a lot of memory failures, then any associated
printk() output really needs to be rate-limited. I noticed this while
running selftests/mm/uffd-unit-tests, which logs 12,305 lines of output,
adding (on my system) an extra 97 seconds of runtime due to printk time.

The test normally only takes about 10 seconds, and the enclosing set of
mm selftests normally takes 305 seconds, so the additional 97 seconds
really hurts.

Generating lots of memory poisoning events seems like a valid use case,
by which I mean that this is not just a testing artifact. And that's why
the fix applies to the code that directly generates the output, rather
than the selftest that triggers it.

With this patch, all but 10 lines are suppressed, thus speeding up that
particular selftest by 90% (runtime drops from 107 seconds, to 10.6
seconds).

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 arch/x86/mm/fault.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bba4e020dd64..e4f3c7721f45 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -928,7 +928,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 		struct task_struct *tsk = current;
 		unsigned lsb = 0;
 
-		pr_err(
+		pr_err_ratelimited(
 	"MCE: Killing %s:%d due to hardware memory corruption fault at %lx\n",
 			tsk->comm, tsk->pid, address);
 		if (fault & VM_FAULT_HWPOISON_LARGE)

base-commit: dccb07f2914cdab2ac3a5b6c98406f765acab803
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07  2:29 [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs John Hubbard
@ 2024-05-07  8:13 ` David Hildenbrand
  2024-05-07 16:28   ` John Hubbard
  2024-05-07 19:26 ` Borislav Petkov
  1 sibling, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2024-05-07  8:13 UTC (permalink / raw)
  To: John Hubbard, Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H . Peter Anvin, LKML, linux-mm, Peter Xu,
	Axel Rasmussen

The patch subject is misleading. This should be "don't flood the system 
log". Nobody cares about the speed of a unittest ;)

On 07.05.24 04:29, John Hubbard wrote:
> If a system experiences a lot of memory failures, then any associated
> printk() output really needs to be rate-limited. I noticed this while
> running selftests/mm/uffd-unit-tests, which logs 12,305 lines of output,
> adding (on my system) an extra 97 seconds of runtime due to printk time.

Recently discussed:

https://lkml.kernel.org/r/a9e3120d-8b79-4435-b113-ceb20aa45ee2@alu.unizg.hr

See the pros/cons of using ratelimiting, and what an alternative for 
uffd is that Axel is working on.

(CCing Peter and Axel)

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07  8:13 ` David Hildenbrand
@ 2024-05-07 16:28   ` John Hubbard
  2024-05-07 16:43     ` David Hildenbrand
  0 siblings, 1 reply; 11+ messages in thread
From: John Hubbard @ 2024-05-07 16:28 UTC (permalink / raw)
  To: David Hildenbrand, Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H . Peter Anvin, LKML, linux-mm, Peter Xu,
	Axel Rasmussen

On 5/7/24 1:13 AM, David Hildenbrand wrote:
> The patch subject is misleading. This should be "don't flood the system 

I went back and forth on that subject line. :)

> log". Nobody cares about the speed of a unittest ;)

Yes they do. People should actually run the selftests, which in turn have
enshrined their guidelines in kernel doc. See dev-tools/kselftest.rst,
"Contributing new tests", which says, as you would hope, "Don't take
too long".

It's important. Tests need to be quick, and having one out of 50 that
blows it up is worth fixing.

> 
> On 07.05.24 04:29, John Hubbard wrote:
>> If a system experiences a lot of memory failures, then any associated
>> printk() output really needs to be rate-limited. I noticed this while
>> running selftests/mm/uffd-unit-tests, which logs 12,305 lines of output,
>> adding (on my system) an extra 97 seconds of runtime due to printk time.
> 
> Recently discussed:
> 
> https://lkml.kernel.org/r/a9e3120d-8b79-4435-b113-ceb20aa45ee2@alu.unizg.hr
> 
> See the pros/cons of using ratelimiting, and what an alternative for 
> uffd is that Axel is working on.
> 
> (CCing Peter and Axel)
> 

That thread seems to have stalled. I *do* have MCE experience (writing a 
handler,
dealing with MCEs and related bugs), and what they wrote so far is exactly
correct: if you were going to flood the log, then no, we don't need to see
every single line printed. The first 10 or so, plus the fact that rate 
limiting
kicked in, is sufficient to proceed with debugging and/or hardware 
replacement.

I'd like to just do this patch almost as-is, just with a fixed up subject,
perhaps:

     x86/fault: rate-limit to avoid flooding dmesg with "MCE: Killing" 
reports

Yes?

thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07 16:28   ` John Hubbard
@ 2024-05-07 16:43     ` David Hildenbrand
  2024-05-07 16:53       ` John Hubbard
  2024-05-07 18:08       ` Axel Rasmussen
  0 siblings, 2 replies; 11+ messages in thread
From: David Hildenbrand @ 2024-05-07 16:43 UTC (permalink / raw)
  To: John Hubbard, Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H . Peter Anvin, LKML, linux-mm, Peter Xu,
	Axel Rasmussen

On 07.05.24 18:28, John Hubbard wrote:
> On 5/7/24 1:13 AM, David Hildenbrand wrote:
>> The patch subject is misleading. This should be "don't flood the system
> 
> I went back and forth on that subject line. :)
> 
>> log". Nobody cares about the speed of a unittest ;)
> 
> Yes they do. People should actually run the selftests, which in turn have
> enshrined their guidelines in kernel doc. See dev-tools/kselftest.rst,
> "Contributing new tests", which says, as you would hope, "Don't take
> too long".
> 
> It's important. Tests need to be quick, and having one out of 50 that
> blows it up is worth fixing.

I'm pretty sure you got my point: it's much more important to not let 
unprivileged users flood the log (possibly harming the system?) than 
making a test run faster :)

> 
>>
>> On 07.05.24 04:29, John Hubbard wrote:
>>> If a system experiences a lot of memory failures, then any associated
>>> printk() output really needs to be rate-limited. I noticed this while
>>> running selftests/mm/uffd-unit-tests, which logs 12,305 lines of output,
>>> adding (on my system) an extra 97 seconds of runtime due to printk time.
>>
>> Recently discussed:
>>
>> https://lkml.kernel.org/r/a9e3120d-8b79-4435-b113-ceb20aa45ee2@alu.unizg.hr
>>
>> See the pros/cons of using ratelimiting, and what an alternative for
>> uffd is that Axel is working on.
>>
>> (CCing Peter and Axel)
>>
> 
> That thread seems to have stalled.

Yes, there was no follow-up.

> I *do* have MCE experience (writing a
> handler,
> dealing with MCEs and related bugs), and what they wrote so far is exactly
> correct: if you were going to flood the log, then no, we don't need to see
> every single line printed. The first 10 or so, plus the fact that rate
> limiting
> kicked in, is sufficient to proceed with debugging and/or hardware
> replacement.
> 
> I'd like to just do this patch almost as-is, just with a fixed up subject,
> perhaps:
> 
>       x86/fault: rate-limit to avoid flooding dmesg with "MCE: Killing"
> reports
> 
> Yes?


Makes sense to me (and thanks for confirming that we don't need 
complexity elsewhere).

I think we at least need "Fixes:" (not sure if stable is warranted as 
well, 1b0a151c10a6d823f033023b9fdd9af72a89591b didn't CC stable).

Consider adding a link to the report in that thread.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07 16:43     ` David Hildenbrand
@ 2024-05-07 16:53       ` John Hubbard
  2024-05-07 16:55         ` David Hildenbrand
  2024-05-07 18:08       ` Axel Rasmussen
  1 sibling, 1 reply; 11+ messages in thread
From: John Hubbard @ 2024-05-07 16:53 UTC (permalink / raw)
  To: David Hildenbrand, Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H . Peter Anvin, LKML, linux-mm, Peter Xu,
	Axel Rasmussen

On 5/7/24 9:43 AM, David Hildenbrand wrote:
> On 07.05.24 18:28, John Hubbard wrote:
>> On 5/7/24 1:13 AM, David Hildenbrand wrote:
...
>> I'd like to just do this patch almost as-is, just with a fixed up 
>> subject,
>> perhaps:
>>
>>       x86/fault: rate-limit to avoid flooding dmesg with "MCE: Killing"
>> reports
>>
>> Yes?
> 
> 
> Makes sense to me (and thanks for confirming that we don't need 
> complexity elsewhere).
> 
> I think we at least need "Fixes:" (not sure if stable is warranted as 
> well, 1b0a151c10a6d823f033023b9fdd9af72a89591b didn't CC stable).
> 
> Consider adding a link to the report in that thread.
> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 

OK, there's enough changes to the commit description that I'll go ahead
and send a v2, with a new subject line, a Fixes:, your Ack, a link to
the other conversation, and maybe reword a bit based on your feedback
too.

Thanks for the review and the Ack!

thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07 16:53       ` John Hubbard
@ 2024-05-07 16:55         ` David Hildenbrand
  0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand @ 2024-05-07 16:55 UTC (permalink / raw)
  To: John Hubbard, Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H . Peter Anvin, LKML, linux-mm, Peter Xu,
	Axel Rasmussen

On 07.05.24 18:53, John Hubbard wrote:
> On 5/7/24 9:43 AM, David Hildenbrand wrote:
>> On 07.05.24 18:28, John Hubbard wrote:
>>> On 5/7/24 1:13 AM, David Hildenbrand wrote:
> ...
>>> I'd like to just do this patch almost as-is, just with a fixed up
>>> subject,
>>> perhaps:
>>>
>>>        x86/fault: rate-limit to avoid flooding dmesg with "MCE: Killing"
>>> reports
>>>
>>> Yes?
>>
>>
>> Makes sense to me (and thanks for confirming that we don't need
>> complexity elsewhere).
>>
>> I think we at least need "Fixes:" (not sure if stable is warranted as
>> well, 1b0a151c10a6d823f033023b9fdd9af72a89591b didn't CC stable).
>>
>> Consider adding a link to the report in that thread.
>>
>> Acked-by: David Hildenbrand <david@redhat.com>
>>
> 
> OK, there's enough changes to the commit description that I'll go ahead
> and send a v2, with a new subject line, a Fixes:, your Ack, a link to
> the other conversation, and maybe reword a bit based on your feedback
> too.

Sounds good. We might want to CC stable: I just read 
1b0a151c10a6d823f033023b9fdd9af72a89591b and I think the problem there 
cannot really be triggered by unprivileged users.

Hm... thanks!

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07 16:43     ` David Hildenbrand
  2024-05-07 16:53       ` John Hubbard
@ 2024-05-07 18:08       ` Axel Rasmussen
  2024-05-07 18:10         ` John Hubbard
  1 sibling, 1 reply; 11+ messages in thread
From: Axel Rasmussen @ 2024-05-07 18:08 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: John Hubbard, Andy Lutomirski, Dave Hansen, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, LKML, linux-mm, Peter Xu

On Tue, May 7, 2024 at 9:43 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 07.05.24 18:28, John Hubbard wrote:
> > On 5/7/24 1:13 AM, David Hildenbrand wrote:
> >> The patch subject is misleading. This should be "don't flood the system
> >
> > I went back and forth on that subject line. :)
> >
> >> log". Nobody cares about the speed of a unittest ;)
> >
> > Yes they do. People should actually run the selftests, which in turn have
> > enshrined their guidelines in kernel doc. See dev-tools/kselftest.rst,
> > "Contributing new tests", which says, as you would hope, "Don't take
> > too long".
> >
> > It's important. Tests need to be quick, and having one out of 50 that
> > blows it up is worth fixing.
>
> I'm pretty sure you got my point: it's much more important to not let
> unprivileged users flood the log (possibly harming the system?) than
> making a test run faster :)
>
> >
> >>
> >> On 07.05.24 04:29, John Hubbard wrote:
> >>> If a system experiences a lot of memory failures, then any associated
> >>> printk() output really needs to be rate-limited. I noticed this while
> >>> running selftests/mm/uffd-unit-tests, which logs 12,305 lines of output,
> >>> adding (on my system) an extra 97 seconds of runtime due to printk time.
> >>
> >> Recently discussed:
> >>
> >> https://lkml.kernel.org/r/a9e3120d-8b79-4435-b113-ceb20aa45ee2@alu.unizg.hr
> >>
> >> See the pros/cons of using ratelimiting, and what an alternative for
> >> uffd is that Axel is working on.
> >>
> >> (CCing Peter and Axel)
> >>
> >
> > That thread seems to have stalled.
>
> Yes, there was no follow-up.

Apologies, I had completely forgotten about this. I blame the weekend. :)

No objections from me to the simple rate limiting proposed here, if
useful you can take:

Acked-by: Axel Rasmussen <axelrasmussen@google.com>

But, it seems to me the earlier proposal may still be useful.
Specifically, don't print at all for "synthetic" poisons from
UFFDIO_POISON or similar mechanisms. This way, "real" errors aren't
gobbled up by the ratelimit due to spam from "synthetic" errors. If
folks agree, I can *actually* send a patch this time. :)

>
> > I *do* have MCE experience (writing a
> > handler,
> > dealing with MCEs and related bugs), and what they wrote so far is exactly
> > correct: if you were going to flood the log, then no, we don't need to see
> > every single line printed. The first 10 or so, plus the fact that rate
> > limiting
> > kicked in, is sufficient to proceed with debugging and/or hardware
> > replacement.
> >
> > I'd like to just do this patch almost as-is, just with a fixed up subject,
> > perhaps:
> >
> >       x86/fault: rate-limit to avoid flooding dmesg with "MCE: Killing"
> > reports
> >
> > Yes?
>
>
> Makes sense to me (and thanks for confirming that we don't need
> complexity elsewhere).
>
> I think we at least need "Fixes:" (not sure if stable is warranted as
> well, 1b0a151c10a6d823f033023b9fdd9af72a89591b didn't CC stable).
>
> Consider adding a link to the report in that thread.
>
> Acked-by: David Hildenbrand <david@redhat.com>
>
> --
> Cheers,
>
> David / dhildenb
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07 18:08       ` Axel Rasmussen
@ 2024-05-07 18:10         ` John Hubbard
  2024-05-07 18:15           ` Axel Rasmussen
  0 siblings, 1 reply; 11+ messages in thread
From: John Hubbard @ 2024-05-07 18:10 UTC (permalink / raw)
  To: Axel Rasmussen, David Hildenbrand
  Cc: Andy Lutomirski, Dave Hansen, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H . Peter Anvin, LKML,
	linux-mm, Peter Xu

On 5/7/24 11:08 AM, Axel Rasmussen wrote:
> On Tue, May 7, 2024 at 9:43 AM David Hildenbrand <david@redhat.com> wrote:
...
>>> That thread seems to have stalled.
>>
>> Yes, there was no follow-up.
> 
> Apologies, I had completely forgotten about this. I blame the weekend. :)
> 
> No objections from me to the simple rate limiting proposed here, if
> useful you can take:
> 
> Acked-by: Axel Rasmussen <axelrasmussen@google.com>
> 
> But, it seems to me the earlier proposal may still be useful.
> Specifically, don't print at all for "synthetic" poisons from
> UFFDIO_POISON or similar mechanisms. This way, "real" errors aren't
> gobbled up by the ratelimit due to spam from "synthetic" errors. If
> folks agree, I can *actually* send a patch this time. :)
> 

That sounds good to me. (Should it also rate limit, though? I'm leaning
toward yes.)


thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07 18:10         ` John Hubbard
@ 2024-05-07 18:15           ` Axel Rasmussen
  2024-05-07 22:49             ` John Hubbard
  0 siblings, 1 reply; 11+ messages in thread
From: Axel Rasmussen @ 2024-05-07 18:15 UTC (permalink / raw)
  To: John Hubbard
  Cc: David Hildenbrand, Andy Lutomirski, Dave Hansen, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, LKML, linux-mm, Peter Xu

On Tue, May 7, 2024 at 11:11 AM John Hubbard <jhubbard@nvidia.com> wrote:
>
> On 5/7/24 11:08 AM, Axel Rasmussen wrote:
> > On Tue, May 7, 2024 at 9:43 AM David Hildenbrand <david@redhat.com> wrote:
> ...
> >>> That thread seems to have stalled.
> >>
> >> Yes, there was no follow-up.
> >
> > Apologies, I had completely forgotten about this. I blame the weekend. :)
> >
> > No objections from me to the simple rate limiting proposed here, if
> > useful you can take:
> >
> > Acked-by: Axel Rasmussen <axelrasmussen@google.com>
> >
> > But, it seems to me the earlier proposal may still be useful.
> > Specifically, don't print at all for "synthetic" poisons from
> > UFFDIO_POISON or similar mechanisms. This way, "real" errors aren't
> > gobbled up by the ratelimit due to spam from "synthetic" errors. If
> > folks agree, I can *actually* send a patch this time. :)
> >
>
> That sounds good to me. (Should it also rate limit, though? I'm leaning
> toward yes.)

I believe the proposal so far was, simulated poisons aren't really
"global" events, and are only relevant to the process itself. So don't
send them to the global kernel log at all, and instead let the process
do whatever it wants with them (e.g. it could print something when it
receives a signal, perhaps with rate limiting).

>
>
> thanks,
> --
> John Hubbard
> NVIDIA
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07  2:29 [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs John Hubbard
  2024-05-07  8:13 ` David Hildenbrand
@ 2024-05-07 19:26 ` Borislav Petkov
  1 sibling, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2024-05-07 19:26 UTC (permalink / raw)
  To: John Hubbard
  Cc: Andy Lutomirski, Dave Hansen, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, x86, H . Peter Anvin, LKML, linux-mm

On Mon, May 06, 2024 at 07:29:39PM -0700, John Hubbard wrote:
> Generating lots of memory poisoning events seems like a valid use case,
> by which I mean that this is not just a testing artifact. And that's why
> the fix applies to the code that directly generates the output, rather
> than the selftest that triggers it.

Sorry, not taking a "fix" for something hypothetical.

If this is a real issue on a real system and the printing is the
*actual* problem at hand in a hw error storm sure, but no, not because
a selftest runs slower.

I'm pretty sure in a hw error storm scenario, printk being slow is the
least of your problems.

And in such a scenario the *last* thing you wanna do is ratelimit prints
so that you can't even get all the logs which is *the* thing you need to
debug the hw.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs
  2024-05-07 18:15           ` Axel Rasmussen
@ 2024-05-07 22:49             ` John Hubbard
  0 siblings, 0 replies; 11+ messages in thread
From: John Hubbard @ 2024-05-07 22:49 UTC (permalink / raw)
  To: Axel Rasmussen
  Cc: David Hildenbrand, Andy Lutomirski, Dave Hansen, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, LKML, linux-mm, Peter Xu

On 5/7/24 11:15 AM, Axel Rasmussen wrote:
> On Tue, May 7, 2024 at 11:11 AM John Hubbard <jhubbard@nvidia.com> wrote:
>>
>> On 5/7/24 11:08 AM, Axel Rasmussen wrote:
>>> On Tue, May 7, 2024 at 9:43 AM David Hildenbrand <david@redhat.com> wrote:
>> ...
>>>>> That thread seems to have stalled.
>>>>
>>>> Yes, there was no follow-up.
>>>
>>> Apologies, I had completely forgotten about this. I blame the weekend. :)
>>>
>>> No objections from me to the simple rate limiting proposed here, if
>>> useful you can take:
>>>
>>> Acked-by: Axel Rasmussen <axelrasmussen@google.com>
>>>
>>> But, it seems to me the earlier proposal may still be useful.
>>> Specifically, don't print at all for "synthetic" poisons from
>>> UFFDIO_POISON or similar mechanisms. This way, "real" errors aren't
>>> gobbled up by the ratelimit due to spam from "synthetic" errors. If
>>> folks agree, I can *actually* send a patch this time. :)
>>>
>>
>> That sounds good to me. (Should it also rate limit, though? I'm leaning
>> toward yes.)
> 
> I believe the proposal so far was, simulated poisons aren't really
> "global" events, and are only relevant to the process itself. So don't
> send them to the global kernel log at all, and instead let the process
> do whatever it wants with them (e.g. it could print something when it
> receives a signal, perhaps with rate limiting).
  
OK. And seeing as how I'm not (at all) in alignment with Borislav on
where to apply rate limiting, we'd better go with your approach.


thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-05-07 22:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-07  2:29 [PATCH] x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs John Hubbard
2024-05-07  8:13 ` David Hildenbrand
2024-05-07 16:28   ` John Hubbard
2024-05-07 16:43     ` David Hildenbrand
2024-05-07 16:53       ` John Hubbard
2024-05-07 16:55         ` David Hildenbrand
2024-05-07 18:08       ` Axel Rasmussen
2024-05-07 18:10         ` John Hubbard
2024-05-07 18:15           ` Axel Rasmussen
2024-05-07 22:49             ` John Hubbard
2024-05-07 19:26 ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.