[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] BugFix: grub menu gets stuck due to unreliable rdtsc instruc
From: |
Daniel Kiper |
Subject: |
Re: [PATCH] BugFix: grub menu gets stuck due to unreliable rdtsc instruction. |
Date: |
Tue, 26 Nov 2024 16:32:07 +0100 |
User-agent: |
NeoMutt/20170113 (1.7.2) |
On Mon, Nov 11, 2024 at 02:17:19PM +0800, Duan Yayong via Grub-devel wrote:
> This patch is used to fix grub menu gets stuck in server
> AC poweron/poweroff stress test of x86_64, which is reproduced with 1/200
> ratio. The root cause analysis as below:
>
> Q. What's the code logic?
> A. "grub_tsc_init" function will init tsc by setting grub_tsc_rate, which
> call stack is:
> grub_tsc_init -> grub_tsc_calibrate_from_pmtimer -> grub_divmod64
> Among, "grub_divmod64" function needs "tsc_diff" as the second parameter.
> In "grub_pmtimer_wait_count_tsc", we will call "grub_get_tsc" function to get
> time stamp counter value to assign to "start_tsc" variable, and get into
> "while(1)" loop space to get "end_tsc" variable value with same function,
> after 3580 ticks, return "end_tsc - start_tsc". Actually, "rdtsc" instruction
> will be called in "grub_get_tsc", but "rdtsc" instruction is not reliable(for
> the reason see the next question), which will cause "tsc_diff" to be a very
> big number larger than (1UL << 32) or a negative number, so that
> grub_tsc_rate will be zero. When "run_menu" function is startup, and calls
> "grub_tsc_get_time_ms" function to get current time to check if timeout time
> reach, at this time, "grub_tsc_get_time_ms" function will return zero due to
> zero "grub_tsc_rate" variable, then grub menu gets stuck...
Please wrap lines in the commit message. Now it is unreadable.
FYI, lines can be (a bit) longer than 80 chars...
>
> Q. What's the difference between rdtsc and rdtscp instructions
> in x86_64 architecture? Here is more explanations from Intel
> Manual per check with Intel OS expert as below:
Could you point exact paragraph(s) explaining this in Intel SDM? How
this applies to the AMD? Additionally, it seems to me we should check
for rdtscp instruction availability and fallback to rdtsc/lfence when
it is not present.
> A. The RDTSC instruction is not a serializing instruction. It does not
> necessarily wait until all previous instructions have been executedbefore
> reading the counter. Similarly, subsequent instructions may begin execution
> before the read operation is performed. The following items may guide
> software seeking to order executions of RDTSC:
> - If software requires RDTSC to be executed only after all previouss
> instructions have executed and all previous loads are globally visible, 1 it
> can execute LFENCE immediately before RDTSC.
> - If software requires RDTSC to be executed only after all previouss
> instructions have executed and all previous loads and stores areglobally
> visible, it can execute the sequence MFENCE;LFENCE immediately before RDTSC.
> - If software requires RDTSC to be executed prior to execution of any
> sulbsequent instruction (including any memory accesses), it can execute the
> sequence LFENCE immediately after RDTSC.
>
> A. The RDTSCP instruction is not a serializing instruction, but it does wait
> until all previous instructions have executed and all previous loads are
> globally visible. 1 But it does not wait for previous stores to be globally
> visible, and subsequent instructions may begin execution before the read
> operation is performed. The following items may guide softwareseeking to
> order executions of RDTSCP:
> - If software requires RDTSCP to be executed only after all previous stores
> are globally visible, it can execute MFENCE immediatelybefore RDTSCP.
> - If software requires RDTSCP to be executed prior to execution of any
> subsequent instruction (including any memory accesses), itcan execute LFENCE
> immediately after RDTSCP.
>
> Q. Why do we do this fix?
>
> A. Changing "rdtsc" instruction to "rdtscp" to make sure do read counter
> operation after previous instruction have executed and all previous loads are
> globally visiable, that means we keep current and previous instruction
> serializing; add "lfence" load barrier instruction after "rdrscp"
> instruction, that means we keep current and subsequent instrution
> serializing. So we must get correct tsc value in this timeout check scenario.
>
> A. Adding "if (grub_tsc_rate == 0)" judgement just in case other unknown
> instruction exception, so that
> "grub_tsc_calibrate_from_pit||grub_tsc_calibrate_from_efi" getting
> "grub_tsc_rate" methods have a opportunity to be performed but causing grub
> menu stucking.
>
> A. After this change, we cannot reproduce this issue via 1060 AC
> poweron/poweroff stress test cycles.
>
> Signed-off-by: Duan Yayong <duanyayong@bytedance.com>
> Signed-off-by: Li Yongqiang <liyongqiang@huaqin.com>
> Signed-off-by: Sun Ming <simon.sun@huaqin.com>
> ---
> grub-core/kern/i386/tsc_pmtimer.c | 11 +++++++++++
> include/grub/i386/tsc.h | 8 ++++----
> 2 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/grub-core/kern/i386/tsc_pmtimer.c
> b/grub-core/kern/i386/tsc_pmtimer.c
> index 5c03c510a..04f7c2006 100644
> --- a/grub-core/kern/i386/tsc_pmtimer.c
> +++ b/grub-core/kern/i386/tsc_pmtimer.c
> @@ -143,5 +143,16 @@ grub_tsc_calibrate_from_pmtimer (void)
> if (tsc_diff == 0)
> return 0;
> grub_tsc_rate = grub_divmod64 ((1ULL << 32), tsc_diff, 0);
> + /*
> + Specifically, when the tsc_diff (end_tsc - start_tsc) is greater than
> (1ULL << 32),
> + the result of grub_divmod64() becomes zero, causing grub_tsc_rate to
> always be zero.
> + As a result, grub_tsc_get_time_ms() consistently returns zero, and the
> grub menu
> + countdown gets stuck. To resolve this, we return 0 to proceed to the
> next calibration
> + function when grub_tsc_rate is zero.
> + */
Wrong coding style for comment [1]. Please fix it...
> + if (grub_tsc_rate == 0) {
> + return 0;
> + }
> +
I think the grub-core/kern/i386/tsc_pmtimer.c fix should be in separate patch.
> return 1;
> }
> diff --git a/include/grub/i386/tsc.h b/include/grub/i386/tsc.h
> index 947e62fa1..543de93e0 100644
> --- a/include/grub/i386/tsc.h
> +++ b/include/grub/i386/tsc.h
> @@ -44,10 +44,10 @@ grub_get_tsc (void)
> /* The CPUID instruction is a 'serializing' instruction, and
> avoids out-of-order execution of the RDTSC instruction. */
> grub_cpuid (0,a,b,c,d);
> - /* Read TSC value. We cannot use "=A", since this would use
> - %rax on x86_64. */
> - asm volatile ("rdtsc":"=a" (lo), "=d" (hi));
> -
> + /*
> + Use the rdtscp command instead of RDTSC to provide better ordering
> guarantees.
> + */
Wrong coding style... [1]
> + asm volatile ("rdtscp\n\t""lfence\n\t":"=a"(lo), "=d"(hi));
This is a mess...
asm volatile ("rdtscp; lfence" : "=a" (lo), "=d" (hi));
> return (((grub_uint64_t) hi) << 32) | lo;
> }
Daniel
[1] https://www.gnu.org/software/grub/manual/grub-dev/grub-dev.html#Comments