[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 pr
From: |
Moger, Babu |
Subject: |
Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series |
Date: |
Tue, 12 Nov 2024 10:23:50 -0600 |
User-agent: |
Mozilla Thunderbird |
Hi Maksim,
On 11/12/24 04:09, Maksim Davydov wrote:
>
>
> On 11/8/24 23:56, Moger, Babu wrote:
>> Hi Maxim,
>>
>> Thanks for looking into this. I will fix the bits I mentioned below in
>> upcoming Genoa/Turin model update.
>>
>> I have few comments below.
>>
>> On 11/8/2024 12:15 PM, Maksim Davydov wrote:
>>> Hi!
>>> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa
>>> host. I found some mismatches that confused me. Could you help me to
>>> understand them?
>>>
>>> On 5/4/23 23:53, Babu Moger wrote:
>>>> Adds the support for AMD EPYC Genoa generation processors. The model
>>>> display for the new processor will be EPYC-Genoa.
>>>>
>>>> Adds the following new feature bits on top of the feature bits from
>>>> the previous generation EPYC models.
>>>>
>>>> avx512f : AVX-512 Foundation instruction
>>>> avx512dq : AVX-512 Doubleword & Quadword Instruction
>>>> avx512ifma : AVX-512 Integer Fused Multiply Add instruction
>>>> avx512cd : AVX-512 Conflict Detection instruction
>>>> avx512bw : AVX-512 Byte and Word Instructions
>>>> avx512vl : AVX-512 Vector Length Extension Instructions
>>>> avx512vbmi : AVX-512 Vector Byte Manipulation Instruction
>>>> avx512_vbmi2 : AVX-512 Additional Vector Byte Manipulation Instruction
>>>> gfni : AVX-512 Galois Field New Instructions
>>>> avx512_vnni : AVX-512 Vector Neural Network Instructions
>>>> avx512_bitalg : AVX-512 Bit Algorithms, add bit algorithms Instructions
>>>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>>>> Quadword Instructions
>>>> avx512_bf16 : AVX-512 BFLOAT16 instructions
>>>> la57 : 57-bit virtual address support (5-level Page Tables)
>>>> vnmi : Virtual NMI (VNMI) allows the hypervisor to inject
>>>> the NMI
>>>> into the guest without using Event Injection mechanism
>>>> meaning not required to track the guest NMI and
>>>> intercepting
>>>> the IRET.
>>>> auto-ibrs : The AMD Zen4 core supports a new feature called
>>>> Automatic IBRS.
>>>> It is a "set-and-forget" feature that means that,
>>>> unlike e.g.,
>>>> s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS
>>>> mitigation
>>>> resources automatically across CPL transitions.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>> target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>>>> 1 file changed, 122 insertions(+)
>>>>
>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>> index d50ace84bf..71fe1e02ee 100644
>>>> --- a/target/i386/cpu.c
>>>> +++ b/target/i386/cpu.c
>>>> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info
>>>> = {
>>>> },
>>>> };
>>>> +static const CPUCaches epyc_genoa_cache_info = {
>>>> + .l1d_cache = &(CPUCacheInfo) {
>>>> + .type = DATA_CACHE,
>>>> + .level = 1,
>>>> + .size = 32 * KiB,
>>>> + .line_size = 64,
>>>> + .associativity = 8,
>>>> + .partitions = 1,
>>>> + .sets = 64,
>>>> + .lines_per_tag = 1,
>>>> + .self_init = 1,
>>>> + .no_invd_sharing = true,
>>>> + },
>>>> + .l1i_cache = &(CPUCacheInfo) {
>>>> + .type = INSTRUCTION_CACHE,
>>>> + .level = 1,
>>>> + .size = 32 * KiB,
>>>> + .line_size = 64,
>>>> + .associativity = 8,
>>>> + .partitions = 1,
>>>> + .sets = 64,
>>>> + .lines_per_tag = 1,
>>>> + .self_init = 1,
>>>> + .no_invd_sharing = true,
>>>> + },
>>>> + .l2_cache = &(CPUCacheInfo) {
>>>> + .type = UNIFIED_CACHE,
>>>> + .level = 2,
>>>> + .size = 1 * MiB,
>>>> + .line_size = 64,
>>>> + .associativity = 8,
>>>> + .partitions = 1,
>>>> + .sets = 2048,
>>>> + .lines_per_tag = 1,
>>>
>>> 1. Why L2 cache is not shown as inclusive and self-initializing?
>>>
>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>> * cache inclusive. Read-only. Reset: Fixed,1.
>>> * cache is self-initializing. Read-only. Reset: Fixed,1.
>>
>> Yes. That is correct. This needs to be fixed. I Will fix it.
>>>
>>>> + },
>>>> + .l3_cache = &(CPUCacheInfo) {
>>>> + .type = UNIFIED_CACHE,
>>>> + .level = 3,
>>>> + .size = 32 * MiB,
>>>> + .line_size = 64,
>>>> + .associativity = 16,
>>>> + .partitions = 1,
>>>> + .sets = 32768,
>>>> + .lines_per_tag = 1,
>>>> + .self_init = true,
>>>> + .inclusive = true,
>>>> + .complex_indexing = false,
>>>
>>> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that
>>> the WBINVD/INVD instruction is not guaranteed to invalidate all lower
>>> level caches (0 bit)?
>>>
>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>> * cache inclusive. Read-only. Reset: Fixed,0.
>>> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.
>>>
>>
>> Yes. Both of this needs to be fixed. I Will fix it.
>>
>>>
>>>
>>> 3. Why the default stub is used for TLB, but not real values as for
>>> other caches?
>>
>> Can you please eloberate on this?
>>
>
> For L1i, L1d, L2 and L3 cache we provide the correct information about
> characteristics. In contrast, for L1i TLB, L1d TLB, L2i TLB and L2d TLB
> (0x80000005 and 0x80000006) we use the same value for all CPU models.
> Sometimes it seems strange. For instance, the current default value in
> QEMU for L2 TLB associativity for 4 KB pages is 4. But 4 is a reserved
> value for Genoa (as PPR for Family 19h Model 11h says)
Yes. I see that. We may need to address this sometime in the future.
>
>>>
>>>> + },
>>>> +};
>>>> +
>>>> /* The following VMX features are not supported by KVM and are left
>>>> out in the
>>>> * CPU definitions:
>>>> *
>>>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition
>>>> builtin_x86_defs[] = {
>>>> { /* end of list */ }
>>>> }
>>>> },
>>>> + {
>>>> + .name = "EPYC-Genoa",
>>>> + .level = 0xd,
>>>> + .vendor = CPUID_VENDOR_AMD,
>>>> + .family = 25,
>>>> + .model = 17,
>>>> + .stepping = 0,
>>>> + .features[FEAT_1_EDX] =
>>>> + CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
>>>> CPUID_CLFLUSH |
>>>> + CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
>>>> CPUID_PGE |
>>>> + CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
>>>> CPUID_MCE |
>>>> + CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
>>>> + CPUID_VME | CPUID_FP87,
>>>> + .features[FEAT_1_ECX] =
>>>> + CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
>>>> + CPUID_EXT_XSAVE | CPUID_EXT_AES | CPUID_EXT_POPCNT |
>>>> + CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
>>>> + CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
>>>> + CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
>>>> + CPUID_EXT_SSE3,
>>>> + .features[FEAT_8000_0001_EDX] =
>>>> + CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
>>>> + CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
>>>> + CPUID_EXT2_SYSCALL,
>>>> + .features[FEAT_8000_0001_ECX] =
>>>> + CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>>>> + CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
>>>> + CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
>>>> + CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
>>>> + .features[FEAT_8000_0008_EBX] =
>>>> + CPUID_8000_0008_EBX_CLZERO |
>>>> CPUID_8000_0008_EBX_XSAVEERPTR |
>>>> + CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
>>>> + CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
>>>> + CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
>>>> + CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,
>>>
>>> 4. Why 0x80000008_EBX features related to speculation vulnerabilities
>>> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?
>>
>> KVM does not expose these bits to the guests yet.
>>
>> I normally check using the ioctl KVM_GET_SUPPORTED_CPUID.
>>
>
> I'm not sure, but at least the first two of these features seem to be
> helpful to choose the appropriate mitigation. Do you think that we should
> add them to KVM?
Yes. Sure.
>
>>
>>>
>>>> + .features[FEAT_8000_0021_EAX] =
>>>> + CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
>>>> + CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
>>>> + CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
>>>> + CPUID_8000_0021_EAX_AUTO_IBRS,
>>>
>>> 5. Why some 0x80000021_EAX features are not set?
>>> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS)
>>
>> KVM does not expose FSRC and FSRS bits to the guests yet.
>
> But KVM exposes the same features (0x7 ecx=1, bits 10 and 11) for Intel
> CPU models. Do we have to add these bits for AMD to KVM?
Yes. Sure.
>
>>
>> The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if
>> we can add this bit to the Genoa and Turin.
Will add this in my qemu series.
>>
>>>
>>>> + .features[FEAT_7_0_EBX] =
>>>> + CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
>>>> CPUID_7_0_EBX_AVX2 |
>>>> + CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
>>>> CPUID_7_0_EBX_ERMS |
>>>> + CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
>>>> + CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED |
>>>> CPUID_7_0_EBX_ADX |
>>>> + CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
>>>> + CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
>>>> + CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>>>> + CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>>>> + .features[FEAT_7_0_ECX] =
>>>> + CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP |
>>>> CPUID_7_0_ECX_PKU |
>>>> + CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>>>> + CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>>>> + CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>>>> + CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>>>> + CPUID_7_0_ECX_RDPID,
>>>> + .features[FEAT_7_0_EDX] =
>>>> + CPUID_7_0_EDX_FSRM,
>>>
>>> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data
>>> processors have to use it, am I right?
>>
>> KVM does not expose L1D_FLUSH to the guests. Not sure why. Need to
>> investigate.
>>
>
> It seems that KVM has exposed L1D_FLUSH since da3db168fb67
Sure. Will update my patch series.
>
>>
>>>
>>>> + .features[FEAT_7_1_EAX] =
>>>> + CPUID_7_1_EAX_AVX512_BF16,
>>>> + .features[FEAT_XSAVE] =
>>>> + CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
>>>> + CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
>>>> + .features[FEAT_6_EAX] =
>>>> + CPUID_6_EAX_ARAT,
>>>> + .features[FEAT_SVM] =
>>>> + CPUID_SVM_NPT | CPUID_SVM_NRIPSAVE | CPUID_SVM_VNMI |
>>>> + CPUID_SVM_SVME_ADDR_CHK,
>>>> + .xlevel = 0x80000022,
>>>> + .model_id = "AMD EPYC-Genoa Processor",
>>>> + .cache_info = &epyc_genoa_cache_info,
>>>> + },
>>>> };
>>>> /*
>>>
>>
>
> So, If you don't mind, I will send a patch to KVM within a few hours. I
> will add bits for FSRC, FSRS and some bits from 0x80000008_EBX
>
FSRC and FSRS are not used anywhere in the kernel. It is mostly FYI kind
of information. It does not hurt to add. Please go ahead.
--
Thanks
Babu Moger