qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [PATCH v2 10/14] target/arm/kvm64: Add kvm_arch_get/put_s


From: Richard Henderson
Subject: Re: [Qemu-arm] [PATCH v2 10/14] target/arm/kvm64: Add kvm_arch_get/put_sve
Date: Wed, 26 Jun 2019 17:22:34 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0

On 6/21/19 6:34 PM, Andrew Jones wrote:
> +/*
> + * If ARM_MAX_VQ is increased to be greater than 16, then we can no
> + * longer hard code slices to 1 in kvm_arch_put/get_sve().
> + */
> +QEMU_BUILD_BUG_ON(ARM_MAX_VQ > 16);

This seems easy to fix, or simply drop the slices entirely for now, as
otherwise they are a teeny bit confusing.

It's a shame that these slices exist at all.  It seems like the kernel could
use the negotiated max sve size to grab the data all at once.

> +        for (n = 0; n < KVM_ARM64_SVE_NUM_ZREGS; n++) {
> +            uint64_t *q = aa64_vfp_qreg(env, n);
> +#ifdef HOST_WORDS_BIGENDIAN
> +            uint64_t d[ARM_MAX_VQ * 2];
> +            int j;
> +            for (j = 0; j < cpu->sve_max_vq * 2; j++) {
> +                d[j] = bswap64(q[j]);
> +            }
> +            reg.addr = (uintptr_t)d;
> +#else
> +            reg.addr = (uintptr_t)q;
> +#endif
> +            reg.id = KVM_REG_ARM64_SVE_ZREG(n, i);
> +            ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);

It might be worth splitting this...

> +        for (n = 0; n < KVM_ARM64_SVE_NUM_PREGS; n++) {
> +            uint64_t *q = &env->vfp.pregs[n].p[0];
> +#ifdef HOST_WORDS_BIGENDIAN
> +            uint64_t d[ARM_MAX_VQ * 2 / 8];
> +            int j;
> +            for (j = 0; j < cpu->sve_max_vq * 2 / 8; j++) {
> +                d[j] = bswap64(q[j]);
> +            }
> +            reg.addr = (uintptr_t)d;
> +#else
> +            reg.addr = (uintptr_t)q;
> +#endif
> +            reg.id = KVM_REG_ARM64_SVE_PREG(n, i);
> +            ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);

... and this (unified w/ reg + size parameters?) to a function because ...

> +        reg.addr = (uintptr_t)&env->vfp.pregs[FFR_PRED_NUM].p[0];
> +        reg.id = KVM_REG_ARM64_SVE_FFR(i);
> +        ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);

... you forgot to apply the bswap here.

Likewise for the other direction.


r~


PS: It's also tempting to drop the ifdefs and, since we know the host supports
sve instructions, and that the host supports sve_max_vq, do the reformatting as

    uint64_t scratch[ARM_MAX_VQ * 2];
    asm("whilelo  p0.d, xzr, %2\n\t"
        "ld1d     z0.d, p0/z [%1]\n\t"
        "str      z0, [%0]"
        : "=Q"(scratch)
        : "Q"(*aa64_vfp_qreg(env, n)),
          "r"(cpu->sve_max_vq)
        : "p0", "v0");

PPS: Ideally, this would be further cleaned up with acle builtins, but those
are still under development for GCC.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]