|
From: | Matheus K. Ferst |
Subject: | Re: [PATCH v4 29/31] target/ppc: Implement cfuged instruction |
Date: | Thu, 13 May 2021 09:24:32 -0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 |
On 13/05/2021 08:31, Richard Henderson wrote:
On 5/12/21 1:54 PM, matheus.ferst@eldorado.org.br wrote:+ while (i) { + n = ctz64(mask); + if (n > i) { + n = i; + } + + m = (1ll << n) - 1; + if (bit) { + right = ror64(right | (src & m), n); + } else { + left = ror64(left | (src & m), n); + } + + src >>= n; + mask >>= n; + i -= n; + bit = !bit; + mask = ~mask; + } + + if (bit) { + n = ctpop64(mask); + } else { + n = 64 - ctpop64(mask); + } + + return left | (right >> n); +}This doesn't correspond to the algorithm presented in the manual. Thus this requires lots of extra commentary.I guess I see how you're trying to process blocks at a time, instead of single bits at a time. But I don't think the merging of data into "right" and "left" looks right. I would have expectedright = (right << n) | (src & m); and similarly for left.It doesn't look like that the ctpop at the end is correct, given how mask has been modified. I would have thought thatn = ctpop64(orig_mask); return (left << n) | right; would be the correct answer.I could be wrong about the above, but that's what the missing commentary should have helped me understand.
It sure worth more comments. Yes, the idea is to process in blocks, and we negate the mask to avoid deciding between ctz and cto inside the loop. We use rotate instead of shift so it don't change the number of zeros and ones, and then we don't need orig_mask.
You'll find my test cases for cfuged and vcfuged on https://github.com/PPC64/qemu/blob/ferst-tcg-cfuged/tests/tcg/ppc64le/ . I got the same results by running them with this implementation and with the Power10 Functional Simulator.
+static bool trans_CFUGED(DisasContext *ctx, arg_X *a) +{ + REQUIRE_64BIT(ctx); + REQUIRE_INSNS_FLAGS2(ctx, ISA310); +#if defined(TARGET_PPC64) + gen_helper_cfuged(cpu_gpr[a->ra], cpu_gpr[a->rt], cpu_gpr[a->rb]); +#else + gen_invalid(ctx); +#endif + return true; +}Given that this helper will also be used by vcfuged, there's no point in hiding it in a TARGET_PPC64 block, and thus you can drop the ifdefs.r~
If I remove it, the build for ppc will fail, because cpu_gpr is declared as TCGv, and the helper uses i64 to match {get,set}_cpu_vsr{l,h}. REQUIRE_64BIT makes the helper call unreachable for ppc, but it's a runtime check. At build time, the compiler will check the types anyway, and give us an error.
-- Matheus K. Ferst Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/> Analista de Software Júnior Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
[Prev in Thread] | Current Thread | [Next in Thread] |