[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 49/60] target/riscv: vector mask population count vmpopc
From: |
Richard Henderson |
Subject: |
Re: [PATCH v5 49/60] target/riscv: vector mask population count vmpopc |
Date: |
Sat, 14 Mar 2020 18:20:33 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 |
On 3/12/20 7:58 AM, LIU Zhiwei wrote:
> +target_ulong HELPER(vmpopc_m)(void *v0, void *vs2, CPURISCVState *env,
> + uint32_t desc)
> +{
> + target_ulong cnt = 0;
> + uint32_t mlen = vext_mlen(desc);
> + uint32_t vm = vext_vm(desc);
> + uint32_t vl = env->vl;
> + int i;
> +
> + for (i = 0; i < vl; i++) {
> + if (vm || vext_elem_mask(v0, mlen, i)) {
> + if (vext_elem_mask(vs2, mlen, i)) {
> + cnt++;
> + }
> + }
> + }
> + return cnt;
> +}
This is ok as-is, so
Reviewed-by: Richard Henderson <address@hidden>
But you can do better.
You create an array, similar to arm's pred_esz_masks[],
indexed by log2(mlen).
mask = pred_mlen_masks[log2_mlen];
n = vl >> (6 - log2_mlen);
r = extract32(vl, 0, 6 - log2_mlen);
if (r) {
rmask = extract64(mask, 0, r << log2_mlen);
} else {
rmask = 0;
}
if (vm) {
for (i = 0; i < n; i++) {
uint64_t j = ((uint64_t *)vs2)[i];
cnt += ctpop64(j & mask);
}
if (rmask) {
uint64_t j = ((uint64_t *)vs2)[i];
cnt += ctpop64(j & rmask);
}
} else {
for (i = 0; i < n; i++) {
uint64_t j = ((uint64_t *)vs2)[i];
uint64_t k = ((uint64_t *)v0)[i];
cnt += ctpop64(j & k & mask);
}
if (rmask) {
uint64_t j = ((uint64_t *)vs2)[i];
uint64_t k = ((uint64_t *)v0)[i];
cnt += ctpop64(j & k & rmask);
}
}
r~
- [PATCH v5 44/60] target/riscv: vector single-width integer reduction instructions, (continued)
- [PATCH v5 44/60] target/riscv: vector single-width integer reduction instructions, LIU Zhiwei, 2020/03/12
- [PATCH v5 45/60] target/riscv: vector wideing integer reduction instructions, LIU Zhiwei, 2020/03/12
- [PATCH v5 46/60] target/riscv: vector single-width floating-point reduction instructions, LIU Zhiwei, 2020/03/12
- [PATCH v5 47/60] target/riscv: vector widening floating-point reduction instructions, LIU Zhiwei, 2020/03/12
- [PATCH v5 48/60] target/riscv: vector mask-register logical instructions, LIU Zhiwei, 2020/03/12
- [PATCH v5 49/60] target/riscv: vector mask population count vmpopc, LIU Zhiwei, 2020/03/12
- Re: [PATCH v5 49/60] target/riscv: vector mask population count vmpopc,
Richard Henderson <=
- [PATCH v5 50/60] target/riscv: vmfirst find-first-set mask bit, LIU Zhiwei, 2020/03/12
- [PATCH v5 51/60] target/riscv: set-X-first mask bit, LIU Zhiwei, 2020/03/12
- [PATCH v5 52/60] target/riscv: vector iota instruction, LIU Zhiwei, 2020/03/12
- [PATCH v5 53/60] target/riscv: vector element index instruction, LIU Zhiwei, 2020/03/12
- [PATCH v5 54/60] target/riscv: integer extract instruction, LIU Zhiwei, 2020/03/12