[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [Qemu-devel] [RFC 4/6] target-ppc: add cmprb instruction
From: |
Nikunj A Dadhania |
Subject: |
Re: [Qemu-ppc] [Qemu-devel] [RFC 4/6] target-ppc: add cmprb instruction |
Date: |
Thu, 21 Jul 2016 13:38:49 +0530 |
User-agent: |
Notmuch/0.21 (https://notmuchmail.org) Emacs/25.0.94.1 (x86_64-redhat-linux-gnu) |
Richard Henderson <address@hidden> writes:
> On 07/12/2016 11:33 PM, Nikunj A Dadhania wrote:
>> +/* cmprb - range comparison: isupper, isaplha, islower*/
>> +static void gen_cmprb(DisasContext *ctx)
>> +{
>> + TCGLabel *lab1 = gen_new_label();
>> + TCGLabel *lab2 = gen_new_label();
>> + TCGv src1 = tcg_temp_local_new();
>> + TCGv src2 = tcg_temp_local_new();
>> + TCGv src2lo = tcg_temp_local_new();
>> + TCGv src2hi = tcg_temp_local_new();
>> +
>> + tcg_gen_andi_tl(src1, cpu_gpr[rA(ctx->opcode)], 0xFF);
>> + tcg_gen_andi_tl(src2, cpu_gpr[rB(ctx->opcode)], 0xFFFFFFFF);
>
> There's no point in this mask, since it's covered by
>
>> +
>> + tcg_gen_andi_tl(src2lo, src2, 0xFF);
>> + tcg_gen_shri_tl(src2hi, src2, 8);
>> + tcg_gen_andi_tl(src2hi, src2hi, 0xFF);
>
> these ones.
Right.
>> +
>> + tcg_gen_brcond_tl(TCG_COND_GTU, src1, src2hi, lab1);
>> + tcg_gen_brcond_tl(TCG_COND_LTU, src1, src2lo, lab1);
>> + tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 1 << CRF_GT);
>> + tcg_gen_br(lab2);
>> + gen_set_label(lab1);
>> +
>> + if (ctx->opcode & 0x00200000) {
>> + tcg_gen_shri_tl(src2hi, src2, 24);
>> + tcg_gen_andi_tl(src2hi, src2hi, 0xFF);
>> + tcg_gen_shri_tl(src2lo, src2, 16);
>> + tcg_gen_andi_tl(src2lo, src2lo, 0xFF);
>> + tcg_gen_brcond_tl(TCG_COND_GTU, src1, src2hi, lab2);
>> + tcg_gen_brcond_tl(TCG_COND_LTU, src1, src2lo, lab2);
>> + tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 1 << CRF_GT);
>> + }
>> + gen_set_label(lab2);
>> + tcg_temp_free(src1);
>> + tcg_temp_free(src2);
>> + tcg_temp_free(src2lo);
>> + tcg_temp_free(src2hi);
>> +}
>
> You've forgotten to clear crf in the false case.
Yes, next version has the fix.
> This is better implemented without branches, like
>
> TCGv_i32 src1, src2, src2lo, src2hi;
> TCGv_i32 crf = cpu_crf[cdfD(ctx->opcode)];
>
> // allocate all 4 "src" temps
>
> tcg_gen_trunc_tl_i32(src1, cpu_gpr[rA(ctx->opcode)]);
> tcg_gen_trunc_tl_i32(src2, cpu_gpr[rB(ctx->opcode)]);
>
> tcg_gen_ext8u_i32(src2lo, src2);
> tcg_gen_shri_i32(src2, src2, 8);
> tcg_gen_extu8_i32(src2hi, src2hi);
>
> tcg_gen_setcond_tl(TCG_COND_LEU, src2lo, src2lo, src1);
> tcg_gen_setcond_tl(TCG_COND_LEU, src2hi, src1, src2hi);
> tcg_gen_and_tl(crf, src2lo, src2hi);
>
> if (ctx->opcode & 0x00200000) {
> tcg_gen_shri_i32(src2, src2, 8);
> tcg_gen_ext8u_i32(src2lo, src2);
> tcg_gen_shri_i32(src2, src2, 8);
> tcg_gen_ext8u_i32(src2hi, src2);
> tcg_gen_setcond_i32(TCG_COND_LEU, src2lo, src2lo, src1);
> tcg_gen_setcond_i32(TCG_COND_LEU, src2hi, src1, src2hi);
> tcg_gen_and_i32(src2lo, src2lo, src2hi);
> tcg_gen_or_i32(crf, crf, src2lo);
> }
>
> tcg_gen_shli_i32(crf, crf, CRF_GT);
>
> // free all 4 "src" temps
Sure.
Regards
Nikunj
- [Qemu-ppc] [RFC 5/6] target-ppc: add modulo word operations, (continued)
[Qemu-ppc] [RFC 6/6] target-ppc: add modulo dword operations, Nikunj A Dadhania, 2016/07/12
[Qemu-ppc] [RFC 4/6] target-ppc: add cmprb instruction, Nikunj A Dadhania, 2016/07/12
[Qemu-ppc] [RFC 3/6] target-ppc: adding addpcis instruction, Nikunj A Dadhania, 2016/07/12