qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/7] target/s390x: vxeh2: vector string search


From: Richard Henderson
Subject: Re: [PATCH v2 2/7] target/s390x: vxeh2: vector string search
Date: Mon, 7 Mar 2022 08:50:56 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

On 3/6/22 16:03, David Miller wrote:
+static DisasJumpType op_vstrs(DisasContext *s, DisasOps *o)
+{
+    const uint8_t es = get_field(s, m5);
+    const uint32_t D = get_field(s, m6);
+
+    if (es > ES_32) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }

Missed a check for bits of m6 other than zs must be zero.

+    gen_gvec_4_ptr(get_field(s, v1), get_field(s, v2),
+                   get_field(s, v3), get_field(s, v4),
+                   cpu_env, (D << 16) | es, gen_helper_vstrs);

Why << 16?

+void HELPER(vstrs)(void *v1, const void *v2, const void *v3, void *v4,
+                   CPUS390XState *env, uint32_t desc) {
+    const bool zs = (desc >> 16);

If this is meant to recover D, it won't. The value that you passed above is had via simd_data(desc). After that, you could get D via >> 16.

+    const uint8_t es = desc & 16;

I think you clearly meant & 15 here.  Or maybe & 3, since ES <= 2?
However!

I think you shouldn't actually pass these values via simd_data(). I think you should have 6 helper functions, which call this as an inline function passing ES and ZS as immediate arguments. This will allow the compiler to fold away all of the multiple checks vs ES and ZS.


+    uint32_t str_len = 0, eos = 0;
+    uint32_t i = 0, j = 0, k = 0, cc = 0;
+    uint32_t substr_len = ((uint8_t *)v4)[H1(7)] & 31;
+
+    for (i = 0; i < 16; i += char_size) {
+        if (0 == es && !((uint8_t  *)v3)[H1(i >> es)]) { break; }
+        if (1 == es && !((uint16_t *)v3)[H2(i >> es)]) { break; }
+        if (2 == es && !((uint32_t *)v3)[H4(i >> es)]) { break; }

Instead of iterating to 16 by char_size and dividing by es, you should iterate to element_count by 1. You should use s390_vec_read_element here, and elsewhere.

+    }
+    if (i < substr_len) {
+        substr_len = i;
+    }

This bounding of substr_len is only done when ZS is true.

+    ((uint64_t *)v1)[0] = ((uint64_t *)v1)[1] = 0;
+    ((uint8_t *)v1)[H1(7)] = k;

s390_vec_write_element64(v1, 0, k);
s390_vec_write_element64(v1, 1, 0);


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]