qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 05/13] accel: adding TB_JIT_TIME and full replacing CONFIG


From: Richard Henderson
Subject: Re: [PATCH v9 05/13] accel: adding TB_JIT_TIME and full replacing CONFIG_PROFILER
Date: Tue, 8 Oct 2019 11:25:48 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 10/7/19 11:28 AM, Alex Bennée wrote:
> From: "Vanderson M. do Rosario" <address@hidden>
> 
> Replace all others CONFIG_PROFILER statistics and migrate it to
> TBStatistics system. However, TCGProfiler still exists and can
> be use to store global statistics and times. All TB related
> statistics goes to TBStatistics.
> 
> Signed-off-by: Vanderson M. do Rosario <address@hidden>
> Message-Id: <address@hidden>
> [AJB: fix authorship]
> Signed-off-by: Alex Bennée <address@hidden>
> ---
>  accel/tcg/tb-stats.c      |  98 +++++++++++++++++++++---------
>  accel/tcg/translate-all.c |  46 ++++++++-------
>  configure                 |   3 -
>  cpus.c                    |  14 ++---
>  include/exec/tb-stats.h   |  15 +++++
>  include/qemu/timer.h      |   5 +-
>  monitor/misc.c            |  28 ++-------
>  tcg/tcg.c                 | 121 ++++++++++++--------------------------
>  tcg/tcg.h                 |   2 +-
>  vl.c                      |   8 +--
>  10 files changed, 159 insertions(+), 181 deletions(-)
> 
> diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
> index 0e64c176b3..f431159fd2 100644
> --- a/accel/tcg/tb-stats.c
> +++ b/accel/tcg/tb-stats.c
> @@ -17,11 +17,18 @@
>  #include "exec/tb-stats.h"
>  
>  /* TBStatistic collection controls */
> -enum TBStatsStatus { TB_STATS_RUNNING, TB_STATS_PAUSED, TB_STATS_STOPPED };
> +enum TBStatsStatus {
> +    TB_STATS_DISABLED = 0,
> +    TB_STATS_RUNNING,
> +    TB_STATS_PAUSED,
> +    TB_STATS_STOPPED
> +};

This should be in patch 1, I should think.

> +uint64_t dev_time;

This patch is doing several things at once, and I think it should be split.
All of the changes to dev_time, for instance, are unrelated to TBStatistic.

> +    jpi->interm_time += stat_per_translation(tbs, time.interm);
> +    jpi->code_time += stat_per_translation(tbs, time.code);
> +    jpi->opt_time += stat_per_translation(tbs, time.opt);
> +    jpi->la_time += stat_per_translation(tbs, time.la);
> +    jpi->restore_time += tbs->time.restore;
> +    jpi->restore_count += tbs->time.restore_count;

Why are some of these "per translation" and some not?

> @@ -370,11 +371,11 @@ static int cpu_restore_state_from_tb
>      }
>      restore_state_to_opc(env, tb, data);
>  
> -#ifdef CONFIG_PROFILER
> -    atomic_set(&prof->restore_time,
> -                prof->restore_time + profile_getclock() - ti);
> -    atomic_set(&prof->restore_count, prof->restore_count + 1);
> -#endif
> +    if (tb_stats_enabled(tb, TB_JIT_TIME)) {
> +        atomic_add(&tb->tb_stats->time.restore, profile_getclock() - ti);
> +        atomic_inc(&tb->tb_stats->time.restore_count);
> +    }

Would it be better to use a the TBStatistics lock than two atomic updates?

> @@ -1826,10 +1828,10 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>          tcg_ctx->tb_jmp_target_addr = tb->jmp_target_arg;
>      }
>  
> -#ifdef CONFIG_PROFILER
> -    atomic_set(&prof->interm_time, prof->interm_time + profile_getclock() - 
> ti);
> -    ti = profile_getclock();
> -#endif
> +    if (tb_stats_enabled(tb, TB_JIT_TIME)) {
> +        atomic_add(&tb->tb_stats->time.interm, profile_getclock() - ti);
> +        ti = profile_getclock();
> +    }

You should call profile_getclock() only once here.

Why does this need an atomic_add, while the rest of TB_JIT_STATS within
tb_gen_code do not, and in fact have a comment:

> +    /*
> +     * Collect JIT stats when enabled. We batch them all up here to
> +     * avoid spamming the cache with atomic accesses
> +     */

> @@ -1871,9 +1873,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>      }
>      tb->tc.size = gen_code_size;
>  
> -#ifdef CONFIG_PROFILER
> -    atomic_set(&prof->code_time, prof->code_time + profile_getclock() - ti);
> -#endif
> +    if (tb_stats_enabled(tb, TB_JIT_TIME)) {
> +        atomic_add(&tb->tb_stats->time.code, profile_getclock() - ti);
> +    }

Likewise.

> diff --git a/configure b/configure
> index 8f8446f52b..eedeb9016e 100755
> --- a/configure
> +++ b/configure
> @@ -6566,9 +6566,6 @@ fi
>  if test "$static" = "yes" ; then
>    echo "CONFIG_STATIC=y" >> $config_host_mak
>  fi
> -if test "$profiler" = "yes" ; then
> -  echo "CONFIG_PROFILER=y" >> $config_host_mak
> -fi

Removing the define without removing --enable-profiler.

>  static int tcg_cpu_exec(CPUState *cpu)
>  {
>      int ret;
> -#ifdef CONFIG_PROFILER
> -    int64_t ti;
> -#endif
> +    uint64_t ti;
>  
>      assert(tcg_enabled());
> -#ifdef CONFIG_PROFILER
>      ti = profile_getclock();
> -#endif
> +
>      cpu_exec_start(cpu);
>      ret = cpu_exec(cpu);
>      cpu_exec_end(cpu);
> -#ifdef CONFIG_PROFILER
> -    atomic_set(&tcg_ctx->prof.cpu_exec_time,
> -               tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti);
> -#endif
> +
> +    atomic_add(&tcg_ctx->prof.cpu_exec_time, profile_getclock() - ti);

This is also unrelated to TBStatistics.

It's also adding an unconditional atomic_add, of which I am not fond.  Should
this be testing tb_stats_collection_enabled() or something else?

I'll also note that tcg_ctx is per-thread (for mttcg), and so perhaps this does
not require an atomic_add anyway.  Perhaps just an atomic_set to be paired with
atomic_read in the dump function that's iterating over the tcg_ctx.

Even without the atomic_add, we shouldn't make the syscall for getclock in the
normal case.


> +    /* These are accessed with atomic operations */
> +    struct {
> +        int64_t restore;
> +        uint64_t restore_count;
> +        int64_t interm;
> +        int64_t code;
> +        int64_t opt;
> +        int64_t la;
> +    } time;

Why are these signed?  We're always adding positive values.

Why is restore_count here in "time"?  Indeed, why all of these unnamed
sub-structures at all?  I don't see that "." helps organization any more than 
"_".

> @@ -4020,18 +3959,18 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
>      }
>  #endif
>  
> -#ifdef CONFIG_PROFILER
> -    atomic_set(&prof->opt_time, prof->opt_time - profile_getclock());
> -#endif
> +    if (tb_stats_enabled(tb, TB_JIT_TIME)) {
> +        atomic_add(&tb->tb_stats->time.opt, -profile_getclock());
> +    }

Atomic add of a negative clock value?  That just means the intermediate value
is unusable, so why make any of this atomic?

Also, this is tb_gen_code again, wherein we already talked about not using
atomic_foo.

> @@ -4081,14 +4020,17 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
>      s->pool_labels = NULL;
>  #endif
>  
> +    if (!tb_stats_collection_enabled()) {
> +        QTAILQ_FOREACH(op, &s->ops, link) {
> +            TCGOpcode opc = op->opc;
> +            atomic_add(&s->prof.table_op_count[opc], 1);
> +        }
> +    }

Why would you collect stats when stats collection is disabled?  Is this a
simple typo?


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]