qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Making all TB invalidation asynchronous (MTTCG safe_wor


From: Frederic Konrad
Subject: Re: [Qemu-devel] Making all TB invalidation asynchronous (MTTCG safe_work)?
Date: Thu, 25 Feb 2016 09:58:29 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

Hi Alex,

We decided in Seattle to make this flag per tb (eg move it to the tb
struct).



On 24/02/2016 18:30, Alex Bennée wrote:
> Hi,
>
> So I've been working on reducing MTTCG tb_lock contention and currently
> have a tb_lock around the following code (in my cpu_exec):
>
>     /* Note: we do it here to avoid a gcc bug on Mac OS X when
>        doing it in tb_find_slow */
>     tb_lock();
>     if (tcg_ctx.tb_ctx.tb_invalidated_flag) {
>         /* as some TB could have been invalidated because
>            of memory exceptions while generating the code, we
>            must recompute the hash index here */
>         next_tb = 0;
>         tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
>     }
>     /* see if we can patch the calling TB. When the TB
>        spans two pages, we cannot safely do a direct
>        jump. */
>     if (next_tb != 0 && tb->page_addr[1] == -1
>         && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>         tb_add_jump((TranslationBlock *)(next_tb & ~TB_EXIT_MASK),
>                     next_tb & TB_EXIT_MASK, tb);
>     }
>     tb_unlock();
>
> And this started me down the rabbit hole of the meaning of
> tcg_ctx.tb_ctx.tb_invalidated_flag. So as far as I follow there are two
> places this is set:
>
>  * We've run out of translation memory and we are throwing everything
>    away (tb_alloc == NULL)
>  * We've invalidated the physical pages of some TranslationBlocks
>
> The first case there is a slightly convoluted buffer overflow handing
> code (tb_gen_code):
>
>     if (unlikely(!tb)) {
>  buffer_overflow:
>         /* flush must be done */
>         tb_flush_safe(cpu);
>         /* Don't forget to invalidate previous TB info.  */
>         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>         tb_unlock();
>         cpu_loop_exit(cpu);
>     }
>
> Which I'm sure could be more simply handled by just queuing the safe
> tb_flush and returning a NULL tb and letting the execution loop unwind
> before resetting the translation buffers.
>
> The second case has been partially asynced by Fred:
>
>     /* invalidate one TB */
>     void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>     {
>         CPUState *cpu;
>         PageDesc *p;
>         unsigned int h;
>         tb_page_addr_t phys_pc;
>         struct CPUDiscardTBParams *params;
>
>         assert_tb_lock(); /* added by me because of bellow */
>
>         /* remove the TB from the hash list */
>         phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
>         h = tb_phys_hash_func(phys_pc);
>         tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
>
>         /* remove the TB from the page list */
>         if (tb->page_addr[0] != page_addr) {
>             p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
>             tb_page_remove(&p->first_tb, tb);
>             invalidate_page_bitmap(p);
>         }
>         if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
>             p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
>             tb_page_remove(&p->first_tb, tb);
>             invalidate_page_bitmap(p);
>         }
>
>         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>
>         CPU_FOREACH(cpu) {
>             params = g_malloc(sizeof(struct CPUDiscardTBParams));
>             params->cpu = cpu;
>             params->tb = tb;
>             async_run_on_cpu(cpu, cpu_discard_tb_from_jmp_cache, params);
>         }
>         async_run_safe_work_on_cpu(first_cpu, tb_invalidate_jmp_remove, tb);
>
>         tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
>     }
>
> But I'm wondering why we can't defer all the page invalidation to safe
> work?
>
> I don't think it matters to the invalidating vCPU as it has to get
> to the end of its block anyway. For other vCPUs as there is no strict
> synchronisation can we not pretend what ever the operation was that
> triggered the invalidation happened just as the block ended?
>
> The final case I don't quite follow is the avoiding invalidation of
> tb_next in cpu_exec_nocache() if we have already caused a tb
> invalidation event:
>
>     tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
>
> Which is later (in cpu_io_recompile):
>
>     if (tb->cflags & CF_NOCACHE) {
>         if (tb->orig_tb) {
>             /* Invalidate original TB if this TB was generated in
>              * cpu_exec_nocache() */
>             tb_phys_invalidate(tb->orig_tb, -1);
>         }
>         tb_free(tb);
>     }
>
> My aim in all of this is to see if we can remove another flag from
> tb_ctx (one less thing to mutex access to) and make the code flow easier
> to follow. So remaining question:
>
> * Are there cases where not immediately invalidating the tb_page
>   structures would cause problems for the emulation?

Is that the same issue we might have with the memory barriers?

Fred
>
> Thanks in advance for any elucidation ;-)
>
> --
> Alex Bennée
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]