[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v1 1/1] xilinx_axidma.c: Fix up the stream_runni
From: |
Alistair Francis |
Subject: |
Re: [Qemu-devel] [PATCH v1 1/1] xilinx_axidma.c: Fix up the stream_running() function |
Date: |
Fri, 19 Jun 2015 10:04:28 +1000 |
On Fri, Jun 5, 2015 at 12:58 PM, Peter Crosthwaite
<address@hidden> wrote:
> On Thu, Jun 4, 2015 at 7:00 PM, Alistair Francis
> <address@hidden> wrote:
>> On Thu, Jun 4, 2015 at 8:52 AM, Peter Crosthwaite
>> <address@hidden> wrote:
>>> On Wed, May 27, 2015 at 12:37 AM, Alistair Francis
>>> <address@hidden> wrote:
>>>> Previously the stream_running() function didn't check
>>>> if the DMA was halted. This caused hangs in recent versions
>>>> of MicroBlaze u-boot. Correct stream_running() to check
>>>> DMASR_HALTED as well as DMACR_RUNSTOP.
>>>>
>>>
>>> So I'm stuggling with this one. Partly because I think HALTED might be
>>> misimplemented in existing code. I did some digging, and AFAICS,
>>> HALTED is conditional on !DAMCR_RUNSTOP. I think i might have got
>>> 210914e29975d17e635f9e8c1f7478c0ed7a208f wrong:
>>>
>>> @@ -276,7 +276,7 @@ static void stream_process_mem2s(struct Stream *s,
>>> stream_desc_load(s, s->regs[R_CURDESC]);
>>>
>>> if (s->desc.status & SDESC_STATUS_COMPLETE) {
>>> - s->regs[R_DMASR] |= DMASR_IDLE;
>>> + s->regs[R_DMASR] |= DMASR_HALTED;
>>> break;
>>> }
>>>
>>> Stepping back and ignoring the existing implementation of HALTED there
>>> are 4 states of RS:H (RUNSTOP and HALTED):
>>>
>>> !RS && H - this is the off state. doc refers to this as the "halted" state.
>>> RS && !H - This is the running state.
>>> !RS && !H - This is the transient state. Software has cleared RS but
>>> there s still something on AXI bus so cant assert halted yet.
>>> RS && H - This is an invalid state.
>>>
>>> Current code reaches the invalid state on the ring buffer full case
>>> due to the bug above. My thoery is
>>> 210914e29975d17e635f9e8c1f7478c0ed7a208f should have just been:
>>>
>>> if (s->desc.status & SDESC_STATUS_COMPLETE) {
>>> - s->regs[R_DMASR] |= DMASR_IDLE;
>>> break;
>>> }
>>>
>>> Now I think there is yet another bug in that clearing RS doesn't seem
>>> to be able to reliably set the HALTED bit (only in the unrelated case
>>> of a ring buffer fill).
>>>
>>> I'm starting to question whether the HALTED bit as far as QEMU is
>>> concerned should just be a straight negation of RS. Depending on what
>>> the conditions cause a transient and what doesn't, the transient as I
>>> describe above may evaporate as we can get away with this simple
>>> shortcut.
>>
>> Hey Peter,
>>
>> Ok, so you are saying maybe we should only have 'Running' and 'Off'
>> states?
>>
>>>
>>> This would make this patch obsolete without fixing your bug :).
>>>
>>> So running on the assumption that HALTED is misimplemented your patch
>>> is doing something with that behaviour. The misimplemented HALTED is
>>> currently holding the state of "we are blocked on a full buffer". If
>>> you can point me which of the 3 call sites of stream_running was
>>> giving you problems I might have more clues.
>>
>> So the problem was basically because:
>> - axienet_eth_rx_notify() calls the DMA push functions
>>
>> - xilinx_axidma_data_stream_can_push() returned true
>> - Therefore xilinx_axidma_data_stream_push() could be called
>>
>> - This meant that stream_process_s2mem() was called
>> - But it would break straight away as
>> 's->desc.status & SDESC_STATUS_COMPLETE' would return true.
>> - This meant it would cause an infinite loop as the stream could be
>> pushed, but nothing was ever pushed. As ret was always zero
>> the code never broke out of the second while loop in
>> axienet_eth_rx_notify()
>>
>
> Ok I think I know what is up. It happens by fluke, that the bad
> implementation of halted is exactly the state you need to correct this
> issue. You could capture it as a new boolean in the state struct and
> just check it in can_push. I'm looking at ways to refactor it to avoid
Do you mean capture the running/!running state in the state struct?
> the extra calls to _push though. It might be possible to refactor such
> that all of stream_running, stream_idle and the SDESC_STATUS_COMPLETE
> happen in can_push. This means can_push has to do the
> stream_desc_load. But that is ok, as the current solution will already
> have spurious fetches of the descriptor. Then make can_push itself the
> iteration condition for _mem2s's loop which will handle descriptor
> fetches for you in the process.
Is it really that bad to be calling _push?
Thanks,
Alistair
>
> Regards,
> Peter
>
>> Thanks,
>>
>> Alistair
>>
>>>
>>> FYI you patch may still be correct but I wondering whether is has
>>> uncovered a bug that should lead to a rework of this.
>>>
>>> Regards,
>>> Peter
>>>
>>>> Signed-off-by: Alistair Francis <address@hidden>
>>>> Reviewed-by: Sai Pavan Boddu <address@hidden>
>>>> ---
>>>> hw/dma/xilinx_axidma.c | 3 ++-
>>>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/hw/dma/xilinx_axidma.c b/hw/dma/xilinx_axidma.c
>>>> index d06002d..27fba40 100644
>>>> --- a/hw/dma/xilinx_axidma.c
>>>> +++ b/hw/dma/xilinx_axidma.c
>>>> @@ -154,7 +154,8 @@ static inline int stream_resetting(struct Stream *s)
>>>>
>>>> static inline int stream_running(struct Stream *s)
>>>> {
>>>> - return s->regs[R_DMACR] & DMACR_RUNSTOP;
>>>> + return s->regs[R_DMACR] & DMACR_RUNSTOP &&
>>>> + !(s->regs[R_DMASR] & DMASR_HALTED);
>>>> }
>>>>
>>>> static inline int stream_idle(struct Stream *s)
>>>> --
>>>> 1.7.1
>>>>
>>>>
>>>
>>
>