[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 0/3] error: allow local errors to trigger abo
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH v2 0/3] error: allow local errors to trigger abort |
Date: |
Thu, 18 Jun 2015 18:34:15 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
"Michael S. Tsirkin" <address@hidden> writes:
> It's a common idiom:
>
> Error *local_err = NULL;
> ....
> foo(&local_err);
> ...
> if (local_err) {
> error_propagate(errp, local_err);
> return;
> }
>
> Unfortunately it means that call to foo(&local_err) will
> not abort even if errp is set to error_abort.
>
> Instead, we get an abort at error_propagate which is too late,
> that is, the quality of the stack trace is degraded in that it no longer
> pinpoints the actual cause of failure.
>
> To fix, add an API to check errp and set local_err to error_abort
> if errp is error_abort.
>
> This is out of RFC but I'm still not converting all users:
> let's merge these patches, then I'll convert all users
> on top.
Let's take a step back and review intended use of Error before and after
this series.
Before:
* Parameter Error **errp (by convention the last one)
To create and return an error, do:
error_setg(errp, ...);
To propagate an existing error (typically one you received from a
callee), do:
error_propagate(errp, ...);
You're not supposed to examine errp, let alone dereference it.
* Actual argument
- to receive an error: address of an Error * variable, its value must
be null before the call, and may be examined afterwards
- to ignore errors: null pointer
- to abort on error: &error_abort
This leads to a few patterns:
T foo(..., Error **errp)
{
// pattern: receive, test and propagate error
Error *err = NULL;
bar(..., &err);
if (err) {
error_propagate(errp, err);
return ...;
}
// pattern: set error
if (...) {
error_setg(errp, ...);
return ...;
}
// pattern: pass through error
// really the first pattern less the test simplified
baz(..., errp);
return ...;
}
Your patch modifies the "actual argument to receive an error" clause:
the variable must now be either null or the value of
error_init_local(errp). If it's the latter, then it must be passed to
error_propagate(errp, err).
We acquire a new pattern:
// pattern: receive, test and propagate error
Error *err = error_init_local(errp);
bar(..., &err);
if (err) {
error_propagate(errp, err);
return ...;
}
Let's see whether it can fully replace the existing pattern. Before you
can convert err = NULL to err = error_init_local(errp), you have to:
* find the appropriate errp (usually trivial)
* double-check we error_propagate(errp, err) on every path leaving the
function
It's perfectly possible that we *don't* propagate on all paths:
Error *err = NULL;
bar(..., &err);
if (error is non-recoverable) {
error_propagate(errp, err);
return ...;
}
recover and carry on
Here, error_init_local(errp) would be *wrong*.
The Error boilerplate is annoying, but at least there are few ways to
get it wrong (the most common one is "Error * variable not null before
the call"). I'm most reluctant to add more ways to get it wrong.
Is the gain worth all this additional complexity? The gain is certainly
real: backtrace shows where the error got created instead of where it
was propagated to &error_abort. But the existing backtraces have never
bothered me. When I get one ending at an error_propagate(errp, err),
and I need the real one instead, finding it in the debugger is easy
enough. The complexity does bother me.
Here's an utterly trivial way to get some of the gain for none of the
complexity: make error_setg() & friends store caller's __FILE__ and
__LINE__.