qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Maximum QMP reply size


From: Daniel P . Berrangé
Subject: Re: Maximum QMP reply size
Date: Tue, 20 Sep 2022 09:14:21 +0100
User-agent: Mutt/2.2.6 (2022-06-05)

On Mon, Sep 19, 2022 at 08:45:07AM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Fri, Sep 16, 2022 at 04:12:00PM +0100, Peter Maydell wrote:
> >> On Thu, 15 Sept 2022 at 16:21, Dr. David Alan Gilbert
> >> <dgilbert@redhat.com> wrote:
> >> >
> >> > * Peter Maydell (peter.maydell@linaro.org) wrote:
> >> > > On Tue, 6 Sept 2022 at 20:41, John Snow <jsnow@redhat.com> wrote:
> >> > > > Hi, I suspect I have asked this before, but I didn't write it down in
> >> > > > a comment, so I forget my justification...
> >> > > >
> >> > > > In the QMP lib, we need to set a buffering limit for how big a QMP
> >> > > > message can be -- In practice, I found that the largest possible
> >> > > > response was the QAPI schema reply, and I set the code to this:
> >> > > >
> >> > > >     # Maximum allowable size of read buffer
> >> > > >     _limit = (64 * 1024)
> >> > > >
> >> > > > However, I didn't document if this was a reasonable limit or just a
> >> > > > "worksforme" one. I assume that there's no hard limit for the 
> >> > > > protocol
> >> > > > or the implementation thereof in QEMU. Is there any kind of value 
> >> > > > here
> >> > > > that would be more sensible than another?
> >> > > >
> >> > > > I'm worried that if replies get bigger in the future (possibly in 
> >> > > > some
> >> > > > degenerate case I am presently unaware of) that the library default
> >> > > > will become nonsensical.
> >> > >
> >> > > There are some QMP commands which return lists of things
> >> > > where we put no inherent limit on how many things there
> >> > > are in the list, like qom-list-types. We'd have to be getting
> >> > > a bit enthusiastic about defining types for that to get
> >> > > up towards 64K's worth of response, but it's not inherently
> >> > > impossible. I think using human-monitor-command to send
> >> > > an 'xp' HMP command is also a way to get back an arbitrarily
> >> > > large string (just ask for a lot of memory to be dumped).
> >> >
> >> > We could put size limits on xp; most Humans will only dump a few kB
> >> > maximum like that, any larger and you can dump to file.
> >> 
> >> Sure, we could, but why? It's not clear to me why a consumer
> >> of QMP needs to impose a maximum message size limit on it:
> >> I thought it was just JSON. Fixed buffer sizes are very 1980s :-)
> >
> > Well even if they parse the JSON as it streams in, rather than
> > reading the whole doc and then parsing it in one go, you still
> > need to have limits on any sane QMP client.
> >
> > The QEMU process is an untrusted component in the stack, talking
> > to a trusted mgmt layer. If the QEMU process sends a 1 TB JSON
> > message as a QMP reply, the mgmt layer must not try to parse
> > that as they'll let loose the kraken^H^H^H^H^H OOM killer.
> >
> > To be robust against either a malicious or mis-behaving QEMU
> > they need to impose a limit on the size of QMP response they'll
> > be willing to process. The challenge is figuring out what limit
> > is big enough to handle any conceivable valid message, while
> > being small enough to minimize denial of service risks.
> 
> Yes.
> 
> QEMU does this for QMP input.
> 
> Trying to defend against malicious QMP input would of course be
> pointless; if you can send QMP input, you "own" QEMU anyway.  It's
> defense against *accidents*.
> 
> The limits are extremely (overly?) generous: each command is limited to
> 1024 levels of nesting to protect the stack, 64MiB of total token size
> and 2Mi[*] tokens to protect the heap.
> 
> > NB, that's not the only thing clients need todo to protect from
> > a bad QEMU. Rate limiting consumption is potentially important too
> > lest a bad QEMU inflict a DoS on the CPU by sending such frequent
> > messages that the QMP client is burning 100% CPU just parsing
> > them.  I've not seen any QMP client do this in practice though,
> > not even libvirt has attempted it.
> 
> What could a management application do when it detects it can't /
> doesn't want to keep up with QMP output?

Gratuitously killing the VM is an option, but likely too harsh for
most scenarios.

Another option is to disconnect from QMP and then immediately connect
again, in the hope it was just a transient confused QEMU.

Another option would be to simply mark the VM as "broken", leave it
running but stop talking to QMP entirely. THe guest owner can still
use their VM, but can't make mgmt app changes and loose some monitoring
features. Possibly allow user to attempt a "repair" - ie disconnect
and reconnect.

Another option is to have a dedicated OS thread for handling the QMP
client side, and ensure that thread is under the same CPU usage limits
as the VM as a whole, so any excess CPU usage is accounted to the VM.

> >> If this is a common requirement then should we define something
> >> in the protocol where the client says "I can support messages
> >> up to this big" and then the server has to split things up?
> >
> > Splitting doesn't help protect against the DoS, because the QMP
> > client would have to reassemble the pieces afterwards to process
> > the reply / async event.
> 
> Yes.
> 
> Can we estimate limits that should suffice?  Documenting them could help
> management applications.

FWIW, libvirt uses 10 MB as an arbitrarily picked large limit. I'm
not aware of us having bug reports complaining this is too low thus
far.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]