[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] Implementing RLIMIT_AS
From: |
Samuel Thibault |
Subject: |
Re: [RFC] Implementing RLIMIT_AS |
Date: |
Sun, 22 Dec 2024 11:45:11 +0100 |
Diego Nieto Cid, le dim. 22 déc. 2024 00:11:09 -0300, a ecrit:
> On Sun, Dec 22, 2024 at 02:35:08AM +0100, Samuel Thibault wrote:
> >
> > What do you refer to by hard/soft?
> >
>
> I just didn't understand the hard/soft limits. It's better described
> by the structure members and not the comments:
Read the documentation, man setrlimit:
The soft limit is the value that the kernel enforces for the correspond-
ing resource. The hard limit acts as a ceiling for the soft limit: an
unprivileged process may set only its soft limit to a value in the range
from 0 up to the hard limit, and (irreversibly) lower its hard limit. A
privileged process (under Linux: one with the CAP_SYS_RESOURCE capabil-
ity in the initial user namespace) may make arbitrary changes to either
limit value.
or posix:
The rlim_cur member specifies the current or soft limit and the
rlim_max member specifies the maximum or hard limit. Soft limits may
be changed by a process to any value that is less than or equal to
the hard limit. A process may (irreversibly) lower its hard limit
to any value that is greater than or equal to the soft limit. Only
a process with appropriate privileges can raise a hard limit. Both
hard and soft limits can be changed in a single call to setrlimit()
subject to the constraints described above.
Sergey Bugaev, le dim. 22 déc. 2024 09:33:22 +0300, a ecrit:
> Hm so now that I think of it, it could make sense to enforce soft
> limits in userland,
But I don't think it'll really be easy from inside userland since you'd
have to intercept all vm_allocate etc. calls done by applications.
> if we care about "memory allocated" and not "size
> of address space" (because the latter is influenced by memory received
> in messages etc).
Here we really care about the latter. The potential to get memory
allocated (overcommit)
> You'd track the amount of memory allocated, increase
> it in mmap (when making anonymous or private mappings) and sbrk, and
> compare it with the soft value
> Maybe I'm describing the same thing as RLIMIT_DATA?
It looks so.
> The Linux man page
> says Linux applies it to mmap as well as sbrk since 4.7.
Uh? That doesn't make sense. Posix says
RLIMIT_DATA
This is the maximum size of a data segment of the process, in bytes.
It doesn't talk about mmaps.
> > Luca, le ven. 20 déc. 2024 10:25:02 +0100, a ecrit:
> > > are you working on x86_64? if yes, that could be the redzone configured
> > > here:
> > >
> > > https://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/exec/exec.c#n1247
> >
> > That's indeed a very good candidate.
> >
> > One thing is: it's a VM_PROT_NONE/VM_PROT_NONE area. We wouldn't really
> > want to make such area account for RLIMIT_AS, as they are not meant to
> > store anything.
>
> This complicates a bit the accounting. I can keep a count of memory allocated
> whit that protection. But I supose I need to check for calls to `vm_protect`
> or
> its underlying implementation.
No, as mentioned in Sergey vm_protect cannot raise the max part, it's
cast in stone. Only vm_deallocate will be able to just remove that
mapping.
> > Even that patch shouldn't be needed nowadays: the support was commited
> > upstream, and it's only very old-built netdde/rumpdisk that would need the
> > debian patch.
>
> A small nuisance I'm getting is that the gnumach.gz I build from source
> does not have a proper verison and dpkg firends complain when regenerating
> GRUB configuration.
grub doesn't care about what's inside gnumach.gz. Just rename it to
gnumach-mine.gz and for grub it'll have "mine" version.
Sergey Bugaev, le dim. 22 déc. 2024 09:33:22 +0300, a ecrit:
> > > Yes, with the host port being an optional parameter for the case when
> > > the limit is getting requested to be increased.
> >
> > Great.
>
> FWIW, this means that the caller would be potentially sending the host
> priv port to someone who's not necessarily the kernel. That's fine if
> we're acting on mach_task_self (since if someone is interposing our
> task port, we can trust them), but not fine if we're a privileged
> server who's willing to raise the given task's memory allowance
> according to some policy.
Task ports are created only by task_create, so only managed by gnumach.
Privileges servers do trust that their task ports are indeed from that.
Samuel