bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#46881: 28.0.50; pdumper dumping causes way too many syscalls


From: Daniel Colascione
Subject: bug#46881: 28.0.50; pdumper dumping causes way too many syscalls
Date: Thu, 4 Mar 2021 17:26:32 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1

On 3/3/21 12:51 AM, Eli Zaretskii wrote:

From: Pip Cet <pipcet@gmail.com>
Date: Tue, 2 Mar 2021 20:45:04 +0000

On Tue, Mar 2, 2021 at 8:35 PM Pip Cet <pipcet@gmail.com> wrote:
I've looked into the problem, and it seems easy to solve and worth it
in terms of debuggability and performance.
Very rough benchmarks, but this seems to be clearly worth it:

Performance:
With patch:
real    0m3.861s
user    0m3.776s
sys    0m0.085s

Without patch:
real    0m7.001s
user    0m4.476s
sys    0m2.511s

Number of syscalls:
With patch: 415442
Without patch: 2028307

Patch will be attached once this has a bug number.
And here's the patch. Testing would be very appreciated.

I'm unsure about the precise usage of dump_off vs ptrdiff_t here; I
don't think it matters, but suggestions, nitpicks, and comments, on
this or any other aspect, would be very appreciated.
 From 92ee138852b34ede2f43dd7f93f310fc746bb3bf Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Tue, 2 Mar 2021 20:38:23 +0000
Subject: [PATCH] Prepare pdumper dump file in memory, write it in one go
  (Bug#46881)

* src/pdumper.c (struct dump_context): Add buf, buf_size, max_offset fields.
(grow_buffer): New function.
(dump_write): Use memcpy, not an actual emacs_write.
(dump_seek): Keep track of maximum seen offset.
(Fdump_emacs_portable): Write out the file contents when done.
---
  src/pdumper.c | 20 ++++++++++++++++++--
  1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/pdumper.c b/src/pdumper.c
index 337742fda4ade..62ddad8ee5e34 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -473,6 +473,10 @@ dump_fingerprint (char const *label,
  {
    /* Header we'll write to the dump file when done.  */
    struct dump_header header;
+  /* Data that will be written to the dump file.  */
+  void *buf;
+  ptrdiff_t buf_size;
+  ptrdiff_t max_offset;
Lisp_Object old_purify_flag;
    Lisp_Object old_post_gc_hook;
@@ -581,6 +585,13 @@ dump_fingerprint (char const *label,
  
  /* Dump file creation */
+static void dump_grow_buffer (struct dump_context *ctx)
+{
+  ctx->buf = xrealloc (ctx->buf, ctx->buf_size = (ctx->buf_size ?
+                                                 (ctx->buf_size * 2)
+                                                 : 1024 * 1024));
+}
+
  static dump_off dump_object (struct dump_context *ctx, Lisp_Object object);
  static dump_off dump_object_for_offset (struct dump_context *ctx,
                                        Lisp_Object object);
@@ -747,8 +758,9 @@ dump_write (struct dump_context *ctx, const void *buf, 
dump_off nbyte)
    eassert (nbyte == 0 || buf != NULL);
    eassert (ctx->obj_offset == 0);
    eassert (ctx->flags.dump_object_contents);
-  if (emacs_write (ctx->fd, buf, nbyte) < nbyte)
-    report_file_error ("Could not write to dump file", ctx->dump_filename);
+  while (ctx->offset + nbyte > ctx->buf_size)
+    dump_grow_buffer (ctx);
+  memcpy ((char *)ctx->buf + ctx->offset, buf, nbyte);
    ctx->offset += nbyte;
  }
@@ -828,6 +840,8 @@ dump_tailq_pop (struct dump_tailq *tailq)
  static void
  dump_seek (struct dump_context *ctx, dump_off offset)
  {
+  if (ctx->max_offset < ctx->offset)
+    ctx->max_offset = ctx->offset;
    eassert (ctx->obj_offset == 0);
    if (lseek (ctx->fd, offset, SEEK_SET) < 0)
      report_file_error ("Setting file position",
@@ -4159,6 +4173,8 @@ DEFUN ("dump-emacs-portable",
    ctx->header.magic[0] = dump_magic[0];
    dump_seek (ctx, 0);
    dump_write (ctx, &ctx->header, sizeof (ctx->header));
+  if (emacs_write (ctx->fd, ctx->buf, ctx->max_offset) < ctx->max_offset)
+    report_file_error ("Could not write to dump file", ctx->dump_filename);
dump_off
      header_bytes = header_end - header_start,
--
2.30.1
Thanks.

Daniel, Paul: any comments?  In particular, is it safe to allocate
large amounts of memory off the heap while dumping?  A couple of
places in pdumper.c says some parts of code should call malloc.

It looks fine, but wouldn't dumping to a FILE* (with internal buffering) do the same basic thing in a simpler way? There aren't any particular constraints on the environment _during_ the dump: we even make new lisp objects. It's when loading the dump, early in initialization, that you have to be careful.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]