groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 09/47: [libbib]: Refactor index header checking.


From: G. Branden Robinson
Subject: [groff] 09/47: [libbib]: Refactor index header checking.
Date: Tue, 11 Jan 2022 06:33:14 -0500 (EST)

gbranden pushed a commit to branch master
in repository groff.

commit 4fad0459bb039468a9be60c0e1341833740aa262
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Wed Jan 5 10:15:18 2022 +1100

    [libbib]: Refactor index header checking.
    
    * src/libs/libbib/index.cpp: Move more header validity checks...
    
      (index_search_item::load): ...from here....
    
      (index_search_item::check_header): ...to here.  Test all size values
      in header for negative values (never valid) before proceeding.  (These
      data could be changed to unsigned integer types in the file format,
      but that would require bumping the file version.  That in turn would
      make indexes generated with groff 1.23 unusable on systems running
      older groffs and, perhaps worse, would make groff 1.23 reject index
      files produced by older groffs.  On the other hand, the regeneration
      of index files should be, for those who use them, a common activity,
      and as long as the original database files are kept intact, it seems
      likely that most people, given modern machines, won't even notice a
      slowdown in document generation when refer(1) and friends fail to open
      the indices and fall back to full-text searches of the originals.  So
      we could still consider revising the file format before the groff 1.23
      release.)
    
    * src/libs/libbib/index.cpp (index_search_item::load): Explicitly
      perform widening conversion from signed `off_t` value (from a stat(2)
      buffer) to an unsigned integer; quietens compiler warning about
      comparison between integers differing in signedness.
---
 ChangeLog                 | 29 ++++++++++++++++++++++++
 src/libs/libbib/index.cpp | 56 +++++++++++++++++++++++------------------------
 2 files changed, 56 insertions(+), 29 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index f7a2c784..ac187689 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,32 @@
+2022-01-05  G. Branden Robinson <g.branden.robinson@gmail.com>
+
+       [libbib]: Refactor index header checking.
+
+       * src/libs/libbib/index.cpp: Move more header validity checks...
+       (index_search_item::load): ...from here...
+
+       (index_search_item::check_header): ...to here.  Test all size
+       values in header for negative values (never valid) before
+       proceeding.  (These data could be changed to unsigned integer
+       types in the file format, but that would require bumping the
+       file version.  That in turn would make indexes generated with
+       groff 1.23 unusable on systems running older groffs and, perhaps
+       worse, would make groff 1.23 reject index files produced by
+       older groffs.  On the other hand, the regeneration of index
+       files should be, for those who use them, a common activity, and
+       as long as the original database files are kept intact, it seems
+       likely that most people, given modern machines, won't even
+       notice a slowdown in document generation when refer(1) and
+       friends fail to open the indices and fall back to full-text
+       searches of the originals.  So we could still consider revising
+       the file format before the groff 1.23 release.)
+
+       * src/libs/libbib/index.cpp (index_search_item::load):
+       Explicitly perform widening conversion from signed `off_t` value
+       {from a stat(2) buffer} to an unsigned integer; quietens
+       compiler warning about comparison between integers differing in
+       signedness.
+
 2022-01-05  G. Branden Robinson <g.branden.robinson@gmail.com>
 
        * configure.ac (AC_CHECK_HEADERS): Expect stdbool.h, since we
diff --git a/src/libs/libbib/index.cpp b/src/libs/libbib/index.cpp
index 3fc9b941..aebcbbfd 100644
--- a/src/libs/libbib/index.cpp
+++ b/src/libs/libbib/index.cpp
@@ -76,7 +76,7 @@ class index_search_item : public search_item {
 public:
   index_search_item(const char *, int);
   ~index_search_item();
-  const char *check_header(int);
+  const char *check_header(index_header *, unsigned);
   bool load(int fd);
   search_item_iterator *make_search_item_iterator(const char *);
   bool is_valid();
@@ -146,32 +146,40 @@ inline void unused(void *) { }
 // Validate the data reported in the header so that we don't overread on
 // the heap in the load() member function.  Return null pointer if no
 // problems are detected.
-const char *index_search_item::check_header(int size_remaining)
+const char *index_search_item::check_header(index_header *file_header,
+                                           unsigned file_size)
 {
-  size_t chunk_size;
-  if (header.tags_size < 0)
-    return "tag list length supposedly negative";
-  chunk_size = header.tags_size * sizeof(tag);
+  if (file_header->tags_size < 0)
+    return "tag list length negative";
+  if (file_header->lists_size < 0)
+    return "reference list length negative";
+  // The table and string pool sizes will not be zero, even in an empty
+  // index.
+  if (file_header->table_size < 1)
+    return "table size nonpositive";
+  if (file_header->strings_size < 1)
+    return "string pool size nonpositive";
+  size_t sz = (file_header->tags_size * sizeof(tag)
+              + file_header->lists_size * sizeof(int)
+              + file_header->table_size * sizeof(int)
+              + file_header->strings_size
+              + sizeof(file_header));
+  if (sz != file_size)
+    return("size mismatch between header and data");
+  unsigned size_remaining = file_size;
+  unsigned chunk_size = file_header->tags_size * sizeof(tag);
   if (chunk_size > size_remaining)
     return "claimed tag list length exceeds file size";
   size_remaining -= chunk_size;
-  if (header.lists_size < 0)
-    return "reference list length supposedly negative";
-  chunk_size = header.lists_size * sizeof(int);
+  chunk_size = file_header->lists_size * sizeof(int);
   if (chunk_size > size_remaining)
     return "claimed reference list length exceeds file size";
   size_remaining -= chunk_size;
-  // The table and string pool sizes will not be zero, even in an empty
-  // index.
-  if (header.table_size < 1)
-    return "table size supposedly nonpositive";
-  chunk_size = header.table_size * sizeof(int);
+  chunk_size = file_header->table_size * sizeof(int);
   if (chunk_size > size_remaining)
     return "claimed table size exceeds file size";
   size_remaining -= chunk_size;
-  if (header.strings_size < 1)
-    return "string pool size supposedly nonpositive";
-  chunk_size = header.strings_size;
+  chunk_size = file_header->strings_size;
   if (chunk_size > size_remaining)
     return "claimed string pool size exceeds file size";
   return 0;
@@ -191,7 +199,7 @@ bool index_search_item::load(int fd)
     return false;
   }
   mtime = sb.st_mtime;
-  int size = int(sb.st_size);
+  unsigned size = unsigned(sb.st_size); // widening conversion
   if (size == 0) {
     error("index '%1' is an empty file", name);
     return false;
@@ -234,17 +242,7 @@ bool index_search_item::load(int fd)
          name, header.version, INDEX_VERSION);
     return false;
   }
-  int sz = (header.tags_size * sizeof(tag)
-           + header.lists_size * sizeof(int)
-           + header.table_size * sizeof(int)
-           + header.strings_size
-           + sizeof(header));
-  if (sz != size) {
-    error("size of '%1' is wrong: was %2, should be %3",
-         name, size, sz);
-    return false;
-  }
-  const char *problem = check_header(size);
+  const char *problem = check_header(&header, size);
   if (problem != 0) {
     if (do_verify)
       error("corrupt header in index file '%1': %2", name, problem);



reply via email to

[Prev in Thread] Current Thread [Next in Thread]