gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master 7d117daa: Library (fits.h): Table info is read


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master 7d117daa: Library (fits.h): Table info is read when END key not read by CFITSIO
Date: Thu, 7 Jul 2022 12:56:28 -0400 (EDT)

branch: master
commit 7d117daa2d0a4cdccce35c889e2513f4bbde81d7
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>

    Library (fits.h): Table info is read when END key not read by CFITSIO
    
    Until now, the 'gal_fits_tab_info' function would only stop parsing the
    headers of the input FITS file when CFITSIO returned the 'END'
    keyword. However, as reported by Hilderic Browne in [1], for some FITS
    files, strangely CFITSIO doesn't return the 'END' keyword. As a result,
    Table would fall into an infinite loop and would never stop!
    
    After finding that the problem originates from CFITSIO, it was immediately
    reported to the CFITSIO maintainers at 'ccfits@heasarc.gsfc.nasa.gov' and
    they are looking into it.
    
    With this commit, a work-around has been implemented (which is necessary
    because even after the fix in CFITSIO, it may take many years for many
    users to update it): immediately after attempting to read the next keyword
    in a FITS file, a check has been added for the 'status' output of
    CFITSIO. If we have indeed passed 'END', the 'status' will be
    'KEY_OUT_BOUNDS'. In this case, we will manually write 'END' and let the
    loop stop as before.
    
    While checking that same file, I noticed that while reading blank columns,
    we aren't setting the string blank columns properly (instead of directly
    passing the pointer to the blank string, we were passing the pointer to a
    pointer!). With this commit, this bug has also been fixed.
    
    [1] https://lists.gnu.org/archive/html/bug-gnuastro/2022-07/msg00005.html
---
 NEWS                         |   3 +-
 THANKS                       |   1 +
 bin/fits/fits.c              |   5 +-
 doc/announce-acknowledge.txt |   1 +
 lib/fits.c                   | 167 ++++++++++++++++++++++++++++++-------------
 5 files changed, 124 insertions(+), 53 deletions(-)

diff --git a/NEWS b/NEWS
index bf925fe9..9060070d 100644
--- a/NEWS
+++ b/NEWS
@@ -58,7 +58,6 @@ See the end of the file for license conditions.
          astarithmetic img-a.fits img-b.fits + --metaname="my-sum" \
                        --output=sum.fits
 
-
   Crop:
    --oneelemstdout: when a crop has a single pixel and this option is
      called, the single pixel's value will be printed on the standard
@@ -217,6 +216,8 @@ See the end of the file for license conditions.
   bug #62710: Plain-text integers starting with 0 are read in the octal
               base (which is common in software engineering, but not in
               data analysis). Found by Manuel Sánchez-Benavente.
+  bug #62718: Table goes into an infinite loop on some old ASCII FITS
+              tables. Found by Hilderic Browne.
 
 
 
diff --git a/THANKS b/THANKS
index 899501f7..3e6ac378 100644
--- a/THANKS
+++ b/THANKS
@@ -29,6 +29,7 @@ support in Gnuastro. The list is ordered alphabetically (by 
family name).
     Faezeh Bijarchian                    fbidjarchian@gmail.com
     Leindert Boogaard                    boogaard@strw.leidenuniv.nl
     Nicolas Bouché                       nicolas.bouche@irap.omp.eu
+    Hilderic Browne                      hilderic@storm.ca
     Stefan Brüns                         stefan.bruens@rwth-aachen.de
     Fernando Buitrago                    fbuitrago@oal.ul.pt
     Adrian Bunk                          bunk@debian.org
diff --git a/bin/fits/fits.c b/bin/fits/fits.c
index 91aa4c16..e292c8f3 100644
--- a/bin/fits/fits.c
+++ b/bin/fits/fits.c
@@ -300,9 +300,10 @@ fits_print_extension_info(struct fitsparams *p)
       printf(" Column 3: Image data type or 'table' format (ASCII or "
              "binary).\n");
       printf(" Column 4: Size of data in HDU.\n");
-      printf(" Column 5: Units of data in HDU (only images, for tables use 
'asttable -i').\n");
+      printf(" Column 5: Units of data in HDU (only images, for tables "
+             "use 'asttable -i').\n");
       if(hasblankunits)
-        printf("           ('%s': no unit in HDU metadata, or is "
+        printf("           ('%s': no unit in HDU metadata, or "
                "HDU is a table)\n", GAL_BLANK_STRING);
       if(hascomments)
         printf(" Column 6: Comments about the HDU (e.g., if its HEALpix, or "
diff --git a/doc/announce-acknowledge.txt b/doc/announce-acknowledge.txt
index 72e38fe6..1097ce77 100644
--- a/doc/announce-acknowledge.txt
+++ b/doc/announce-acknowledge.txt
@@ -2,6 +2,7 @@ Alphabetically ordered list to acknowledge in the next release.
 
 Marjan Akbari
 Faezeh Bijarchian
+Hilderic Browne
 Sepideh Eskandarlou
 Sílvia Farras
 Teet Kuutma
diff --git a/lib/fits.c b/lib/fits.c
index d70f867d..d26f6e1d 100644
--- a/lib/fits.c
+++ b/lib/fits.c
@@ -3033,10 +3033,11 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
 
   /* Necessary when a keyword can't be written immediately as it is read in
      the FITS header and it actually depends on other data before. */
-  gal_list_str_t *tmp_n, *later_name=NULL;
-  gal_list_str_t *tmp_v, *later_value=NULL;
+  gal_list_str_t   *tmp_n, *later_name=NULL;
+  gal_list_str_t   *tmp_v, *later_value=NULL;
   gal_list_sizet_t *tmp_i, *later_index=NULL;
 
+
   /* Open the FITS file and get the basic information. */
   fptr=gal_fits_hdu_open_format(filename, hdu, 1);
   *tableformat=gal_fits_tab_format(fptr);
@@ -3069,8 +3070,30 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
      reserved. */
   for(i=9; strcmp(keyname, "END"); ++i)
     {
+
       /* Read the next keyword. */
       fits_read_keyn(fptr, i, keyname, value, NULL, &status);
+      switch(status)
+        {
+        /* Everything is good, ignore (this is just a place-holder!) */
+        case 0: break;
+
+        /* It can happen that the 'END' keyword is not read by
+           'fits_read_keyn' (see the file in [1] below, with CFITSIO
+           4.1.0). In that case, the loop will pass this keyword and the
+           next call to 'fits_read_key' will return with a 'status' of
+           'KEY_OUT_BOUNDS'. To stop the loop therefore, we'll simply put
+           an 'END' manually and decrement the counter (for any future
+           checks).
+
+           [1] 
https://lists.gnu.org/archive/html/bug-gnuastro/2022-07/msg00005.html*/
+        case KEY_OUT_BOUNDS: --i; sprintf(keyname, "END"); status=0; break;
+
+        /* There is an un-expected problem, abort and let the user know
+           about it. */
+        default: gal_fits_io_error (status, NULL); /* An error. */
+        }
+
 
       /* For string valued keywords, CFITSIO's function above, keeps the
          single quotes around the value string, one before and one
@@ -3079,6 +3102,7 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
          space.*/
       if(value[0]=='\'') gal_fits_key_clean_str_value(value);
 
+
       /* COLUMN DATA TYPE. According the the FITS standard, the value of
          TFORM is most generally in this format: 'rTa'. 'T' is actually a
          code of the datatype. 'r' is the 'repeat' counter and 'a' is
@@ -3095,21 +3119,36 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
              keywords beyond the number of columns, but we don't want to be
              that strict here.*/
           index = strtoul(&keyname[5], &tailptr, 10) - 1;
-          if(index<tfields)     /* Counting from zero was corrected above. */
+          if(index<tfields)  /* Counting from zero was corrected above. */
             {
               /* The FITS standard's value to this option for FITS ASCII
                  and binary files differ. */
               if(*tableformat==GAL_TABLE_FORMAT_AFITS)
-                fits_ascii_tform(value, &datatype, NULL, NULL, &status);
+                {
+                  /* 'repeat' is not defined for ASCII tables in FITS, so
+                     it should be 1 (zero means that the column is empty);
+                     while we actually have one number in each row. */
+                  repeat=1;
+                  fits_ascii_tform(value, &datatype, NULL, NULL, &status);
+                }
               else
-                fits_binary_tform(value, &datatype, &repeat, NULL, &status);
+                {
+                  /* Read the column's numeric data type. */
+                  fits_binary_tform(value, &datatype, &repeat, NULL,
+                                    &status);
+
+                  /* Set the repeat flag if necessary (recall that by
+                     default 'allcols[index].minmapsize' will be zero! So
+                     we need this flag to confirm that the zero there is
+                     meaningful.*/
+                  if(repeat==0)
+                    allcols[index].flag
+                      |= GAL_TABLEINTERN_FLAG_TFORM_REPEAT_IS_ZERO;
+                }
 
-              /* Write the "repeat" element into 'allcols->minmapsize', but
+              /* Write the "repeat" element into 'allcols->minmapsize',
                  also activate the repeat flag within the dataset.*/
               allcols[index].minmapsize = repeat;
-              if(repeat==0)
-                allcols[index].flag
-                  |= GAL_TABLEINTERN_FLAG_TFORM_REPEAT_IS_ZERO;
 
               /* Write the type into the data structure. */
               allcols[index].type=gal_fits_datatype_to_type(datatype, 1);
@@ -3122,24 +3161,25 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
                     {
                       repeat=strtol(value+1, &tailptr, 0);
                       if(*tailptr!='\0')
-                        error(EXIT_FAILURE, 0, "%s (hdu: %s): the value to "
-                              "keyword '%s' ('%s') is not in 'Aw' format "
-                              "(for strings) as required by the FITS "
-                              "standard in %s", filename, hdu, keyname, value,
-                              __func__);
+                        error(EXIT_FAILURE, 0, "%s (hdu: %s): the value "
+                              "to keyword '%s' ('%s') is not in 'Aw' "
+                              "format (for strings) as required by the "
+                              "FITS standard in %s", filename, hdu,
+                              keyname, value, __func__);
                     }
 
                   /* TFORM's 'repeat' element can be zero (signifying a
                      column without any data)! In this case, we want the
                      output to be fully blank, so we need space for the
                      blank string. */
-                  allcols[index].disp_width = ( repeat
-                                                ? repeat
-                                                : strlen(GAL_BLANK_STRING)+1);
+                  allcols[index].disp_width=( repeat
+                                              ? repeat
+                                              : strlen(GAL_BLANK_STRING)+1);
                 }
             }
         }
 
+
       /* COLUMN SCALE FACTOR. */
       else if(strncmp(keyname, "TSCAL", 5)==0)
         {
@@ -3148,12 +3188,13 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
             {
               tscal[index]=strtol(value, &tailptr, 0);
               if(*tailptr!='\0')
-                error(EXIT_FAILURE, 0, "%s (hdu: %s): value to %s keyword "
-                      "('%s') couldn't be read as a number in %s", filename,
-                      hdu, keyname, value, __func__);
+                error(EXIT_FAILURE, 0, "%s (hdu: %s): value to %s "
+                      "keyword ('%s') couldn't be read as a number "
+                      "in %s", filename, hdu, keyname, value, __func__);
             }
         }
 
+
       /* COLUMN ZERO VALUE (for signed/unsigned types). */
       else if(strncmp(keyname, "TZERO", 5)==0)
         {
@@ -3168,6 +3209,7 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
             }
         }
 
+
       /* COLUMN NAME. All strings in CFITSIO start and finish with single
          quotation marks, CFITSIO puts them in itsself, so if we don't
          remove them here, we might have duplicates later, its easier to
@@ -3180,6 +3222,7 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
             gal_checkset_allocate_copy(value, &allcols[index].name);
         }
 
+
       /* COLUMN UNITS. */
       else if(strncmp(keyname, "TUNIT", 5)==0)
         {
@@ -3188,6 +3231,7 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
             gal_checkset_allocate_copy(value, &allcols[index].unit);
         }
 
+
       /* COLUMN COMMENTS */
       else if(strncmp(keyname, "TCOMM", 5)==0)
         {
@@ -3196,6 +3240,7 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
             gal_checkset_allocate_copy(value, &allcols[index].comment);
         }
 
+
       /* COLUMN BLANK VALUE. Note that to interpret the blank value the
          type of the column must already have been defined for this column
          in previous keywords. Otherwise, there will be a warning and it
@@ -3228,6 +3273,7 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
             }
         }
 
+
       /* COLUMN DISPLAY FORMAT */
       else if(strncmp(keyname, "TDISP", 5)==0)
         {
@@ -3240,7 +3286,6 @@ gal_fits_tab_info(char *filename, char *hdu, size_t 
*numcols,
       /* Column zero. */
     }
 
-
   /* If any columns should be added later because of missing information,
      add them here. */
   if(later_name)
@@ -3378,13 +3423,13 @@ fits_tab_read_onecol(void *in_prm)
     = (struct fits_tab_read_onecol_params *)tprm->params;
 
   /* Subsequent definitions. */
-  void *blank;
   char **strarr;
   fitsfile *fptr;
   gal_data_t *col;
   gal_list_sizet_t *tmp;
+  void *blank, *blankuse;
   int isfloat, hdutype, anynul=0, status=0;
-  size_t i, j, c, indout, indin=GAL_BLANK_SIZE_T;
+  size_t i, j, c, strw, indout, indin=GAL_BLANK_SIZE_T;
 
   /* Open the FITS file */
   fptr=gal_fits_hdu_open_format(p->filename, p->hdu, 1);
@@ -3417,12 +3462,19 @@ fits_tab_read_onecol(void *in_prm)
          automatically in 'gal_fits_table_info'. */
       if(col->type==GAL_TYPE_STRING)
         {
+          /* Since the column may contain blank values, and the blank
+             string is pre-defined in Gnuastro, we need to be sure that for
+             each row, a blank string can fit. */
+          strw = ( strlen(GAL_BLANK_STRING) > p->allcols[indin].disp_width
+                   ? strlen(GAL_BLANK_STRING)
+                   : p->allcols[indin].disp_width );
+
+          /* Allocate the space for each row's strings. */
           strarr=col->array;
           for(j=0;j<p->numrows;++j)
             {
               errno=0;
-              strarr[j]=calloc(p->allcols[indin].disp_width+1,
-                               sizeof *strarr[j]);
+              strarr[j]=calloc(strw+1, sizeof *strarr[0]); /* +1 for '\0' */
               if(strarr[j]==NULL)
                 error(EXIT_FAILURE, errno, "%s: allocating %zu bytes for "
                       "strarr[%zu]", __func__,
@@ -3443,38 +3495,53 @@ fits_tab_read_onecol(void *in_prm)
       else
         {
           /* Allocate a blank value for the given type and read/store the
-             column using CFITSIO. Note that for binary tables, we only
-             need blank values for integer types. For binary floating point
-             types, the FITS standard defines blanks as NaN (same as almost
-             any other software like Gnuastro). However if a blank value is
-             specified, CFITSIO will convert other special numbers like
-             'inf' to NaN also. We want to be able to distringuish 'inf'
-             and NaN here, so for floating point types in binary tables, we
-             won't define any blank value. In ASCII tables, CFITSIO doesn't
-             read the 'NAN' values (that it has written itself) unless we
-             specify a blank pointer/value. */
+             column using CFITSIO.
+
+             * For binary tables, we only need blank values for integer
+               types. For binary floating point types, the FITS standard
+               defines blanks as NaN (same as almost any other software
+               like Gnuastro). However if a blank value is specified,
+               CFITSIO will convert other special numbers like 'inf' to NaN
+               also. We want to be able to distringuish 'inf' and NaN here,
+               so for floating point types in binary tables, we won't
+               define any blank value. In ASCII tables, CFITSIO doesn't
+               read the 'NAN' values (that it has written itself) unless we
+               specify a blank pointer/value.
+
+             * 'fits_read_col' takes the pointer to the thing that should
+               be placed in a blank column (for strings, the 'char *')
+               pointer. However, for strings, 'gal_blank_alloc_write' will
+               return a 'char **' pointer! So for strings, we need to
+               dereference the blank. This is why we need 'blankuse'. */
           isfloat = ( col->type==GAL_TYPE_FLOAT32
                       || col->type==GAL_TYPE_FLOAT64 );
           blank = ( ( hdutype==BINARY_TBL && isfloat )
                     ? NULL
                     : gal_blank_alloc_write(col->type) );
-          fits_read_col(fptr, gal_fits_type_to_datatype(col->type), indin+1,
-                        1, 1, col->size, blank, col->array, &anynul, &status);
-
-          /* In the ASCII table format, CFITSIO might not be able to read
-             'INF' or '-INF'. In this case, it will set status to 'BAD_C2D'
-             or 'BAD_C2F'. So, we'll use our own parser for the column
-             values. */
-          if( hdutype==ASCII_TBL
-              && isfloat
-              && (status==BAD_C2D || status==BAD_C2F) )
+          blankuse = ( col->type==GAL_TYPE_STRING
+                       ? *((char **)blank)
+                       : blank);
+          fits_read_col(fptr, gal_fits_type_to_datatype(col->type),
+                        indin+1, 1, 1, col->size, blankuse, col->array,
+                        &anynul, &status);
+
+          /* In the ASCII table format some things need to be checked. */
+          if( hdutype==ASCII_TBL )
             {
-              fits_tab_read_ascii_float_special(p->filename, p->hdu,
-                                                fptr, col, indin+1, p->numrows,
-                                                p->minmapsize, p->quietmmap);
-              status=0;
+              /* CFITSIO might not be able to read 'INF' or '-INF'. In this
+                case, it will set status to 'BAD_C2D' or 'BAD_C2F'. So,
+                we'll use our own parser for the column values. */
+              if(isfloat && (status==BAD_C2D || status==BAD_C2F) )
+                {
+                  fits_tab_read_ascii_float_special(p->filename, p->hdu,
+                                                    fptr, col, indin+1,
+                                                    p->numrows,
+                                                    p->minmapsize,
+                                                    p->quietmmap);
+                  status=0;
+                }
             }
-          gal_fits_io_error(status, NULL); /* After the 'status' correction. */
+          gal_fits_io_error(status, NULL); /* After 'status' correction. */
 
           /* Clean up and sanity check (just note that the blank value for
              strings, is an array of strings, so we need to free the



reply via email to

[Prev in Thread] Current Thread [Next in Thread]