gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master a626911: Table: column concatenation with only


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master a626911: Table: column concatenation with only certain columns
Date: Tue, 21 Jul 2020 19:15:56 -0400 (EDT)

branch: master
commit a626911254ff99ffb12a2de20b4cfce29c3a5556
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>

    Table: column concatenation with only certain columns
    
    Until now, when doing column concatenation (appending), all the columns in
    the specified file would be appended to the output table. This was very
    annoying when only one column is needed for example. Also the names of the
    column were unchanged (causing many situations where the same name would be
    used for more than one column).
    
    With this commit we now have the '--catcolumns' option which can be used to
    limit the columns that will be appended. Also, by default the names of all
    appended columns will be appended with a '-N' (where 'N' is the file
    counter hosting the column). Also the old '--catcolumn' option (which was a
    file name) has been renamed to '--catcolumnfile' to avoid confusion.
    
    With these features, it is now very easy to add multiple measurements (for
    examples catalogs generated by MakeCatalog but from different filters, that
    all have the same column names and some redundant columns) into one.
---
 NEWS                    | 16 ++++++++++++++++
 bin/table/args.h        | 43 ++++++++++++++++++++++++++++++++++---------
 bin/table/asttable.conf |  2 +-
 bin/table/main.h        |  6 ++++--
 bin/table/table.c       | 40 ++++++++++++++++++++++++++++++----------
 bin/table/ui.h          |  8 +++++---
 doc/gnuastro.texi       | 38 ++++++++++++++++++++++++++++----------
 7 files changed, 118 insertions(+), 35 deletions(-)

diff --git a/NEWS b/NEWS
index 8bf882c..30ab78c 100644
--- a/NEWS
+++ b/NEWS
@@ -29,6 +29,17 @@ See the end of the file for license conditions.
      distortion, you can simply run 'astfits --wcsdistortion=SIP' on the
      file. The inverse conversion is also supported (from SIP to TPV).
 
+  Table:
+   - New '--catcolumns' to specify which columns to concatenate (or append)
+     to the output. You can specify the file name containing the columns to
+     append with the '--catcolumnfile' option and '--catcolumnhdu' (see
+     changed features because until now they had different names).
+   - New '--catcolumnsrawname' will leave the name of concatenated
+     (appended) columns unchanged. By default the names of the appended
+     columns will be appended with a '-N' (where 'N' is a counter for the
+     file that is used to append columns). The default behavior is to avoid
+     multiple columns having the same name.
+
   Library:
    - Spectral lines library: SiIII, OIII, CIV, NV and rest of Lyman series.
    - GAL_CONFIG_HAVE_WCSLIB_DIS_H: if the host's WCSLIB supports distortions.
@@ -50,6 +61,11 @@ See the end of the file for license conditions.
    - The 'pow' operator can also accept integer inputs. This also applies
      to column arithmetic in Table.
 
+  Table:
+   --catcolumnfile ('-L') is new name for '--catcolumn' ('-C').
+   --catcolumnhdu is new name for '--catcolhdu' (short option name hasn't
+     changed).
+
   Library:
    - gal_type_string_to_number: Numbers ending in '.' or '.0' will be
      parsed as floating point. Until now, it would only parse numbers as
diff --git a/bin/table/args.h b/bin/table/args.h
index d50304e..bd734c1 100644
--- a/bin/table/args.h
+++ b/bin/table/args.h
@@ -71,26 +71,39 @@ struct argp_option program_options[] =
       GAL_OPTIONS_NOT_SET
     },
     {
-      "catcolumn",
-      UI_KEY_CATCOLUMN,
+      "catcolumnfile",
+      UI_KEY_CATCOLUMNFILE,
       "STR",
       0,
-      "Name of files to be concat column",
+      "File(s) to be concatenated by column.",
       GAL_OPTIONS_GROUP_INPUT,
-      &p->catcolumn,
+      &p->catcolumnfile,
       GAL_TYPE_STRLL,
       GAL_OPTIONS_RANGE_ANY,
       GAL_OPTIONS_NOT_MANDATORY,
       GAL_OPTIONS_NOT_SET
     },
     {
-      "catcolhdu",
-      UI_KEY_CATCOLHDU,
+      "catcolumnhdu",
+      UI_KEY_CATCOLUMNHDU,
       "STR/INT",
       0,
-      "HDU/Extension(s) for the calcolmn files.",
+      "HDU/Extension(s) in catcolumnfile.",
       GAL_OPTIONS_GROUP_INPUT,
-      &p->catcolhdu,
+      &p->catcolumnhdu,
+      GAL_TYPE_STRLL,
+      GAL_OPTIONS_RANGE_ANY,
+      GAL_OPTIONS_NOT_MANDATORY,
+      GAL_OPTIONS_NOT_SET
+    },
+    {
+      "catcolumns",
+      UI_KEY_CATCOLUMNS,
+      "STR",
+      0,
+      "Columns to use in catcolumnfile.",
+      GAL_OPTIONS_GROUP_INPUT,
+      &p->catcolumns,
       GAL_TYPE_STRLL,
       GAL_OPTIONS_RANGE_ANY,
       GAL_OPTIONS_NOT_MANDATORY,
@@ -128,7 +141,19 @@ struct argp_option program_options[] =
       GAL_OPTIONS_NOT_MANDATORY,
       GAL_OPTIONS_NOT_SET
     },
-
+    {
+      "catcolumnrawname",
+      UI_KEY_CATCOLUMNRAWNAME,
+      0,
+      0,
+      "Don't touch column names of --catcolumnfile.",
+      GAL_OPTIONS_GROUP_OUTPUT,
+      &p->catcolumnrawname,
+      GAL_OPTIONS_NO_ARG_TYPE,
+      GAL_OPTIONS_RANGE_0_OR_1,
+      GAL_OPTIONS_NOT_MANDATORY,
+      GAL_OPTIONS_NOT_SET
+    },
 
 
 
diff --git a/bin/table/asttable.conf b/bin/table/asttable.conf
index 0590550..84b7de0 100644
--- a/bin/table/asttable.conf
+++ b/bin/table/asttable.conf
@@ -21,4 +21,4 @@
 
 # Inputs
  wcshdu        1
- catcolhdu     1
+ catcolumnhdu  1
diff --git a/bin/table/main.h b/bin/table/main.h
index 4b103ea..9bef786 100644
--- a/bin/table/main.h
+++ b/bin/table/main.h
@@ -100,8 +100,10 @@ struct tableparams
   uint8_t          descending;  /* Sort columns in descending order.    */
   size_t                 head;  /* Output only the no. of top rows.     */
   size_t                 tail;  /* Output only the no. of bottom rows.  */
-  gal_list_str_t   *catcolumn;  /* Filename to concat column wise.      */
-  gal_list_str_t   *catcolhdu;  /* HDU/extension for the catcolumn.     */
+  gal_list_str_t *catcolumnfile; /* Filename to concat column wise.     */
+  gal_list_str_t *catcolumnhdu;  /* HDU/extension for the catcolumn.    */
+  gal_list_str_t  *catcolumns;  /* List of columns to concatenate.      */
+  uint8_t    catcolumnrawname;  /* Don't modify name of appended col.   */
 
   /* Internal. */
   struct column_pack *outcols;  /* Output column packages.              */
diff --git a/bin/table/table.c b/bin/table/table.c
index ceea67a..6cc3608 100644
--- a/bin/table/table.c
+++ b/bin/table/table.c
@@ -612,28 +612,30 @@ table_head_tail(struct tableparams *p)
 static void
 table_catcolumn(struct tableparams *p)
 {
-  char *hdu=NULL;
-  gal_data_t *tocat, *final;
+  size_t counter=1;
   gal_list_str_t *filell, *hdull;
+  gal_data_t *tocat, *final, *newcol;
+  char *tmpname, *hdu=NULL, cstr[100];
   struct gal_options_common_params *cp=&p->cp;
 
   /* Go over all the given files. */
-  hdull=p->catcolhdu;
-  for(filell=p->catcolumn; filell!=NULL; filell=filell->next)
+  hdull=p->catcolumnhdu;
+  for(filell=p->catcolumnfile; filell!=NULL; filell=filell->next)
     {
       /* Set the HDU (not necessary for non-FITS tables). */
       if(gal_fits_name_is_fits(filell->v))
         {
           if(hdull) { hdu=hdull->v; hdull=hdull->next; }
           else
-            error(EXIT_FAILURE, 0, "not enough '--catcolhdu's. For every "
-                  "FITS table given to '--catcolumn', a call to "
-                  "'--catcolhdu' is necessary to identify its HDU/extension");
+            error(EXIT_FAILURE, 0, "not enough '--catcolumnhdu's (or '-u'). "
+                  "For every FITS table given to '--catcolumnfile'. A call to "
+                  "'--catcolumnhdu' is necessary to identify its "
+                  "HDU/extension");
         }
       else hdu=NULL;
 
       /* Read the catcolumn table. */
-      tocat=gal_table_read(filell->v, hdu, NULL, NULL, cp->searchin,
+      tocat=gal_table_read(filell->v, hdu, NULL, p->catcolumns, cp->searchin,
                            cp->ignorecase, cp->minmapsize, p->cp.quietmmap,
                            NULL);
 
@@ -646,9 +648,27 @@ table_catcolumn(struct tableparams *p)
               gal_fits_name_save_as_string(filell->v, hdu), tocat->dsize[0],
               p->table->dsize[0]);
 
+      /* Append a counter to the column names because this option is most
+         often used with columns that have a similar name and it would help
+         the user if the output doesn't have multiple columns with same
+         name. */
+      if(p->catcolumnrawname==0)
+        for(newcol=tocat; newcol!=NULL; newcol=newcol->next)
+          if(newcol->name)
+            {
+              /* Add the counter suffix to the column name. */
+              sprintf(cstr, "-%zu", counter);
+              tmpname=gal_checkset_malloc_cat(newcol->name, cstr);
+
+              /* Free the old name and put in the new one. */
+              free(newcol->name);
+              newcol->name=tmpname;
+            }
+
       /* Find the final column of the main table and add this table.*/
       final=gal_list_data_last(p->table);
       final->next=tocat;
+      ++counter;
     }
 }
 
@@ -691,8 +711,8 @@ table(struct tableparams *p)
   if(p->outcols)
     arithmetic_operate(p);
 
-  /* Concatenate the columns of tables(if required)*/
-  if(p->catcolumn) table_catcolumn(p);
+  /* Concatenate the columns of tables (if required)*/
+  if(p->catcolumnfile) table_catcolumn(p);
 
   /* Write the output. */
   gal_table_write(p->table, NULL, p->cp.tableformat, p->cp.output,
diff --git a/bin/table/ui.h b/bin/table/ui.h
index 9d27a80..23be7a4 100644
--- a/bin/table/ui.h
+++ b/bin/table/ui.h
@@ -42,7 +42,7 @@ enum program_args_groups
 /* Available letters for short options:
 
    a b d f g j k l m p t v x y z
-   A B E G H J L O Q R X Y
+   A B E G H J O Q R X Y
 */
 enum option_keys_enum
 {
@@ -59,14 +59,16 @@ enum option_keys_enum
   UI_KEY_DESCENDING      = 'd',
   UI_KEY_HEAD            = 'H',
   UI_KEY_TAIL            = 't',
-  UI_KEY_CATCOLUMN       = 'C',
-  UI_KEY_CATCOLHDU       = 'u',
+  UI_KEY_CATCOLUMNS      = 'C',
+  UI_KEY_CATCOLUMNHDU    = 'u',
+  UI_KEY_CATCOLUMNFILE   = 'L',
 
   /* Only with long version (start with a value 1000, the rest will be set
      automatically). */
   UI_KEY_POLYGON         = 1000,
   UI_KEY_INPOLYGON,
   UI_KEY_OUTPOLYGON,
+  UI_KEY_CATCOLUMNRAWNAME,
 };
 
 
diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index 5c1bbf2..1db0a48 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -9376,20 +9376,38 @@ If the value to this option is @option{none}, no WCS 
will be written in the outp
 FITS extension/HDU that contains the WCS to be used in the @code{wcstoimg} and 
@code{imgtowcs} operators of @option{--column} (see above).
 The FITS file name can be specified with @option{--wcsfile}.
 
-@item -F STR
-@itemx --catcolumn=STR
-Concatenate/add the columns of this option's value (a filename) with the main 
input table (keeping number of rows fixed).
-The concatenation is done after any column selection (for example with 
@option{--column}) or row selection (for example with @option{--range}) is 
applied to the main input argument.
+@item -L STR
+@itemx --catcolumnfile=STR
+Concatenate (or add, or append) the columns of this option's value (a 
filename) to the output columns.
+This option may be called multiple times (to add columns from more than one 
file into the final output), the columns from each file will be added in the 
same order that this option is called.
+
+By default all the columns of the given file will be appended, if you only 
want certain columns to be appended, use the @option{--catcolumns} option to 
specify their name or number (see @ref{Selecting table columns}).
+Note that the columns given to @option{--catcolumns} must be present in all 
the given files (if this option is called more than once).
+
+The concatenation is done after any column selection (for example with 
@option{--column}) or row selection (for example with @option{--range}) is 
applied to the main input table given to Table.
+The number of rows in the file(s) given to this option has to be the same as 
the final output table if this option wasn't given.
 
-If the file given to this option is a FITS file, its necessary to also define 
the corresponding HDU/extension with @option{--catcolhdu}.
-Also note that no column or row selection is applied to the table given to 
this option.
-This option may be called multiple times (to add columns from more than one 
file into the final output), the columns will be added in the same order that 
this option is called.
+If the file given to this option is a FITS file, its necessary to also define 
the corresponding HDU/extension with @option{--catcolumnhdu}.
+Also note that no operation (for example row selection, arithmetic or etc) is 
applied to the table given to this option.
+
+If the appended columns have a name, the column names of each file will be 
appended with a @code{-N}, where @code{N} is a counter starting from 1 for each 
appended file.
+This is done because when concatenating columns from multiple tables (more 
than two) into one, they may have the same name, and its not good practice to 
have multiple columns with the same name.
+You can disable this feature with @option{--catcolumnrawname}.
 
 @item -u STR/INT
-@itemx --catcolhdu=STR/INT
-The HDU/extension of the FITS file(s) that have been added with 
@option{--catcolumn}.
+@itemx --catcolumnhdu=STR/INT
+The HDU/extension of the FITS file(s) that should be concatenated, or 
appended, with @option{--catcolumnfile}.
 If @option{--catcolumn} is called more than once with more than one FITS file, 
its necessary to call this option more than once.
-The HDUs will be loaded in the same order as the FITS files given to 
@option{--catcolumn}.
+The HDUs will be loaded in the same order as the FITS files given to 
@option{--catcolumnfile}.
+
+@item -C STR/INT
+@itemx --catcolumns=STR/INT
+The column(s) in the file(s) given to @option{--catcolumnfile} to append.
+When this option is not given, all the columns will be concatenated.
+See @option{--catcolumnfile} for more.
+
+@item --catcolumnrawname
+Don't modify the names of the concatenated (appended) columns, see description 
in @option{--catcolumnfile}.
 
 @item -O
 @itemx --colinfoinstdout



reply via email to

[Prev in Thread] Current Thread [Next in Thread]