[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to b
From: |
Andrew J. Schorr |
Subject: |
Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB |
Date: |
Mon, 21 Oct 2019 07:25:06 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Hi,
A few points:
1. Tom's suggestion involves more typing than necessary. The example below:
FNR == 1 { for (i = 1; i <= NF; i++) mem[$i] = i }
$(mem["Name"]) ~ /^H/ { process H.* records }
$(mem("Amount"]) < 0 { process negative amount records }
can be rewritten as (and I use this approach all the time):
NR == 1 {
for (i = 1; i <= NF; i++)
m[$i] = i
next
}
$m["Name"] ~ /^H/ { process H.* records }
$m["Amount"] < 0 { process negative amount records }
This saves 4 characters per variable reference (mem -> m, and no need
for the parentheses), so reduces the overhead by almost 50% (9 -> 5). :-)
So yeah, it costs you a bit of typing, but it's so much cleaner than
mucking with SYMTAB.
2. If the goal here is really to process CSV files, then there's a
gawkextlib project to develop a CSV processing extension that could
probably benefit from some development/contributions.
3. If you develop your own extension, you are welcome to contribute
it to the gawkextlib project.
http://gawkextlib.sourceforge.net/
https://sourceforge.net/projects/gawkextlib/
Regards,
Andy
On Mon, Oct 21, 2019 at 01:50:32AM -0400, address@hidden wrote:
> Apologies for forgetting to CC the list for the last two iterations of this
> discussion. I am correcting that mistake with this reply.
>
> Arnold,
>
> I understand your lack of enthusiasm, particularly after seeing the
> unexpected and undesired results when I tried to actually use my proposed
> update.
>
> After having reviewed the manual documentation of the gawk extension API, I
> tend to agree that what I want to do is most easily done in a new extension
> function.
>
> There do appear to be several of the delivered extension function sources I
> could use as a model for a relatively simple extension function that
> satisfies my use case.
>
> Thank you for your understanding, guidance, and genuine consideration of my
> needs.
>
> If I get such an extension operating correctly and robustly, is there any
> interest in my contributing that extension to the project?
>
> Regards,
>
> Peter
>
> > -----Original Message-----
> > From: address@hidden <address@hidden>
> > Sent: Sunday, October 20, 2019 3:08 PM
> > To: address@hidden; address@hidden
> > Subject: Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable
> names
> > to be assigned to SYMTAB
> >
> > OK, I understand the use case. You want to allow for column names that
> are not
> > used as variables in the program.
> >
> > I am not overly enthusiastic about making this change. I think it
> encourages
> > confusion as to how SYMTAB works and should be used, and leads (or can
> easily
> > lead) to sloppy programming.
> >
> > W.R.T. functions that can create and set variables, these can easily be
> written in
> > C as a loadable extension; the manual provides the details. That is
> probably the
> > easier path to follow than patching gawk itself.
> >
> > Thanks,
> >
> > Arnold
> >
> > <address@hidden> wrote:
> >
> > > The actual use case here is for CSV files with column header lines.
> > > At BEGINFILE or FNR == 1 time I would like to assign the column header
> > > values (checked for valid variable name format first) as real gawk
> > > variable names and assign the column number as their value.
> > >
> > > Assuming a sample CSV like this:
> > >
> > > Name,Desc,Amount
> > > Harry,Item # 1,-30
> > > Jeffery,"Groups, Pairs, and stuff",46
> > >
> > > I would like to be able to write gawk code like the following
> > > (assuming FPAT has been set to deconstruct CSV records and without a
> > > check for variable name validity to keep it simpler):
> > >
> > > FNR == 1 { for (i = 1; i <= NF; i++) SYMTAB[$i] = i } $Name ~ /^H/ {
> > > process H.* records } $Amount < 0 { process negative amount records }
> > >
> > > Obviously with Tom's suggestion this can be coded today as:
> > >
> > > FNR == 1 { for (i = 1; i <= NF; i++) mem[$i] = i }
> > > $(mem["Name"]) ~ /^H/ { process H.* records }
> > > $(mem("Amount"]) < 0 { process negative amount records }
> > >
> > > But that alternative is more typing and IMHO much less clear and is
> > > also much easier to make typing mistakes while coding.
> > >
> > > If this is insufficient reason to open up assignment to SYMTAB I will
> > > accept your decision, but that is what I was trying to accomplish with
> > > the patch I submitted.
> > >
> > > Alternatively, could (a) builtin function(s) be supplied to DTRT to
> > > create dynamic variables and assign values to them? E.G.,
> > > create_var($i[,value]) and/or assign_var($i,value).
> > >
> > > Peter
> > >
> > > > -----Original Message-----
> > > > From: address@hidden <address@hidden>
> > > > Sent: Wednesday, October 16, 2019 11:14 AM
> > > > To: address@hidden; address@hidden;
> > > > address@hidden
> > > > Cc: address@hidden
> > > > Subject: Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk
> > > > variable
> > > names
> > > > to be assigned to SYMTAB
> > > >
> > > > Peter,
> > > >
> > > > I have to agree with Tom here. From the snippet you sent, it doesn't
> > > > look
> > > like
> > > > there's a significant advantage to your using SYMTAB.
> > > >
> > > > Arnold
> > > >
> > > > Tom Gray <address@hidden> wrote:
> > > >
> > > > > HI Peter,
> > > > >
> > > > > Have you considered using your own array instead of SYMTAB[] Lets
> > > > > call it mem[] ... to represent some generic storage space.
> > > > > Then the index will not have any "nice variable name" restrictions.
> > > > >
> > > > > FNR == 1 { for (I = 1; I <= NF; i++) { mem[$i] =
> computed_value }
> > > > > }
> > > > >
> > > > > If ((mem[MyVar] > x) && (mem[MyVar] < y)) { do-something
> > > > >
> > > > > In my opinion Gawks jagged arrays of arrays are the best thing
> > > > > since
> > > sliced
> > > > bread.
> > > > > Combined with recursion and indirect function calls you get
> > > > > incredible
> > > power.
> > > > >
> > > > > Do not underestimate the power of portability. If you write
> > > > > something
> > > cool,
> > > > you want others to be able to use it.
> > > > >
> > > > > Tom
> <Remainder of original chain snipped for brevity>
> --
>
--
Andrew Schorr e-mail: address@hidden
Telemetry Investments, L.L.C. phone: 917-305-1748
545 Fifth Ave, Suite 1108 fax: 212-425-5550
New York, NY 10017-3630
- [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, pjfarley3, 2019/10/07
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, arnold, 2019/10/08
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, pjfarley3, 2019/10/08
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, pjfarley3, 2019/10/13
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, Tom Gray, 2019/10/14
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, pjfarley3, 2019/10/15
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, arnold, 2019/10/16
- Message not available
- Message not available
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, pjfarley3, 2019/10/21
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB,
Andrew J. Schorr <=
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, pjfarley3, 2019/10/21
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, Andrew J. Schorr, 2019/10/22
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, pjfarley3, 2019/10/22
- Re: [bug-gawk] gawk 5.0.1 patch to allow *valid* awk variable names to be assigned to SYMTAB, arnold, 2019/10/22