[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposal: Making datamash extendable
From: |
Tim Rice |
Subject: |
Re: Proposal: Making datamash extendable |
Date: |
Wed, 18 May 2022 21:25:32 +0000 |
Hey Shawn,
I like the idea of making datamash more easily extendable. On the other hand, I
have concerns about the performance hit of moving core functionality out to any
scripting language.
An idea that comes to mind is using something like Bash's dynamically-loadable
builtins. We could have it so that datamash is able to read extra object files
from a particular directory. Since they are dynamically linked after being
compiled, I believe (correct me if I'm wrong) they would or could be
language-agnostic. People could then write extensions with C, Fortran, or
whatever. Even assembly if that's the way they like to party :)
Another option would be to do what Git does: a "core" program which basically
just searches the path for any other program prefixed with `git-` and farms out the rest
of the arguments to that subprogram. This would make datamash very easy to extend, with
the main problem being it would certainly destroy backwards compatibility in heavy-handed
ways.
If people do want to use scripting languages with datamash, our refactoring work for v2.0
could aim to establish a "libdatamash" which people could then create language
bindings for. Then datamash could be scripted not only for guile or tcl but also python,
perl, ruby, lua etc, depending on who wants to create the bindings for their favorite
language.
Thoughts?
~ Tim
On Wed, May 18, 2022 at 05:52:46AM -0700, Shawn Wagner wrote:
(This is a datamash 2.0 idea)
Currently, adding a new operation is an annoying pain - you have to
touch 3 or 4 different source files, making sure the order of
different things all match up, etc.
I want to embed a scripting language in it so that if an unknown
operation is encountered, it can just load a source file that
implements it - and maybe rewrite some/all of the existing operations
to use this framework. It'll make for easier additions of new
features, and allow user-contributed ones without needing to patch and
recompile.
My preference for a language to use is Guile, since it's GNU's
official extension language and I'm quite fond of Scheme, with tcl a
close second. There are some who like lua for an embedded scripting
language, but they're silly people who should be treated kindly.
A simple example of what defining a new operation might look like:
(define-scalar add1 #:type 'numeric #:help "Add 1 to the value"
(lambda (n) (+ n 1)))