[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module.
From: |
Pierre Langlois |
Subject: |
[bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module. |
Date: |
Fri, 17 Feb 2023 12:38:05 +0000 |
User-agent: |
mu4e 1.8.13; emacs 28.2 |
Hi,
Andrew Tropin <andrew@trop.in> writes:
> [[PGP Signed Part:Undecided]]
> On 2023-02-12 12:07, Pierre Langlois wrote:
>
>> Hi,
>>
>> Andrew Tropin <andrew@trop.in> writes:
>>
>>> [[PGP Signed Part:Undecided]]
>>> On 2023-02-10 15:48, Pierre Langlois wrote:
>>>
>>>> Hi Andrew, thanks for pushing this along! It's great to see things
>>>> getting merged.
>>>>
>>>> Andrew Tropin <andrew@trop.in> writes:
>>>>
>>>>> [[PGP Signed Part:Undecided]]
>>>>> On 2023-02-09 18:04, Andrew Tropin wrote:
>>>>>
>>>>>> On 2023-02-09 13:39, zimoun wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Thu, 09 Feb 2023 at 14:11, Andrew Tropin <andrew@trop.in> wrote:
>>>>>>>
>>>>>>>> I applied tree-sitter and tree-sitter-cli patches,
>>>>>>>
>>>>>>> Just to be sure to understand, you have only applied 02/32 and 05/32,
>>>>>>> right?
>>>>>>>
>>>>>>>
>>>>>>> [bug#49946] [PATCH v7 02/32] gnu: tree-sitter: Update to 0.20.7.
>>>>>>> id:20221125012142.22579-3-pierre.langlois@gmx.com
>>>>>>> http://issues.guix.gnu.org/msgid/20221125012142.22579-3-pierre.langlois@gmx.com
>>>>>>>
>>>>>>> [bug#49946] [PATCH v7 05/32] gnu: Add tree-sitter-cli.
>>>>>>> id:20221125012142.22579-6-pierre.langlois@gmx.com
>>>>>>> http://issues.guix.gnu.org/msgid/20221125012142.22579-6-pierre.langlois@gmx.com
>>>>>>>
>>>>>>> Leaving out all the others, right?
>>>>>>
>>>>>> Merged first 5 patches from 01 to 05, also added one more commit, which
>>>>>> addresses some things from reviews and one commit, which adds html
>>>>>> grammar.
>>>>>>
>>>>>> The html grammar is added for the testing purposes. It relies on
>>>>>> generated parser.c and scanner.c and we will need to repackage it using
>>>>>> grammar.js instead. I'm not sure if a separate build system is needed
>>>>>> for this, I guess we can just rewrite tree-sitter-grammar function,
>>>>>> which generates packages as in example with tree-sitter-grammar-html:
>>>>>> https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/tree-sitter.scm?h=53b00b91b73bd60412d5bd057e22e6d63194a7f7#n158
>>>>>>
>>>>>> Anyway, I only skimmed tree-sitter-build-system source code, and plan to
>>>>>> read it carefully, evaluate and either introduce new build system or
>>>>>> just move all needed parts to tree-sitter-grammar function. WDYT?
>>>>>> After we done with it we can package all other grammars.
>>>>>
>>>>> Ok, I realized that the proper build process for tree-sitter grammars is
>>>>> a little harder than I expected, tree-sitter-build system make sense. I
>>>>> reviewed it, made a small change:
>>>>
>>>> Ah great, I was going to comment to try and push for us to keep the
>>>> build system. I originally went with a template package and inheritance,
>>>> but Maxime suggested moving to a build-system which ended up making the
>>>> package definitions a *lot* nicer IMO (see previous discussion here
>>>> https://issues.guix.gnu.org/49946#144). It also allows us to deal with
>>>> grammars that depend on each other more nicely I think.
>>>>
>>>>>
>>>>> @@ -29,7 +29,7 @@ (define-module (guix build tree-sitter-build-system)
>>>>> ;; Commentary:
>>>>> ;;
>>>>> ;; Build procedures for tree-sitter grammar packages. This is the
>>>>> -;; builder-side code, which builds on top fo the node build-system.
>>>>> +;; builder-side code, which builds on top of the node build-system.
>>>>> ;;
>>>>> ;; Tree-sitter grammars are written in JavaScript and compiled to a
>>>>> native
>>>>> ;; shared object. The `tree-sitter generate' command invokes `node' in
>>>>> order
>>>>> @@ -114,7 +114,7 @@ (define (compile-language dir)
>>>>> "-fno-exceptions"
>>>>> "-O2"
>>>>> "-g"
>>>>> - "-o" ,(string-append lib "/" lang ".so")
>>>>> + "-o" ,(string-append lib "/libtree-sitter-" lang
>>>>> ".so")
>>>>> ;; An additional `scanner.{c,cc}' file is sometimes
>>>>> ;; provided.
>>>>> ,@(cond
>>>>>
>>>>>
>>>>> rewrote html grammar to use this build system and made it work with
>>>>> built-in treesit package. Also, tried examples of c and cpp grammars
>>>>> from patches in this thread.
>>>>>
>>>>> If you ok with it, I'll push the build system to master and update the
>>>>> html grammar accordingly.
>>>>>
>>>>> The final result will look like this:
>>>>>
>>>>> (define tree-sitter-delete-generated-files
>>>>> #~(begin
>>>>> (delete-file "binding.gyp")
>>>>> (delete-file-recursively "bindings")
>>>>> (delete-file "src/grammar.json")
>>>>> (delete-file "src/node-types.json")
>>>>> (delete-file "src/parser.c")
>>>>> (delete-file-recursively "src/tree_sitter")))
>>>>>
>>>>> (define* (tree-sitter-grammar
>>>>> language language-for-synopsis version commit hash
>>>>> #:key
>>>>> (repository-url
>>>>> (format #f "https://github.com/tree-sitter/tree-sitter-~a"
>>>>> language))
>>>>> (inputs '()))
>>>>> (let ((synopsis (string-append language-for-synopsis
>>>>> " grammar for tree-sitter"))
>>>>> (name (string-append "tree-sitter-grammar-" language)))
>>>>> (package
>>>>> (name name)
>>>>> (version version)
>>>>> (home-page repository-url)
>>>>> (source (origin
>>>>> (method git-fetch)
>>>>> (uri (git-reference
>>>>> (url repository-url)
>>>>> (commit commit)))
>>>>> (file-name (git-file-name name version))
>>>>> (sha256 (base32 hash))
>>>>> (modules '((guix build utils)))
>>>>> (snippet tree-sitter-delete-generated-files)))
>>>>> (build-system tree-sitter-build-system)
>>>>> (inputs inputs)
>>>>> (synopsis synopsis)
>>>>> (description (string-append synopsis "."))
>>>>> (license license:expat))))
>>>>>
>>>>> (define-public tree-sitter-grammar-html
>>>>> (tree-sitter-grammar
>>>>> "html" "HTML"
>>>>> "0.19.0" "v0.19.0"
>>>>> "1hg7vbcy7bir6b8x11v0a4x0glvqnsqc3i2ixiarbxmycbgl3axy"))
>>>>>
>>>>> After that we can bring the rest of the grammars.
>>>>
>>>> I would suggest to rmeove the `tree-sitter-grammar' function, and keep
>>>> grammars as "regular" package records, even though it's a little bit
>>>> more verbose:
>>>>
>>>> --8<---------------cut here---------------start------------->8---
>>>> (define-public tree-sitter-html
>>>> (package
>>>> (name "tree-sitter-html")
>>>
>>> It seems tree-sitter-html mimics upstream package name and probably make
>>> more sense than tree-sitter-grammar-html used by me.
>>
>> Yeah, at some point I think I had named the packages with "grammar" as
>> well, but thought it was a bit of a mouthful. I'm also thinking one day
>> we may build language bindings as part of the build system (Rust and
>> NodeJS I think ATM), so those packages could do more than ship the
>> grammar in the future (although we don't know if we'll ever really need
>> that).
>>
>>>
>>>> (version "0.19.0")
>>>> (source (origin
>>>> (method git-fetch)
>>>> (uri (git-reference
>>>> (url "https://github.com/tree-sitter/tree-sitter-html")
>>>> (commit (string-append "v" version))))
>>>> (file-name (git-file-name name version))
>>>> (sha256
>>>> (base32
>>>> "1hg7vbcy7bir6b8x11v0a4x0glvqnsqc3i2ixiarbxmycbgl3axy"))
>>>> (modules '((guix build utils)))
>>>> (snippet tree-sitter-delete-generated-files)))
>>>> (build-system tree-sitter-build-system)
>>>> (home-page "https://github.com/tree-sitter/tree-sitter-html")
>>>> (synopsis "Tree-sitter HTML grammar")
>>>> (description
>>>> "This package provides a HTML grammar for the Tree-sitter library.")
>>>> (license license:expat)))
>>>> --8<---------------cut here---------------end--------------->8---
>>>>
>>>> This way, they look like any other package in Guix, which makes it
>>>> easier for us to apply automatic changes in the future if needed (for
>>>> example like how the input format could be automically updated for all
>>>> "simple" package definitions, but had to be manual whenever custom code
>>>> refactoring was done). Does this make sense?
>>>
>>> Make sense, but on the other hand we already have hunspell, aspell
>>> dictionaries and probably a few more others, which are very similiar in
>>> spirit and we already have to keep in mind their existence on such
>>> automatic code updates.
>>>
>>> It looks that the packages differ only in url for the source code, lang
>>> name and sometimes in inputs. Having template package function can make
>>> management of shared parts more centralized, reduce possibility of
>>> copy-paste mistakes, when the description wasn't updated and so on and
>>> can reduce the amount of the code overall (which also reduces the change
>>> of introducing an error).
>>>
>>> I don't have a strong opinion on this topic, but leaning towards the
>>> template function slightly more, however I'm completely ok with the
>>> standalone package definitions as well. WDYT?
>>
>> I can think of both cost/benefits to the template so I don't have a
>> strong opinion either :-).
>>
>> I do like the template to make sure people don't forget to delete
>> generated files, that's quite important as it seems upstream packages
>> often check-in the generated C code. Although, we could probably assert
>> that with in the build-system phase? I'll think about that.
>>
>> On the other hand, I wonder how the template works for packages that
>> provide multiple grammars (see ocaml and typescript for example). I
>> guess we could use the template for trivial packages, and standalone
>> definitions for more complex ones? In general, if we keep the template
>> interface really simple, then I'm happy with it.
>
> Hi Pierre!
>
> I spend two days trying grammars with and without helper function and
> found hepler quite helpful to reduce boilerplate and errors from
> copypaste, so I went the way with helper. The logic inside is quite
> trivial, the only downside I found so far is that in cases when
> repository url constructed automatically I can't easily open the repo
> url in the browser.
>
> I packaged all the grammars from this thread and a few more on top of
> it. Updated them to usually latest versions, added some comments, when
> needed.
>
> If I forgot to reply on something or you have any comments/ideas, let me
> know! :)
>
> Kudos to Pierre and everyone, who helped with all the tree-sitter stuff.
Thank you for landing all this work!
signature.asc
Description: PGP signature
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., (continued)
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., zimoun, 2023/02/09
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Andrew Tropin, 2023/02/09
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Andrew Tropin, 2023/02/10
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Pierre Langlois, 2023/02/10
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Pierre Langlois, 2023/02/10
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Andrew Tropin, 2023/02/12
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Pierre Langlois, 2023/02/12
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Andrew Tropin, 2023/02/12
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Pierre Langlois, 2023/02/12
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Andrew Tropin, 2023/02/14
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module.,
Pierre Langlois <=
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Pierre Langlois, 2023/02/10
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Andrew Tropin, 2023/02/12
- [bug#49946] [PATCH v7 01/32] gnu: tree-sitter: Move to its own module., Pierre Langlois, 2023/02/12