guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Accuracy of importers?


From: Ricardo Wurmus
Subject: Re: Accuracy of importers?
Date: Thu, 28 Oct 2021 12:25:56 +0000
User-agent: mu4e 1.6.6; emacs 27.2


Ludovic Courtès <ludovic.courtes@inria.fr> writes:

Hello Guix!

As I’m preparing my PackagingCon talk and wondering how language package managers could make our lives easier, I thought it’d be interesting to
know how well our importers are doing.

My understanding is that most of them require manual intervention—i.e.,
one has to tweak what ‘guix import’ produces, even if we ignore
synopsis/description/license, to set the right inputs, etc. If we were to estimate the fraction of imported packages for which manual changes
are needed, what would it look like?

   importer     fraction of imported packages needing changes
[…]
cran 5% (Ricardo? Simon? seems to almost always work?)

Like Lars and Simon wrote: the importers work *really* well for both CRAN and Bioconductor, so much so that I’m using them in the background here:

https://git.elephly.net/gitweb.cgi?p=software/r-guix-install.git;a=blob;f=guix-install.R;h=2766aa1f2d248a8ed2a4eb4c3244b85574d326e2;hb=HEAD

The biggest annoyance is the missing “license:” prefix when packaging things for gnu/packages/cran.scm or gnu/packages/bioconductor.scm. Descriptions need regular clean- up work (e.g. to complete sentences), even though we’re using some heuristics to fix the most common stylistic problems. It’s really not a big deal, though.

The biggest missing feature is recursive import of dependencies hosted on Github or Mercurial (with “-r -a git” or “-r -a hg”). I.e. a package on Github that declares a dependency on another package that’s also only hosted on Github will fail to import that dependency. This is pretty rare, but it happens with experimental bioinfo software.

   texlive      (Ricardo? Thiago? Marius?)

This one is not usable. I’d even add “at all”. I keep announcing that one day I’ll replace it with a new importer, but that new importer just isn’t ready yet.

What about licensing info: which ones provide accurate licensing info?
My guess:

   gnu
   pypi
   cpan
   cran

The CRAN importer is as accurate as upstream allows. CRAN requires a free license, Bioconductor requires a license declaration (there have been very few cases where the license was not correct, but a number of cases where the license was non-free, such as the Artistic 1.0 license. Bioconductor sometimes is sneaky and the R code is free but a necessary library is not.

   texlive

Pretty terrible. The license declaration is generally too vague. Licenses are often declared without version number, and sometimes it’s just some generic “free” license. A new importer based on texlive.tlpdb would not improve this by much, because the upstream declarations are just spotty and unreliable.

--
Ricardo


PS: attached is a rough WIP patch of what I had been using to import new texlive stuff.

diff --git a/guix/import/texlive.scm b/guix/import/texlive.scm
index 18d8b95ee0..b94aa1cf40 100644
--- a/guix/import/texlive.scm
+++ b/guix/import/texlive.scm
@@ -19,10 +19,12 @@
 
 (define-module (guix import texlive)
   #:use-module (ice-9 match)
+  #:use-module (ice-9 rdelim)
   #:use-module (sxml simple)
   #:use-module (sxml xpath)
   #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-2)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (web uri)
@@ -125,9 +127,9 @@ (define (fetch-sxml name)
       (xml->sxml (http-fetch url)
                  #:trim-whitespace? #t))))
 
-(define (guix-name component name)
+(define (guix-name name)
   "Return a Guix package name for a given Texlive package NAME."
-  (string-append "texlive-" component "-"
+  (string-append "texlive-"
                  (string-map (match-lambda
                                (#\_ #\-)
                                (#\. #\-)
@@ -186,12 +188,123 @@ (define (sxml-value path)
                      ((lst ...) `(list ,@lst))
                      (license license)))))))
 
+(define tlpdb
+  (memoize
+   (lambda ()
+     (let ((file "/home/rekado/dev/gx/branches/master/texlive.tlpdb")
+           (fields
+            '((name     . string)
+              (shortdesc . string)
+              (longdesc . string)
+              (catalogue-license . string)
+              (catalogue-ctan . string)
+              (srcfiles . list)
+              (runfiles . list)
+              (docfiles . list)
+              (depend   . list)))
+           (record
+            (lambda* (key value alist #:optional (type 'string))
+              (let ((new
+                     (or (and=> (assoc-ref alist key)
+                                (lambda (existing)
+                                  (cond
+                                   ((eq? type 'string)
+                                    (string-append existing " " value))
+                                   ((eq? type 'list)
+                                    (cons value existing)))))
+                         (cond
+                          ((eq? type 'string)
+                           value)
+                          ((eq? type 'list)
+                           (list value))))))
+                (acons key new (alist-delete key alist))))))
+       (call-with-input-file file
+         (lambda (port)
+           (let loop ((all (list))
+                      (current (list))
+                      (last-property #false))
+             (let ((line (read-line port)))
+               (cond
+                ((eof-object? line) all)
+
+                ;; End of record.
+                ((string-null? line)
+                 (loop (cons (cons (assoc-ref current 'name) current)
+                             all)
+                       (list) #false))
+
+                ;; Continuation of a list
+                ((and (zero? (string-index line #\space)) last-property)
+                 ;; Erase optional second part of list values like
+                 ;; "details=Readme" for files
+                 (let ((plain-value (first
+                                     (string-split
+                                      (string-trim-both line) #\space))))
+                   (loop all (record last-property
+                                     plain-value
+                                     current
+                                     'list)
+                         last-property)))
+                (else
+                 (or (and-let* ((space (string-index line #\space))
+                                (key   (string->symbol (string-take line 
space)))
+                                (value (string-drop line (1+ space)))
+                                (field-type (assoc-ref fields key)))
+                       ;; Erase second part of list keys like "size=29"
+                       (if (eq? field-type 'list)
+                           (loop all current key)
+                           (loop all (record key value current field-type) 
key)))
+                     (loop all current #false))))))))))))
+
+(define (files->directories files)
+  (map (cut string-join <> "/" 'suffix)
+       (delete-duplicates (map (lambda (file)
+                                 (drop-right (string-split file #\/) 1))
+                               files)
+                          equal?)))
+
+(define (tlpdb->package name)
+  (and-let* ((data (assoc-ref (tlpdb) name))
+             (dirs (files->directories
+                    (append (or (assoc-ref data 'docfiles) (list))
+                            (or (assoc-ref data 'runfiles) (list))
+                            (or (assoc-ref data 'srcfiles) (list))))))
+    (pk data)
+    ;; TODO
+    `(package
+       (name ,(guix-name name))
+       (version (number->string %texlive-revision))
+       (source (texlive-origin name version
+                               ',dirs
+                               (base32
+                                "TODO"
+                                #;
+                                ,(bytevector->nix-base32-string
+                                  (let-values (((port get-hash) 
(open-sha256-port)))
+                                    (write-file checkout port)
+                                    (force-output port)
+                                    (get-hash))))))
+       (build-system texlive-build-system)
+       (arguments ,`(,'quote (#:tex-directory "TODO")))
+       ,@(or (and=> (assoc-ref data 'depend)
+                    (lambda (inputs)
+                      `((propagated-inputs ,inputs))))
+             '())
+       ,@(or (and=> (assoc-ref data 'catalogue-ctan)
+                    (lambda (url)
+                      `((home-page ,(string-append "https://ctan.org"; url)))))
+             '((home-page "https://www.tug.org/texlive/";)))
+       (synopsis ,(assoc-ref data 'shortdesc))
+       (description ,(beautify-description
+                      (assoc-ref data 'longdesc)))
+       (license ,(string->license
+                  (assoc-ref data 'catalogue-license))))))
+
 (define texlive->guix-package
   (memoize
    (lambda* (package-name #:optional (component "latex"))
      "Fetch the metadata for PACKAGE-NAME from REPO and return the `package'
 s-expression corresponding to that package, or #f on failure."
-     (and=> (fetch-sxml package-name)
-            (cut sxml->package <> component)))))
+     (tlpdb->package package-name))))
 
 ;;; ctan.scm ends here

reply via email to

[Prev in Thread] Current Thread [Next in Thread]