bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Further reducing special case code for XS structuring


From: Patrice Dumas
Subject: Re: Further reducing special case code for XS structuring
Date: Wed, 3 Apr 2024 23:20:28 +0200

On Wed, Apr 03, 2024 at 09:23:28PM +0100, Gavin Smith wrote:
> We previously discussed the interface of the XS modules and trying to
> improve their abstraction so that the same interface was used for both
> XS and non-XS code.  (Emails from 2024-02-23 "Better abstraction of
> Texinfo::Document interface regardless of XS".)  We made changes in
> this direction.
> 
> There is still code like this in multiple places in the code, e.g. in
> texi2any.pl:
> 
> # XS parser and not explicitely unset
> my $XS_structuring = ((not defined($ENV{TEXINFO_XS})
>                         or $ENV{TEXINFO_XS} ne 'omit')
>                        and (not defined($ENV{TEXINFO_XS_PARSER})
>                             or $ENV{TEXINFO_XS_PARSER} eq '1')
>                        and (not defined($ENV{TEXINFO_XS_STRUCTURE})
>                             or $ENV{TEXINFO_XS_STRUCTURE} ne '0'));
> 
> I think this makes the code harder to understand and is probably not
> necessary.  It will make the code harder to maintain in the future.
> (E.g., it could be something that makes new contributors give up.)

Some of those variables could be removed, but at least some of the
conversion variables need to stay in, and they need to be constructed
using similar conditions.  So I do not think that this kind of
constructs can be completly removed.

In Texinfo/Convert/HTML.pm, I think that those constructs need to stay in.
The $XS_structuring variable itself can be removed, but the corresponding
code would then need to be incorporated in the $XS_convert definition,
(maybe without TEXINFO_XS_STRUCTURE if the variable is removed).

> I tried to eliminate this, but haven't succeeded so far in the time I've
> spent on it.
> 
> I found the $XS_structuring variable was being passed to
> Texinfo::Parser::parse_texi_file, and the difference that this made was
> whether Texinfo::Document::build_document or Texinfo::Document::get_document
> was called (in 'parse_texi_file', in Texinfo/XS/parsetexi/Parsetexi.pm).
> If $XS_structuring was false, then it was 'build_document' that was called,
> which builds more of the document data in Perl structures.
> 
> I first tried fixing the value of this variable so that get_document would
> always be called:

This could theoretically work.  The current approach is simpler,
considering that with TEXINFO_XS_STRUCTURE set to 0, the perl structures
are needed so they are all built instead of relying on being built on
demand.

> I thought that the extra Perl structures were needed by the pure Perl
> module Texinfo/Structuring.pm.

Indeed, if TEXINFO_XS_STRUCTURE is set to 0.

> Access to these structures is through
> accessor functions in Texinfo/Document.pm; (e.g. 'nodes_list') however,
> the Perl structures are not built on demand in the pure Perl accessor
> functions.

Indeed, since the perl structures are in general needed, I did not try
to get them built from XS code, but they are obtained through
build_document.

>  To try to compensate for this, I enabled the XS overrides in
> Texinfo/Document.pm:

My first thought was that it does not acknowledge
TEXINFO_XS_STRUCTURE=0.  But it may also be a relevant path when the
XS parser is used and the Document is considered to be more in the
parser than with Structuring.

> diff --git a/tp/Texinfo/Document.pm b/tp/Texinfo/Document.pm
> index 4b8ceb1c99..76479499b1 100644
> --- a/tp/Texinfo/Document.pm
> +++ b/tp/Texinfo/Document.pm
> @@ -95,7 +95,7 @@ sub import {
>        for my $sub (keys %XS_overrides) {
>          Texinfo::XSLoader::override ($sub, $XS_overrides{$sub});
>        }
> -      if ($XS_structuring) {
> +      if ($XS_parser or $XS_structuring) {
>          for my $sub (keys %XS_structure_overrides) {
>            Texinfo::XSLoader::override ($sub, $XS_structure_overrides{$sub});
>          }
> 
> (Another reason that this seems like a good change to make is that there
> seems to be no reason that Texinfo/Document.pm should have a different
> implementation depending on the value of TEXINFO_XS_STRUCTURE.)

To me this is more a choice.  We can decide that TEXINFO_XS_STRUCTURE
extends to Document.pm or that Document.pm is governed by
TEXINFO_XS_PARSER instead.  Document.pm is between the two, where it
stands exactly is our choice.

> This does appear to make texi2any work with TEXINFO_XS_STRUCTURE=0, although
> there are various test failures that I haven't investigated properly
> It appears that the failures are to do with @item; e.g. for
> "converters_tests.t indices_in_begin_tables_lists_entries_after_item", the
> diff includes

This seems to me to be something worth investigating.  There could be a
logical reason why it does not work, but it is not so obvious to me,
normally the on-demand build of Perl structure is supposed to cover this
too.  It could point to a need to call $document->tree() somewhere
earlier or later.

> @@ -3848,15 +3846,13 @@ 
> $result_texis{'indices_in_begin_tables_lists_entries_after_item'} = '\\input 
> tex
>  
>  @itemize @minus
>  @c comment in itemize
> -@item 
>  @cindex also a cindex in itemize
> -e--mph item
> +@item e--mph item
>  @end itemize
>  
>  @itemize @bullet
> -@item 
>  @cindex index entry within itemize
> -i--tem 1
> +@item i--tem 1
>  @item @cindex index entry right after @@item
>  i--tem 2
>  @end itemize
> 
> - which appears to be the result of some tree transformation not being
> reflected.
> 
> I guess the main issue is with having a tree in both Perl and C and the
> Perl transformation not being reflected in the C tree structure which
> is then used for the output.

Seems right.

> So I think this approach is close to working if I can work out what
> is going on with the right version of the tree being used.  Hopefully,
> then much of the special-casing for TEXINFO_XS_STRUCTURE throughout the
> code for texi2any would no longer be needed.  Do you think this is right,
> or have I missed something?

It would be nice if such a setup worked correctly, but I am not sure
that it is "right", as it is not obvious to me that having Document.pm XS
used when TEXINFO_XS_STRUCTURE=0 is the right thing.

> Another option is to get rid of TEXINFO_XS_STRUCTURE completely.  As you
> may remember, we introduced this option after the last Texinfo release,
> when the new XS code had introduced a degree of slow-down and we were
> looking for ways to disable the new XS code while this could be fixed.
> Since this slow-down was in fact later fixed, it is not as necessary.  I
> expect that I would find it easy to remove TEXINFO_XS_STRUCTURE, while I
> am still not sure how easily I could get the approach above to work.

It would indeed be easier to remove TEXINFO_XS_STRUCTURE, but I think
that the condition should be kept in texi2any.pl, except that it would
only be on TEXINFO_XS and TEXINFO_XS_PARSER.

> Do you have any idea as to the best way to proceed?

I will try to understand why your approach did not work, as my feeling
is that it should have.  As I said above, it is not clear that it is a
good idea, but that is another issue.

I also still think that removing TEXINFO_XS_STRUCTURE would be ok, and
use XS or not for Structuring based on what is used for the parser.

-- 
Pat



reply via email to

[Prev in Thread] Current Thread [Next in Thread]