texinfo-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

branch release/7.0 updated: Non-XS optimizations


From: Gavin D. Smith
Subject: branch release/7.0 updated: Non-XS optimizations
Date: Sun, 12 Feb 2023 13:09:47 -0500

This is an automated email from the git hooks/post-receive script.

gavin pushed a commit to branch release/7.0
in repository texinfo.

The following commit(s) were added to refs/heads/release/7.0 by this push:
     new 6e59c7d0a0 Non-XS optimizations
6e59c7d0a0 is described below

commit 6e59c7d0a048d99947c67fa62613abab36d172d4
Author: Gavin Smith <gavinsmith0123@gmail.com>
AuthorDate: Sun Feb 12 18:09:38 2023 +0000

    Non-XS optimizations
    
    * tp/Texinfo/Convert/ParagraphNonXS.pm (_add_next, add_text):
    Add /o flag to regexes using string interpolation, in order to
    only compile each regex once.
    * tp/Texinfo/Convert/Unicode.pm (string_width):
    Sometimes call 'length' on the string.  This makes a difference
    because string_width is now called from ParagraphNonXS.pm.
---
 ChangeLog                            | 11 +++++++++++
 tp/Texinfo/Convert/ParagraphNonXS.pm |  4 ++--
 tp/Texinfo/Convert/Unicode.pm        |  8 ++++++++
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index c43aaa29ec..43b0f47d42 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2023-02-12  Gavin Smith <gavinsmith0123@gmail.com>
+
+       Non-XS optimizations
+
+       * tp/Texinfo/Convert/ParagraphNonXS.pm (_add_next, add_text):
+       Add /o flag to regexes using string interpolation, in order to
+       only compile each regex once.
+       * tp/Texinfo/Convert/Unicode.pm (string_width):
+       Sometimes call 'length' on the string.  This makes a difference
+       because string_width is now called from ParagraphNonXS.pm.
+
 2023-02-12  Gavin Smith <gavinsmith0123@gmail.com>
 
        * tp/Texinfo/ParserNonXS.pm (_process_remaining_on_line):
diff --git a/tp/Texinfo/Convert/ParagraphNonXS.pm 
b/tp/Texinfo/Convert/ParagraphNonXS.pm
index 1cb442846b..19add4f402 100644
--- a/tp/Texinfo/Convert/ParagraphNonXS.pm
+++ b/tp/Texinfo/Convert/ParagraphNonXS.pm
@@ -213,7 +213,7 @@ sub _add_next($;$$$)
         $paragraph->{'last_char'} = 'a';
       } elsif ($word =~
            /([^$end_sentence_character$after_punctuation_characters])
-            [$end_sentence_character$after_punctuation_characters]*$/x) {
+            [$end_sentence_character$after_punctuation_characters]*$/ox) {
         # Save the last character in $word before punctuation
         $paragraph->{'last_char'} = $1;
       }
@@ -406,7 +406,7 @@ sub add_text($$)
           and $tmp =~
         /(^|[^\p{Upper}$after_punctuation_characters$end_sentence_character])
          [$after_punctuation_characters]*[$end_sentence_character]
-         [$end_sentence_character\x08$after_punctuation_characters]*$/x) {
+         [$end_sentence_character\x08$after_punctuation_characters]*$/ox) {
         if ($paragraph->{'frenchspacing'}) {
           $paragraph->{'end_sentence'} = -1;
         } else {
diff --git a/tp/Texinfo/Convert/Unicode.pm b/tp/Texinfo/Convert/Unicode.pm
index 3fcc4b307a..9b7b235c22 100644
--- a/tp/Texinfo/Convert/Unicode.pm
+++ b/tp/Texinfo/Convert/Unicode.pm
@@ -1566,6 +1566,14 @@ sub string_width($)
 {
   my $string = shift;
 
+  # Optimise for the common case where we can just return the length
+  # of the string.  These regexes are faster than making the substitutions
+  # below.
+  if ($string =~ /^[\p{IsPrint}\p{IsSpace}]*$/
+      and $string !~ /[\p{InFullwidth}\pM]/) {
+    return length($string);
+  }
+
   $string =~ s/\p{InFullwidth}/\x{02}/g;
   $string =~ s/\pM/\x{00}/g;
   $string =~ s/\p{IsPrint}/\x{01}/g;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]