[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
branch release/7.0 updated: Non-XS optimizations
From: |
Gavin D. Smith |
Subject: |
branch release/7.0 updated: Non-XS optimizations |
Date: |
Sun, 12 Feb 2023 13:09:47 -0500 |
This is an automated email from the git hooks/post-receive script.
gavin pushed a commit to branch release/7.0
in repository texinfo.
The following commit(s) were added to refs/heads/release/7.0 by this push:
new 6e59c7d0a0 Non-XS optimizations
6e59c7d0a0 is described below
commit 6e59c7d0a048d99947c67fa62613abab36d172d4
Author: Gavin Smith <gavinsmith0123@gmail.com>
AuthorDate: Sun Feb 12 18:09:38 2023 +0000
Non-XS optimizations
* tp/Texinfo/Convert/ParagraphNonXS.pm (_add_next, add_text):
Add /o flag to regexes using string interpolation, in order to
only compile each regex once.
* tp/Texinfo/Convert/Unicode.pm (string_width):
Sometimes call 'length' on the string. This makes a difference
because string_width is now called from ParagraphNonXS.pm.
---
ChangeLog | 11 +++++++++++
tp/Texinfo/Convert/ParagraphNonXS.pm | 4 ++--
tp/Texinfo/Convert/Unicode.pm | 8 ++++++++
3 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index c43aaa29ec..43b0f47d42 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2023-02-12 Gavin Smith <gavinsmith0123@gmail.com>
+
+ Non-XS optimizations
+
+ * tp/Texinfo/Convert/ParagraphNonXS.pm (_add_next, add_text):
+ Add /o flag to regexes using string interpolation, in order to
+ only compile each regex once.
+ * tp/Texinfo/Convert/Unicode.pm (string_width):
+ Sometimes call 'length' on the string. This makes a difference
+ because string_width is now called from ParagraphNonXS.pm.
+
2023-02-12 Gavin Smith <gavinsmith0123@gmail.com>
* tp/Texinfo/ParserNonXS.pm (_process_remaining_on_line):
diff --git a/tp/Texinfo/Convert/ParagraphNonXS.pm
b/tp/Texinfo/Convert/ParagraphNonXS.pm
index 1cb442846b..19add4f402 100644
--- a/tp/Texinfo/Convert/ParagraphNonXS.pm
+++ b/tp/Texinfo/Convert/ParagraphNonXS.pm
@@ -213,7 +213,7 @@ sub _add_next($;$$$)
$paragraph->{'last_char'} = 'a';
} elsif ($word =~
/([^$end_sentence_character$after_punctuation_characters])
- [$end_sentence_character$after_punctuation_characters]*$/x) {
+ [$end_sentence_character$after_punctuation_characters]*$/ox) {
# Save the last character in $word before punctuation
$paragraph->{'last_char'} = $1;
}
@@ -406,7 +406,7 @@ sub add_text($$)
and $tmp =~
/(^|[^\p{Upper}$after_punctuation_characters$end_sentence_character])
[$after_punctuation_characters]*[$end_sentence_character]
- [$end_sentence_character\x08$after_punctuation_characters]*$/x) {
+ [$end_sentence_character\x08$after_punctuation_characters]*$/ox) {
if ($paragraph->{'frenchspacing'}) {
$paragraph->{'end_sentence'} = -1;
} else {
diff --git a/tp/Texinfo/Convert/Unicode.pm b/tp/Texinfo/Convert/Unicode.pm
index 3fcc4b307a..9b7b235c22 100644
--- a/tp/Texinfo/Convert/Unicode.pm
+++ b/tp/Texinfo/Convert/Unicode.pm
@@ -1566,6 +1566,14 @@ sub string_width($)
{
my $string = shift;
+ # Optimise for the common case where we can just return the length
+ # of the string. These regexes are faster than making the substitutions
+ # below.
+ if ($string =~ /^[\p{IsPrint}\p{IsSpace}]*$/
+ and $string !~ /[\p{InFullwidth}\pM]/) {
+ return length($string);
+ }
+
$string =~ s/\p{InFullwidth}/\x{02}/g;
$string =~ s/\pM/\x{00}/g;
$string =~ s/\p{IsPrint}/\x{01}/g;
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- branch release/7.0 updated: Non-XS optimizations,
Gavin D. Smith <=