help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "split-sentences"?


From: tomas
Subject: Re: "split-sentences"?
Date: Sat, 23 Jan 2021 09:41:37 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Sat, Jan 23, 2021 at 07:38:49AM +0100, moasenwood--- via Users list for the 
GNU Emacs text editor wrote:
> moasenwood--- via Users list for the GNU Emacs text editor wrote:
> 
> > Can I parse/split a string into sentences based on
> > human-language punctuation?
> >
> > Did anyone do that already?
> 
> I mean very mechanically is fine, no linguistics or anything.
> 
> So this
> 
> "'This sentence is spoken by Mr. W. E. B Dubois, Esq.!' played
> through amazon.com alexa speakers?"
>
> would be
> 
> ("'" "This sentence is spoken by Mr" "." "W" "." "E" "." "B
> Dubois" "," "Esq" "." "!" "'" "played through amazon" "."
> "com" "alexa "speakers" "?")

Not exactly your result, but this comes close:

  (split-string
    "'This sentence is spoken by Mr. W. E. B Dubois, Esq.!' played through 
amazon.com alexa speakers?"
    "[[:punct:]][[:space:]]*")

=>

  (""
   "This sentence is spoken by Mr"
   "W"
   "E"
   "B Dubois"
   "Esq"
   ""
   ""
   "played through amazon"
   "com alexa speakers"
   "")

You can adjust the results by tweaking the regexp (try word
boundaries like '\<' and '\>' if you want to keep punctuation)
or the other split-string's optional params (e.g. drop the
empty matches, etc.).

Cheers
 - t

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]