help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Highlight saved, rendered HTML document


From: Jean Louis
Subject: Re: Highlight saved, rendered HTML document
Date: Wed, 9 Jun 2021 22:48:51 +0300
User-agent: Mutt/2.0.7+183 (3d24855) (2021-05-28)

* Julius Hamilton <julkhami@gmail.com> [2021-06-09 21:06]:
> Hello,
> 
> I would like to be able to highlight webpages offline, for better reading
> comprehension of them.

Hypothes.is Annotate the web, with anyone, anywhere.
https://web.hypothes.is/

That may be one of best tools for annotation. It could be installed on
your computer.

More resources:

Open Annotation · GitHub
https://github.com/openannotation

Home - Annotator - Annotating the Web
http://annotatorjs.org/

Different solution is to save the HTML page as PDF and use Emacs to
annotate PDF (you said it works) or Evince PDF viewer to annotate it.

There is different solution to convert HTML to text and then to
annotate it by using: 

;; Author: Bastian Bechtold
;; Maintainer: Bastian Bechtold
;; URL: https://github.com/bastibe/annotate.el

Converting HTML to text is not hard, there are many tools to do that,
including with Emacs.

$ elinks --dump https://www.example.com > example.txt

or

$ pandoc -f html -t plain https://www.example.com

> I recently discovered that for some reason, these tools do not work
> for downloaded pages being viewed in a browser. Maybe it's because
> they try to save the highlights in relation to each URL, and the
> downloaded pages don't have URLs.

Maybe this system could help?

Home | CollectiveAccess
https://collectiveaccess.org/

You may install CollectiveAccess on your computer and annotate
anything from WWWW. Demo:
https://demo.collectiveaccess.org/index.php/system/auth/login?redirect=https%3A%2F%2Fdemo.collectiveaccess.org%2Findex.php%2FDashboard%2FIndex

> I was wondering if anybody could recommend a way to highlight rendered HTML
> pages in Emacs. I know Emacs provides annotation tools for PDFs in
> pdf-tools mode, and highlighting plaintext in a certain highlighting mode.
> It seems likely that it should be possible for HTML pages too.
> 
> Just to be clear, I don't mean syntax highlighting HTML code, but rather
> moving a cursor through a web document to highlight information of
> interest.

I could use annotate.el to annotate HTML that I have opened with
eww-open-file and annotated with annotate-mode, but I could not save
annotations. Now I am thinking it could be or should be possible to
adapt it. 

Cc: to Ihor as he may know the solution.

How annotate.el works you can see in the attached image, but I think
that annotation is too short or somehow limited if it is straight in
the text.

Good and simple way to annotate documents would be either GNU
Hyperbole or `eev' package, then I would take the approach of making
buttons which I would highlight and be able to quickly jump to the
annotation. Here is the example hyperlink to text annotation:
"/home/admin/tmp/annotations.txt"

Or hyperlink to specific line number:
"/home/admin/tmp/annotations.txt:2"

Or `eev' hyperlinks:

(find-fline "~/tmp/annotations.txt")

Or like this below that could annotate the paragraph and jump to
annotations file searching for "lorem ipsum", or it could go to
specific position, it implies that files are writeable.

(find-fline "~/tmp/annotations.txt" "lorem ipsum")

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam
lectus. Sed sit amet ipsum mauris. Maecenas congue ligula ac quam
viverra nec consectetur ante hendrerit. Donec et mollis
dolor. 

I would take the programmatic approach to annotations on the higher
level which would or could work with files but also buffers not
related to files such as those values edited from a database. The
approach would be similar to `eev' package and function `find-fline',
so I would make it for read only files based on the line or query, for
writeable files based on the query only (prone to fail if things are
changed). A query or a line could even be highlighted later if mode is
turned on, or it could become a button on the fly (Emacs package
button.el) -- and data would be stored outside, in the database object
that refers to the file. That approach makes it little more visual.

Right now I am annotating any file, any object by using database
meta-level attributes, so if there is a file there is description,
internal description, text, report, author, tags, all such information
pieces are separate from the file, thus not so specific to parts of
the text as I simply not need it that defined. I have 14000 objects to
PDFs by page number, that is not an annotation but is similar, as I
can jump from description straight to PDF (or files of any
kinds). This message I have already "annotated" and can further work
on it, it is offline though it is online, jumping from annotation to
offline or online version works too.

-- 
Jean

Take action in Free Software Foundation campaigns:
https://www.fsf.org/campaigns

In support of Richard M. Stallman
https://stallmansupport.org/

⟦ (hyperscope 38467) ⟧



reply via email to

[Prev in Thread] Current Thread [Next in Thread]