help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: To fetch URL, extract <title> element?


From: Jean Louis
Subject: Re: To fetch URL, extract <title> element?
Date: Thu, 12 Nov 2020 16:20:46 +0300
User-agent: Mutt/2.0 (3d08634) (2020-11-07)

* Michael Heerdegen <michael_heerdegen@web.de> [2020-11-12 15:57]:
> Jean Louis <bugs@gnu.support> writes:
> 
> > > If I understand what you want correctly, eww seems to get the title with
> > > `eww-tag-title'
> >
> > That somehow sounds easier to do. To get HTML or any text is first
> > priority.
> 
> I also only had looked at the eww code.  Maybe Lars wants to help
> more.

Some hyperlinks are captured by copy from any browser and inserted
into Emacs.

- As such do not have title or annotation, but they need to
  have. Title has to be fetched automatically. It is expensive
  process. I would like fetching only headers.

- some WWW links expire, their status has to be updated from time to
  time

- then it becomes possible for user to mark hyperlinks and update
  titles for all of them

I do not know how to use url-retrieve but I found out how to use it
synchronoysly and for now this work non-elegantly. 

(defun hyperscope-url-to-string (url)
  "Fetch URL and return as string."
  (url-retrieve-synchronously url)
  (let ((buffer (url-retrieve-synchronously url)))
    (with-current-buffer buffer
      (buffer-string))))

(defun hyperscope-fetch-title (url)
  "Return title for URL or if there is no match URL."
  (let* ((string (hyperscope-url-to-string url))
         (match (string-match "<title>\\(.*\\)</title>" string)))
    (if match
        (replace-regexp-in-string "<title>\\|</title>" "" (match-string 0 
string))
      url)))

(defun hyperscope-fetch-title-for-url (id)
  (let* ((url (hlinks-link id))
         (title-or-url (hyperscope-fetch-title url)))
    (hlink-update-name-1 title-or-url id)))

(defun hyperscope-update-url-title ()
  (interactive)
  (let ((id (tabulated-list-get-id)))
    (hyperscope-fetch-title-for-url id)))

> > That will help in Hyperscope to automatically update WWW links with
> > their titles provided that content-type is HTML.
> 
> I'm curious: what exactly are you doing?  (I don't know Hyperscope but
> see that it's easy to find infos about it in the Internet.)

It is DKR or Dynamic Knowledge Repository
https://www.dougengelbart.org/content/view/190/163/
https://en.wikipedia.org/wiki/Dynamic_knowledge_repository

Hyperscope is a browsing tool that enables most of the viewing and
navigating features called for in Doug Engelbart's open hyperdocument
system framework (OHS) to support dynamic knowledge repositories
(DKRs) and rising Collective IQ.
https://www.dougengelbart.org/content/view/154/86/

This HyperScope for Emacs is similar to it. It may grow as large index
or it can be used only for bookmarking simple stuff. It is collection
of hyperlinks to anything. Similarly as Emacs bookmarking system it
can hyperlink to any file, file by search or by line number. It does
not work as text as it is database backed.

emacs-libpq dynamic module for PostgreSQL database is coming soon into
GNU ELPA. When this comes then maybe I get some productive version
coming as well.

As result it gives collective IQ or easier access to pieces of
information that a group may need to accelerate its efficiency.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]