gnunet-svn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[www_shared] branch master updated: extract only article content


From: gnunet
Subject: [www_shared] branch master updated: extract only article content
Date: Tue, 11 May 2021 19:18:26 +0200

This is an automated email from the git hooks/post-receive script.

dold pushed a commit to branch master
in repository www_shared.

The following commit(s) were added to refs/heads/master by this push:
     new 2b72c7f  extract only article content
2b72c7f is described below

commit 2b72c7f57d318271856f992eb2e58c133ae5179e
Author: Florian Dold <florian@dold.me>
AuthorDate: Tue May 11 19:16:04 2021 +0200

    extract only article content
---
 sitegen/site.py | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/sitegen/site.py b/sitegen/site.py
index f9d3e7d..2148763 100644
--- a/sitegen/site.py
+++ b/sitegen/site.py
@@ -65,14 +65,14 @@ def cut_text(filename, count):
         return textreduced
 
 
-def extract_body(text):
+def extract_body(text, content_id="newspost-content"):
     """Extract the body of some HTML and
     return it wrapped in an <article> tag."""
     soup = BeautifulSoup(text, features="lxml")
-    bs = soup.findAll("body")
-    b = bs[0]
-    b.name = "article"
-    return b.prettify()
+    content = soup.find(id=content_id)
+    if content is None:
+        raise Error("can't extract content")
+    return content.prettify()
 
 
 def make_helpers(root, in_file, locale):

-- 
To stop receiving notification emails like this one, please contact
gnunet@gnunet.org.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]