help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

convert whole website from iso-8859-2/1 to utf-8


From: Miroslav Rovis
Subject: convert whole website from iso-8859-2/1 to utf-8
Date: Fri, 09 Jul 2004 00:22:42 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7b) Gecko/20040406

No one helped yet...
So I went scuba diving into the huge Emacs lisping ocean... Breath! Breath!
Two days! First modest success...
----------------------------------------------------------------------
----------------------------------------------------------------------
An HTML file (schoolih.htm):
----------------------------------------------------------------------
<html>
<head>
<title>Virtualna Škola Supero!</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-2">
</head>

<body>
  <h3>Dobrodošli u Virtualnu školu Supero! </h3>
<p>Podučavam engleski, hrvatski i talijanski kao strane jezike. I u tome sam
    istinski vješt.</p>
  <p>Nudim također poduku početnika u informatici. I pisanju na Mreži.</p>
  <p>No ovo je ustvari virtualna škola. Nije &quot;prava&quot;.</p>
<p>Naravno, budem li ikad imao stvarno malo čedo u obliku institucije, tako će se zvati.</p>
  <p>Podučavam brojne đake u stvarnom životu već lijepi broj godina.</p>
</body>
</html>
----------------------------------------------------------------------
----------------------------------------------------------------------
My first lisp program (latin2-utf8.sh):
(use at your own risk ;-)
----------------------------------------------------------------------
#!/usr/bin/emacs --script
;; step
(find-file "/test/schoolih.htm")
(princ "buffer-name is: ")
(princ (buffer-name))
(princ "\n")
(princ "buffer-file-name is: ")
(princ (buffer-file-name))
(princ "\n")
(princ "buffer-file-coding-system is: ")
(princ buffer-file-coding-system)
(princ "\n")
(princ "coding-system-for-write is: ")
(princ coding-system-for-write)
(princ "\n")
(princ "\n")
;; step
(set-visited-file-name "/test/schoolih_u.htm")
;; step
(search-forward "iso-8859-2" nil t)
(replace-match "utf-8" nil t)
;; step
(let ((coding-system-for-write 'utf-8))
(princ "buffer-name is: ")
(princ (buffer-name))
(princ "\n")
(princ "buffer-file-name is: ")
(princ (buffer-file-name))
(princ "\n")
(princ "buffer-file-coding-system is: ")
(princ buffer-file-coding-system)
(princ "\n")
(princ "coding-system-for-write is: ")
(princ coding-system-for-write)
(princ "\n")
(princ "\n")
;; step
;;(revert-buffer-with-coding-system 'utf-8)
(save-buffer (current-buffer)))
----------------------------------------------------------------------
((Of course, most of it is nothing other than my groping for solutions in this entirely new territory for me and can be cut out and forgotten.))
----------------------------------------------------------------------
----------------------------------------------------------------------
The program writes a file schoolih_u.htm which is identical in all to
  schoolih.htm, except for:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
and the fact that those:

š č đ ž ć

character are in genuine utf-8 flavour!
(any other characters peculiar to utf-8 would have been so in just the
same fashion -- anyone interested is encouraged to try)...

----------------------------------------------------------------------
----------------------------------------------------------------------
OK. Enough braggadoccio (oh, I know how little and puny this is, I
 know...).
This is a very small part of the whole project.

Any help is still appreciated.

May all lispers stay well and healthy, esp. in their souls!
Miroslav Rovis
www.rovis.org

----------------------------------------------------------------------





reply via email to

[Prev in Thread] Current Thread [Next in Thread]