help-smalltalk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-smalltalk] [Q] Bug in EncodedStream?


From: Paolo Bonzini
Subject: Re: [Help-smalltalk] [Q] Bug in EncodedStream?
Date: Mon, 16 Oct 2006 10:21:29 +0200
User-agent: Thunderbird 1.5.0.7 (Macintosh/20060909)

Sungjin Chun wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

When I run following:

(I18N.EncodedStream encoding: (UnicodeString fromString: '전성진'))
contents !

gst emits endless messages related to garbage collecting then crashes
with segmentation faults.
Yes, it is a stupid bug. When using the system function iconv, gst has to split the UnicodeCharacters back into 8-bit Characters, and here it gets stuck in an infinite loop. The first character for example is $<16rC804>, and the "C8" byte is created as a UnicodeCharacter rather than a Character. This causes a recursive creation of another I18N.EncodedStream.

The attached patch fixes the bug; thanks for reporting it.

In my testing, I only used Eastern-European characters where all bytes are < 0x80.
And, are there any simple example for processing UTF-8 encoded string?
Can you expand?

Paolo
--- orig/i18n/Sets.st
+++ mod/i18n/Sets.st
@@ -718,13 +718,13 @@ next
          been extracted."
        wch := answer := self nextInput codePoint.
        wch := (wch bitShift: -8) + 16r1000000.
-       ^(answer bitAnd: 255) asCharacter
+       ^Character value: (answer bitAnd: 255)
     ].
 
     "Answer any other byte"
     answer := wch bitAnd: 255.
     wch := wch bitShift: -8.
-    ^answer asCharacter
+    ^Character value: answer
 !
 
 flush
@@ -754,7 +754,7 @@ next
        wch := answer := self nextInput codePoint.
        wch := wch bitAnd: 16rFFFFFF.
        count := 3.
-       ^(answer bitShift: -24) asCharacter
+       ^Character value: (answer bitShift: -24)
     ].
 
     "Answer any other byte.  We keep things so that the byte we answer
@@ -763,7 +763,7 @@ next
     wch := wch bitAnd: 16rFFFF.
     wch := wch bitShift: 8.
     count := count - 1.
-    ^answer asCharacter
+    ^Character value: answer
 !
 
 flush

reply via email to

[Prev in Thread] Current Thread [Next in Thread]