Re: problems with string encoding

gnustep-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problems with string encoding

From:	Fred Kiefer
Subject:	Re: problems with string encoding
Date:	Wed, 10 Nov 2010 20:46:58 +0100
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; de; rv:1.9.1.15) Gecko/20101026 SUSE/3.0.10 Thunderbird/3.0.10

Am 09.11.2010 23:21, schrieb David Wetzel:
> Hi,
> 
> when parsing web pages I need to figure out the encoding.
> What I am currently doing is, getting the start of the page as string buffer.
> 
> look for a substring like "charset=iso-8859-1" and then I have the encoding.
> The problem is that if that fails:
> 
>   encStr = [[NSString alloc] initWithBytes:buffer 
>                                     length:len
>                                   encoding:NSISOLatin1StringEncoding];
> 
> I have no means to get the charset string part.
> Before, used to use NSASCIIStringEncoding but that fails for some reason.
> What to do if its not Latin1? It could be anything.
> It would be fine if all non-ASCI-7 chars would be lost.
> 
> Is there a nice way of pushing that cString in and get a lossy (and I mean 
> really lossy) ASCII-7 NSString back?
> No Iconv or other conversions are needed.
> 
> Comments are welcome :-)

As far as I remember we only do all that fancy conversion when the
encoding provided doesn't match the internal encoding used by GNUstep.
That way everything should be fine when you use the internal encoding
and this simples way to do that is to create your NSString via this method:

- (id) initWithCString: (const char*)byteString  length: (NSUInteger)length

[Prev in Thread]

Current Thread

[Next in Thread]

problems with string encoding, David Wetzel, 2010/11/09
- Re: problems with string encoding, Fred Kiefer <=

Prev by Date: problems with string encoding
Next by Date: NSPathUtilities and document directory
Previous by thread: problems with string encoding
Next by thread: NSPathUtilities and document directory
Index(es):
- Date
- Thread