Re: extended format is wrong?

bug-ncurses
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: extended format is wrong?

From:	Anmol Sethi
Subject:	Re: extended format is wrong?
Date:	Tue, 26 Apr 2016 23:24:25 -0400
I found https://github.com/mauke/unibilium

It supports extended capabilities but its code is much easier for me to 
understand. I’m gonna try and implement it again tomorrow.
> On Apr 26, 2016, at 10:03 PM, Anmol Sethi <address@hidden> wrote:
> 
> I’d test to see how ncurses handles it, and fix my problems from there, but I 
> cannot enable tracing for some reason. I have 0 experience with c, sorry! I 
> set the environment variable, I see the tracing set, but it outputs nothing. 
> And I do have the right flags set.
> 
>> On Apr 26, 2016, at 10:02 PM, Anmol Sethi <address@hidden> wrote:
>> 
>> Hello!
>> 
>> I’m writing a terminfo library for go. Someone here suggested I add support 
>> for the extended storage format. I’ve tried and I’m running into quite a few 
>> problems. Here is an example xterm terminfo file from my system in hex. I’ve 
>> only included the extended format divided up.
>> 
>> 
>> EXTENDED:                             
>> Header:
>> 01 00 00 00 39 00 73 00 a2 02 
>> Values are 1, 0, 57, 115, and 674 in decimal.
>> 
>> According to term(5), that means 1 bool cap, 0 numeric caps, 57 string caps, 
>> 115 bytes for the extended string table and 674 bytes for the last offset of 
>> the extended string table. Ok so I attempted at writing a reader with this 
>> information and turns out those values are wrong or I am not understanding 
>> any of this correctly. Before I begin, what exactly is the difference 
>> between the 4th and 5th short integers of the header? One is the byte length 
>> of the string table and the other is the last offset of the string table. 
>> I’m not 100% what exactly that means. I’m gonna assume that the first one is 
>> the size in which the string cap values are contained and the second is the 
>> entire size of the table. The ncurses source defines the 4th short as 
>> ext_str_size and the 5th as ext_str_limit. Funny thing is, the source 
>> doesn’t even use the 4th short.
>> 
>> Bool:
>> 01 00 
>> 
>> Anyways, after the header, there is the boolean section. Since there is only 
>> one boolean value, I read the byte and then I noticed the old quirk, I need 
>> to skip the extra null byte inserted to keep everything on word boundaries. 
>> Ok so done and done.
>> 
>> String Section:
>> ff ff 00 00 07 00 0e 00 15 00 1c 00 23 00 2a 00 31 00 38 00 3f 00 46 00 4d 
>> 00 54 00 5b 00 62 00 69 00
>> 70 00 77 00 7e 00 85 00 8c 00 93 00 9a 00 a1 00 a8 00 af 00 b6 00 bd 00 c4 
>> 00 cb 00 d2 00 d9 00 e0 00
>> e7 00 ee 00 f5 00 fc 00 03 01 0a 01 11 01 18 01 1f 01 26 01 2d 01 34 01 3b 
>> 01 42 01 49 01 50 01 57 01
>> 5e 01 65 01 ff ff ff ff ff ff ff ff 
>> 
>> Next the numeric capabilities were 0 so I went straight to the string caps. 
>> Now here I read in these 114 bytes, or 57 shorts.
>> 
>> Rest:
>> 00 00 03 00 06 00 0b 00 10 00
>> 15 00 1a 00 1f 00 23 00 28 00 2d 00 32 00 37 00 3c 00 42 00 48 00 4e 00 54 
>> 00 5a 00 60 00 66 00 6c 00
>> 72 00 78 00 7d 00 82 00 87 00 8c 00 91 00 97 00 9d 00 a3 00 a9 00 af 00 b5 
>> 00 bb 00 c1 00 c7 00 cd 00
>> d3 00 d9 00 df 00 e5 00 eb 00 f1 00 f7 00 fd 00 03 01 09 01 0d 01 12 01 17 
>> 01 1c 01 21 01 26 01 2a 01
>> 2e 01 32 01 1b 5b 33 3b 33 7e 00 1b 5b 33 3b 34 7e 00 1b 5b 33 3b 35 7e 00 
>> 1b 5b 33 3b 36 7e 00 1b 5b
>> 33 3b 37 7e 00 1b 5b 31 3b 32 42 00 1b 5b 31 3b 33 42 00 1b 5b 31 3b 34 42 
>> 00 1b 5b 31 3b 35 42 00 1b
>> 5b 31 3b 36 42 00 1b 5b 31 3b 37 42 00 1b 5b 31 3b 33 46 00 1b 5b 31 3b 34 
>> 46 00 1b 5b 31 3b 35 46 00
>> 1b 5b 31 3b 36 46 00 1b 5b 31 3b 37 46 00 1b 5b 31 3b 33 48 00 1b 5b 31 3b 
>> 34 48 00 1b 5b 31 3b 35 48
>> 00 1b 5b 31 3b 36 48 00 1b 5b 31 3b 37 48 00 1b 5b 32 3b 33 7e 00 1b 5b 32 
>> 3b 34 7e 00 1b 5b 32 3b 35
>> 7e 00 1b 5b 32 3b 36 7e 00 1b 5b 32 3b 37 7e 00 1b 5b 31 3b 33 44 00 1b 5b 
>> 31 3b 34 44 00 1b 5b 31 3b
>> 35 44 00 1b 5b 31 3b 36 44 00 1b 5b 31 3b 37 44 00 1b 5b 36 3b 33 7e 00 1b 
>> 5b 36 3b 34 7e 00 1b 5b 36
>> 3b 35 7e 00 1b 5b 36 3b 36 7e 00 1b 5b 36 3b 37 7e 00 1b 5b 35 3b 33 7e 00 
>> 1b 5b 35 3b 34 7e 00 1b 5b
>> 35 3b 35 7e 00 1b 5b 35 3b 36 7e 00 1b 5b 35 3b 37 7e 00 1b 5b 31 3b 33 43 
>> 00 1b 5b 31 3b 34 43 00 1b
>> 5b 31 3b 35 43 00 1b 5b 31 3b 36 43 00 1b 5b 31 3b 37 43 00 1b 5b 31 3b 32 
>> 41 00 1b 5b 31 3b 33 41 00
>> 1b 5b 31 3b 34 41 00 1b 5b 31 3b 35 41 00 1b 5b 31 3b 36 41 00 1b 5b 31 3b 
>> 37 41 00 41 58 00 58 4d 00
>> 6b 44 43 33 00 6b 44 43 34 00 6b 44 43 35 00 6b 44 43 36 00 6b 44 43 37 00 
>> 6b 44 4e 00 6b 44 4e 33 00
>> 6b 44 4e 34 00 6b 44 4e 35 00 6b 44 4e 36 00 6b 44 4e 37 00 6b 45 4e 44 33 
>> 00 6b 45 4e 44 34 00 6b 45
>> 4e 44 35 00 6b 45 4e 44 36 00 6b 45 4e 44 37 00 6b 48 4f 4d 33 00 6b 48 4f 
>> 4d 34 00 6b 48 4f 4d 35 00
>> 6b 48 4f 4d 36 00 6b 48 4f 4d 37 00 6b 49 43 33 00 6b 49 43 34 00 6b 49 43 
>> 35 00 6b 49 43 36 00 6b 49
>> 43 37 00 6b 4c 46 54 33 00 6b 4c 46 54 34 00 6b 4c 46 54 35 00 6b 4c 46 54 
>> 36 00 6b 4c 46 54 37 00 6b
>> 4e 58 54 33 00 6b 4e 58 54 34 00 6b 4e 58 54 35 00 6b 4e 58 54 36 00 6b 4e 
>> 58 54 37 00 6b 50 52 56 33
>> 00 6b 50 52 56 34 00 6b 50 52 56 35 00 6b 50 52 56 36 00 6b 50 52 56 37 00 
>> 6b 52 49 54 33 00 6b 52 49
>> 54 34 00 6b 52 49 54 35 00 6b 52 49 54 36 00 6b 52 49 54 37 00 6b 55 50 00 
>> 6b 55 50 33 00 6b 55 50 34
>> 00 6b 55 50 35 00 6b 55 50 36 00 6b 55 50 37 00 6b 61 32 00 6b 62 31 00 6b 
>> 62 33 00 6b 63 32 00
>> 
>> Alright, now the rest of it should be the string table right? Well it turns 
>> out that it isn’t. Here look, this is the rest of the extended format (it is 
>> in a different format but it should be clear whats going on):
>> 
>> First 116 bytes:
>> "\x00\x00\x03\x00\x06\x00\v\x00\x10\x00\x15\x00\x1a\x00\x1f\x00#\x00(\x00-\x002\x007\x00<\x00B\x00H\x0
>> 0N\x00T\x00Z\x00`\x00f\x00l\x00r\x00x\x00}\x00\x82\x00\x87\x00\x8c\x00\x91\x00\x97\x00\x9d\x00\xa3\x00
>> \xa9\x00\xaf\x00\xb5\x00\xbb\x00\xc1\x00\xc7\x00\xcd\x00\xd3\x00\xd9\x00\xdf\x00\xe5\x00\xeb\x00\xf1\x
>> 00\xf7\x00\xfd\x00\x03\x01\t\x01\r\x01\x12\x01\x17\x01\x1c\x01!\x01&\x01*\x01.\x012\x01"
>> 
>> String Table:
>> "\x1b[3;3~\x00\x
>> 1b[3;4~\x00\x1b[3;5~\x00\x1b[3;6~\x00\x1b[3;7~\x00\x1b[1;2B\x00\x1b[1;3B\x00\x1b[1;4B\x00\x1b[1;5B\x00
>> \x1b[1;6B\x00\x1b[1;7B\x00\x1b[1;3F\x00\x1b[1;4F\x00\x1b[1;5F\x00\x1b[1;6F\x00\x1b[1;7F\x00\x1b[1;3H\x
>> 00\x1b[1;4H\x00\x1b[1;5H\x00\x1b[1;6H\x00\x1b[1;7H\x00\x1b[2;3~\x00\x1b[2;4~\x00\x1b[2;5~\x00\x1b[2;6~
>> \x00\x1b[2;7~\x00\x1b[1;3D\x00\x1b[1;4D\x00\x1b[1;5D\x00\x1b[1;6D\x00\x1b[1;7D\x00\x1b[6;3~\x00\x1b[6;
>> 4~\x00\x1b[6;5~\x00\x1b[6;6~\x00\x1b[6;7~\x00\x1b[5;3~\x00\x1b[5;4~\x00\x1b[5;5~\x00\x1b[5;6~\x00\x1b[
>> 5;7~\x00\x1b[1;3C\x00\x1b[1;4C\x00\x1b[1;5C\x00\x1b[1;6C\x00\x1b[1;7C\x00\x1b[1;2A\x00\x1b[1;3A\x00\x1
>> b[1;4A\x00\x1b[1;5A\x00\x1b[1;6A\x00\x1b[1;7A\x00AX\x00XM\x00kDC3\x00kDC4\x00kDC5\x00kDC6\x00kDC7\x00k
>> DN\x00kDN3\x00kDN4\x00kDN5\x00kDN6\x00kDN7\x00kEND3\x00kEND4\x00kEND5\x00kEND6\x00kEND7\x00kHOM3\x00kH
>> OM4\x00kHOM5\x00kHOM6\x00kHOM7\x00kIC3\x00kIC4\x00kIC5\x00kIC6\x00kIC7\x00kLFT3\x00kLFT4\x00kLFT5\x00k
>> LFT6\x00kLFT7\x00kNXT3\x00kNXT4\x00kNXT5\x00kNXT6\x00kNXT7\x00kPRV3\x00kPRV4\x00kPRV5\x00kPRV6\x00kPRV
>> 7\x00kRIT3\x00kRIT4\x00kRIT5\x00kRIT6\x00kRIT7\x00kUP\x00kUP3\x00kUP4\x00kUP5\x00kUP6\x00kUP7\x00ka2\x
>> 00kb1\x00kb3\x00kc2\x00”
>> 
>> The first 116 bytes are actually not part of the string table. I have no 
>> idea what purpose they serve. If I skip these 116 bytes, I get to the actual 
>> string table and the offsets from before work perfectly fine. I noticed that 
>> 114 + 116 is 230. So 230 bytes need to be read to get to the string table. I 
>> noticed that the 4th short is 115, and 115 * 2 = 230. So essentially I’m 
>> using the 4th short multiplied by 2 as the length of the offsets. But I take 
>> the 3rd short of the header, aka the number of string caps and use only that 
>> number of shorts from the buffer. As in I read in those extra 116 bytes but 
>> I never use them. This seems to work fine across different files but I don’t 
>> think its correct. I think I’m misunderstanding something.
>> 
>> Could someone please help me understand the extended format and what I am 
>> doing wrong?
>> 
>> If you need any further clarification on my problem, please do not hesitate 
>> to ask.
>> 
>> Here is the current extended reader code. 
>> https://github.com/nhooyr/terminfo/blob/master/read.go#L126
>> 
>> First I make sure we’re on a word boundary. Then I read in the 10 bytes for 
>> the reader. Next I print it out. Then I read in the bool section. Then I 
>> read in the numeric section (its 0 in this case). Next I read the string 
>> offset section but for the size, I use r.h[lenExtTable] * 2 which means, 
>> take the 4th short (the size of the string table section apparently..) and 
>> multiple it by 2, and use that. Then I use the last offset as the size of 
>> the actual string table and print out all of the extended values. You can 
>> easily run this on your own computer by setting the TERM to xterm, running 
>> “go get github.com/nhooyr/terminfo” and then run "go test 
>> github.com/nhooyr/terminfo -v”. You should see all of the string caps being 
>> printed out.
>> 
>
[Prev in Thread]
Current Thread
[Next in Thread]
extended format is wrong?, Anmol Sethi, 2016/04/26
- Re: extended format is wrong?, Anmol Sethi, 2016/04/26
  - Re: extended format is wrong?, Anmol Sethi <=
    - Re: extended format is wrong?, Anmol Sethi, 2016/04/29
Prev by Date: Re: extended format is wrong?
Next by Date: no patch tomorrow
Previous by thread: Re: extended format is wrong?
Next by thread: Re: extended format is wrong?
Index(es):
- Date
- Thread