extended format is wrong?

bug-ncurses
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
extended format is wrong?

From:	Anmol Sethi
Subject:	extended format is wrong?
Date:	Tue, 26 Apr 2016 22:02:13 -0400
Hello!

I’m writing a terminfo library for go. Someone here suggested I add support for 
the extended storage format. I’ve tried and I’m running into quite a few 
problems. Here is an example xterm terminfo file from my system in hex. I’ve 
only included the extended format divided up.

  
EXTENDED:                             
Header:
01 00 00 00 39 00 73 00 a2 02 
Values are 1, 0, 57, 115, and 674 in decimal.

According to term(5), that means 1 bool cap, 0 numeric caps, 57 string caps, 
115 bytes for the extended string table and 674 bytes for the last offset of 
the extended string table. Ok so I attempted at writing a reader with this 
information and turns out those values are wrong or I am not understanding any 
of this correctly. Before I begin, what exactly is the difference between the 
4th and 5th short integers of the header? One is the byte length of the string 
table and the other is the last offset of the string table. I’m not 100% what 
exactly that means. I’m gonna assume that the first one is the size in which 
the string cap values are contained and the second is the entire size of the 
table. The ncurses source defines the 4th short as ext_str_size and the 5th as 
ext_str_limit. Funny thing is, the source doesn’t even use the 4th short.

Bool:
01 00 

Anyways, after the header, there is the boolean section. Since there is only 
one boolean value, I read the byte and then I noticed the old quirk, I need to 
skip the extra null byte inserted to keep everything on word boundaries. Ok so 
done and done.

String Section:
ff ff 00 00 07 00 0e 00 15 00 1c 00 23 00 2a 00 31 00 38 00 3f 00 46 00 4d 00 
54 00 5b 00 62 00 69 00
70 00 77 00 7e 00 85 00 8c 00 93 00 9a 00 a1 00 a8 00 af 00 b6 00 bd 00 c4 00 
cb 00 d2 00 d9 00 e0 00
e7 00 ee 00 f5 00 fc 00 03 01 0a 01 11 01 18 01 1f 01 26 01 2d 01 34 01 3b 01 
42 01 49 01 50 01 57 01
5e 01 65 01 ff ff ff ff ff ff ff ff 

Next the numeric capabilities were 0 so I went straight to the string caps. Now 
here I read in these 114 bytes, or 57 shorts.

Rest:
00 00 03 00 06 00 0b 00 10 00
15 00 1a 00 1f 00 23 00 28 00 2d 00 32 00 37 00 3c 00 42 00 48 00 4e 00 54 00 
5a 00 60 00 66 00 6c 00
72 00 78 00 7d 00 82 00 87 00 8c 00 91 00 97 00 9d 00 a3 00 a9 00 af 00 b5 00 
bb 00 c1 00 c7 00 cd 00
d3 00 d9 00 df 00 e5 00 eb 00 f1 00 f7 00 fd 00 03 01 09 01 0d 01 12 01 17 01 
1c 01 21 01 26 01 2a 01
2e 01 32 01 1b 5b 33 3b 33 7e 00 1b 5b 33 3b 34 7e 00 1b 5b 33 3b 35 7e 00 1b 
5b 33 3b 36 7e 00 1b 5b
33 3b 37 7e 00 1b 5b 31 3b 32 42 00 1b 5b 31 3b 33 42 00 1b 5b 31 3b 34 42 00 
1b 5b 31 3b 35 42 00 1b
5b 31 3b 36 42 00 1b 5b 31 3b 37 42 00 1b 5b 31 3b 33 46 00 1b 5b 31 3b 34 46 
00 1b 5b 31 3b 35 46 00
1b 5b 31 3b 36 46 00 1b 5b 31 3b 37 46 00 1b 5b 31 3b 33 48 00 1b 5b 31 3b 34 
48 00 1b 5b 31 3b 35 48
00 1b 5b 31 3b 36 48 00 1b 5b 31 3b 37 48 00 1b 5b 32 3b 33 7e 00 1b 5b 32 3b 
34 7e 00 1b 5b 32 3b 35
7e 00 1b 5b 32 3b 36 7e 00 1b 5b 32 3b 37 7e 00 1b 5b 31 3b 33 44 00 1b 5b 31 
3b 34 44 00 1b 5b 31 3b
35 44 00 1b 5b 31 3b 36 44 00 1b 5b 31 3b 37 44 00 1b 5b 36 3b 33 7e 00 1b 5b 
36 3b 34 7e 00 1b 5b 36
3b 35 7e 00 1b 5b 36 3b 36 7e 00 1b 5b 36 3b 37 7e 00 1b 5b 35 3b 33 7e 00 1b 
5b 35 3b 34 7e 00 1b 5b
35 3b 35 7e 00 1b 5b 35 3b 36 7e 00 1b 5b 35 3b 37 7e 00 1b 5b 31 3b 33 43 00 
1b 5b 31 3b 34 43 00 1b
5b 31 3b 35 43 00 1b 5b 31 3b 36 43 00 1b 5b 31 3b 37 43 00 1b 5b 31 3b 32 41 
00 1b 5b 31 3b 33 41 00
1b 5b 31 3b 34 41 00 1b 5b 31 3b 35 41 00 1b 5b 31 3b 36 41 00 1b 5b 31 3b 37 
41 00 41 58 00 58 4d 00
6b 44 43 33 00 6b 44 43 34 00 6b 44 43 35 00 6b 44 43 36 00 6b 44 43 37 00 6b 
44 4e 00 6b 44 4e 33 00
6b 44 4e 34 00 6b 44 4e 35 00 6b 44 4e 36 00 6b 44 4e 37 00 6b 45 4e 44 33 00 
6b 45 4e 44 34 00 6b 45
4e 44 35 00 6b 45 4e 44 36 00 6b 45 4e 44 37 00 6b 48 4f 4d 33 00 6b 48 4f 4d 
34 00 6b 48 4f 4d 35 00
6b 48 4f 4d 36 00 6b 48 4f 4d 37 00 6b 49 43 33 00 6b 49 43 34 00 6b 49 43 35 
00 6b 49 43 36 00 6b 49
43 37 00 6b 4c 46 54 33 00 6b 4c 46 54 34 00 6b 4c 46 54 35 00 6b 4c 46 54 36 
00 6b 4c 46 54 37 00 6b
4e 58 54 33 00 6b 4e 58 54 34 00 6b 4e 58 54 35 00 6b 4e 58 54 36 00 6b 4e 58 
54 37 00 6b 50 52 56 33
00 6b 50 52 56 34 00 6b 50 52 56 35 00 6b 50 52 56 36 00 6b 50 52 56 37 00 6b 
52 49 54 33 00 6b 52 49
54 34 00 6b 52 49 54 35 00 6b 52 49 54 36 00 6b 52 49 54 37 00 6b 55 50 00 6b 
55 50 33 00 6b 55 50 34
00 6b 55 50 35 00 6b 55 50 36 00 6b 55 50 37 00 6b 61 32 00 6b 62 31 00 6b 62 
33 00 6b 63 32 00

Alright, now the rest of it should be the string table right? Well it turns out 
that it isn’t. Here look, this is the rest of the extended format (it is in a 
different format but it should be clear whats going on):

First 116 bytes:
"\x00\x00\x03\x00\x06\x00\v\x00\x10\x00\x15\x00\x1a\x00\x1f\x00#\x00(\x00-\x002\x007\x00<\x00B\x00H\x0
0N\x00T\x00Z\x00`\x00f\x00l\x00r\x00x\x00}\x00\x82\x00\x87\x00\x8c\x00\x91\x00\x97\x00\x9d\x00\xa3\x00
\xa9\x00\xaf\x00\xb5\x00\xbb\x00\xc1\x00\xc7\x00\xcd\x00\xd3\x00\xd9\x00\xdf\x00\xe5\x00\xeb\x00\xf1\x
00\xf7\x00\xfd\x00\x03\x01\t\x01\r\x01\x12\x01\x17\x01\x1c\x01!\x01&\x01*\x01.\x012\x01"

String Table:
"\x1b[3;3~\x00\x
1b[3;4~\x00\x1b[3;5~\x00\x1b[3;6~\x00\x1b[3;7~\x00\x1b[1;2B\x00\x1b[1;3B\x00\x1b[1;4B\x00\x1b[1;5B\x00
\x1b[1;6B\x00\x1b[1;7B\x00\x1b[1;3F\x00\x1b[1;4F\x00\x1b[1;5F\x00\x1b[1;6F\x00\x1b[1;7F\x00\x1b[1;3H\x
00\x1b[1;4H\x00\x1b[1;5H\x00\x1b[1;6H\x00\x1b[1;7H\x00\x1b[2;3~\x00\x1b[2;4~\x00\x1b[2;5~\x00\x1b[2;6~
\x00\x1b[2;7~\x00\x1b[1;3D\x00\x1b[1;4D\x00\x1b[1;5D\x00\x1b[1;6D\x00\x1b[1;7D\x00\x1b[6;3~\x00\x1b[6;
4~\x00\x1b[6;5~\x00\x1b[6;6~\x00\x1b[6;7~\x00\x1b[5;3~\x00\x1b[5;4~\x00\x1b[5;5~\x00\x1b[5;6~\x00\x1b[
5;7~\x00\x1b[1;3C\x00\x1b[1;4C\x00\x1b[1;5C\x00\x1b[1;6C\x00\x1b[1;7C\x00\x1b[1;2A\x00\x1b[1;3A\x00\x1
b[1;4A\x00\x1b[1;5A\x00\x1b[1;6A\x00\x1b[1;7A\x00AX\x00XM\x00kDC3\x00kDC4\x00kDC5\x00kDC6\x00kDC7\x00k
DN\x00kDN3\x00kDN4\x00kDN5\x00kDN6\x00kDN7\x00kEND3\x00kEND4\x00kEND5\x00kEND6\x00kEND7\x00kHOM3\x00kH
OM4\x00kHOM5\x00kHOM6\x00kHOM7\x00kIC3\x00kIC4\x00kIC5\x00kIC6\x00kIC7\x00kLFT3\x00kLFT4\x00kLFT5\x00k
LFT6\x00kLFT7\x00kNXT3\x00kNXT4\x00kNXT5\x00kNXT6\x00kNXT7\x00kPRV3\x00kPRV4\x00kPRV5\x00kPRV6\x00kPRV
7\x00kRIT3\x00kRIT4\x00kRIT5\x00kRIT6\x00kRIT7\x00kUP\x00kUP3\x00kUP4\x00kUP5\x00kUP6\x00kUP7\x00ka2\x
00kb1\x00kb3\x00kc2\x00”

The first 116 bytes are actually not part of the string table. I have no idea 
what purpose they serve. If I skip these 116 bytes, I get to the actual string 
table and the offsets from before work perfectly fine. I noticed that 114 + 116 
is 230. So 230 bytes need to be read to get to the string table. I noticed that 
the 4th short is 115, and 115 * 2 = 230. So essentially I’m using the 4th short 
multiplied by 2 as the length of the offsets. But I take the 3rd short of the 
header, aka the number of string caps and use only that number of shorts from 
the buffer. As in I read in those extra 116 bytes but I never use them. This 
seems to work fine across different files but I don’t think its correct. I 
think I’m misunderstanding something.

Could someone please help me understand the extended format and what I am doing 
wrong?

If you need any further clarification on my problem, please do not hesitate to 
ask.

Here is the current extended reader code. 
https://github.com/nhooyr/terminfo/blob/master/read.go#L126

First I make sure we’re on a word boundary. Then I read in the 10 bytes for the 
reader. Next I print it out. Then I read in the bool section. Then I read in 
the numeric section (its 0 in this case). Next I read the string offset section 
but for the size, I use r.h[lenExtTable] * 2 which means, take the 4th short 
(the size of the string table section apparently..) and multiple it by 2, and 
use that. Then I use the last offset as the size of the actual string table and 
print out all of the extended values. You can easily run this on your own 
computer by setting the TERM to xterm, running “go get 
github.com/nhooyr/terminfo” and then run "go test github.com/nhooyr/terminfo 
-v”. You should see all of the string caps being printed out.
[Prev in Thread]
Current Thread
[Next in Thread]
extended format is wrong?, Anmol Sethi <=
- Re: extended format is wrong?, Anmol Sethi, 2016/04/26
  - Re: extended format is wrong?, Anmol Sethi, 2016/04/26
    - Re: extended format is wrong?, Anmol Sethi, 2016/04/29
Prev by Date: Re: %s and %c in parameterized terminfo strings?
Next by Date: Re: extended format is wrong?
Previous by thread: ANN: ncurses-6.0-20160423
Next by thread: Re: extended format is wrong?
Index(es):
- Date
- Thread