[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
extended format is wrong?
From: |
Anmol Sethi |
Subject: |
extended format is wrong? |
Date: |
Tue, 26 Apr 2016 22:02:13 -0400 |
Hello!
I’m writing a terminfo library for go. Someone here suggested I add support for
the extended storage format. I’ve tried and I’m running into quite a few
problems. Here is an example xterm terminfo file from my system in hex. I’ve
only included the extended format divided up.
EXTENDED:
Header:
01 00 00 00 39 00 73 00 a2 02
Values are 1, 0, 57, 115, and 674 in decimal.
According to term(5), that means 1 bool cap, 0 numeric caps, 57 string caps,
115 bytes for the extended string table and 674 bytes for the last offset of
the extended string table. Ok so I attempted at writing a reader with this
information and turns out those values are wrong or I am not understanding any
of this correctly. Before I begin, what exactly is the difference between the
4th and 5th short integers of the header? One is the byte length of the string
table and the other is the last offset of the string table. I’m not 100% what
exactly that means. I’m gonna assume that the first one is the size in which
the string cap values are contained and the second is the entire size of the
table. The ncurses source defines the 4th short as ext_str_size and the 5th as
ext_str_limit. Funny thing is, the source doesn’t even use the 4th short.
Bool:
01 00
Anyways, after the header, there is the boolean section. Since there is only
one boolean value, I read the byte and then I noticed the old quirk, I need to
skip the extra null byte inserted to keep everything on word boundaries. Ok so
done and done.
String Section:
ff ff 00 00 07 00 0e 00 15 00 1c 00 23 00 2a 00 31 00 38 00 3f 00 46 00 4d 00
54 00 5b 00 62 00 69 00
70 00 77 00 7e 00 85 00 8c 00 93 00 9a 00 a1 00 a8 00 af 00 b6 00 bd 00 c4 00
cb 00 d2 00 d9 00 e0 00
e7 00 ee 00 f5 00 fc 00 03 01 0a 01 11 01 18 01 1f 01 26 01 2d 01 34 01 3b 01
42 01 49 01 50 01 57 01
5e 01 65 01 ff ff ff ff ff ff ff ff
Next the numeric capabilities were 0 so I went straight to the string caps. Now
here I read in these 114 bytes, or 57 shorts.
Rest:
00 00 03 00 06 00 0b 00 10 00
15 00 1a 00 1f 00 23 00 28 00 2d 00 32 00 37 00 3c 00 42 00 48 00 4e 00 54 00
5a 00 60 00 66 00 6c 00
72 00 78 00 7d 00 82 00 87 00 8c 00 91 00 97 00 9d 00 a3 00 a9 00 af 00 b5 00
bb 00 c1 00 c7 00 cd 00
d3 00 d9 00 df 00 e5 00 eb 00 f1 00 f7 00 fd 00 03 01 09 01 0d 01 12 01 17 01
1c 01 21 01 26 01 2a 01
2e 01 32 01 1b 5b 33 3b 33 7e 00 1b 5b 33 3b 34 7e 00 1b 5b 33 3b 35 7e 00 1b
5b 33 3b 36 7e 00 1b 5b
33 3b 37 7e 00 1b 5b 31 3b 32 42 00 1b 5b 31 3b 33 42 00 1b 5b 31 3b 34 42 00
1b 5b 31 3b 35 42 00 1b
5b 31 3b 36 42 00 1b 5b 31 3b 37 42 00 1b 5b 31 3b 33 46 00 1b 5b 31 3b 34 46
00 1b 5b 31 3b 35 46 00
1b 5b 31 3b 36 46 00 1b 5b 31 3b 37 46 00 1b 5b 31 3b 33 48 00 1b 5b 31 3b 34
48 00 1b 5b 31 3b 35 48
00 1b 5b 31 3b 36 48 00 1b 5b 31 3b 37 48 00 1b 5b 32 3b 33 7e 00 1b 5b 32 3b
34 7e 00 1b 5b 32 3b 35
7e 00 1b 5b 32 3b 36 7e 00 1b 5b 32 3b 37 7e 00 1b 5b 31 3b 33 44 00 1b 5b 31
3b 34 44 00 1b 5b 31 3b
35 44 00 1b 5b 31 3b 36 44 00 1b 5b 31 3b 37 44 00 1b 5b 36 3b 33 7e 00 1b 5b
36 3b 34 7e 00 1b 5b 36
3b 35 7e 00 1b 5b 36 3b 36 7e 00 1b 5b 36 3b 37 7e 00 1b 5b 35 3b 33 7e 00 1b
5b 35 3b 34 7e 00 1b 5b
35 3b 35 7e 00 1b 5b 35 3b 36 7e 00 1b 5b 35 3b 37 7e 00 1b 5b 31 3b 33 43 00
1b 5b 31 3b 34 43 00 1b
5b 31 3b 35 43 00 1b 5b 31 3b 36 43 00 1b 5b 31 3b 37 43 00 1b 5b 31 3b 32 41
00 1b 5b 31 3b 33 41 00
1b 5b 31 3b 34 41 00 1b 5b 31 3b 35 41 00 1b 5b 31 3b 36 41 00 1b 5b 31 3b 37
41 00 41 58 00 58 4d 00
6b 44 43 33 00 6b 44 43 34 00 6b 44 43 35 00 6b 44 43 36 00 6b 44 43 37 00 6b
44 4e 00 6b 44 4e 33 00
6b 44 4e 34 00 6b 44 4e 35 00 6b 44 4e 36 00 6b 44 4e 37 00 6b 45 4e 44 33 00
6b 45 4e 44 34 00 6b 45
4e 44 35 00 6b 45 4e 44 36 00 6b 45 4e 44 37 00 6b 48 4f 4d 33 00 6b 48 4f 4d
34 00 6b 48 4f 4d 35 00
6b 48 4f 4d 36 00 6b 48 4f 4d 37 00 6b 49 43 33 00 6b 49 43 34 00 6b 49 43 35
00 6b 49 43 36 00 6b 49
43 37 00 6b 4c 46 54 33 00 6b 4c 46 54 34 00 6b 4c 46 54 35 00 6b 4c 46 54 36
00 6b 4c 46 54 37 00 6b
4e 58 54 33 00 6b 4e 58 54 34 00 6b 4e 58 54 35 00 6b 4e 58 54 36 00 6b 4e 58
54 37 00 6b 50 52 56 33
00 6b 50 52 56 34 00 6b 50 52 56 35 00 6b 50 52 56 36 00 6b 50 52 56 37 00 6b
52 49 54 33 00 6b 52 49
54 34 00 6b 52 49 54 35 00 6b 52 49 54 36 00 6b 52 49 54 37 00 6b 55 50 00 6b
55 50 33 00 6b 55 50 34
00 6b 55 50 35 00 6b 55 50 36 00 6b 55 50 37 00 6b 61 32 00 6b 62 31 00 6b 62
33 00 6b 63 32 00
Alright, now the rest of it should be the string table right? Well it turns out
that it isn’t. Here look, this is the rest of the extended format (it is in a
different format but it should be clear whats going on):
First 116 bytes:
"\x00\x00\x03\x00\x06\x00\v\x00\x10\x00\x15\x00\x1a\x00\x1f\x00#\x00(\x00-\x002\x007\x00<\x00B\x00H\x0
0N\x00T\x00Z\x00`\x00f\x00l\x00r\x00x\x00}\x00\x82\x00\x87\x00\x8c\x00\x91\x00\x97\x00\x9d\x00\xa3\x00
\xa9\x00\xaf\x00\xb5\x00\xbb\x00\xc1\x00\xc7\x00\xcd\x00\xd3\x00\xd9\x00\xdf\x00\xe5\x00\xeb\x00\xf1\x
00\xf7\x00\xfd\x00\x03\x01\t\x01\r\x01\x12\x01\x17\x01\x1c\x01!\x01&\x01*\x01.\x012\x01"
String Table:
"\x1b[3;3~\x00\x
1b[3;4~\x00\x1b[3;5~\x00\x1b[3;6~\x00\x1b[3;7~\x00\x1b[1;2B\x00\x1b[1;3B\x00\x1b[1;4B\x00\x1b[1;5B\x00
\x1b[1;6B\x00\x1b[1;7B\x00\x1b[1;3F\x00\x1b[1;4F\x00\x1b[1;5F\x00\x1b[1;6F\x00\x1b[1;7F\x00\x1b[1;3H\x
00\x1b[1;4H\x00\x1b[1;5H\x00\x1b[1;6H\x00\x1b[1;7H\x00\x1b[2;3~\x00\x1b[2;4~\x00\x1b[2;5~\x00\x1b[2;6~
\x00\x1b[2;7~\x00\x1b[1;3D\x00\x1b[1;4D\x00\x1b[1;5D\x00\x1b[1;6D\x00\x1b[1;7D\x00\x1b[6;3~\x00\x1b[6;
4~\x00\x1b[6;5~\x00\x1b[6;6~\x00\x1b[6;7~\x00\x1b[5;3~\x00\x1b[5;4~\x00\x1b[5;5~\x00\x1b[5;6~\x00\x1b[
5;7~\x00\x1b[1;3C\x00\x1b[1;4C\x00\x1b[1;5C\x00\x1b[1;6C\x00\x1b[1;7C\x00\x1b[1;2A\x00\x1b[1;3A\x00\x1
b[1;4A\x00\x1b[1;5A\x00\x1b[1;6A\x00\x1b[1;7A\x00AX\x00XM\x00kDC3\x00kDC4\x00kDC5\x00kDC6\x00kDC7\x00k
DN\x00kDN3\x00kDN4\x00kDN5\x00kDN6\x00kDN7\x00kEND3\x00kEND4\x00kEND5\x00kEND6\x00kEND7\x00kHOM3\x00kH
OM4\x00kHOM5\x00kHOM6\x00kHOM7\x00kIC3\x00kIC4\x00kIC5\x00kIC6\x00kIC7\x00kLFT3\x00kLFT4\x00kLFT5\x00k
LFT6\x00kLFT7\x00kNXT3\x00kNXT4\x00kNXT5\x00kNXT6\x00kNXT7\x00kPRV3\x00kPRV4\x00kPRV5\x00kPRV6\x00kPRV
7\x00kRIT3\x00kRIT4\x00kRIT5\x00kRIT6\x00kRIT7\x00kUP\x00kUP3\x00kUP4\x00kUP5\x00kUP6\x00kUP7\x00ka2\x
00kb1\x00kb3\x00kc2\x00”
The first 116 bytes are actually not part of the string table. I have no idea
what purpose they serve. If I skip these 116 bytes, I get to the actual string
table and the offsets from before work perfectly fine. I noticed that 114 + 116
is 230. So 230 bytes need to be read to get to the string table. I noticed that
the 4th short is 115, and 115 * 2 = 230. So essentially I’m using the 4th short
multiplied by 2 as the length of the offsets. But I take the 3rd short of the
header, aka the number of string caps and use only that number of shorts from
the buffer. As in I read in those extra 116 bytes but I never use them. This
seems to work fine across different files but I don’t think its correct. I
think I’m misunderstanding something.
Could someone please help me understand the extended format and what I am doing
wrong?
If you need any further clarification on my problem, please do not hesitate to
ask.
Here is the current extended reader code.
https://github.com/nhooyr/terminfo/blob/master/read.go#L126
First I make sure we’re on a word boundary. Then I read in the 10 bytes for the
reader. Next I print it out. Then I read in the bool section. Then I read in
the numeric section (its 0 in this case). Next I read the string offset section
but for the size, I use r.h[lenExtTable] * 2 which means, take the 4th short
(the size of the string table section apparently..) and multiple it by 2, and
use that. Then I use the last offset as the size of the actual string table and
print out all of the extended values. You can easily run this on your own
computer by setting the TERM to xterm, running “go get
github.com/nhooyr/terminfo” and then run "go test github.com/nhooyr/terminfo
-v”. You should see all of the string caps being printed out.
- extended format is wrong?,
Anmol Sethi <=