[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: extended format is wrong?
From: |
Anmol Sethi |
Subject: |
Re: extended format is wrong? |
Date: |
Fri, 29 Apr 2016 08:31:18 -0400 |
I got it now. Those extra bytes were the offsets for the names in the string
table. And term(5) description of the fourth short of the extended header as
"(4) size of the extended string table in bytes.” is wrong. Its actually the
total amount of offsets. Though its unnecessary because you can very easily
compute it using the other fields in the header.
> On Apr 26, 2016, at 11:24 PM, Anmol Sethi <address@hidden> wrote:
>
> I found https://github.com/mauke/unibilium
>
> It supports extended capabilities but its code is much easier for me to
> understand. I’m gonna try and implement it again tomorrow.
>> On Apr 26, 2016, at 10:03 PM, Anmol Sethi <address@hidden> wrote:
>>
>> I’d test to see how ncurses handles it, and fix my problems from there, but
>> I cannot enable tracing for some reason. I have 0 experience with c, sorry!
>> I set the environment variable, I see the tracing set, but it outputs
>> nothing. And I do have the right flags set.
>>
>>> On Apr 26, 2016, at 10:02 PM, Anmol Sethi <address@hidden> wrote:
>>>
>>> Hello!
>>>
>>> I’m writing a terminfo library for go. Someone here suggested I add support
>>> for the extended storage format. I’ve tried and I’m running into quite a
>>> few problems. Here is an example xterm terminfo file from my system in hex.
>>> I’ve only included the extended format divided up.
>>>
>>>
>>> EXTENDED:
>>> Header:
>>> 01 00 00 00 39 00 73 00 a2 02
>>> Values are 1, 0, 57, 115, and 674 in decimal.
>>>
>>> According to term(5), that means 1 bool cap, 0 numeric caps, 57 string
>>> caps, 115 bytes for the extended string table and 674 bytes for the last
>>> offset of the extended string table. Ok so I attempted at writing a reader
>>> with this information and turns out those values are wrong or I am not
>>> understanding any of this correctly. Before I begin, what exactly is the
>>> difference between the 4th and 5th short integers of the header? One is the
>>> byte length of the string table and the other is the last offset of the
>>> string table. I’m not 100% what exactly that means. I’m gonna assume that
>>> the first one is the size in which the string cap values are contained and
>>> the second is the entire size of the table. The ncurses source defines the
>>> 4th short as ext_str_size and the 5th as ext_str_limit. Funny thing is, the
>>> source doesn’t even use the 4th short.
>>>
>>> Bool:
>>> 01 00
>>>
>>> Anyways, after the header, there is the boolean section. Since there is
>>> only one boolean value, I read the byte and then I noticed the old quirk, I
>>> need to skip the extra null byte inserted to keep everything on word
>>> boundaries. Ok so done and done.
>>>
>>> String Section:
>>> ff ff 00 00 07 00 0e 00 15 00 1c 00 23 00 2a 00 31 00 38 00 3f 00 46 00 4d
>>> 00 54 00 5b 00 62 00 69 00
>>> 70 00 77 00 7e 00 85 00 8c 00 93 00 9a 00 a1 00 a8 00 af 00 b6 00 bd 00 c4
>>> 00 cb 00 d2 00 d9 00 e0 00
>>> e7 00 ee 00 f5 00 fc 00 03 01 0a 01 11 01 18 01 1f 01 26 01 2d 01 34 01 3b
>>> 01 42 01 49 01 50 01 57 01
>>> 5e 01 65 01 ff ff ff ff ff ff ff ff
>>>
>>> Next the numeric capabilities were 0 so I went straight to the string caps.
>>> Now here I read in these 114 bytes, or 57 shorts.
>>>
>>> Rest:
>>> 00 00 03 00 06 00 0b 00 10 00
>>> 15 00 1a 00 1f 00 23 00 28 00 2d 00 32 00 37 00 3c 00 42 00 48 00 4e 00 54
>>> 00 5a 00 60 00 66 00 6c 00
>>> 72 00 78 00 7d 00 82 00 87 00 8c 00 91 00 97 00 9d 00 a3 00 a9 00 af 00 b5
>>> 00 bb 00 c1 00 c7 00 cd 00
>>> d3 00 d9 00 df 00 e5 00 eb 00 f1 00 f7 00 fd 00 03 01 09 01 0d 01 12 01 17
>>> 01 1c 01 21 01 26 01 2a 01
>>> 2e 01 32 01 1b 5b 33 3b 33 7e 00 1b 5b 33 3b 34 7e 00 1b 5b 33 3b 35 7e 00
>>> 1b 5b 33 3b 36 7e 00 1b 5b
>>> 33 3b 37 7e 00 1b 5b 31 3b 32 42 00 1b 5b 31 3b 33 42 00 1b 5b 31 3b 34 42
>>> 00 1b 5b 31 3b 35 42 00 1b
>>> 5b 31 3b 36 42 00 1b 5b 31 3b 37 42 00 1b 5b 31 3b 33 46 00 1b 5b 31 3b 34
>>> 46 00 1b 5b 31 3b 35 46 00
>>> 1b 5b 31 3b 36 46 00 1b 5b 31 3b 37 46 00 1b 5b 31 3b 33 48 00 1b 5b 31 3b
>>> 34 48 00 1b 5b 31 3b 35 48
>>> 00 1b 5b 31 3b 36 48 00 1b 5b 31 3b 37 48 00 1b 5b 32 3b 33 7e 00 1b 5b 32
>>> 3b 34 7e 00 1b 5b 32 3b 35
>>> 7e 00 1b 5b 32 3b 36 7e 00 1b 5b 32 3b 37 7e 00 1b 5b 31 3b 33 44 00 1b 5b
>>> 31 3b 34 44 00 1b 5b 31 3b
>>> 35 44 00 1b 5b 31 3b 36 44 00 1b 5b 31 3b 37 44 00 1b 5b 36 3b 33 7e 00 1b
>>> 5b 36 3b 34 7e 00 1b 5b 36
>>> 3b 35 7e 00 1b 5b 36 3b 36 7e 00 1b 5b 36 3b 37 7e 00 1b 5b 35 3b 33 7e 00
>>> 1b 5b 35 3b 34 7e 00 1b 5b
>>> 35 3b 35 7e 00 1b 5b 35 3b 36 7e 00 1b 5b 35 3b 37 7e 00 1b 5b 31 3b 33 43
>>> 00 1b 5b 31 3b 34 43 00 1b
>>> 5b 31 3b 35 43 00 1b 5b 31 3b 36 43 00 1b 5b 31 3b 37 43 00 1b 5b 31 3b 32
>>> 41 00 1b 5b 31 3b 33 41 00
>>> 1b 5b 31 3b 34 41 00 1b 5b 31 3b 35 41 00 1b 5b 31 3b 36 41 00 1b 5b 31 3b
>>> 37 41 00 41 58 00 58 4d 00
>>> 6b 44 43 33 00 6b 44 43 34 00 6b 44 43 35 00 6b 44 43 36 00 6b 44 43 37 00
>>> 6b 44 4e 00 6b 44 4e 33 00
>>> 6b 44 4e 34 00 6b 44 4e 35 00 6b 44 4e 36 00 6b 44 4e 37 00 6b 45 4e 44 33
>>> 00 6b 45 4e 44 34 00 6b 45
>>> 4e 44 35 00 6b 45 4e 44 36 00 6b 45 4e 44 37 00 6b 48 4f 4d 33 00 6b 48 4f
>>> 4d 34 00 6b 48 4f 4d 35 00
>>> 6b 48 4f 4d 36 00 6b 48 4f 4d 37 00 6b 49 43 33 00 6b 49 43 34 00 6b 49 43
>>> 35 00 6b 49 43 36 00 6b 49
>>> 43 37 00 6b 4c 46 54 33 00 6b 4c 46 54 34 00 6b 4c 46 54 35 00 6b 4c 46 54
>>> 36 00 6b 4c 46 54 37 00 6b
>>> 4e 58 54 33 00 6b 4e 58 54 34 00 6b 4e 58 54 35 00 6b 4e 58 54 36 00 6b 4e
>>> 58 54 37 00 6b 50 52 56 33
>>> 00 6b 50 52 56 34 00 6b 50 52 56 35 00 6b 50 52 56 36 00 6b 50 52 56 37 00
>>> 6b 52 49 54 33 00 6b 52 49
>>> 54 34 00 6b 52 49 54 35 00 6b 52 49 54 36 00 6b 52 49 54 37 00 6b 55 50 00
>>> 6b 55 50 33 00 6b 55 50 34
>>> 00 6b 55 50 35 00 6b 55 50 36 00 6b 55 50 37 00 6b 61 32 00 6b 62 31 00 6b
>>> 62 33 00 6b 63 32 00
>>>
>>> Alright, now the rest of it should be the string table right? Well it turns
>>> out that it isn’t. Here look, this is the rest of the extended format (it
>>> is in a different format but it should be clear whats going on):
>>>
>>> First 116 bytes:
>>> "\x00\x00\x03\x00\x06\x00\v\x00\x10\x00\x15\x00\x1a\x00\x1f\x00#\x00(\x00-\x002\x007\x00<\x00B\x00H\x0
>>> 0N\x00T\x00Z\x00`\x00f\x00l\x00r\x00x\x00}\x00\x82\x00\x87\x00\x8c\x00\x91\x00\x97\x00\x9d\x00\xa3\x00
>>> \xa9\x00\xaf\x00\xb5\x00\xbb\x00\xc1\x00\xc7\x00\xcd\x00\xd3\x00\xd9\x00\xdf\x00\xe5\x00\xeb\x00\xf1\x
>>> 00\xf7\x00\xfd\x00\x03\x01\t\x01\r\x01\x12\x01\x17\x01\x1c\x01!\x01&\x01*\x01.\x012\x01"
>>>
>>> String Table:
>>> "\x1b[3;3~\x00\x
>>> 1b[3;4~\x00\x1b[3;5~\x00\x1b[3;6~\x00\x1b[3;7~\x00\x1b[1;2B\x00\x1b[1;3B\x00\x1b[1;4B\x00\x1b[1;5B\x00
>>> \x1b[1;6B\x00\x1b[1;7B\x00\x1b[1;3F\x00\x1b[1;4F\x00\x1b[1;5F\x00\x1b[1;6F\x00\x1b[1;7F\x00\x1b[1;3H\x
>>> 00\x1b[1;4H\x00\x1b[1;5H\x00\x1b[1;6H\x00\x1b[1;7H\x00\x1b[2;3~\x00\x1b[2;4~\x00\x1b[2;5~\x00\x1b[2;6~
>>> \x00\x1b[2;7~\x00\x1b[1;3D\x00\x1b[1;4D\x00\x1b[1;5D\x00\x1b[1;6D\x00\x1b[1;7D\x00\x1b[6;3~\x00\x1b[6;
>>> 4~\x00\x1b[6;5~\x00\x1b[6;6~\x00\x1b[6;7~\x00\x1b[5;3~\x00\x1b[5;4~\x00\x1b[5;5~\x00\x1b[5;6~\x00\x1b[
>>> 5;7~\x00\x1b[1;3C\x00\x1b[1;4C\x00\x1b[1;5C\x00\x1b[1;6C\x00\x1b[1;7C\x00\x1b[1;2A\x00\x1b[1;3A\x00\x1
>>> b[1;4A\x00\x1b[1;5A\x00\x1b[1;6A\x00\x1b[1;7A\x00AX\x00XM\x00kDC3\x00kDC4\x00kDC5\x00kDC6\x00kDC7\x00k
>>> DN\x00kDN3\x00kDN4\x00kDN5\x00kDN6\x00kDN7\x00kEND3\x00kEND4\x00kEND5\x00kEND6\x00kEND7\x00kHOM3\x00kH
>>> OM4\x00kHOM5\x00kHOM6\x00kHOM7\x00kIC3\x00kIC4\x00kIC5\x00kIC6\x00kIC7\x00kLFT3\x00kLFT4\x00kLFT5\x00k
>>> LFT6\x00kLFT7\x00kNXT3\x00kNXT4\x00kNXT5\x00kNXT6\x00kNXT7\x00kPRV3\x00kPRV4\x00kPRV5\x00kPRV6\x00kPRV
>>> 7\x00kRIT3\x00kRIT4\x00kRIT5\x00kRIT6\x00kRIT7\x00kUP\x00kUP3\x00kUP4\x00kUP5\x00kUP6\x00kUP7\x00ka2\x
>>> 00kb1\x00kb3\x00kc2\x00”
>>>
>>> The first 116 bytes are actually not part of the string table. I have no
>>> idea what purpose they serve. If I skip these 116 bytes, I get to the
>>> actual string table and the offsets from before work perfectly fine. I
>>> noticed that 114 + 116 is 230. So 230 bytes need to be read to get to the
>>> string table. I noticed that the 4th short is 115, and 115 * 2 = 230. So
>>> essentially I’m using the 4th short multiplied by 2 as the length of the
>>> offsets. But I take the 3rd short of the header, aka the number of string
>>> caps and use only that number of shorts from the buffer. As in I read in
>>> those extra 116 bytes but I never use them. This seems to work fine across
>>> different files but I don’t think its correct. I think I’m misunderstanding
>>> something.
>>>
>>> Could someone please help me understand the extended format and what I am
>>> doing wrong?
>>>
>>> If you need any further clarification on my problem, please do not hesitate
>>> to ask.
>>>
>>> Here is the current extended reader code.
>>> https://github.com/nhooyr/terminfo/blob/master/read.go#L126
>>>
>>> First I make sure we’re on a word boundary. Then I read in the 10 bytes for
>>> the reader. Next I print it out. Then I read in the bool section. Then I
>>> read in the numeric section (its 0 in this case). Next I read the string
>>> offset section but for the size, I use r.h[lenExtTable] * 2 which means,
>>> take the 4th short (the size of the string table section apparently..) and
>>> multiple it by 2, and use that. Then I use the last offset as the size of
>>> the actual string table and print out all of the extended values. You can
>>> easily run this on your own computer by setting the TERM to xterm, running
>>> “go get github.com/nhooyr/terminfo” and then run "go test
>>> github.com/nhooyr/terminfo -v”. You should see all of the string caps being
>>> printed out.
>>>
>>
>