[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd)
From: |
Ludovic Courtès |
Subject: |
Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd) |
Date: |
Mon, 17 Oct 2022 14:51:01 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) |
Hi,
Ludovic Courtès <ludo@gnu.org> skribis:
> … so ‘exec_load’ is doing its job, it seems.
Turns out that may not be the case.
Here’s a *bad* mapping on the second ‘task_resume’ breakpoint (when
‘exec’ is about to start):
--8<---------------cut here---------------start------------->8---
db> show all threads
TASK THREADS
0 gnumach (f5f7cf00): 7 threads:
0 (f5f7be18) .W..N. 0xc11dac04
1 (f5f7bcd0) R..O..(idle_thread_continue)
2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
1 ext2fs (f5f7ce40): 6 threads:
0 (f5f7b520) R....F
1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
3 (f5f7b000) .W.O..(mach_msg_continue) 0
4 (f67d3e20) .W.O..(mach_msg_receive_continue) 0
5 (f67d3cd8) .W.O..(mach_msg_continue) 0
2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
db> trace
task_resume(f593e010,fb7d9010,f5f73e80,c106972a)
ipc_kobject_server(f593e000,3,18,0)+0x1eb
mach_msg_trap(bffff4c0,3,18,20,8)+0x1703
>>>>> user space <<<<<
db> x/tbx 0xcbc 0xf5f7b3d8
no memory is assigned to address 00000cbc
0
db> show map $map2
Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
size=290816,resident:225280,wired=0
version=13
map entry 0xf625ec08: start=0x0, end=0x1000
prot=1/7/copy, object=0x0, offset=0x0
map entry 0xf625ebb0: start=0x1000, end=0x26000
prot=5/7/copy, object=0xf5f6ad70, offset=0x0
Object 0xf5f6ad70: size=0x25000, 1 references
37 resident pages, 0 absent pages, 0 paging ops
memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
uninitialized,temporary internal,copy_strategy=0
shadow=0x0 (offset=0x0),copy=0x0
map entry 0xf625eb58: start=0x26000, end=0x34000
prot=1/7/copy, object=0xf5f6ad20, offset=0x0
Object 0xf5f6ad20: size=0xe000, 1 references
14 resident pages, 0 absent pages, 0 paging ops
memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
uninitialized,temporary internal,copy_strategy=0
shadow=0x0 (offset=0x0),copy=0x0
map entry 0xf625eb00: start=0x34000, end=0x37000
prot=3/7/copy, object=0xf5f6acd0, offset=0x0
Object 0xf5f6acd0: size=0x3000, 1 references
3 resident pages,--db_more--
--8<---------------cut here---------------end--------------->8---
Compare with what a “good” mapping looks like at that same moment:
--8<---------------cut here---------------start------------->8---
start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoint
trap,
eip 0xc1030d5b
Breakpoint at task_resume: pushl %ebp
db> show all threads
TASK THREADS
0 gnumach (f5f7cf00): 7 threads:
0 (f5f7be18) .W..N. 0xc11dac04
1 (f5f7bcd0) R..O..(idle_thread_continue)
2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
1 ext2fs (f5f7ce40): 6 threads:
0 (f5f7b520) R....F
1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
3 (f5f7b000) .W.O..(mach_msg_continue) 0
4 (f67d2e20) .W.O..(mach_msg_receive_continue) 0
5 (f67d2cd8) .W.O..(mach_msg_continue) 0
2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
db> x/tbx 0xcbc 0xf5f7b3d8
8
db> show map $map2
Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
size=290816,resident:229376,wired=0
version=14
map entry 0xf625ec08: start=0x0, end=0x1000
prot=1/7/copy, object=0xf5f6ad70, offset=0x0
Object 0xf5f6ad70: size=0x1000, 1 references
1 resident pages, 0 absent pages, 0 paging ops
memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
uninitialized,temporary internal,copy_strategy=0
shadow=0x0 (offset=0x0),copy=0x0
map entry 0xf625ebb0: start=0x1000, end=0x26000
prot=5/7/copy, object=0xf5f6ad20, offset=0x0
Object 0xf5f6ad20: size=0x25000, 1 references
37 resident pages, 0 absent pages, 0 paging ops
memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
uninitialized,temporary internal,copy_strategy=0
shadow=0x0 (offset=0x0),copy=0x0
map entry 0xf625eb58: start=0x26000, end=0x34000
prot=1/7/copy, object=0xf5f6acd0, offset=0x0
Object 0xf5f6acd0: size=0xe000, 1 references
14 resident pages, 0 absent pages, 0 paging ops
memory object=0x0 (offset=0x0),control=0x0, name=0xf5f826e0
uninitialized,temporary internal,copy_strategy=0
shadow=0x0 (offset=0x0),copy=0x0
map entry 0xf625eb00: start=0x34000, end=0x37000
prot=3/7/copy, object=0xf5f6ac80, offset=0x0
Object 0xf5f6ac80: size=0x3000, 1 references
3 resident pages, 0 absent pages, 0 paging ops
memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82690
uninitialized,temporary internal,copy_strategy=0
shadow=0x0 (offset=0x0),copy=0x0
map entry 0xf625eaa8: start=0xbfff0000, end=0xc0000000
prot=3/7/copy, object=0xf5f6ac30, offset=0x0
Object 0xf5f6ac30: size=0x10000, 1 references
1 resident pages, 0 absent pages, 0 paging ops
memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82640
uninitialized,temporary internal,copy_strategy=0
shadow=0x0 (offset=0x0),copy=0x0
--8<---------------cut here---------------end--------------->8---
Notice that 0xcbc reads a valid relocation, where 8 = R_386_RELATIVE.
In the “bad” case, the first map entry is empty, with no associated
memory object and zero resident pages.
My reading of ‘read_exec’ is that the page is supposed to be populated
eagerly by the ‘copyout’ call here:
--8<---------------cut here---------------start------------->8---
static int
read_exec(void *handle, vm_offset_t file_ofs, vm_size_t file_size,
vm_offset_t mem_addr, vm_size_t mem_size,
exec_sectype_t sec_type)
{
struct multiboot_module *mod = handle;
[...]
err = vm_allocate(user_map, &start_page, end_page - start_page, FALSE);
assert(err == 0);
assert(start_page == trunc_page(mem_addr));
if (file_size > 0)
{
err = copyout((char *)phystokv (mod->mod_start) + file_ofs,
(void *)mem_addr, file_size);
assert(err == 0);
}
[...]
return 0;
}
--8<---------------cut here---------------end--------------->8---
There are interesting tricks in ‘copyout_retry’ to fake a page fault so
the copy can actually be made, IIUC.
Could it be that this bit isn’t quite working?
Ideas?
Problem with debugging this is that setting a breakpoint on ‘exec_load’
causes the system to boot fine (breaking on ‘task_resume’ is fine tough,
go figure…).
Ludo’.
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), (continued)
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Ludovic Courtès, 2022/10/06
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Samuel Thibault, 2022/10/06
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Ludovic Courtès, 2022/10/06
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Samuel Thibault, 2022/10/06
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Ludovic Courtès, 2022/10/07
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Samuel Thibault, 2022/10/07
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Ludovic Courtès, 2022/10/08
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Ludovic Courtès, 2022/10/09
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Samuel Thibault, 2022/10/09
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Ludovic Courtès, 2022/10/10
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd),
Ludovic Courtès <=
- Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd), Ludovic Courtès, 2022/10/23