[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "AIscm" array JIT
From: |
Nala Ginrut |
Subject: |
Re: "AIscm" array JIT |
Date: |
Sat, 11 Jun 2016 01:38:46 +0800 |
I've installed from debian repo and tried all the example code, very
cool and impressive! Thanks for the work!
Last months I tried matrix operations in Guile and ChezScheme, it's too
slow. I expect to do some machine learning things to control robot car
with the camera. And I found AIscm has the basic things what I want.
Please continue the work, it's useful. ;-)
On Wed, 2016-06-08 at 22:16 +0100, Jan Wedekind wrote:
> Hi,
> I am working on a compact library [1] for JIT compilation of
> array
> operations. It only runs on AMD64 processors. Currently it supports
> array
> operations using booleans, integers, and integer RGB and integer
> complex
> numbers.
> There are still important things missing: floating point numbers,
> compiling calls to C methods (e.g. sin, cos, ...), tensor
> operations,
> convolutions, ... I would like to eventually do numerical processing
> similar to Python's NumPy (but more generic), Theano (but more
> compact
> syntax as facilitated by macros), and OpenCV.
> Here is an example adding an integer to each element of a 2D array
> and
> returning the result:
>
> scheme@(guile-user)> (use-modules (oop goops) (aiscm jit) (aiscm
> int)
> (aiscm pointer) (aiscm
> sequence))
> scheme@(guile-user)> (+ (arr (2 3 5) (7 11 13)) 3)
> $1 = #<sequence<sequence<int<8,unsigned>>>>:
> ((5 6 8)
> (10 14 16))
>
> The fallback method for the GOOPS generic "+" adds a JIT compiled
> plus
> operation for the specific array types to the generic and then calls
> "+"
> again.
> The corresponding machine code to produce the unsigned byte array
> is
> shown below:
>
> 0: 4c 89 64 24 f0 mov QWORD PTR [rsp-0x10],r12
> 5: 48 89 6c 24 e8 mov QWORD PTR [rsp-0x18],rbp
> a: 4c 89 7c 24 e0 mov QWORD PTR [rsp-0x20],r15
> f: 4c 89 74 24 d8 mov QWORD PTR [rsp-0x28],r14
> 14: 4c 89 6c 24 d0 mov QWORD PTR [rsp-0x30],r13
> 19: 48 89 5c 24 c8 mov QWORD PTR [rsp-0x38],rbx
> 1e: 48 89 7c 24 f8 mov QWORD PTR [rsp-0x8],rdi
> 23: 4c 8b 64 24 08 mov r12,QWORD PTR [rsp+0x8]
> 28: 48 8b 7c 24 18 mov rdi,QWORD PTR [rsp+0x18]
> 2d: 48 8b 6c 24 20 mov rbp,QWORD PTR [rsp+0x20]
> 32: 8a 44 24 28 mov al,BYTE PTR [rsp+0x28]
> 36: 48 6b de 01 imul rbx,rsi,0x1
> 3a: 49 8b f0 mov rsi,r8
> 3d: 4d 6b cc 01 imul r9,r12,0x1
> 41: 4c 8b fd mov r15,rbp
> 44: 49 be 00 00 00 00 00 movabs r14,0x0
> 4b: 00 00 00
> 4e: 4c 8b 44 24 f8 mov r8,QWORD PTR [rsp-0x8]
> 53: 4d 3b f0 cmp r14,r8
> 56: 74 3e je 0x96
> 58: 49 ff c6 inc r14
> 5b: 4c 6b d9 01 imul r11,rcx,0x1
> 5f: 4c 8b ee mov r13,rsi
> 62: 4c 6b d7 01 imul r10,rdi,0x1
> 66: 4d 8b e7 mov r12,r15
> 69: 48 bd 00 00 00 00 00 movabs rbp,0x0
> 70: 00 00 00
> 73: 48 3b ea cmp rbp,rdx
> 76: 74 16 je 0x8e
> 78: 48 ff c5 inc rbp
> 7b: 45 8a 04 24 mov r8b,BYTE PTR [r12]
> 7f: 44 02 c0 add r8b,al
> 82: 45 88 45 00 mov BYTE PTR [r13+0x0],r8b
> 86: 4d 03 eb add r13,r11
> 89: 4d 03 e2 add r12,r10
> 8c: eb e5 jmp 0x73
> 8e: 48 03 f3 add rsi,rbx
> 91: 4d 03 f9 add r15,r9
> 94: eb b8 jmp 0x4e
> 96: 4c 8b 64 24 f0 mov r12,QWORD PTR [rsp-0x10]
> 9b: 48 8b 6c 24 e8 mov rbp,QWORD PTR [rsp-0x18]
> a0: 4c 8b 7c 24 e0 mov r15,QWORD PTR [rsp-0x20]
> a5: 4c 8b 74 24 d8 mov r14,QWORD PTR [rsp-0x28]
> aa: 4c 8b 6c 24 d0 mov r13,QWORD PTR [rsp-0x30]
> af: 48 8b 5c 24 c8 mov rbx,QWORD PTR [rsp-0x38]
> b4: c3 ret
>
>
>
> Any comments,suggestions, and feedback are welcome!
>
> Regards
> Jan
>
> [1] https://github.com/wedesoft/aiscm
>