bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] segfault when using 'CORE_COUNT_WANTED' configure flag


From: Xiao-Yong Jin
Subject: Re: [Bug-apl] segfault when using 'CORE_COUNT_WANTED' configure flag
Date: Thu, 17 Oct 2019 12:43:34 -0500

CPU cache is the reason for a JIT compilation/conversion of APL expressions, in 
order to fuse local operations together.

If you are in to CPU cache aspects of HPC, the blis papers would be interesting 
to you.  See the citations at https://github.com/flame/blis

> On Oct 17, 2019, at 10:03 AM, Rowan Cannaday <address@hidden> wrote:
> 
> Peter:
> 
> I am new to APL (~9 months). Most of my day-to-day work is sql/shell, however 
> I use APL for a couple things: 1.) as an ad-hoc calculator & 2.) a symbolic 
> notation that greatly simplifies complex mathematical calculations in how I 
> think about, remember, and approach them.
> 
> As for things I've built in APL, its been mostly toy projects. I built a 
> simple artificial neural net, however ran into difficulties when attempting 
> to generalize backpropagation 
> for arbitrary array sizes. I would like to complete this at some point.
> 
> Parallelizing APL was more of a curiosity, than an immediate need. I do not 
> plan to invest much energy into pursuing it at this time (especially in light 
> of Jürgen's explanation). To be honest I had watched a talk about cpu caching 
> in C++ and was interested in how that related to APL's handling of arrays.
> 
> On Wed, Oct 16, 2019 at 9:53 PM Peter Teeson <address@hidden> wrote:
> Hi Rowan:
> 
> What classes of problems are you trying to solve that would benefit from 
> parallel processing?
> 
> Respect
> 
> Peter Teeson
>> On Oct 16, 2019, at 1:27 PM, Dr. Jürgen Sauermann <address@hidden> wrote:
>> 
>> Hi Rowan,
>> 
>> actually there is no syntax tree in GNU APL. The way in which APL binds names
>> (*late and ambiguously)  makes it fairly useless to parse it beforehand. 
>> What happens
>> in GNU APL is prefix matching at runtime. The prefix-table is in 
>> src/Prefix.def (an
>> automatically generated hast table that does lookups essentially in time 
>> O(1) per prefix.)
>> 
>> This lookup table replaces the AST that you would have in a compiled 
>> language,
>> 
>> Best regards,
>> Jürgen Sauermann
>> 
>> 
>> On 10/16/19 6:55 PM, Rowan Cannaday wrote:
>>> Thanks again,
>>> 
>>> AST = abstract syntax tree. The tree-like structure that is produced by the 
>>> parser.
>>> 
>>> Avoiding compilation is a reasonable restriction.
>>> 
>>> Thanks for the context.
>>> 
>>> - Rowan
>>> 
>>> ```
>>> #gdb apl
>>> GNU gdb (Debian 8.3.1-1) 8.3.1
>>> Copyright (C) 2019 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later 
>>> <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.
>>> Type "show copying" and "show warranty" for details.
>>> This GDB was configured as "x86_64-linux-gnu".
>>> Type "show configuration" for configuration details.
>>> For bug reporting instructions, please see:
>>> <http://www.gnu.org/software/gdb/bugs/>.
>>> Find the GDB manual and other documentation resources online at:
>>>     <http://www.gnu.org/software/gdb/documentation/>.
>>> 
>>> For help, type "help".
>>> Type "apropos word" to search for commands related to "word"...
>>> Reading symbols from apl...
>>> (gdb) run
>>> Starting program: /usr/local/bin/apl
>>> [Thread debugging using libthread_db enabled]
>>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>>> [Detaching after vfork from child process 23377]
>>> [New Thread 0x7ffff68d7700 (LWP 23381)]
>>> [New Thread 0x7ffff60d6700 (LWP 23382)]
>>> [New Thread 0x7ffff58d5700 (LWP 23383)]
>>> 
>>>                     ______ _   __ __  __    ___     ____   __
>>>                    / ____// | / // / / /   /   |   / __ \ / /
>>>                   / / __ /  |/ // / / /   / /| |  / /_/ // /
>>>                  / /_/ // /|  // /_/ /   / ___ | / ____// /___
>>>                  \____//_/ |_/ \____/   /_/  |_|/_/    /_____/
>>> 
>>>                     Welcome to GNU APL version 1.8 / 1191M
>>> 
>>>                 Copyright (C) 2008-2019  Dr. Jürgen Sauermann
>>>                        Banner by FIGlet: www.figlet.org
>>> 
>>>                 This program comes with ABSOLUTELY NO WARRANTY;
>>>                   for details run: /usr/local/bin/apl --gpl.
>>> 
>>>      This program is free software, and you are welcome to redistribute it
>>>          according to the GNU Public License (GPL) version 3 or later.
>>> 
>>>       )OFF
>>> 
>>> Goodbye.
>>> Session duration: 5.06809 seconds
>>> Couldn't read debug register: No such process.
>>> (gdb) Cannot find user-level thread for LWP 23382: generic error
>>> (gdb) [Thread 0x7ffff58d5700 (LWP 23383) exited]
>>> [Thread 0x7ffff60d6700 (LWP 23382) exited]
>>> [Thread 0x7ffff68d7700 (LWP 23381) exited]
>>> [Inferior 1 (process 23368) exited normally]
>>> 
>>> (gdb) bt
>>> No stack.
>>> 
>>> ```
>>> 
>>> On Wed, Oct 16, 2019 at 4:18 PM Dr. Jürgen Sauermann 
>>> <mail@jürgen-sauermann.de> wrote:
>>> Hi Rowan,
>>> 
>>> a stack-trace for the segfault would be good (command gdb apl then: 'run' 
>>> and finally 'bt' after
>>> the segfault,
>>> 
>>> No idea what AST is.
>>> You could try TAB-expansion to get options in various situations and try 
>>> e.g.
>>> 
>>> ]help ⌹
>>> 
>>> to get help for APL primitives. Currently system functions and variables 
>>> are not in )help,
>>> but I suppose extending file src/Help.def could easily add them.
>>> 
>>> 
>>> Compiling APL is IMHO a wrong path. Too many problems, too little gain.
>>> 
>>> Best Regards,
>>> Jürgen Sauermann
>>> 
>>> 
>>> On 10/16/19 5:01 PM, Rowan Cannaday wrote:
>>>> Thank you for the explanation Jürgen.
>>>> 
>>>> That makes intuitive sense. A shared-memory single threaded service is a 
>>>> reasonable abstraction.
>>>> 
>>>> Another approach, is to compile a subset of APL to an intermediate 
>>>> representation.
>>>> 
>>>> Is there a way to export the AST?
>>>> in addition - is there an in-repl method of viewing help and/or arguments 
>>>> for system variables & functions?
>>>> 
>>>> By the way, a minor regression: segfaulting, but only after exiting.
>>>> ```
>>>>       )OFF
>>>> ====================================================
>>>> SEGMENTATION FAULT
>>>> thread: 0x7f8747766700
>>>> thread_cSegmentation fault
>>>> ```
>>>> 
>>>> Thanks again,
>>>> - Rowan
>>>> 
>>>> On Wed, Oct 16, 2019 at 12:06 PM Dr. Jürgen Sauermann 
>>>> <mail@jürgen-sauermann.de> wrote:
>>>> Hi Blake,
>>>> 
>>>> it is sort of working, but I could well use some help in troubleshooting
>>>> the remaining problems. I can help fixing them, but finding their root 
>>>> cause
>>>> (and making them reproducible) is a different story.
>>>> 
>>>> My current interpretation of various benchmarks that Elias Mårtenson and
>>>> myself did some years ago is that the bandwidth of the memory interface
>>>> between the CPUs (or cores) and the memory is the limiting factor, and no
>>>> matter how efficient the APL interpreter is, this bottleneck will dictate 
>>>> the
>>>> speedup that can be achieved.
>>>> 
>>>> As an example, from 1985 to 1990, myself and 4 students had built a the
>>>> hardware of a parallel APL machine with 32 CPUs and measured a speedup
>>>> of close to 32 for sufficiently large vectors.
>>>> 
>>>> In contrast, if I remember correctly, then  Elias achieved a speedup of 12 
>>>> with
>>>> 80 CPUs using the parallel feature of GNU APL. The only difference that
>>>> I can see between our 1990 machine (called Datis-P-256 because the 
>>>> architecture
>>>> could be scaled up to 256 processors) was the memory architecture:
>>>> 
>>>> Datis-P had one separate memory for each CPU, while current multicore
>>>> boxes share their memory module(s) among different cores. That simply
>>>> boils down to the fact that the memory bandwidth of Datis-P scaled with the
>>>> number of processors, while the number of cores on a typical multi-core box
>>>> does not. As long as this is the case, parallel APL remains severely 
>>>> limited
>>>> in terms of the speedup that can be achieved.
>>>> 
>>>> Best Regards,
>>>> Jürgen Sauermann
>>>> 
>>>> 
>>>> 
>>>> On 10/16/19 12:58 PM, Blake McBride wrote:
>>>>> Greetings,
>>>>> 
>>>>> I think getting the parallel processing working is important.  It may be 
>>>>> that for various reasons the speedup in general cases is minimal and not 
>>>>> worth the effort.  However, I'd imagine that there are particular 
>>>>> use-cases utilizing large arrays where the speedup would be substantial.  
>>>>> That is when those types of enhancements would make APL a real benefit.
>>>>> 
>>>>> Thanks.
>>>>> 
>>>>> Blake
>>>>> 
>>>>> 
>>>>> On Wed, Oct 16, 2019 at 5:27 AM Dr. Jürgen Sauermann 
>>>>> <mail@jürgen-sauermann.de> wrote:
>>>>> Hi Rowan,
>>>>> 
>>>>> fixed in SVN 1191.
>>>>> 
>>>>> You should not be too enthusiastic, though, because the speed-ups that
>>>>> can be achieved are somewhat disappointing. And due to that, I
>>>>> haven't put too much effort into fixing faults (sometimes apl hangs
>>>>> on a semaphore when parallel execution is enabled).
>>>>> 
>>>>> Best Regards,
>>>>> Jürgen Sauermann
>>>>> 
>>>>> 
>>>>> On 10/16/19 5:15 AM, Rowan Cannaday wrote:
>>>>>> Hello,
>>>>>> 
>>>>>> intrigued by the ability to parallelize APL, thought I'd try to test it:
>>>>>> 
>>>>>> `apl --cfg` followed by a line of '=' signs followed by `apl -q`:
>>>>>> 
>>>>>> 
>>>>>> configurable options:
>>>>>> ---------------------
>>>>>>     ASSERT_LEVEL_WANTED=2
>>>>>>     SECURITY_LEVEL_WANTED=0 (default)
>>>>>>     APSERVER_PATH=/tmp/GNU-APL/APserver (default)
>>>>>>     APSERVER_PORT=16366 (default)
>>>>>>     APSERVER_TRANSPORT=0 (default)
>>>>>>     CORE_COUNT_WANTED=2
>>>>>>     DYNAMIC_LOG_WANTED=yes
>>>>>>     MAX_RANK_WANTED=8 (default)
>>>>>>     RATIONAL_NUMBERS_WANTED=yes
>>>>>>     SHORT_VALUE_LENGTH_WANTED=12, therefore:
>>>>>>         sizeof(Value)       : 456 bytes
>>>>>>         sizeof(Cell)        :  24 bytes
>>>>>>         sizeof(Value header): 168 bytes
>>>>>> 
>>>>>>     VALUE_CHECK_WANTED=yes
>>>>>>     VALUE_HISTORY_WANTED=yes
>>>>>>     VF_TRACING_WANTED=no (default)
>>>>>>     VISIBLE_MARKERS_WANTED=yes
>>>>>> 
>>>>>> how ./configure was (probably) called:
>>>>>> --------------------------------------
>>>>>>     ./configure  'CORE_COUNT_WANTED=2' 'DEVELOP_WANTED=yes' 
>>>>>> 'VALUE_HISTORY_WANTED=yes' 'VISIBLE_MARKERS_WANTED=yes' 
>>>>>> '--enable-maintainer-mode'
>>>>>> 
>>>>>> BUILDTAG:
>>>>>> ---------
>>>>>>     Project:        GNU APL
>>>>>>     Version / SVN:  1.8 / 1190M
>>>>>>     Build Date:     2019-10-16 02:45:24 UTC
>>>>>>     Build OS:       Linux 5.2.0-3-amd64 x86_64
>>>>>>     config.status:  'CORE_COUNT_WANTED=2' 'DEVELOP_WANTED=yes' 
>>>>>> 'VALUE_HISTORY_WANTED=yes' 'VISIBLE_MARKERS_WANTED=yes' 
>>>>>> '--enable-maintainer-mode'
>>>>>>     Archive SVN:    1161
>>>>>> 
>>>>>> ================================================================================
>>>>>> 
>>>>>> $ apl -q
>>>>>> 
>>>>>> 
>>>>>> ====================================================
>>>>>> SEGMENTATION FAULT
>>>>>> thread: 0x7f6078042e00
>>>>>> thread_contexts_count: 2
>>>>>> busy_worker_count:     0
>>>>>> active_core_count:     1
>>>>>> thread # 0:               0 RUN  job:   0 no-name
>>>>>> thread #-1:               0 RUN  job:   0 no-name
>>>>>> 
>>>>>> 
>>>>>> ----------------------------------------
>>>>>> -- Stack trace at main.cc:88
>>>>>> ----------------------------------------
>>>>>> 0x7F6078FD1BBB __libc_start_main
>>>>>> 0x5631406C386D  main
>>>>>> 0x5631406CAD8D   init_apl(int, char const**)
>>>>>> 0x5631407E881B    Parallel::init(bool)
>>>>>> 0x563140832E2D     Thread_context::init_parallel(CoreCount, bool)
>>>>>> 0x7F60794E5B18      sem_init
>>>>>> 0x7F60794E8510
>>>>>> 0x5631406CA95A
>>>>>> ========================================
>>>>>> ====================================================
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]