Re: Dynamic mutidimensional arrays

Hi again

On Wed, 5 Apr 2023 at 15:25, Fischlin Andreas wrote:

Yes, you can implement this in PIM and ISO and need no extra feature. Michael, as I tried to express earlier, I fear you are hung up in doing the client side of things and the implementation in the same go, i.e. all in a program module (e.g. your M2 and Fortran pseudo code samples). Instead I advocate for splitting the task into the servicing part, which goes into a library module, and the client part, the program module where you use that functionality. Yes, this requires a bit more of coding and a design effort, but the overall result is more robust, offers reusability and is safe to use. The way to do it in M2.

I wholeheartedly agree. And I might add that the separation of definition and implementation into separate compilation units is an important part of this design and development approach. There is a methodology of thinking developed by psychologist Edvard de Bono called the "Six Thinking Hats".

The premise is that when we cooperate in teams, our efforts are often misaligned, we work against each other instead of with each other. The reason being that everybody is looking at the subject from a specific perspective and their efforts are only aimed at covering that particular perspective. Thus in a meeting, one person would say something about feasibility, another about costs, and yet another about risks. The outcome is then often that everybody keeps doing what they are doing without incorporating the insights of the other viewpoints. With the six hat method, the moderator of the meeting sets the context, where each context is represented by a coloured paper hat. When the context is set, everyone puts on the paper hat of that colour representing the context and then nobody is permitted to say anything that is not within that context. As the meeting progresses, the moderator goes through all the contexts one after another. As a result, everybody's thinking is aligned and productivity increases.

https://en.wikipedia.org/wiki/Six_Thinking_Hats

The separation of definition and implementation module into separate editing and compilation units is quite similar. When you write the definition module, you have the client hat on. Your focus is on "How am I going to use this library so it will be practical and convenient for me as a user?". By contrast, when you are writing the implementation module, you have the implementer hat on. Your focus has now shifted to "How am I going to implement the specified functionality with the given interface in the most efficient way?". At times, these two goals are at odds with each other, and then it takes some going back and forth until a good compromise is found. But every time you switch between the two units, you switch hats, you switch focus. Try to keep this aspect in mind when you write your code and you will probably find that it will make things easier.

Benjamin, I prefer our approach (cf. LgMatrices) over what you sketched as being less limited (MaxElems being maximally large, with LMatrix = POINTER TO LMat and can allocate the maximum storage for any given compiler that is addressable via CARDINAL*).

I am not saying that this is the only way to do it. In fact, I had earlier given a different example where the dimensions were static. There are advantages and disadvantages to both approaches. The reason I wrote this particular example code is because Michael asked for something matching the approach he had given in his own example.

Generally, I would say from experience that for mathematical applications, the approach using vectors and matrices with static sizes but dynamically bounded (by introducing a counter) is probably preferable, if only because when doing mathematics, the sizes of vectors and matrices needed for any given problem are usually within a certain ballpark.

By contrast, the approach using dynamically sized objects is often more practical when working with collections such as associative arrays, trees and lists. Those tend to have applications where the sizes are all over the place from the very small to the very large.

A programming language should provide means to implement both approaches without too much trouble.

And more importantly, it avoids the messy address arithmetic – I really advise against – as the notation when accessing elements or vectors of a matrix is quite familiar: E.g. vec[i]-th element is denoted by vec^[i] or mat(i,j)-th element is denoted by mat^[i]^[j] ). This notation helps when it comes to implement more complex calculations, e.g. when computing eigen values etc. I would not want to implement that with the address arithmetic your approach would require.

That's why I don't like working with PIM and ISO. It's a serious shortcoming, and in this particular aspect C wins hands down over M2. And that is the reason why we added the means to declare record types with indeterminate array fields. This has the advantage that you only do one single allocation, instead of at least two, considering that memory allocation on the heap is an expensive operation and typically a bottleneck in concurrency situations. Further, it has the advantage that it keeps the metadata in close proximity to the payload data, which is important for performance on modern CPUs which rely heavily on caching and if you need to allocate your metadata separately from your payload data, they produce cache misses every single time. And yet, it does preserve the array subscript notation while also allowing automatic boundary checks to be inserted by the compiler.

The facility actually permits both approaches. So there does not need to be any separate language facility for each approach.

hope this clarifies

regards

benjamin

From:	Benjamin Kowarsch
Subject:	Re: Dynamic mutidimensional arrays
Date:	Wed, 5 Apr 2023 16:03:31 +0900