gomp-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gomp-discuss] Plan ... coments wanted !


From: Steven Bosscher
Subject: Re: [Gomp-discuss] Plan ... coments wanted !
Date: 30 Jan 2003 01:00:05 +0100

Op do 30-01-2003, om 00:23 schreef Diego Novillo:
> On Thu, 30 Jan 2003, Steven Bosscher wrote:
> 
> > Op wo 29-01-2003, om 15:50 schreef Diego Novillo:
> > > You really want to work on GIMPLE.  That's the language over
> > > which GCC will do tree optimizations.
> > ---- 8< ----
> > > As far as the optimizations go, almost everything will be
> > > analyzed and optimized at the GIMPLE level.
> > 
> > So where should the OpenMP stuff be translated to libcalls and who knows
> > what else?  During GENERIC->GIMPLE, or during GIMPLE->RTL?
> > 
> GIMPLE->RTL.  GENERIC->GIMPLE will probably be just a
> simplification of the expressions much like we simplify the
> original parse trees.

I'm still not convinced that is possible. Now you're the CS PhD here, so
I must be misunderstanding something... Hope you can explain then.

For OpenMP we need to keep track of where variable are, because most
directives can explicitly specify what should happen with a given
variable. With all the cool SSA optimizations, loop normalization, etc,
how can we make sure we still have all the information we need when we
parallelize the code?

For example, consider a small modification to the snippet from the Intel
article (s/k/k+1/).  Not a very bright piece of code, but for sake of
argument:

#define N 10000 
void
ploop(void) 
{
  int k, x[N], y[N], z[N]; 
  #pragma omp parallel for private(k) shared(x,y,z) 
  for (k=1;  k=<N; k++) {
    x[k-1] = x[k-1] * y[k-1] + workunit(z[k-1]);
  }
}

Now, if we first do GIMPLE+Optimizations, the for-loop will be
normalized.  Then all the [k-1]s would be replaced with the index
variable for the normalized loop with PRE/CCP.  Et voila, k is dead and
is probably eliminated by DCE(?).

We would have to update the OpenMP information to something like
"private(normalized_loop_index)".  IMO optimizers shouldn't have to do
that.

The OpenMP semantics for the origianl loop is: "Each thread will have
its own copy of k". What we would end up doing is giving each thread its
own copy of the normalized loop index.  Would that still be correct?

Maybe that's why intel handles OpenMP directives *before* high level
optimizations?

Greetz
Steven






reply via email to

[Prev in Thread] Current Thread [Next in Thread]