[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: dataframe dereferencing
From: |
Jaroslav Hajek |
Subject: |
Re: dataframe dereferencing |
Date: |
Fri, 3 Sep 2010 08:55:29 +0200 |
On Thu, Sep 2, 2010 at 9:31 PM, Judd Storrs <address@hidden> wrote:
> On Thu, Sep 2, 2010 at 3:04 PM, Jaroslav Hajek <address@hidden> wrote:
>>
>> while you have every right to naively expect this, understand that for
>> cell(x(1:3, 1:2)) the inner expression must result in some kind of
>> intermediary object (e.g. a sub-dataframe) which is then converted to
>> cell, while x.cell(1:3, 1:2) may be optimized so as to extract the
>> proper portion of data to cell directly. Similarly for matrix.
>
> I see your point--you think it's a performance issue, but I think it is
> incorrect to assume that subsetting a dataframe is necessarily
> inefficient. Really, that's a question of implementation not semantics. I
> don't think that linguistic novelty is a good approach to optimization. Two
> competing semantic models is a bad thing.
Competing? Oh no, these would be just happily co-existing :) Besides,
for a dataframe df there are actually two cell conversions, df.cell
and df.as.cell, and you need to distinguish between them.
> If performance is a problem,
> optimize later.
As every dogmatic statement, this is not always true. Optimization
possibilities are always design-dependent to some extent.
Certain optimizations are simply impossible later if not born in mind
from the very start.
> Personally, I think octave's internal function dispatch is
> always going to be faster than a cobbled-together m-file-based dispatch.
The dispatch is not the problem, the intermediate object is.
> A
> different optimization would be to make dataframe perform lazy
> sub-referencing--e.g. a subframe is a view of the original frame (which
> could also have memory advantages).
>>
Lazy indexing is cool, but there is a number of problems implied...
>> However, I see no reason why dataframe couldn't support conversion to
>> cell through cell (dataframe) as well.
>
> Well, I don't think we want to go the perl route if we can avoid it...
>>
Huh?
>> Before you overload {} or suggest doing it, make sure you understand
>> the associated cs-list & numel issues.
>
> You're going to have to point me somewhere on this one. I'm proposing it
> anyway because it's semantically correct.
>
Expressions like A{I} and A(I).B may generate a cs-list. This is
especially important in assignment, where the cs-list length needs to
be evaluated *prior* to the right hand side (and hence prior to the
subsasgn call).
regards
--
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
- dataframe dereferencing, CdeMills, 2010/09/02
- Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/02
- Re: dataframe dereferencing, Judd Storrs, 2010/09/02
- Re: dataframe dereferencing, CdeMills, 2010/09/02
- Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/02
- Re: dataframe dereferencing, Judd Storrs, 2010/09/03
- Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/03
- Re: dataframe dereferencing, Judd Storrs, 2010/09/03
- Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/04
- Re: dataframe dereferencing, Judd Storrs, 2010/09/04
- Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/04
- Re: dataframe dereferencing, CdeMills, 2010/09/06