[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ADD FILES and UPDATE
From: |
Ben Pfaff |
Subject: |
Re: ADD FILES and UPDATE |
Date: |
Sun, 30 Nov 2008 21:00:56 -0800 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) |
John Darrington <address@hidden> writes:
> I also wasn't aware that compare_values only checked the first 8
> characters. I have a feeling this is a bug.
Yes, in many cases I think so too.
Here is my analysis of each use in the source tree:
* src/data/category.c (cat_value_find): The code assumes
that each candidate is exactly 1 union value in size,
so I don't think that this is a bug here.
* src/language/stats/binomial.c (do_binomial): I think
that this is a bug, since I don't see any guarantee
that the width is short.
* src/language/stats/examine.q (show_summary): I'm not
sure from a first look, but I'm inclined to say that
this is a bug. However, the SPSS documentation for
EXAMINE says that only the short string part of long
string variables is analyzed, and maybe we have that
restriction too, though we do not document it.
* src/language/stats/freq.c (compare_freq): The main user
of this function is FREQUENCIES, and it intentionally
only analyzes the short-string part of long string
variables. chisquare.c also uses it, and I don't know
whether it has the same intention.
* src/language/stats/oneway.q (run_oneway): ONEWAY only
supports numeric variables, so this could actually use
a different comparison function that only compared
doubles. At any rate I don't think it is buggy.
* src/language/stats/t-test.q (which_group): T-TEST only
analyzes the short-string part of long string
variables, so this should be correct.
* src/math/coefficient.c (pspp_coeff_var_to_coeff): I
don't know. Further investigation required.
* src/math/covariance-matrix.c (various functions):
Ditto.
* src/math/group.c (compare_group): Ditto.
* src/ui/gui/find-dialog.c (value_compare): Seems likely
to be wrong.
I could use advice from you or Jason on the code that you
respectively maintain, especially in the "I don't know" cases
above.
> The only problem with the 3way function is that the caller must
> gaurantee that both values have the same with.
>
> I think that we need a more general compare_values_with_widths_3way
> function and all existing callers of compare_values_short should use it.
That sounds like a good idea. I know that in some places
comparison of strings of differing length is done by extending
the shorter string with spaces to the length of the longer
string, then comparing the result; see, e.g., compare_string() in
src/language/expressions/helpers.c, which is used for string
inequality comparison operators (and elsewhere) in expressions.
Are you aware of unequal-length strings being compared any other
way in PSPP?
Have you had a chance to look at any of the other commits yet?
Thanks,
Ben.
--
"Premature optimization is the root of all evil."
--D. E. Knuth, "Structured Programming with go to Statements"
- Re: ADD FILES and UPDATE,
Ben Pfaff <=
- Re: ADD FILES and UPDATE, John Darrington, 2008/12/01
- Re: ADD FILES and UPDATE, Ben Pfaff, 2008/12/02
- Re: ADD FILES and UPDATE, Ben Pfaff, 2008/12/03
- a better way to represent values (was: Re: ADD FILES and UPDATE), Ben Pfaff, 2008/12/08
- Re: a better way to represent values (was: Re: ADD FILES and UPDATE), John Darrington, 2008/12/08
- Re: a better way to represent values, Ben Pfaff, 2008/12/08
- Re: a better way to represent values, John Darrington, 2008/12/08
- Re: a better way to represent values, Ben Pfaff, 2008/12/08