[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Import data from other file formats and Histogram question
From: |
Erik Frebold |
Subject: |
Re: Import data from other file formats and Histogram question |
Date: |
Sun, 15 Nov 2009 13:34:11 -0800 (PST) |
1. Thanks for this. Still mystified though. Did try 1st two suggestions, and I
think I'm running under US-English as this is a fresh install of Suse 11.1, and
I did select US-English at install. btw this is pspp 0.6.2
Did succeed to load the complete datafile by cutting/pasting all the numbers
into a texteditor (kludge alert..) and saving to a new file. What would that
have stripped out? (the original file still produces gcc's "number followed by
garbage" error for every one of the 5000 values.
The new .csv file that works contains 5000 items, but the successful import
into psppire contains only 3838. That's odd.
2. Re: Histogram: I run the following on the successful import:
FREQUENCIES
/VARIABLES= VAR001
/FORMAT=AVALUE TABLE
/HISTOGRAM=NORMAL.
This produces some tables and a .png histogram with six bins. Shouldn't there
be more bins for a dataset this size? I seem to recall reading that quite some
thought went into the numbins algorithm for HISTOGRAM.
> Message: 4
> Date: Sun, 15 Nov 2009 07:21:14 +0000
> From: John Darrington <address@hidden>
> Subject: Re: Importing data from other file formats than .sav
> To: Erik Frebold <address@hidden>
> Cc: address@hidden
> Message-ID: <address@hidden>
> Content-Type: text/plain; charset="us-ascii"
>
> My guess is that you're running under a locale which uses the comma as the
> decimal separator instead
> of the dot (fr_CA ?). If this is the case, then obviously none of the
> strings in your file are valid
> numbers. Either do "SET DECIMAL=DOT." or change all the "." characters to
> "," in your file or run
> under an English locale.
>
> J'
>
>
> On Fri, Nov 13, 2009 at 09:12:45PM -0800, Erik Frebold wrote:
> I'm having a bit of difficulty getting this to happen. What I'd like to
> do is get pspp to read data files output from simple python modules. Thus
> far, I've been trying to use psppire's "import delimited text data" function.
> Have saved small test files variously as .txt, .csv, or as a .csv with
> python-added commas between values. In all cases the data look fine in the
> home program, and also look fine in psppire's import function right up to the
> last step "ok", whereupon they produce multiple errors of the type:
> "dataSet.csv:1: data file warning: (columns 1-0, F field) Number followed by
> garbage." Only when each data value is separated from the next by a comma
> (and no spaces or carriage returns) do I escape these errors, but then each
> datum heads its own variable column in psppire.
>
Perhaps the way to tackle this is to use a pspp syntax window with "BEGIN
DATA" and "DATA LIST" or "GET DATA"? (on some sort of .csv or .txt file?) Does
pspp require each imported datum to be exactly the same length (i.e. same
precision)?
> A few suggestions to set me on the right track would be helpful as I've
> run out of ideas.
>
- Re: Import data from other file formats and Histogram question,
Erik Frebold <=