pspp-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MATCH FILES BY TOTALLY OPEN JOINTS


From: Frans Houweling
Subject: Re: MATCH FILES BY TOTALLY OPEN JOINTS
Date: Sun, 5 Jan 2025 22:53:23 +0100
User-agent: Mozilla Thunderbird

Hi Ricardo,

  this example shows that  "FILE /FILE" matches include all cases from both files even when the BY variable does not match..


DATASET DECLARE left.
DATASET ACTIVATE left.
DATA LIST LIST /voterID var1 var2.
BEGIN DATA
55 1 2
66 3 4
END DATA.

DATASET DECLARE right.
DATASET ACTIVATE right.
DATA LIST LIST /voterID var1 var3.
BEGIN DATA
55 5 6
77 3 4
END DATA.

DATASET ACTIVATE left.
MATCH FILES FILE = * /FILE = right /BY voterID.
LIST.


       Data List
╭───────┬────┬────┬────╮
│voterID│var1│var2│var3│
├───────┼────┼────┼────┤
│  55.00│1.00│2.00│6.00│
│  66.00│3.00│4.00│   .│
│  77.00│3.00│   .│4.00│
╰───────┴────┴────┴────╯

If with FULL OUTER JOIN you mean "include all cases even if the BY variable is missing" (strange scenario), then to achieve this I suggest you assign some unique value to each missing case (for example: IF (MISSING(voterID)) voterID = -$CASENUM.), then use ADD FILES to amass all cases, then use AGGREGATE on the BY variable to eliminate duplicates.



On 1/5/25 18:58, Ricardo Mejias wrote:


From: Ricardo Mejias <ricardomejias@hotmail.com>
Sent: Sunday, January 5, 2025 12:50 PM
To: pspp-users@gnu.org <pspp-users@gnu.org>
Subject: Re: Pspp-users Digest, Vol 221, Issue 2
 
This is my code, and it works fine for matching the left file to the right, but only for VoterIDs  that are in both files. 

MATCH FILES


 
 
 
 
 
 
 
 
 
/FILE = "F:\External Drive for PSPP Data\FloridaTotalsFiles\SelectedFields_20241112.sav"/File = "F:\External Drive for PSPP Data\FloridaTotalsFiles\VoterIDParty_20241022.sav"
/BY VoterID.











SAVE
/OUTFILE =
"F:\External Drive for PSPP Data\FloridaTotalsFiles\SelectedFieldsWithLastMonParty_20241112.sav".




What is the syntax to make this be a totally open joint. That means that blank records from both files are also selected when their VoterID exists on the other file.






Send Pspp-users mailing list submissions to
        pspp-users@gnu.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://na01.safelinks.protection.outlook.com/?url="">
or, via email, send a message with subject or body 'help' to
       
pspp-users-request@gnu.org

You can reach the person managing the list at
        pspp-users-owner@gnu.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Pspp-users digest..."


Today's Topics:

   1. Re: Questions of MATCH FILES (Phong Duong)
   2. Re: Questions of MATCH FILES (ft gmail)


----------------------------------------------------------------------

Message: 1
Date: Sat, 4 Jan 2025 18:01:35 -0600
From: Phong Duong <phong_duong@yahoo.com>
To: Ricardo Mejias <ricardomejias@hotmail.com>
Cc: pspp-users@gnu.org
Subject: Re: Questions of MATCH FILES
Message-ID: 1DEA2BEE-7AE4-42E6-9A67-00E685CAB223@yahoo.com"><1DEA2BEE-7AE4-42E6-9A67-00E685CAB223@yahoo.com>
Content-Type: text/plain; charset="utf-8"

An HTML attachment was scrubbed...
URL: <https://na01.safelinks.protection.outlook.com/?url="">>

------------------------------

Message: 2
Date: Sun, 5 Jan 2025 13:56:49 +0100
From: ft gmail
<public.ftr@gmail.com>
To: pspp-users@gnu.org
Subject: Re: Questions of MATCH FILES
Message-ID: 99bbcb90-dd6e-4b1e-b3af-2fd96bafb28f@gmail.com"><99bbcb90-dd6e-4b1e-b3af-2fd96bafb28f@gmail.com>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

The list does not accept images, it is text only. So, to make us
understand your question please copy-paste your syntax.

Thanks.

Le 05/01/2025 à 01:01, Phong Duong a écrit :
> Perhaps there should be no period before /ByVoterID.
>
> Sent from my iPhone
>
>> On Jan 4, 2025, at 13:58, Ricardo Mejias <ricardomejias@hotmail.com>
>> wrote:
>>
>> 
>> ------------------------------------------------------------------------
>> *When I run this code:*
>> *
>> *
>> *
>> <image.png>
>> *
>> *I get this error message.  It happens with and without a period
>> after /ByVoterID.  What do I have to change to avoid this error?*
>> *
>> *
>> *
>> <image.png>
>> *
>> *
>> *
>> *Also: What do I do to make the matched file include all the records
>> from both original files, and a blank for all of the fields that do
>> not match the other file, except for the VoterID field?*
>> *I read an IBM website that calls this and "a Full Outer Joint," but
>> does not explain how to accomplish it.  And I searched for Outer
>> Joint on the PSPP documentation and did not find it. *
>> *
>> *
>> *https://na01.safelinks.protection.outlook.com/?url="">
>>
>> <image.png>
>>
>>
>>
>>
>> Merge node (SPSS Modeler) - IBM
>> <
https://na01.safelinks.protection.outlook.com/?url="">>
>> The function of a Merge node is to take multiple input records and
>> create a single output record containing all or some of the input
>> fields. This is a useful operation when you want to merge data from
>> different sources, such as internal customer data and purchased
>> demographic data.
>>
https://na01.safelinks.protection.outlook.com/?url="">
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://na01.safelinks.protection.outlook.com/?url="">>

------------------------------

Subject: Digest Footer

_______________________________________________
Pspp-users mailing list
Pspp-users@gnu.org
https://na01.safelinks.protection.outlook.com/?url="">


------------------------------

End of Pspp-users Digest, Vol 221, Issue 2
******************************************

reply via email to

[Prev in Thread] Current Thread [Next in Thread]