pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Command line use; download of nzb files does not stop


From: Duncan
Subject: Re: [Pan-users] Command line use; download of nzb files does not stop
Date: Sun, 6 Nov 2011 19:16:10 +0000 (UTC)
User-agent: Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT bb16cbd /st/portage/src/egit-src/pan2)

Graham Lawrence posted on Sun, 06 Nov 2011 07:38:35 -0800 as excerpted:

> Dang, this is a fine group, where I've been introduced to newsgroups and
> the intricacies of downloading from nzbs, what I guess should be called
> implicit compound conditionals in Bash, several new linux commands, and
> how to use email.

=:^)

Obviously this group/list doesn't always stay 100% on topic.  I'm 
definitely as much if not more to blame there than most and I recognize 
that fact.  But in my defense I've not seen complaints about it yet, and 
in the context of my previous reply to the effect that people whose posts 
tend to be the most helpful also tend to be the ones setting group/list 
mores and tolerance levels, it could be argued that my own example has 
helped set the tone here for what /is/ acceptable.

Additionally, pan went thru some pretty dark times a few years ago, when 
it looked like development was abandoned and the few regulars here were 
on death watch, simply waiting for the day it would no longer compile on 
current systems, and the last one out could shut out the lights, as they 
say.  Fortunately, we have new development now and I think everyone's 
thankful to those doing it, but back then, it's quite likely that the 
occasional OT discussion helped keep the last dying embers alive when 
otherwise the list might have been abandoned, as there was simply nothing 
on-topic left to discuss for those still around.

But I too quite enjoy the list/group and the discussions we have, which I 
guess is a good thing given my role in creation and support of the 
atmosphere we have. =:^)

> And my Pan download completed using
> 
> pan --no-gui -o /home/g/Films --nzb "${nzb[1]}" 2>/home/g/pan.debug
> 
> Turns out the problem was my quoting the quotes with \.

IIRC I mentioned that quote-escaping, since what that was effectively 
doing was killing bash's interpretation and removal of the quotes, so 
they were passed on to pan itself.  I wasn't quite sure how pan itself 
dealt with quotes, since they'd normally tend to be stripped before it 
got the commandline, but it seems they do have an effect.  (If I were 
more of a coder I could of course actually look at the sources to see 
what pan did with quotes, but...)  Fortunately or unfortunately, pan 
doesn't blow up with them, it just changes behavior enough to complicate 
troubleshooting.  (If it crashed pan or if pan interpreted the quotes as 
literal parts of the filename, the problem would be more intense, but 
also much more immediately obvious.)

> Yet it still made copies of some files, but only a few instead of
> seemingly every single one, as before.  Which suggests that there may be
> a legitimate reason why Pan does this, and that I simply got spooked
> because it was doing it so much.  Is there a valid reason or is this
> truly anomalous behavior on its part?

I'm not exactly sure how to read your question, and whether this answer 
really applies or not, but...

Keep in mind that in this context at some level, pan is simply a "bot 
downloader", downloading, decoding and saving potentially all attachments 
found in a list to some arbitrary save-directory.

The problem then appears when multiple entirely different attachments, 
potentially on entirely unrelated groups, happen to appear with the same 
filename.  This problem is compounded by the fact that sometimes posts 
are made, for example on images groups, with the original filenames 
assigned by default by the camera that took the picture in the first 
place.  Since many cameras have the same default pattern 
prefix_number.jpg, with a common prefix, name clashes are not uncommon.  
(Of course they happen frequently with spam as well, but it isn't just 
spam, and pan can't judge what's spam and what's not, it's just a bot 
doing what it's told to do.)

If pan didn't have the insert-copyN-on-file-already-exists  behavior, 
these same-name attachments would overwrite each other, and only the last 
one that happened to be downloaded would be saved.

That doesn't mean that pan couldn't at least do checksums on the files 
and delete the new version if it has the same checksum and size as the 
old one, which would help in some ways.  But of course that'd make it 
harder to catch duplicate downloading, as well, and pan would certainly 
have to download the entire thing and decode it to do the checksumming, 
so it wouldn't save bandwidth, just the hassle of having a bunch of 
duplicated files around.  And since many people pay by the gigabyte, 
silent name/size/checksum duplicate deletion would result in a lot of 
unhappy users who used up their bandwidth allowance without realizing it, 
because they had mistakenly set pan to download the same files over and 
over again and pan was doing just that, then deleting them as duplicates.

But more likely, it's simply that such a deduplicating feature just never 
got coded up and merged into pan.  It'd be a useful thing to have, 
particularly if there was an option to either simply log the dup-deletion 
or popup a warning dialog when it happened, so the user could choose just 
how noisy they wanted pan to be when it finds it's downloading the same 
thing over and over.

But I'm not sure if that answers your question or just sidesteps it.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]