bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file"


From: Tim Rühsen
Subject: Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format
Date: Tue, 09 Aug 2016 21:25:16 +0200
User-agent: KMail/5.2.3 (Linux/4.6.0-1-amd64; KDE/5.23.0; x86_64; ; )

On Freitag, 5. August 2016 20:25:06 CEST Matthew White wrote:
> On Thu, 04 Aug 2016 16:47:18 +0200
> 
> Tim Ruehsen <address@hidden> wrote:
> > On Wednesday, August 3, 2016 1:46:56 PM CEST Matthew White wrote:
> > > On Tue, 02 Aug 2016 11:27:08 +0200
> > > 
> > > Tim Ruehsen <address@hidden> wrote:
> > > > On Tuesday, August 2, 2016 10:06:42 AM CEST Matthew White wrote:
> > > > > On Sat, 30 Jul 2016 21:23:56 +0200
> > > > > 
> > > > > Matthew White <address@hidden> wrote:
> > > > > > Hello!
> > > > > > 
> > > > > > I noticed that wget doesn't use the metalink:file element as the
> > > > > > RFC5854
> > > > > > suggests.
> > > > > > 
> > > > > > https://tools.ietf.org/html/rfc5854#section-4.1.2.1
> > > > > > 
> > > > > > The RFC5854 specifies that the metalink:file could have a
> > > > > > "path/file"
> > > > > > format. In this case wget should create the "path" tree and save
> > > > > > the
> > > > > > file
> > > > > > as "path/file", but it doesn't. Instead wget saves the file in the
> > > > > > working directory.
> > > > > > 
> > > > > > e.g. <file name="dirA/dirB/file.gz">
> > > > > > 
> > > > > > With this patch wget conforms to the RFC5854.
> > > > > > 
> > > > > > I made this patch working on the following branch:
> > > > > > master (latest 20cac2c5ab3d63aacfba35fb10878a2d490e2377)
> > > > > > git://git.savannah.gnu.org/wget.git
> > > > > > 
> > > > > > Let me know if this helps.
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > After the suggestions of Tim, I fixed the patch (nice! fix the
> > > > > fix...).
> > > > > So,
> > > > > scratch the previous patch and use this one instead.
> > > > > 
> > > > > The function concat_strings() replaces the combination of malloc(),
> > > > > strlen(), and strcpy(). (Thanks Tim.)
> > > > 
> > > > In this special case (just one string to clone), please use xstrdup().
> > > > That is much less overhead.
> > > > 
> > > > > The description now follows the style of the other patches. (Thanks
> > > > > again
> > > > > Tim!)
> > > > > 
> > > > > You may consider this patch a Bugfix:
> > > > > * src/utils.c (unique_create, fopen_excl): cannot create a directory
> > > > > tree
> > > > > like "path/file".
> > > > > 
> > > > > The problem is that fopen_excl() doesn't create non-existing
> > > > > directories.
> > > > > What do you suggest?
> > > > 
> > > > IMO, saving metalink files should work 'as usual and expected'. That
> > > > means, all the directory options should apply (see man wget /
> > > > Directory
> > > > Options). We also have to deal with 'escaping' file name sequences
> > > > like
> > > > ../ etc.
> > > > Not to forget character set conversions.
> > > > 
> > > > What about cases where you download https://myserver/file.tgz, and in
> > > > the
> > > > metalink file 'name' is set to 'xyz.doc' ? IMO, we should keep
> > > > file.tgz,
> > > > except the user stated --trust-server-names.
> > > > 
> > > > Maybe we should set up a a metalink document where we define how
> > > > everything
> > > > should work including corner cases. Next step would be to set up
> > > > appropriate tests and fix the wget metalink code to survive the tests.
> > > > 
> > > > > Can we make this patch obsolete?
> > > > 
> > > > Basically yes.
> > > > But we see, there is still much to do around metalink ... but most of
> > > > the
> > > > needed code is already there. Most work will be the definition and the
> > > > tests.
> > > > 
> > > > Regards, Tim
> > > 
> > > Changed to use xstrdup(), new patch attached.
> > 
> > I already mentioned, that taking the metalink filename pulls in a security
> > issue. It might escape the current directory with a relative or absolute
> > path.
> Actually, wget aborts if something like "../path/file" is found as
> metalink:file name (see attached document as patch). The same goes for
> "./path/file" and "/path/file".
> > You might just take the file name without path, but only if the the option
> > -- trust-server-names has been set.
> > 
> > Tim
> 
> Hi Tim,
> 
> I got your PM, did you get mine (I had an email server problem)?
> 
> I wrote a draft Metalink.README trying to explain the current interaction
> between "Directory Options" and the option '--include-metalink=file' on the
> command line.
> 
> I attach the Metalink.README here as a patch.
> 
> Can you confirm my findings? (Don't rush)

I created a new upstream branch 'metalink' where we can put further work for 
testing. I included your README as doc/metalink.txt and also added two tests 
for '../File1' and '/File1', which currently fail (the tests expect 'File1' to 
be written). 

You can't push there, but maybe you clone it (e.g. to Gitlab, Github or 
elsewhere) and push your changes to that repo. So I can directly merge from 
there (and you could then merge from upstream 'metalink').

I still didn't find time to think about the Metalink file naming details...

Regards, Tim

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]