bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] wget produces erroneous robots.txt


From: leoh Jones
Subject: Re: [Bug-wget] wget produces erroneous robots.txt
Date: Wed, 18 Feb 2015 08:40:39 -0500

Thanks for the reply.
I am using debian8 (jessie) if that matters. Though I did have the same
issue on a new version of ubuntu.
I did not use the option --content-on-error  I just used "-m"
I have no ~./wgetrc and no /etc/wget
Hey, where is the official github repo?
I will try again on the mailing list. Here is the wget version on my debian
machine

$ wget --version
GNU Wget 1.16 built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls +ntlm +opie +psl +ssl/gnutls

Wgetrc:
    /etc/wgetrc (system)
Locale:
    /usr/share/locale
Compile:
    gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
    -DLOCALEDIR="/usr/share/locale" -I. -I../lib -I../lib
    -D_FORTIFY_SOURCE=2 -I/usr/include -g -O2 -fstack-protector-strong
    -Wformat -Werror=format-security -DNO_SSLv2 -D_FILE_OFFSET_BITS=64
    -g -Wall
Link:
    gcc -g -O2 -fstack-protector-strong -Wformat
    -Werror=format-security -DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall
    -Wl,-z,relro -L/usr/lib -lnettle -lgnutls -lz -lpsl -lidn -luuid
    ftp-opie.o gnutls.o http-ntlm.o ../lib/libgnu.a

Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic <address@hidden>.
Please send bug reports and questions to <address@hidden>.


On Wed, Feb 18, 2015 at 8:22 AM, Tim Ruehsen <address@hidden> wrote:

> On Wednesday 18 February 2015 07:45:53 leoh Jones wrote:
> > Pardon me, if this email reaches you in error.
> > email addresses taken from wget source.
> > I was mirroring a webserver with wget -m <address>
> > when it was done I went in to look at the files, and noticed that there
> is
> > a robots.txt file. This was interesting, because the site mirrored
> doesn't
> > have a robots.txt file.
> > so then, I looked at the robots.txt file contents, which was that of the
> > site 404 page.
>
> First of all, I can't reproduce it here with the latest version from git.
>
> Looks like the new feature --content-on-error is enabled. Did you use it ?
> What do /etc/wgetrc and ~./wgetrc look like ? And very important: what is
> the
> output of 'wget --version' ?
>
> > Is this a bug? I signed up for the mailing list, for wget bug reports but
> > never heard back. Or is this expected behavior?
>
> When you sign up for the mailing list, you should get an email very soon
> with
> further instructions. Just try it again.
>
> Tim


reply via email to

[Prev in Thread] Current Thread [Next in Thread]