wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wget2 | Fails to download podcast episodes in rss feed with "not followe


From: Steven Willis (@onlynone)
Subject: wget2 | Fails to download podcast episodes in rss feed with "not followed (not matching base)" (#634)
Date: Fri, 30 Jun 2023 01:22:06 +0000


Steven Willis created an issue: https://gitlab.com/gnuwget/wget2/-/issues/634



I have a podcast feed available at a URL of the form: 
`https://subdomain.example.com/feed/1234567.xml?u=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJfaWQiOiIyNzU3ODQ0MTkxZWQ2MzFhMTdlODc3ZjUifQ.cHjoXumlDyvchHmkSgYiIzhTT+dPOB4guGSHfxQp8oE`

It returns a feed like:

```
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/"; 
xmlns:content="http://purl.org/rss/1.0/modules/content/"; 
xmlns:atom="http://www.w3.org/2005/Atom"; version="2.0" xmln
s:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"; 
xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0";>
 <channel>
  <!-- SNIP -->
  <item>
   <!-- SNIP -->
   <enclosure 
url="https://subdomain.example.com/episode/234567890.mp3?u=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJfaWQiOiIyNzU3ODQ0MTkxZWQ2MzFhMTdlODc3ZjUifQ.cHjoXumlDyvchHmkSgYiIzhTT+dPOB4guGSHfxQp8oE";
 length="1" type="audio/mpeg"/>
   <!-- SNIP -->
  </item>
  <!-- SNIP -->
 </channel>
</rss>
```

I'd like to mirror the feed and all the episodes referenced by it. I tried: 
`wget2 --span-hosts --mirror 
'https://subdomain.example.com/feed/1234567.xml?u=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJfaWQiOiIyNzU3ODQ0MTkxZWQ2MzFhMTdlODc3ZjUifQ.cHjoXumlDyvchHmkSgYiIzhTT+dPOB4guGSHfxQp8oE'`
 . But that only downloads the feed's xml/rss file to 
`./subdomain.example.com/feed/1234567.xml\?u\=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJfaWQiOiIyNzU3ODQ0MTkxZWQ2MzFhMTdlODc3ZjUifQ.cHjoXumlDyvchHmkSgYiIzhTT+dPOB4guGSHfxQp8oE.

If I add `--verbose --debug -o log` and look at the log after, I see lines like:

```
URL 
'https://subdomain.example.com/episode/234567890.mp3?u=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJfaWQiOiIyNzU3ODQ0MTkxZWQ2MzFhMTdlODc3ZjUifQ.cHjoXumlDyvchHmkSgYiIzhTT+dPOB4guGSHfxQp8oE'
 not followed (not matching base)
```

I don't really understand since that URL does appear to match the original base 
URL requested... but I also have `--span-hosts` specified, so it should be 
allowed to cross hosts anyway, right? I also tried setting 
`--base=https://subdomain.example.com/` and slight variants on that, but 
nothing seemed to work.

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/-/issues/634
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]