Linux Journal digital edition and wget

Hello All, As are others, I am a Linux Journal subscriber, and they have discontinued the print edition, providing a PDF from the web instead. I get a monthly email, with a link, and clicking on that brings up a page in FireFox, and an auto start download with the FireFox download manager. Sometimes it "completes" short of the whole file, useless, but usually I can get with multiple tries. This month, no such luck. I have tried copying and pasting the link into wget, but without success, there is a stub document, and wget then exits instead of listening for the following file. I have looked at the wget man page, and there is a provision for getting, and saving, session cookies, as well as ways to do login credentials, but I am not certain of quite what is required. I do have login credentials available, but appear not needed. The URL is, without all the final string as I am required to not pass that on, but of 6 alphanumeric characters, a hyphen, then fourteen characters, another hyphen, then ten characters. There is a shortened version follows, which should fail in a browser, but provide the first portion for reference. http://linuxjournalservices.com/portal/wts/cgmciy- I am wondering whether I need recursion, or some other option to get wget to wait for the download to start on the opened connection. Regards, Mark Trickett

On 2012-06-17 22:45, Mark Trickett wrote:
Hello All,
As are others, I am a Linux Journal subscriber, and they have discontinued the print edition, providing a PDF from the web instead. I get a monthly email, with a link, and clicking on that brings up a page in FireFox, and an auto start download with the FireFox download manager. Sometimes it "completes" short of the whole file, useless, but usually I can get with multiple tries. This month, no such luck.
I have tried copying and pasting the link into wget, but without success, there is a stub document, and wget then exits instead of listening for the following file. I have looked at the wget man page, and there is a provision for getting, and saving, session cookies, as well as ways to do login credentials, but I am not certain of quite what is required. I do have login credentials available, but appear not needed. The URL is, without all the final string as I am required to not pass that on, but of 6 alphanumeric characters, a hyphen, then fourteen characters, another hyphen, then ten characters. There is a shortened version follows, which should fail in a browser, but provide the first portion for reference.
http://linuxjournalservices.com/portal/wts/cgmciy-
I am wondering whether I need recursion, or some other option to get wget to wait for the download to start on the opened connection.
Hi Mark, Firstly, I suggest you try this: http://www.linuxjournal.com/digital. Alternatively, in the current LJ issue (ironically) one of the letters suggests that a 'wget' of the URL in the email works fine as long as you enclose the URL in double-quotes: Bruno wrote:
Here is how I do it:
* I click on the link received via e-mail.
* In the browser that opens, I copy the entire address bar (via Ctrl-A, Ctrl-C). The URL is something like "http://download.linuxjournal.com/pdf/get-doc.php?code=...".
* I then type wget and paste the URL, with one very important precaution though. The URL must be enclosed in double quotes, as the parameters are separated by an ampersand (&), which has a special meaning under UNIX/Linux.
It works like a charm! So, if you have the chance to pass that information on to Bob, that would be great!
Alternatively: Jordi Clariana wrote:
I was reading the April 2012 issue, and > regarding the letter from Bob Johnson > titled "Linux Journal Download with > wget?", I think I may have a solution: > the cliget extension for Firefox > (https://addons.mozilla.org/en-us/ > firefox/addon/cliget).
Hope this helps. -- Regards, Matthew Cengia

On Mon, Jun 18, 2012 at 09:59:40AM +1000, Matthew Cengia wrote:
Firstly, I suggest you try this: http://www.linuxjournal.com/digital. Alternatively, in the current LJ issue (ironically) one of the letters suggests that a 'wget' of the URL in the email works fine as long as you enclose the URL in double-quotes:
note: enclosing URLs in single-quotes on the shell command line is generally a better idea. Strings inside double-quotes are subject to further expansion by the shell - including $ characters being interpreted as shell variables, ` backticks causing a sub-shell to be executed, and so on. Single-quoted strings are fixed / constant / not subject to further expansion by the shell. there are all sorts of additional context-dependant qualifiers and exceptions and caveats and oddities but in short: double-quotes expand variables single-quotes do not - any text inside them is treated as a fixed string. and without quoting....well, since long URLs often have ampersand (& '&') characters in them (it's the separator between HTTP GET variables), as soon as the shell sees the & it'll run the command line up until that point in the background and then continue execution with the remainder of the command line. i.e. this is the reason why complex URLs need to be quoted (or any "special" characters need to be individually escaped with backslash). similarly, semi-colons are interpreted by the shell as a separator between commands - e.g. "ls ; echo foo" is treated exactly the same as if you typed "ls<enter>echo foo<enter>" simple URLs without ampersands (or semicolons or other characters that have special meaning to the shell) are no problem. but it's safer to just quote all URLs that are more complex than http://example.com/path/filename.html craig -- craig sanders <cas@taz.net.au> BOFH excuse #321: Scheduled global CPU outage

Craig Sanders wrote:
note: enclosing URLs in single-quotes on the shell command line is generally a better idea.
greybot says: "USE MORE QUOTES!" They are vital. Also, learn the difference between ' and " and `. See <http://mywiki.wooledge.org/Quotes> and <http://wiki.bash-hackers.org/syntax/words>. -- twb, who is a big fan of linking to FAQs instead of rewriting the answer each time ;-)

On Mon, Jun 18, 2012 at 12:16:47PM +1000, Trent W. Buck wrote:
Craig Sanders wrote:
note: enclosing URLs in single-quotes on the shell command line is generally a better idea.
greybot says:
"USE MORE QUOTES!" They are vital. Also, learn the difference between ' and " and `. See <http://mywiki.wooledge.org/Quotes> and <http://wiki.bash-hackers.org/syntax/words>.
good links.
twb, who is a big fan of linking to FAQs instead of rewriting the answer each time ;-)
having been frustrated on *numerous* occasions by google searches, mailing list archives and web sites linking to FAQs and other URLs that no longer exist, i'm a big fan of duplication, rewriting, custom explanations, and paraphrasing in addition to linking to FAQs. craig -- craig sanders <cas@taz.net.au>

On 06/18/2012 12:31 PM, Craig Sanders wrote:
having been frustrated on *numerous* occasions by google searches, mailing list archives and web sites linking to FAQs and other URLs that no longer exist, i'm a big fan of duplication, rewriting, custom explanations, and paraphrasing in addition to linking to FAQs.
craig
And that is greatly appreciated. ben

Hello Matthew, On Mon, 2012-06-18 at 09:59 +1000, Matthew Cengia wrote:
On 2012-06-17 22:45, Mark Trickett wrote:
Hello All,
As are others, I am a Linux Journal subscriber, and they have discontinued the print edition, providing a PDF from the web instead. I get a monthly email, with a link, and clicking on that brings up a page in FireFox, and an auto start download with the FireFox download manager. Sometimes it "completes" short of the whole file, useless, but usually I can get with multiple tries. This month, no such luck.
Hi Mark,
Firstly, I suggest you try this: http://www.linuxjournal.com/digital. Alternatively, in the current LJ issue (ironically) one of the letters suggests that a 'wget' of the URL in the email works fine as long as you enclose the URL in double-quotes:
Have visited that URL, and can copy and paste the appropriate URL for the particular issue, complete with PHP code. I have done with single quotes also. There appear to be problems with the network, even wget was failing somewhere up to 5% of the file. It gave up after twenty tries, with one early effort getting a peak of 9%. I have suspicions about Telstra Big Pond. wget did work, perfectly a little later, when I grabbed the podcast of ABC RN Science Show from Sunday past, about Alan Turing.
Bruno wrote:
Here is how I do it:
* I click on the link received via e-mail.
* In the browser that opens, I copy the entire address bar (via Ctrl-A, Ctrl-C). The URL is something like "http://download.linuxjournal.com/pdf/get-doc.php?code=...".
* I then type wget and paste the URL, with one very important precaution though. The URL must be enclosed in double quotes, as the parameters are separated by an ampersand (&), which has a special meaning under UNIX/Linux.
It works like a charm! So, if you have the chance to pass that information on to Bob, that would be great!
Alternatively:
Jordi Clariana wrote:
I was reading the April 2012 issue, and > regarding the letter from Bob Johnson > titled "Linux Journal Download with > wget?", I think I may have a solution: > the cliget extension for Firefox > (https://addons.mozilla.org/en-us/ > firefox/addon/cliget).
Hope this helps.
Have also added that firefox addon, but need to sort out the invocation options for wget. I _MUST_ use the --no-proxy option to get anywhere, hence suspicion about Telstra Big Pond. This still does not answer why getting podcast from ABC in OZ does work. Regards, Mark Trickett

Mark Trickett wrote:
quotes also. There appear to be problems with the network, even wget was failing somewhere up to 5% of the file. It gave up after twenty tries, with one early effort getting a peak of 9%.
wget -c. There are plenty more interesting options in the manpage, too.
Have also added that firefox addon, but need to sort out the invocation options for wget. I _MUST_ use the --no-proxy option to get anywhere,
That's your fault, not telstra's. If telstra are using a transparent proxy (as, IIRC, TPG does), you won't bypass it with wget --no-proxy.
participants (5)
-
Ben Nisenbaum
-
Craig Sanders
-
Mark Trickett
-
Matthew Cengia
-
Trent W. Buck