SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Pastimes : Linux OS.: Technical questions

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
From: Thomas A Watson11/4/2013 8:11:47 PM
   of 484
 
Downloading a set of files with wget and lynx -dump and filters.
The NIPPC report is 1004 pages and also given as chapter pdf.

watson@xen1[126]lynx -dump nipccreport.com | grep pdf
5. nipccreport.com
6. nipccreport.com
7. nipccreport.com
8. nipccreport.com
9. nipccreport.com
10. nipccreport.com
11. nipccreport.com
12. nipccreport.com
13. nipccreport.com
14. nipccreport.com
15. nipccreport.com
16. nipccreport.com
17. nipccreport.com

pipe to P2 gets ride of the line number.

surrounding the command with ` can be used to pass the output to wget.

wget will get each file in the list.

wget `lynx -dump nipccreport.com | grep pdf |P2`

P2 is a self generating multi entry soft link creations script
toms.homeip.net tcsh script, open source license. oldie but sill works fine.

Note for some file sets types there are multiple identical links. So | to sort and the pipe to uniq. This should leave only one line for each unique file.

lynx -dump nipccreport.com | grep pdf | P2 | sort |uniq

wget `lynx -dump nipccreport.com | grep pdf |P2`
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext