http://qs321.pair.com?node_id=1202129


in reply to Re^2: Batch remove "404 Not Found" URLs
in thread Batch remove URLs

Assuming that you just want to get the job done and are not pursuing this as an academic exercise, I would abandon the one-liner approach. It can be done that way, but the more you throw into it the messier it gets. Here's one plan:

  1. Store your 300 URLs in a file, one per line (if you haven't already done so). You can then slurp this into an array at the start of your script.
  2. Loop over the files with a simple glob
  3. Inside that loop over all the URLs
  4. Inside the inner loop, call a subroutine with the filename and the URL to replace

You can now test the inner subroutine in isolation on a test file to your heart's content to get it perfectly right without destroying the initial content. Consider quotemeta for the search terms. If you get stuck with that approach, come back with specific questions, ideally as an SSCCE. Good luck.