Don’t fret the next time you have to check a large list of links to discover which are dead and which are live, Dead Link Checker can check your list for you in next to no time.
Most regular visitors to JournalXtra are aware of the Webmaster Marketing Tools that I compiled to make it easier to submit websites, blogs and ebooks etc.. to relevant directory, ping service and syndication sites as well as to submit press releases. Websites come and go over time and so links to them expire. Every once in a while I recheck those links, thousands of them, for activity. If I checked those links manually by clicking them one-by-one it would take me a week or two to keep those tools up to date.
Thankfully, there is a perl script called Dead Link Checker which will run through a list of links in a file, test them for activity and write each result to an HTML formatted file.
Dead Link Checker places each link into an HTML file (which it automatically creates) categorized by server response code. Each tested link that returns 3xx or 4xx server codes (for example 301 or 403) is written into the HTML file alongside any page to which it redirects. This makes it easy to decide whether a link is dead or just redirected to an updated version of the page.
Many page redirects are only between the http:// and http://www. URL preference of a site’s webmaster.
Here’s an example of how to use it (example pages provided):
- create a text file (links.txt)
- paste the collection of URLs into the file (links.txt) – it might look like this
- format the links so that they only describe the URL from HTTP:// to .COM or .NET etc… i.e remove any superfluous text – it might now look like this. Use the Text Manipulation FAQs if you need help to quickly edit large text files.
- open a terminal (Konsole or console) in the folder that holds the text file (links.txt)
- type the command
- that command will provide an HTML file called checked.html out of the links it checks from the file links.txt – it might look like this
- load the created HTML file (checked.html) in a web browser
- also load the HTMl file (checked.html) in a text editor (I use Kate) – it might look like this
- read the links and manually remove the bad ones from the HTML file loaded in the text editor
- those with a 2xx result are Live so should be kept
- those with a 3xx result are redirecting to somewhere else so check that the redirect is not to a totally different site; if it is then that link is probably dead so check them manually to confirm their status
- those with a 4xx result lead to non found pages. Check them manually to confirm whether they are active or dead. Dead Link Checker relies on the server response, not all servers give a correct response
- those with a 5xx result will likely never load. Check them manually
- use the text editor to reformat the links in the HTML file (checked.html) once the bad links have been removed from it (Instructions provided assume the use of Kate)
- convert all characters to lower case (Tools>Lowercase)
- remove superfluous characters with sed:
- open find and replace (ctrl+r)
- find
- replace it with
- (or some other character that is not reproduced anywhere within the file)
- Use Alt+A to replace all occurrences.
- find
- replace it with nothing (i.e just remove it)
- find
- replace it with nothing (i.e just remove it)
- find
- replace it with nothing (i.e just remove it)
- find
- replace it with
- Manually remove anything that isn’t between a tags (<a> and </a>)
- use a terminal (Konsole or Console) – open it from within the folder that holds the HTML file (checked.html) and enter this command
- That command removes anything in a line that is written in front of a pipe “|” (inclusive of the pipe) and places all the links into a the file links.txt
|
<code>|-> </code>
<b>
</b>
</a><br>
</a>
sed 's/.*|//g' checked.html > links.txt
- The file links.txt will now contain the checked, active URLs from the original list formatted within <a> tags with anchor text equal to the link e.g
- If required, alter the anchor text to remove the http:// and .com (or .net etc…) components.
deadlinkcheck -HTMLoutput -noCache -Verb links.txt > checked.html
<a href="http://journalxtra.com">http://journalxtra.com</a>
It will look similar to this
Intelligent use of Dead Link Checker can transform a task that would usually take a week or two to complete into one that takes less than 20 minutes.
Download Dead Link Checker from Sourceforge and read its user guide here.
I do not whether DLC works in Windows. Use a Linux Live Disk (Kubuntu or Linux Mint) if it doesn’t.
Just thought I’d post some cheerful, happy and uplifting music videos to help shake off the January blues
You wouldn’t believe how hard t is find cheerful songs on the Net – if you don’t know what they are called then you haven’t much chance of finding them. Must worn you, cheery songs tend to be accompanied by the worst music videos.
The first is Always Look on The Bright Side of Life from the film The Life of Brian by Monty Python which was one of the more comical films released when I was a child.
Here’s a bebop called Don’t Worry be Happy by Bobby McFerrin. It’s one of the easiest songs to remember, sing and whistle. The best thing for me is that it makes me smile every time I hear it. Probably because one of my ex girlfriends gave it to me on a vinyl just before we got together.
This next one is called Sunshine Reggae. It was released in the early 80’s by a Danish group called Laid Back. The B side of the 7″ Vinyl became a hit in the US whereas the rest of the world preferred the A side.
This final one is livelier and more uplifting. It is It’s A Beautiful Day by Tom Boxer ft Jay
I’ll post a few more cheery songs as and when I remember them.
Just a quick comment to let you know I’ve removed the Amazon ads that once adorned the sidebars. I’ve been considering their removal for the past month or two because they slow down the site’s page load time but I wavered on the side that some visitors might find them useful so I left them up. However, Amazon has twice adjusted its user agreement within the last 30 days. One of those adjustments changed their linking policy and the second one served to combine its Amazon U.K, Amazon FR and Amazon DE user agreements such that one agreement for Amazon EU covers all three programs. That latter change is a little guise about EU harmony and a big ruse to inflict ever more stringent rules to its affiliates; rules that heavily favor Amazon over its affiliates. As you can tell, I am not happy with the changes. To be blunt, I feel pretty shafted by Amazon and I am sure many other webmasters will feel the same once they read Amazon’s new user agreement.
Amazon was and still is one of the most complicated affiliate programs to administer due to its use of separate sites for separate customer regions. For an affiliate to receive commission from product referrals the referred referees had to purchase directly from the Amazon site to which the affiliate referred him/her. For example:
when an affiliate (publisher) sends someone to Amazon U.K and that someone purchases an item through Amazon U.K then the affiliate earns commission; but when someone is referred to Amazon U.K and moves across to Amazon US and makes a purchase through Amazon US then the affiliate earns nothing.
That means that affiliates who wish to advertise Amazon products for commission have two choices:
- sign up to one Amazon site’s affiliate program (usually Amazon US) and advertise its products solely, or
- sign up to all Amazon programs and use an ad server like OpenX to display separate ads from each different Amazon program according to a visitor’s geographic location.
As you can see, Amazon already favors itself above its affiliates beyond reasonableness – it gets a lot of free publicity from webmasters who hope to make a financial return on their hard work and valuable advertisement space.
The first alteration Amazon made to its affiliate program (less than a month ago) prohibited webmasters from adding their affiliate links to search engines and permits Amazon to not issue commission to affiliates when visitors arrive at Amazon through links listed by search engines.
Reasonable enough, Amazon has to protect itself from those webmasters who use unscrupulous advertisement methods to drive prospects to Amazon via their personal affiliate links. However, there is one slight issue with this rule: search engines find links to websites through websites; it is entirely possible for a scrupulous webmaster to link to an Amazon product on his/her own websites and for a search engine to display the link to that Amazon product when it returns its search results; hence a well intentioned webmaster who follows Amazon’s rules might be penalized through no mistake of his own; and this is especially likely when products are advertised via Twitter which is something Amazon encourages its affiliates to do.
It is my understanding that the second (and most recent) major alteration, the one that unifies its British, German and French user agreements prohibits the use of Amazon links in forum, blog and other website signatures and prohibits the use of automated shops except those provided by Amazon. To a lot of webmasters (not I), these are standard marketing practices. A lot of webmasters will now have to change their signatures and close their automated shops.
I have one more Amazon link to remove from my network of sites. Thankfully it is served by OpenX so I can replace it with a link provided by a more favorable advertisement program. With the exception of that link, from this time onwards, I will not advertise products for Amazon until they change their affiliate program to one that is less hostile to the webmasters who use it to advertise Amazon.
I will still buy products through Amazon, they have some pretty amazing deals I just won’t be advertising their products for the foreseeable future.
I do have another reason to be grumpy about Amazon: for the whole time I have advertised their products I have made not one sale. Tens of thousands of ad impressions across several websites and zero sales and a couple of click-throughs. I will stick with the likes of Clickbank, Affiliate Future and Affiliate Window from no onwards. At least their products sell and they pay more commission which helps pay toward my server and domain registration fees.
Here is Amazon’s official overview of its new EU user agreement, enjoy:
Operating Agreement Update, March 1 2010 version compared with February 1 2010 version
The Associates Programme Operating Agreement has been updated effective March 1, 2010. A single Operating Agreement governing the Associates Programme for each of the Amazon sites in the UK, DE and FR, and a combined sign up process for new Associates enables enrolment in one, two, or all three programmes at once. The updated Operating Agreement has been restructured to make the information you want more accessible. In addition, some terms of the agreement have changed. For instance, the updated agreement clarifies or modifies (see sections referenced below for applicable agreement provisions):
- your rights and obligations regarding content, offers, and links posted in connection with the program, as well as your use of our content and trademarks (see Linking Requirements, Programme Participation Requirements, and sections 3,4, 5, and 11 of the Operating Agreement);
- your responsibilities for your site users’ claims (see section 5 of the Operating Agreement);
- your obligations to provide us with certain information and for keeping information you give us accurate and up to date (see section 2 of the Operating Agreement);
- your obligations as to communications regarding your relationship with us (see section 10 of the Operating Agreement);
- prohibitions on (i) your placement of Special Links in posts to the Amazon site (e.g., in reviews or on forums), (ii) your use of sub-tags to identify specific users, (iii) making inaccurate, deceptive, or misleading claims about any product, the Amazon site, or any Amazon policy, promotion, or price, (iv) your ability to collect account information used by our customers in connection with any Amazon site, and (v) your use of any malicious or harmful code or any automatically-installing software application on your site (see Linking Requirements, Programme Participation Requirements, and section 10 of the Operating Agreement);
- your representations and warranties (see Linking Requirements, Programme Participation Requirements, and the preamble and sections 5, 12, and 13 of the Operating Agreement);
- your indemnity and other obligations to us (see section 5 of the Operating Agreement)
- our rights to monitor your compliance with the agreement (see section 4 of the Operating Agreement);
- rights to withhold payment of advertising fees in certain circumstances (see section 4 of the Operating Agreement);
- our right to charge administration fees on and/or close dormant accounts (see section 8 of the Operating Agreement);
- the minimum advertising fees that you must earn before payment by direct deposit, cheque, or gift certificate can be issued (see section 8 of the Operating Agreement);
- the amount of prior notice of any modifications to or termination of the agreement (see sections 14 and 15 of the Operating Agreement);
- our rights in what you submit to us (see section 12 of the Operating Agreement); and
- limitations of obligations and liability (see sections 17 and 18 of the Operating Agreement).
This is only a general summary of some of the changes and does not affect the interpretation of the updated Operating Agreement. Your continued participation in the program constitutes your acceptance of the updated agreement. Therefore, please carefully read the updated Operating Agreement.
The full new agreement is available here.

![How to Check For Dead Links Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_a.png?x-id=9cb6333c-3b56-404c-8879-7424d55fc7bf)

![Some Happy Songs to Keep Us Smiling Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_a.png?x-id=5c30b775-0b2a-4970-a3e6-23d49cef0d8d)


![No More Amazon Ads, Yippee \Q/ Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_a.png?x-id=788d1085-0ca7-44a7-8c9b-dd45e6cc2e9c)
