Don’t fret the next time you have to check a large list of links to discover which are dead and which are live, Dead Link Checker can check your list for you in next to no time.
Most regular visitors to JournalXtra are aware of the Webmaster Marketing Tools that I compiled to make it easier to submit websites, blogs and ebooks etc.. to relevant directory, ping service and syndication sites as well as to submit press releases. Websites come and go over time and so links to them expire. Every once in a while I recheck those links, hundreds of them, for activity. If I checked those links manually by clicking them one-by-one it would take me a week or two to keep those tools up to date.
One method used to check for dead links is to use software that reads a webpage or website and attempts to load each link. Such software can be set to check links on the pages to which links on the tested page lead and to further depth if required. For example, these programs will check links on page one, links on the pages lead to by links on page one, links on the pages to which the links on page two lead and so on….
A few of these link check programs are:
- KLinkStatus (Linux)
- Linkchecker
- Xenu’s Link Sleuth (Windows)
A more exhaustive list of URL checker tools is at the bottom of this page along with the download link for Dead Link Checker.
This type of link check program has two huge limitations:
- they check all links on a specified page. One cannot specify which links are checked and which links are not;
- the links they check must be placed with in a webpage that is stored and loaded by a web server.
It is not difficult to see how those limitations can become hindrances to anyone wishing to check the activity of a specific set of links within a webpage or website with many links that need not be checked.
Thankfully, there is a perl script called Dead Link Checker which will run through a list of links in a file that is stored on a local machine, test them for activity and write each result to an HTML formatted file.
Dead Link Checker places each link into an HTML file (which it automatically creates) categorized by server response code. Each tested link that returns 3xx or 4xx server codes (for example 301 or 403) is written into the HTML file alongside any page to which it redirects. This makes it easy to decide whether a link is dead or just redirected to an updated version of the page.
Many page redirects are only between the http:// and http://www. URL preference of a site’s webmaster.
Here’s an example of how to use it. Example pages provided. All work is performed from within the sameĀ folder:
- create a text file (links.txt)
- paste the collection of URLs into the file (links.txt) – it might look like this
- format the links so that they only describe the URL from HTTP:// to .COM or .NET etc… i.e remove any superfluous text – it might now look like this . Use the Text Manipulation FAQs if you need help to quickly edit large text files.
- open a terminal (Konsole or console) in the folder that holds the text file (links.txt)
- type the command
- that command will provide an HTML file called checked.html out of the links it checks from the file links.txt – it might look like this
- load the created HTML file (checked.html) in a web browser
- also load the HTMl file (checked.html) in a text editor (I use Kate) – it might look like this
- read the links and manually remove the bad ones from the HTML file loaded in the text editor
- those with a 2xx result are Live so should be kept
- those with a 3xx result are redirecting to somewhere else so check that the redirect is not to a totally different site; if it is then that link is probably dead so check them manually to confirm their status
- those with a 4xx result lead to non found pages. Check them manually to confirm whether they are active or dead. Dead Link Checker relies on the server response, not all servers give a correct response
- those with a 5xx result will likely never load. Check them manually
- use the text editor to reformat the links in the HTML file (checked.html) once the bad links have been removed from it (Instructions provided assume the use of Kate)
- convert all characters to lower case (Tools>Lowercase)
- remove superfluous characters with sed:
- open find and replace (ctrl+r)
- find
- replace it with
- (or some other character that is not reproduced anywhere within the file)
- Use Alt+A to replace all occurrences.
- find
- replace it with nothing (i.e just remove it)
- find
- replace it with nothing (i.e just remove it)
- find
- replace it with nothing (i.e just remove it)
- find
- replace it with
- Manually remove anything that isn’t between a tags (<a> and </a>)
- use a terminal (Konsole or Console) – open it from within the folder that holds the HTML file (checked.html) and enter this command
- That command removes anything in a line that is written in front of a pipe “|” (inclusive of the pipe) and places all the links into a the file links.txt
|
<code>|-> </code>
<b>
</b>
</a><br>
</a>
sed 's/.*|//g' checked.html > links.txt
- The file links.txt will now contain the checked, active URLs from the original list formatted within <a> tags with anchor text equal to the link e.g
- If required, alter the anchor text to remove the http:// and .com (or .net etc…) components.
deadlinkcheck -HTMLoutput -noCache -Verb links.txt > checked.html
<a href="http://journalxtra.com">http://journalxtra.com</a>
It will look similar to this
Intelligent use of Dead Link Checker can transform a task that would usually take a week or two to complete into one that takes less than 20 minutes.
I do not whether DLC works in Windows. Use a Linux Live Disk (Kubuntu or Linux Mint) if it doesn’t.
Download Link
Download Dead Link Checker:
- Sourceforge
- Its user guide is availableĀ here (also at Sourceforge)
List of Link Checker Software
These URL checker tools check links loaded within webpages served online only.
Run from a local PC
- Checklinks
- Dead link check
- gURLChecker
- KLinkStatus
- Linkchecker
- link-checker
- linklint
- webcheck
- webgrep
- Xenu’s Link Sleuth
HTML interface
Copyright secured by Digiprove © 2010
Related posts:


![How to Check For Dead Links Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_a.png?x-id=9cb6333c-3b56-404c-8879-7424d55fc7bf)




No Responses to “How to Check For Dead Links”