The Sitemap is the Holy Grail of a website. It’s the sheet (or sheets) of xml that new webmasters don’t know to use and some experienced webmasters neglect to create. Consider that every website has a front, a back, a mouthpiece, a gang of security guards and a guide. Visitors see the front, the webmaster uses the backend to create the front, the RSS feed tells the world what’s happening at the website, robots.txt and other little bits help protect it, and the sitemap guides search engine spiders around the it.
Usually, if you use a content management system (CMS) you will be blessed with automatic sitemap generation either through an inbuilt process or a plugin. In which case, you only need to locate it, submit it to search engines, link to it from your index page or the footer of every page, and regularly ping it to tell search engines about updates to it. You will usually find your sitemap sitting comfortably close to your robots.txt at the root of your domain e.g. your-domain.com/sitemap.xml
If you are not blessed with automatic sitemap generation and submission then you will need to create your own sitemap. Of course, that is what this article is all about and below here are the instructions your should follow to do that.
Most often, a sitemap needs to be manually created when a website is hand crafted in (x)html or when a sitemap is to be remotely hosted (i.e. the sitemap is placed on a different domain or server to the website it maps as is frequently the case when a sponsor provides a co-brand or white label site but not enough space or facility to host a sitemap. You can learn how to split a domain across multiple hosts in this EasyGuide.
There are programs and scripts that can be used to generate sitemaps. These can be split into two categories: those that work and those that don’t work. Pedants might point out that a third category exists which includes those that only work when they feel like it or after a lot of flirtatious smooth-talking, as is often the case.
Those sitemap generators that do work can be subdivided into two subcategories:
- those that run from a desktop PC, and
- those that run from a web server.
And they may be subdivided into paid and free. Guess which we’re going to work with ![]()
Most of the free sitemap tools that work from a desktop PC are the same ones used to check for dead links. You should read How to Check for Deadlinks to learn more about them because I am not going to discuss them here. More often than not the “sitemaps” created by those programs need to be manually edited into an xml sitemap format, for example, the URLs
http://journalxtra.com/downloads/ http://journalxtra.com/tools/
Would become:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> <!-- The site URLs go below here --> <url> <loc>http://journalxtra.com/downloads/</loc> <changefreq>weekly</changefreq> </url> <url> <loc>http://journalxtra.com/tools/</loc> <changefreq>weekly</changefreq> </url> </urlset>
The Scriptilitious ScriptBox is free and comes with a sitemap-maker utility that make it easy convert those URLs into a sitemap. You can get it by clicking the download button at the end of this article. I advise you to use klinkstatus (or similar) to index a website from your desktop then to use Scriptilitious to convert the indexed URLs into an xml sitemap. There is a a rumor that klinkstatus will soon have a specific template for xml sitemap creation which is good news for webmasters who use Linux (like me). Unfortunately, neither of these programs is yet capable of automatically uploading a sitemap to a server.
So let’s take a look at the working, free online sitemap generators:
There are many scripts that can be uploaded to a web server and configured to automatically rebuild a sitemap and submit it to various search engines. Unfortunately, and they are incredibly awkward to set up and configure; plus, for security reasons, many of them will only map a website that is on the domain where the script is being used. That restriction prohibits them from being used to create sitemaps for remote sites.
A better option is to use free online sitemap generators. They work, they are not limited to one website, they don’t care whether you own the site being mapped and they can be used frequently. There is one catch: most limit their free maps to either 500, 1000 or 5000 URLs and only map URLs that can be reached from the root (index) page of a website. The ones I use are no exception:
- xml-sitemaps.com will generate a well formatted xml sitemap of up to 500 URLs,
- sitemaps-builder.com generates a map with up to 1,000 URLs, and
- PC Time Limit builds sitemaps of upto 5,000 links.
Those three sitemap generators are more than enough for most sites but what if you have a co-brand, white label or hand-crafted website that updates daily and has hundreds of thousands of pages that must be indexed? How might all those lovely URLs be indexed?
Think about this:
A list of the most recent URLs is created when you generate a sitemap. When a new web page is created a new URL is created which must be added to that map. If you start out with 1,000 URLs and add 10 new URLs every day then over 20 days another 200 URLs must be mapped. If a sitemap generator maps only the first 1000 URLs it encounters from a website’s index page and there are 1200 URLs to index then 200 URLs will be missed out of the map. An incomplete map is bad news. An incomplete map could result in a site being poorly indexed by search engines.
Is there a way to coax the online generators to create a bigger sitemap?
Fortunately, sitemap generators do not check the size of a current sitemap and cannot determine whether a sitemap is made up from the contents of multiple sitemaps that have been generated by free sitmap generators. This failing can be turned to our advantage: we can use the same free tools to create daily or weekly sitemaps then combine their results to build one super sitemap. We can then force the generator to map different parts of a website by putting links to those parts on the website’s index page. For best results, one of those links should point to an artificial linklist that points to the sections of the site that need to be mapped; but, we must be careful not to duplicate data lines!
The Method
The method is easy for those who use Linux. I do not know whether Windows comes with “sed” but Windows users can use VirtualBox, a Linux LiveDisk, or they can install CygWin (Cygwin or CygwinX). These instructions assume you have already placed strategic links on your site’s front (index) page that point to the deeper parts of your website or a linklist that contains deeplinks to those parts you wish to have mapped. Strategic links should be as close to the top of the index page as possible (machines read webpages top-to-bottom, left-to-right). You can make your life easier my using the automated sitemap-ripper utility that comes with Scriptilitious. Again, Scriptilitious can be downloaded at the bottom of this article. So, here’s how we create a sitemap using online generators and (or not) the free sitemap-ripper utility:
- Use one of the sitemap generation tools listed above. Sometimes the generators can be mistaken for DoS attacks and hack attempts so they can be blocked by server security software. My general route is to try 5,000 then 1000 then 500 URLs. The latter one is rarely ever blocked;
- Upload the sitemap to your server. It should usually be placed in the root directory e.g. your-domain.com/sitemap.xml;
- Register the sitemap with the big two search engines (Google and Bing (and Yahoo));
- Place a link to the sitemap in the footer of your site’s index page (I suggest the footer because, most often, the same footer is repeated on every page). This ensures that Yahoo! and other search engines can easily find the sitemap;
- If possible, place a link to your sitemap in robots.txt by adding this line to it:
- Use My Page Rank to ping the major search engines with the details of your sitemap;
- To update the sitemap, use one of the sitemap generation tools but instead of overwriting the old sitemap with the newly created one, combine their contents. You can do this with sitemap-ripper or with this little bit of code:
- Place the content of both sitemaps (old and new) into one file called sitemap.xml.
- Open a terminal (Bash/Konsole/Console) and type or copy and paste this script into it
- That code removes superfluous whitespace at the beginning of all lines, changes all URLs to the http://www format, sorts the data, removes duplicate content, extracts all the mapped URLs, sets their priority to “0.5″ and specifies their frequency of change as “daily”. The final file it creates is the all important sitemap.xml. The downloadable script is more interactive, allows URL format, the change frequency, and page priority to be specified as it runs – plus it automatically combines the original sitemaps before it rips them apart, extracts the URLs, cleans them up, removes duplicates and reformats them into our Holy Grail.
sed -i 's#^[ t]*##g' sitemap.xml sed -i 's#http://www.#http://#g' sitemap.xml sed -i 's#http://#http://www.#g' sitemap.xml sed -i 's#<url>##g' sitemap.xml sed -i 's#</url>##g' sitemap.xml grep "<loc>" sitemap.xml > extracted.xml sort -u extracted.xml > sorted.txt rm sitemap.xml rm extracted.xml mv sorted.txt sitemap.xml sed -i 's#<url><url>#<url>#g' sitemap.xml sed -i 's#<loc>#<url>n <loc>#g' sitemap.xml sed -i 's#</loc>#</loc>n <changefreq>daily</changefreq>n <priority>0.5</priority>n</url>#g' sitemap.xml sed -i.bak '1i <?xml version="1.0" encoding="UTF-8"?>n<urlsetn xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"n xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"n xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9n http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">n <!-- The site URLs go below here -->n <!-- formatted with a script from http://journalxtra.com/downloads -->n' sitemap.xml echo '</urlset>' >> sitemap.xml rm sitemap.xml.bak - Repeat step 7 then 6 every time a new sitemap is generated.
Sitemap: http://www.example.com/sitemap.xml
Ensure that your URLs use only one of the http:// or http://www formats. If you’re URLs are mixed then the pages could be indexed twice or thrice which could be rewarded with a search engine penalty and lower page rank due to different backlinks pointing to different pages (http:// is different to http://www.)
If you cannot be bothered to do all the above, for as little as $19.99 you can get a sitemap generator that will create as many sitemaps as you need as big as you need them. It’s available from xml-sitemaps.com, provides automatic sitemap updates and will save you a lot of time.
Scripilitious
This free interactive utility box comes with two sitemap creation scripts that automate step 7 to produce a sitemap with the name sitemap.xml. Scriptilitious is known to work with Linux it might work natively with Windows but most likely will require Cygwin or some other Linux terminal emulator.
Instructions
- Unzip the downloaded file,
- Place the two sitemaps that need to be combined into the WorkBox folder (give them both a different name),
- Open a terminal in Scriptilitious folder and type ./scriptilitious.sh
- Further usage instructions are provided as the script runs.
No, these are nothing like the buttons used to fasten shirts and unfasten women’s blouses
these are the type used to create wonderfully stylish clickable areas on web pages. In total, there are 21 free Gimp button brushes waiting to be downloaded. They are easy enough to use but I have provided full instructions and tips below for those who need them. Those instructions along with their usage and redistribution notes are also in the the Readme file packaged with the download.
The buttons are in greyscale so can be colored any way you like; they are without text to make it easier for you to customize them; and they are quicker to use than the scripts that originally created them. The ones shown in the image are only exemplary of the goodies given away in the download.
INSTALLATION
Unzip the downloaded file and place the gbr files contained in the GIMP-Brushes-Web-Buttons folder into your GIMP Brushes folder. Linux will find this in home/.gimp-2.6/brushes. Windows users, why don’t you migrate to Linux???? Just kidding
look for GIMP under Program Files then locate your Brushes folder under that.
USAGE INSTRUCTIONS
- create a blank canvass (File>New). Use a white background,
- select a brush color,
- select the paint brush icon from the Toolbox,
- select one of the buttons (brushes) from the Layers and Channels box,
- paint the brush onto the canvass (press the left mouse button multiple times if you want a deeper colour),
- follow this step if your chosen button has curved edges, otherwise skip to step 7:
- select the fuzzy select tool (it’s in the Toolbox, 4th one from the left along the top row, looks like a magic wand). Adjust the threshold for precision (between 90 and 100 is usually best),
- left click on the canvass (not the button),
- make the selected areas transparent (right click and select Colours>Colour to Alpha and follow the presented instructions),
- autocrop the image (Image>Autocrop Image),
- save the image (for safety),
- if required, add text with the Text Tool (it’s in the Toolbox),
- save the image again but with a different name. This could be your inactive button.
- select the button layer,
- darken or lighten the image (Colours>Brightness-Contrast etc),
- save it again with a different name to the last one saved. This could be your active button.
- darken or lighten the image (Colours>Brightness-Contrast etc),
- save it again with a different name to the last one saved. This could be your hover button.
TIPS
- Two of the brushes are shadows. One is for use in combination with the Aqua Pill button, the other is for the Aqua Bou button. They are handy for creating shadows of different colour to the main button. The Aqua buttons already come in non-shadowed and shadowed form;
- Some of the buttons may be filled with gradients (Colours>Map>Gradient Map) etc;
- It’s best to use dark colors or a combination of a light colour with a secondary dark color (with lowered opacity) superimposed over it i.e. use the same brush twice but with different colors;
- If a button’s background needs to be changed from transparent to a color just right click the image layer, create a new layer, fill it with your chosen color then move the layer’s position in the layer stack to place it under the button’s layer.
I wish you the best of fun with them.
Welcome to the second JournalXtra newsletter of the year where you will learn what’s new, what’s hot and what’s not at JournalXtra. February was an incredibly busy month for me. I invested a lot of time adding and updating many articles. Here’s a brief recap of what those articles were.
New Articles
How to Split a Website Across Multiple Servers
Have you ever in your wildest dreams thought, I wish I could use my domain name on more than one web server so that I can use it with multiple white label sites. No? Yes? Probably? Sometimes white label and co-brand sites just do not provide as much server space, site customization or products needed to build a successful website. Thankfully, what one sponsor doesn’t provide another might and, if not, your own definitely server will. This short EasyGuide explains exactly how to configure a registrar to re-route traffic for specific sub domains of a website so you can have site1.mydomain.com and site2.mydomain.com on different servers.
10,000 Backlinks in 1 – The Best URL Submission You Will Ever Make
During February I discovered a new plug in for Wordpress. Called the BungeeBones Remotely Hosted Directory, it aims to provide a remotely hosted and managed directory for Wordpress blogs. It works similarly to a webring but with a difference – it displays as a directory and not as a sidebar advertisement billboard for other people’s blogs. It has a lot of potential and will, eventually, provide thousands of backlinks to its members as more and more webmasters take to using it. I highly recommend the use of this plugin. You can see JournalXtra’s standalone version of the directory here.
Rant About the Amazon Affiliate Program
Yes, I finally killed off the Amazon ads that littered the sidebar. They were ugly, didn’t generate any income, slowed down the site’s load speed to an unbearable level and Amazon’s we-only-pay-you-if-people-click-the-right-ad-for-their-location ad system is much too fiddly to implement and check (my stats didn’t match their stats). Plus, Amazon has made changes to their contract that’s too overbearing for a small site like JournalXtra to cope with. If you wish to read more about my decision to remove Amazon’s ads then please follow the link.
Happy Songs to Keep us Smiling
A few of my friends felt a bit low at the beginning of the month so I thought I’d add some sunshine to their lives by pointing them to a few uplifting happy songs. You wouldn’t believe how hard it is to find happy songs on the Net. It’s ridiculous. Eventually, after a few hours of searching, I came up with, and you might want to sit down before you read this, are you ready? Sure??? I found four. Yes, four (4) happy songs. Amazing. I know. I couldn’t believe it myself. I was so caught up in the excitement I nearly forgot about you guys. Luckily I didn’t and I posted them here for you to enjoy too.
Everybody who uses a computer knows what an ebook is so I won’t go into detail about them but suffice it to say I have added 85 ebook directories to the Webmaster’s Toolbox. Ebooks are a fantastic way to advertise and get targeted traffic to websites, landing pages and sponsor sites. Best of all, good ebooks go viral. So, what are you waiting for? Get writing, save it as a pdf and submit it to an ebook directory.
How to Check for Dead Links within a List
This is a must read article for every URL collecting surfer or webmaster with a list of links in a file that need to be checked for life signs. The process is simple – download the right script, properly format your links, run the script. Easy. Even easier when the instructions for using the script and formatting the links are provided in an EasyGuide. Need to check the links on a website? I’ve answered that one too.
Find Hot Current Search Terms and Keywords
A couple of years ago I stumbled upon a website that listed real search terms in realtime. I thought I’d revisit the guide and update it. I was shocked, the website listed in the original article had changed. It no longer provides realtime search terms, it is now an advert for a web design company. It appears the site’s designers played a not too uncommon trick on us – it had used the domain name to provide a service that would gain lots and lots of backlinks while it matured. Once the backlinks were in place, they converted the site to its real function of advertising a web design company. One has to respect the cunning of its architects; nevertheless, I was a little pissed at it but it did give me the impetus to re-write the whole article instead of just updating it. The new article lists several sites that provide near realtime and historical search words and phrases entered into search engines by surfers. I hope you find it useful.
JournalXtra now proudly boasts its brand new, all shiny and sparkling, download section. No, don’t go looking for movies and music ’cause ya ain’t gonna find any in there unless I’ve created them. This new part of the site is home to all the free downloads linked within other parts of the site along with other goodies I decide to give away. It is still under development and already has a few GIMP plug-ins, scripts and brushes ready for download. The download area is part of a new strategy I’m using to attract traffic so expect to see it fill up with all sorts of free stuff over the next few weeks and months.
Updated Articles
The Suffusion Review has been updated to show off the theme’s new admin panel and its ever increasing list of widgets and config options.
The GIMP guide has had another script repository added to it and the scripts download zip file has been replaced with a larger one (it’s better to grab this from the download section than the guide). The guide will be re-written over the course of March so expect a surprise if you view it in April.
The Webmaster’s Toolbox has had a lot of updates. To name a few of them, there are more links website and product review sites, more ping services and an ebooks directory and syndication site list. All the links have been re-checked for activity. I am aware that some of the links still point to sites other than their intended targets, I apologize for that and promise the offending URLs will be weeded out over time; you are welcome to help me there by pointing them out (use the contact form listed in the menubar).
March will be just as busy as February. I have several articles half started and several more lined up to be written. I’ve hinted at what some of them are above but I can’t say too much because other webmaster might steel my ideas. I hope you had a good February and I wish you a brilliant March.
To my readers and friends who are in Chile, I sincerely hope you have not been too badly affected by your recent Earthquake and Tsunami. Those of you who have been affected by it, I can not imagine the devastation it has caused and will not pretend to understand what you might currently be suffering. All I can do is pray that you get all the help you, your family and friends need to get through it and get passed it. I am here if you need to talk.


![Crafty Sitemap Building Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_a.png?x-id=9f1e8ba4-5ebe-4571-98f0-f01dc5d99922)


![Free GIMP Button Brushes Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_a.png?x-id=6d604b45-26e8-4c9c-814c-b8ac907d5049)














