A site in need of a sitemap

The Sitemap is the Holy Grail of a website. It’s the sheet (or sheets) of xml that new webmasters don’t know to use and some experienced webmasters neglect to create. Consider that every website has a front, a back, a mouthpiece, a gang of security guards and a guide. Visitors see the front, the webmaster uses the backend to create the front, the RSS feed tells the world what’s happening at the website, robots.txt and other little bits help protect it, and the sitemap guides search engine spiders around the it.

Usually, if you use a content management system (CMS) you will be blessed with automatic sitemap generation either through an inbuilt process or a plugin. In which case, you only need to locate it, submit it to search engines, link to it from your index page or the footer of every page, and regularly ping it to tell search engines about updates to it. You will usually find your sitemap sitting comfortably close to your robots.txt at the root of your domain e.g. your-domain.com/sitemap.xml

If you are not blessed with automatic sitemap generation and submission then you will need to create your own sitemap. Of course, that is what this article is all about and below here are the instructions your should follow to do that.

Most often, a sitemap needs to be manually created when a website is hand crafted in (x)html or when a sitemap is to be remotely hosted (i.e. the sitemap is placed on a different domain or server to the website it maps as is frequently the case when a sponsor provides a co-brand or white label site but not enough space or facility to host a sitemap. You can learn how to split a domain across multiple hosts in this EasyGuide.

There are programs and scripts that can be used to generate sitemaps. These can be split into two categories: those that work and those that don’t work. Pedants might point out that a third category exists which includes those that only work when they feel like it or after a lot of flirtatious smooth-talking, as is often the case.

Those sitemap generators that do work can be subdivided into two subcategories:

  • those that run from a desktop PC, and
  • those that run from a web server.

And they may be subdivided into paid and free. Guess which we’re going to work with :-)

Most of the free sitemap tools that work from a desktop PC are the same ones used to check for dead links. You should read How to Check for Deadlinks to learn more about them because I am not going to discuss them here. More often than not the “sitemaps” created by those programs need to be  manually edited into an xml sitemap format, for example, the URLs

http://journalxtra.com/downloads/

http://journalxtra.com/tools/

Would become:

<?xml version="1.0" encoding="UTF-8"?>
<urlset
 xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<!-- The site URLs go below here -->

<url>
 <loc>http://journalxtra.com/downloads/</loc>
 <changefreq>weekly</changefreq>
</url>
<url>
 <loc>http://journalxtra.com/tools/</loc>
 <changefreq>weekly</changefreq>
</url>

</urlset>

The Scriptilitious ScriptBox is free and comes with a sitemap-maker utility that make it easy convert those URLs into a sitemap. You can get it by clicking the download button at the end of this article. I advise you to use klinkstatus (or similar) to index a website from your desktop then to use Scriptilitious to convert the indexed URLs into an xml sitemap. There is a a rumor that klinkstatus will soon have a specific template for xml sitemap creation which is good news for webmasters who use Linux (like me). Unfortunately, neither of these programs is yet capable of automatically uploading a sitemap to a server.

So let’s take a look at the working, free online sitemap generators:

There are many scripts that can be uploaded to a web server and configured to automatically rebuild a sitemap and submit it to various search engines. Unfortunately, and they are incredibly awkward to set up and configure; plus, for security reasons, many of them will only map a website that is on the domain where the script is being used. That restriction prohibits them from being used to create sitemaps for remote sites.

A better option is to use free online sitemap generators. They work, they are not limited to one website, they don’t care whether you own the site being mapped and they can be used frequently. There is one catch: most limit their free maps to either 500, 1000 or 5000 URLs and only map URLs that can be reached from the root (index) page of a website. The ones I use are no exception:

Those three sitemap generators are more than enough for most sites but what if you have a co-brand, white label or hand-crafted website that updates daily and has hundreds of thousands of pages that must be indexed? How might all those lovely URLs be indexed?

Think about this:

A list of the most recent URLs is created when you generate a sitemap. When a new web page is created a new URL is created which must be added to that map. If you start out with 1,000 URLs and add 10 new URLs every day then over 20 days another 200 URLs must be mapped. If a sitemap generator maps only the first 1000 URLs it encounters from a website’s index page and there are 1200 URLs to index then 200 URLs will be missed out of the map. An incomplete map is bad news. An incomplete map could result in a site being poorly indexed by search engines.

Is there a way to coax the online generators to create a bigger sitemap?

Fortunately, sitemap generators do not check the size of a current sitemap and cannot determine whether a sitemap is made up from the contents of multiple sitemaps that have been generated by free sitmap generators. This failing can be turned to our advantage: we can use the same free tools to create daily or weekly sitemaps then combine their results to build one super sitemap. We can then force the generator to map different parts of a website by putting links to those parts on the website’s index page. For best results, one of those links should point to an artificial linklist that points to the sections of the site that need to be mapped; but, we must be careful not to duplicate data lines!

The Method

The method is easy for those who use Linux. I do not know whether Windows comes with “sed” but Windows users can use VirtualBox, a Linux LiveDisk, or they can install CygWin (Cygwin or CygwinX). These instructions assume you have already placed strategic links on your site’s front (index) page that point to the deeper parts of your website or a linklist that contains deeplinks to those parts you wish to have mapped. Strategic links should be as close to the top of the index page as possible (machines read webpages top-to-bottom, left-to-right). You can make your life easier my using the automated sitemap-ripper utility that comes with Scriptilitious. Again, Scriptilitious can be downloaded at the bottom of this article. So, here’s how we create a sitemap using online generators and (or not) the free sitemap-ripper utility:

  1. Use one of the sitemap generation tools listed above. Sometimes the generators can be mistaken for DoS attacks and hack attempts so they can be blocked by server security software. My general route is to try 5,000 then 1000 then 500 URLs. The latter one is rarely ever blocked;
  2. Upload the sitemap to your server. It should usually be placed in the root directory e.g. your-domain.com/sitemap.xml;
  3. Register the sitemap with the big two search engines (Google and Bing (and Yahoo));
  4. Place a link to the sitemap in the footer of your site’s index page (I suggest the footer because, most often, the same footer is repeated on every page). This ensures that Yahoo! and other search engines can easily find the sitemap;
  5. If possible, place a link to your sitemap in robots.txt by adding this line to it:
  6. Sitemap: http://www.example.com/sitemap.xml
  7. Use My Page Rank to ping the major search engines with the details of your sitemap;
  8. To update the sitemap, use one of the sitemap generation tools but instead of overwriting the old sitemap with the newly created one, combine their contents. You can do this with sitemap-ripper or with this little bit of code:
    1. Place the content of both sitemaps (old and new) into one file called sitemap.xml.
    2. Open a terminal (Bash/Konsole/Console) and type or copy and paste this script into it
    3. sed -i 's#^[ t]*##g' sitemap.xml
      sed -i 's#http://www.#http://#g' sitemap.xml
      sed -i 's#http://#http://www.#g' sitemap.xml
      sed -i 's#<url>##g' sitemap.xml
      sed -i 's#</url>##g' sitemap.xml
      grep "<loc>" sitemap.xml > extracted.xml
      sort -u extracted.xml > sorted.txt
      rm sitemap.xml
      rm extracted.xml
      mv sorted.txt sitemap.xml
      sed -i 's#<url><url>#<url>#g' sitemap.xml
      sed -i 's#<loc>#<url>n  <loc>#g' sitemap.xml
      sed -i 's#</loc>#</loc>n  <changefreq>daily</changefreq>n  <priority>0.5</priority>n</url>#g' sitemap.xml
      sed -i.bak '1i <?xml version="1.0" encoding="UTF-8"?>n<urlsetn      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"n      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"n      xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9n      http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">n      <!-- The site URLs go below here -->n      <!-- formatted with a script from http://journalxtra.com/downloads -->n' sitemap.xml
      echo '</urlset>' >> sitemap.xml
      rm sitemap.xml.bak
    4. That code removes superfluous whitespace at the beginning of all lines, changes all URLs to the http://www format, sorts the data, removes duplicate content, extracts all the mapped URLs, sets their priority to “0.5″ and specifies their frequency of change as “daily”. The final file it creates is the all important sitemap.xml. The downloadable script is more interactive, allows URL format, the change frequency, and page priority to be specified as it runs – plus it automatically combines the original sitemaps before it rips them apart, extracts the URLs, cleans them up, removes duplicates and reformats them into our Holy Grail.
  9. Repeat step 7 then 6 every time a new sitemap is generated.

Ensure that your URLs use only one of the http:// or http://www formats. If you’re URLs are mixed then the pages could be indexed twice or thrice which could be rewarded with a search engine penalty and lower page rank due to different backlinks pointing to different pages (http:// is different to http://www.)

If you cannot be bothered to do all the above, for as little as $19.99 you can get a sitemap generator that will create as many sitemaps as you need as big as you need them. It’s available from xml-sitemaps.com, provides automatic sitemap updates and will save you a lot of time.

Scripilitious

This free interactive utility box comes with two sitemap creation scripts that automate step 7 to produce a sitemap with the name sitemap.xml. Scriptilitious is known to work with Linux it might work natively with Windows but most likely will require Cygwin or some other Linux terminal emulator.

Instructions

  1. Unzip the downloaded file,
  2. Place the two sitemaps that need to be combined into the WorkBox folder (give them both a different name),
  3. Open a terminal in Scriptilitious folder and type ./scriptilitious.sh
  4. Further usage instructions are provided as the script runs.

Scriptilitious ScriptBox (79)

Reblog this post [with Zemanta]

dp seal trans 16x168 Crafty Sitemap Building  Copyright secured by Digiprove © 2010

Free GIMP Button Brushes

Sleek, Sexy and Stylish!

No, these are nothing like the buttons used to fasten shirts and unfasten women’s blouses wpml wink Free GIMP Button Brushes these are the type used to create wonderfully stylish clickable areas on web pages. In total, there are 21 free Gimp button brushes waiting to be downloaded. They are easy enough to use but I have provided full instructions and tips below for those who need them. Those instructions along with their usage and redistribution notes are also in the the Readme file packaged with the download.

The buttons are in greyscale so can be colored any way you like; they are without text to make it easier for you to customize them; and they are quicker to use than the scripts that originally created them. The ones shown in the image are only exemplary of the goodies given away in the download.

Website Buttons (108)

INSTALLATION

Unzip the downloaded file and place the gbr files contained in the GIMP-Brushes-Web-Buttons folder into your GIMP Brushes folder. Linux will find this in home/.gimp-2.6/brushes. Windows users, why don’t you migrate to Linux???? Just kidding wpml yahoo Free GIMP Button Brushes look for GIMP under Program Files then locate your Brushes folder under that.

USAGE INSTRUCTIONS

  1. create a blank canvass (File>New). Use a white background,
  2. select a brush color,
  3. select the paint brush icon from the Toolbox,
  4. select one of the buttons (brushes) from the Layers and Channels box,
  5. paint the brush onto the canvass (press the left mouse button multiple times if you want a deeper colour),
  6. follow this step if your chosen button has curved edges, otherwise skip to step 7:
    1. select the fuzzy select tool (it’s in the Toolbox, 4th one from the left along the top row, looks like a magic wand). Adjust the threshold for precision (between 90 and 100 is usually best),
    2. left click on the canvass (not the button),
    3. make the selected areas transparent (right click and select Colours>Colour to Alpha and follow the presented instructions),
  7. autocrop the image (Image>Autocrop Image),
  8. save the image (for safety),
  9. if required, add text with the Text Tool (it’s in the Toolbox),
  10. save the image again but with a different name. This could be your inactive button.
  11. select the button layer,
  12. darken or lighten the image (Colours>Brightness-Contrast etc),
  13. save it again with a different name to the last one saved. This could be your active button.
  14. darken or lighten the image (Colours>Brightness-Contrast etc),
  15. save it again with a different name to the last one saved. This could be your hover button.

TIPS

  1. Two of the brushes are shadows. One is for use in combination with the Aqua Pill button, the other is for the Aqua Bou button. They are handy for creating shadows of different colour to the main button. The Aqua buttons already come in non-shadowed and shadowed form;
  2. Some of the buttons may be filled with gradients (Colours>Map>Gradient Map) etc;
  3. It’s best to use dark colors or a combination of a light colour with a secondary dark color (with lowered opacity) superimposed over it i.e. use the same brush twice but with different colors;
  4. If a button’s background needs to be changed from transparent to a color just right click the image layer, create a new layer, fill it with your chosen color then move the layer’s position in the layer stack to place it under the button’s layer.

I wish you the best of fun with them.

Website Buttons (108)

Reblog this post [with Zemanta]

dp seal trans 16x165 Free GIMP Button Brushes  Copyright secured by Digiprove © 2010

The Rock in the Way

“What? Why would I want to do that?” I hear you shout.

Picture this:

you’ve joined an affiliate program and your sponsor provides you with a hosted free site. You register your domain name, point it to your sponsor’s server then set up your free website. A few days later you think about marketing it. You visit lots of directories and send back link requests to hundreds of webmasters but everywhere you turn you see a great massive big bolder blocking your path. You approach it, inspect it and see some words written on it in BIG, BOLD RED LETTERS:

Reciprocal Link Required

You think, Mmm, O.K, I’ll put the recips in my website’s footer and when that’s full I’ll stick them in my sidebar.

So you get out your hammer and bolster and start chipping away at it.  After a few days of chipping you finally break through the last of the bolder. As it splits into two pieces you step onto the other side of the path, exclaim “Phew! That was hard work” then sit down in exhaustion. Just before you close your eyes to relax you notice all the chippings you’ve left on the path, What a mess, you think to yourself, How am I going to clear that up?

Well, how are you going to clear it up?

How about this:

create a remotely hosted sub domain of your site, preferably on your own server, and install a directory, blog, gallery or whatever else your sponsor provided white label site needs.

Believe me when I say this, it is far easier to do than it sounds; and you will need to do this whenever you need to spread a domain name across free sites provided by your sponsors, blogs provided by free blog services and your own paid for server space.

In this guide I will show how to point your sub domains to servers other than the server your main domain (your top level domain) points toward.

You will need

  • 1 registered domain name
  • access to your domain name registrar’s control panel
  • 1 server that isn’t connected to the one your top level domain, TLD, points at

The Set-Up

We will pretend we have a TLD called

example.com (http://example.com)

We will create a sub domain called

subdomain.example.com (http://subdomain.example.com)

We will host example.com on Server 1 and the sub domain on Server 2.

We will assume that example.com already points to Server 1 with a set-up similar to this (the left hand side is the Host Name, the right hand side is the server detail):

A-NAME or A-ADDRESS

* = 123.456.789.10

@ = 123.456.789.10

www = 123.456.789.10

A-NAMEs point to the IP address of the target server

CNAME or Name Server (NS)

ns123.server-one.com.

ns124.server-one.com.

CNAMEs point to the target server’s Name Server

The full stop after the “com” is necessary for the CNAME. Some registrars auto add it so might not like the Name Server if you add the dot. Use the dot first, if the registrar spits it out, try it without the dot.

The Method – How to Split a Website Across Multiple Servers

Go to your own server’s control panel (usually CPanel) and create an add on domain by entering the details for your top level domain. For our example domain, this would be created as example.com

Do not worry, creating an add on domain will not change the current set-up for your domain. Requests for your domain will still be directed to the server it already points at.

Next, create a sub domain for your newly created add on domain. Again, you should do this from your server control panel. Ensure the sub domain’s folder resides under the top level domain’s folder. Our example sub domain is subdomain.example.com It would sit on our server under the following directory structure (Linux)

public_html/domain/subdomain

To send requests for the sub domain, subdomain.example.com, to any server we specify we must create the following items for it

  • Domain Name Server (DNS) records (A-NAME and CNAME), and
  • A URL Redirect

They are very easy to set-up and should take less than 5 minutes although they could take several hours to fully propagate the web. Using our example sub domain, subdomain.example.com, to point it to our example server, Server 2, those records will look similar to these (the left hand side is the Host Name, the right hand side is the server detail):

A-NAME or A-ADDRESS

subdomain = 999.888.777.66

www.subdomain = 999.888.777.66

A-NAMEs point to the IP address of the target server

CNAME or Name Server (NS)

subdomain = ns123.server-two.com.

subdomain = ns124.server-two.com.

Create as many NS records as you have name servers.

CNAMEs point to the target server’s Name Server

URL Redirect

www.subdomain = http://subdomain.example.com

The URL redirection ensures that requests for www are sent to the http: version of the site

Here is an example picture of how your domain’s DNS records might look (click it to view the full sized image)

Domain CNAME and A-NAME Records Preview

Example DNS Records

The new sub domain’s server details could take a few hours to fully propagate the Internet. Once its details have spread around, whenever someone requests your TLD they will be sent to one server and when they request your sub domain they will reach your secondary server.

You can create as many sub domains on as many servers as you require for your website. I often use this method to create sitemaps, directories and blogs when a free hosted site fails to provide them.

If you need to reroute your emails then you can usually create an email forwarding service from within your domain registrar’s control panel; but that’s another article altogether.

If you need your own host server then I highly recommend Hostgator. I’ve been with them for many years and cannot fault their servers, their services or their products. I register most of my domain names through them too.

Here is my Hostgator affiliate link

COUPON CODES: Hostgator’s current coupon codes are

  • Use “freemonth” to get a free month of hosting if you plan to pay monthly,
  • Use “snowman” to get 20% off a year’s hosting if you plan to pay per year.

Those codes are valid for both shared and reseller hosting plans. Dedicated server lease is on a per year basis so only the “snowman” coupon code can be used to get a massive %20 off the yearly fee.

When I can’t buy domain names from Hostgator I buy them from 123-reg.co.uk. Although Hostgator is my preferred option.

Reblog this post [with Zemanta]

dp seal trans 16x1641 How to Split a Website Across Multiple Servers  Copyright secured by Digiprove © 2010