Most web pages are built with that elusive “HTML” thing. Those 4 letters that web developers, Internet users and trendy grandmothers brandish about as though everybody knows what they mean. But do they really know what HTML is? Do they know what those 4 letters stand for?
Though there are many guides that explain how to use HTML, few explain what HTML is and few explicitly show the differences between the HTML code and the page that is displayed because of the HTML. This guide fixes some of that. It won’t show you how to build a web page but will show you what a web page is and give you a behind the scenes tour of the HTML that makes a web page.
Behind the Scenes with HTML
HTML stands for Hypertext Mark-up Language. It is called a mark-up language because it allows us to ‘mark out’ areas of text and assign special meaning to them.
When viewed in a text editor, a very basic HTML page looks like this:
<html> <head> </head> <body> </body> </html>
The above is an example of HTML mark-up.
The words between the less than and greater than symbols (<>) are called HTML tags. Most tags come in two formats: opening and closing. An opening tag tells web clients like browsers such as Firefox where to apply a tag’s properties within a page. Content like images and text is sandwiched between the opening and closing tags. A closing tag tells a web client where the content between the opening and closing tags stop(s).
<html> <head> </head> <body> SOME CONTENT </body> </html>
The opening <html> tag tells web browsers like Firefox that the document being viewed is an HTML document and is written in HTML mark-up. Browsers will display a web page based on the layout instructions given by the HTML mark-up.
The closing </html> tag tells a browser that the HTML page has ended. Any tag that has a forward slash (/) after the opening ‘less than sign’ (<) symbolizes the closing (ending) of its opening tag’s reach. The opening tag is the same as the closing tag but without the forward slash.
From the start of an opening tag to the end of its corresponding closing tag is called an HTML element.
<div></div> is called an HTML element.
<div>CONTENT</div> is called an HTML element.
In <body><div>CONTENT</div></body>, both <body> to </body> and <div> to </div> are called HTML elements. some of what affects the style of the <body> tag will affect the <div> tag it encases. More about this soon.
Here is the key concept:
HTML is a way to layout parts of a webpage like text and images. It is not a programming language. It is not a scripting language but a mark-up language.
Imagine you have a blank collage book and lots of pictures from newspapers and catalogs that you want to stick in it. You take your scissors, cut out the pictures then smear plenty of glue onto the back of each clipping and stick it into your book. Every clipping is exactly where you want it and whenever you review your collage you will see the same clippings in the same places.
A web page is like a collage book except there is no glue; but there are clippings like images, movies and podcasts.
There are drag’n’drop website creation tools like Kompozer and Maqetta that let you build web pages by positioning your ‘clippings’ or content and adding text into a page without ever seeing the HTML that tells a web browser where each clipping should be displayed within the page.
If you look behind the scenes of an HTML page you will see <html> tags, <head> tags and <body> tags.
The HTML markup tags are not display on screen in browsers. They are invisible unless specially written for actual display.
When you make a collage, you can use clippings of any shape. Not so for web pages. All web content is laid out in rectangular or square blocks. There are no triangles, no circles, no hexagons. Just blocks with 4 internal 90 degree angles and 4 straight edges. Every image, every movie and every chunk of text is placed within a rectangular/square division of the page. The Hypertext Mark-up Language (HTML) marks out those blocks with opening and closing tags like <div>TEXT</div>.
The Most Basic of the Basic Elements
The <html></html> tags were explained earlier. They tell a web browser that a page is laid out in HTML and that everything written between the HTML tags should be interpreted (the technical term is parsed) as marked out in HTML.
The opening <head> tag and closing </head> tag mark out an area within the page that is read by web browsers but which isn’t itself shown within the page content. This is known as the page head or page header. Anything within these header tags will not be shown within the page. Whatever is here will be expressed within the page but not shown to visitors. Do not confuse the header tags with the visible page header. Keep reading to learn more about the page header.
Content is added to a page between the <body></body> tags. A typical page with content might look something like this:
<html> <head> </head> <body> <div id="wrapper"> <div id="container"> <div> <img src="http://example.com/logo.jpg" /> </div> <div> <p>This is a page of content.</p> </div> <div> <p>Footer copyright text.</p> </div> </div> </div> </body> </html>
The opening <div> tags and closing </div> tags mark out divisions within the page.
The page is read by web browsers and other clients from top-left to bottom-right, line by line. Each tag is read in turn. When an opening tag is found the browser knows to look for a corresponding closing tag further down the page. When two opening tags are found, the browser knows to look for two closing tags further down the page. Just like using curly brackets in text, when two opening brackets are used, you close the inner brackets before you close the outer brackets, HTML does the same with HTML tags.
Our example page has 5 sets of opening <div> tags and five sets of corresponding closing </div> tags. Some of our <div> tags are named with IDs and classes:
- An ID is unique to the page. It can only be used once in any one page. It can be reused in other pages but when a tag is given an ID, no other tag may use that same ID.
- A class is non unique. Multiple tags (of any type) may use the same class name within the same page and across multiple pages.
IDs and classes are used for styling elements within a web page by adding style instructions to the HTML web page’s associated CSS page. More about these later.
Each <div></div> tag group represents a section within the page:
The <div> with the ID “wrapper” is used to hold and centrally align the visible part of the page. This division is mainly for compatibility with old Internet Explorer browsers that ignore the page styling that centres a page.
The <div> with the ID “container” is used to hold the actual visible page content. This division corrects the abnormalities caused by Internet Explorer compatibility fixes employed in the “wrapper” div.
The <div> with the class “header” is used to create the top of the page where the site logo image, site title and site by-line are displayed. Do not confuse the “header” with the <head></head> tags.
The <div> with the class “content” holds the page content that displays between the page header div and the page footer div.
The <div> with the class “footer” holds content such as links, copyright text and contact details that is displayed at the bottom of the page.
Our example page has header, content and footer divisions of equal width. A sidebar could easily be inserted into the page next to the content div. This would require the content div to be made less wide than the header and footer elements so it can accommodate the extra sidebar element. This is one reason that the header and footer are created separately to the container div. Another reason is that the header and footer rarely change across a site. Keeping them separate helps a page to load more quickly and seamlessly when they are loaded into a page using server side includes (SSIs).
Other tags used in the example above are:
- <img /> which is used to load an image. This is an example of a self closing tag – look at where the forward slash is.
- <p></p> tags which are used to mark out text within a page. These are called paragraph tags. Each paragraph is wrapped in <p>tags</p>.
CSS and Element Styling
CSS stands for Cascading Style Sheet. Every web page should have a CSS file if its HTML elements need to be styled with background colors, background images, borders, external margins, internal paddings and other properties.
A style sheet is referenced (linked from) a web page by a meta tag placed in the web page’s head section. The head section (element) is everything between the opening and closing <head></head> tags. Remember what I said earlier about the head section of a page?
Meta tags are used within the head section to provide details to web browsers and bots about a web page. Except for the title meta tag, meta tags are usually invisible to people viewing a web page – only the effects of meta tags are expressed within a web page.
Here’s an example <head></head> section:
<head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>A Web Page</title> <link rel="Shortcut Icon" href="favicon.ico" type="image/x-icon" /> <link rel="stylesheet" href="style.css" type="text/css" media="screen" /> </head>
The first meta tag <meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″ /> tells web clients that the character encoding used by a web page is UTF-8.
The <title></title> tag tells browsers the title to use for the tab the page is displayed in.
The <link rel=”Shortcut Icon”… /> tag tells browsers where to locate the favicon to display in the page tab within the browser. This is a self closing tag.
The <link rel=”stylesheet”… /> tag tells browsers where to find the CSS sheet for the web page.
The style sheet contains the instructions a browser needs to refine the appearance of the web page that requests it. A web page can use more than one CSS style sheet.
Style information can be placed directly into a web page between <style></style> tags within the head section and scripts can be placed in the head section of a web page. They are usually placed between <script></script> tags.
In the styling information, the named HTML elements, classes and IDs are called selectors.
The above example adds style for:
The “body” selector. This affects the <body> tag to change its background color.
An ID called id. This adds a border to the element that is called “id”. In the CSS, an ID is referenced with a hash symbol (#) e.g #wrapper and #container.
A class called “class”. This adds a border to any element called “class”. Classes are referenced in the CSS with full-stop in front of them e.g .header, .content and .footer.
How Does HTML Fit into PHP?
HTML pages are known as static pages. When an HTML document is viewed it will always display the same way. The styling will always be the same, the elements will always be in the same places.
PHP web pages are known as dynamic. Dynamic pages are recreated every time they are viewed. They might change according to the browser they are requested by, they might change to match a browser’s width, they might load with random images and text, or they might change according to the time or any other detectable property of the client that requests the page. It all depends on how they are built.
PHP uses HTML to layout a page but the PHP determines how that page will look and behave when viewed.
The objective of this HTML discussion was to explain the meaning of HTML and to define some of the terminology used by web developers when discussing HTML markup and web page elements. It was not this discussion’s intention to teach you how to use HTML; there are plenty of good guides to HTML on the Net already and one more guide will only add to the confusion.
What you should now be aware of is that HTML web pages are laid out (built) in a similar way to collages but instead of glue, HTML tags and CSS selectors are used to stick page elements (like text and images) into their positions.
A basic web page needs <html></html>, <head></head> and <body></body> tags. A corresponding pair of tags and the content they encase are called an HTML element. The <html> element encases all the other elements of the page, the <head> element holds the page’s meta information that surfers do not see, and the <body> element holds the content that surfers do see and that is styled by CSS and scripts referenced in the <head> element.
When web developers talk about a page’s header, they can mean the visible page top where the site logo is shown or they can mean the head section of the HTML mark-up. Usually, when asked to put something in a site’s head or header – unless it is an image or text – it will be intended to go between the <head></head> tags or it may be a vague reference to the style sheet of the page.
I have found the following sites invaluable throughout my years as a web developer (I feel old writing it like that). Even though I now almost exclusively use WordPress to build websites, these HTML resources have proved themselves useful: