The meta element allows you to write additional information or data about the HTML document in the head section between <head> and </head>.
These can be instructions for the web browser, the web server, or a web crawler (also spider, searchbot, bot, or search robot). Even though the use of those meta elements is optional, they often get specified. It’s quite difficult, especially for beginners, to keep track of the many existing HTML attributes and the possible attribute values you can use with the meta element. Many of these additional details aren’t standardized at all.
Web Crawler: A web crawler is an application that searches the internet and analyzes entire websites. There are different types of web crawlers on the go that collect different types of information. Search engines also use a web crawler to analyze websites. Basically, the principle is quite similar to web browsing, where hyperlinks take you from one web page to other URLs. A web crawler stores these URLs and visits these pages one by one. The websites are evaluated via indexing to make searching for the relevant data possible.
The Most Commonly Used Metadata
A meta element is usually composed of at least two attributes. Either the attributes consist of a name/content combination or an http-equiv/content combination. In addition, a special version exists for character encoding.
“name/content” Combinations: Freely Definable Metadata
The meta element containing the HTML attribute name can basically contain any information in the HTML attribute content. Theoretically, you could assign any value to the contents of name yourself. Nevertheless, some default metadata for the name attribute value has been defined in HTML. However, these name/content combinations aren’t intended for personal information, but should only contain information about the HTML document. A simple example might look as follows:
...
<head>
<title>Freely definable metadata</title>
<meta name="author" content="John Doe">
<meta name="keywords" content="metadata, meta, html">
<meta charset="UTF-8">
</head>
...
Here, you can see two typical name/content combinations. The first example defines the author of the web page, while the second pair defines keywords for the search engines. You could use any number of other meta elements here.
“http-equiv/content” Combinations: HTTP Equivalents
The specifications with http-equiv (also called the pragma directive) were intended for the web server to communicate. The web server should read this information and then take the read information into account when responding to the client (web browser) and use it in the HTTP response header. However, web servers don’t actually parse HTML documents, so again it’s up to the browser how this information gets processed. Let’s look at a simple example:
<!doctype html>
<html lang="en">
<head>
<title>HTTP equivalents</title>
<meta http-equiv="refresh" content="5">
<meta charset="UTF-8">
</head>
<body>
<p>Page gets refreshed every 5 seconds.</p>
</body>
</html>
The refresh value for the http-equiv attribute and the value 5 for the content attribute allow you to make the web browser refresh the web page every five seconds.
Setting the Character Encoding for the HTML Document
In addition to the name/content and http-equiv/content pairs, there’s a third option that allows you to specify the character encoding (more easily). Generally, you should use this information when creating a web page that’s written in a language other than English.
<meta charset="UTF-8">
This will ensure that special characters such as German umlauts and some other special characters are also displayed correctly, thanks to the UTF-8 character set standard. Besides the internet, modern operating systems also use UTF-8, and unless you have a reason to use a different character set, you should always work with UTF-8.
Setting the Viewport
Let’s jump ahead to the viewport now, as a correct setting will prevent a responsive website from being displayed in a small view on the mobile device. The viewport is the area of the browser window where the web content gets displayed. Without any special precautions, web pages on a smartphone’s mobile browser would be scaled down until they fit completely on the screen. This allows visitors to keep an overview and zoom into the page.
If you want to create modern websites today, then taking into account the different device sizes and a responsive web design is part of the development process. When creating responsive web pages, you must prevent this automatic downsizing. You can do this via a meta element like the following:
<meta name="viewport" content="width=device-width">
It tells the browser to use the actual width of the device rather than an imaginary width. You can see the result of this line in a responsive web page in the figure below, where the automatic resizing function was implemented on the left-hand side and the viewport with the meta tag was used on the right.
Specifying Useful Metadata for a Web Crawler
This section provides a brief description of some metadata for search engine robots (web crawlers). However, you must be aware that this information is only a recommendation for the web crawlers. Whether the search bots adhere to it is out of your hands. At least these attribute values were partly (co)designed by Google, Yahoo, and Microsoft, so these publishers will probably stick to them. If you want to include information for the web crawler as metadata, you must assign the robots value to the name attribute. In the content attribute, you write (or suggest) what the web crawler has to do when it visits the web page, for example:
<meta name="robots" content="index,follow">
This allows the search robot to include the web page in the search engine index and to follow the hyperlinks on the page. However, you can usually omit this information because this is the usual behavior of a web crawler.
If you don’t want the page to be indexed or the hyperlinks to be followed, you can use the attribute values noindex and/or nofollow in content:
<meta name="robots" content="noindex">
Here, you indicate that your website shouldn’t be included in the search engine index (noindex), so that the page can’t be found via a search engine. If you want the page to be included in the search engine index, but don’t want the hyperlinks to be followed, you merely need to use the attribute value nofollow in content.
Useful Metadata for Search Engines
Especially for search engines, two name values are important, namely keywords and description. However, the keywords value has lost importance because it was misused in the past to feed search engines with many misleading keywords (keyword stuffing) to be listed as close to the top as possible in the search. In the meantime, the search engines are again indexing the content of a website in a more targeted manner and tend to leave the keywords unnoticed (or less noticed). If you still want to specify keywords, you must separate the individual keywords in content separated by commas, as the following example shows:
<meta name="keywords" content="html, meta, keywords">
Here, for example, html, meta, and keywords were used as keywords for the website.
What’s more interesting, however, is the description text of the website. Although this text will probably not be considered directly in the search results, the description is, in addition to the title, the first thing a user sees listed in the search engine as information from your website. You should keep the description as short and precise as possible and use a maximum of 150 to 250 characters (depending on the search engine). A text that’s too long will be shortened.
Here’s an example of such a description:
...
<head>
<title>Description text for search engines</title>
<meta charset="UTF-8">
<meta name="description"
content="A description should be as
short and precise as possible. Here
you should summarize in 2-3 sentences
what this page is about. Characters
exceeding the limit will be shortened.">
</head>
<body>
...
</body>
...
In Google, for example, this description text is usually listed as shown below.
If you don’t specify a description with a meta element, this text will get generated from the parts of the page content. However, it isn’t possible to predict exactly what this description will look like and what kind of text will be used for it. For this reason, you should definitely take the description into your own hands instead of leaving it up to the algorithm of a search engine.
The First Impression Is Important: Although it isn’t as important as it was in the early days of the internet, metadata still plays a significant role in search engine coverage. You should therefore always pay attention to the title element and the description (name="description") because these elements are often the first things that website visitors get in return from search engines when the page is listed in a search.
Useful Metadata for the Web Browser
If you want to refresh the content of a web page after a certain time or redirect it to another URL, you can use the http-equiv attribute with the refresh value for this purpose. The content attribute enables you to set the time by when the update or redirection should take place.
You can force a refresh of the web page as follows:
<meta http-equiv="refresh" content="30">
This would refresh the currently loaded web page every 30 seconds.
The redirection to another website can be set up in a similar manner:
<meta http-equiv="refresh"
content="5; URL=http://domain.com/">
This causes the browser to switch to the domain.com URL after five seconds. You could also use zero seconds here, but this way, you can at least let the user know in the HTML document body why they are being redirected and where.
Stop Using the Automatic Redirection Feature: Automatic redirection can be helpful if the address of the web project has changed. However, some browsers ignore this redirection depending on their settings. In addition, you should also note that the search engines ignore this redirection. In this context, it’s often better to define a hyperlink with information in the HTML document body to the new URL with an explanatory note. In addition, when you use the time 0, it could be difficult for the visitor of the page to use the back button of the browser because this would throw them forward again and again. Alternatively, a redirection can also be created on the server. For example, if you have access to the configuration file .htaccess (for Apache web server) or web.config (for IIS), you can configure redirecting there. Automatic redirection has been classified as deprecated by the W3C anyway, which is why you should refrain from using it for future web projects. But because redirects are still commonly used, I included the topic here.
As mentioned previously, you can also use the old character encoding specification:
<meta http-equiv="content-type"
content="text/html; charset=utf-8" />
This specification corresponds to the more recent specification introduced in HTML: <meta charset="UTF-8">
The additional use of the old specification has the advantage that it will also be understood by older browsers that don’t know <meta charset="UTF-8">.
Using General Metadata
In addition, there’s a considerable amount of general metadata, such as the author of the HTML document or the date and time the document was edited. This is helpful, for example, when several people work on one HTML project. You can specify all this information as a name/content combination. Let’s look at some examples:
<meta name="author" content="John Doe">
<meta name="date" content=" 2021-01-15T12:00:00+01:00">
Here, the author of the web page (author) and the date of the last change (date) were indicated. If you want to provide personal information for the readers about the current HTML document, you shouldn’t do that via metadata, but directly in the HTML document in a readable manner. The metadata is only useful when someone looks at the source code of the document or when it’s read by a software. There’s also other general metadata such as generator, which provides information on the software that was used to create the website. Additionally, you can use application-name to make special specifications if the web page belongs to a specific web platform or if a specific web application is running in the web page.
My Recommendation: This Metadata Belongs in the Basic HTML Framework
As you’ve now been introduced to a number of different types of metadata, you’ll probably wonder which type of metadata will be useful for your own website. This is ultimately up to you, but personally I always use at least the character encoding for UTF-8, a page description, and the viewport in the head element:
...
<head>
<title>German umlauts</title>
...
<meta charset="UTF-8" />
<meta name="description" content="A description should preferably be as
short and precise as possible. Here
you should summarize in 2-3 sentences
what this page is about. Characters
exceeding the limit will be shortened."/>
<meta name="viewport"
content="width=device-width,initial-scale=1.0, shrink-to-fit=no" />
</head>
...
HTML Attributes for the <meta> Element
This table provides an overview of the HTML attributes for the meta element.
Editor’s note: This post has been adapted from a section of the book HTML and CSS: The Comprehensive Guide by Jürgen Wolf.
Comments