Total Validator
HTML / XHTML / WCAG / Section 508 / CSS / Links / Spelling

Help | Website | Feedback

Introduction

All of the options that appear on the Include tab of the Pro tool are described below.

pro tool include tab
Include tab

top

Skip

You may not always wish to check certain parts of your site. This option allows you to skip parts of the site by specifying one or more paths below the starting page to ignore.

When you click on the 'Skip' button a dialog appears to allow you to add, remove, and update a list of paths to skip.

The paths that you enter here must start with a '/'. Any pages that lie below the starting page and within the path specified here will be ignored.

For example if your starting page is 'http://mysite.com/somepath/index.html' and you specify a path to skip of '/otherpath' then pages that start with 'http://mysite.com/somepath/otherpath' will be skipped. Note that this includes pages such as 'http://mysite.com/somepath/otherpath/index.html' and 'http://mysite.com/somepath/otherpathaswell/index.html'. If you just wish to skip 'http://mysite.com/somepath/otherpath/index.html' then specify a path to skip of '/otherpath/'.

You can use this option in combination with the 'Include' option to provide further restrictions on what to check. For example you could set a path to skip of '/' to skip everything except the starting page, and then use 'Include' to specify exactly which paths should be validated. You can also use the 'robots.txt' option at the same time for further fine-grained selection of what to validate.

You can use regular expression syntax here, but you must always start with '/' as the first character and note that .* is always automatically added to the end of whatever you enter.

top

Include

If you specify some paths to skip, or use Disallow within your robots.txt file, you may wish to override this to include some paths within these areas of your website that you do wish to check. Note that it only makes sense for these include paths to be 'below' the paths to skip or 'below' paths disallowed in the robots.txt file.

When you click on the 'Include' button a dialog appears to allow you to add, remove, and update a list of paths to include.

The paths that you enter here must start with a '/' and be more than just a single '/'.

For example if your starting page is 'http://mysite.com/somepath/index.html' and you specify a path to skip of '/otherpath/', and an include path of '/otherpath/subpath/' then pages such as 'http://mysite.com/somepath/otherpath/index.html' will be skipped, but pages such as 'http://mysite.com/somepath/otherpath/subpath/index.html' will be included (as long as you have a link to them from any pages that are validated).

If you wish to skip everything except a single folder you could set a path to skip of '/' (to skip everything except the starting page), and then use 'Include' to specify exactly which paths below this should be validated. You can also use the 'robots.txt' option at the same time for further fine-grained selection of what to validate.

You can use regular expression syntax here, but you must always start with '/' as the first character and note that .* is always automatically added to the end of whatever you enter.

top

Use robots.txt

An alternative way of specifying which parts of your site to skip is to add a standard robots.txt file to you website. Total Validator will use any rules marked for all user agents with a *, as well as those specifically marked with a user agent of 'TotalValidator'. For example:

User-agent: *
Disallow: /blogs

User-agent: TotalValidator
Allow: /support/
Disallow: /support/resources/

Total Validator supports all of the features supported by Google including multiple 'Disallow:' and 'Allow:' statements in any order, wildcards and suffixes.

Note that paths in a robots.txt file are relative to the root of the site and not the starting page for validation unlike the 'Skip' and 'Include' options. The starting page itself will always be validated even if the robots.txt file disallows it. This option can also be used in combination with the 'Skip' and 'Include' options for fine-grained selection of what to validate.

top

Follow remote links

When checking more than one page; those pages that don't start with the URL of the starting page will be ignored. This includes pages on remote sites and pages 'above' the starting page or in a different part of the website.

Selecting this option will cause the validator to ignore this restriction and so will visit all the pages linked to the starting page regardless of their URL. This applies to the starting page only, so that remote links on subsequent pages will be ignored.

Use this option with care otherwise you may end up checking far more pages than intended. It is expected that in most cases this option will be used with a specially constructed starting page that references different parts of the same website. See the FAQ for further details.

top

Validate errors

If you select this option then whenever your web server returns an error status code such as 404 (page not found), then the error page sent by the web server will be validated.

This is a useful way of checking that your error pages also conform to standards.

top

Ignore errors/warnings

If you use the tool and it reports that there are 'errors' in your site that you are happy to live with, then use this option to stop them appearing. In this way you can clean up the reports produced to make them more useful to you. You could also ignore any errors/warnings that you think are errors in the tool itself, although we would prefer it if you could let us know so we can fix them so that everyone will benefit.

This value you supply must be a comma separated list of errors and/or warnings to ignore. For example:

E601, W600, E404, P861

Once you've seen how the errors/warnings are reported we are sure you'll understand what to put in here.

This option applies to all pages/css validated. If you need finer control then you can add special instructions to your pages/css instead.

top

Stop after problems

This option allows you to specify the maximum number of problems to be reported before the validation is automatically halted. This is especially useful on large sites, where the same problems may be reported again and again. Instead of waiting for the whole site to be validated you can fix these common problems after validating only a few pages and then validate the entire site.

The value that you enter here must be an integer (whole number) greater than 0. Leave it blank for this option to be ignored.

Note that the number of problems reported could be slightly higher than the figure you enter, as all problems for the last tag validated will be reported.

top

Stop after pages

This option allows you to specify the maximum number of pages with problems to be reported before the validation is automatically halted. This is an alternative to the 'Stop after problems' option, although both can be used at the same time if required.

The value that you enter here must be an integer (whole number) greater than 0. Leave it blank for this option to be ignored.

top

Page pause

If you wish to minimise the impact of validation requests on your server you can use this option to set the time in milliseconds to pause before retrieving each page. By pausing in this way the rate of requests hitting the server will be reduced. Normally this option is used together with the Link pause option.

top

Browser identification

When validating a website the tool identifies itself as 'TotalValidator/6.0' by default. If you wish the tool to identify itself as another user agent, then select the required identity from the drop down list.

You can amend the list of identities and what they mean using the 'Edit List' button. This will display a dialog box for easy editing of the list of user agents and the corresponding text sent to the web server when the tool accesses it.

If you wish to return to the default list of identities then use the 'Reset' button provided on the edit screen. This also allows you to update the list with the identities provided in the most recent version following an upgrade.

top

Strip query

Some websites are constructed such that query parameters are dynamically added to links on their pages such that the links are different each time the page is served. This is a problem for Total Validator which treats these links as being to different pages because the URLs are different. This means that it will test the same page(s) again and again.

If this happens to you then use this option to prevent it. The links will then be stripped of all query parameters before being used. Note that this may mean that not all pages are checked, depending on how the query parameters are used.

top

Strip session

Some websites are constructed such that session ids may be dynamically added to links on their pages. These links typically add these session ids to the end of the link using a semicolon ';' to separate them like so:

http://thewebsite.com/path/page.html;jsession=123456

This can sometime be a problem for Total Validator which may view two links to the same page as referring to different pages because the URLs are different. This means that it may test the same page(s) again and again.

If this happens to you than use this option to prevent it. The links will then be stripped of the semicolon and everything following this up to the start of any query parameters or to the end of the URL if there are none.

top