Backtracking

"Backtracking" refers to a process of manual URL navigation by browsing through directory structures. In order to Backtrack, you'll need to know something about how URL's work:

This illustration shows each of the four main elements of a URL: protocol, Domain Name, File Path, and File Name.

Click here to go right to a description of Backtracking, and why one would use this technique.

PROTOCOL
The first element in any URL is the Protocol. In the case of web pages, the protocol will almost always be "http", which stands for "HyperText Transfer Protocol." Basically, this tells your browser that it will be loading a web page. As mentioned earlier, web browsers can use other protocols to access other kinds of information on the Internet. More specifically, in addition to "http", this protocol element of the URL could also read "gopher" or "ftp". Most modern browsers assume that users are looking for web pages, so the protocol is optional. If the user simply types in www.abacon.com, for example, the browser will automatically fill in the "http://" section of the URL.
 One final word about protocol: most new web browsers can also be used to browse documents on a user's computer. Instead of typing in a web page address, the user might type in something like "c:\". This would direct the web browser to display the contents of the computers "C" drive. The actual URL would look like this: " file:///c|/".   We can add "file" as another protocol that web browsers can use; Microsoft Windows 98 makes use of this feature of web browsers by enabling users to perform all navigation on their computer through a web page-like interface.

SERVER DOMAIN NAME
 The second element to every URL is the server domain name, which is like the street address of the web server. Basically, the domain name tells the browser where it can find the web page in question, and in theory, the domain name reads similar to a street address, from most specific to most general. In the above example, the domain name consists of three parts: "www", "niu", and "edu". Going in reverse, we see that "edu" tells us the web page in question is associated with an educational institution. Other domain types include "com" (a commercial site), "org" (a non-profit organization), "gov" (a governmental site), "net" (a local network), or might be a country code like "us" (United States), "uk" (England), "jp" (Japan), or "ca" (Canada). The next part, "niu" specifies which educational institution we'll be looking at: in this case, Northern Illinois University. The final part, "www," tells us what kind of server we'll be accessing: in this case, a web server (slightly redundant given the definition of the "http" protocol, but standard).

FILE PATH
The third element included in a URL is the file path. This element tells the browser where on the server to look for the requested web page. In the example above, the file path specifies "english", so the web browser will look on the server for a folder called "english."  File paths can include nested folders as well. For example, consider the following URL: http://www.niu.edu/english/classes/ceh/main.html. In this example, the file path specifies several layers of folders. First, the browser will look for a folder called "english." Assuming it finds that folder, it will look for a folder called "classes" within the "english" folder; then it will look for "ceh" inside "classes."

FILE NAME
The final element to a URL is the actual file name of the web page in question. In the example above, the file name of the web page we are looking for is "english_home.html". Note that most web pages will end in ".htm" or ".html".
One special exception to this final URL element concerns servers that use "default documents." For example, if a user were to type in http://www.niu.edu/english/, one of three things would happen. First, they might get an error that told them the page they requested couldn't be found. This would happen because the user forgot to enter the file name. Second, the user might go to a default page. In this case, the web server knows that if users do not enter a file name, the browser should automatically look for a file called "index.html" and go there. This "index.html" file is a normal web page, so could look like anything at all. Third, the user might receive a list of all the files currently in that folder (the folder called "english" in this case). This is called "directory browsing", and is basically the same thing that a user does when he or she looks at an index of the files in their floppy disk. Some servers do not allow directory browsing, but some do. Thus, if a user does not enter a file name, the browser will probably look for a file called "index.html". If it finds one, that page will be loaded. If it does not, the browser will find out if it can list the index of the folder. If it can, it will. If it can't, then the browser will give the user an error message (probably to the effect that directory browsing is not allowed, or permission denied).

To sum up how URL's work, let’s take another look at our sample URL: http://www.niu.edu/english/english_home.html. The protocol tells the browser that it should look for a web page. The domain name tells the browser that it should look for a web server at an educational institution called NIU. The file path tells the browser to look for a folder called "english" on the web server. The file name tells the browser which  page in the "english" folder it should copy and display for the user. That's all clear now, right?
 

Backtracking:

Now that you know how URL's work, the concept of backtracking might be fairly obvious. Backtracking is the process of moving up the FILE PATH, and seeing what each layer in the hierarchy returns.

Let's take an example:
http://www.engl.niu.edu/ceh/104/

This process is useful for discovering the relationship between any particular document and the nature of the heirarchical structure of the PATH from the server root (the page you get when looking at he Domain Name) to the page in question.

fine print