FTP and troubleshooting

ISP policy, your URL

Get some guidance from your Web server's system operators: look for a "how to post your Web pages" page on their Web site. If your system is run like mine, each user has full control of their own Web site directory and files. I can update my Web files whenever I want. There used to be some systems that didn't give their users that much control: sometimes the rule was "send your stuff to us and we'll post it for you if we like it."

If your ISP has a "how to post" page, it should also tell you the format of your home page URL to publish. Unless you are paying extra for hosting of your own domain name (i.e. www.janedoe.com) your URL will include a directory path of some sort.

Some URL formats on Unix servers include a squiggle character called a tilde (~). It's usually a shift character at the upper left corner of your keyboard, above the Tab key. This tends to confuse folks the first time they encounter it; they sometimes mistake it for a dash. In Unix pathnames, the tilde character stands for "user's home directory," but it gets interpreted a little differently depending on what Unix process is seeing it. In shell commands, /~jdoe/ might be translated to /home/jdoe/ (user J. Doe's home directory on that system). When the Web server sees /~jdoe/ in a URL, it might be translated to /home/jdoe/public_html/ (J. Doe's default Web page directory). The specific directory names could be different on different systems.

URLs often end with a directory name. When a Web server gets an HTTP request for a URL ending with a directory name, it's treated as a request for the default html filename for that server, usually index.html. When you're putting a URL of that form in actual hyperlink code, you should always include the trailing slash for performance reasons. You can leave it out (as seen here) in email signatures, business cards, advertising and the like. People also usually omit the leading http:// protocol section in print.

URL examples for print formats
Hosted personal domain addresswww.janedoe.com
ISP-domain address with home directorywww.isp.com/~jdoe

Usually the free Web page that comes with your Internet account is defined as being a personal page, meaning that you may not explicitly offer commercial services from it. ISPs normally have different rates for commercial pages. Ask your ISP how they define the difference.

File transfer client

You can use your Web browser to upload your html files, one at a time, but a real ftp client lets you select and upload multiple files in one step, and also lets you rename and delete in the remote directory.

I like to use WS_ftp for file transfer. I think WS ftp is pretty hard to beat for simple, intuitive interface and general ease of use. It also has a feature where you can examine local and remote directory listings in Notepad windows. Having access to the text of the file list like that is sometimes handy for site maintenance.

Windows native ftp

Windows 95 and later* comes with an undocumented text-mode ftp client, which you run from a DOS window, which also supports multiple file transfer: you can use that if nothing else. It has pretty much the same command syntax as command-line ftp on Unix. Type ftp at a DOS prompt to start ftp, and bye or quit to terminate ftp and go back to the DOS prompt.

Some text-mode ftp commands
open Open ftp address (or you can type ftp [address] from the DOS prompt)
? List available commands
help List available commands, or help [command] to display description of command
dir List files in the working (current) directory on the remote system
pwd "Print" working directory (display current directory on remote system)
cd Change directory on remote system
get Transfer a single file from the remote system to your computer
mget Transfer multiple files from the remote system to your computer
put Transfer a single file from your computer to the remote system
mput Transfer multiple files from your computer to the remote system
delete  Delete a file on the remote system
status Show current option status
close Disconnect from remote system (without exiting ftp)
bye Exit ftp (to DOS prompt)
quit

I found a text-mode ftp help page which covers Windows ftp.

Filenames and attributes on the server

If you were editing your pages on a DOS/Win3 system, they had the extension htm on your hard disk, and you had to change them after you uploaded. A Unix system will also treat woof.htm and woof.html as different filenames: you had to upload, delete woof.html on the server (the old version) and then rename the new file "woof.htm" to "woof.html".

Unix filenames are case-sensitive: index.html and INDEX.html are different filenames on a Unix system. The simplest thing is to keep everything lower case. Also, Unix Web servers expect Web page filenames to have the four-character extension html.

I believe Windows-based Web servers can handle links correctly to files with the Unix-style "html" extension, but require "index.htm" as the default filename. When I moved my site from Computech to Icehouse, I had to switch my home page filename from index.html to index.htm and change all my internal hyperlinks to the home page.

An internal hyperlink is a link between two html files that are both part of your Web site, or from one place to another within a single file. An external hyperlink points to a URL out on the Internet somewhere.

There are Internet Web sites with URL addresses that use the three-character extension ".htm". Don't change those to ".html", or you'll break the link. Sorry, but you can't just globally search-and-replace ".htm" to ".html".

Internal hyperlinks can have either absolute or relative addresses. An absolute address is a pathname that looks like /foo/bar/whizbang.html and gives a full path starting from the root directory, symbolized by the initial slash in the pathname. External URLs are normally absolute, and include a protocol and system name as well. A relative address is either just a filename, or a relative pathname that's assumed to start from whatever directory is current. A relative pathname begins with a directory name, not a slash.

Your life as a Web author will be a lot easier if all your internal hyperlinks use relative addresses. That way they will work exactly the same on the server and when you are previewing locally. They'll also keep working if your provider changes the location of user pages on the server.

Server and local-disk web page directories

Unless your site is extremely complicated, you'll probably want to put all your html files in one directory, and your internal hyperlink relative addresses will just be filenames. But you may want to put your GIFs and JPGs in a separate subdirectory like "gifs". Your image tags will then need to look like <IMG SRC="gifs/fido.jpg">. This is an example of a relative address. The path section of an absolute address always starts with a slash; a relative address always begins with either a filename or a directory name.

I also like to put any other external files in separate subdirectories, such as zipfiles or movies. This way, when I do File Open in my tag editor, I only see my actual html files, plus any external CSS and JavaScript files. If you're using a converter, you'll probably still need to edit the generated code a bit: consider separate directories for word-processor source files and html files.

When you're updating a Web site, you'll usually modify a few files and need to upload and overwrite just the changed ones. I keep a local-disk subdirectory called "ready", copy changed files to it when preparing to upload, and delete them from the ready directory after uploading. My saved ftp-client entry for my Web server is preset to display this ready directory when I connect. This keeps me from getting confused while online and maybe uploading the wrong files.

Here's an example of the kind of local-disk directory system I'm suggesting:

c:--|--Batch
    |--My Documents--|--(other data directories)
    |                |--
    |                |--
    |                |==Html====|==gifs
    |                |          |==zips
    |                |          |==movies
    |                |--Ready
    |--Windows
   etc.

Files in C:\My Documents\Html: index.html etc.

The Html directory holds only the HTML files (and external CSS and JavaScript files if any). All other types of files (pictures, zipfiles etc.) are in subdirectories. The Ready directory holds modified HTML files ready for upload to the server. When you have a new picture file or zipfile or something to upload, you can just copy that to the Ready directory too, or you can reproduce the subdirectories under the Html directory if you want.

On a Unix Web server, if your login is jdoe, things might look something like this:

home--|--elaine  (other users' home directories)
      |--george
      |--jerry
      |--jdoe----|--(possible other subdirectories)
      |          |--
      |          |==public_html==|==gifs
      |                          |==zips
      |                          |==movies
      |
      |--kramer
     etc.

Filenames in home/jdoe/public_html: index.html etc.

Your local-disk Html directory and its subdirectories are a functioning model of the server public_html directory and its subdirectories. In fact whenever your Internet connection is active, your pages should work exactly the same from your local drive as they do from the server, including external hyperlinks.

You can even name your top-level local html directory the same as the server one if you want, but there's no need to. Any subdirectories under that—gifs etc. in the example, or maybe more web-page directories for a larger site—do have to have matching names locally and on the server, to make it possible for the relative addresses in your internal hyperlinks to work on both.


HTML checked
site feedback