Web design; look before you leap

If you're going to do Web design and authorship, there are some important things to think about before you dive in, that will tend to save a lot of time fixing things later.

Some relate to organizing information so people can find things, and some relate to time-saving tools and techniques you will use to create and maintain your pages.

The easy way: HTML plus CSS

The introduction of Cascading Style Sheets (CSS) makes Web publishing easier. CSS separates document structure (HTML) from document formatting (CSS), and makes the formatting simpler, more efficient, and easier to maintain.

This section is about what CSS is good for, and deciding whether to use CSS in your Web project; I cover the details of implementing CSS in another page.

How we got where we are

Stupidity got us into this mess; why can't it get us out?   (Will Rogers)

When the Web was first invented, it was supposed to be for everyone, not just as readers but also as authors. It was originally designed for structured academic text, with specific visual formatting to be left up to the user and the user's browser. Then people noticed the commercial applications, the popularity of the Web exploded, and the programmers took over.

Corporate clients demanded more control over presentation and formatting than was provided for in the original design of HTML. They got it in the short term through the use of four main work-arounds:

Abuse of structural and non-standard HTML tags for formatting purposes (such as BLOCKQUOTE, BR, CENTER) Tends to make pages hard to index and search, and harder to maintain.
Giant borderless invisible nested tables, used to force relative arrangement of page elements HTML tables are always complicated; putting all your content inside nested tables makes it that much worse, and tends to make pages slow to load and plot besides.
Flashy-looking graphics used in place of text elements Lets you have whatever "look" you want, but such text can't be indexed or searched.
JavaScript, CGI/Perl, PHP, VBScript, Active Server Pages, and other programmer's tools Diving into various programming languages, some based on C, obviously takes things out of the realm of the ordinary would-be Web author.

This style of Web-page coding relying on abuse of tags, tables, and graphics to force formatting in the HTML code is called presentational HTML, to distinguish it from the more efficient HTML plus CSS technique.

The presentational HTML coding style leads to all page content being encased in multiple nested tables and DIV tags, making the code unnecessarily complex and bloated. Lots of corporate people today think it always requires a programmer to create or modify Web pages. Many of the programmer types who have been making money consulting at it are willing to let you think so too.

CSS to the rescue

In the original vision, HTML was going to be purely structural: the tags would only specify "this is a heading" or "this is a body paragraph." Formatting was to be left entirely to the user's choices of browser program and settings, to decide things like what font to use for headings, or what margins body paragraphs would have. Corporate clients insisted on control of presentation, and they got it the hard way.

One function of Cascading Style Sheets (CSS) is to give everybody some control over formatting, in a much simpler and more efficient way. Webmasters and page authors can each have different levels of control over the "look" of their pages, and room can still be left for the user's preferences.

Here's how CSS works. We have at least two files, possibly three, to think about in this process: two files written by the Web page author, located on the Web server, and a third file that the user may or may not have chosen to create on their own PC.

file location On the Web server On the user's PC
file type HTML page Site style sheet User's style sheet
filename widgets.html basic.css mystyles.css
code <H2>Type B widgets</H2> H2 { font-family: Arial; } H2 { color: red; }
English
translation
of code
"This is a second-level heading" "Display all second-level headings in Arial font" "Display all second-level headings in red"
result seen
in the browser
Type B widgets Type B widgets Type B widgets

You can already see how the "cascading" part works: the user's local style-sheet turns that heading red, without affecting the Arial font choice specified in the author/Webmaster style sheet on the server.

Another big "win" of the CSS method is seen when you specify most of your styles in a single external style-sheet file, for an entire site. A single line in a single 4K style-sheet file, such as:

H1 H2 H3 H4 H5 H6 { font-family: Verdana, Arial, sans-serif; }

... can set the font for all headings in all pages of your entire site, and you can change the formatting of all your headings at any time just by changing that one line.*

The equivalent in 1997-style presentational HTML is:

<H2><FONT FACE="Arial">Type B widgets</FONT></H2>

... and it has to be set and maintained individually for every heading tag in every page of the entire site.

There's provision in the CSS1 and CSS2 specifications for control over fonts, font-variants (such as bold, italic, small caps) text and background colors, element borders, whitespace for pages and particular elements, relative positioning of elements, line spacing and justification, and even effects such as drop shadows, all based on simple text elements, without resort to graphics.

You can also set up style definitions for CSS named "classes" and apply them to different types of elements. For example, I use a simple one called "small-print" which specifies a sans-serif font and a percentage size reduction, and I can apply that single CSS class to body text, lists, and tables, with a consistent effect.

H1 H2 { font-family: "Comic Sans MS", Arial, sans-serif; }
H3 H4 H5 H6 { font-family: Verdana, Arial, sans-serif; }

Good and bad news about CSS

Full browser compliance with the CSS specs has been slow in coming.

Back to good news: Firefox, Opera, and to a less impressive extent Internet Explorer 6, all support a good portion of defined CSS properties. IE7 is supposed to be better than IE6 at CSS compliance.

There's usable support for font-family, the basic font variants (bold, italic) font colors, case transforms, left, right, and center alignment, first-line indents, basic margins padding and whitespace properties, list style types; and support for various kinds of dimension settings including absolute units (such as pixels, inches, cm, points, picas) and relative units tied to either window or font sizes, which are often preferable for usability reasons over absolute units.

You can't do everything in the CSS specs yet, but you can definitely do enough to be worthwhile. When better support comes along, you'll be ready to take advantage of it.

Depending on your organization's values, you may or may not be able to use CSS with an external Web site. If the expressed priority is clear communication, CSS can definitely work. If there are visual-arts types insisting on a certain finicky, flashy "look" for things, then there may be no alternative, right now, to traditional mid-90's "tag-abuse" Web design. As browser support of CSS improves, this may change.

CSS is great for real, functional intranet content. It makes Web publishing easier for non-programmers and simpler to maintain, which is a big win for intranets. There's enough of CSS functional in current browsers to be very workable now, in the intranet role where communication and efficiency are more important than flashy visuals.

There are a lot of people still getting paid to do Web pages who cut their teeth on bloated presentational HTML, with 100% of every page's content inside complex nested tables, loading pages up with gratuitous JavaScript, and in general keeping the customers too intimidated to think Web publishing might possibly be within their reach. Many of them have now added CSS to their bag of tricks, and use it for formatting just enough so nobody can point to it as something they don't encompass ... but only as another element of the general snow job.

Why tables for layout is stupid is a user-friendly (even management-friendly) Web presentation that lays out all the advantages of the efficient CSS+HTML style of Web design, over 1997-style presentational HTML with all layout controlled by tables.


Intranet vs. external Web

Intranet content is probably easier to do than external Web-site content. People tend to expect flashy visuals more with external Internet Web sites ... not so much because it's useful as because a lot of existing external sites are flashy-looking. With intranet pages, it's a little more obvious to people that clear communication should be a priority. Formatting can be kept relatively simple, although still pleasing and interesting, and a Webmaster can concentrate on rational navigation and site structure, and good writing.

Intranet and external content should be structured, written, and formatted differently. In neither case should it be structured to match the organization's org-chart. It's always going to be tempting to do that, partly because the org-chart's already done, and partly because it makes it so easy to tell who's responsible for writing what.

Branching an external Internet Web site just like the org-chart is especially bad, because the readers are going to be people from outside your company who have no clue about your org-chart structure. Even with intranet pages there's usually a better way.

In either case it's probably better to try to structure the site according to the way the target audience naturally thinks about the information that will be presented. For an external site, you can talk to your customers about it, look at sites posted by competitors, and consider industry journals, convention publications, and academic sources. For an intranet, you might talk to employees and middle managers, look at people's paper files and routing baskets, and the company newsletter. Remember you're laying out an information space, and not a container for desks, or accounting dollars.


Plan your information architecture

Frames: just say No

Some folks will disagree with me on this point. Some percentage of those folks will be Web consultants who just love to do frames sites, because they feel fairly safe that the suckers—excuse me, the customers—will never figure out how the frames code works. (Cynical? Who, me?)

The World Wide Web Consortium, which has always been the primary nonpartisan forum for interoperable Web standards and guidelines, offers a multi-platform Web editor and browser called Amaya. Amaya doesn't support frames. I think that makes it pretty clear how they feel about it.

This site had a frames architecture for all of calendar 1997. In fact by the end of that year any reader could choose online between four different frameset layouts, two horizontal and two vertical, with text links or pushbutton graphics, serving the same set of content pages. I was pretty pleased with myself that I figured all that stuff out with only moderate aspirin consumption. Then I discovered Jakob Nielsen's Alertbox site.

I now think that frames are more trouble—for the user—than they're worth, especially for intranet content. Dr. Nielsen's article Why Frames Suck (December 1996) is a good exposition of the fundamental usability problems with frames, and still relevant. If you search for "frames" inside Yahoo's WWW category you'll usually find more on this subject.

Briefly, these are the problems with frames:

  1. They break one of the fundamental paradigms of the Web: URL = Web page = bookmark. This is one of the first things a new Web user learns. On the ordinary kind of frames site, the only URL available for bookmarking is that of the frameset; there's no simple way to bookmark individual content pages, which users will always want to do.
  2. Users who still have 14 and 15-inch monitors don't have any screen space to spare; a frames design may use up space they can't afford to give up.
  3. Frames make printing Web pages harder. They also make it more complicated to view the source code for a particular page.
  4. Search engines have trouble properly indexing frames sites. Also, users following search engine hits are likely to land on a body document instead of the frameset, and see either a navigation bar with no content, or a content page without the navigation bar.
  5. For all these reasons, experienced Web users find frames instantly annoying. It's known from experiment that on sites that offer both frames and non-frames versions, a majority of users go straight to the non-frames version.
  6. Frames are a pain to code.

Many site designs still use frames for navigation, and there are complex design approaches available that address at least some of these concerns, if you really want to work that hard. See Nielsen's Why Frames Suck page for more on this. On the other hand, there are also several kinds of banners and logos becoming popular, that you can display on your home page to declare your site frames-free.

Perhaps a few years down the road, nearly everyone will have large displays, and the HTML/CSS specs and browser compliance will have advanced enough, and frames will become sufficiently easy and workable. Maybe everyone will start using floating navigation bars, and bookmarking will still work the way users expect. But don't hold your breath till it happens.

For helpful details on coping with other people's frames sites, see the Frames section of my Basic browser tips page.

Logical arrangement of topics

The structure of your site needs to make enough sense to your readers so that they will be able to tell where to look to find what they need at the moment. For many sites, especially business sites, a hierarchical subject tree will make the most sense for the overall structure. For certain topics, you may need a chronological or other sequential structure.

If you were archiving past company newsletters on the intranet, for example, it would be natural to link them by date, although a short listing of top stories would probably be helpful as well. For complex procedures to be followed, a stepwise sequence of pages might make sense. Follow the structure implicit in the information itself.

Speed-tuning

Anyone who's been using the Web for a while knows its biggest problem is being too slow. For more on this see Jakob Nielsen's 1997 Alertbox article The Need for Speed. Some people even take to calling it the World Wide Wait.

I believe all external Web sites should still be tested and usable at dialup speed. There's no excuse for knowingly designing a site that only works on broadband. During the first years of the Web's popularity, a steady 80% of Web users had dialup modem access. Dialup users were fairly quick to adopt faster modems as they became available—it was a relatively cheap upgrade—but users with faster-than-modem access stayed in that 20% minority. Faster connection types such as DSL, cable modems, and satellite have a larger and expanding share now, but there are still millions of users stuck with dialup, in many cases due to geography rather than money. Besides making you look incompetent, designing for broadband only is therefore elitist and socially destructive.

Average and maximum HTML file sizes for any site must be held to limits. According to Dr. Nielsen's book Designing Web Usability, 30K is the maximum allowable size for a single HTML file to be viewed over dialup access. This is based on ten seconds as the longest a Web user can be expected to wait, after clicking a hyperlink, before their attention begins to wander.

Even intranet pages viewed over a LAN should be kept small and fast. Ten seconds is a maximum permissible response time, not an optimum. We know from research that users work best with hypertext systems when response times are less than a second. Since page files tend to grow as they are edited and improved, your site design needs to include methods for breaking them into sections when they become too large.

Web pages loaded up with a lot of JavaScript, or with formatting forced by having all content inside giant borderless nested tables contribute to a phenomenon called code bloat. Use only the JavaScript you really need for functionality, and lose the showoff stuff. Use fast efficient CSS instead of obsolete table-based formatting.

Don't add graphics to Web pages just to be adding graphics; they add their own increment to the response time. Add them only when they add meaning or usability to your presentation. Make sure the graphics you do use are as small and fast as possible, and their image tags are coded so they don't unnecessarily delay page plotting. There's more about fast graphics on my Web graphics page.

Validation

You can also make your pages render faster by validating your code. First you'll need to include an appropriate document type declaration (DTD) and character encoding at the top of each page's code, and then you can validate your HTML and CSS code.

W3C HTML validator http://validator.w3.org/
This online validator checks pages one at a time, but has more helpful feedback on errors it finds.
Web Design Group HTML validator http://www.htmlhelp.com/tools/validator/
This one can check a whole site in one step, up to 100 pages.
W3C CSS validator http://jigsaw.w3.org/css-validator/
Web Design Group CSS validator http://www.htmlhelp.com/tools/csscheck/

When you submit a page URL to the validator it may return errors, cited by line number in your code. Then you'll need to correct the errors, upload the corrected version, and resubmit the URL. Sometimes one mistake in your code can produce several errors in the validator output, but usually they will all refer to the same line number.

Web pages without a DTD and character encoding cause readers' browsers to operate in quirks mode, in which the browser is prepared to cope with badly coded HTML. The browser has to receive the whole page, analyze the code, and produce a best guess at the results the page author was trying for. With a DTD and character encoding at the top of each page, and "well-formed" legal HTML code, the browser can operate in standards compliance mode, in which it immediately starts parsing and rendering the code as it comes in from the network, endwise, so to speak.

PDF abuse

Acrobat PDF is a semi-open document format controlled by Adobe. Free Adobe Reader is available to everyone for download,* and most people have it installed. Microsoft Office and OpenOffice.org can both output PDF format now, and if you install the free utility CutePDF Writer, any Windows program you have that can print can output PDF files. PDF files are easily published on the Web using file hyperlinks, and the user's browser will respond by launching Adobe Reader and opening the PDF file.

Unfortunately some organizations respond to all this by treating PDF format as just another way to Web-publish, for any content they haven't had time or budget yet to convert to plain HTML. This is a really bad idea. The proper use of PDF format on the Web is only for content meant to be printed out by the user, and never read online.

What's wrong with PDF? When a user clicks on a link to a PDF file, it takes longer to load and display, especially on slower hardware and/or Internet connections, and the browse interface is totally different in appearance and function. This can confuse the heck out of inexperienced users. Even users who understand what's happening will be annoyed, especially if it's not made clear which links go to PDF files and which to the expected HTML content.

If you do use PDF format for content that is meant to be printed, make it very clear which links cause the PDF file(s) to be sent. Expecting the user to see the PDF extension in their browser status bar when they hover the link isn't good enough: I never notice that. Put a big bold (PDF) label right next to the file hyperlink. If you can put those PDF file links in something like a bullet or number list to focus more attention, instead of buried in a text flow, so much the better. If there's just one PDF link, format it by itself between paragraphs like a block quote.

There are cases where PDF has to be used for content like software documentation, that one would like to supply in printed form but can't for economic reasons. Usually this is either included on an install CD or made available for download. An excellent example is the OpenOffice.org user guide, which was properly set up with multi-level bookmarks for navigation, something that should be done anytime lengthy content is published in PDF format.

"Under construction"

Give me a break.

All Web sites are always under construction. Dynamic content and rapid updates and publishing are part of the whole point of doing things on the Web. "Under construction" has been identified by everyone as a dumb thing to put on any Web page since the first days of the popularity of the Web, for more than ten years now. It's hard to believe there are still a few sites that do this.

If you want to publish your site when there's a planned section or two you haven't had time to write yet, and you just gotta, you can say on the site that those sections aren't ready, but at least avoid that "U.C." phrase. If you do that, it would be better to have that element of your navigation scheme not be a link, perhaps with a "not ready" note next to it, rather than having a navigation link that opens a page with no content, or worse, produces an error.

In many cases it will be more professional looking and/or less annoying not to mention those sections at all until they are ready.


Site map & site search

These are very important elements of any medium to large Web site. Beginning Web authors (and government employees) often omit them. After you design a rational information architecture for your site, its logic will seem obvious to you; alternate ways of finding things will seem unnecessary.

We have to remember that every human being who sets out to publish on the Web designs a different information architecture. Whenever a Web user clicks an external hyperlink and moves to a different author's Web pages, he or she has a whole new system to learn. Sometimes the user is going to give up. We must provide backup ways of finding things.

There are even chronically impatient Web users who never try to learn site structures, who immediately use keyword search on every new site they see, if it's available.

If you provide site search, a site map, or both, every page in your site should link to them, as well as to your site's home page. Web users are going to drop out of the sky like paratroopers onto any individual page in your site, following "hits" from Internet search engines. They need a quick way to tap into the structure of your site, to answer questions like "Where am I?" and "What's going on here?"

Site search

Site search is more important than a site map. In the early days of the Web you needed a programmer to implement site search. Now there are free advertising-supported services such as FreeFind, and every site that's big enough to need it can have keyword search. You can have a separate page with the search box and button, put it on your home page, or at the top of every page. FreeFind's advertising appears on the search results pages.

Bravenet also offers free site search, along with many other free services for people with Web sites, not all of which are good things to do. Or a Google search on "free site search" will probably find you more sites that offer it.

If the service provides lots of advanced search features, you might want to provide a separate "advanced search" help page, so as not to confuse casual users, or see if the search service provides one, as FreeFind does.

One occasionally sees sites which offer a search box that uses Google or some other global search engine to keyword-search the whole Web. This is pointless. Web users already know how to search the whole Internet; in fact, that's probably how they arrived at one of your pages in the first place. Either provide keyword search that works within your own pages, or no search widget at all.

Site map

It's also pretty easy to provide a site map. There's a lot of variation in the things Web authors provide under the name "site map." Some are done as a giant graphic, set up as a clickable image map; some are sophisticated JavaScript-based dynamic widgets produced by programmers.

In its simplest form, a site map is just a list of some kind, that provides a hyperlink for each and every HTML page file in the site, together with some indication of how the pages are interrelated or structured.

We need to remember that a person using a site map may be doing so out of desperation. They may already have tried and failed to find what they need, using your carefully-designed navigation scheme. I think site maps should always be text-based, so that they will load relatively quickly, and therefore always be accessible for dialup/modem users.

A simple way to do this is using nested HTML lists; number lists or bullet lists or some combination. Create a list object where each item is a hyperlink to one of the top-level pages in your site. Then, under each top-level list item, add a subordinate nested list, with a link for each child page. Keep going like that until you've included all the levels in your site's tree-structure hierarchy.

Having a site map does add a little to the required maintenance of a site: whenever you add or remove a whole page, you'll have to make the corresponding change in your site map. Fortunately most of us change content a lot more often than page structure, so it's not too difficult.


HTML checked
site feedback