Universtity of Oslo's logo
Ph: 20011015

Cascading Style Sheets

Håkon Wium Lie

Thesis submitted for the degree of Doctor Philosophiœ
Faculty of Mathematics and Natural Sciences
University of Oslo
Norway
2005

© Håkon Wium Lie, 1994-2005

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License.

Submitted 29th of March, 2005, as partial fulfillment of the degree
Doctor Philosophiœ
At the Faculty of Mathematics and Natural Sciences
University of Oslo
Norway

Series of dissertations submitted to the Faculty of Mathematics and Natural Sciences, University of Oslo.
No. 498

ISSN 1501-7710

Abstract

The topic of this thesis is style sheet languages for structured documents on the web. Due to characteristics of the web – including a screen-centric publishing model, a multitude of output devices, uncertain delivery, strong user preferences, and the possibility for later binding between content and style – the hypothesis is that the web calls for different style sheet languages than does traditional electronic publishing.

Style sheet languages that were developed and used prior to the web are analyzed and compared with style sheet proposals for the web between 1993-1996. The dissertation describes the design of a web-centric style sheet language known as Cascading Style Sheets (CSS). CSS has several notable features including: cascading, pseudo-classes and pseudo-elements, forward-compatible parsing rules, support for different media types, and a strong emphasis on selectors. Problems in CSS are analyzed, and recommended future research is described.

Inspiration

Style sheets constitute a wormhole into unspeakable universes. –James D Mason, 1994
Style sheet languages are terribly underresearched. –Philip M Marden, Ethan V Munson, 1999
In which form are you planning to publish the first edition of the Parsifal poem? Even if I like Latin letters, I'm afraid they are unpopular (especially among publishers). So, if the letters will be German, please make the type large and of good quality. The legibility of a text is very important to me. –Richard Wagner, in a letter to his publisher Ludwig Strecker

Table of contents

Abstract Inspiration Table of contents List of figures List of tables Acknowledgements Overview and summary of the thesis Introduction Structured documents Style sheets prior to the web Style sheet proposals for the web
Robert Raisch's proposal (RRP) Pei Wei's proposal (PWP) Steve Heaney's proposal (SHP) Cascading HTML Style Sheets (CHSS) Joe English's proposal (JEP) Sketch of Simple Formatting Primitives (SSFP) DSSSL Lite Stream-based style sheet proposal (SSP) PSL96 Summary and conclusions
Web requirements Cascading Style Sheets Problems in CSS CSS for small screens Cascading links Future research Conclusions Glossary References Colophon

Acknowledgements

Having glanced through a fair number of doctoral dissertations myself, I believe the acknowledgements to be one of the most widely read sections. It is where the author, for a brief moment, can stray from the dryness of academic writing to express years of accumulated frustration and gratitude. Having worked on Cascading Style Sheets (CSS) for a decade, I have had my fair share of both frustration and gratitude. I'll try to express the latter in words while the frustration will be left to the leading.Terms in bold are explained in the Glossary.

My gratitude goes first and foremost to my parents, Sissel and Alfred Lie. My father set a fine academic example by getting his PhD at the age of 50 and the fact that I'm beating him by a decade or so is a complement to him rather than to me. My mother's love of publications and her extensive information filing system have also contributed to my own urge to get my notes into order. I hereby pass the challenge of beating their father to a PhD onto my own children. Or, at least, to get their notes in order.

Two very special people deserve particular mention and thanks; without them, this thesis would not exist. Bert Bos joined me at a point when CSS had been named and waved, but was still a set of immature ideas rather than a coherent specification. During some short weeks around a white-board in the summer of 1995, CSS was hammered out. I will remember that time in Sophia-Antipolis as some of best days and nights of my life. Karen Mosman is my publisher, muse and partner. Her enduring loyalty to my writing and to my person has changed both for the better. My writing and my person, that is; Karen herself is practically perfect.

The World Wide Web Consortium (W3C) has been a good home for CSS. I thank Tim Berners-Lee and Jean-Francois Abramatic for setting up the organizational structures necessary to make it happen. Tim also gets very special thanks for inventing the web, not patenting it, and leaving a stylistic gap to filled. Among my W3C colleagues who were instrumental in supporting the work in the early days are Dave Raggett and Dan Connolly. Dave's browser, later named Arena, provided the perfect testing ground for CSS. Dan – after some healthy initial resistance – supported me when presenting CSS to the W3C HTML Editorial Review Board (ERB) which he co-chaired with Dave.

One small anecdote from the ERB meeting in April 1996 is worth recounting. Since I was there primarily to present CSS rather than take part in HTML discussions, I was given the task of taking minutes. It so happened that the name of the next HTML version was decided in this meeting. I hope I will be forgiven for disclosing an (anonymized) excerpt from the minutes:

The naming issue was raised, and the meeting switched 
into brainstorm mode. Suggestions fell into three groups:
 - version numbers: 3.1, 3.2, 3.5, 4.0
 - code names: Wilbur, Classic HTML, Unified HTML, 
   Common HTML, W3C HTML
 - compounds: HTML96, W3C HTML4
In the end, people preferred version numbers. NN 
argued that Wilbur was a major change that deserved a new 
major number: 4.0. Other people didn't like the zero in 
that name. "HTML 3.2" was selected after discussions and 
votes.

So, somewhat by accident, I was the first person to type the now ubiquitous string HTML 3.2 into a computer. A few small key strokes for a man, a giant leap for the web.

Inside W3C, the CSS Working Group has been the keeper of the flame. Some highly intelligent and dedicated people joined the group over the years. I would especially like to thank Ian Hickson, David Baron, Tantek Çelik, Daniel Glazman and Eric Meyer. Additionally, Steven Pemberton chaired the first W3C Workshop on Style Sheets, Chris Lilley served as chair for many years, and Ian Jacobs contributed his editorial skills. I am grateful to all of you.

In 1999, when CSS1 and CSS2 had been written, I joined Opera Software to ensure that the specifications were implemented correctly by at least one browser. Thanks go to Jon von Tetzchner and Geir Ivarsøy for founding a company worth working for. Geir, along with Karl Anders Øygard, is also the mastermind behind Opera's display engine that makes CSS shine on screens of all sizes. Thanks also go to Snorre Grimsby, Rijk van Geijtenbeek, Brian Wilson and Sue Sims for supporting CSS internally and externally.

Many people have been helpful while writing this thesis. I am indebted to Paul Grosso, Vincent Quint, Pamela Gennusa, Ethan Munson, Joe English, Harvey Bingham, Paul Prescod, Jany Quintard, Yann Dirson, Dave Pawson, Ian Castle, Didier P. H. Martin, Geir Ove Grønmo and Bette Harvey for answering my many questions about the past. I am grateful to Joe English, Wayne Gramlich, James Mason, Jeff Moore and Dan Connolly for allowing me to quote from their unpublished writings. Gunilla Petersén of the Royal Swedish Opera directed me to the inspirational Wagner quote.

This thesis concerns style sheet proposals for the web. I am grateful to the authors of the proposals for contributing a very interesting topic of research. Having analyzed their proposals without having access to their minds, I may have misunderstood or misinterpreted their work. If so, please contact me. Thanks also go to the participants on the www-talk, www-html and www-style mailing lists. Without the communities that formed on the mailing lists, the web would not have existed as we know it today.

CSS has borrowed many ideas from the MIT Media Lab where I spent two forming years. Thanks to Walter Bender and Andy Lippman for exposing me to those ideas. At the University of Oslo, Ole Hanseth and Gisle Hannemyr have motivated me to write up my notes into a thesis, and advised me on how this should be done. Without them, my notes would still be scattered around.

I am grateful to Anthea Vaughan for patiently copy-editing my drafts.

I would like to thank the people who created FrameMaker, GNU-emacs, and the Prince formatter. FrameMaker taught me typography, GNU-emacs gracefully accepted all my handcrafted tags, and Prince put this thesis onto paper.

Oslo, March 2005
Håkon Wium Lie

Overview and summary of the thesis

The topic of this thesis is style sheet languages for structured documents on the web. The hypothesis is that the web calls for different style sheet languages than does traditional electronic publishing. Further, the design of a style sheet language that fulfills the specific requirements of the web, namely Cascading Style Sheets, is described. The thesis can be divided into a why part (Chapter 1-5), a how part (Chapter 6-9), and where to go from here (Chapter 10).

Chapter 1: Introduction

The first chapter is an introduction to the the topic of the thesis and related subjects. The historical context in which CSS was developed is described, including the development of HTML from its roots in structured documents to the presentational tags introduced by various browsers. Key concepts such as structured documents, style sheets and cascading are introduced.

Chapter 2: Structured documents

Style sheet languages and structured documents are mutually dependent. Without style sheets, structured documents cannot be presented, and without structured documents there is nothing for style sheets to present. Chapter 2 starts by introducing the ladder of abstraction which is proposed as a measuring tool for structured document formats. Such formats developed prior to the web (Scribe, LaTeX, ODA, SGML) and for the web (HTML, XML) are described. Finally, the role of transformation languages vs. style sheet languages is discussed.

Chapter 3: Style sheets prior to the web

Chapter 3 is the first chapter in which style sheets are discussed in some detail. The first part of the chapter establishes a set of criteria for style sheet languages; in order to qualify as a style sheet language six components must be present: syntax, selectors, properties, values and units, value propagation and a formatting model. Three style sheet languages developed before the Web (FOSI, DSSSL and P94) are described. The historical background of each is followed by a technical review.

Chapter 4: Style sheet proposals for the web

This chapter is a survey of the style sheet languages that were proposed for the web in the period 1993-1996. Nine different proposals are reviewed according to the criteria established in the previous chapter.

Chapter 5: Web requirements

Publishing on the web is different from other types of electronic publishing. Six web-specific requirements are discussed in Chapter 5. None of the pre-web style sheet languages nor subsequent style sheet language proposals fulfill all requirements for publishing on the web.

Chapter 6: Cascading Style Sheets

This chapter marks the start of the how section of the thesis. In this chapter Cascading Style Sheets (CSS) is described in some detail, and the language is evaluated according to the criteria that were established in Chapter 3. CSS is also evaluated against the web requirements discussed in Chapter 5.

Chapter 7: Problems in CSS

This chapter discusses problems in, and related to, the CSS specifications. These range from simple spelling errors to more complex questions such as whether or not some functionality fulfills its intended role. The chapter is loosely organized along an axis of complexity; the first part describes how simple errors have been handled. Thereafter, real and perceived problems in the specifications are discussed. The last section is dedicated to problems in the cascading mechanism.

Chapter 8: CSS for small screens

This chapter describes how cascading can be used to render web pages on small screens. By enforcing a carefully crafted browser style sheet, web pages are reformatted into narrow columns to avoid horizontal scrolling.

Chapter 9: Cascading links

A novel use of CSS to represent hyperlink information rather than stylistic information is discussed in this chapter. Cascading links make it possible to deploy new markup languages with hyperlinks in them, without user agents knowing how linking information is coded.

Chapter 10: Future research

This chapter points to areas of future research and development that are likely to yield beneficial results.

Chapter 11: Conclusions

The conclusions support the argument of the thesis: due to its characteristics, the web calls for style sheet languages different from those for traditional electronic publishing. The main contributions of the thesis are listed: the ladder of abstraction, the components of a style sheet language, the web requirements on style sheet languages, and CSS.

Introduction

Around 1990, Tim Berners-Lee developed three specifications that formed the basis of the World Wide Web project: the HyperText Markup Language (HTML) was developed as a document format for the web; Universal Resource Locators (URL) were added to represent links between the documents; and the HyperText Transfer Protocol (HTTP) was developed to transfer documents between machines on the internet [Berners-Lee 1999]. Both specifications and implementations were made freely available by CERN.

The web quickly gained momentum. With the launch of the National Center for Supercomputing Applications (NCSA) Mosaic browser in 1993 [Andreessen 1993a], users suddenly had an attractive browser to surf a steadily increasing set of interlinked documents. With an rising number of users, more authors were attracted to the web, and content proliferated.

In the beginning, HTML, was a simple structured document format with markup tags added between text strings to indicate the role of the text. For example, a string of text could be marked as a paragraph, while another string could be marked as a clickable link. The elements in early HTML were logical rather than presentational. For example, HTML would mark some text as a heading but would not describe how the heading was to be presented. The presentation of text – including what font, color and size to use – was primarily determined by the browser.

Structure versus presentation

Scientific environments such as CERN value logic, structure and content more highly than aesthetics, imagery and style. This sense of structure is reflected in HTML. Each paragraph is marked as such and headings are given a numbered level to indicate their place in the document structure.

As the web attracted attention outside of scientific environments, authors started complaining that they did not have enough influence over the appearance of their pages. One of the most frequent questions asked by authors new to the web was how to change fonts and colors of elements. This excerpt from a message sent to the www-talk [www-talk] mailing list early in 1994 [Andreessen 1994a], gives a sense of the tension between authors and browser implementorsI have quoted from a message sent to a mailing list for the developer community in this chapter, and will do so many times in chapters to come. Mailing lists were crucial for bringing together the web community in the early years, and hypertext archives of mailing lists quickly sprang up in the early 1990s. Today, a decade later, these archives provide valuable insights to the web's design and development. :

In fact, it has been a constant source of delight for me over the past year to get to continually tell hordes (literally) of people who want to – strap yourselves in, here it comes – control what their documents look like in ways that would be trivial in TeX, Microsoft Word, and every other common text processing environment: 'Sorry, you're screwed.'

The author of the message was Marc Andreessen, one of the programmers behind the popular NCSA Mosaic browser. He later became a co-founder of Netscape which fullfilled authors' requests by introducing presentational tags in HTML. On October 13, 1994, Netscape announced [Andreessen 1994b] the first beta release of their browser. The Netscape browser supported a set of new presentational HTML tags (e.g. CENTER to center text) and more were to follow shortly.

Abstraction levels

By adding presentational tags to HTML, the language evolved from being an abstract, structured, markup language where authors marked the different logical roles of the text (paragraphs, headlines, lists and so forth) towards a concrete presentation language where emphasis is on the final form presentation of documents (fonts, colors and layout).

In traditional paper-based publishing, the reader receives a final form product. Each letter on a printed page has a fixed position, shape, size and color that cannot be changed by the reader. Electronic documents, however, are unfinished products that must be assembled before they can be presented to the human reader. In the assembly process – better known as formatting – many choices of how to present the document are made. For example, the browser must pick the fonts and colors to use when presenting the document on a color screen. The level of processing that an electronic document needs will vary considerably depending on what document format is used. As such, electronic documents are similar to furniture: some furniture comes pre-assembled while other items are bought in flat packages and the owner must do the final assembly. If a document format requires much processing, it is said to have a high level of abstraction. If the document format needs little processing, it is said to have a low level of abstraction.

Determining the right abstraction level is an important part of designing a document format. If the abstraction level is high, both the authoring process and the task of formatting the document become more complex. The author must relate to non-visible abstract concepts. On the receiving end, the browser must transform elements from abstract to concrete objects and this task is more complex if the elements are highly abstract. The benefit of a high abstraction level is that the content can be reused in many contexts. For example, a headline can be presented in large letters on printed sheets, and with a louder voice in a text-to-speech system.

Conversely, a low level of abstraction will make the authoring and formatting process easier (up to a point). Authors can use visually oriented WYSIWYG (What You See Is What You Get) tools, and the browser does not have to perform extensive transformations before presenting the document. The drawback of using presentation-oriented document formats is that the content is not easily reusable in other contexts. For example, it can be difficult to make presentation-oriented documents available on a device with a different screen size, or to a visually impaired person.

When transforming documents from one format to another, the chances are that the two formats are at different abstraction levels. In general, it is possible to transform documents from a higher to a lower abstraction level, but not the other way around. The ladder of abstraction is introduced in this thesis as a way of measuring the level of abstraction.

Presentational HTML

The introduction of presentational tags in HTML was a downwards move on the ladder of abstraction. Several of the new elements (e.g., BLINK) were meaningful only for particular output devices (how is blinking text displayed in a text-to-speech system?). The creators of HTML intended it to be usable in many settings but presentational tags threatened device independence, accessibility and content reuse.

The development of HTML into a presentation-oriented language also changed the power balance between authors and users. Structured documents must be formatted by the browser before presentation, and – to some extent – the formatting process can be influenced by the user. However, when the browser receives a document in its final form, the formatting process is complete and can no longer be influenced by the user.

Web authors had asked for more influence over the document presentation and welcomed this development, but there was also resistance in the web community. Many felt that the web had the potential of realizing personalized publishing where the reader – rather than the publisher – was in control. Content should be selectable based on reader preferences, and the medium and form of presentation should also be the choice of the reader. By turning HTML into a presentation language there was a risk of losing the degrees of freedom necessary to realize a user-centric publishing model.

Style sheets

Style sheets were proposed as an alternative to the evolution of HTML from a structural language to a presentational language. The term style sheet is used in traditional publishing as a way to ensure consistency [Chicago 1993] in documents. In the traditional publishing process, a manuscript is accompanied by a style sheet which serves as a running account of rules about diction and language usage adopted for a particular manuscript [Brüggemann-Klein&Wood 1992].

In the 1980's, publishing changed dramatically with the introduction of personal computers for use in the preparation of manuscripts. Electronic publishing offered tools to ease all stages of publishing from authoring, through editing, to printing. In electronic publishing, the term style sheets came to mean a set of rules regarding how to present content, rather than rules for how to author content. Style sheets would be specified by the designer and sent to the typesetter before printing. Typically, they would describe the visual layout of a text-centric document, including fonts, colors and white space.

In this thesis, the term style sheet refers to a set of rules that associate stylistic properties and values with structural elements in a document, thereby expressing how to present the document. Style sheets generally do not contain content, are linkable from documents, and they are reusable. This definition allows the term to be used in the context of electronic publishing both off and on the web.

Style sheets were available in electronic publishing systems from around 1980 (see Chapter 2 and 3). Combined with structured documents, style sheets offered late binding [Reid 1989] of content and presentation where the content and the presentation are combined after the authoring is complete. This idea was attractive to publishers for two reasons. First, a consistent style could be achieved across a range of publications. Second, the author did not have to worry about the presentation of the publication but could concentrate on the content.

Indeed, some authors found it liberating not having to worry about presentational details in the authoring process [Cailliau 1997]. However, most authors ended up using authoring systems which emphasizes the presentation rather than the structure [Sørgaard 1996].

WYSIWYG – a competing model

WYSIWYG – What You See Is What You Get – is a competing model for authoring documents. WYSIWYG applications constantly update a final form presentation. As the author types, the screen is updated to reflect the page layout that would result should the document be printed at that point.

Instead of the late binding between presentation and content, employed by structured documents and style sheets, WYSIWYG offers instant binding; all editing operations result in instant visual changes to the final presentation. This approach often results in documents whose authors emphasize the final presentation – which is typically a printed document – rather than the logical markup.

Several applications try to combine the concept of structured documents with WYSIWYG editing, including Adobe's FrameMaker [FrameMaker], Microsoft's Word [MS-Word] and W3C's Amaya [Amaya]. Typically, these applications offer the author several views of the document one of which is WYSIWYG and others that are more structural. This makes it possible to author structured documents with a WYSIWYG tool. There is a risk associated with using WYSIWIG tools, however: they also allow authors to make purely presentational modifications which may not be consistent with the document structure.

Web characteristics

Research has shown that when documents are authored with the printed copy as the final target, it is difficult to motivate authors to work on a logical level rather than a visual level [Sandahl 1999]. With the emergence of the web, however, the possibilities for reuse of content increases. Instead of printing and distributing documents on paper, web documents are transferred electronically to the user's computer. The shift towards electronic distribution of documents has several key characteristics that influence both the authoring process and style sheet languages.

Late binding becomes later binding: On the web, documents are transmitted in electronic form to the user's computer. The late binding between content and presentation of electronic publishing becomes even later binding on the web. The binding no longer takes place in the publishing house but, rather, in the user's computer. This increases the freedom of the presentation but also poses a new performance challenge since the binding takes place while the user is waiting. Also, the author is not present to make sure that the presentation is correct. Paper-centric publishing becomes screen-centric: Before the advent of the web, most electronic documents ended up as printed documents. They were edited and processed on computer screens but, most often, the final media type was print. On the web, most users view documents on a screen. Single output becomes multiple outputs: Although screens are the primary media type on the web, many other types exist. Authors do not know what kind of output device will be employed by a user. There is no longer one final form presentation, there are many. Therefore, it is important that style sheets can describe presentations for multiple output devices. Author control becomes shared author/user influence: Since the binding between content and presentation takes place in the user's computer, influences from several sources may be combined to form a presentation. Given this freedom, it seems reasonable that the user – as well as the author – should be able to influence the presentation. Personalized presentations based on the needs and preferences of the user become possible. This is different from other publishing environments where authors and publishers have full control of the presentation. Stand-alone documents become hyperlinked: The web is a large collection of hyperlinked documents and information that was previously expressed as textual references can now be active hyperlinks. Dependable delivery becomes uncertain: Web resources are distributed across many connected computers and the chance of a resource not being available is significant. Another change is that the web is more likely to fail than are in-house publishing systems. It is natural to make the style sheet available on the web, but the resource may not always be available at the user's end.

Thus, with the introduction of the web the focus of style sheets is shifted from being an author's tool in the authoring process to being a tool for content reuse after the content has been generated. Style sheets on the web are potentially more important than are style sheets for paper-centric publishing because the possibility of content reuse is greater. Just as the nature of style sheets changed from paper-based publishing to electronic publishing, so has the nature of style sheets changed again for web publishing.

Style sheet mechanisms for the web

A crude form of style sheets was hard-coded into the first WWW client implemented on the NeXT machine at CERN. However, no specification for style sheets was written and no syntax for a style sheet language was proposed; it was considered a matter for each browser to decide how to best display pages to its users.

The potential benefits of using style sheets on the web are significant. A well-developed style sheet mechanism would give authors a richer stylistic vocabulary than they could hope for in an evolving HTML. Also, HTML would remain a structured markup language that worked on a wide range of devices.

For these reasons, many people on the www-talk mailing list [www-talk], which was the electronic meeting place for the early web community, agreed that the web could benefit from style sheets. However, there was disagreement as to whether or not the web would require a new style sheet language or if one of the existing languages, designed primarily for paper-based publishing, would be suitable.

Several style sheet languages for the web were proposed in 1993 (see Chapter 4: Style sheet proposals for the web) but none of them gained momentum. This was mostly due to lack of support in browsers; as long as Mosaic – by far the most popular browser of its day – did not support style sheets there was little motivation for authors to write them. Also, none of the proposals were developed to a stable state. A successful style sheet language for the web had to be compelling enough both for browser developers to implement, and for authors to use.

CSS

Three days before Netscape announced their new browser, this author published the first CSS proposal (named Cascading HTML style sheets – a proposal) [Lie 1994] on the web. In addition to describing fonts, colors and layout of documents – which several proposals had done previously – CSS introduced new functionality to account for the differences in publishing imposed by the web. The concept of cascading allowed both authors and users to influence the presentation of a document:

The proposed scheme supplies the brower with an ordered list (cascade) of style sheets. The user supplies the initial sheet which may request total control of the presentation, but – more likely – hands most of the influence over to the style sheets referenced in the incoming document.

Negotiating between the needs and wishes of readers and authors was one of the main ambitions of CSS. If successful, authors would get their fair share of influence over the presentation and would not feel compelled to use presentational HTML and other tricks. Readers, on the other hand, would be served documents in a form in which they could choose between accepting the author's suggested presentation or specify their own.

In many cases there would be no conflict between the author and the reader. Neither would want to specify the presentation of the document. In such cases, it is important for the browser to have a default style sheet that describes a default presentation of HTML documents. CSS, therefore, defines three possible sources for style sheets: authors, readers, and browsers. CSS is able to combine style sheets from these sources to form the presentation of a document. The process of combining several style sheets – and resolving conflicts if they occur – is known as cascading.

The CSS development

The first CSS proposal was put forward in the spirit of open exchange of ideas on how the web should develop, and discussions took place on public mailing lists. A number of people responded to the proposal [Bos 1994][Behlendorf 1994][Wei 1994] and the draft was developed further. During the course of 1995, approximately eight revisions were published. The last of these, published in December 1995, was declared to be stable and browser vendors were encouraged to use it as a base for implementations [Lie 1996].

With a few minor exceptions, the syntax from the draft of December 1995 has remained stable and the first section of the specification can still serve as an introduction to CSS:

Designing simple style sheets is easy. One only needs to know a little HTML and some basic desktop publishing terminology. E.g., to set the text color of 'H1' elements to blue, one can say:
  H1 { color: blue } 
The example consists of two main parts: selector ('H1') and declaration ('color: blue'). The declaration has two parts, property ('color') and value ('blue').

The CSS1 specification became a W3C Recommendation [CSS1 1996] in December 1996. In May 1998 CSS2 became a W3C Recommendation [CSS2 1998]. Chapter 6 (Cascading Style Sheets) describes the development of the Recommendations in more detail.

A decade after the first CSS proposal was published, all major web browsers support CSS and a majority of web pages use CSS. It may still be too early to fully evaluate CSS and its impact on the web, but it possible to study the design of CSS and compare it with other style sheet languages and style sheet language proposals.

Summary and conclusions

This chapter introduces some of the key concepts of this thesis. HTML was developed as a simple structured document format for the web. As web authors requested more presentational influence over their documents, HTML started developing into a presentational rather than a structural language. To stop this downwards slide on the ladder of abstraction, CSS was developed as a style sheet language for the web. Style sheets have been part of electronic publishing systems since around 1980. On the web, the focus of style sheets is shifted from being a tool in the authoring process to being a tool for content reuse after the content has been generated.

The thesis explores in more detail why the web requires style sheet languages different from those in other kinds of publishing, and how such a language can be designed. Before doing so, however, it is necessary to discuss two other topics. First, structured documents must be understood since style sheets are applied to structured documents. Second, style sheet languages developed before the advent of the web must be researched to determine if any of these languages are suitable for use on the web. This is done in Chapter 2 and Chapter 3, respectively.

Structured documents

Style sheet languages and structured document formats are mutually dependent on each other. Without style sheets, structured documents cannot be presented, and without structured documents there is nothing for style sheets to present. Due to the strong relationship between the two, it is important to understand structured documents when studying style sheet languages. Some structured document systems that have been most influential on style sheet languages are discussed in this chapter.

In a seminal work titled Structured Documents [André, et al. 1989], the topic is defined as:

A document may be described as a collection of objects with higher-level objects formed from more primitive objects. The object relationships represent the logical relationships between components of the document. For example, the present document is described as a book at the highest level. The book is subdivided into chapters, each chapter into sections, subsections, paragraphs, and so forth. Such a document organization has come to be known as the structured document representation.

One important feature of the structured document representation is that it has a certain level of abstraction. The level of abstraction is especially important when the structured document is combined with a style sheet to form a presentation. Therefore, the first part of this chapter discusses abstraction levels in structured documents and proposes a ladder of abstraction to measure the level of abstraction in web document formats.

The second part of the chapter describes seminal structured document systems, namely Scribe; LaTex; Open Document Architecture (ODA); Standard Generalized Markup Language (SGML); HyperText Markup Language (HTML); and Extensible Markup Language (XML). Each of the systems is briefly described historically and technically with special emphasis on their relationships with style sheet languages.

A third part discusses the relationship between transformation languages and style sheet languages on the web.

Abstraction levels

In his book, Language in Action, Hayakawa [Hayakawa 1940] introduces the notion of a linguistic ladder of abstraction. At the bottom of the abstraction ladder is an object. As an example, Hayakawa uses a cow named Bessie. The cow is composed of muscle, bones, skin and other biological parts. As the first step up the ladder, we disregard the biology inside the cow but retain its physical properties – for example its color, size and shape – and we call it Bessie. Bessie is just one of many objects that can be classified as cows. On the farm where Bessie lives, there are many other kinds of animals that can all be referred to as livestock. The climb up the ladder of abstraction can continue to farm assets and wealth. This concept is illustrated in Figure 1.

The ladder of abstraction.

The ladder of abstraction. Illustration reprinted from Hayakawa [Hayakawa 1940].

A similar example of abstraction levels can be found in the field of computer networking. In 1983, the International Standards Organization (ISO) developed a network model called Open Systems Interconnection (OSI) Reference Model which defined a framework of computer communications. The ISO/OSI Reference Model has seven layers, each of which has a different level of abstraction. The seven layers are: physical, data link, network, transport, session, presentation and application.

I believe the notion of an abstraction ladder is useful when evaluating document formats. How high a certain document format is on the ladder will determine the complexity of formatting the document into a presentation. Since the formatting of a document is specified by a style sheet, the abstraction level is a crucial feature for the success of style sheets.

The vertical nature of a ladder corresponds to how one describes abstraction levels as high or low. Typical characteristics of document formats that are high on the ladder of abstraction are:

The information needs processing in order to be presented. For example, in order to render an HTML document visually, the words must be broken into lines, fonts must be selected, and the characters must be turned into rasterized glyphs. The information can be processed and presented in many different ways. Presenting a document visually is only one of several possibilities; others include aural renderings and braille embedding. The information is represented in a compact manner. Representing a character with an eight-bit code is more compact than representing an image of the same character.

Conversely, documents written in formats that are low on the ladder of abstraction need less processing in order to be presented, they have less flexibility of presentation, and they are less compact.

Another important observation is that it is generally possible to transform documents downwards on the ladder but much more difficult to move the other way [Lie&Saarela 1999]. For example, graphical web browsers – in collaboration with the windowing system – rasterize HTML documents into pixels and thereby move information downwards on the ladder of abstraction. Optical Character Recognition (OCR) software attempts to climb the ladder by turning images into text, but OCR systems only work under certain conditions and are prone to errors. Similarly, it is impossible to devise an algorithm that converts documents written in a Turing-complete language due to the halting problem [Connolly 1994a].

In the context of web document formats, I believe the following criteria can be used to establish the steps in the ladder of abstraction:

Is the text human-readable? That is, if the document is presented to a human reader, will he/she be able to read the document? Is the text machine-readable? That is, does the format have a notion of numbered characters, or does it represent text as images – in which case the text is not available. Is the logical order of text preserved? That is, do documents written in the format have a notion of the logical reading order of the content? Is the document scalable? That is, can the document be zoomed in without introducing visible artefacts? Is reflow possible? That is, can text be reflowed into lines, columns and pages of different dimensions? Can the roles of the various text elements be represented? For example, can the author mark part of the text as a headline, a paragraph, or perhaps as the name of a variable in a computer program? Being able to distinguish between these roles is important. When making documents available in braille, for example, some text should be contracted (e.g. headlines), while other text should not (e.g. variable names) [Lorimer 1996]. Is the format device-independent? That is, can documents written in the format be rendered into many different devices (e.g. printers, screens, braille printers, and text synthesizers) or are documents intended for a single type of device? Does the format contain application-specific semantics? HTML is a general document format that does not attempt to describe semantics from more specialized fields, e.g. mathematics and chemistry, and therefore does not contain application-specific semantics. Formats that contain application-specific semantics tend to be higher on the ladder of abstraction.

A comparison of document formats on the ladder of abstraction.

GIF, PNG private XML
vocabulary
PDF XSL-FO HTML MathML
application-
specific semantics?
no no no no no yes
device-independent? no no no no yes yes
roles known? no no no no yes yes
text in logical order? unknown unknown no yes yes yes
reflow possible? no unknown no yes yes yes
scalable? no unknown yes yes yes yes
text machine-readable? no yes yes yes yes yes
text human-readable? yes yes yes yes yes yes

Table 1 shows the relative positions of various document formats on the ladder of abstraction. Some notes to the table:

GIF [GIF 1990] and PNG [PNG 1996] are bitmap image formats rather than document formats, but images are often used to represent documents. Fax transmission is a common example outside the web. PDF [Adobe 1993] is a document format developed by Adobe Systems. PDF is a presentation-oriented format and has no concept of, for example, paragraphs and headings. Many users have discovered this when trying to copy content from PDF documents laid out in several columns. When selecting text, the selection will span across multiple columns and thereby mix text from several parts of the document into the same selection. Recent versions of PDF have introduced functionality to retain a document's logical structure in PDF [Adobe 2001]. XSL-FO refers to a document consisting of formatting objects as defined in the XSL Recommendation [XSL 2001]. XSL-FO is discussed later in this chapter. XML [XML 1998], in which several of the emerging formats are written, is also included in the table and refers to documents published using private XML vocabularies where the semantics are not universally known. The rating of HTML is based on a best-case scenario where the author makes use of semantic elements and does not alter the reading order of elements by using features such as positioning or tables. It may be argued that most HTML documents do not follow these conventions. MathML is a W3C Recommendation for mathematical notation [MathML 1998].

Having established the ladder of abstraction as a measuring tool for structured document formats, the next section discusses structured document systems in more detail.

Structured document systems

Beginning around 1980, there was an active research community in the field of electronic publishing and structured documents. The community published their results in the proceedings of the Electronic Publishing conferences, in the journal Electronic Publishing – Origination, Dissemination and Design [Electronic Publishing], and Cambridge University Press published a series of books on the topic. Richard Furuta lists many of the important works in Important papers in the history of document preparation systems: basic sources [Furuta 1992].

The researchers generally agreed on the benefits of vendor-neutral document formats to facilitate document exchange. The benefits of structured documents were also well understood. There were, however, several approaches to structured documents, and competing formats were developed. This section describes and discusses four of them.

One line of development started in the late 1970's when Brian Reid developed Scribe [Reid 1980]. Scribe pioneered the notion of structured documents and enforced a distinction between logical markup and presentational templates in the authoring process. The Scribe philosophy was continued in Leslie Lamport's LaTeX which was first released in 1985 [Lamport 1986]. LaTeX is a macro package on top of Donald Knuth's TeX program which serves as the low-level formatter [Knuth 1984].

Open Document Architecture (ODA) is a set of ISO standards to facilitate the electronic exchange of documents [ODA]. ODA documents can represent both the logical and the presentational representation of a document.

Standard Generalized Markup Language (SGML) [SGML 1986] and its predecessor GML were developed by Charles Goldfarb and colleagues during the 1970s and 1980s [Furuta, et al. 1982]. SGML became an ISO standard in 1986.

These six systems (Scribe, LaTeX, ODA, SGML, HTML and XML) are described in this section. Before discussing each one, it may be helpful to informally list the perceived ambitions and achievements of the six systems. see Table 2.

The ambitions and achievements of the six different structured document systems.

Is primarily a system to define new languages? Has notion of document semantics? Has notion of document presentation? Enco-
ding
Reference Level of comp­lexity Main achieve­ment
Scribe no yes yes text implementation moderate inspired LaTeX
LaTex no yes yes text implementation moderate de facto format in scientific publishing
ODA no yes yes binary specification high became ISO standard
SGML yes no no text specification high became ISO standard, inspired HTML and XML
HTML no yes some text specification & implementation moderate universally understood hypertext format
XML yes no no text specification moderate syntactic basis for emerging formats

For a more formal taxonomy of document formats, see The Origin of (Document) Species [Khare&Rifkin 1998].

In addition to the achievements listed in Table 2, all systems should be credited for having inspired authors and programmers to see the benefits of structured documents.

The discussions of the various structured document systems below do not follow a strict pattern. The systems vary widely in how well they are understood, how much use they have seen, and how much information is currently available about each system. The primary goal of the descriptions is not to perform a comparative analysis, but rather to discuss aspects of these languages which this author finds interesting in the context of style sheets.

Scribe

The Scribe system was developed in the late 1970s by Brian Reid at Carnegie-Mellon University [Reid 1980]. Scribe is noteworthy for pioneering the structured approach to authoring. It encourages authors to work with predefined logical objects, and authors typically produce documents in their final form without having to specify any of the formatting.

The Scribe system changed somewhat over the years. The discussion in this chapter is based on Scribe as described in Scribe Introductory User's Manual from 1980 [Reid&Walker 1979]. The description attempts to give a general overview of Scribe, and not all features are discussed.

A simple document

A Scribe document can be remarkably simple:

@Make(Text)
@Device(Diablo)
@Heading(Comrades and Strangers)

The example above uses three key concepts of Scribe: document types, commands, and formatting environments. The first line chooses a particular document type (Text) from a set of different document types. The second line is a command which specifies that the document should be printed on a specific device. The third line specifies that a certain string (Comrades and Strangers) is the heading of the document.

Document types

An installation of Scribe comes with a database of document types. The Scribe documentation lists 11 different document types: Text (which is default), Article, Report, Manual, Thesis, Brochure, Guide, Letter, Letterhead, ReferenceCard, and Slides. A Scribe document typically starts by selecting which document type to use:

@Make(Thesis)

The system administrator of the Scribe installation is expected to change the database to fit local needs. For example, the formatting requirements of a dissertation vary from one university to another, and the differences can be accounted for in the Thesis document type. In theory, authors can write their dissertations without thinking about the formatting requirements and can concentrate rather on the content.

Document types influence both the content model and the presentation of a document. For example, the Thesis document type allows and expects the TitlePage and various other environments to be used:

@Make(Thesis)
@Device(Diablo)
@Begin(TitlePage)
  @TitleBox(Comrades and Strangers)
  @CopyrightNotice(Michael Harrold)
@End(TitlePage)

It is possible for authors to change both the content model and the presentation of their own documents, but doing so is cumbersome. Scribe encourages a mode where a local administrator maintains control over – and responsibility for – the various document types that are used in the organization.

Scribe commands

In addition to the content itself, a Scribe source file contains Scribe commands. These correspond to what is known as markup in SGML/HTML/XML terminology. There are approximately 35 commands. They can be divided into five main groups:

Classification commands: @Begin and @End are used to mark the start and end of environments, and @Make is used to declare a document type. Variable commands: To handle counters (@Set, @Tag), cross-references (@Ref), and string variables (@String, @Value) that are expanded into, for example, date and username. Visual commands: To mark page breaks (@NewPage), add vertical spacing (@BlankSpace), handle tab stops (@Tabset, @TabDivide, @TabClear) or change style (@Style) or font (@SpecialFont). Out-of-flow content commands: To label certain out-of-flow content, for example footnotes (@Foot), running headers (@PageHeading) and footers (@PageFooting). System commands: To import other files (@Import), specify the output device (@Device), and output a message on the console (@Message).

The classification of commands into groups is done by this author.

The Scribe documentation describes commands as non-procedural. However, some of the commands are arguably procedural, most notably @BlankSpace and @NewPage. In a structured approach, page breaks are attached to structured elements (e.g. a heading) rather than using a separate command.Scribe also supports the structured approach through the Pagebreak environment Attribute.

Likewise, the @Style and @SpecialFont commands, which are used to set stylistic and font preferences, can be questioned since they are not attached to structured elements.

Another command that is easily challenged is @Device, which is used to specify the printing device for the output. Web authors will not know what printing device (if any) the user has. Scribe, however, was used mostly with paper as the final form, and including commands like @Device is a pragmatic choice.

Formatting environments

The most frequently used commands in Scribe are @Begin and @End which, respectively, mark the beginning and end of formatting environments. A formatting environment corresponds roughly to an element in SGML/HTML/XML terminology, and the @Begin and @End commands correspond to tags. Formatting environments are also referred to as named formatting environments or just environments. Here is a simple fragment from the Scribe documentationThe quote is from Oscar Wilde: The Soul of man under Socialism, 1895:

@Begin(Quotation) 
On mechanical slavery, on the slavery of the machine, 
the future of the world depends.
@End(Quotation)

Text inside the Quotation environment is given extra space on all sides. Text can also be placed in environments through a shorthand syntax:

@Quotation(On mechanical slavery, on the slavery of the machine, 
the future of the world depends.)

Environments can be nested inside each other:

@Quotation(On mechanical slavery, on the slavery of the @i[machine], 
the future of the world depends.)

The example above also shows how different pairs of characters can be used in delimiters. The outer delimiters use () characters, while the inner delimiters use [].

All Scribe systems offer a common set of environments for authors to use. See Table 3.

Keeping in mind how important Scribe has been in the promotion of logical markup, it is noteworthy that around half of the environments have presentational rather than logical roles.

Not all structure in a Scribe document must be marked up explicitly. Scribe is able to identify paragraphs from the white space in the source document. Consider this example:

@begin(enumerate)
The first item of three.

The second item. 

The last item.
@end(enumerate)

The resulting enumerated list consists of three items. The Multi