Flowing Web Documents (fwd:rev4)

This is a proposal document that identifies two things: basic requirements a language for web based documents should have, and a proposal for a language that meets this requirements.

Previous knowledge of HTML is encouraged.

Requirements of a web document language

1. It should be easy to expose information to the reader: I shouldn't need a lot of formatting sugar to produce a document.

2. There should be little assumptions on the capabilities of the reader and web browser: accessibility is extremely important, not only sensorial but also about the languages used and the way the user can navigate and manipulate a document. A document is not a piece of art for the original creator to expect control over how it is displayed. There is also an increasing trend for accessing documents with differing layouts and resolutions.

3. The contents of a document should be kept sane. Just as you wouldn't expect a book to do something behind your back, you shouldn't expect a web document to be interactive. The activity should be in the hands, and control, of the user.

4. A network of linked documents is by itself only as rich as the information that was codified by the hosts. Therefore the reader should have the ability to further link documents, modify, annotate and improve on the experience.

In conclusion, this document proposes a language for writing documents oriented for the web that puts the user as the focal point of document display and use, facilitate common actions taken on documents, and overall improve the usability of the web.

The flowing web document

The proposed language assumes a document aims to codify some knowledge, and it's quality is as the ability of all the pieces involved in transmitting that information to the reader. No matter how responsive or beautiful your website is, if is not optimized for my phone screen. No matter how well written in English it is, if I cannot read English. You get the point. The variety of uses is too large for a web host to be able to consistently deliver the intended knowledge to its reader. This proposal starts tackling this by a flowing document layout.

The flowing layout considers contexts; pieces of information that should logically be rendered together. The contents of a context however can be more or less stretched and moved around to better fit the desired display. What best fit means is what the user is most comfortable with, and should be configurable in the web browser. For contrast this webpage written in HTML by default left justifies everything. Contexts can also contain other contexts, consider the example of a text that contains a figure with a caption. You probably mean the caption to be close to the figure, with the positioning of the figure in the overall text being less important.


Fig. 1 - Example of possible visual renderings where more maleable content is adjusted to the screen. Notice how a scrollbar is added in the second example instead of increasing the density of content.

For the contexts to naturally adjust it's form to the display there needs to be a display referential, that is to say, a way the display is bound and direction for it to flow. Right now in HTML documents can usually flow in both an xx and yy axis. Again, this behavior should be configurable, with the default only a yy axis to resemble printed media. The content could also be possible to flow in columns, and to cycle these columns through a xx axis. Through the use of flowing documents I should be able to have all documents of the web simultaneously adapt to the way I want to browse them, as well as the density of information I want to be greeted with, see for instance the depiction in Figure 1. It shows the exact same number of contexts.

The language

Since the language is based on context, it makes sense that the basic construct is a bounding context:

{ This is some content. }

This is a simple context, meaning it doesn't have any special meaning beyond implying some information should be kept together. Instead of using a space after the starting curly brace we could also use a function. Consider the following:

{header This is a header}

In this case the function header is operated on the rest of the context, and the browser would probably treat it differently. Functions always have either one or two arguments. Before showing all the available functions let's look at some more difficult examples making use of two arguments:

{language français
	{
		{header Croissants}
		au fromage {link email:name@organization.code contact information}
	}
}

Here we see how we can inform the browser of the language being used in the text. An intelligent browser should have the ability to translate it to another language if it better fits the users needs. We also notice the ubiquitous link function. Again, no guarantees are made about what it does besides that somehow the first argument links to something that is described in the second argument. A browser might not display it's text; might compose a references list, might replace the text by something else and only show it on mouseover and so on. It should be whatever is best for the user, which probably depends on what the first argument refers to, which brings us to document bookmarks:

{mark Bibliography {header Bibliography}}

Naturally the browser might already compose its quick access index of headers in the document for easier browsing, but adding a mark function guarantees there should be some way for me to quickly jump from one part of the document to the other; or be shown a preview, and so on. Finally, the remaining fragments of text in a context always form the last argument of the function; if this is not intended open a new context.

{header { Name} { Some content} { More content}}

The above example just uses as first and only context the whole three contexts passed, there is no way to over-parametrize functions. The curly braces characters can also be escaped by using \{ and \} instead. And to finish, doing something like bellow or related nonsense is subject to the browsers disposition on what to do with the inner most contexts. The outermost should be valid in its behavior.

{link uri:abc {link uri:xyz This might make your browser cry.}}

Functions list

section 1

By putting section contexts inside section contexts, and also using header contexts we can automatically define the level of document content.

header 1

Argument is the header text.

link 2

Arguments are first the URI of the linked resource and secondly the description of the resource.

language 1

Argument is the name of the language; languages can have multiple variants and be spelled and written in different languages as well. You can use none to specify something is impossible to translate.

emph 1

Argument is the content to which to apply emphasis.

scratch 1

Argument is the content to mark as wrong or out-of-date; can be difficult to express not visually.

spoiler 1

Hide the argument content until the user activates it. The hidden content should maintain it's occupied space in a display even if not shown or obfuscated.

warn 2

Similar to spoiler but with a descriptive warning as second parameter. May require human confirmation so avoid abusing this.

mark 1

Mark a context in a unique keyword that can be later used for links inside the document. Bookmarks are local to the file they are declared in.

embed 2

Embed a file. This should be done in moderation for files whose size is not easy to resize, which is the case of images that shouldn't be stretched. Whether or not a file is embedded or a link created, or some other solution used is again browser dependent and user configurable. The first argument is an URI and the second is a textual representation. It is very important to have descriptive representations for when that file type is not displayed visually. File embeding can be a security concern, interactive media types and cross-domain URIs might be disallowed by default by the we browser.

quote 1

Marks a context as being a quotation. Should probably be part of a context with the information on the author.

listing 1

Does not produce a list! Instead is should output its parameters as readable a representation as possible. Can be used for instance to render code in monospace or hexadecimal data.

bullet 1

Produces one bullet point for some context. To change the level of the bullet points just used them inside other bullet point contexts or enum contexts.

enum 1

Similar to bullet but with an enumeration.

keyword 1

Used to indicate a certain word is important and should be either stylized or added to an index, if it exists.

note 1

Used to indicate or introduce something that doesn't really follow the text flow, but should be made a note of. Can be for instance a footnote or commentary box, or just indented text.

format 2

Used to format the second argument with the type of the first argument. Useful for country independent units, phone numbers, currencies, dates, ...

define 2

This function allows the definition of new, non recursive, functions, for instance:

{define newchapter {section {header %1} %2}}

The symbols %1 and %2 are only understood inside a define function and can be escaped with \%1 or \%2. Outside of a define function they cannot be escaped. You cannot use a define inside another define functions, since it would be ambiguous. Defined functions have their scope in the file they are declared in, an embed function cannot be used to export them and reuse them, while the functions that use it in the embeded file continue valid. You still cannot have more than two parameters and the number indicates the order of passage (%1 is the head of the arguments and %2 the rest). Argument invocation is made left to right.

When a document is parsed some specific defined functions can also be used to obtain information about the whole document, these are: doc-title, doc-description, doc-summary, doc-date, doc-authors, doc-license, doc-version and cache-control. Since there is only one symbol defined at a time there is no confusion about the properties of the document. None of this definitions is compulsory, though either doc-title or doc-description should be always present. The difference between doc-description and doc-summary is that the first should be a short description of the page, while doc-title might be the official title. doc-summary is a more lengthy summary of the document contents and can be useful for instance for previewing documents. Understand that given the way this information is defined, it is expected that it will be reused throughout the document.

User Interaction and input functions

inputarea 2

Allows the user to input something, text and other media types. The underlying protocol should make it efficient to reject disallowed media types without reading unecessary data. The first argument is a codified hint of the expected input type, like textline, textmultiline, image, etc; this isn't used to specify the allowed media type, only the size and capabilities the browser should prefer to offer. The second argument is the long description of the use of the input. This description should contain things like the maximum file size, or maximum number of line changes in a text allowed; not just names like "Comment".

selectone 2

Offers a choice of a number of selectable contexts, described and numbered by its parameters number. It is also an existential operator that forces the user to select at least one of the options. The first argument is a context whose subcontexts are used to identify the options, the second is a long description.

selectany 2

Similar to selectone but any number of options may be selected, or not. The two can be mixed to ensure any number can be select and at the same time at least one option has to selected as well:

{selectone
	{selectany
		{
			Option1
			Option2
			{ Option 3 }
			Option4
		}
		You can select more than one though.
	}
	You must select at least one option.
}
button 2

Allows the user to make an action, that provokes a new client request to the server. The first argument specifies the context specified by a mark that should be updated; if it contains selectables and input areas their information is codified and sent; codified in a way relative to the innermost mark they are inserted in. Because of this it is often useful to include inputtables inside a mark functions. The server should respond with a new partial flowing web document that will replace the context inside the mark identified. This means a button can be used to easily refresh an online document by including it entirely on an initial mark function. It also means a button "press" cannot cause changes to the local document outside of the expectations of the user.

What is expected of a web browser

The web browser should be configurable about how the information is convened, and for each method what is the preferred way of doing so. For visual rendering it should be capable of automatic and on demand translation of text, spell checking, be flexible about how date/time, units and other country dependent information is displayed. It should also obey the user preferences on document information density, text orientation, rendering direction, text effects, font family and size, effects and colors for specific functions and background color. It should further be configurable on how to display descriptions, produce automatic short versions of text, embed some types of files and how to treat followed links by file type.

For non-visual cues it should support text-to-speech conversion and audio cues, as well as speech-to-text conversion for word search.

In general the browser should also be capable of supporting adding extra links, commentary, emphasis, notes and embed other media on their personal view on the document. This does not necessarily affect how others perceive the document, but this data should be reasonably expected to remain in place even if a document is changed by the host. When the host changes the document, the user should have the ability to see his changes on the previous, locally stored version of the document; and to his changes that remain valid in the new document.

It should also be possible to specify how you want content to flow and specify some configuration on a domain by domain basis, instead of for the whole web.

Implementation notes for web browsers

A document without any contexts uses by default the global context; there is no need for a global body context. When a browser is first run it is a good idea for it to try to notice some things from the system used (such as the unicode locale and screen size) as well as prompt the user for its 1st use setup. Also on the topic of locales there are no ways to specify it, so a web browser should assume unicode is used and use the locale according to the languages specified by the language function or the default for the system. Since there is no way to specify paragraphs it should also be useful to apply automatic paragraph breaks like LaTeX for instance. Finally, since the symbol for end of contexts is the same regardless of context, there might be some confusion on context formatting when the document is malformed. In this situation the web browser should present a non-intrusive warning and attempt to display the document to the best of it's abilities.


This proposal was written in the hope that it will be useful. For criticism please e-mail me at gonmf [at] sapo.pt, open an issue or submit a pull request in the git repository.