Data Model
The data model of WWW is what the http protocol implements.
Until now, WWW had a simple data model consisting of two kinds of
data:
- a classical hypertext node,
- an index.
The following addition is made:
These node types are discussed in more detail below. UDIs in the text
are informally represented by a readable phrase.
Classical hypertext node
This consists only of text, with links going out from it or links
coming in to it, maybe anchored to a selection of the text rather
than the whole node,
Index
This is to be thought of as a set of links to documents, and the link
to be followed is chosen by typing a string of text into a search
panel. An index is usually implemented by a data base.
Indirect node
This is a document which, instead of containing the desired contents,
contains the indications necessary to get to the desired document.
Thus, when a UDI is used referring to this indirect node, the actual
document returned is the result of the elaboration of the contents
of the indirect node.
Indirect links come in three types:
Indirect nodes are used to solve:
When PUT is in the HTTP protocol, ther is no reason why a more sophisticated
user could not write his/her own indirect documents to this effect,
or would even be allowed to store them at the server site.
The three types are discussed in more detail below:
Forward nodes
When document D1 on machine M1 gets relocated to document D2 on machine
M2, all links to D1 become invalid. To avoid massive updating, a document
D1.f is put on M1. It contains the UDI of D2. When a client wants
D1, the server on M1 will try D1, fail, then try D1.f, succeed, and
thereby know that D1 has been relocated. It then sends the client
the contents of D1.f, preceded by the indication that the document
found was a forward. The browser then uses this to find the real document.
Note that if there is a chain of forwards, it is the browser that
follows this chain, not the server! The browser can then decide to
tell the user, or even automatically update the old link to the new
location. Once the document D2 is displayed, making a link to it will
be a link to D2, not to D1.f
Redirected nodes
Sometimes a document needs an alias, e.g. because it is one of a growing
series: in the case of monthly reports, one may well want to link
to "this month's report", and always get what is in the latest report.
There exist documents for, say, January, February, March, ... which
can also be linked to. A redirecting node works much like a forward
node in that it contains the UDI of the aliased document. So, if a
document TMR is requested, the server would not find TMR, would look
for TMR.r and return that with the indication that it is a redirected
document. The browser would then use the UDI (e.g. March.html) to
find the current real document. The difference is in the behaviour
when a link is made. Suppose March.html is on the screen. A link to
this would result in a link to TMR if March.html was found through
TMR, and in a link to March.html if it was found directly. Thus it
is possible to make new links to "this month's report" rather than
to March.html.
Queries
Queries of a complex nature are bound to be in some form of a programming
language (e.g. SQL). To ensure independence of these languages, the
complete text of the query should not be part of the UDI. The UDI
should represent the desired contents. Take the example of a query
that returns the current age distribution of staff in a given category
as a table. This table could well be called "Physicists & Engineers"
but require a quite lengthy SQL program. If at some time the personnel
data is transferred to a new database that does not use SQL, the link
to this document becomes invalid. It will also be difficult to produce
the query in the first place through the only use of search panels.
A solution is to put the entire SQL program into a file PE.q, and
give that document the UDI "Physicists & Engineers". The server will
again not find this document and search for PE.q. It will return the
contents with the indication that it is a query.
There are two solutions: one puts the UDI of the query document plus
the UDI of the server that is to execute it in the anchor. This is
unacceptable because the first server to receive the request may have
to follow the links to a potentially infinite chain, servers need
to contain browsers, the query is recursive. Advantages are that the
scheme works with old browsers and that the query document itself
does not contain indications of where it is to be executed, making
it possible to re-use the same query text on several servers.
The other solution, which is the one adopted, uses the same mechanism
as for forwards: the browser gets the query returned and then sends
it to the server. The browser also follows the chains, there is no
recursion. Disadvantages are that the query document now must contain
the address of the server and cannot be reused without introducing
some type of include-file mechanism, and that the scheme does not
work with old browsers.
Linking with indirect nodes
In the following paragraphs, document A contains an anchor that has
UDI b in it. Following the link should lead to document B which is
identified by b.
Forward:
B is displaced and now identified by b1. A forwarding document F is
identified by the old UDI b, it contains b1.
When B appears, the browser knows it through b1. Making a link to
B means using b1. In exceptional cases one might want to link using
b.
For changing B, the PUT command normally uses b1.
Redirected:
UDI b points to redirecting document R, which contains b1. When B
appears, the browser knows it through b. Exceptionally one might prefer
to link using b1, namely when one wants to link to the current value
of b.
For changing B, the PUT command normally uses b1.
Query:
Links are always using b. There is no general way to use PUT on results
of queries.
RC