logiweb man page on Cygwin

Man page or keyword search:  
man Server   22533 pages
apropos Keyword Search (all sections)
Output format
Cygwin logo
[printable version]

LOGIWEB(5)		  File formats and protocols		    LOGIWEB(5)

NAME
       logiweb - Logiweb file formats and protocols

DESCRIPTION
       Logiweb	is  a  system  for storing, locating, and transmitting Logiweb
       pages. Logiweb pages may contain free text mixed with machine  intelli‐
       gible objects like computer programs, testsuites, and formal proofs.

       Logiweb	defines	 a referencing scheme in which each Logiweb page has a
       unique Logiweb reference. A Logiweb reference is	 typically  around  30
       bytes  long.  A	Logiweb	 reference contains, among other, a RIPEMD-160
       hash key of the referenced page.

       The Logiweb standard comprises the following file formats  and  network
       protocols.

	   The Logiweb vector format
	   The Logiweb rack format
	   The Logiweb protocol

       The vector format is a condensed representation suited for transmission
       over the Internet. Furthermore, the authenticity of a Logiweb  page  in
       vector format can be verified using the RIPEMD-160 hash function. Logi‐
       web vectors can be retrieved from untrusted repositories and  transmit‐
       ted over untrusted networks.

       The  rack  format is a comprehensive, precompiled representation suited
       for storing on a local disk. Furthermore, the rack  format  is  a  good
       starting	 point	for  conversion to other formats. The Logiweb Abstract
       Machine lgwam(1) needs pages in rack format in order  to	 execute  pro‐
       grams defined on the pages.

       Earlier	versions  of Logiweb supported XML, MathML, Lisp S-expressions
       and several other formats. The current version leaves such  support  to
       the  user  through  the	capabilities for user defined rendering of the
       Logiweb compiler lgc(1). Only support for latex(1), bibtex(1),  makein‐
       dex(1), dvipdfm(1), and cc(1) has survived in the present version.

       The  Logiweb protocol allows to locate a Logiweb page given its Logiweb
       reference. More specificially, the Logiweb protocol allows to translate
       a  Logiweb  reference  into  an	http  URL.  Then  http may be used for
       retrieving the referenced page and RIPEMD-160 may be used for verifying
       the authenticity of page.

       Logiweb	version 0.1.1 to 0.1.10 included a Logiweb server which imple‐
       mented the Logiweb protocol. From Version 1.0.1, the Logiweb server  is
       excluded	 from  the  distribution.  That is because installation of the
       Logiweb server requires considerable privileges and network insight.

       For Version 1.0.1 a Logiweb server does not even exist,	and  users  of
       the  Logiweb  compiler  lgc(1) are refered to PATH based translation of
       references to http URLs. Furthermore, the Logiweb protocol will change.
       That  is	 because  it has proved to be too difficult to get an assigned
       UDP port and thus the protocol will be modified	to  be	channeled  via
       http. That will reduce efficiency but also has some benefits.

       The  Logiweb  server  will  be distributed separately at a later stage.
       When Logiweb servers are installed in the  future,  repositories	 which
       run  a  server will appear to lgc(1) users to contain all Logiweb pages
       in the world. The present but obsolete Logiweb protocol	is  documented
       in logiweb(7).

BITS, BYTES, AND VECTORS
       The  file  formats  and protocols are byte oriented, so we first define
       bytes and vectors (where 'vector' means 'byte vector').

	   bit		 ::= "0" | "1"
	   byte		 ::= bit*8
	   vector	 ::= byte*

       A byte denotes a value in 0..255. We refer to the least significant bit
       of a byte as bit zero and to the most significant bit as bit seven.

       Bytes  of  vectors are given in network order which is identical to the
       order in which they are stored on disk. We refer to the first byte of a
       vector as byte zero.

       We  write vectors in mixed endian binary: each byte is written with the
       most significant bit first and bytes a written in network order.

       We refer to bit m of byte n of a vector as bit  m+8*n  of  the  vector.
       Thus,  the  bits of a two byte vector are given in this order: 7, 6, 5,
       4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8. This order gives  rise  to
       the term 'mixed endian'.

       Mixed  endian  notation	resembles  Internet dot notation. Internet dot
       notation specifies bytes to be written in network order and  each  byte
       to be written with the most significant digit first.

       As  an  example,	 mixed endian binary 0000 0010 0000 0011 denotes a two
       byte followed by a three byte.

CARDINALS AND STRINGS
       We shall refer to non-negative integers as cardinals. We represent car‐
       dinals and byte strings thus:

	   septet	 ::= bit*7
	   middle-septet ::= "1" septet
	   end-septet	 ::= "0" septet
	   cardinal	 ::= middle-septet* end-septet
	   string	 ::= length vector
	   length	 ::= cardinal

       A middle or end septet denotes a value in 0..127. As examples, the mid‐
       dle septet 1000 0010 and the end septet 0000 0010 both denote two.

       Cardinals are expressed little endian base 128. As an example,

	   1000 0011 0000 0010

       denotes 3+128*2 = 3+256 = 259.

       A string encodes a sequence of bytes. The length field  of  the	string
       indicates the number of bytes in the string.

       The  grammar  is	 *not*	context	 free  because	the  strings cannot be
       expressed in a context free manner.

VECTOR FORMAT
       The vector format of Logiweb pages has the following syntax:

	   pagevector	 ::= bibliography dictionary body
	   bibliography	 ::= reference reference* zero
	   reference	 ::= string	    ; the string must be non-empty
	   zero		 ::= cardinal	    ; the cardinal must denote zero
	   dictionary	 ::= entry* zero
	   entry	 ::= index arity
	   index	 ::= nonzero
	   nonzero	 ::= cardinal	    ; the cardinal must be non-zero
	   arity	 ::= cardinal
	   body		 ::= vector

       The indexes of the dictionary must be strictly decreasing.

       We refer to the first reference of the bibliography as reference	 zero.
       Reference zero must be the reference of the page itself.

       Above, references are described as strings. The following definition is
       more precise:

	   reference	 ::= length scheme ripemd timestamp
	   scheme	 ::= "0000 0001"
	   ripemd	 ::= byte*20
	   timestamp	 ::= mantissa exponent
	   mantissa	 ::= cardinal
	   exponent	 ::= cardinal

       The length must equal the total number of bytes in scheme, ripemd,  and
       timestamp.

       Whenever	 a  Logiweb  page  is  published,  the	time of publication is
       included as a timestamp in the reference of the page. The timestamp  is
       given in Logiweb time as M*10^(-E) where M and E are the values of man‐
       tissa and exponent, respectively.

       Logiweb time indicates the number of seconds which have elapsed accord‐
       ing  to	International Atomic Time (TAI) since TAI:00:00:00 of Modified
       Julian  Day  (MJD)  Zero.  MJD-0	 corresponds  to  the  Gregorian  date
       GRD-1858-11-17. In 1858, UTC lacked 10 seconds behind TAI (even thought
       they did not know at that time). Hence, Logiweb time measures the  num‐
       ber   of	  seconds   according	to  International  Atomic  Time	 since
       UTC:23:59:50 of GRD-1858-11-16.

       The ripemd code consists of 20 bytes which corresponds to 160 bits. The
       ripemd  code of a page must be computed using the RIPEMD-160 hash func‐
       tion applied to all bytes following ripemd, including the timestamp  of
       the page itself.

       The  body represents a parse tree whose nodes may be Logiweb symbols or
       strings. A Logiweb symbol comprises a Logiweb reference	and  an	 index
       where  the  index is a cardinal. To be a valid symbol, the index of the
       symbol must occur in the dictionary of the referenced page.  The	 arity
       of a symbol is the arity assigned to the symbol by that dictionary.

       Transformation  of the body given as a vector to a parse tree is called
       'unpacking'. Reference one of a page may define	an  unpacker,  i.e.  a
       user defined unpacking function. See the 'base' and 'lgc' Logiweb pages
       which come with the distribution for details on this.

       If reference one of a page does not define an unpacker then the body is
       in default format which is as follows:

	   body		 ::= node*
	   node*	 ::= symbol | zero string
	   symbol	 ::= nonzero

       The  nodes  of the body are given in Polish prefix. The arities of each
       symbol is taken from dictionaries of the relevant  page.	 Strings  have
       arity zero.

       Each  symbol has value 1+R+n*i where i is the index of the symbol, n is
       the number of references in the bibliography, and  R  is	 the  relative
       reference of the symbol. The relative reference R if between 0 and n-1.
       The reference of a symbol is the R'th element of the bibliography.

RACK FORMAT
       The rack of a page is a structure which represents a  compiled  version
       of the page. See the 'base' and 'lgc' pages which come with the distri‐
       bution for more on this.

       A rack is an inhomogeneous array, i.e. and  array  whose	 elements  may
       have  different	types.	Indexes	 of a rack are called hooks. Racks and
       hooks are to Logiweb what records and record fields  are	 to  languages
       like C.

       Racks  may  reside  in memory or may be stored to disk. Racks in memory
       contain more data than racks stored to disk. Hence,  racks  are	pruned
       when  written  to  disk	and the missing parts have to be reconstructed
       when reading the racks back into memory.

       Pruning affects three hooks of a rack.

       First, a rack in memory has a code hook which  contains	compiled  ver‐
       sions  of  value	 definitions. To keep disk racks architecture indepen‐
       dent, the code hook is removed when writing a rack to  disk  and	 value
       definitions  have  to  be  recompiled when reading the rack back. Using
       racks still represent a great saving in time since no parsing,  unpack‐
       ing, and macro expansion is needed when loading a rack.

       Second, a rack in memory has a cluster hook which contains racks of all
       transitively referenced pages. This hook is removed since it contains a
       lot  of	data  which is fast and easy to reconstruct from the disk ver‐
       sions of transitively referenced pages.

       Third and finaly, a rack in memory has a diagnose hook.	The  value  of
       that  hook  is 'maptagged'. A maptag allows to delay computation of the
       diagnose in an otherwise eager process. When writing a  rack  to	 disk,
       computation  of	the diagnose is forced and the maptag is removed. When
       reading the rack back from disk, a maptag  is  added  to	 the  diagnose
       hook.  Again, see the 'base' and 'lgc' pages that come with the distri‐
       bution for details.

       Once a rack is pruned for code hook, cluster hook, and diagnose maptag,
       it  is  a  data structure which is built up from only three data types:
       cons, cardinals, and a special value named T which corresponds to  NULL
       in C.

       Pruned racks are stored in the following format:

	   rack		 ::= racknode* count
	   count	 ::= cardinal	    ; three plus the number of nodes
	   racknode	 ::= card1 | card2 | cons
	   card1	 ::= zero cardinal
	   card2	 ::= two string
	   two		 ::= cardinal	    ; the cardinal must represent 2
	   cons		 ::= head tail
	   head		 ::= cardinal
	   tail		 ::= cardinal

       A  rack	with  no nodes represents the value T. Racks with at least one
       node represents the value represented by the last node. Each node  rep‐
       resents a value as indicated in the following.

       A card1 node represents the given cardinal.

       A  card2	 node represents a cardinal as follows: Take the given string.
       Discard the length field to obtain a vector. Append a byte  with	 value
       one  to the vector. Then interpret the vector as a little endian cardi‐
       nal base 256.

       A cons represents the pair of the given head and	 tail.	A  head	 whose
       value  is one represents T. A head whose value H is greater than 2 rep‐
       resents the value of node H-3. The tail is interpretted similarly.  The
       head  and tail must equal one or reference nodes that appear before the
       cons.

RACK WRITING CONVENTIONS
       As can be seen, cardinals can be represented in two ways. All cardinals
       can be represented as a card1 whereas only some cardinals can be repre‐
       sented as a card2. When writing racks to disk, one should use the card2
       representation when possible and the card1 representation otherwise.

       The  representation  allows  to represent sharing in that a node can be
       referenced from more than one cons node. When writing a rack  to	 disk,
       one  should  traverse the rack left-to-right, root-to-leaves, and depth
       first. One should maximize sharing such that no two nodes represent the
       same value.

       If  followed,  the  rack	 writing conventions ensure that the same rack
       will be written the same way on distinct systems. That simplifies test‐
       ing but is not necessary for reading racks back to memory.

AUTHOR
       Klaus Grue, http://logiweb.eu/

SEE ALSO
       lgc(1), lgwam(1), lgc(5), lgc.conf(5), logiweb(7)

Logiweb				   JULY 2009			    LOGIWEB(5)
[top]

List of man pages available for Cygwin

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net