summaryrefslogtreecommitdiff
path: root/Docs
diff options
context:
space:
mode:
authorJames Bursa <james@netsurf-browser.org>2006-10-02 21:58:33 +0000
committerJames Bursa <james@netsurf-browser.org>2006-10-02 21:58:33 +0000
commitc88b268f846e5501e19b5d75b0e8ac9b11858857 (patch)
treee09f15adccc34647b590638c1416c1df4ca027ca /Docs
parent118f435133cb5aa3a81fa84e29f69e563e2a0e35 (diff)
downloadnetsurf-c88b268f846e5501e19b5d75b0e8ac9b11858857.tar.gz
netsurf-c88b268f846e5501e19b5d75b0e8ac9b11858857.tar.bz2
Split documentation into files by topic and convert to AsciiDoc format.
svn path=/trunk/netsurf/; revision=2975
Diffstat (limited to 'Docs')
-rw-r--r--Docs/00-overview58
-rw-r--r--Docs/01-content24
-rw-r--r--Docs/02-layout31
-rw-r--r--Docs/03-css81
-rw-r--r--Docs/04-errors30
5 files changed, 224 insertions, 0 deletions
diff --git a/Docs/00-overview b/Docs/00-overview
new file mode 100644
index 000000000..c7984d06b
--- /dev/null
+++ b/Docs/00-overview
@@ -0,0 +1,58 @@
+NetSurf Documentation for Developers
+====================================
+
+The documents in this directory describe how the NetSurf code works, and any
+other information useful to developers.
+
+Directory Structure
+-------------------
+The source is split at top level as follows:
+
+content:: Fetching, managing, and converting content
+render:: HTML processing and layout
+css:: CSS parser
+image:: Image conversion
+desktop:: Non-platform specific front-end
+riscos:: RISC OS specific code
+debug:: Unix debug build specific code
+gtk:: GTK specific code
+utils:: Misc. useful functions
+
+Other Documentation
+-------------------
+RISC OS specific protocols:
+
+- Plugin http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/funcspec.html[]
+ http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/browse-plugins.html[]
+- URI http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/uri.html[]
+- URL http://www.vigay.com/inet/inet_url.html[]
+- Nested WIMP http://www.ecs.soton.ac.uk/~jmb202/riscos/acorn/nested.html[]
+
+Specifications:
+
+- HTML 4.01 http://www.w3.org/TR/html401/[]
+ (see also http://www.w3.org/MarkUp/[])
+- XHTML 1.0 http://www.w3.org/TR/xhtml1/[]
+- CSS 2.1 http://www.w3.org/TR/CSS21/[]
+- HTTP/1.1 http://www.w3.org/Protocols/rfc2616/rfc2616.html[]
+ and errata http://purl.org/NET/http-errata[]
+ (see also http://www.w3.org/Protocols/[])
+- HTTP Authentication http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2617.html[]
+- PNG http://www.w3.org/Graphics/PNG/[]
+- URI http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2396.html[]
+ (see also http://www.w3.org/Addressing/[] and RFC 2616)
+- Cookies http://wp.netscape.com/newsref/std/cookie_spec.html[] and
+ http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2109.html[]
+
+Libraries
+---------
+Get these compiled for RISC OS with headers from
+http://netsurf.strcprstskrzkrk.co.uk/developer/[]
+
+- libxml (XML and HTML parser) http://www.xmlsoft.org/[]
+- libcurl (HTTP, FTP, etc) http://curl.haxx.se/libcurl/[]
+- OSLib (C interface to RISC OS SWIs) http://ro-oslib.sourceforge.net/[]
+- libmng (PNG, JNG, MNG support) http://www.libmng.com/[]
+- libjpeg (JPEG support) http://www.ijg.org/[]
+- zlib http://www.gzip.org/zlib/[]
+- OpenSSL (HTTPS support) http://www.openssl.org/[]
diff --git a/Docs/01-content b/Docs/01-content
new file mode 100644
index 000000000..4db4bade8
--- /dev/null
+++ b/Docs/01-content
@@ -0,0 +1,24 @@
+Fetching, managing, and converting content
+==========================================
+
+The modules in the content directory provide the infrastructure for fetching
+data, managing it in memory, and converting it for display.
+
+Struct Content
+--------------
+Each URL is stored in a struct ::content. This structure contains the
+content_type and a union with fields for each type of data (HTML, CSS,
+images). The content_* functions provide a general interface for handling these
+structures. For example, content_redraw() calls html_redraw() or
+nsjpeg_redraw(), etc., depending on the type of content. See content.h and
+content.c.
+
+Fetching
+--------
+A high-level interface to starting the process of fetching and converting an URL
+is provided by the fetchcache functions, which check the memory cache for a url
+and fetch, convert, and cache it if not present. See fetchcache.h and
+fetchcache.c.
+
+The fetch module provides a low-level URL fetching interface. See fetch.h and
+fetch.c.
diff --git a/Docs/02-layout b/Docs/02-layout
new file mode 100644
index 000000000..ddc7cfd06
--- /dev/null
+++ b/Docs/02-layout
@@ -0,0 +1,31 @@
+HTML processing and layout
+==========================
+
+The modules in the layout directory process and layout HTML pages.
+
+Overview
+--------
+This is the process to render an HTML document:
+
+First the HTML is parsed to a tree of xmlNodes using the HTML parser in libxml.
+This happens simultaneously with the fetch [html_process_data()].
+
+Any stylesheets which the document depends on are fetched and parsed.
+
+The tree is converted to a 'box tree' by xml_to_box(). The box tree contains a
+node for each block, inline element, table, etc. The aim of this stage is to
+determine the 'display' or 'float' CSS property of each element, and create the
+corresponding node in the box tree. At this stage the style for each element is
+also calculated (from CSS rules and element attributes). The tree is normalised
+so that each node only has children of permitted types (eg. TABLE_CELLs must be
+within TABLE_ROWs) by adding missing boxes.
+
+The box tree is passed to the layout engine [layout_document()], which finds the
+space required by each element and assigns coordinates to the boxes, based on
+the style of each element and the available width. This includes formatting
+inline elements into lines, laying out tables, and positioning floats. The
+layout engine can be invoked again on a already laid out box tree to reformat it
+to a new width. Coordinates in the box tree are relative to the position of the
+parent node.
+
+The box tree can then be rendered using each node's coordinates.
diff --git a/Docs/03-css b/Docs/03-css
new file mode 100644
index 000000000..5744e27b7
--- /dev/null
+++ b/Docs/03-css
@@ -0,0 +1,81 @@
+CSS parser
+==========
+
+CSS is tokenised by a re2c-generated scanner (scanner.l), and then parsed into a
+memory representation by a lemon-generated parser (parser.y, ruleset.c).
+
+Styles are retrieved using css_get_style(). They can be cascaded by
+css_cascade().
+
+Implementing a new CSS property
+-------------------------------
+In this section I go through adding a CSS property to NetSurf, using the
+'white-space' property as an example. -- James Bursa
+
+First read and understand the description of the property in the CSS
+specification (I have worked from CSS 2, but now 2.1 is probably better).
+
+Add the property to css_enums. This file is used to generate css_enum.h and
+css_enum.c:
+
+ css_white_space inherit normal nowrap pre
+
+(I'm not doing pre-wrap and pre-line for now.)
+
+Add fields to struct css_style to represent the property:
+
+ css_white_space white_space;
+
+Add a parser function for the property to ruleset.c. Declare a new function:
+
+ static void parse_white_space(struct css_style * const s, const struct css_node * const v);
+
+and add it to property_table:
+
+ { "white-space", parse_white_space },
+
+This will cause the function to be called when the parser comes to a rule giving
+a value for white-space. The function is passed a linked list of struct
+::css_node, each of which corresponds to a token in the CSS source, and must
+update s to correspond to that rule. For white-space, the implementation is
+simply:
+
+ void parse_white_space(struct css_style * const s, const struct css_node * const v)
+ {
+ css_white_space z;
+ if (v->type != CSS_NODE_IDENT || v->next != 0)
+ return;
+ z = css_white_space_parse(v->data, v->data_length);
+ if (z != CSS_WHITE_SPACE_UNKNOWN)
+ s->white_space = z;
+ }
+
+First we check that the value consists of exactly one identifier, as described
+in the specification. If it is not, we ignore it, since it may be some future
+CSS. The css_white_space_parse() function is generated in css_enum.c, and
+converts a string giving a value to a constant. If the conversion succeeds, the
+style s is updated.
+
+Add defaults for the style to css_base_style, css_empty_style, and
+css_blank_style in css.c. The value in css_base_style should be the one given as
+'Initial' in the spec, and the value in css_empty_style should be inherit. If
+'Inherited' is yes in the spec, the value in css_blank_style should be inherit,
+otherwise it should be the one given as 'Initial'. Thus for white-space, which
+has "Initial: normal, Inherited: yes" in the spec, we use CSS_WHITE_SPACE_NORMAL
+in css_base_style and CSS_WHITE_SPACE_INHERIT in the other two.
+
+Edit css_cascade() and css_merge() in css.c to handle the property. In both
+cases for white-space this looks like:
+
+ if (apply->white_space != CSS_WHITE_SPACE_INHERIT)
+ style->white_space = apply->white_space;
+
+Add the property to css_dump_style() (not essential).
+
+Now the box, layout and / or redraw code needs to be changed to use the new
+style property. This varies much more depending on the property.
+
+For white-space, convert_xml_to_box() was changed to split text at newlines if
+white-space was pre, and to replace spaces with hard spaces for nowrap.
+Additionally, calculate_inline_container_widths() was changed to give the
+appropriate minimum width for pre and nowrap.
diff --git a/Docs/04-errors b/Docs/04-errors
new file mode 100644
index 000000000..786c46374
--- /dev/null
+++ b/Docs/04-errors
@@ -0,0 +1,30 @@
+Error handling
+==============
+
+This section describes error handling in the code.
+
+The most common serious error is memory exhaustion. If malloc(), strdup(), etc.
+fails, clean up and free any partially complete structures leaving data in a
+consistent state, and return a value which indicates failure, eg. 0 for
+functions which return a pointer (document the value in the function
+documentation). The caller should then propagate the failure up in the same way.
+At some point, the error should stop being passed up and be reported to the user
+using
+
+ warn_user("NoMemory", 0);
+
+The other common error is one returned by a RISC OS SWI. Always use "X" SWIs,
+something like this:
+
+ os_error *error;
+ error = xwimp_get_pointer_info(&pointer);
+ if (error) {
+ LOG(("xwimp_get_pointer_info: 0x%x: %s\n",
+ error->errnum, error->errmess));
+ warn_user("WimpError", error->errmess);
+ return false;
+ }
+
+If an error occurs during initialisation, in most cases exit immediately using
+die(), since this indicates that there is already insufficient memory, or a
+resource file is corrupted, etc.