[[!meta title="LibDOM/MinimalSubset"]]
[[!meta author="Tlsa"]]
[[!meta date="2012-03-31T09:51:42Z"]]
[[!toc]]
Overview
--------
This page enumerates the minimal subset required before it's worth
integrating LibDOM with NetSurf's current engine:
- Core (Document, Element, Attribute, etc)
- EventTarget
- HTMLDocument, HTMLElement
With the above interfaces, we can create an HTMLDocument instance per
page and create a tree of HTMLElement instances for the content.
Specialised HTMLElement subclasses can be added as required.
Current Status
--------------
We now have all the above interfaces implemented (to some level of
completeness). The example dom-structure-dump code is now capable of
building an HTMLDocument using HTMLElement nodes and not leaking dynamic
resources.
Before integration can proceed, the following (at least) needs doing:
- We need to invent an API to permit comparison of classes to
satisfy LibCSS' has\_class selection callback's needs.
- Implement this API.
Once the above things are complete, integration may proceed as follows:
- Remove the parser binding infrastructure from NetSurf's HTML
content handler and replace it with instantiation of LibDOM.
- Port the box tree constructor to the LibDOM API
- Fix up anything else that expects a libxml tree (save complete is
a likely candidate)
- Construct form gadgets during box tree construction & register them
with the relevant DOM node
In order to not have memory consumption issues and performance
regressions, we also need to to:
- Replace the hash table of EventListeners on an EventTarget with a
linked list, or similar -- the current implementation is both a
memory hog and broken.
- Replace the hash tables for Attributes attached to Element nodes
with linked lists, or similar. Again, this is currently wasting
memory for no benefit.
- The Hubbub parser binding needs bringing into sync with the feature
set of the parser binding used in NetSurf. Feature parity must be
ensured to avoid regressions. Specifically:
- There needs to be a way to set the quirks mode of a document,
and retrieve it (i.e. specialised API on HTMLDocument)
- Form association needs implementing (i.e. specialised API on
HTMLFormElement)
- Fix the mess that is/implement HTMLCollection,
HTMLOptionsCollection, HTMLFormElement, HTMLInputElement,
HTMLSelectElement, HTMLOptionElement, HTMLTextareaElement
(May get away with just HTMLCollection and HTMLFormElement
initially)
- HTMLCollection can \*not\* assume that the collection
contents all live beneath a fixed root node
(particularly for form inputs)
- Register form control elements with their form
- Work out when to deregister them again
- Class/id attribute optimisations need retaining (see
below)
- HTMLElement needs to gain optimisations for class attributes
(i.e. pre-parse the value string into individual class identifiers).
See NetSurf's existing Hubbub binding for an approximation of what's
needed -- HTMLElement.setAttr(NS) is probably where this needs to
live (or possibly implement it in Element.setAttr(NS) or
Attr.setValue if we want the same behaviour for XML documents).
The above work can be done in parallel with the integration work but
must be completed before a release can occur.
To tidy things up before a libdom release, we need to do the following:
- Get speed back to pre-libdom.
- Sort out DOM events.
- Cleanup idea:
- Every point in netsurf which uses libdom should ensure any time
it passes a pointer to a dom\_node pointer, it makes it NULL
first.
- Then every API in libdom which takes a pointer to a dom\_node
pointer in order to return a dom\_node pointer, it asserts that
the pointer points to a NULL.
- Then once we're sure that no user is asserting we alter that
assert to an "if not NULL, unref and set to NULL"
- Then we can simplify callers, often into for() loops rather than
complex while loops with unreffings.