[[!meta title="LibDOM/MinimalSubset"]] [[!meta author="Tlsa"]] [[!meta date="2012-03-31T09:51:42Z"]] [[!toc]] Overview -------- This page enumerates the minimal subset required before it's worth integrating LibDOM with NetSurf's current engine: - Core (Document, Element, Attribute, etc) - EventTarget - HTMLDocument, HTMLElement With the above interfaces, we can create an HTMLDocument instance per page and create a tree of HTMLElement instances for the content. Specialised HTMLElement subclasses can be added as required. Current Status -------------- We now have all the above interfaces implemented (to some level of completeness). The example dom-structure-dump code is now capable of building an HTMLDocument using HTMLElement nodes and not leaking dynamic resources. Before integration can proceed, the following (at least) needs doing: - We need to invent an API to permit comparison of classes to satisfy LibCSS' has\_class selection callback's needs. - Implement this API. Once the above things are complete, integration may proceed as follows: - Remove the parser binding infrastructure from NetSurf's HTML content handler and replace it with instantiation of LibDOM. - Port the box tree constructor to the LibDOM API - Fix up anything else that expects a libxml tree (save complete is a likely candidate) - Construct form gadgets during box tree construction & register them with the relevant DOM node In order to not have memory consumption issues and performance regressions, we also need to to: - Replace the hash table of EventListeners on an EventTarget with a linked list, or similar -- the current implementation is both a memory hog and broken. - Replace the hash tables for Attributes attached to Element nodes with linked lists, or similar. Again, this is currently wasting memory for no benefit. - The Hubbub parser binding needs bringing into sync with the feature set of the parser binding used in NetSurf. Feature parity must be ensured to avoid regressions. Specifically: - There needs to be a way to set the quirks mode of a document, and retrieve it (i.e. specialised API on HTMLDocument) - Form association needs implementing (i.e. specialised API on HTMLFormElement) - Fix the mess that is/implement HTMLCollection, HTMLOptionsCollection, HTMLFormElement, HTMLInputElement, HTMLSelectElement, HTMLOptionElement, HTMLTextareaElement (May get away with just HTMLCollection and HTMLFormElement initially) - HTMLCollection can \*not\* assume that the collection contents all live beneath a fixed root node (particularly for form inputs) - Register form control elements with their form - Work out when to deregister them again - Class/id attribute optimisations need retaining (see below) - HTMLElement needs to gain optimisations for class attributes (i.e. pre-parse the value string into individual class identifiers). See NetSurf's existing Hubbub binding for an approximation of what's needed -- HTMLElement.setAttr(NS) is probably where this needs to live (or possibly implement it in Element.setAttr(NS) or Attr.setValue if we want the same behaviour for XML documents). The above work can be done in parallel with the integration work but must be completed before a release can occur. To tidy things up before a libdom release, we need to do the following: - Get speed back to pre-libdom. - Sort out DOM events. - Cleanup idea: - Every point in netsurf which uses libdom should ensure any time it passes a pointer to a dom\_node pointer, it makes it NULL first. - Then every API in libdom which takes a pointer to a dom\_node pointer in order to return a dom\_node pointer, it asserts that the pointer points to a NULL. - Then once we're sure that no user is asserting we alter that assert to an "if not NULL, unref and set to NULL" - Then we can simplify callers, often into for() loops rather than complex while loops with unreffings.