|author||Daniel Silverstone <firstname.lastname@example.org>||2017-02-04 09:41:13 +0000|
|committer||Daniel Silverstone <email@example.com>||2017-02-04 09:41:13 +0000|
Initial conversion from MediaWiki, 20170204
Diffstat (limited to 'libdom/minimalsubset.mdwn')
1 files changed, 99 insertions, 0 deletions
diff --git a/libdom/minimalsubset.mdwn b/libdom/minimalsubset.mdwn
new file mode 100644
@@ -0,0 +1,99 @@
+This page enumerates the minimal subset required before it's worth
+integrating LibDOM with NetSurf's current engine:
+- Core (Document, Element, Attribute, etc)
+- HTMLDocument, HTMLElement
+With the above interfaces, we can create an HTMLDocument instance per
+page and create a tree of HTMLElement instances for the content.
+Specialised HTMLElement subclasses can be added as required.
+We now have all the above interfaces implemented (to some level of
+completeness). The example dom-structure-dump code is now capable of
+building an HTMLDocument using HTMLElement nodes and not leaking dynamic
+Before integration can proceed, the following (at least) needs doing:
+- <s>We need to invent an API to permit comparison of classes to
+ satisfy LibCSS' has\_class selection callback's needs.</s>
+ - <s>Implement this API</s>.
+Once the above things are complete, integration may proceed as follows:
+- <s>Remove the parser binding infrastructure from NetSurf's HTML
+ content handler and replace it with instantiation of LibDOM.</s>
+- <s>Port the box tree constructor to the LibDOM API</s>
+- <s>Fix up anything else that expects a libxml tree (save complete is
+ a likely candidate)</s>
+- Construct form gadgets during box tree construction & register them
+ with the relevant DOM node
+In order to not have memory consumption issues and performance
+regressions, we also need to to:
+- <s>Replace the hash table of EventListeners on an EventTarget with a
+ linked list, or similar -- the current implementation is both a
+ memory hog and broken.</s>
+- <s>Replace the hash tables for Attributes attached to Element nodes
+ with linked lists, or similar. Again, this is currently wasting
+ memory for no benefit.</s>
+- The Hubbub parser binding needs bringing into sync with the feature
+ set of the parser binding used in NetSurf. Feature parity must be
+ ensured to avoid regressions. Specifically:
+ - <s>There needs to be a way to set the quirks mode of a document,
+ and retrieve it (i.e. specialised API on HTMLDocument)</s>
+ - Form association needs implementing (i.e. specialised API on
+ - Fix the mess that is/implement HTMLCollection,
+ HTMLOptionsCollection, HTMLFormElement, HTMLInputElement,
+ HTMLSelectElement, HTMLOptionElement, HTMLTextareaElement
+ (May get away with just HTMLCollection and HTMLFormElement
+ - HTMLCollection can \*not\* assume that the collection
+ contents all live beneath a fixed root node
+ (particularly for form inputs)
+ - Register form control elements with their form
+ - Work out when to deregister them again
+ - <s>Class/id attribute optimisations need retaining (see
+- <s>HTMLElement needs to gain optimisations for class attributes
+ (i.e. pre-parse the value string into individual class identifiers).
+ See NetSurf's existing Hubbub binding for an approximation of what's
+ needed -- HTMLElement.setAttr(NS) is probably where this needs to
+ live (or possibly implement it in Element.setAttr(NS) or
+ Attr.setValue if we want the same behaviour for XML documents). </s>
+The above work can be done in parallel with the integration work but
+must be completed before a release can occur.
+To tidy things up before a libdom release, we need to do the following:
+- Get speed back to pre-libdom.
+- Sort out DOM events.
+- Cleanup idea:
+ - Every point in netsurf which uses libdom should ensure any time
+ it passes a pointer to a dom\_node pointer, it makes it NULL
+ - Then every API in libdom which takes a pointer to a dom\_node
+ pointer in order to return a dom\_node pointer, it asserts that
+ the pointer points to a NULL.
+ - Then once we're sure that no user is asserting we alter that
+ assert to an "if not NULL, unref and set to NULL"
+ - Then we can simplify callers, often into for() loops rather than
+ complex while loops with unreffings.