1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
|
[[!meta title="LibDOM/MinimalSubset"]]
[[!meta author="Tlsa"]]
[[!meta date="2012-03-31T09:51:42Z"]]
[[!toc]]
Overview
--------
This page enumerates the minimal subset required before it's worth
integrating LibDOM with NetSurf's current engine:
- Core (Document, Element, Attribute, etc)
- EventTarget
- HTMLDocument, HTMLElement
With the above interfaces, we can create an HTMLDocument instance per
page and create a tree of HTMLElement instances for the content.
Specialised HTMLElement subclasses can be added as required.
Current Status
--------------
We now have all the above interfaces implemented (to some level of
completeness). The example dom-structure-dump code is now capable of
building an HTMLDocument using HTMLElement nodes and not leaking dynamic
resources.
Before integration can proceed, the following (at least) needs doing:
- <s>We need to invent an API to permit comparison of classes to
satisfy LibCSS' has\_class selection callback's needs.</s>
- <s>Implement this API</s>.
Once the above things are complete, integration may proceed as follows:
- <s>Remove the parser binding infrastructure from NetSurf's HTML
content handler and replace it with instantiation of LibDOM.</s>
- <s>Port the box tree constructor to the LibDOM API</s>
- <s>Fix up anything else that expects a libxml tree (save complete is
a likely candidate)</s>
- Construct form gadgets during box tree construction & register them
with the relevant DOM node
In order to not have memory consumption issues and performance
regressions, we also need to to:
- <s>Replace the hash table of EventListeners on an EventTarget with a
linked list, or similar -- the current implementation is both a
memory hog and broken.</s>
- <s>Replace the hash tables for Attributes attached to Element nodes
with linked lists, or similar. Again, this is currently wasting
memory for no benefit.</s>
- The Hubbub parser binding needs bringing into sync with the feature
set of the parser binding used in NetSurf. Feature parity must be
ensured to avoid regressions. Specifically:
- <s>There needs to be a way to set the quirks mode of a document,
and retrieve it (i.e. specialised API on HTMLDocument)</s>
- Form association needs implementing (i.e. specialised API on
HTMLFormElement)
- Fix the mess that is/implement HTMLCollection,
HTMLOptionsCollection, HTMLFormElement, HTMLInputElement,
HTMLSelectElement, HTMLOptionElement, HTMLTextareaElement
(May get away with just HTMLCollection and HTMLFormElement
initially)
- HTMLCollection can \*not\* assume that the collection
contents all live beneath a fixed root node
(particularly for form inputs)
- Register form control elements with their form
- Work out when to deregister them again
- <s>Class/id attribute optimisations need retaining (see
below)</s>
- <s>HTMLElement needs to gain optimisations for class attributes
(i.e. pre-parse the value string into individual class identifiers).
See NetSurf's existing Hubbub binding for an approximation of what's
needed -- HTMLElement.setAttr(NS) is probably where this needs to
live (or possibly implement it in Element.setAttr(NS) or
Attr.setValue if we want the same behaviour for XML documents). </s>
The above work can be done in parallel with the integration work but
must be completed before a release can occur.
To tidy things up before a libdom release, we need to do the following:
- Get speed back to pre-libdom.
- Sort out DOM events.
- Cleanup idea:
- Every point in netsurf which uses libdom should ensure any time
it passes a pointer to a dom\_node pointer, it makes it NULL
first.
- Then every API in libdom which takes a pointer to a dom\_node
pointer in order to return a dom\_node pointer, it asserts that
the pointer points to a NULL.
- Then once we're sure that no user is asserting we alter that
assert to an "if not NULL, unref and set to NULL"
- Then we can simplify callers, often into for() loops rather than
complex while loops with unreffings.
|