| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
right options.
Fix a couple more doxygen warnings.
svn path=/trunk/hubbub/; revision=5996
|
|
|
|
|
|
| |
Add a bunch of extra commentary to stop doxygen warning.
svn path=/trunk/hubbub/; revision=5994
|
|
|
|
|
|
|
|
|
|
| |
states.
What happened was that, given \rabc, we would advance past the \r, then read at current_offset + len (len == 1). I.E. read 'b' instead of 'a'. If the data in the inputstream's internal buffer happened to end immediately after the \r, then we'd read past the end of the buffer thanks to a bug in lpu_inputstream_peek which was fixed in r5965. In any case, we'd still be looking at the wrong character when looking for CRLF pairs.
All regression tests now pass again.
svn path=/trunk/hubbub/; revision=5967
|
|
|
|
|
|
|
| |
Drop HUBBUB_OOD and just use HUBBUB_NEEDDATA, instead.
Currently aborts in bogus comment handling if it encounters a \r at the end of the inputstream's utf-8 buffer.
svn path=/trunk/hubbub/; revision=5966
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5851
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5846
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5664
|
|
|
|
|
|
| |
Fix commentary copied from libcss
svn path=/trunk/hubbub/; revision=5663
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5656
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5285
|
|
|
|
|
|
|
| |
Pedantic whitespace changes.
More paranoia surrounding entity handling.
svn path=/trunk/hubbub/; revision=5266
|
|
|
|
|
|
|
|
| |
testing whether the 4 most recently read characters in the data state are <!--.
Add a couple of assertions for paranoia.
svn path=/trunk/hubbub/; revision=5146
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5108
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5107
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5106
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5104
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5099
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5098
|
|
|
|
|
|
| |
store their own length and when emitting the comment, the buffer contains the whole comment and nothing else.
svn path=/trunk/hubbub/; revision=5095
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5093
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5086
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5080
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5030
|
|
|
|
|
|
|
| |
- Remove the no-op FINISH() macro from the tokeniser
- Fix a typo in the charset detector
svn path=/trunk/hubbub/; revision=5007
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=5000
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4980
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4979
|
|
|
|
|
|
| |
so that "encoding change" requests can be sent back down the chain from the treebuilder at some point.
svn path=/trunk/hubbub/; revision=4978
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4894
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4893
|
|
|
|
|
|
| |
without semicolons in attribute values.
svn path=/trunk/hubbub/; revision=4892
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4890
|
|
|
|
|
|
| |
of the file.
svn path=/trunk/hubbub/; revision=4889
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4888
|
|
|
|
|
|
| |
used to simply tokeniser->context.pending.
svn path=/trunk/hubbub/; revision=4882
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4881
|
|
|
|
|
|
| |
single statements they expanded to.
svn path=/trunk/hubbub/; revision=4880
|
|
|
|
|
|
| |
now necessary. Should should provide a small speedup.
svn path=/trunk/hubbub/; revision=4873
|
|
|
|
|
|
|
|
|
| |
profiling
- Fix a few instances of where the wrong COLLECT*() macros were used
- Always use emit_current_chars(tokeniser) rather than emit_character_token(tokeniser, tokeniser->context.chars), to make sure that the pointer is always set correctly
svn path=/trunk/hubbub/; revision=4872
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4845
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4844
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4843
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
svn://source.netsurf-browser.org/branches/takkaria/hubbub-parserutils
........
r4631 | takkaria | 2008-07-13 12:54:30 +0100 (Sun, 13 Jul 2008) | 2 lines
Initial hatchet job moving to libparserutils (search and replace and a bit of cleaning up). This doesn't compile.
........
r4632 | takkaria | 2008-07-13 15:28:52 +0100 (Sun, 13 Jul 2008) | 2 lines
libparserutilize everything up to the "before attribute name" state. (Not compiling)
........
r4633 | takkaria | 2008-07-13 15:32:14 +0100 (Sun, 13 Jul 2008) | 2 lines
Replace all uses of "current_{comment|chars}" with just "chars".
........
r4634 | takkaria | 2008-07-13 16:12:06 +0100 (Sun, 13 Jul 2008) | 2 lines
Fix lots of compile errors, lpuise "before attribute name" state.
........
r4636 | takkaria | 2008-07-13 17:23:17 +0100 (Sun, 13 Jul 2008) | 2 lines
Finish lpuising the tag states, apart from character references.
........
r4637 | takkaria | 2008-07-13 19:58:52 +0100 (Sun, 13 Jul 2008) | 2 lines
lpuise the comment states.
........
r4638 | takkaria | 2008-07-13 20:04:31 +0100 (Sun, 13 Jul 2008) | 2 lines
Switch to setting hubbub_string::len to 0 instead of hubbub_string::ptr to NULL to indicate an empty buffer, as it was previously.
........
r4639 | takkaria | 2008-07-13 21:02:11 +0100 (Sun, 13 Jul 2008) | 2 lines
"lpu up" about half of the DOCTYPE handling stages.
........
r4640 | takkaria | 2008-07-13 21:23:00 +0100 (Sun, 13 Jul 2008) | 2 lines
Finish off LPUing the doctype modes.
........
r4641 | takkaria | 2008-07-13 21:37:33 +0100 (Sun, 13 Jul 2008) | 2 lines
The tokeniser uses lpu apart from the entity matcher, now.
........
r4643 | takkaria | 2008-07-14 01:20:36 +0100 (Mon, 14 Jul 2008) | 2 lines
Fix up the character reference matching stuff--still not properly dealt with, but compiles futher.
........
r4644 | takkaria | 2008-07-14 01:24:49 +0100 (Mon, 14 Jul 2008) | 2 lines
Get the tokeniser compiling in its LPU'd form.
........
r4645 | takkaria | 2008-07-14 01:26:34 +0100 (Mon, 14 Jul 2008) | 2 lines
Remember to advance the stream position after emitting tokens.
........
r4646 | takkaria | 2008-07-14 01:34:36 +0100 (Mon, 14 Jul 2008) | 2 lines
Nuke the src/input directory and start work on the treebuilder.
........
r4647 | takkaria | 2008-07-14 01:56:27 +0100 (Mon, 14 Jul 2008) | 2 lines
Get hubbub building in its LPU'd form.
........
r4648 | takkaria | 2008-07-14 02:41:03 +0100 (Mon, 14 Jul 2008) | 2 lines
Get the tokeniser2 testrunner working.
........
r4649 | takkaria | 2008-07-14 02:48:55 +0100 (Mon, 14 Jul 2008) | 2 lines
Fix test LDFLAGS so things link properly.
........
r4650 | takkaria | 2008-07-14 16:25:51 +0100 (Mon, 14 Jul 2008) | 2 lines
Get testcases compiling, remove ones now covered by libparserutils.
........
r4651 | takkaria | 2008-07-14 16:37:09 +0100 (Mon, 14 Jul 2008) | 2 lines
Remove more tests covered by libpu.
........
r4652 | takkaria | 2008-07-14 17:53:18 +0100 (Mon, 14 Jul 2008) | 2 lines
Fix up the tokeniser a bit.
........
r4653 | takkaria | 2008-07-14 19:02:15 +0100 (Mon, 14 Jul 2008) | 3 lines
- Remove the buffer_handler stuff from hubbub
- Add the basics of a buffer for attribute values and text.
........
r4654 | takkaria | 2008-07-14 20:00:45 +0100 (Mon, 14 Jul 2008) | 2 lines
Get character references working in attribute values, start trying to make them work in character tokens.
........
r4656 | takkaria | 2008-07-14 23:28:52 +0100 (Mon, 14 Jul 2008) | 2 lines
Get entities working a bit better.
........
r4657 | takkaria | 2008-07-14 23:37:16 +0100 (Mon, 14 Jul 2008) | 2 lines
Get entities working properly. (!)
........
r4658 | takkaria | 2008-07-14 23:56:10 +0100 (Mon, 14 Jul 2008) | 2 lines
Make doctypes work a bit better.
........
r4659 | takkaria | 2008-07-15 00:18:49 +0100 (Tue, 15 Jul 2008) | 2 lines
Get DOCTYPEs working.
........
r4660 | takkaria | 2008-07-15 00:26:36 +0100 (Tue, 15 Jul 2008) | 2 lines
Fix CDATA sections.
........
r4661 | takkaria | 2008-07-15 01:01:16 +0100 (Tue, 15 Jul 2008) | 2 lines
Get comments working again.
........
r4662 | takkaria | 2008-07-15 01:14:19 +0100 (Tue, 15 Jul 2008) | 2 lines
Fix EOF in "after attribute name" state.
........
r4664 | takkaria | 2008-07-15 01:30:27 +0100 (Tue, 15 Jul 2008) | 2 lines
Put the tests in better order, remove one now superceded with libpu.
........
r4665 | takkaria | 2008-07-15 01:46:29 +0100 (Tue, 15 Jul 2008) | 2 lines
Remove a lot of now-redunant clearings of the current stream offset.
........
r4667 | jmb | 2008-07-15 11:56:54 +0100 (Tue, 15 Jul 2008) | 2 lines
Completely purge charset stuff from hubbub. Parserutils handles this now.
........
r4677 | takkaria | 2008-07-15 21:03:42 +0100 (Tue, 15 Jul 2008) | 2 lines
Get more tests passing, handle NUL bytes in data state.
........
r4694 | takkaria | 2008-07-18 17:55:44 +0100 (Fri, 18 Jul 2008) | 3 lines
- Handle CRs correctly in some token states.
- Handle NULs correctly in the CDATA state.
........
r4706 | takkaria | 2008-07-19 14:58:48 +0100 (Sat, 19 Jul 2008) | 2 lines
Improve the tokeniser2 output a bit.
........
r4721 | takkaria | 2008-07-21 20:57:29 +0100 (Mon, 21 Jul 2008) | 2 lines
Get a better framework in place to allow switching to using a buffer mid-collect. This fails a couple of testcases and doesn't implement proper CR or NUL support yet.
........
r4725 | takkaria | 2008-07-23 17:20:07 +0100 (Wed, 23 Jul 2008) | 2 lines
Make comment tokens in tokeniser2 display both expected and actual output.
........
r4726 | takkaria | 2008-07-23 19:10:23 +0100 (Wed, 23 Jul 2008) | 4 lines
- Add FINISH() macro which stops using buffered character collection.
- Make the encoding U+FFFD in UTF-8 a global varabile, for sanity
- Make the bogus comment state deal with NULs correctly.
........
r4730 | takkaria | 2008-07-24 00:35:16 +0100 (Thu, 24 Jul 2008) | 2 lines
Try to get NUL bytes handled as the spec says.
........
r4731 | takkaria | 2008-07-24 00:40:59 +0100 (Thu, 24 Jul 2008) | 2 lines
Get CRs working in the data state.
........
r4732 | takkaria | 2008-07-24 00:47:45 +0100 (Thu, 24 Jul 2008) | 2 lines
Set force-quirks correctly when failing to match PUBLIC or SYSTEM in DOCTYPEs.
........
r4773 | takkaria | 2008-07-28 15:34:41 +0100 (Mon, 28 Jul 2008) | 2 lines
Fix up the tokeniser, finally.
........
r4801 | takkaria | 2008-07-29 15:59:31 +0100 (Tue, 29 Jul 2008) | 2 lines
Refactor macros a bit.
........
r4802 | takkaria | 2008-07-29 16:04:17 +0100 (Tue, 29 Jul 2008) | 2 lines
Do s/HUBBUB_TOKENISER_STATE_/STATE_/, for shorter line lengths.
........
r4805 | takkaria | 2008-07-29 16:58:37 +0100 (Tue, 29 Jul 2008) | 4 lines
Start cleaning up the hubbub tokeniser;
- refactor to use new inline emit_character_token() and emit_current_tag() functions; makes code clearer
- check EOF before using the CHAR() macro, so eventually it can be removed.
........
r4806 | takkaria | 2008-07-29 17:45:36 +0100 (Tue, 29 Jul 2008) | 2 lines
More cleanup like the previous commit.
........
r4807 | takkaria | 2008-07-29 19:48:44 +0100 (Tue, 29 Jul 2008) | 2 lines
Rewrite comment-handling code to be just the one function, whilst updating it to handle CRs and NULs properly. (All comments now always use the buffer.)
........
r4820 | takkaria | 2008-07-30 14:14:49 +0100 (Wed, 30 Jul 2008) | 2 lines
Finish off the first sweep of cleaning up and refactoring the tokeniser.
........
r4821 | takkaria | 2008-07-30 15:12:22 +0100 (Wed, 30 Jul 2008) | 2 lines
Add copyright statement.
........
r4822 | takkaria | 2008-07-30 17:23:01 +0100 (Wed, 30 Jul 2008) | 2 lines
Apply changes made to tokeniser2 to tokeniser3.
........
r4829 | takkaria | 2008-07-31 01:59:07 +0100 (Thu, 31 Jul 2008) | 4 lines
- Make the tokeniser save everything into the buffer, at least for now.
- Fix logic errors introduced in refactoring
- Avoid emitting more tokens than we have to (e.g. instead of emitting "<>" and switching back to the data state, just switch back to the data state and let it take care of it)
........
r4830 | takkaria | 2008-07-31 02:03:08 +0100 (Thu, 31 Jul 2008) | 2 lines
Small treebuilder <isindex> fix.
........
r4831 | takkaria | 2008-07-31 02:32:29 +0100 (Thu, 31 Jul 2008) | 2 lines
Stop holding on to pointers to character data across treebuilder calls.
........
r4832 | takkaria | 2008-07-31 02:45:09 +0100 (Thu, 31 Jul 2008) | 18 lines
Merge revisions 4620-4831 from trunk hubbub to libinputstream hubbub, modulo one change to test/Makefile which makes the linker choke when linking tests.
------------------------------------------------------------------------
r4666 | jmb | 2008-07-15 11:52:13 +0100 (Tue, 15 Jul 2008) | 3 lines
Make tree2 perform reference counting.
Fix bits of the treebuilder to perform reference counting correctly in the face of *result not pointing to the same object as the node passed in to the treebuilder client callbacks.
------------------------------------------------------------------------
r4668 | jmb | 2008-07-15 12:37:30 +0100 (Tue, 15 Jul 2008) | 2 lines
Fully document treebuilder callbacks.
------------------------------------------------------------------------
r4675 | takkaria | 2008-07-15 21:01:03 +0100 (Tue, 15 Jul 2008) | 2 lines
Fix memory leak in tokeniser2.
------------------------------------------------------------------------
........
r4834 | jmb | 2008-07-31 09:57:51 +0100 (Thu, 31 Jul 2008) | 2 lines
Fix infinite loop in charset detector
........
r4835 | jmb | 2008-07-31 13:01:24 +0100 (Thu, 31 Jul 2008) | 2 lines
Actually store namespaces on formatting list. Otherwise we read uninitialised memory. Add some semblance of filling allocations with junk to myrealloc().
........
r4836 | jmb | 2008-07-31 13:06:07 +0100 (Thu, 31 Jul 2008) | 2 lines
Lose debug again
........
r4837 | jmb | 2008-07-31 15:09:19 +0100 (Thu, 31 Jul 2008) | 2 lines
Lose obsolete testdata (this is now part of lpu)
........
svn path=/trunk/hubbub/; revision=4839
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4550
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4452
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4410
|
|
|
|
|
|
| |
warnings.
svn path=/trunk/hubbub/; revision=4408
|
|
|
|
|
|
| |
reached.
svn path=/trunk/hubbub/; revision=4406
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4405
|
|
|
|
| |
svn path=/trunk/hubbub/; revision=4404
|