summaryrefslogtreecommitdiff
path: root/README
blob: cd4a57cb69e892d8c566823debd186a072bbcd95 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
NetSurf public suffix list handling
===================================

library to generate static code representation of the Public suffix list

Public suffix list
------------------

The public suffix list is a database of top level domain names [1].
The database allows an application to determine if if a domain
name requires an additional label to be valid.

Uses
----

The principle use in a web browser is to restrict supercookies being
set [2] although it can also serve secondary purposes in the UI such
as domain highlighting.

Implementation
--------------

This implementation uses a directly mapped tree to allow for a
relatively compact representation (~70k compiled vs ~185k source) to
determine if a given hostname is a public suffix. There is no
allocation and the data tables all reside in read only data section
and cannot be updated after compilation.

There is a single API nspsl_getpublicsuffix() which will return the
public suffix of a hostname if it was valid or NULL if not. The
hostname passed must be normalised (i.e. lower case and domain labels
in punycode)

The data tables are staticly generated from a given input with the
genpubsuffix.pl perl program. This program takes a downloaded suffix
database and generates a file the main library lookup code directly
includes.

The generated psl.inc file and the public suffix list used to generate
it are included in the source release as a conveniance but can be
regenerated simply by deleting the src/public_suffix_list.dat file and
making any target.

Other libraries
---------------

This library was created for a very limited purpose in NetSurf and is
therefore possibly not useful as a generic solution. Other C libraries
might be more apropriate:

- Registered domain libs

  Small C library which builds a pre-processed database into dynamic
  memory. Similar tree based lookup to nspsl. Not been updated for a
  while

  https://github.com/usrflo/registered-domain-libs/

- libpsl

  has a very compact data representation (~35k vs nspsl ~70k) using
  DAFSA taken from the chromium project. Can load (preprocessed)
  database files at runtime as well as using a compile time builtin
  set.

  Can be compiled with runtime IDNA support. Directly uses the PSL git
  repo as a submodule so PSL database always up to date.

  If the 0.14 release was more readily available in distribution
  packages NetSurf would use this in preference to nspsl

  https://github.com/rockdaboot/libpsl

[1] https://publicsuffix.org
[2] https://en.wikipedia.org/wiki/HTTP_cookie#Supercookie