diff --git a/README.md b/README.md index 8b3dafc..8b28421 100644 --- a/README.md +++ b/README.md @@ -380,7 +380,7 @@ Spaces are removed between attributes if possible. Entities are decoded if valid (see relevant parsing section) and their decoded characters as UTF-8 is shorter or equal in length. -Numeric entities that do not refer to a valid Unicode Scalar Value are decoded to U+FFFD REPLACEMENT CHARACTER. +Numeric entities that do not refer to a valid [Unicode Scalar Value](https://www.unicode.org/glossary/#unicode_scalar_value) are decoded to U+FFFD REPLACEMENT CHARACTER. If an entity is unintentionally formed after decoding, the leading ampersand is encoded, e.g. `&` becomes `&`. This is done as `&` is equal to or shorter than all other entity representations of characters part of an entity (`[a-zA-Z0-9;]`), and there is no other conflicting entity name that starts with `amp`. @@ -402,7 +402,7 @@ hyperbuild simply does HTML minification, and almost does no syntax checking or For example, this means that it's not an error to have self-closing tags, declare multiple `
` elements, use incorrect attribute names and values, or write something like `