From 4570c647a9fc6c398a22bc6acc169628806ea4c1 Mon Sep 17 00:00:00 2001 From: Wilson Lin Date: Mon, 30 Dec 2019 16:52:59 +1100 Subject: [PATCH] Fix invalid entity decoding --- README.md | 2 ++ fuzz/in/complex.html | 4 ++-- src/err.rs | 1 + src/main.rs | 3 +++ src/unit/attr/value.rs | 8 +------- src/unit/content.rs | 27 ++++++++++++++------------- src/unit/entity.rs | 38 ++++++++++++++++++++++++++++++++++---- 7 files changed, 57 insertions(+), 26 deletions(-) diff --git a/README.md b/README.md index 4d4ba70..e1244c7 100644 --- a/README.md +++ b/README.md @@ -271,6 +271,8 @@ If a named entity is an invalid reference as per the [specification](https://htm Numeric character references that do not reference a valid [Unicode Scalar Value](https://www.unicode.org/glossary/#unicode_scalar_value) are considered malformed. +No ampersand can immediately follow a malformed entity e.g. `&am&` or `&`. + ### Attributes Backticks (`` ` ``) are not valid quote marks and are not interpreted as such. diff --git a/fuzz/in/complex.html b/fuzz/in/complex.html index 463ec82..5ca43dc 100644 --- a/fuzz/in/complex.html +++ b/fuzz/in/complex.html @@ -9,9 +9,9 @@ there b " data="a" class=" "> - a + a&
ÆA
-

Hello

+

Hello &

&;

 

&amp