Minor reformatting; minor README wording improvements

2020-09-20 20:50:22 +10:00 · 2020-09-20 20:50:22 +10:00 · 8ff5bc3768
parent 9968dd4649
commit 8ff5bc3768
3 changed files with 14 additions and 8 deletions
--- a/README.md
+++ b/README.md
@ -1,6 +1,7 @@
 # minify-html

-An HTML minifier meticulously optimised for both speed and effectiveness, available for Rust, Node.js, Python, Java, and Ruby.
+An HTML minifier meticulously optimised for both speed and effectiveness written in Rust.
+Comes with native bindings to Node.js, Python, Java, and Ruby.

 - Advanced minification strategy beats other minifiers with only one pass.
 - Uses zero memory allocations, SIMD searching, direct tries, and lookup tables.
@ -45,7 +46,7 @@ minify-html --src /path/to/src.html --out /path/to/output.min.html
 minify-html = { version = "0.3.8", features = ["js-esbuild"] }
 ```

-Building with the `js-esbuild` feature requires the Go compiler to be installed as well, to build the [JS minifier](https://github.com/evanw/esbuild).
+Building with the `js-esbuild` feature requires the Go compiler to be installed as well, to build the [JS minifier](https://github.com/wilsonzlin/esbuild-rs).

 If the `js-esbuild` feature is not enabled, `cfg.minify_js` will have no effect.

@ -415,9 +416,7 @@ Numeric entities that do not refer to a valid [Unicode Scalar Value](https://www

 If an entity is unintentionally formed after decoding, the leading ampersand is encoded, e.g. `&&#97;&#109;&#112;;` becomes `&ampamp;`. This is done as `&amp` is equal to or shorter than all other entity representations of characters part of an entity (`[&#a-zA-Z0-9;]`), and there is no other conflicting entity name that starts with `amp`.

-It's possible to get an unintentional entity after removing comments, e.g. `&am<!-- -->p`.
-
-Left chevrons after any decoding in text are encoded to `&LT` if possible or `&LT;` otherwise.
+Note that it's possible to get an unintentional entity after removing comments, e.g. `&am<!-- -->p`; minify-html will **not** encode the leading ampersand.

 ### Comments

--- a/src/unit/content.rs
+++ b/src/unit/content.rs
@ -157,8 +157,8 @@ pub fn process_content(proc: &mut Processor, cfg: &Cfg, ns: Namespace, parent: O
                if proc.last_is(b'<') && (
                    TAG_NAME_CHAR[c] || c == b'?' || c == b'!' || c == b'/'
                ) {
-                    // If this is a tag name char and we just wrote `<` (decoded or original),
-                    // we need to encode the `<`.
+                    // We need to encode the `<` that we just wrote as otherwise this char will
+                    // cause it to be interpreted as something else (e.g. opening tag).
                    // NOTE: This conditional should mean that we never have to worry about a
                    // semicolon after encoded `<` becoming `&LT;` and part of the entity, as the
                    // only time `&LT` appears is when we write it here; every other time we always
--- a/src/unit/tag.rs
+++ b/src/unit/tag.rs
@ -94,7 +94,14 @@ impl MaybeClosingTag {
 }

 // TODO Comment param `prev_sibling_closing_tag`.
-pub fn process_tag(proc: &mut Processor, cfg: &Cfg, ns: Namespace, parent: Option<ProcessorRange>, mut prev_sibling_closing_tag: MaybeClosingTag, source_tag_name: ProcessorRange) -> ProcessingResult<MaybeClosingTag> {
+pub fn process_tag(
+    proc: &mut Processor,
+    cfg: &Cfg,
+    ns: Namespace,
+    parent: Option<ProcessorRange>,
+    mut prev_sibling_closing_tag: MaybeClosingTag,
+    source_tag_name: ProcessorRange,
+) -> ProcessingResult<MaybeClosingTag> {
    if prev_sibling_closing_tag.exists_and(|prev_tag| !can_omit_as_before(proc, Some(prev_tag), source_tag_name)) {
        prev_sibling_closing_tag.write(proc);
    };