diff options
author | Martin Fischer <martin@push-f.com> | 2023-08-12 11:06:02 +0200 |
---|---|---|
committer | Martin Fischer <martin@push-f.com> | 2023-08-19 06:41:55 +0200 |
commit | 9f1019afa7a8e9102d67356d85bd632044eb2d0c (patch) | |
tree | 4c6664aad5a11a942d6684a62e507de28193f5bb /examples/tokenize.rs | |
parent | c3d60e88efa32329614178dfc9455ef33ea0a88d (diff) |
break!: merge Tokenizer::new_with_emitter into Tokenizer::new
The Tokenizer does not perform any state switching, since
proper state switching requires a feedback loop between
tokenization and DOM tree building. Using the Tokenizer
directly therefore is a bit of a pitfall, since you might
not expect it to e.g. tokenize `<script><b>` as:
StartTag(StartTag { name: "script", .. })
StartTag(StartTag { name: "b", .. })
Since we don't want to make walking into pitfalls
particularly easy, this commit changes the Tokenizer::new
method so that you have to specify the Emitter.
Since this makes new_with_emitter redundant it is removed.
Diffstat (limited to 'examples/tokenize.rs')
-rw-r--r-- | examples/tokenize.rs | 11 |
1 files changed, 9 insertions, 2 deletions
diff --git a/examples/tokenize.rs b/examples/tokenize.rs index ceb5751..5776362 100644 --- a/examples/tokenize.rs +++ b/examples/tokenize.rs @@ -1,9 +1,16 @@ //! Let's you easily try out the tokenizer with e.g. //! printf '<h1>Hello world!</h1>' | cargo run --example=tokenize -use html5tokenizer::{BufReadReader, Tokenizer}; + +use html5tokenizer::{DefaultEmitter, Tokenizer}; +use std::io::BufReader; fn main() { - for token in Tokenizer::new(BufReadReader::new(std::io::stdin().lock())).flatten() { + for token in Tokenizer::new( + BufReader::new(std::io::stdin().lock()), + DefaultEmitter::<_, ()>::default(), + ) + .flatten() + { println!("{:?}", token); } } |