- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Any good Scala/Java solutions for sanitizing HTML?
I'd like to use the Scala XPath features, and it's quite possible some
of the HTML I'll be dealing with won't be properly formatted. Can
someone recommend a good sanitizer?
Thanks,
Ken










Re: Any good Scala/Java solutions for sanitizing HTML?
Kenneth McDonald schrieb:
> I'd like to use the Scala XPath features, and it's quite possible some
> of the HTML I'll be dealing with won't be properly formatted. Can
> someone recommend a good sanitizer?
http://www.nabble.com/How-to-use-TagSoup-with-Scala-XML--td17575225.html
- Florian
Re: Any good Scala/Java solutions for sanitizing HTML?
http://www.benmccann.com/dev-blog/java-html-parsing-library-comparison/
Cheers
Rich
On Wed, Jan 21, 2009 at 7:22 PM, Florian Hars <hars [at] bik-gmbh [dot] de> wrote:
--
http://www.richdougherty.com/
Re: Any good Scala/Java solutions for sanitizing HTML?
Rich Dougherty schrieb:
> I was looking into this recently, and I found an article that was
> helpful. The comments are worth reading too.
>
> http://www.benmccann.com/dev-blog/java-html-parsing-library-comparison/
Most are DOM parsers, while scala wants SAX. I put up code for the two
that are usable without a DOM2SAX converter there:
http://www.hars.de/2009/01/html-as-xml-in-scala.html
- Florian