- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
XML encoding problem.
Mon, 2009-05-25, 07:29
The following code:
val node =
val str = scala.xml.Utility.toXML(node)
scala.xml.XML.loadString(str)
Causes an exception:
[Fatal Error] :1:7: An invalid XML character (Unicode: 0x1) was found in the element content of the document.
org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1) was found in the element content of the document.
I guess my first question: is there a better way to encode the XML
into a string?
Should the string encoding in Utility be fixed to use a char ref? The
set of valid characters seems fairly complicated from what I can tell,
but perhaps some simple test is available.
Thanks,
David
Mon, 2009-05-25, 08:17
#2
Re: XML encoding problem.
On Sun, May 24, 2009 at 11:38:01PM -0700, Alex Cruise wrote:
>> Should the string encoding in Utility be fixed to use a char ref? The
>> set of valid characters seems fairly complicated from what I can tell,
>> but perhaps some simple test is available.
> XML 1.0 doesn't permit most control characters at all, either as literals
> or entities: http://www.w3.org/TR/xml/#charsets
>
> XML 1.1 is less restrictive in this area but I wouldn't bet on your
> document living for long in the wild. :)
Thanks,
I've changed my XML to allow these blocks to be encoded in base64.
I scan the string for control characters, and encode the whole
thing if it has control characters.
David
David Brown wrote:
> val node =
> val str = scala.xml.Utility.toXML(node)
> scala.xml.XML.loadString(str)
>
> Causes an exception:
>
> [Fatal Error] :1:7: An invalid XML character (Unicode: 0x1) was
> found in the element content of the document.
> org.xml.sax.SAXParseException: An invalid XML character (Unicode:
> 0x1) was found in the element content of the document.
>
> Should the string encoding in Utility be fixed to use a char ref? The
> set of valid characters seems fairly complicated from what I can tell,
> but perhaps some simple test is available.
XML 1.0 doesn't permit most control characters at all, either as
literals or entities: http://www.w3.org/TR/xml/#charsets
XML 1.1 is less restrictive in this area but I wouldn't bet on your
document living for long in the wild. :)
-0xe1a