In XML, one can omit an element to express lack of value or use an empty element.

Sometimes it is desirable to represent an unshipped item, unknown information, or inapplicable information explicitly with an element, rather than by an absent element.

The trouble with using an empty element <foo></foo> is that the emptiness would no longer matches the specified type like xs:positiveInteger. It is possible to form a xs:union of xs:positiveInteger and an xs:enumeration with only empty string in it to allow either poistive integers or empty string. However, technically speaking, an empty string is different from pure emptiness. In terms of code, it's the difference between null and "", or in Scala, None and Some("").

XML Schema resolves this issue by introducing a special attribute called xsi:nil. By writing

<price xsi:nil="true" />

parsing with parser combinators

I've known the limitation of hand parsing for a while. Parsing that relies on token positions quickly gets out of hand when there are more complex grammars like repetitions, options, and choices of sequences. At some point, I decided to use scala's parser combinators to do the parsing of content types, but it's been a long way to implement it.

First let's look at a real-life example of such complex structure:

<complexType name="SubjectType">
                <element ref="saml:BaseID"/>

mixed content

Added support for mixed contents. in which text nodes are placed in conjunction with subelements within an element. Similar to the way I handled <any>, text nodes are placed in DataRecord[String] object along with other DataRecord objects under a member value called mixed.

Suppose we have a schema that looks like this:

<element name="Element3">
  <complexType mixed="true">
    <choice maxOccurs="unbounded">
      <element ref="ipo:Choice1"/>
      <element ref="ipo:Choice2"/>

<any> and <anyAttribute>

Last time I wrote:

Next goal is to handle <any> without ignoring it completely using this generic container. I can probably store the scala.xml.Elem object "as is" in a collection. The user of round trip probably would expect that I don't lose luggage during the flight.

That's pretty much what I did for two usages of <any> handled by scalaxb. First pattern is that it could appear as part of the sequence.

<element name="Choice1">
      <any namespace="##any" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>

round trip

I started implementation of round trip: xml document -> scala object -> back to xml document. Currently done with elements and attributes tracked in the case class. Suppose you have the following document:

val subject = <shipTo xmlns="http://www.example.com/IPO"
  <street>1537 Paper Street</street>

per-namespace package name

You can now generate classes under different packages per-namespace.

  -p:<namespaceURI>=<package> | --package:<namespaceURI>=<package>
        specifies the target package for <namespaceURI>

For example XML Signature and SAML 2.0 assertion probably should be under different package name:

attributes and namespaces in xml

I've heard on occasions that the way attributes work in XML is a mess. It is. The fault is not at attributes per se, but it's the way XML namespace is implemented that's confusing. The spec is called Namespaces in XML 1.0. Try to keep a straight face.

  1. Default namespace declarations do not apply directly to attribute names.
  2. The interpretation of unprefixed attributes is determined by the element on which they appear.
  3. The namespace name for an unprefixed attribute name always has no value.
  4. In all cases, the local name is local part.
  5. No tag may contain two attributes which have identical names.
  6. No tag may contain two attributes which have qualified names with the same local part and with prefixes which have been bound to namespace names that are identical.
  7. All element and attribute names contain either zero or one colon.
  8. No attributes with a declared type of ID, IDREF(S), ENTITY(IES), or NOTATION contain any colons.

<import>, <any>, and simple types

Finally implemented some rough cut of the <import> support. The process is looking more and more like a language compiler. When you are processing multiple XSD schema, the actual parsing step doesn't really care about other files, but the type checking/object binding phase needs to be aware of the other schemas.

In the schema, all you have to do is write

<import namespace="http://www.example.com/IPO"/>

and the types etc. from the given namespace are imported to the current schema. So, this works more like Java's import than C's #include, which is limited to inclusion of a physical file.

xml namespace

I've been working on Issue 1 submitted by Tsuresh Kumar for a while now. The idea is to compile the schema for SAML Metadata, which is saml-schema-metadata-2.0.xsd. I need to implement a few things so scalaxb can handle the schema.

scala, maven, and netbeans

I've been coding scala using TextMate with ant or simple-build-tool, but I'd like to give NetBeans a try again. Coding without IDE wasn't as bad, since the compiler and tests can catch undefined symbols and typos while making code changes; however, that's not to say it was perfect. I had to go back and forth between the scaladoc and TextMate in situation I could have just let IDE autocompletion look things up. Also navigating between error messages and the code has been slow, especially since I moved to sbt, which is not integrated with TextMate.

Syndicate content