Xsd type provider and nillable elements
XSD is dead, long live XSD!
My little contribution to the F# OSS ecosystem is schema support for the XML Type Provider. It's been recently merged into F# Data (and will ship soon in the upcoming version 3.0) after being available for a while as a standalone project.
It "comes with comprehensible documentation" but I'm going to use this blog to post a few tips covering marginal aspects.
Before introducing the type provider (and today's tip about nillable elements) let me spend a few words about schemas.
Validation
Having a schema allows to validate documents against it. We will use the following handy snippet:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: |
|
Given a schema (AuthorXsd
) and some documents (xml1
and xml2
):
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: |
|
we can check their validity:
1: 2: 3: 4: 5: 6: 7: |
|
and see that xml2
lacks the name
element:
|
Type Provider
The XML Type Provider can be used with the Schema
parameter,
generating a type with Name
and Born
properties.
1: 2: 3: 4: 5: 6: |
|
Beware that no validation is performed; in fact, also xml2
could
be parsed, albeit accessing the Name
property would cause an exception.
If you need to validate your input you have to do it yourself
using code like the above validation snippet, which is useful anyway:
whenever the type provider behaves unexpectedly, first check whether the input
is valid.
You may be surprised, for example, that the following document is invalid:
1:
|
|
|
Nillable Elements
The validator complains about the born
element lacking,
although it was declared nillable.
Declaring a nillable element is a weird way to specify that its value
is not mandatory. A much simpler and more common alternative is to rely
on minOccurs
and maxOccurs
to constrain the allowed number of elements.
But in case you stumble across a schema with nillable elements,
you need to be aware that valid documents look like this:
1: 2: 3: 4: 5: 6: |
|
|
You may legitimately wonder what the heck is this strange nil
attribute. It belongs to a special W3C namespace and its purpose
is to explicitly signal the absence of a value.
The element tag must always be present for a nillable element!
But the element is allowed to have content only when the nil
attribute is false (or is simply omitted like in xml1
):
1: 2: 3: 4: 5: 6: 7: 8: |
|
|
For nillable elements the XML Type Provider creates two
optional properties (Nil
and Value
).
1:
|
|
|
For valid elements if Nil = Some true
, then Value = None
.
The converse does not hold in general: for certain data types like
xs:string
that admit empty content, it is possible to have Value = None
even if Nil = Some false
or Nil = None
; in fact the nil
attribute
helps disambiguate subtleties about the lack of a value: the value
was not entered vs the value NULL was entered (can you feel the smell of
the billion dollar mistake?).
In practice, when reading XML, you mostly rely on Value
and ignore Nil
.
When you use the type provider to write XML, on the other hand, you need
to pass appropriate values in order to obtain a valid document:
1: 2: 3: |
|
|
<summary>Represents a reader that provides fast, noncached, forward-only access to XML data.</summary>
type XmlSchemaSet = new : unit -> unit + 1 overload member Add : targetNamespace: string * schemaUri: string -> XmlSchema + 3 overloads member Compile : unit -> unit member Contains : targetNamespace: string -> bool + 1 overload member CopyTo : schemas: XmlSchema [] * index: int -> unit member Remove : schema: XmlSchema -> XmlSchema member RemoveRecursive : schemaToRemove: XmlSchema -> bool member Reprocess : schema: XmlSchema -> XmlSchema member Schemas : unit -> ICollection + 1 overload member CompilationSettings : XmlSchemaCompilationSettings ...
<summary>Contains a cache of XML Schema definition language (XSD) schemas.</summary>
--------------------
XmlSchemaSet() : XmlSchemaSet
XmlSchemaSet(nameTable: XmlNameTable) : XmlSchemaSet
XmlSchemaSet.Add(schema: XmlSchema) : XmlSchema
XmlSchemaSet.Add(targetNamespace: string, schemaDocument: XmlReader) : XmlSchema
XmlSchemaSet.Add(targetNamespace: string, schemaUri: string) : XmlSchema
<summary>Ignore the passed value. This is often used to throw away results of a computation.</summary>
<param name="value">The value to ignore.</param>
(+0 other overloads)
XmlReader.Create(input: System.IO.TextReader) : XmlReader
(+0 other overloads)
XmlReader.Create(input: System.IO.Stream) : XmlReader
(+0 other overloads)
XmlReader.Create(reader: XmlReader, settings: XmlReaderSettings) : XmlReader
(+0 other overloads)
XmlReader.Create(inputUri: string, settings: XmlReaderSettings) : XmlReader
(+0 other overloads)
XmlReader.Create(input: System.IO.TextReader, settings: XmlReaderSettings) : XmlReader
(+0 other overloads)
XmlReader.Create(input: System.IO.Stream, settings: XmlReaderSettings) : XmlReader
(+0 other overloads)
XmlReader.Create(inputUri: string, settings: XmlReaderSettings, inputContext: XmlParserContext) : XmlReader
(+0 other overloads)
XmlReader.Create(input: System.IO.TextReader, settings: XmlReaderSettings, inputContext: XmlParserContext) : XmlReader
(+0 other overloads)
XmlReader.Create(input: System.IO.TextReader, settings: XmlReaderSettings, baseUri: string) : XmlReader
(+0 other overloads)
type StringReader = inherit TextReader new : s: string -> unit member Close : unit -> unit member Dispose : disposing: bool -> unit member Peek : unit -> int member Read : unit -> int + 2 overloads member ReadAsync : buffer: char [] * index: int * count: int -> Task<int> + 1 overload member ReadBlock : buffer: Span<char> -> int member ReadBlockAsync : buffer: char [] * index: int * count: int -> Task<int> + 1 overload member ReadLine : unit -> string ...
<summary>Implements a <see cref="T:System.IO.TextReader" /> that reads from a string.</summary>
--------------------
System.IO.StringReader(s: string) : System.IO.StringReader
type XmlReaderSettings = new : unit -> unit member Clone : unit -> XmlReaderSettings member Reset : unit -> unit member Async : bool member CheckCharacters : bool member CloseInput : bool member ConformanceLevel : ConformanceLevel member DtdProcessing : DtdProcessing member IgnoreComments : bool member IgnoreProcessingInstructions : bool ...
<summary>Specifies a set of features to support on the <see cref="T:System.Xml.XmlReader" /> object created by the <see cref="Overload:System.Xml.XmlReader.Create" /> method.</summary>
--------------------
XmlReaderSettings() : XmlReaderSettings
<summary>Specifies the type of validation to perform.</summary>
<summary>Validate according to XML Schema definition language (XSD) schemas, including inline XML Schemas. XML Schemas are associated with namespace URIs either by using the <see langword="schemaLocation" /> attribute or the provided <see langword="Schemas" /> property.</summary>
module Result from Microsoft.FSharp.Core
<summary>Contains operations for working with values of type <see cref="T:Microsoft.FSharp.Core.Result`2" />.</summary>
<category>Choices and Results</category>
--------------------
[<Struct>] type Result<'T,'TError> = | Ok of ResultValue: 'T | Error of ErrorValue: 'TError
<summary>Helper type for error handling without exceptions.</summary>
<category>Choices and Results</category>
<summary> Represents an OK or a Successful result. The code succeeded with a value of 'T. </summary>
type XmlSchemaException = inherit SystemException new : unit -> unit + 4 overloads member GetObjectData : info: SerializationInfo * context: StreamingContext -> unit member LineNumber : int member LinePosition : int member Message : string member SourceSchemaObject : XmlSchemaObject member SourceUri : string
<summary>Returns detailed information about the schema exception.</summary>
--------------------
XmlSchemaException() : XmlSchemaException
XmlSchemaException(message: string) : XmlSchemaException
XmlSchemaException(message: string, innerException: exn) : XmlSchemaException
XmlSchemaException(message: string, innerException: exn, lineNumber: int, linePosition: int) : XmlSchemaException
<summary> Represents an Error or a Failure. The code failed with a value of 'TError representing what went wrong. </summary>
<summary>Gets the description of the error condition of this exception.</summary>
<returns>The description of the error condition of this exception.</returns>
type LiteralAttribute = inherit Attribute new : unit -> LiteralAttribute
<summary>Adding this attribute to a value causes it to be compiled as a CLI constant literal.</summary>
<category>Attributes</category>
--------------------
new : unit -> LiteralAttribute
<summary>Print to <c>stdout</c> using the given format, and add a newline.</summary>
<param name="format">The formatter.</param>
<returns>The formatted result.</returns>
namespace FSharp
--------------------
namespace Microsoft.FSharp
namespace FSharp.Data
--------------------
namespace Microsoft.FSharp.Data
<summary>Typed representation of a XML file.</summary> <param name='Sample'>Location of a XML sample file or a string containing a sample XML document.</param> <param name='SampleIsList'>If true, the children of the root in the sample document represent individual samples for the inference.</param> <param name='Global'>If true, the inference unifies all XML elements with the same name.</param> <param name='Culture'>The culture used for parsing numbers and dates. Defaults to the invariant culture.</param> <param name='Encoding'>The encoding used to read the sample. You can specify either the character set name or the codepage number. Defaults to UTF8 for files, and to ISO-8859-1 the for HTTP requests, unless <c>charset</c> is specified in the <c>Content-Type</c> response header.</param> <param name='ResolutionFolder'>A directory that is used when resolving relative file references (at design time and in hosted execution).</param> <param name='EmbeddedResource'>When specified, the type provider first attempts to load the sample from the specified resource (e.g. 'MyCompany.MyAssembly, resource_name.xml'). This is useful when exposing types generated by the type provider.</param> <param name='InferTypesFromValues'>If true, turns on additional type inference from values. (e.g. type inference infers string values such as "123" as ints and values constrained to 0 and 1 as booleans. The XmlProvider also infers string values as JSON.)</param> <param name='Schema'>Location of a schema file or a string containing xsd.</param>
Parses the specified XSD string
<summary>The representation of "Value of type 'T"</summary>
<param name="Value">The input value.</param>
<returns>An option representing the value.</returns>
<summary>The representation of "No value"</summary>