AntaniXml


Tutorial

The public API comprises factory methods to load a schema. Usually schema definitions are given as xsd files, so you need to specify their Uri. Overloads accepting xsd as plain text are also provided; they're handy for experimenting with little xsd snippets:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
    let xsdText = """
        <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
            elementFormDefault="qualified" attributeFormDefault="unqualified">
            <xs:element name="e1" type="xs:int" />
            <xs:element name="e2" type="xs:string" />
        </xs:schema>""" 

    Schema.CreateFromText(xsdText)
          .Generator(XmlQualifiedName "e1")
          .GenerateInfinite()
          |> Seq.take 5
          |> Seq.iter (printfn "%A")

Choosing a global element we can get a generator for it. The same example in C# is:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
var xsdText = @"
    <xs:schema xmlns:xs = 'http://www.w3.org/2001/XMLSchema'
        elementFormDefault = 'qualified' attributeFormDefault = 'unqualified' >
        <xs:element name = 'e1' type = 'xs:int' />
        <xs:element name = 'e2' type = 'xs:string' />
    </xs:schema > ";

Schema.CreateFromText(xsdText)
    .Generator(new XmlQualifiedName("e1"))
    .GenerateInfinite()
    .Take(5)
    .ToList()
    .ForEach(Console.WriteLine);

and may generate something like this:

1: 
2: 
3: 
4: 
5: 
<e1>0</e1>
<e1>  -3</e1>
<e1>0</e1>
<e1>4</e1>
<e1>-1</e1>

Property based testing

For property based testing we can get instances of the Arbitrary type defined by FsCheck:

1: 
2: 
3: 
    let arb = Schema.CreateFromUri("po.xsd")
                    .Arbitrary(XmlQualifiedName "purchaseOrder")
    

Again, the C# version is almost the same:

1: 
2: 
var arb = Schema.CreateFromUri("po.xsd")
    .Arbitrary(new XmlQualifiedName("purchaseOrder"));

The idea of property based testing is to express a specification with boolean functions (properties). But instead of trying to prove that a property holds, we simply check that the function is true for a big number of randomly generated input values.

In our context the first, obligatory example is validity. This is of course a property we always expect to hold and hopefully AntaniXml produces valid elements, but it's worth checking it because for some schema it may not be the case, and you may discover the need to customize generators in order to obtain valid elements.

The Check.Quick function generates a certain number of values using the given Arbitrary instance and, for each one, checks if the property holds; in this case it checks if the generated element is valid:

1: 
2: 
3: 
4: 
5: 
6: 
    open FsCheck
    
    let schema = Schema.CreateFromUri "foo.xsd"
    let arbFoo = schema.Arbitrary(XmlQualifiedName "foo")
    Prop.forAll arbFoo schema.IsValid
    |> Check.Quick 

The same in C# is:

1: 
2: 
3: 
4: 
var schema = Schema.CreateFromUri("foo.xsd");
var arbFoo = schema.Arbitrary(new XmlQualifiedName("foo"));
Prop.ForAll(arbFoo, x => schema.IsValid(x))
    .QuickCheck();

In the standard output a message like the following should be printed

1: 
Ok, passed 100 tests.

In case a counter-example is found, it is printed instead. FsCheck has a concept of shrinking aimed at minimizing counter-examples. At the moment AntaniXml lacks proper support for shrinking so counter-examples provided for failing tests may be bigger than necessary.

When checking properties in a unit test, the function Check.QuickThrowOnFailure may be used instead of Check.Quick so that a test failure is triggered when a property does not hold. For popular unit testing frameworks like NUnit and XUnit there are also extensions enabling to express FsCheck properties more directly.

A more interesting example of property based testing is about XML data binding and serialization. Suppose you have a class representing a global element in a schema. This kind of data binding classes are often obtained with tools like xsd.exe. It may be interesting to check that all valid elements can be properly deserialized into instances of the corresponding class. And serializing such instances back to XML should result in equivalent elements. Probably it's not required for the resulting elements to be identical to the original ones, especially when it comes to formatting details; but at least we should expect no loss of contents. You may be surprised to discover that for many schemas it is quite hard or impossible to get a suitable data binding class. This is due to the X/O impedance mismatch.

One more use case is schema evolution. Evolving a schema ensuring backward compatibility means that all the elements valid according to the old version of the schema should also be valid for the new version. This can be checked with property based testing.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
    open FsCheck
    
    let oldSchema = Schema.CreateFromUri "old.xsd"
    let newSchema = Schema.CreateFromUri "new.xsd"
    let arbFooOld = oldSchema.Arbitrary(XmlQualifiedName "foo")

    let isStillValid elm = oldSchema.IsValid elm ==> newSchema.IsValid elm

    Prop.forAll arbFooOld isStillValid
    |> Check.Quick

The same in C# is:

1: 
2: 
3: 
4: 
5: 
var oldSchema = Schema.CreateFromUri("old.xsd");
var newSchema = Schema.CreateFromUri("new.xsd");
var arbFooOld = oldSchema.Arbitrary(new XmlQualifiedName("foo"));
Prop.ForAll(arbFooOld, x => newSchema.IsValid(x).When(oldSchema.IsValid(x)))
    .QuickCheck();

In this example we also see in action a conditional property, expressed in F# with the ==> operator and in C# with the fluent method When. Again, the concept of conditional property is well explained in the FsCheck documentation.

Creating samples for the XML type provider

Another possible usage scenario is creating samples for the XML type provider.

FSharp.Data is a popular F# library featuring many type providers, including one for XML. Strongly typed access to xml documents is achieved with inference on samples. AntaniXml may help to produce the needed samples:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
    open AntaniXml
    open System.IO
    open FSharp.Data
    open System.Xml.Linq

    let samples = 
        Schema.CreateFromUri(@"C:\temp\po.xsd")
              .Generator(new XmlQualifiedName "purchaseOrder")
              .Generate(5)
    XElement(XName.Get "root", samples).Save(@"C:\temp\samples.xml")

    type po = XmlProvider< @"C:\temp\samples.xml", SampleIsList = true>

Of course when a schema is available it would be a better option to infer types directly from it. Future versions of FSharp.Data may support xsd.

Known Limitations

XML Schema is rich and complex, it's inevitable to have some limitations. A few ones are known and listed below. Some of them may hopefully be addressed in the future. But likely there are many more unknown limitations. If you find one please raise an issue. Anyway don't be too scared of this disclaimer. AntaniXml can cope with many nuances and support many features (like regex patterns thanks to Fare). The main limitations currently known are:

built-in types

A few built-in types are not supported: Notation, NmTokens, Id, Idref, Idrefs, Entity and Entities.

identity and subset constraint

XML Schema provides rough equivalents of primary and foreign keys in databases. Version 1.1 also introduced assertions to allow further constraints. Schemas are primarily grammar based, so they are a good fit for random generators. But these complementary features for specifying constraints are at odd with the generative approach.

wildcards

With certain kinds of wildcards (e.g. ##other) it may be impossible to generate valid contents.

regex patterns

Some regex patterns may not be properly supported, for example those using the character classes \i and \c which are specific to W3C XML Schema.

Public API

The public types of the AntaniXml namespace constitute the public API. Users of the library are expected to interact only with the Schema class and sometimes with the CustomGenerators class.

The rest of the library is organized in modules. Even if they are public they are not designed to be directly used by C# client code.

namespace AntaniXml
module XsdFactory

from AntaniXml
namespace System
namespace System.Xml
val xsdText : string

Full name: Tutorial.xsdText
Multiple items
namespace System.Xml.Schema

--------------------
type Schema =
  new : xmlSchemaSet:XmlSchemaSet -> Schema
  member Arbitrary : elementName:XmlQualifiedName -> Arbitrary<XElement>
  member Arbitrary : elementName:XmlQualifiedName * customizations:CustomGenerators -> Arbitrary<XElement>
  member Generator : elementName:XmlQualifiedName -> IXmlElementGenerator
  member IsValid : element:XElement -> bool
  member Validate : element:XElement -> ValidationResult
  member Validate : element:string -> ValidationResult
  member GlobalElements : IEnumerable<XmlQualifiedName>
  static member CreateFromText : schemaText:string -> Schema
  static member CreateFromUri : schemaUri:string -> Schema

Full name: AntaniXml.Schema

--------------------
new : xmlSchemaSet:Schema.XmlSchemaSet -> Schema
static member Schema.CreateFromText : schemaText:string -> Schema
Multiple items
type XmlQualifiedName =
  new : unit -> XmlQualifiedName + 2 overloads
  member Equals : other:obj -> bool
  member GetHashCode : unit -> int
  member IsEmpty : bool
  member Name : string
  member Namespace : string
  member ToString : unit -> string
  static val Empty : XmlQualifiedName
  static member ToString : name:string * ns:string -> string

Full name: System.Xml.XmlQualifiedName

--------------------
XmlQualifiedName() : unit
XmlQualifiedName(name: string) : unit
XmlQualifiedName(name: string, ns: string) : unit
module Seq

from Microsoft.FSharp.Collections
val take : count:int -> source:seq<'T> -> seq<'T>

Full name: Microsoft.FSharp.Collections.Seq.take
val iter : action:('T -> unit) -> source:seq<'T> -> unit

Full name: Microsoft.FSharp.Collections.Seq.iter
val printfn : format:Printf.TextWriterFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val arb : FsCheck.Arbitrary<Linq.XElement>

Full name: Tutorial.arb
static member Schema.CreateFromUri : schemaUri:string -> Schema
namespace FsCheck
val schema : Schema

Full name: Tutorial.schema
val arbFoo : FsCheck.Arbitrary<Linq.XElement>

Full name: Tutorial.arbFoo
member Schema.Arbitrary : elementName:XmlQualifiedName -> FsCheck.Arbitrary<Linq.XElement>
member Schema.Arbitrary : elementName:XmlQualifiedName * customizations:CustomGenerators -> FsCheck.Arbitrary<Linq.XElement>
member Schema.IsValid : element:Linq.XElement -> bool
val oldSchema : Schema

Full name: Tutorial.oldSchema
val newSchema : Schema

Full name: Tutorial.newSchema
val arbFooOld : FsCheck.Arbitrary<Linq.XElement>

Full name: Tutorial.arbFooOld
val elm : 'a (requires 'a :> Linq.XElement)
namespace System.IO
Multiple items
namespace FSharp

--------------------
namespace Microsoft.FSharp
Multiple items
namespace FSharp.Data

--------------------
namespace Microsoft.FSharp.Data
namespace System.Xml.Linq
val samples : XElement array

Full name: Tutorial.samples
Multiple items
type XElement =
  inherit XContainer
  new : name:XName -> XElement + 4 overloads
  member AncestorsAndSelf : unit -> IEnumerable<XElement> + 1 overload
  member Attribute : name:XName -> XAttribute
  member Attributes : unit -> IEnumerable<XAttribute> + 1 overload
  member DescendantNodesAndSelf : unit -> IEnumerable<XNode>
  member DescendantsAndSelf : unit -> IEnumerable<XElement> + 1 overload
  member FirstAttribute : XAttribute
  member GetDefaultNamespace : unit -> XNamespace
  member GetNamespaceOfPrefix : prefix:string -> XNamespace
  member GetPrefixOfNamespace : ns:XNamespace -> string
  ...

Full name: System.Xml.Linq.XElement

--------------------
XElement(name: XName) : unit
XElement(other: XElement) : unit
XElement(other: XStreamingElement) : unit
XElement(name: XName, content: obj) : unit
XElement(name: XName, [<System.ParamArray>] content: obj []) : unit
type XName =
  member Equals : obj:obj -> bool
  member GetHashCode : unit -> int
  member LocalName : string
  member Namespace : XNamespace
  member NamespaceName : string
  member ToString : unit -> string
  static member Get : expandedName:string -> XName + 1 overload

Full name: System.Xml.Linq.XName
XName.Get(expandedName: string) : XName
XName.Get(localName: string, namespaceName: string) : XName
type po = obj

Full name: Tutorial.po
type XmlProvider

Full name: FSharp.Data.XmlProvider


<summary>Typed representation of a XML file.</summary>
       <param name='Sample'>Location of a XML sample file or a string containing a sample XML document.</param>
       <param name='SampleIsList'>If true, the children of the root in the sample document represent individual samples for the inference.</param>
       <param name='Global'>If true, the inference unifies all XML elements with the same name.</param>
       <param name='Culture'>The culture used for parsing numbers and dates. Defaults to the invariant culture.</param>
       <param name='Encoding'>The encoding used to read the sample. You can specify either the character set name or the codepage number. Defaults to UTF8 for files, and to ISO-8859-1 the for HTTP requests, unless `charset` is specified in the `Content-Type` response header.</param>
       <param name='ResolutionFolder'>A directory that is used when resolving relative file references (at design time and in hosted execution).</param>
       <param name='EmbeddedResource'>When specified, the type provider first attempts to load the sample from the specified resource
          (e.g. 'MyCompany.MyAssembly, resource_name.xml'). This is useful when exposing types generated by the type provider.</param>
       <param name='InferTypesFromValues'>If true, turns on additional type inference from values.
          (e.g. type inference infers string values such as "123" as ints and values constrained to 0 and 1 as booleans. The XmlProvider also infers string values as JSON.)</param>
       <param name='Schema'>Location of a schema file or a string containing xsd.</param>
Fork me on GitHub