A provoking claim

As a software anarchitect, I like to challenge the status quo: I propose to use RDF and SPARQL for the core domain logic of business applications.

Many business applications consist of simple workflows to process rich information. My claim is that RDF and SPARQL are ideal to model and process such information while a workflow engine can orchestrate the processing steps.

Cheap philosophy

Algebraic data types are concrete structures capable of representing information explicitly and are becoming popular for domain modeling.

But also a logical framework like RDF shines at representing knowledge about a domain. Rich Hickey's provoking talks may upset my F# friends, but I think he has a point: explicit, precise data types may lead to rigid designs (to be clear, this article explains that the culprit for a rigid design is not the type system).

In domain modeling, common advice is to focus on functions and not on data: we should describe the dynamic behavior of a system rather than static information. This applies both to OO (classes are collections of functions) and FP (pure functions still have a dynamic, computational sense even though we like to think of them as static input-output mappings).

Often this advice is neglected. Partly for historical reasons stemming from the dominance of relational databases. Partly because the value of many business applications lies more in the data than in their processing steps. My endorsement of RDF is limited to this kind of applications, for which other declarative approaches, SQL-like or Prolog-like, may work as well.

Proof of Concept

I admit this is cheap philosophy and my claim is not backed by real world experience, so I decided to get a feel of what it means to build an application with a core domain based on RDF. As a proof of concept, I hacked a toy workflow engine in a few lines of F# code. It orchestrates the steps of workflow definitions like the following one (expressed as RDF in Turtle notation):

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
@prefix w: <http://workflow.org/> .
@prefix : <http://example.org/> .

:search a w:Workflow ;
    w:startAt :validation .
:validation a w:AskStep ;
    w:sparqlQuery "validation.rq" ;
    w:nextOnTrue :retrieval ;
    w:nextOnFalse :ko .
:retrieval a w:ConstructStep ;
    w:sparqlQuery "retrieval.rq" ;
    w:next :ok .
:ko a w:FinalStep ;
    w:success false .
:ok a w:FinalStep ;
    w:success true .

The workflow accepts RDF input like:

1: 
2: 
3: 
4: 
@prefix : <http://example.org/> .

[ a :SearchRequest ;
    :keyword "logic", "software" ] .

and the workflow steps use the dotNetRDF library to process information with SPARQL: ASK queries for branching (although for validation we may also use something more specific like SHACL):

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
# validation.rq
prefix : <http://example.org/>

ASK
WHERE {
    ?request a :SearchRequest ;
        :keyword ?keyword .
    FILTER (strlen(?keyword) > 3)
}

and CONSTRUCT queries to transform and merge information:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
# retrieval.rq
prefix : <http://example.org/>

CONSTRUCT {
    ?result :about ?keyword .
}
WHERE {
    ?request a :SearchRequest ;
        :keyword ?keyword .
    SERVICE <https://mytriples/sparql> {
        ?result :about ?keyword .
    }
}

Query processing happens in memory but we can use also RDF databases (triplestores) for persistent storage of information. Federated queries (with the SERVICE keyword) allow to relate information in memory with information stored in RDF databases.

Of course real applications interact with different kinds of databases and other infrastructure (queues, APIs...) so our workflow engine needs to plug in custom adapter code for such interactions (and for when data processing is complex enough and requires a real programming language). But, overall, RDF provides a great data model with standard and uniform tools to process, persist and serialize information with no impedance mismatch.

A mixed paradigm

Most programmers (including me) are scared of building applications using something other than their favourite programming language. Filling in the gaps of some 'bubbles and arrows' workflow framework can be frustrating and painful, especially when such tools are built to appeal managers, selling the illusion to create applications with almost no programming skills. Therefore, it's fundamental a smooth integration of declarative RDF processing with regular programming. Type providers in Iride can help to bridge RDF information with processing code.

The following sendOffers function can be plugged as a custom step into a workflow. It takes an instance of IGraph as input and access its information through types generated from an RDF schema by GraphProvider. A concern may be that external libraries like dotNetRDF pollute our domain. But the IGraph interface is much like ICollection or IDictionary from the base library. Purists would ban all of them but in practice they appear routinely in domain logic.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
35: 
36: 
37: 
38: 
39: 
40: 
41: 
42: 
43: 
44: 
45: 
46: 
47: 
48: 
open Iride

type Schema = GraphProvider<Schema="""

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix : <http://example.org/> .

schema:price a rdf:Property ;
	schema:domainIncludes schema:Offer ;
	schema:rangeIncludes xsd:decimal .
schema:priceCurrency a rdf:Property ;
	schema:domainIncludes schema:Offer ;
	schema:rangeIncludes xsd:string .
schema:gtin a rdf:Property ;
	schema:domainIncludes schema:Offer ;
	schema:rangeIncludes xsd:string .
""">

let (|EUR|USD|Other|) (offer: Schema.Offer) =
    match Seq.exactlyOne offer.PriceCurrency with
    | "EUR" -> EUR offer
    | "USD" -> USD offer
    | _ -> Other offer

let (|Expensive|_|) (offer: Schema.Offer) =
    let price = Seq.exactlyOne offer.Price
    match offer with
    | EUR _ ->
        if price > 200m
        then Some (Expensive offer)
        else None
    | USD _ ->
        if price > 250m
        then Some (Expensive offer)
        else None
    | Other _ -> None

let sendOffer = function
    | Expensive offer ->
        let gtin = Seq.exactlyOne offer.Gtin
        printfn "promote %s to rich customers" gtin
    | _ -> ()

let sendOffers (data: VDS.RDF.IGraph) =
    Schema.Offer.Get data
    |> Seq.iter sendOffer

Notice how provided types help navigating information but lack precision. Price, PriceCurrency and Gtin are sequences because RDF allows multiple property values. Here, the application is assuming there is a single value for each of them (possibly relying on a previous SHACL validation step, because the schema only describes a domain, imposing no constraint).

In F#, we enjoy the kind of precision given by union types. I argue their strength is more in taming cyclomatic complexity rather than in information modeling. By providing exaustive case matching (like active patterns in the example), union types implicitly constrain the processing paths, hence they pertain more to the dynamic aspect of a system.

The next example shows yet another option, in which we get rid of the workflow engine and throw into the mix union types for branching logic (namely Option to short-circuit on failure) and type providers to access raw graph data encapsulated in OO-style types:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
open Iride
open VDS.RDF

type G = GraphProvider<Schema = """
    @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix : <http://example.org/> .

    :keyword rdfs:domain :SearchRequest ;
         rdfs:range xsd:string .
""">

let validKeyword k = String.length k > 3

type SearchRequest private(request: G.SearchRequest) =

    member _.Keywords = request.Keyword |> Seq.filter validKeyword

    static member TryCreate(data: IGraph) =
        match G.SearchRequest.Get data |> Seq.toArray with
        | [| r |] ->
            if r.Keyword |> Seq.exists validKeyword
            then Some (SearchRequest r)
            else None
        | _ -> None

Conclusion

Type Providers and data related technologies like RDF are expected to live inside adapters at the boundaries of applications, far removed from the core domain logic. I argue in favor of admitting them inside the core of information-based applications. Although my aim is mainly thought-provoking, I really hope to see some ideas from declarative, logic based paradigms percolate into mainstream programming, much like what happened with functional programming permeating OO languages.

namespace Iride
type Schema = nested type Offer
type GraphProvider =
<summary>Type provider of RDF classes.</summary> <param name='Sample'>RDF Sample as Turtle.</param> <param name='Schema'>RDF Schema as Turtle.</param>
val offer : Schema.Offer
type Offer = new : resource: Resource -> Offer member Equals : obj: obj -> bool member GetHashCode : unit -> int static member Add : graph: IGraph * node: INode -> Offer + 1 overload static member Get : graph: IGraph -> IEnumerable<Offer> member Gtin : PropertyValues<string> member Price : PropertyValues<decimal> member PriceCurrency : PropertyValues<string> member Resource : Resource
module Seq from Microsoft.FSharp.Collections
<summary>Contains operations for working with values of type <see cref="T:Microsoft.FSharp.Collections.seq`1" />.</summary>
val exactlyOne : source:seq<'T> -> 'T
<summary>Returns the only element of the sequence.</summary>
<param name="source">The input sequence.</param>
<returns>The only element of the sequence.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
<exception cref="T:System.ArgumentException">Thrown when the input does not have precisely one element.</exception>
property Schema.Offer.PriceCurrency: PropertyValues<string> with get
val price : decimal
property Schema.Offer.Price: PropertyValues<decimal> with get
active recognizer EUR: Schema.Offer -> Choice<Schema.Offer,Schema.Offer,Schema.Offer>
union case Option.Some: Value: 'T -> Option<'T>
<summary>The representation of "Value of type 'T"</summary>
<param name="Value">The input value.</param>
<returns>An option representing the value.</returns>
union case Option.None: Option<'T>
<summary>The representation of "No value"</summary>
active recognizer USD: Schema.Offer -> Choice<Schema.Offer,Schema.Offer,Schema.Offer>
active recognizer Other: Schema.Offer -> Choice<Schema.Offer,Schema.Offer,Schema.Offer>
val sendOffer : _arg1:Schema.Offer -> unit
active recognizer Expensive: Schema.Offer -> Schema.Offer option
val gtin : string
property Schema.Offer.Gtin: PropertyValues<string> with get
val printfn : format:Printf.TextWriterFormat<'T> -> 'T
<summary>Print to <c>stdout</c> using the given format, and add a newline.</summary>
<param name="format">The formatter.</param>
<returns>The formatted result.</returns>
val sendOffers : data:VDS.RDF.IGraph -> unit
val data : VDS.RDF.IGraph
namespace VDS
namespace VDS.RDF
type IGraph = inherit INodeFactory inherit IDisposable inherit IXmlSerializable member Assert : t: Triple -> bool + 1 overload member Clear : unit -> unit member ContainsTriple : t: Triple -> bool member CreateUriNode : unit -> IUriNode + 1 overload member Difference : g: IGraph -> GraphDiffReport member Equals : g: IGraph * mapping: byref<Dictionary<INode,INode>> -> bool member GetBlankNode : nodeId: string -> IBlankNode ...
<summary> Interface for RDF Graphs. </summary>
<remarks><para> Most implementations will probably want to inherit from the abstract class <see cref="T:VDS.RDF.BaseGraph">BaseGraph</see> since it contains reference implementations of various algorithms (Graph Equality/Graph Difference/Sub-Graph testing etc) which will save considerable work in implementation and ensure consistent behaviour of some methods across implementations. </para></remarks>
Schema.Offer.Get(graph: VDS.RDF.IGraph) : System.Collections.Generic.IEnumerable<Schema.Offer>
val iter : action:('T -> unit) -> source:seq<'T> -> unit
<summary>Applies the given function to each element of the collection.</summary>
<param name="action">A function to apply to each element of the sequence.</param>
<param name="source">The input sequence.</param>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
type G = nested type SearchRequest
val validKeyword : k:string -> bool
val k : string
module String from Microsoft.FSharp.Core
<summary>Functional programming operators for string processing. Further string operations are available via the member functions on strings and other functionality in <a href="http://msdn2.microsoft.com/en-us/library/system.string.aspx">System.String</a> and <a href="http://msdn2.microsoft.com/library/system.text.regularexpressions.aspx">System.Text.RegularExpressions</a> types. </summary>
<category>Strings and Text</category>
val length : str:string -> int
<summary>Returns the length of the string.</summary>
<param name="str">The input string.</param>
<returns>The number of characters in the string.</returns>
Multiple items
type SearchRequest = private new : request:SearchRequest -> SearchRequest static member TryCreate : data:IGraph -> SearchRequest option member Keywords : seq<string>

--------------------
private new : request:G.SearchRequest -> SearchRequest
val request : G.SearchRequest
type SearchRequest = new : resource: Resource -> SearchRequest member Equals : obj: obj -> bool member GetHashCode : unit -> int static member Add : graph: IGraph * node: INode -> SearchRequest + 1 overload static member Get : graph: IGraph -> IEnumerable<SearchRequest> member Keyword : PropertyValues<string> member Resource : Resource
property G.SearchRequest.Keyword: PropertyValues<string> with get
val filter : predicate:('T -> bool) -> source:seq<'T> -> seq<'T>
<summary>Returns a new collection containing only the elements of the collection for which the given predicate returns "true". This is a synonym for Seq.where.</summary>
<remarks>The returned sequence may be passed between threads safely. However, individual IEnumerator values generated from the returned sequence should not be accessed concurrently. Remember sequence is lazy, effects are delayed until it is enumerated.</remarks>
<param name="predicate">A function to test whether each item in the input sequence should be included in the output.</param>
<param name="source">The input sequence.</param>
<returns>The result sequence.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
val data : IGraph
G.SearchRequest.Get(graph: IGraph) : System.Collections.Generic.IEnumerable<G.SearchRequest>
val toArray : source:seq<'T> -> 'T []
<summary>Builds an array from the given collection.</summary>
<param name="source">The input sequence.</param>
<returns>The result array.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
val r : G.SearchRequest
val exists : predicate:('T -> bool) -> source:seq<'T> -> bool
<summary>Tests if any element of the sequence satisfies the given predicate.</summary>
<remarks>The predicate is applied to the elements of the input sequence. If any application returns true then the overall result is true and no further elements are tested. Otherwise, false is returned.</remarks>
<param name="predicate">A function to test each item of the input sequence.</param>
<param name="source">The input sequence.</param>
<returns>True if any result from the predicate is true; false otherwise.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>