Maybe rdf
A provoking claim
As a software anarchitect, I like to challenge the status quo: I propose to use RDF and SPARQL for the core domain logic of business applications.
Many business applications consist of simple workflows to process rich information. My claim is that RDF and SPARQL are ideal to model and process such information while a workflow engine can orchestrate the processing steps.
Cheap philosophy
Algebraic data types are concrete structures capable of representing information explicitly and are becoming popular for domain modeling.
But also a logical framework like RDF shines at representing knowledge about a domain. Rich Hickey's provoking talks may upset my F# friends, but I think he has a point: explicit, precise data types may lead to rigid designs (to be clear, this article explains that the culprit for a rigid design is not the type system).
In domain modeling, common advice is to focus on functions and not on data: we should describe the dynamic behavior of a system rather than static information. This applies both to OO (classes are collections of functions) and FP (pure functions still have a dynamic, computational sense even though we like to think of them as static input-output mappings).
Often this advice is neglected. Partly for historical reasons stemming from the dominance of relational databases. Partly because the value of many business applications lies more in the data than in their processing steps. My endorsement of RDF is limited to this kind of applications, for which other declarative approaches, SQL-like or Prolog-like, may work as well.
Proof of Concept
I admit this is cheap philosophy and my claim is not backed by real world experience, so I decided to get a feel of what it means to build an application with a core domain based on RDF. As a proof of concept, I hacked a toy workflow engine in a few lines of F# code. It orchestrates the steps of workflow definitions like the following one (expressed as RDF in Turtle notation):
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: |
|
The workflow accepts RDF input like:
1: 2: 3: 4: |
|
and the workflow steps use the dotNetRDF library to process information with SPARQL: ASK queries for branching (although for validation we may also use something more specific like SHACL):
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
and CONSTRUCT queries to transform and merge information:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: |
|
Query processing happens in memory but we can use also RDF databases (triplestores) for persistent storage of information. Federated queries (with the SERVICE keyword) allow to relate information in memory with information stored in RDF databases.
Of course real applications interact with different kinds of databases and other infrastructure (queues, APIs...) so our workflow engine needs to plug in custom adapter code for such interactions (and for when data processing is complex enough and requires a real programming language). But, overall, RDF provides a great data model with standard and uniform tools to process, persist and serialize information with no impedance mismatch.
A mixed paradigm
Most programmers (including me) are scared of building applications using something other than their favourite programming language. Filling in the gaps of some 'bubbles and arrows' workflow framework can be frustrating and painful, especially when such tools are built to appeal managers, selling the illusion to create applications with almost no programming skills. Therefore, it's fundamental a smooth integration of declarative RDF processing with regular programming. Type providers in Iride can help to bridge RDF information with processing code.
The following sendOffers
function can be plugged as a custom step into a workflow.
It takes an instance of IGraph
as input and access its information through types
generated from an RDF schema by GraphProvider
.
A concern may be that external libraries like dotNetRDF pollute our domain.
But the IGraph
interface is much like ICollection
or IDictionary
from the base library.
Purists would ban all of them but in practice they appear routinely in domain logic.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: |
|
Notice how provided types help navigating information but lack precision.
Price
, PriceCurrency
and Gtin
are sequences because RDF allows multiple property values.
Here, the application is assuming there is a single value for each of them
(possibly relying on a previous SHACL validation step, because the schema only describes a domain, imposing no constraint).
In F#, we enjoy the kind of precision given by union types. I argue their strength is more in taming cyclomatic complexity rather than in information modeling. By providing exaustive case matching (like active patterns in the example), union types implicitly constrain the processing paths, hence they pertain more to the dynamic aspect of a system.
The next example shows yet another option, in which we get rid of the workflow engine
and throw into the mix union types for branching logic (namely Option
to short-circuit on failure)
and type providers to access raw graph data encapsulated in OO-style types:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: |
|
Conclusion
Type Providers and data related technologies like RDF are expected to live inside adapters at the boundaries of applications, far removed from the core domain logic. I argue in favor of admitting them inside the core of information-based applications. Although my aim is mainly thought-provoking, I really hope to see some ideas from declarative, logic based paradigms percolate into mainstream programming, much like what happened with functional programming permeating OO languages.
<summary>Type provider of RDF classes.</summary> <param name='Sample'>RDF Sample as Turtle.</param> <param name='Schema'>RDF Schema as Turtle.</param>
<summary>Contains operations for working with values of type <see cref="T:Microsoft.FSharp.Collections.seq`1" />.</summary>
<summary>Returns the only element of the sequence.</summary>
<param name="source">The input sequence.</param>
<returns>The only element of the sequence.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
<exception cref="T:System.ArgumentException">Thrown when the input does not have precisely one element.</exception>
<summary>The representation of "Value of type 'T"</summary>
<param name="Value">The input value.</param>
<returns>An option representing the value.</returns>
<summary>The representation of "No value"</summary>
<summary>Print to <c>stdout</c> using the given format, and add a newline.</summary>
<param name="format">The formatter.</param>
<returns>The formatted result.</returns>
<summary> Interface for RDF Graphs. </summary>
<remarks><para> Most implementations will probably want to inherit from the abstract class <see cref="T:VDS.RDF.BaseGraph">BaseGraph</see> since it contains reference implementations of various algorithms (Graph Equality/Graph Difference/Sub-Graph testing etc) which will save considerable work in implementation and ensure consistent behaviour of some methods across implementations. </para></remarks>
<summary>Applies the given function to each element of the collection.</summary>
<param name="action">A function to apply to each element of the sequence.</param>
<param name="source">The input sequence.</param>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
<summary>Functional programming operators for string processing. Further string operations are available via the member functions on strings and other functionality in <a href="http://msdn2.microsoft.com/en-us/library/system.string.aspx">System.String</a> and <a href="http://msdn2.microsoft.com/library/system.text.regularexpressions.aspx">System.Text.RegularExpressions</a> types. </summary>
<category>Strings and Text</category>
<summary>Returns the length of the string.</summary>
<param name="str">The input string.</param>
<returns>The number of characters in the string.</returns>
type SearchRequest = private new : request:SearchRequest -> SearchRequest static member TryCreate : data:IGraph -> SearchRequest option member Keywords : seq<string>
--------------------
private new : request:G.SearchRequest -> SearchRequest
<summary>Returns a new collection containing only the elements of the collection for which the given predicate returns "true". This is a synonym for Seq.where.</summary>
<remarks>The returned sequence may be passed between threads safely. However, individual IEnumerator values generated from the returned sequence should not be accessed concurrently. Remember sequence is lazy, effects are delayed until it is enumerated.</remarks>
<param name="predicate">A function to test whether each item in the input sequence should be included in the output.</param>
<param name="source">The input sequence.</param>
<returns>The result sequence.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
<summary>Builds an array from the given collection.</summary>
<param name="source">The input sequence.</param>
<returns>The result array.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
<summary>Tests if any element of the sequence satisfies the given predicate.</summary>
<remarks>The predicate is applied to the elements of the input sequence. If any application returns true then the overall result is true and no further elements are tested. Otherwise, false is returned.</remarks>
<param name="predicate">A function to test each item of the input sequence.</param>
<param name="source">The input sequence.</param>
<returns>True if any result from the predicate is true; false otherwise.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>