Having been some time with an in-memory mutable dictionary serving as a data store, the time seemed right to introduce a permanent data store.

Selecting a data store

Choosing a store for a solution is a key architectural decision and requires evaluating a huge set of factors including requirements, domain-store match, cost, scale, read vs write counts, eventual consistency vs availability, transactional integrity, etc.

There are a number of different stores that are reasonable for this project. Ideally I’d try implementing them all, however I don’t expect that to happen in the short-term. These are some options:

  • Document store e.g. MongoDB. This may be appropriate given our aggregates (template and asset) are quite complex objects.
  • Search index e.g. ElasticSearch. If there are many orders of magnitudes more reads than writes then optimizing for querying using a search index may be appropriate.
  • Event store. An event store, typically implemented manually on flat files, records all the business events triggered by API or system actions. It is often used in a CQRS system in concert with a read-specific store. Having read/write separation in our stack allows for this option.
  • Relational database e.g. SQL Server. Provides transactions, is the most mature, and the best supported by libraries.

Choosing between these relies on criteria that don’t exist because this isn’t a commercial project, so because I’m learning and trying to reduce the number of learning variables I chose the tried-and-true, a relational database.

As an aside, as an architect we also want to allow for future contingencies where it is practical. By providing a clear persistence layer in the code and comprehensive integration tests, we allow the choice of store to be re-evaluated in the future (providing the organization is prepared for the cost).


F# has an excellent feature, Type Providers, that generate strongly typed code at compile time for interacting with data sources. This page has a nice comparison of some type providers for F#. For no particular reason I selected SQLProvider

As mentioned earlier, the domain aggregates are relatively deep, so they map to multiple tables, which is why in a production solution I’d lean towards a non-relational store. Here is a comparison of the domain type template, and the tables that store it in SQL (constraints excluded):

type FieldValue = 
    | StringField of string
    | DateField of System.DateTime
    | NumericField of float

type FieldDefinition = {
    Id: System.Guid
    Name: string
    Field: FieldValue

type Template = {
    Id: System.Guid
    Name: string
    Fields: FieldDefinition list
    MaintenanceProgramId: System.Guid option
CREATE TABLE [dbo].[FieldValue](
    [FieldValueId] [uniqueidentifier] NOT NULL,
    [FieldDefinitionId] [uniqueidentifier] NOT NULL,
    [AssetId] [uniqueidentifier] NULL,
    [ValueType] [tinyint] NOT NULL,
    [StringValue] [nvarchar](max) NULL,
    [DateValue] [datetime] NULL,
    [NumericValue] [real] NULL)

CREATE TABLE [dbo].[FieldDefinition](
    [FieldDefinitionId] [uniqueidentifier] NOT NULL,
    [Name] [nvarchar](255) NOT NULL,
    [TemplateId] [uniqueidentifier] NOT NULL)

CREATE TABLE [dbo].[Template](
    [TemplateId] [uniqueidentifier] NOT NULL,
    [Name] [nvarchar](255) NOT NULL,
    [MaintenanceProgramId] [uniqueidentifier] NULL)

The end result is that we end up with some fairly complex queries and mapping logic. One of the sources of complexity is that the FieldValue table is used both for default template field values (when AssetId is null) and asset field values.

The query contains two left outer joins because a template may exist without any fields. This is shown using the rather cryptic (!!) operator. It would be nice if a more pleasant name could be used for this operator.

The mapping between discriminated unions and the underlying store can be handled by joining to a different table for each case, or by the approach used here which is to have a different column for each case. Using different columns makes the query easier, but results in a sparse table. Given we only have three cases the sparseness of the table shouldn’t be a big space penalty.

The read mapping is shown below. The full code is available here.

let private mapSingleTemplate (rows:TemplateQueryResultSet list) : Template =
    let fields = [ for row in rows do
                    let (_, defn, value) = row
                    if defn.FieldDefinitionId <> System.Guid.Empty then // empty guid means template has no fields
                        yield {
                            Id = defn.FieldDefinitionId
                            Name = defn.Name
                            Field = match value.ValueType with
                                    | 1uy -> StringField(value.StringValue.Value)
                                    | 2uy -> DateField(value.DateValue.Value)
                                    | 3uy -> NumericField(float value.NumericValue.Value)
                                    | _ -> failwith "Unknown field type"
    let (templateCols, _, _) = rows.Head
        Id = templateCols.TemplateId
        Name = templateCols.Name
        Fields = fields
        MaintenanceProgramId = templateCols.MaintenanceProgramId

let private templateByIdQuery (dc:DbContext) id : System.Linq.IQueryable =
    query { 
        for template in dc.Dbo.Template do
        // (!!) means left outer join
        for fieldDef in (!!) template.``dbo.FieldDefinition by TemplateId`` do
        for fieldVal in (!!) fieldDef.``dbo.FieldValue by FieldDefinitionId`` do
        where (template.TemplateId = id && fieldVal.AssetId.IsNone)
        select (template, fieldDef, fieldVal)

let private templateById id : Template option =
    let dc:DbContext = Sql.GetDataContext()
    let rows = templateByIdQuery dc id |> Seq.toList
    if [] = rows then None else Some (mapSingleTemplate rows)