How we’ve used Parslet to build the Framework Definition Language for CCS

dxw shapes

The language needed to be simple enough for non-developers to compose

This is the 2nd post of two, please check out the first one if you haven’t already.

Now that you (hopefully) have an idea of what Parslet is and how it can be used to parse structured sentences into – well – almost anything, let’s look at the language we developed for RMI to build framework definitions with.

The problem

Suppliers on the RMI system report their business to CCS using Excel templates. Each framework has its own template, which are stored in Github and can be downloaded by suppliers from the application. The supplier then re-uploads the filled-in template, and the RMI system ingests it & stores the data in a database.

The problem we needed to solve is that each framework needs to be modelled in the application so that the uploaded data from Excel can be validated and stored. Currently, the modelling is done by dxw developers; but as mentioned earlier, once we leave the project, CCS have no developers to do this work themselves.

The solution

Before I joined the project, the team decided to create a structured domain-specific language, called Framework Definition Language, for CCS to write the framework definitions.

This language needed to be simple enough for non-developers to compose, but rich enough to contain all the nuances of the framework definitions.

The frameworks consist of three main parts:

  1. The metadata. The framework name, its management charge rate etc.
  2. The fields. These are the fields the data will be ingested into, and can consist of a set of Invoice fields, Contract fields, or both.
  3. Lookups. Some fields can only contain certain values, and the lookups are the lists of values for those fields.

Additionally, the fields can be validated as optional or mandatory, a date, a number, one of a specific set of values (Lookups) etc. To make matters more complicated, the management charge can be a flat rate, or vary – either according to the value of another column in the spreadsheet, or according to whether the supplier was supplying central government or the wider public sector.

As you can see, creating something that is both usable by non-developers and feature rich was a daunting task.

To keep this blog post at a less-than-War-and-Peace length, I’m going to give a general overview of the language we eventually came up with.

The metadata

The first part is the framework metadata. This consists of a Framework instruction (the root, as mentioned in part 1), the framework name and short name, and the management charge.

Framework RM807 {
  Name 'Vehicle Hire (RM807)'
  ManagementCharge 0.5%
}

The FDL must begin with the word ‘Framework’ followed by the framework short name (in this case, ‘RM807’). Then the framework name and management charge (metadata) follow, encapsulated by curly braces. The InvoiceFields, ContractFields and Lookups follow the metadata, each block of information encapsulated by curly braces.

Because the management charge may be calculated one of three ways (flat rate, varies by column, varies by sector) there are three ways of writing it:

Flat rate

ManagementCharge 0.5%

Varies by column value, in this case ‘Spend Code’

ManagementCharge varies_by 'Spend Code' {
  'Lease Rental'                 -> 0.5%
  'Fleet Management Fee'         -> 0.5%
  'Damage'                       -> 0%
  'Other Re-charges'             -> 0%
}

Varies by sector

ManagementCharge sector_based {
  CentralGovernment -> 0.5%
  WiderPublicSector -> 1.2%
}

The fields

The (not great) screenshot above is of a subsection of the RM807 framework Excel template (the actual file can be found here). The Excel file represents the fields discussed above. The framework definition needs to map the fields in the Excel template to the fields in the data warehouse.

The FDL representation of the Excel fields looks like this:

InvoiceFields {
  LotNumber from 'Lot Number'
  optional CustomerPostCode from 'Customer PostCode'
  CustomerName from 'Customer Organisation'
  CustomerURN from 'Customer URN'
  optional CustomerReferenceNumber from 'Customer Agreement Number/Reference'
  InvoiceDate from 'Customer Invoice Date'
  InvoiceNumber from 'Customer Invoice Number'
  optional String from 'Customer Invoice Line Number'
  optional ProductDescription from 'Invoice Line Product/Service Description'
...
}

(The second column in the Excel, ‘Supplier Order Number’, is not to be filled in by the supplier, hence it is left out of the fields.)

To break down what we’re seeing here:

The framework builders will be familiar both with the data warehouse field names and the Excel column names, so hopefully, these definitions will be easy to write and understand.

We decided to place the ‘optional’ keyword for non-mandatory fields at the beginning of the lines, as the left-hand side of the document makes them easier to spot than the ragged right-hand side.

The lookups

Some columns in the spreadsheets can only contain certain values. In the Excel, they look something like this:

As shown above, the only acceptable values for “Spend Code” are “Lease Rental, Fleet Management Fee, Damage and Other Re-charges”. In FDL, fields like this are expressed with a ‘lookup’:

InvoiceFields {
  ...
  optional PromotionCode from 'Spend Code'
  ...
}
Lookups {
  PromotionCode [
    'Lease Rental'
    'Fleet Management Fee'
    'Damage'
    'Other Re-charges'
  ]
}

As you can see, the ‘lookup’ block has the same name as the data warehouse field the data will be imported into. We assume that any field with the same name as a lookup block will be a lookup type field.

So has this solution been a hit with CCS’s framework builders?

CCS have technical capabilities in-house so we were confident that with support and clear documentation the team would be able to create their own frameworks. 

Since dxw’s direct involvement with the project ended, CCS has been using the existing frameworks through several submission windows, and have begun creating their own frameworks. Ultimately this was a complex solution to support a complex system, but we are really pleased with how it turned out.