Hello! New to Clojure and am hoping for some advice on how to solve a problem.
A desired feature of the application I am writing is to allow users to set up rules to transform data records from different sources (by this point turned into edn) into a normalized data structure.
So for example the user might source equity prices from Bloomberg and Reuters, which will obviously come in with slightly different record schemas, and I would want users to be able to set up two mappings which transform each of them to a pre-defined ‘price’ target object (with, say, a date, price, security name). There might be a few circumstances where the logic is slightly more complex than ‘the value of key s_price
in the source maps to the value of price
in the target’, like adding two source fields together to get to the target field - but I wouldn’t expect much more complexity than that.
Obviously it’s fairly trivial to write a function ‘transform-bloomberg-object’ which accepts the bloomberg data structure and converts it to the target object schema. However I don’t want the concrete mapping logic to be part of the source code, i.e. I don’t want to redeploy every time a new mapping rule is needed by the business. I would like mapping logic to be loaded at runtime from a database or file (and ultimately generated and saved by business users through some kind of UI).
As that would suggest I think I need a Mapping object in the form of some sort of data structure with a declarative description of how to map from a source field to a target field, something like
{:name "Bloomberg Price Transform Map"
:target-field1 :source-fieldX
:target-field2 (+ source-fieldY source-fieldZ)}
I would then want to have a (single) function which accepts both a source record and a Mapping object, applies the operations described in the Mapping object, and returns a target object.
This feels like something Clojure would be suited for (‘it’s just data’ yada yada), but I’ve not done anything like this before and am unsure if this is a sensible approach. It also seems like something that is a very common business problem, and I don’t want to reinvent the wheel. It’s just the T part of ETL after all, and I’ve seen it solved with graphical ETL tools.
So my questions are:
- is what I’ve described a reasonable approach?
- are there any difficulties you would see with this approach?
- am I trying to reinvent the wheel, and are there libraries that solve or partially solve this problem?
- is there something obvious I’m missing (in particular it feels like spec might have a part to play here, but I’m too new to Clojure to see how or where)?
I’ve googled around and found some answers which seem to touch on the subject, but not really get to the core of the issue.