Vectors of maps: building paths, or can I detect duplicate keys?

Refining a library we have used heavily which I wrote a few years ago, I am looking at ways of refining its UPDATE operations. The basic functionality is to represent HTML input forms as data structures. As forms are ordered this is naturally a hiccup-ish vector of kw-map pairs, so the vector itsely looks like a map where the map in which the curly braces have been exchanged for square ones.

Trouble is, updating this thing to edit it is difficult. Given a structure like,

[:name {:type :text
	:placeholder "Name here"}
 :price {:type :number
	 :read-only true
	 :value 50}]

An example is imagining the user has used an edit-button beside the field on the front-end, finding the address to the matching item in the data structure. The problem is programmatically editing a form-schema value chosen by the user, such as [:price :value]. One approach involves converting the schema vector structure into a map, with a quick (apply hash-map my-schema-vec) , then utilizing get-in, then diffing against the original structure to regain the order. In this situation, besides the probable performance issues, there is a significant chance that the schema vector contains duplicate “keys” at different places in the form, and so they would get clobbered by the convert-to-map process.

I’m open to any suggestions for a better way of doing this, but in particular, is there a way I could at least throw an error if the schema vector contains duplicate keywords, like two of :name? And have that error say which keyword is the duplicate?

My implementation below isn’t the cleanest, but here’s my solution:

I build a hashmap of the keywords in the vector and their indexes. Part of that building is a check if the key already exists, and if so, throw an informative exception.

I made an example function that specifically works for price, but could easily be made more generic. A very important detail I learned is that assoc can replace a value at an index with (assoc coll idx new-val). That’ll likely resolve your performance concerns.

(defn build-map-from-vec2 [value-vecs]
  (reduce (fn [acc v]
            (let [key (first v)
                  idx (second v)]
              (when (contains? acc key)
                (throw (Exception. (str "Key "
                                        key
                                        " already exists at index "
                                        (get acc key)
                                        " (duplicate key at "
                                        idx
                                        ")"))))
              (assoc acc key idx)))
          {}
          value-vecs))

(defn map-keyword-indexes [m]
  (->> m
       (map-indexed (fn [i v]
                      (if (keyword? v)
                        [v i]
                        nil)))
       (filter (complement nil?))
       build-map-from-vec2))

(defn replace-price [data new-price]
  (let [keyword-indexes (map-keyword-indexes data)]
    (if (contains? keyword-indexes :price)
      (let [price-idx (inc (:price keyword-indexes))
            old-price-map (nth data price-idx)
            new-price-map (assoc old-price-map :value new-price)]
        (assoc data price-idx new-price-map))
      (concat data [:price {:type :number
                            :value new-price}]))))

;; replaces existing price
(replace-price [:name {:type :text
                       :placeholder "Name here"}
                :price {:type :number
                        :read-only true
                        :value 50}]
               60)

;; [:name {:type :text,
;;         :placeholder "Name here"}
;;  :price {:type :number,
;;          :read-only true,
;;          :value 60}]



;; adds price if it doesn't exist
(replace-price [:name {:type :text
                       :placeholder "Name here"}]
               60)

;; [:name {:type :text,
;;         :placeholder "Name here"}
;;  :price {:type :number,
;;          :value 60}]



;; helper function throws exception for duplicate keys
(map-keyword-indexes [:name {:type :text
                             :placeholder "Name here"}
                      :price {:type :number
                              :read-only true
                              :value 50}
                      :name {:type :text
                             :value "New value"}])
;; Key :name already exists at index 0 (duplicate key at 4)
1 Like

It’s worth considering if you’re using the wrong data structure, rather than trying to optimize the update process of a possibly sub-optimal representation. (Transformation of the existing data structure back and forth to another representation can be a “tell”.) Designing the data structure around your desired interactions with it (instead of picking the data structure then designing its programmatic interface after the fact) often makes hairy update problems like this melt away.

Maybe an ordered map could help? https://github.com/clj-commons/ordered#maps

Or, if the problem is better identifying the element to change, is there a reason not to add a unique id key to each element, and use that instead of possibly-duplicate keys?

Lastly, to contradict my own advice above, libs like specter can make it straightforward to update a particular thing in a heterogenous nested structure like you describe. But I often find that I prefer solving such problems by simplifying the data structure :slight_smile:

2 Likes

Ah – constructor function that serves solely to check for issues. Not bad! Thanks!

Your “worth considering” is excellent advice for evaluation all the time. Indeed, when I first started the library I tried ordered maps The problem is, ordered maps are second-class citizens in Clojure and have lots of issues with normal functions converting them back in to unordered maps. Since I am moving toward a solution that further involves storing them in a database and retrieving them there, using ordered maps would just be asking for problems.

The idea of using an id key is a good one. But the idea of a path is to be able to use it in a function that operates like (get-in structure [:meaningful1 :meaningful2]), and the solution is non-obvious for a function that uses something other than the primary keyword for that purpose. I guess you would have to manually implement a lookup table that links ids to the map keywords, in which case you are running perilously close to bad complexity…

specter, though, looks like it might just be what I need. I haven’t looked at it with this project in mind, and had forgotten how promising it is. Thanks!

1 Like

Is there any reason you’re not wrapping the thing in a fn?

(fn [v]
  [:name {:type :text
	      :placeholder "Name here"}
   :price {:type :number
	       :read-only true
	       :value (or v 50)}])

Is it part of a larger, default data structure that you want to keep static?

Oh, yes. The full forms may include dozens of id-map pairs. Also, with the goal of eventually storing the structure in a DB, wrapping it in a function would complicate that

I’d try to use functions that are already generic transducers in a thread-pipeline, so that you can opt in to more performance later if you want.

(def data
  [:name {:type :text
          :placeholder "Name here"}
   :price {:type :number
           :read-only true
           :value 50}])

(defn set-val [[current-field field-map] field value]
  (if-not (= field current-field)
    [current-field field-map]
    [current-field (assoc field-map :value value)]))

(defn set-value [data field value]
  (->> data
       (partition 2)
       (mapcat #(set-val % field value))
       vec))

(set-value data :price 60) ;=> [:name {:type :text :placeholder "Name here"} :price {:type :number :read-only true :value 60}]

Then you could replace ->> with injest’s x>> if you want it to test out the transducer performance.

1 Like

Oh, I misread the requested behavior. If a user can pass in the key-value too, like :value, you could parameterize that bit too:

(defn set-val [[current-field field-map] field k value]
  (if-not (= field current-field)
    [current-field field-map]
    [current-field (assoc field-map k value)]))

(defn set-value [data field k value]
  (->> data
       (partition 2)
       (mapcat #(set-val % field k value))
       vec))

(set-value data :price :value 60) ;=> [:name {:type :text :placeholder "Name here"} :price {:type :number :read-only true :value 60}]

I’d imagine you’d want to allow more than one instance of a field in the form though, so I’m not doing validations on uniqueness above. It’d probably be better to pass in an update-function too, instead of a direct value, that takes the old value and optionally uses that to provide the new value.

You can always do (constantly 'some-value) if you really want to coerce it to something specific, but if you want to use the old value, you don’t have to do two passes over the datastructure. Like:

(defn update-field [[current-field field-map] field k update-fn]
  (if-not (= field current-field)
    [current-field field-map]
    [current-field (update field-map k update-fn)]))

(defn update-fields [data field k update-fn]
  (->> data
       (partition 2)
       (mapcat #(update-field % field k update-fn))
       vec))

(update-fields data :price :value #(+ 10 %)) ;=> [:name {:type :text :placeholder "Name here"} :price {:type :number :read-only true :value 60}]

1 Like

If you really want to assert unique fields though:

(defn assert-unique-fields [rows row]
  (let [fields (set (map first rows))
        field (first row)]
    (assert (not (contains? fields field))
            "Fields must be unique")
    (conj rows row)))

(defn update-fields [data field k update-fn]
  (->> data
       (partition 2)
       (reduce assert-unique-fields [])
       (mapcat #(update-field % field k update-fn))
       vec))

(-> [:name {:type :text} :name {:type :jpeg}]
    (update-fields :name :type (constantly 60))) 
;; Assert failed: Fields must be unique
; (not (contains? fields field))

And then for a faster, more eager version, you can sequence all three with x>> and cgrand.xforms:

(defn update-fields [data field k update-fn]
  (x>> data
       (partition 2)
       (x/reduce assert-unique-fields [])
       (mapcat #(update-field % field k update-fn))
       vec))
1 Like