Extract text content from hickory data using specter

I have the following data structure:

    (def root
      {:tag :a,
       :content
       [{:tag :b,
         :content
         [{:tag :c, :content [1 2 "da" {:tag :d, :content ["hello" "c"]}]}]}]})

How can I select or extract text from the above data structure? I tried something like below which wouldn’t work.

    (select [??SOME-SELECTOR  ;; Couldn't figure out what to put here
           :content
           ALL
           string?]
          root)

Hello!

Welcome to Clojureverse!

A reason you haven’t got any replies yet might be that you haven’t described what you expect to get out of your call. Should your call return just a list of the strings like ["da" "hello" "c"]? Or do you want to keep any other information?


From a quick review, I couldn’t find any way to do “arbitrary depth” traversals with Specter. Here’s a way to get all strings recursively from a map using the standard library:

(require '[clojure.walk])

(def root
  {:tag :a,
   :content
   [{:tag :b,
     :content
     [{:tag :c, :content [1 2 "da" {:tag :d,
                                    :content ["hello" "c"]}]}]}]})

(defn filter-recursive [pred coll]
  (let [matches (atom [])]
    (clojure.walk/postwalk (fn [el]
                             (when (pred el)
                               (swap! matches conj el))
                             el)
                           coll)
    @matches))

(filter-recursive string? root)
;; => ["da" "hello" "c"]

;; Or try to handle just the "valid" nodes
(->> root
     (filter-recursive (fn [m]
                         (and (map? m)
                              (contains? m :content))))
     (map :content)
     (mapcat #(filter string? %)))
;; => ("hello" "c" "da")

Does this work for you? It doesn’t use Specter, but personally, I wouldn’t pull in a library for this. Feel free to ask if you have any questions.

Teodor

1 Like

Thanks! Yes I only need the strings in the :content vector.

I finally managed to get the following using Specter,

(def STR-VAL 
  (recursive-path [] p (cond-path 
                        vector? [ALL p]
                        map? [:content ALL p]
                        string? STAY)))

(select STR-VAL root)
;; ["da" "hello" "c"]
1 Like

Glad you got it working! recursive-path was the magic sauce, it seems. For further interest, Using Specter Recursively seems to cover our use case.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.