I’m having trouble searching xml in a pure client-side fashion. I’ve been vacillating between using browser-based DOM functionality for this, trying to leverage Closure, and trying to leverage clojure.data.xml. I can get and read the XML in each of these ways, but I’m struggling to search it. In my example, I want to find every <title> element and obtain the string of what the element is titled. Even this seems difficult, though. Here’s what I’ve scratched up so far, with limited success:
;; this is all cljs
;; with clojure.data.xml, but is non-trivially nested without search capabilities (css/hiccup style would be best, or at least xpath)
(let [x (xml/parse-str "<title>Tech.ToryAnderson.com</title>")]
(-> x :content) ; ("Tech.ToryAnderson.com")
#_(js/console.log x))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; With raw javascript
(let [s "<title>Tech.ToryAnderson.com</title>"
p (js/DOMParser.)
doc (.parseFromString p s "text/xml")]
(-> (.getElementsByTagName doc "title")
;vec ; Can't get to a place to use cljs (map).
; ;; repl/invoke error Error: [object HTMLCollection] is not ISeqable
#_((aget 0)
.-innerHTML) ; "Tech.ToryAnderson.com" ;; works for just one
))
;; but how to do this for a large collection with nested data?
You can also use zippers. Here’s an example from Stack Overflow. Probably don’t need io/reader and instead of x/parse might have to use x/parse-str. Caveat: I haven’t tried this out yet myself.
In ClojureScript, clojure.data.xml uses the browser’s DOMParser.
If the run-time cost of conversion to Clojure data is OK, then zippers are the Cadillac of next steps. Traversing the zipper, you can “see” up and down and all directions from the current node, which can be convenient. At the other extreme is the standard library’s best-kept secret: xml-seq! Demonstrated here on an “RSS” feed which has its own title, in addition to items with titles. We select only the items’ titles:
+1 for showing me xml-seq. This works as a clojure-native searching method, but still lacks the advanced searching of something like xpath e.g. it’s non-trivial to perform a query like “All title nodes that are under doc.type=movie”. Or maybe I just need to embrace a more clojure way of thinking here.
Working in ClojureScript, in a browser, there’s no dishonor in using the browser’s built-in XPath. In “pure Clojure”, Enlive accomplished something more flexible than XPath with zippers, but Enlive’s notation will seem abstruse unless it’s obvious that XPath would have been harder. (The zippery part of Enlive is here: https://github.com/cgrand/enlive/blob/master/src/net/cgrand/enlive_html.clj)