Ok, I’m not totally following your use case, and I don’t really understand your example either cause I don’t think I know what language that is.
But if I understood, you have a bunch of HTML strings, you want to loop over each one, where you’d parse their content for some say keyword and then you’d call getLinks and pass them the keyword list of keywords from the HTML and it’ll return you the thing they link too, which you’ll then replace the keyword in the HTML with an anchor to those returned links. Is that it?
And so, you first need to parse the HTML to find the strings to call getLinks with, but getLinks is an IO call to some other service, so you’re wondering how to push that impure logic to the edge?
Well, first of all, I’d like you to think of the pros/cons of why you want to do that. You mentioned Haskell, but without going over the details, Haskell basically says that:
(defn replace-with-links
[html links-getter]
(->> (get-strings-from-html html)
(links-getter)
(replace-anchors html))
Is a pure function, because if the function you pass in as links-getter is pure, then the above is pure as well. And you can imagine that Haskell says, unless you run this in prod, links-getter will be pure, but if run in prod, then Haskell will have links-getter be the real impure IO call to getLinks.
And so this is true in Clojure as well. The above function is pure in a lot of scenarios, maybe those that matter, like your unit tests and what not.
I like to think Clojure is pragmatic like that, and so this could be totally fine.
But alternatively, what you want to do is break down things even more, you only need to push the impure things to the edge, not get rid of them from your code. So it could be:
(defn main ; (or some API entry point)
[htmls]
(for [html htmls]
(->> (get-strings-from-html html)
(getLinks)
(replace-anchors html))))
Now you’ve pushed it to the edge. Main and getLinks are the only impure functions, main is your workflow definition for defining the series of steps needed to fulfill your program (or request if an API, or command if a user input event). And it’s at the very edge doing all the IO, and getLinks is the IO itself. The other two functions are pure.
Now I’ll say sometimes that main function could get pretty damn long, a complicated use case might have a lot of steps, but you only need to break it down to pure-chunk -> impure-chunk -> pure-chunk -> impure-chunk
kind of thing. So it’s not like it needs to have every single small steps, just broken between a section you can do all pure until you need some impure. So it’s manageable most of the time. And for the rare case it might be unruly, well I’d be pragmatic about it and break the rule of having it all at the very edge, and allow having it as close to the edge as I can but maybe not the very edge.