Help on designing: A logistics network optimizer in Clojure

eerjree · May 5, 2019, 9:08am

I’ve been learning Clojure for several weeks, and feel confident to write a (toy) project that I have written in C++/Python before – a logistics network optimizer. I am new to functional programming and to Clojure,

Why this? I know about the subject.
Why Clojure – I feel network algorithms will “sing” in a language like Clojure. Plus seems like a fun thing to try.

Please let me know if this is not the right forum for such general advise.

I need help to unlearn the OO patterns that I’ve used in the past and write it in FP/Clojure.

A logistics network consists of entities:

"Product"s residing in "Facility"s. Products and Facilities both have lots of internal structure (size, weight etc. for Product) and (location, capacity etc.) for Facility.
“Products” also “Move” from one “Facility” to another “Facility” via transportation processes (Air/Truck etc).
The network consists of the tuples of (Product, Facility) connected to other tuples via transportation links.

In C++/Python, I would make all of the above (Product, Facility etc.) as classes. I would create and maintain state within the classes, and write a bunch of get/set methods.

The network itself would be maintained in a global singleton “Network” class with a vector of [(from,to)] tuples. The problem with this approach is maintaining a global network state and solving it is gets very complex very quickly.

I feel tempted to create a similar design in Clojure using records. Individual record types for Products, Facilities etc. Global network would be a vector. I feel that this is too close to the OO paradigm, so I’m not there yet.

Hence the call for help. Any design articles/discussions would be welcome.

Thanks!

l3nz · May 5, 2019, 3:12pm

I would go for the easiest thing that works. That’s basically a lot of maps. Maybe something like

{ :facilities {:newyork {:stores {:beans {:number 5 :weight 7}, :rice {:number 2 :weight 1}}},
                  :zuerich {:stores {:beans {:number 1 :weight 7}}}
:links [{:from :zuerich :to :newyork :mode :airplane :cost  10}]}

You go for the easiest thing that works, think about it as just data. Then you have a function that, out of this structure, builds action candidates, and another one that applies them and generates a new state.

While you get started and design the data structure, don’t forget to spec it, and use something like Orchestra https://github.com/jeaye/orchestra to instrument your runtime.

What I found hard coming from Java was: how do I know the type of an object? how do I know what to do? short answer: you don’t and you don’t care. Make it simple, and write short functions. You will notice that, compared to most other languages, you don’t “type” much in Clojure - most of the time you will be reflecting, and scratching your head.

And yes, this is the right place to ask questions. You’ll notice that in general the Clojure community is a very friendly and polite space.

linpengcheng · May 6, 2019, 5:23am

Don’t be confused by the Acrobats of Lisp and FP.

As long as you use clojure as RMDB, you can solve it simply, perfectly and reliably. Express (each) container as a table(ref hash-map) and move Product as a transaction(STM).

see:

the Pure function Pipeline Dataflow

Implement relational data model and programming based on hash-map (NoSQL)

didibus · May 6, 2019, 6:35am

In general, FP models things as mappings from input to output, in a sort of dataflow style. It’s more important in the small. So you want functions that only depend on their arguments, and that do not modify anything outside of them. All they do is read from their input arguments, and generate the appropriate output return from them.

This is where most of the difference will be as opposed to the traditional procedural code, like you’d have in Java or C++/Python, where methods almost always manipulate external state such as object fields, global vars, static fields, or even their input arguments pointed value (they mutate their input args).

So when you get to the small design choices, you’ll need to try really hard to follow this FP style, and write most of your computations as taking the current state as input and returning a modified version as output.

Now, in the medium, things don’t look very different then in OO. What happens is that, the part of the code which updates the global state is no longer inside the functions that transform it. Instead, it is inside the calling function. Ideally, the calling function is an orchestrator of state changes and exist at the top most level, so it’s the boundary of the app.

Imagine that top layer as only being allowed to move data around. It can’t actually do anything else. It can extract data from IO or globals, pass it back and forth to other functions and it can write it back to IO or globals. All data transformation happen in your pure FP functions.

That leaves the choice of where and how to store the global state up to you. You could use a real DB, SQL or NoSql, like say DynamoDB, MongoDB, MySQL or PostgresSQL. Or you could use an in-memory one like SQLite, H2, or DataScript. Or you can go with simpler things, like plain old Clojure Maps, Vectors, Sets stored inside mutable references in global Vars such as Atom or Refs. In any case, how you model the data is up to you. For example, @linpengcheng posts shows a way to model data in a relational way using Clojure maps. More conventionally, you can model it hierarchically as well, and use Spec on top to define the data model.

Records in Clojure are normally used for polymorphic type dispatch or really performant field access. If you don’t need any of those, use normal Maps instead with a Spec for defining their schema. It’s simpler and more flexible.

Now, there’s one other way to manage the state, it’s a lot harder, but more functional. That is through recursion instead of mutation. This allows you to make the orchestrator pure as well (minus any sort of IO required). Basically, the orchestrator would take in the state, which it extracts things from, pass it back and forth to the pure FP functions in the order you need it, but it doesn’t update the state, instead it reconstructs a new one, and when it is done, it calls itself or the next orchestrator and passes in that new state. In Clojure, you’d need to use Loop/Recur and Trampoline not to blow up the stack and get TCO.

A simple model of this would be to have a top level recursive event loop. It waits for commands from IO, and based on the command, it calls the appropriate orchestrator for it, passing into it the state. The orchestrator when done returns it the new version of the state, and the event loop recurs, calling itself back with the new state, and then waits again on IO for the next command, rinse and repeat.

eerjree · May 6, 2019, 1:31pm

Many thanks for your inputs. I’d like to check if you have any pointers to using clojure.spec as a way to define domain entities? I’ve read some tutorials, but mostly the basic stuff.

I worry that using bare maps will result in deeply nested data structures that are as difficult to understand and maintain as much as deeply nested code.

eerjree · May 6, 2019, 1:44pm

What are the performance implications of Records vs. maps? I’ve read 2 things: (1) records are faster for field access but (2) there is no structural sharing with records. If I’m creating a large vector of records, will it be less performant than a vector of maps?

didibus · May 6, 2019, 5:27pm

I don’t think so. Records compile to an actual Java class. And Record instances are actual instances of the underlying class. This means Records are their own type, it’s creating an ADT. This means they have a distinct type and are thus useful for type polymorphism, such as with the use of Protocols.

Similarly, because their fields are Object fields, there’s no hashing or looping over any structure to write or read them. It’s a true O(1) read/write. Where’s map keys/vals are pseudo O(1).

l3nz · May 6, 2019, 7:32pm

Go for the easiest thing that works. Especially if you’re getting started, don’t make your life more complex than it should be. You can always experiment when it works

didibus · May 6, 2019, 9:34pm

I’ll add that if you really want to learn Clojure, try it without records first. That’s because you really want to learn to get used to duck typing. Records are a sort of nominal typing, structures have names, predefined and values have places on them, predefined, and access to fields can be statically bound. In general, for Clojure, it is more idiomatic to have things using late binding, and structured to be duck typed.

In essence, structure is defined by the things that use it. Spec allows you to still clearly document and specify the structures that are being used and passed around.

Also, try and use destructuring in your argument vector, that makes it clear what fields are used by a function. And remember to have good doc-strings as well to help clarify confusion.

Basically, you want to use records because you have a reason to over just using maps. And generally that is either interop, polymorphism or performance critical code. Which is something you’d know you had if you did. Doesn’t seem you need any of that for your use case.

jackrusher · May 7, 2019, 10:53am

My recommendations for your project given your purpose:

Avoid records, protocols or any other advanced features that might overlap with your OO experience. Do everything with vectors, maps, sets and functions.
Everything that would otherwise be an object should be a map where the keys are named for the fields of your old objects. Make functions that consume and return these maps.
Don’t spec anything until the general shape of your data and APIs have started to coalesce into something more permanent.

eerjree · May 7, 2019, 1:52pm

Thanks, this is sensible advise.
How do I deal with deeply nested maps? Is this a common design pattern, or am I doing something wrong - my design seems to have maps that contain maps that contain maps. (example: facilities map, where each individual facility has a map of properties, some properties are composites themselves like address).

In effect I want to find the Clojurian way to do object composition from OO.

jackrusher · May 7, 2019, 5:32pm

There’s definitely a place for nested maps, but it’s hard to be overly specific about what’s best for your situation with seeing your data model. A nice exercise would be to describe your data to us as if you were making a DB schema for it, answering questions about the cardinality and relations between the kinds of things you have. If you do that I’ll work through some possible ways of structuring the application with you.

didibus · May 7, 2019, 6:13pm

I think you’re getting confused between constructs and domain modeling.

What you want is to no longer use OOP like constructs such as classes and objects. That’s why we recommend not using records. Instead use data-structures such as Maps, Vectors, Sets, Lists, etc.

Now your domain model is orthogonal to the constructs you end up using. You need to ask, what data do I have, how does it relate to other pieces of data, etc. What’s the structure of the data in the domain basically. An ER diagram is good for that.

If your domain has a lot of relationships like say People have Jobs and LovedOnes. LovedOnes are other People. Now you have a graph. In that Jane can love Paul who loves Jordan who in turn loves Jane. This is regardless of any language or programming paradigm. That’s just your domain and how its data works.

Now you need to ask yourself, how are you going to encode this data in code?

One approach is hierarchical data modeling. Vectors or Maps of Vectors or Maps, etc. This can be tricky to deal with graphs, cause you have circular references, so a traversal could be infinite. Or you could model it as an actual graph, and use an Adjencency List. Or maybe you model it as tables and relations between them such as a SQL db would. The trick is to start thinking about data-structures, the full spectrum of them. Not about Classes and Objects.

Now, most of the time, Vectors or Maps of Vectors or Maps will be good enough. Just go for it. Clojure can really easily manipulate such structure. It’s rare to have graph like data, or even many to many relations. So this encoding will normally work pretty well for most use case, and is well supported by Clojure.

It’s also totally idiomatic in that case to have nested structures. Because you are modeling hierarchies, which are nested in the domain as well.

Basically, what I’m saying is that, in OO, generally things are modeled hierarchically. You have Classes with nested Classes in them. Hierarchical data modeling is good for domains to which it works well for, domains that have hierarchical data themselves. And Clojure doesn’t go against hierarchical data modeling. Using Clojure does not mean not using hierarchical data models when they would be most appropriate. It means implementing the hierarchical data model using dynamic persistent data-structures instead of Classes and Objects. Which you’ll see, Clojure actually makes using and working with hierarchical data way easier then OO.

The other thing Clojure does, is put the state in a central place at the top/boundaries of the app. This makes state more explicit. And similarly, it makes it easier to choose to use a different type of data model, like a relational one, or a graph based one. You can see DataScript as a good example. And this scales well, because then its trivial to move your state into a distributed store like a DB and make your DB the implementation for your data model.

system · November 6, 2019, 6:14am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.