What is your favourite way to document your programs?

yanisurbis · March 15, 2021, 9:17am

What is your favourite way to document your programs? I’m not talking about API documentation, but more about architectural decisions which might look weird for newcomers. What tools do you use? Comments, Google docs, markdown files, separate namespaces with a very long docstring?

Thank you

abogoyavlensky · March 15, 2021, 9:44am

Hi! For documentation I mostly use docstrings and markdown files stored in repo probably in separated one if project has more than one repo. Or atlassian confluence (instead of markdown) for the same purposes and the same way if company has this tool. Documentation might contains following sections: thechnical docs by services, product docs (business logic description and so on), common archetectural docs (contains mostly diagrams and description of service interaction), and knowledge base (recipes and useful guides for whole system). For small projects it could be docs without any section.

For diagrams I really like to use Diagrams or (with a little less features, but simpler to use) Mermaid.

Really like to hear any other use cases and approaches to document projects cause it is important thing.

joinr · March 15, 2021, 1:21pm

Doc strings, marginalia, codox. I’ve done stuff in org mode outside of the inlined code (comments are markdown friendly which marginalia can pick up and format nicely). There are some cool tools out there t though will really nice docs; @generateme has some stuff leveraging metadoc. I think they also use notespace - I “think” tablecloth docs are in it tablecloth scicloj. Pretty sure the libraries are far greater these days for literate programming and api-generated docs and the like. Certainly put in doc strings and the repl-tooling can pick up a lot just from the dev environment.

generateme · March 15, 2021, 2:03pm

tablecloth docs are based on RMarkdown actually. I’ve tested the following:

marginalia - literate programming tool where comments from the namespace are converted to a documentation. Here is the result of using it.
metadoc which allows to combine code snippets with doc strings. Works as a codox writer. examples part here is generated by metadoc.
RMarkdown combined with rep works pretty well. tablecloth docs are generated using this approach. Here + source
notespace, still in progress. Notebook in your editor. Generates static html out of namespace. result + source

BadChicken · March 16, 2021, 3:43am

For architecture decisions, I like to keep Lightweight Architectural Decision Records in markdown files in a docs/architecture/decisions directory in my git repository. A good example can be seen at govuk-aws/docs/architecture/decisions at master · alphagov/govuk-aws · GitHub

simongray · March 16, 2021, 8:01am

Some good suggestions here - did not know about metadoc.

Personally, I like to keep it simple, writing docstrings for most of my functions, even private ones, where I use Markdown backticks for specifying input parameters. I also write explanatory comments whenever I feel like the code is doing something for some reason that isn’t totally apparent.

To me, the best documentation is the intuitive one that you only feel, but can’t see. Stuff like using a consistent and logical naming pattern (for vars and so on). The hardest code to read (for me) is the one that is inconsistent. At my old work I spent many PRs simply refactoring code that had been modified by 5-6 different people over the years where the names of the same inputs kept changing from function to function, or was modified within the function using illogical and inconsistent, temporary names. The fact that some people don’t take names seriously in a dynamically typed language is a huge sin. Names are all we have.

I recommend reading Elements of Clojure - and re-reading your own code

Anthony_Leonard · March 17, 2021, 1:30am

I’ve always been a firm believer that code and APIs should express themselves with zero documentation, mainly through consistent naming (as called out already above) and descriptive tests. The rare times I write a docstring it feels like an admission that this code is wierd and needs explanation, and most I read can be positively mis-leading and best ignored. (For internally produced code I mean … I do use and appreciate doc string in nicely maintained open source libraries etc - thank you!)

For years I similarly discouraged writing out supporting documentation as they were rarely maintained well, and prone to heavyweight review processes… But I find myself writing a more and more documents now - and am usually glad I did. For example where APIs are used by other teams or even 3rd parties some kind of API contract documentation is fairly unavoidable. We have gitlab and confluence for this - both of which have good support for Markdown and Mermaid.

So I’m beginning to conclude that documentation is particularly invaluable not for docstrings but for designs, whether used prospectively (to plan new work) or retrospectively (to look back on what was built and why), and have been going back over thoughts in this area by these usual suspects fairly recently:

+1 for lightweight architectural decision records, which are in the same spirit as the techniques desribed above in that “designs are decisions.” We use an opt-it “guild” style group to govern changes to ours. I’m actively trying out the “one picture” approach right now - even if it is retrospective - to try and build a visual map of how all our microservices fit together, with plenty of free form annotation and visual cues, using diagrams.net.

Finally some more on consistent naming - still the hardest problem in computer science. This extends beyond coding. As someone working in a difficult (government) domain where there a lot of opaque and confusing business terms flying about I’ve started to insist that our business folks call out the terms they use and use them consistently … I’ve even taken to using italics when referencing some domain specific term or phrase, and deliberately use code-styled monospace when referencing someSimilarlyNamedButWierdlyCasedDomainSpecificAttribute in e.g. a JSON API, just to help us all figure out what’s actually being meant. I’m even becoming interested in building business glossaries and information models, which must be the ultimate in geeky documentation of software systems - though I must admit being quite worried about what I might be getting myself into there!

didibus · March 17, 2021, 1:35am

I came accross this recently: https://documentation.divio.com/

Not exactly what you asked for, but related.

For architectural decisions, I like to add a doc-strings on the namespace or use doc-strings/comments in code. But for the overall architecture, I like putting that in the README.md or have an ARCHITECTURE.md.

At work, I just link in README.md to the work wiki and the architecture documentation goes there. That way it’s easier to have others update it over time with low friction.

seancorfield · March 19, 2021, 3:36am

What a fascinating topic!

I have a lot of sympathy for @Anthony_Leonard 's position on this, to be honest. I’ve been writing commercial software for close to forty years and, at this point, I feel like I’ve tried every single possible documentation option… Pretty much every written thing – including “literate programming” – seems to get out of date over time, and the other aspect of “docs in the code repo” that can be problematic is that it’s not practical for communicating with business folks which at least some of our documentation has to be concerned with.

The only thing that has seemed relatively effective, in my experience, is (high-level) diagramming. So much intent can be portrayed in pictures and done so in a way that is a lot less likely to get out of sync with the implementation.

My gripe is that across all these decades, I still haven’t found a good diagramming tool that a) produces nice diagrams and b) is portable across Mac/PC/Linux and c) is easy to use for programmers. Suggestions?

didibus · March 19, 2021, 5:59am

I’ve seen that too, but I still find out of date documentation much better than none. It gives you a good starting context and the code can complement the rest.

Though I guess I’m not really talking documentation of the kind open source projects have or commercial software provide. I mean design and architecture artifacts, which often do also include diagrams, and operational run books.

For business folks, that’s why I like putting a link to a wiki. In the wiki there can be subpages (or attached word docs and all that, which some were docs written for a business audience and others for a more tech savvy audience.

For diagramming I have the same problem as you, but I’ve been using draw.io for a while, and it’s the best I’ve found, even though I still don’t find it too great.

More recently (and with work from home), I’ve been considering getting this: https://www.wacom.com/en-us/products/pen-displays/wacom-one and just hand drawing all my diagrams.

mvarela · March 19, 2021, 7:57am

PlantUML ticks those boxes (though the “nice” aspect does require some effort), plus it’s just text, so you can embed it in comments / org-mode files, etc…

Here’s a scrubbed version of an actual diagram I made for some stuff I’m working on.

The code looks like:

@startuml Architecture
 !include /path/to/C4-PlantUML/C4_Container.puml

skinparam backgroundColor #EEEBDC
left to right direction

   System_Ext(receiver, "Foo Service")
   System_Ext(telemetry, "Bar Service")
   System_Ext(omicron_api, "Quux API")
   System_Ext(devel_dashboard, "Analytics Dashboard, TBD")

   Container(kafka, "Backbone", "Kafka", "Substrate for streaming")
   ContainerDb(crux, "Graph DB", "Crux", "Flexible querying")
   ContainerDb(elastic, "ELK Stack", "Elasticsearch", "Analytics")
   Container(api, "API", "Jetty/Reitit")
   Container(elasticconnect, "Kafka Elasticsearch Connector", "Sink")

   Rel(elasticconnect, kafka, "Takes data for ElasticSearch")
   Rel(devel_dashboard, elastic, "Queries data for analytics")
   Rel(elasticconnect, elastic, "Pushes data into ElasticSearch")
   Rel(omicron_api, api, "Proxies queries", "JSON")
   Rel(receiver, kafka, "Pushes  foobars into", "JSON")
   Rel(telemetry, kafka, "Pushes quux data into", "JSON")
   Rel(kafka, crux, "Stores frobnicator data", "Documents + indices")
   Rel(crux, kafka, "Uses as Tx and Doc storage", "Transactions + docs")
   Rel(api, crux, "Queries frobnicator data", "JSON")

   @enduml

I tend to use org-mode for documenting, and also for literate programming, but unfortunately I haven’t really made it work successfully with my Clojure workflow (RDD is more fluid when done directly from the source files, I find…). I have used it to very good effect with Python, Haskell, Julia, and R in the past, and have always received good feedback on my documentation. To be honest, though, I tend to do it that way since it helps me “rubber duck” my development process, especially when tackling new problems.

Edit: here is an example of literate Python in org-mode: battleship-python/battleship.org at master · mvarela/battleship-python · GitHub

joinr · March 19, 2021, 1:06pm

I stumbled onto yEd years ago. It has a java-based version that’s just a jar file, which works great for my purposes. Also other backends, as well as a cloud offering. The interface is great. It’s portable. Lots of really cool tricks you can do with diagramming and graph transformations. Many graph import/export options. I use it for ad hoc interactive diagrams, as well as analyzing stuff from graphs I’ve generated.

seancorfield · March 19, 2021, 5:19pm

Thanks for the pointers to PlantUML and yEd. Next time I’m thinking about a diagram, I have some new options to consider!

Re: literate programming – as an experiment at work, I built one of our services using Marginalia and it was fine while I was documenting design and usage but as the code actually got fleshed out, it began to completely overwhelm the documentation and made it pretty hard to read. And over time the “comments” (documentation) gradually got out of sync with the code because we didn’t alway catch all of the implications of code changes in the large blocks of documentation around the code. Which is what I’ve always seen happen even with regular comments in code if they’re not just short, succinct, and right there next to the line they are documenting.

Hence my overall preference for minimal comments in code and docstrings that are “complete” but not too extensive since they also suffer from that over time if they describe too much of what is also “in” the code.

didibus · March 19, 2021, 6:59pm

Oh shoot, I used to use yeD, it’s great, best UX I found. I somehow forgot about it lol, I need to use it again.

Anthony_Leonard · March 20, 2021, 9:54pm

draw.io - now renamed diagrams.net gets my vote. They have an open source javascript/electron based client for their web and desktop versions that works everywhere, with installers for Mac, Windiows (including a no-installer version), and Linux for yum, deb etc. It’s very feature rich and touch screen friendly - I particularly like the sketch and comic effects The desktop version is completely free to use and exports/imports many formats. I think their sell for the OSS client is for big companies to be sure their diagrams are not in the hands of a 3rd party, and to reduce the barrier to entry, which makes a lot of sense to me. They make money from the confluence plugin (and other similar integrations I’m assuming) which does have hefty corporate pricing - but you obviously don’t need the plugin to stick a PNG file of a diagram into your wiki page even though you built it from the desktop version of drawio

joinr · March 22, 2021, 1:16pm

I forgot about dynadoc as demoed here. Very interesting concept.

schmudde · March 25, 2021, 11:24pm

Code is particularly bad at capturing two interrelated concepts when assembling a system:

Domain knowledge
Decisions

I find this context incredibly helpful. Any given function may express something about the domain, but won’t provide any context. Domain knowledge ages quite well, can be used throughout the organization, and requires little maintenance. It’s a must.

Describing why a certain approach was taken is clarifying for both the author and the reader. But it’s probably not going to work unless a team really commits to a literate programming approach. I’ve been experimenting with this using org-mode on a single microservice in our organization.

It has worked really well for devops. One document can explain the architecture next to the code that builds it. That’s the only document which is ever updated. The traditional .clj, .sh, etc… files are “tangled” (generated) from this one document. Even better - scripts can be run in the document itself.

For the software itself, it is more of a mixed bag. I’ve found docstrings to be good enough to explain what the function expects, what it returns, and any domain knowledge required to grok the i/o. I think there is room to capture high level design decisions… but as was referenced earlier, I’m not sure those will age so well. So far I haven’t asked my team to include them.

I can’t really share the above documents. The closest example I have handy is for my personal blog. It uses a heavily documented build file (using org-mode and Boot) because I always forget why/how I did what I did when I want to change things.

HolyJak · March 26, 2021, 9:59am

What has worked quite well for me for documenting non-trivial business-level rules is “living documentation” via Specification by Example, in particular using my clj-concordion. I.e. I have a tree of Markdown documents describing the domain and the important rules in its respective parts. These include examples, that are “instrumented” and thus bound to clojure.test namespaces. The document feeds data into test functions and displays the result compared to what was expected. The resulting HTML is also published to a place where business folks can access it. We use it regularly to discuss and clarify how the system behaves. However, IMHO, this is only suitable when you have complex business rules.

I also +1 to using Architecture Decision Records and text-based diagrams.

I have also found FC4, which looks interesting, as it aims to solve a number of problems with other solutions in the space:

a Docs as Code tool that helps software creators and documentarians author software architecture diagrams using the C4 model for visualising software architecture.

but haven’t tried it myself.

schmudde · March 26, 2021, 3:15pm

This sounds great. Why did you come to this conclusion? Because they aren’t useful documents for developers?

didibus · March 26, 2021, 4:29pm

Interesting, have you found this to work for apps without a frontend as well?

Like say there was:

“When customer logins a welcome email is sent to them.”