The #3 most important idea in Computer Science (from Eric Normand's YouTube series)

So I’ve really been enjoying @ericnormand’s ruminations on programming on YouTube lately. Most of his topics are good food for thought and would make for some good conversation topics.

In this video, Eric argues that the “trie” is the third most important idea in computer science. I’m not sure how to go about ranking ideas, but this one does seem pretty fundamental.

Basically, the idea is that the binary branching gives us logarithmic growth, so the larger the system becomes, the more efficient it becomes. Just watch the video…

So I’ve been researching some related topics recently and I thought there was an interesting connection with allometry and scaling.

There’s this idea of “metabolic scaling” and a lot of papers point to tree-like networks that account for logarithmic growth.

This paper even tackles this question of comparing metabolic growth in biology and microchips directly: Energy and time determine scaling in biological and computer designs

And I guess that leads to another question of whether there is any correspondence between the polynomial hierarchy and the biology of organisms.

What are y’all’s thoughts? Do you notice similarities between scaling in algorithms and biology? Would you put tries in your top 3 most important ideas in computer science?

I think my grain would be much bigger. I might say abstractions, datastructures and computations are the three biggest ideas.

Where abstraction is the idea that you can express more with less, or X with Y. All programming languages are abstractions over machine code for example.

Datastructure is the idea that the same data can be organised in many ways, and that some ways are more conductive to certain processing.

And computation is the idea that you can create systematic and generic mechanisms which allows data to be transformed in potentialy limitless ways. Thus, allowing you to build machines that can process data. This is really the most fundamental, and the turing model of computation and the lambda calculus are examples of it. You would include algorithms as a subset of this category.

I would also add side effects as the 4th most important. The ability to attach side effect to data and data changes is what makes the whole thing practical. Allowing most programs to be active participant in our lives, instead of simply passive calculators.

Apparently some bioinformaticists like to organize things similarly. They do sound familiar as basic principles.

I wonder how computation and abstraction are different, though, specifically?

Is a computation an abstraction? Is an abstraction a computation?

hi @John_Newman

I like the idea of using a scaling factor to evaluate software. If scaling laws are followed, it seems like software should get easier to maintain and extend as it grows, just like animals get more efficient as they get bigger. That is, each new feature is easier to write than the last. Sadly, this is usually not the case.

it’s a fun thought experiment: what would have to be the case for a piece of software to scale like that?



Actually this is the case already. Maybe not in the scope of a single project. But take programming languages for instance. They enable us to add features more easily because of the more succinct syntax compared to assembler.
Databases enable every developer to have a useable persistent backend.
I think the answer for a single piece of software is the same. Abstract it and add features on top of it by using it as a library instead of stuffing it together.


@ericnormand Yeah, I’m not sure. It seems that we’d have to differentiate scaling relative to a particular architecture and problem-context and scaling with respect to generic polynomial difficulty.

The physics of this universe will probably imply certain mechanisms are more efficient than others. Likewise, the CPU vs GPU vs FPGA (discussed PDF:here) each are able to process different algorithms at different levels of efficiency, regardless of the algorithm’s polynomial efficiency. Amdahl’s Law is also a scaling law that applies here.

There are some papers(PDF) out there about power laws and ‘scale-free’ architectures in software, where they claim that “distributions with long, fat tails in software are much more pervasive than previously established.”

It would also appear as though the naming of things in software development tends to create Zipf distributions, forming a sort of symbolic trie. Some blogs and papers explore Zipfian distributions in software but I’d probably just chalk up the phenomenon to the trie efficiency you pointed out.

I’d bet all kinds of scaling laws could be correlated between various garbage collection schemes and memory management in various langs. But then you could also imagine softwares designed that have no notion of “garbage.” But in an architecture independent, context independent way, can we say certain things that are true at all scales? I’m not sure.

What about “actions” vs “calculations” you’ve been discussing in other videos? I’ve been wondering if for any given calculation, there is some requisite percentage of underlying action. In other words, for any given intended affect, there will be some necessary amount of unintended side effect, (relative to the primary purpose of the calculation), such as time going by, heat building up, missiles accidentally launching, etc. And perhaps there is some minimum amount ontological baggage for any given teleological action, which might be quantifiable via analysis of available software.

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.