Cool!
Just want to say though, you would only care to limit this if you believe that your host which is running the code won’t be able to vertically scale to some load and will thus brown out.
One example here would be if you think that it would run out of memory due to using too many threads, or that you will run-out of memory due to having too many IO buffers and connections ongoing.
So the question is, can user behavior drove the load on your host to run itself out of memory? If it can, that’s a problem, and you’ll want to set some bounds to prevent this from happening. There are many places where you can bound things though.
So in your case, if the post to fetch are user provided, that could be a problem. You can imagine a user making a single request and asking to fetch say 10 billion posts. That would cause your code to create 10 billion threads each opening an IO buffer and connection, which could cause your host to run out of memory.
Now, you could choose to bound the threads using a thread pool, that would similarly bound the IO buffer and connections since you have one of those per thread. But, then you still need to wonder, keeping track of this 10 billion payload itself consumes memory, you have a vector of 10 billion string. And since that’s unbounded, if 10 billion isn’t enough to OOM, the user can request for 100 billion, or whatever it needs and still tip your host over.
So now you realize that probably what should be bounded on this case is how many post a user can request be fetched on a single request. If you had a bound on that, then you wouldn’t need to bound your futures, since they’re one to one with the number of user requested posts.
The second thing to look at is the total number of concurrent request on your server. Even if you limit each request to ask to fetch at most 1 post. A user could make 100 billion request, and you’d still OOM. So you probably also want a bound on the number of concurrent request you handle.
So now, by just having a bound on max concurrent request and on max post to fetch per request, you’ve also properly bounded your number of futures as MaxConcurentRequest X MaxPostToFetch.
And you can perform load test to see what those values should be configured at for a particular type of host.
So my point is, protecting yourself from tip overs like this is something where you need to consider the entirety of the application. And same as premature optimization, there exist premature artificial limits. I’m in the camp to that limits should be put in place but at the end, when you can property load test and reason about the whole application. And generally, limits at the entry points are good enough and much simpler to implement and maintain and reason about.