What HTTP server library to use

iyedb · December 28, 2018, 12:29am

What are the recommended HTTP server libraries. And what is, the threading, execution model of each.
Thanks.
I have looked into both http-kit and aleph. They do not seem mature enough for production use. So my question is what is the recommended, production ready, HTTP server library in the Clojure ecosystem.
PS: Ring uses jetty which based on the old on request/one thread execution model and is not an option by todays standards (think http servers in golang, python asyncio, etc)

swlkr · December 31, 2018, 12:39am

I use http-kit in production, of course, it’s mostly for low traffic web apps ~200-400 reqs/day. I also use SQLite in prod, so I’m probably a bad person to give out this kind of advice

didibus · December 31, 2018, 6:39am

I’d say Pedestal, Immutant or http-kit feels like the most battle tested. Though I think aleph can very much be used in production. You could also checkout clojure-nginx, seems to have active maintenance for the last three years.

borkdude · December 31, 2018, 8:50am

I’m very happy with yada on top of aleph. Have been using it for a long period in production now.

seancorfield · January 1, 2019, 7:04am

We’ve used both Jetty (Ring’s built-in adapter) and http-kit in production, with many millions of low-latency requests per day, very happily. Don’t underestimate the JVM’s ability to perform to “today’s standards”.

We started with Jetty but switched to http-kit because of some weird thread-related exceptions in Jetty. However, http-kit isn’t as well supported by New Relic, our monitoring tool of choice, so we switched back to (a newer version of) Jetty and we’ve had no problems at all with it.

http-kit is definitely mature enough for production use but tooling support, such as New Relic, lags behind the more mainstream servers.

iyedb · January 1, 2019, 11:45am

I agree with that but once you get used to the asynchronous programming style without the callbacks that golang offers using something like jetty that uses as far as I know a thread for each request, seems like a step back.

orestis · January 1, 2019, 2:19pm

I wonder myself about using a thread per request – in practice of course Jetty uses a lot of smart algorithms and thread pools to both keep the total number of alive threads low, and also minimise the amount of generated garbage, but any battle stories (or tales of great success) with using Jetty to talk to a database would be very welcome.

In my mental model, using a thread per request shouldn’t be that big of a deal when you’re talking to a database, as the database will also have a thread per connection or some similar bottleneck, and in the end you need to match the scale of your web server to the scale of your database. I’m not sure at what point that breaks down though.

lukaszkorecki · January 1, 2019, 5:51pm

Common pattern is to use a connection pool for accessing the database. We use a combo of Jetty and HikariCP and so far it’s all been working fine after tweaking the threadpool size in Jetty’s settings.

seancorfield · January 1, 2019, 6:34pm

Personally, I find the callback style of programming to be a major step backwards. I guess it depends on what you’re used to…

…and if you really want to use that style of programming, Ring supports it (even on Jetty).

seancorfield · January 1, 2019, 6:38pm

As noted, we have a fairly high-traffic system, with a large database (millions of members worldwide, with thousands of them accessing the system concurrently 24x7). We ran on Tomcat for years (with one thread per request) in our legacy system with very few issues and now we’re running our new system on Jetty with no issues (we haven’t even had to tweak the defaults yet). We use c3p0 for connection pooling with Percona 5.7, via ProxySQL, and a master/slave setup.

orestis · January 1, 2019, 8:05pm

That’s encouraging! We’re about to start experimenting with migrating some node.js + mongo services over to clojure + mongo, and there’s many decisions and knobs to turn. Our mongo backend is blazingly fast so hopefully I won’t have to care about threadpools just yet.

What kind of hardware are you running you JVMs on, if I may ask?

seancorfield · January 1, 2019, 8:32pm

I don’t know what’s in the data center, sorry, beyond them being multi-core Linux servers that we’ve been using for several years. Our database servers are virtualized, along with our Redis servers, but our JVM servers are bare metal still. We were using MongoDB heavily a few years ago but we’ve migrated pretty much all of that data back to MySQL – which reduces costs and complexity (running two large master/slave database clusters, in order to support both MySQL and MongoDB did not make sense).

DjebbZ · January 2, 2019, 7:48am

I can testify that we’ve been using Jetty + pedestal in production for a similar multi millions views per day website for quite some time, and coupling it with core.async for calls to external services over http to avoid doing blocking I/O has proved pretty stable. CPU usage is well under our limits, as well as memory usage and were able to render most pages in less than 100ms (some of them under 50ms even).

iyedb · January 2, 2019, 1:50pm

Golang (I use golang to write http services that handle hundreds of rps) offers best of both worlds: asynchronous I/O but with a synchronous API from the programmer’s point of view. So no callbacks are used at all. Thanks to the go runtime, every time a piece of code in a goroutine reads or write to the network (database drivers included of course), it yields the runtime OS thread that is running it, and the go runtime can run any other goroutine and that is “ready”, using underlying OS mechanisms like epoll and kqueue. This concurrency style was a direct inspiration for core,async which even uses the macro go like the way you lauch a goroutine in golang. I am wondering why there isn’t an http library in clojure that uses uses core.async to allow a programming style similar go. And if there isn’t any what is the use case of core.async.
I know aleph uses core.async via manifold but it’s not clear to me how.

iyedb · January 2, 2019, 1:55pm

Thanks ! but aren’t you still blocking the jetty thread running the request when using core.async in the code handling the request ?

iyedb · January 2, 2019, 2:35pm

I didn’t know about pedestal. Looks great.

DjebbZ · January 2, 2019, 6:27pm

No it doesn’t block. I think it’s pedestal that see that an interceptor (handler in Ring parlance) returns a channel via the go macro so it parks the Jetty thread somehow, and Jetty can go back to handle another request. I’m not sure about the specifics but I’m quite sure about the end result. All our I/O is non-blocking through core.async and our webserver handles dozens of requests per second without a problem.

orestis · January 2, 2019, 8:23pm

Do you connect to a database or external service somehow? Isn’t that eventually blocking for I/O? Usually, to get proper async you have to go all he way down, otherwise you just shifted the bottleneck to some other place.

DjebbZ · January 2, 2019, 8:52pm

I see what you mean. Seeing it this way we’re not truly asynchronous because we need the response of the external services to answer the request. But at least every I/O in our webserver is async, so that it doesn’t completely block a Jetty thread.

The only other way that I know of to be truly async is to let the webserver respond very quickly to the request, then deliver the true response through web sockets or server sent events. It’s a technique pioneered by LinkedIn I think.

orestis · January 4, 2019, 7:19am

Not really. If anywhere in your system or have a limited amount of threads servicing requests, and these threads can be all blocked waiting for I/O, then it makes no difference if other parts of your code are async, as they will eventually get stalled.

So in Jetty’s case, you will be accepting connections faster than you can service them, because the remote database will become your bottleneck. Eventually you will run out of memory or file descriptors. Perhaps it would be better to actually refuse new connections, and service the existing ones.

Not all systems fit this architecture or load characteristics, so perhaps this is an academic view that doesn’t bear much connection to reality. My point is, I don’t worry too much about the Jetty part of my system… yet.