Help Needed with Optimizing Clojure Code for High-Performance Data Processing

BrunoBonacci · September 6, 2024, 11:58am

I would say that the first step is to identify where the bottleneck is. If you think you have found that writing the data to the database is the bottleneck, then that’s already a good step forward.

If you are unsure, please have a look at the comment to this post:

Many of the same suggestions will apply here.

If you are sure that the bottleneck is writing the data to the database then there are a couple of things you could double check:

are you using a connection pool for your database connections?
Opening a connection per query could be rather expensive.
If you are not using a connection pool, try one of the following:

Do you batch write your data into Postgres?
Writing data can be done in large batches of thousands of records with a single statement. This is way more efficient than using individual INSERT INTO statements for each record. check PostgreSQL: Documentation: 9.4: INSERT

here is an example of how to bulk insert records

INSERT INTO films (code, title, did, date_prod, kind) VALUES
    ('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
    ('UA502', 'Bananas', 105, '1971-07-13', 'Comedy'),
    ('HG120', 'The Dinner Game', 140, '1961-06-16', 'Comedy');

I hope it helps.
Bruno