Postgresql – Configuring PostgreSQL for write performance

performancepostgresql

One of my PostgreSQL servers hosts several (1-3) databases which receive a constant stream of data. The data is not particularly structured, it amounts to the current time and a variety of observed data for that particular instant. The data rate is fairly high; it works out to about a gigabyte a day for one database, about a tenth of that for another one. I don't expect this rate to increase. Read performance is a much lower priority and is currently acceptable.

In the logs I have this message:

LOG:  checkpoints are occurring too frequently (15 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".

This value is currently set to 16, which is courtesy of pgtune.

What are the settings I should consider to improve write performance? I would prefer to keep as much safety as possible. Considering the volume of data coming in, I could accept losing some recent data in a failure as long as the bulk of the data were intact.

Edit: I'm using PostgreSQL 9.0 for now, but I plan to upgrade to 9.1. I am not posting the hardware details because while I acknowledge their importance, I ultimately will be needing to make this optimization on several machines with very diverse hardware. If the hardware is essential to the answer, please give me the general information so I can apply the answer to machines with different hardware configurations.

Best Answer

1 Gigabyte a day is not that high of a write load. Spread out throughout the day, that comes out to about 50kbytes a second. A slow USB thumb drive could handle that. I'm assuming it's more bursty though. As a_horse_with_no_name suggests, increase checkpoint segments. 100 or so is not out of the ordinary.

Then increase your checkpoint_timeout to 1 hour, as well as look at increasing your checkpoint_completion_target to something closer to 1.0 (100%). The completion target tells PostgreSQL how aggressively to write in the background so that it's x% complete before running a checkpoint, which force all the data to be written out at once from the WAL and will slow the system to a crawl while it's happening.

The reason you don't usually set it to 100% is that it's pretty common to write to the same block more than once, and by delaying WAL writes out to the main store, you prevent the same block being written twice for no reason.

If it's unlikely you'll be writing to the same block more than once before your timeout occurs, i.e. all you do is insert then setting it pretty high makes sense to raise it to 0.9 or so. The worst that'll happen is you'll write a little more often than you might otherwise need to, but checkpoints impact will be greatly reduced.