DisCopy


Monday 27 June 2022

PostgreSQL VACUUM!

VACUUM

is the garbage-collector of PostgreSQL Database to discard obselete (updated) / deleted records of tables, materialized views and optionally analyze the statistics of a database/object.

If a table doesn’t get vacuumed, it will get bloated, which wastes disk space and also hampers the performance.

 

PostgreSQL's VACUUM command is a maintenance command needs to be run periodically to cater the following vital tasks:

1.      To recover or reuse disk space occupied by updated or deleted rows.

2.      To analyze/update data statistics used by the PostgreSQL query planner.

3.      To protect against loss of very old data due to transaction ID wraparound.

 

In normal PostgreSQL operation, tuples/records/rows that are deleted or obsoleted by an update are not physically removed from their table until a VACUUM is done. Therefore it's necessary to do VACUUM periodically, especially on frequently-updated tables as VACUUM reclaims storage occupied by dead tuples.

Normally we don’t need to take care of all that, because the autovacuum daemon does that with some limitations based on the configuration.

The frequency and scope of the VACUUM operations performed for each of the above reasons will vary depending on the needs of each database environment. Some tables with heavy updates and deletes need frequent vacuuming to discard obselete and deleted records. 

 

  • VACUUM processes every table and materialized view in the current database that the current user has permission to vacuum by default i.e. without a table_and_columns list specified, .With a list, VACUUM processes only those specified table(s).
  • VACUUM ANALYZE performs a VACUUM and then an ANALYZE for each selected table. This is a handy combination form for routine maintenance scripts. This option Updates statistics used by the planner to determine the most efficient way to execute a query thus the Optimizer choose right plan including join and indexes.

VACUUM (without FULLsimply reclaims space and makes it available for re-use. This command can operate in parallel with normal reading and writing of the table, as an exclusive lock is not obtained on the table. However, extra space is not returned to the operating system immediately. it's just kept available for re-use within the same table. It also allows us to leverage multiple CPUs in order to process indexes. This feature is known as parallel vacuum. We can use PARALLEL option and specify parallel workers as zero to disable the parallelism.

  • VACUUM FULL rewrites the entire contents of the table into a new disk file with no extra space, allowing unused space to be returned to the operating system. This form is much slower and requires an ACCESS EXCLUSIVE lock on each table while it is being processed. 

BTW, “full” vacuum, which can reclaim more space, takes much longer and exclusively locks impacts the performance and availability of the specific table. This method also requires extra disk space, since it writes a new copy of the table and doesn't release the old copy until the operation is complete. Usually this should only be used when a significant amount of space needs to be reclaimed from within the table like after a huge purge activity on the table.

 

  • VACUUM FREEZE selects aggressive “freezing” of tuples.

Specifying FREEZE with VACUUM is useful in transaction id wrapping and is equivalent to performing VACUUM with the vacuum_freeze_min_age and vacuum_freeze_table_age   parameters set to zero

 Aggressive freezing is always performed when the table is rewritten, so this option is redundant when FULL is specified.

 

Equivalent Commands in other RDBMS:

  • ·      REORG and UPDATE STATISTICS in Sybase
  • ·      REORG and GATHER STATS in Oracle

 

No comments:

Post a Comment