Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch
Go to file
Tomas Vondra b229c10164 Enforce memory limit during parallel GIN builds
Index builds are expected to respect maintenance_work_mem, just like
other maintenance operations. For serial builds this is done simply by
flushing the buffer in ginBuildCallback() into the index. But with
parallel builds it's more complicated, because there are multiple places
that can allocate memory.

ginBuildCallbackParallel() does the same thing as ginBuildCallback(),
except that the accumulated items are written into tuplesort. Then the
entries with the same key get merged - first in the worker, then in the
leader - and the TID lists may get (arbitrarily) long. It's unlikely it
would exceed the memory limit, but it's possible. We address this by
evicting some of the data if the list gets too long.

We can't simply dump the whole in-memory TID list. The GIN index bulk
insert code expects to see TIDs in monotonic order; it may fail if the
TIDs go backwards. If the TID lists overlap, evicting the whole current
TID list would break this (a later entry might add "old" TID values into
the already-written part).

In the workers this is not an issue, because the lists never overlap.
But the leader may see overlapping lists produced by the workers.

We can however derive a safe "horizon" TID - the entries (for a given
key) are sorted by (key, first TID), which means no future list can add
values before the last "first TID" we've seen. This patch tracks the
"frozen" part of the TID list, which we know can't change by merging
additional TID lists. If needed, we can evict this part of the list.

We don't want to do this too often - the smaller lists we evict, the
more expensive it'll be to merge them in the next step (especially in
the leader). Therefore we only trim the list if we have at least 1024
frozen items, and if the whole list is at least 64kB large.

These thresholds are somewhat arbitrary and conservative. We might
calculate the values from maintenance_work_mem, but tests show that does
not really improve anything (time, compression ratio, ...). So we stick
to these conservative values to release memory faster.

Author: Tomas Vondra
Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke
Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com
2025-03-04 20:41:13 +01:00
.github Add CODE_OF_CONDUCT.md, CONTRIBUTING.md, and SECURITY.md. 2024-07-02 13:03:58 -05:00
config Add support for OAUTHBEARER SASL mechanism 2025-02-20 16:25:17 +01:00
contrib postgres_fdw: Extend postgres_fdw_get_connections to return remote backend PID. 2025-03-03 08:51:30 +09:00
doc doc: Expand version compatibility for pg_basebackup features 2025-03-04 12:08:27 +01:00
src Enforce memory limit during parallel GIN builds 2025-03-04 20:41:13 +01:00
.cirrus.star Remove duplicate words in docs and code comments. 2023-10-09 09:18:47 +05:30
.cirrus.tasks.yml ci: Use a RAM disk for NetBSD and OpenBSD. 2025-03-04 11:29:21 +13:00
.cirrus.yml ci: Test NetBSD and OpenBSD 2025-02-12 09:40:07 -05:00
.dir-locals.el Make Emacs perl-mode indent more like perltidy. 2019-01-13 11:32:31 -08:00
.editorconfig Add script to keep .editorconfig in sync with .gitattributes 2025-02-01 10:09:45 +01:00
.git-blame-ignore-revs Add commit 76aa615943 to .git-blame-ignore-revs 2025-02-01 16:48:18 +09:00
.gitattributes Add script to keep .editorconfig in sync with .gitattributes 2025-02-01 10:09:45 +01:00
.gitignore Update top-level .gitignore. 2022-12-04 15:23:00 -05:00
.mailmap Add a Git .mailmap file 2024-11-05 13:56:02 +01:00
aclocal.m4 autoconf: Move export_dynamic determination to configure 2022-12-06 18:55:28 -08:00
configure Add support for OAUTHBEARER SASL mechanism 2025-02-20 16:25:17 +01:00
configure.ac Add support for OAUTHBEARER SASL mechanism 2025-02-20 16:25:17 +01:00
COPYRIGHT Update copyright for 2025 2025-01-01 11:21:55 -05:00
GNUmakefile.in Allow selecting the git revision to be packaged by "make dist". 2024-05-03 11:08:50 -04:00
HISTORY Canonicalize some URLs 2020-02-10 20:47:50 +01:00
Makefile Remove AIX support 2024-02-28 15:17:23 +04:00
meson_options.txt Add support for OAUTHBEARER SASL mechanism 2025-02-20 16:25:17 +01:00
meson.build Make test portlock logic work with meson 2025-02-21 11:25:05 -05:00
README.md Revise the style of a paragraph in README.md. 2024-03-21 10:16:41 -05:00

PostgreSQL Database Management System

This directory contains the source code distribution of the PostgreSQL database management system.

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. This distribution also contains C language bindings.

Copyright and license information can be found in the file COPYRIGHT.

General documentation about this version of PostgreSQL can be found at https://www.postgresql.org/docs/devel/. In particular, information about building PostgreSQL from the source code can be found at https://www.postgresql.org/docs/devel/installation.html.

The latest version of this software, and related software, may be obtained at https://www.postgresql.org/download/. For more information look at our web site located at https://www.postgresql.org/.