postgres

mirror of https://github.com/zebrajr/postgres.git synced 2025-12-06 00:20:01 +01:00

Author	SHA1	Message	Date
Etsuro Fujita	e5a3c9d9b5	postgres_fdw: Inherit the local transaction's access/deferrable modes. Previously, postgres_fdw always 1) opened a remote transaction in READ WRITE mode even when the local transaction was READ ONLY, causing a READ ONLY transaction using it that references a foreign table mapped to a remote view executing a volatile function to write in the remote side, and 2) opened the remote transaction in NOT DEFERRABLE mode even when the local transaction was DEFERRABLE, causing a SERIALIZABLE READ ONLY DEFERRABLE transaction using it to abort due to a serialization failure in the remote side. To avoid these, modify postgres_fdw to open a remote transaction in the same access/deferrable modes as the local transaction. This commit also modifies it to open a remote subtransaction in the same access mode as the local subtransaction. Although these issues exist since the introduction of postgres_fdw, there have been no reports from the field. So it seems fine to just fix them in master only. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAPmGK16n_hcUUWuOdmeUS%2Bw4Q6dZvTEDHb%3DOP%3D5JBzo-M3QmpQ%40mail.gmail.com	2025-06-01 17:30:00 +09:00
Dean Rasheed	b006bcd531	Fix MERGE into a plain inheritance parent table. When a MERGE's target table is the parent of an inheritance tree, any INSERT actions insert into the parent table using ModifyTableState's rootResultRelInfo. However, there are two bugs in the way is initialized: 1. ExecInitMerge() incorrectly uses a different ResultRelInfo entry from ModifyTableState's resultRelInfo array to build the insert projection, which may not be compatible with rootResultRelInfo. 2. ExecInitModifyTable() does not fully initialize rootResultRelInfo. Specifically, ri_WithCheckOptions, ri_WithCheckOptionExprs, ri_returningList, and ri_projectReturning are not initialized. This can lead to crashes, or incorrect query results due to failing to check WCO's or process the RETURNING list for INSERT actions. Fix both these bugs in ExecInitMerge(), noting that it is only necessary to fully initialize rootResultRelInfo if the MERGE has INSERT actions and the target table is a plain inheritance parent. Backpatch to v15, where MERGE was introduced. Reported-by: Andres Freund <andres@anarazel.de> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/4rlmjfniiyffp6b3kv4pfy4jw3pciy6mq72rdgnedsnbsx7qe5@j5hlpiwdguvc Backpatch-through: 15	2025-05-31 12:12:58 +01:00
Michael Paquier	e050af2868	Change internal plan ID type from uint64 to int64 uint64 was chosen to be consistent with the type used by the query ID, but the conclusion of a recent discussion for the query ID is that int64 is a better fit as the signed form is shown to the user, for PGSS or EXPLAIN outputs. This commit changes the plan ID to use int64, following `c3eda50b06` that has done the same for the query ID. The plan ID is new to v18, introduced in `2a0cd38da5`. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aCvzJNwetyEI3Sgo@paquier.xyz	2025-05-31 09:40:45 +09:00
Nathan Bossart	706054b11b	Ensure we have a snapshot when updating various system catalogs. A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit `16bf24e0e4` recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, fixing older versions doesn't seem worth the trouble of significantly modifying the patch. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit `2776922201`, which added HaveRegisteredOrActiveSnapshot(), was not back-patched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13	2025-05-30 15:17:28 -05:00
Tom Lane	232d8caeaa	Fix memory leakage in postgres_fdw's DirectModify code path. postgres_fdw tries to use PG_TRY blocks to ensure that it will eventually free the PGresult created by the remote modify command. However, it's fundamentally impossible for this scheme to work reliably when there's RETURNING data, because the query could fail in between invocations of postgres_fdw's DirectModify methods. There is at least one instance of exactly this situation in the regression tests, and the ensuing session-lifespan leak is visible under Valgrind. We can improve matters by using a memory context reset callback attached to the ExecutorState context. That ensures that the PGresult will be freed when the ExecutorState context is torn down, even if control never reaches postgresEndDirectModify. I have little faith that there aren't other potential PGresult leakages in the backend modules that use libpq. So I think it'd be a good idea to apply this concept universally by creating infrastructure that attaches a reset callback to every PGresult generated in the backend. However, that seems too invasive for v18 at this point, let alone the back branches. So for the moment, apply this narrow fix that just makes DirectModify safe. I have a patch in the queue for the more general idea, but it will have to wait for v19. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us Backpatch-through: 13	2025-05-30 13:45:41 -04:00
Tom Lane	d98cefe114	Allow larger packets during GSSAPI authentication exchange. Our GSSAPI code only allows packet sizes up to 16kB. However it emerges that during authentication, larger packets might be needed; various authorities suggest 48kB or 64kB as the maximum packet size. This limitation caused login failure for AD users who belong to many AD groups. To add insult to injury, we gave an unintelligible error message, typically "GSSAPI context establishment error: The routine must be called again to complete its function: Unknown error". As noted in code comments, the 16kB packet limit is effectively a protocol constant once we are doing normal data transmission: the GSSAPI code splits the data stream at those points, and if we change the limit then we will have cross-version compatibility problems due to the receiver's buffer being too small in some combinations. However, during the authentication exchange the packet sizes are not determined by us, but by the underlying GSSAPI library. So we might as well just try to send what the library tells us to. An unpatched recipient will fail on a packet larger than 16kB, but that's not worse than the sender failing without even trying. So this doesn't introduce any meaningful compatibility problem. We still need a buffer size limit, but we can easily make it be 64kB rather than 16kB until transport negotiation is complete. (Larger values were discussed, but don't seem likely to add anything.) Reported-by: Chris Gooch <cgooch@bamfunds.com> Fix-suggested-by: Jacob Champion <jacob.champion@enterprisedb.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/DS0PR22MB5971A9C8A3F44BCC6293C4DABE99A@DS0PR22MB5971.namprd22.prod.outlook.com Backpatch-through: 13	2025-05-30 12:55:15 -04:00
Fujii Masao	961553daf5	Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more. Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter a non-interruptible loop when they successfully acquired a lock on a transaction but the transaction still appeared to be running. Since this loop continued until the transaction completed, it could result in long, uninterruptible waits. Although this scenario is generally unlikely since XactLockTableWait() and ConditionalXactLockTableWait() can basically acquire a transaction lock only when the transaction is not running, it can occur in a hot standby. In such cases, the transaction may still appear active due to the KnownAssignedXids list, even while no lock on the transaction exists. For example, this situation can happen when creating a logical replication slot on a standby. The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS() within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions, ensuring they can be interrupted safely. Back-patch to all supported branches. Author: Kevin K Biju <kevinkbiju@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com Backpatch-through: 13	2025-05-31 00:08:40 +09:00
David Rowley	c3eda50b06	Change internal queryid type from uint64 to int64 uint64 was perhaps chosen in `cff440d36` as the type was uint32 prior to that widening work. Having this as uint64 doesn't make much sense and just adds the overhead of having to remember that we always output this in its signed form. Let's remove that overhead. The signed form output is seemingly required since we have no way to represent the full range of uint64 in an SQL type. We use BIGINT in places like pg_stat_statements, which maps directly to int64. The release notes "Source Code" section may want to mention this adjustment as some extensions may wish to adjust their code. Author: David Rowley <dgrowleyml@gmail.com> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/50cb0c8b-994b-48f9-a1c4-13039eb3536b@eisentraut.org	2025-05-30 22:59:39 +12:00
Bruce Momjian	03c53a7314	doc PG 18 relnotes: modify async I/O item for other improvements Add "etc." to indicate other actions will also be improved by asynchronous I/O. Reported-by: Melanie Plageman Discussion: https://postgr.es/m/CAAKRu_bqjgSYA+OdemL-X91Yv53OwsVARZy+-tRyj8YQ=kcj0A@mail.gmail.com	2025-05-29 12:37:05 -04:00
Tom Lane	470273da0f	Avoid resource leaks when a dblink connection fails. If we hit out-of-memory between creating the PGconn and inserting it into dblink's hashtable, we'd lose track of the PGconn, which is quite bad since it represents a live connection to a remote DB. Fix by rearranging things so that we create the hashtable entry first. Also reduce the number of states we have to deal with by getting rid of the separately-allocated remoteConn object, instead allocating it in-line in the hashtable entries. (That incidentally removes a session-lifespan memory leak observed in the regression tests.) There is an apparently-irreducible remaining OOM hazard, which is that if the connection fails at the libpq level (ie it's CONNECTION_BAD) then we have to pstrdup the PGconn's error message before we can release it, and theoretically that could fail. However, in such cases we're only leaking memory not a live remote connection, so I'm not convinced that it's worth sweating over. This is a pretty low-probability failure mode of course, but losing a live connection seems bad enough to justify back-patching. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/1346940.1748381911@sss.pgh.pa.us Backpatch-through: 13	2025-05-29 10:39:55 -04:00
Fujii Masao	3c4d7557e0	Fix assertion failure in pg_prewarm() on objects without storage. An assertion test added in commit `049ef33` could fail when pg_prewarm() was called on objects without storage, such as partitioned tables. This resulted in the following failure in assert-enabled builds: Failed Assert("RelFileNumberIsValid(rlocator.relNumber)") Note that, in non-assert builds, pg_prewarm() just failed with an error in that case, so there was no ill effect in practice. This commit fixes the issue by having pg_prewarm() raise an error early if the specified object has no storage. This approach is similar to the fix in commit `4623d7144` for pg_freespacemap. Back-patched to v17, where the issue was introduced. Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/e082e6027610fd0a4091ae6d033aa117@oss.nttdata.com Backpatch-through: 17	2025-05-29 17:50:32 +09:00
Michael Paquier	c3623703f3	Add AioUringCompletion in wait_event_names.txt Oversight in `c325a7633f`, where the LWLock tranche AioUringCompletion has been added. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aDT5sBOxJTdulXnE@paquier.xyz	2025-05-29 13:25:05 +09:00
Bruce Momjian	a1de1b0833	doc PG 18 relnotes: split apart log_connections item Also add details to asynchronous I/O item. Reported-by: Melanie Plageman Discussion: https://postgr.es/m/CAAKRu_YsVvyantS0X0Y_-vp_97=yGaoYJMXXyCEkR7pumAH3Jg@mail.gmail.com	2025-05-28 22:43:36 -04:00
Michael Paquier	35a428f30b	pg_stat_statements: Fix parameter number gaps in normalized queries pg_stat_statements anticipates that certain constant locations may be recorded multiple times and attempts to avoid calculating a length for these locations in fill_in_constant_lengths(). However, during generate_normalized_query() where normalized query strings are generated, these locations are not excluded from consideration. This could increment the parameter number counter for every recorded occurrence at such a location, leading to an incorrect normalization in certain cases with gaps in the numbers reported. For example, take this query: SELECT WHERE '1' IN ('2'::int, '3'::int::text) Before this commit, it would be normalized like that, with gaps in the parameter numbers: SELECT WHERE $1 IN ($3::int, $4::int::text) However the correct, less confusing one should be like that: SELECT WHERE $1 IN ($2::int, $3::int::text) This commit fixes the computation of the parameter numbers to track the number of constants replaced with an $n by a separate counter instead of the iterator used to loop through the list of locations. The underlying query IDs are not changed, neither are the normalized strings for existing PGSS hash entries. New entries with fresh normalized queries would automatically get reshaped based on the new parameter numbering. Issue discovered while discussing a separate problem for HEAD, but this affects all the stable branches. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0tzxvWXsacGyxrixdhy3tTTDfJQqxyFBRFh31nNHBQ5qA@mail.gmail.com Backpatch-through: 13	2025-05-29 11:26:03 +09:00
Bruce Momjian	089f27cf8a	doc: clarify log_connections new "setup_durations" output	2025-05-28 21:42:34 -04:00
Bruce Momjian	bf6034d00d	doc PG 18 relnotes: move ANALYZE item,split ANALYZE/EXPLAIN item Reported-by: Yugo Nagata Author: Yugo Nagata Discussion: https://postgr.es/m/20250528232503.7db770f651c2c821c0e3c1df@sraoss.co.jp	2025-05-28 18:43:31 -04:00
Tom Lane	e5d64fd654	Tighten parsing of datetime input. ParseFraction only expects to deal with fields that contain a decimal point and digit(s). However it's possible in some edge cases for it to be passed input that doesn't look like that. In particular the input could look like a valid floating-point number, such as ".123e6". strtod() will happily eat that, possibly producing a result that is not within the expected range 0..1, which can result in integer overflow in the callers. That doesn't have any security consequences, but it's still not very desirable. Fix by checking that the input has the expected form. Similarly, DecodeNumberField only expects to deal with fields that contain a decimal point and digit(s), but it's sometimes abused to parse strings that might not look like that. This could result in failure to reject bogus input, yielding silly results. Again, fix by rejecting input that doesn't look as-expected. That decision also means that we can affirmatively answer the very old comment questioning whether we couldn't save some duplicative code by using ParseFractionalSecond here. While these changes should only reject input that nobody would consider valid, it still doesn't seem like a change to make in stable branches. Apply to HEAD only. Reported-by: Evgeniy Gorbanev <gorbanev.es@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1328335.1748371099@sss.pgh.pa.us	2025-05-28 15:10:48 -04:00
Tom Lane	be86ca103a	Fix memory leakage when function compilation fails. In pl_comp.c, initially create the plpgsql function's cache context under the assumed-short-lived caller's context, and reparent it under CacheMemoryContext only upon success. This avoids a process-lifespan leak of 8kB or more if the function contains syntax errors. (This leakage has existed for a long time without many complaints, but as we move towards a possibly multi-threaded future, getting rid of process-lifespan leaks grows more important.) In funccache.c, arrange to reclaim the CachedFunction struct in case the language-specific compile callback function throws an error; previously, that resulted in an independent process-lifespan leak. This is arguably a new bug in v18, since the leakage now occurred for SQL-language functions as well as plpgsql. Also, don't fill fn_xmin/fn_tid/dcallback until after successful completion of the compile callback. This avoids a scenario where a partially-built function cache might appear already valid upon later inspection, and another scenario where dcallback might fail upon being presented with an incomplete cache entry. We would have to reach such a faulty cache entry via a pre-existing fn_extra pointer, so I'm not sure these scenarios correspond to any live bug. (The predecessor code in pl_comp.c never took any care about this, and we've heard no complaints about that.) Still, it's better to be careful. Given the lack of field complaints, I'm not very excited about back-patching any of this; but it seems still in-scope for v18. Discussion: https://postgr.es/m/999171.1748300004@sss.pgh.pa.us	2025-05-28 13:29:45 -04:00
Bruce Momjian	c861092b0e	doc PG 18 relnotes: clarify multiplication item Reported-by: Dean Rasheed Author: Dean Rasheed Discussion: https://postgr.es/m/CAEZATCXZGU3LLMZHobYys1MLpyNMAus7+UUpWeeFYwSaPNC2CA@mail.gmail.com	2025-05-28 12:34:11 -04:00
Michael Paquier	4fbb46f612	Adjust regex for test with opening parenthesis in character classes As written, the test was throwing an error because of an unbalanced parenthesis. The regex used in the test is adjusted to not fail and to test the case of an opening parenthesis in a character class after some nested square brackets. Oversight in `d46911e584`. Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at	2025-05-28 09:43:31 +09:00
Michael Paquier	d46911e584	Fix conversion of SIMILAR TO regexes for character classes The code that translates SIMILAR TO pattern matching expressions to POSIX-style regular expressions did not consider that square brackets can be nested. For example, in an expression like [[:alpha:]%_], the logic replaced the placeholders '_' and '%' but it should not. This commit fixes the conversion logic by tracking the nesting level of square brackets marking character class areas, while considering that in expressions like []] or [^]] the first closing square bracket is a regular character. Multiple tests are added to show how the conversions should or should not apply applied while in a character class area, with specific cases added for all the characters converted outside character classes like an opening parenthesis '(', dollar sign '$', etc. Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at Backpatch-through: 13	2025-05-28 08:58:40 +09:00
Bruce Momjian	3e782ca322	doc PG 18 relnotes: add removal details to MD5 item Reported-by: Nathan Bossart Author: Nathan Bossart Discussion: https://postgr.es/m/aDXLoTcBYjfyqeTA@nathan	2025-05-27 17:50:52 -04:00
Bruce Momjian	08b8aa1748	doc PG 18 relnotes: fix markup Reported-by: Peter Smith Discussion: https://postgr.es/m/CAHut+PswZ7wFtpNgv3bdtYK5D0eGMpvz4CcnAxvj7gR_acazGQ@mail.gmail.com	2025-05-27 17:34:45 -04:00
Jeff Davis	34eb2a80d5	Change pg_dump default for statistics export. Set the default behavior of pg_dump and pg_dumpall to be --no-statistics. Leave the default for pg_restore and pg_upgrade to be --with-statistics. Discussion: https://postgr.es/m/CA+TgmoZ9=RnWcCOZiKYYjZs_AW1P4QXCw--h4dOLLHuf1Omung@mail.gmail.com Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2025-05-27 13:54:38 -07:00
Masahiko Sawada	4c08ecd161	Fix assertion when decrementing eager scanning success and failure counters. Previously, we asserted that the eager scan's success and failure counters were positive before decrementing them. However, this assumption was incorrect, as it's possible that some blocks have already been eagerly scanned by the time eager scanning is disabled. This commit replaces the assertions with guards to handle this scenario gracefully. With this change, we continue to allow read-ahead operations by the read stream that exceed the success and failure caps. While there is a possibility that overruns will trigger eager scans of additional pages, this does not pose a practical concern as the overruns will not be substantial and remain within an acceptable range. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAD21AoConf6tkVCv-=JhQJj56kYsDwo4jG5+WqgT+ukSkYomSQ@mail.gmail.com	2025-05-27 11:42:36 -07:00
Peter Eisentraut	c53f3b9cc8	Improve file_copy_method entry in postgresql.conf.sample Improve the wording of the comment a bit, fix whitespace. Also move the entry so that the section order is consistent with config.sgml.	2025-05-26 14:52:00 +02:00
Daniel Gustafsson	1f62dbf5f0	doc: Fix wording in JIT README Remove superfluous 'is' from sentence. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20250526154412.5f77dfead87af9afc089cc48@sraoss.co.jp	2025-05-26 13:30:01 +02:00
Michael Paquier	52a1df85f2	Fix race condition in subscription TAP test 021_twophase The test did not wait for all the subscriptions to have caught up when dropping the subscription "tab_copy". In a slow environment, it could be possible for the replay of the COMMIT PREPARED transaction "mygid" to not be confirmed yet, causing one prepared transaction to be left around before moving to the next steps of the test. One failure noticed is a transaction found in pg_prepared_xacts for the cases where copy_data = false and two_phase = true, but there should be none after dropping the subscription. As an extra safety measure, a check is added before dropping the subscription, scanning pg_prepared_xacts to make sure that no prepared transactions are left once both subscriptions have caught up. Issue introduced by `a8fd13cab0`, fixing a problem similar to `eaf5321c35`. Per buildfarm member kestrel. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm329QaZ+bwU--bW6GjbNSZ8-38cDE8QWofafub7NV67oA@mail.gmail.com Backpatch-through: 15	2025-05-26 17:28:37 +09:00
Amit Kapila	3bcb554fd2	Doc: Make logical replication examples executable in bulk. To improve the usability of logical replication examples, we need to enable bulk copy-pasting of DML/DDL series. Currently, output command tags and prompts disrupt this workflow. While prompts are typically removed, converting them to comments is acceptable here, given the multi-server context. Additionally, ensure all examples containing operators like < and > are wrapped in CDATA blocks to guarantee correct rendering and consistency with other places. Author: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAKFQuwbhbL1uaDTuo9shmo1rA-fX6XGotR7qZQ7rd-ia5ZDoQA@mail.gmail.com	2025-05-26 11:05:05 +05:30
Fujii Masao	47d90b741d	doc: Fix documenation for snapshot export in logical decoding. The documentation for exported snapshots in logical decoding previously stated that snapshot creation may fail on a hot standby. This is no longer accurate, as snapshot exporting on standbys has been supported since PostgreSQL 10. This commit removes the outdated description. Additionally, the docs referred to the NOEXPORT_SNAPSHOT option to suppress snapshot exporting in CREATE_REPLICATION_SLOT. However, since PostgreSQL 15, NOEXPORT_SNAPSHOT is considered legacy syntax and retained only for backward compatibility. This commit updates the documentation for v15 and later to use the modern equivalent: SNAPSHOT 'nothing'. The older syntax is preserved in documentation for v14 and earlier. Back-patched to all supported branches. Reported-by: Kevin K Biju <kevinkbiju@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Kevin K Biju <kevinkbiju@gmail.com> Discussion: https://postgr.es/m/174791480466.798.17122832105389395178@wrigleys.postgresql.org Backpatch-through: 13	2025-05-26 12:47:33 +09:00
Bruce Momjian	44ce4e1593	doc PG 18 relnotes: clarify btree skip-scan item Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzko57+sT=FcxHHo7jnPLhh35up_5aAvogLtj_D9bATsgQ@mail.gmail.com	2025-05-23 17:02:33 -04:00
Jacob Champion	a8f093234d	oauth: Correct missing comma in Requires.private I added libcurl to the Requires.private section of libpq.pc in commit `b0635bfda`, but I missed that the Autoconf side needs commas added explicitly. Configurations which used both --with-libcurl and --with-openssl ended up with the following entry: Requires.private: libssl, libcrypto libcurl The pkg-config parser appears to be fairly lenient in this case, and accepts the whitespace as an equivalent separator, but let's not rely on that. Add an add_to_list macro (inspired by Makefile.global's add_to_path) to build up the PKG_CONFIG_REQUIRES_PRIVATE list correctly. Reported-by: Wolfgang Walther <walther@technowledgy.de> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Discussion: https://postgr.es/m/CAOYmi+k2z7Rqj5xiWLUT0+bSXLvdE7TYgS5gCOSqSyXyTSSXiQ@mail.gmail.com	2025-05-23 13:05:38 -07:00
Jacob Champion	cbc8fd0c9a	oauth: Limit JSON parsing depth in the client Check the ctx->nested level as we go, to prevent a server from running the client out of stack space. The limit we choose when communicating with authorization servers can't be overly strict, since those servers will continue to add extensions in their JSON documents which we need to correctly ignore. For the SASL communication, we can be more conservative, since there are no defined extensions (and the peer is probably more Postgres code). Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAOYmi%2Bm71aRUEi0oQE9ciBnBS8xVtMn3CifaPu2kmJzUfhOZgA%40mail.gmail.com	2025-05-23 13:05:33 -07:00
Bruce Momjian	1ca583f6c0	doc PG 18 relnotes: update to current Includes runtime injection point item by Michael Paquier. Reported-by: Michael Paquier Author: Michael Paquier Discussion: https://postgr.es/m/aDAS0_eWzeGl4sok@paquier.xyz	2025-05-23 16:01:07 -04:00
Tom Lane	02502c1bca	Fix per-relation memory leakage in autovacuum. PgStat_StatTabEntry and AutoVacOpts structs were leaked until the end of the autovacuum worker's run, which is bad news if there are a lot of relations in the database. Note: pfree'ing the PgStat_StatTabEntry structs here seems a bit risky, because pgstat_fetch_stat_tabentry_ext does not guarantee anything about whether its result is long-lived. It appears okay so long as autovacuum forces PGSTAT_FETCH_CONSISTENCY_NONE, but I think that API could use a re-think. Also ensure that the VacuumRelation structure passed to vacuum() is in recoverable storage. Back-patch to v15 where we started to manage table statistics this way. (The AutoVacOpts leakage is probably older, but I'm not excited enough to worry about just that part.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us Backpatch-through: 15	2025-05-23 14:43:43 -04:00
Tom Lane	6aa33afe6d	Fix AlignedAllocRealloc to cope sanely with OOM. If the inner allocation call returns NULL, we should restore the previous state and return NULL. Previously this code pfree'd the old chunk anyway, which is surely wrong. Also, make it call MemoryContextAllocationFailure rather than summarily returning NULL. The fact that we got control back from the inner call proves that MCXT_ALLOC_NO_OOM was passed, so this change is just cosmetic, but someday it might be less so. This is just a latent bug at present: AFAICT no in-core callers use this function at all, let alone call it with MCXT_ALLOC_NO_OOM. Still, it's the kind of bug that might bite back-patched code pretty hard someday, so let's back-patch to v17 where the bug was introduced (by commit `743112a2e`). Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us Backpatch-through: 17	2025-05-23 11:47:33 -04:00
Daniel Gustafsson	fb844b9f06	Revert function to get memory context stats for processes Due to concerns raised about the approach, and memory leaks found in sensitive contexts the functionality is reverted. This reverts commits `45e7e8ca9`, `f8c115a6c`, `d2a1ed172`, `55ef7abf8` and `042a66291` for v18 with an intent to revisit this patch for v19. Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us	2025-05-23 15:44:54 +02:00
Peter Eisentraut	70a13c528b	Move oauth_validator_libraries in postgresql.conf.sample Move oauth_validator_libraries in postgresql.conf.sample to be grouped with the other CONN_AUTH_AUTH settings, rather than making up a new ad-hoc category. This matches the internal categorization and also how it is listed in the documentation.	2025-05-23 09:03:09 +02:00
Bruce Momjian	883339c170	doc PG 18 relnotes: adjust CREATE SUBSCRIPTION attribution Reported-by: vignesh C Discussion: https://postgr.es/m/CALDaNm0Wy-vJ6dE+e=y=yuq31i2KvGf-Rs-u6QOG4K7TpU_6Tw@mail.gmail.com	2025-05-22 23:02:11 -04:00
Bruce Momjian	7ddfac79f2	doc PG 18 relnotes: clarify btree skip scan item Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wz=2CWXgO1+uyR-VfN3ALMtFnfTtXK-VtkoQQ89ogm=4sg@mail.gmail.com	2025-05-22 22:24:18 -04:00
Bruce Momjian	3b7140d27e	doc PG 18 relnotes: remove duplicate commit entry Item related to btree skip scans.	2025-05-22 21:41:38 -04:00
Tom Lane	b7ab88ddb1	Fix assorted new memory leaks in libpq. Valgrind'ing the postgres_fdw tests showed me that libpq was leaking PGconn.be_cancel_key. It looks like freePGconn is expecting pqDropServerData to release it ... but in a cancel connection object, that doesn't happen. Looking a little closer, I was dismayed to find that freePGconn also missed freeing the pgservice, min_protocol_version, max_protocol_version, sslkeylogfile, scram_client_key_binary, and scram_server_key_binary strings. There's much less excuse for those oversights. Worse, that's from five different commits (`a460251f0`, `4b99fed75`, `285613c60`, `2da74d8d6`, `761c79508`), some of them by extremely senior hackers. Fortunately, all of these are new in v18, so we haven't shipped any leaky versions of libpq. While at it, reorder the operations in freePGconn to match the order of the fields in struct PGconn. Some of those free's seem to have been inserted with the aid of a dartboard.	2025-05-22 20:35:32 -04:00
Melanie Plageman	cb1456423d	Replace deprecated log_connections values in docs and tests `9219093cab` modularized log_connections output to allow more granular control over which aspects of connection establishment are logged. It converted the boolean log_connections GUC into a list of strings and deprecated previously supported boolean-like values on, off, true, false, 1, 0, yes, and no. Those values still work, but they are supported mainly for backwards compatability. As such, documented examples of log_connections should not use these deprecated values. Update references in the docs to deprecated log_connections values. Many of the tests use log_connections. This commit also updates the tests to use the new values of log_connections. In some of the tests, the updated log_connections value covers a narrower set of aspects (e.g. the 'authentication' aspect in the tests in src/test/authentication and the 'receipt' aspect in src/test/postmaster). In other cases, the new value for log_connections is a superset of the previous included aspects (e.g. 'all' in src/test/kerberos/t/001_auth.pl). Reported-by: Peter Eisentraut <peter@eisentraut.org> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/e1586594-3b69-4aea-87ce-73a7488cdc97%40eisentraut.org	2025-05-22 17:14:54 -04:00
Tom Lane	d376ab570e	In ExecInitModifyTable, don't scribble on the source plan. The code carelessly modified mtstate->ps.plan->targetlist, which it's not supposed to do. Fortunately, there's not really any need to do that because the planner already set up a perfectly acceptable targetlist for the plan node. We just need to remove the erroneous assignments and update some relevant comments. As it happens, the erroneous assignments caused the targetlist to point to a different part of the source plan tree, so that there isn't really a risk of the pointer becoming dangling after executor termination. The only visible effect of this change we can find is that EXPLAIN will show upper references to the ModifyTable's output expressions using different variables. Formerly it showed Vars from the first target relation that survived executor-startup pruning. Now it always shows such references using the first relation appearing in the planner output, independently of what happens during executor pruning. On the whole that seems like a good thing. Also make a small tweak in ExplainPreScanNode to ensure that the first relation will receive a refname assignment in set_rtable_names, even if it got pruned at startup. Previously the Vars might be shown without any table qualification, which is confusing in a multi-table query. I considered back-patching this, but since the bug doesn't seem to have any really terrible consequences in existing branches, it seems better to not change their EXPLAIN output. It's not too late for v18 though, especially since v18 already made other changes in the EXPLAIN output for these cases. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Andres Freund <andres@anarazel.de> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/213261.1747611093@sss.pgh.pa.us	2025-05-22 14:28:51 -04:00
Tom Lane	f24605e2dc	Fix memory leak in XMLSERIALIZE(... INDENT). xmltotext_with_options sometimes tries to replace the existing root node of a libxml2 document. In that case xmlDocSetRootElement will unlink and return the old root node; if we fail to free it, it's leaked for the remainder of the session. The amount of memory at stake is not large, a couple hundred bytes per occurrence, but that could still become annoying in heavy usage. Our only other xmlDocSetRootElement call is not at risk because it's working on a just-created document, but let's modify that code too to make it clear that it's dependent on that. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/1358967.1747858817@sss.pgh.pa.us Backpatch-through: 16	2025-05-22 13:52:46 -04:00
Nathan Bossart	5d6eac80cd	pg_dump: Adjust reltuples from 0 to -1 for dumps of older versions. Before v14, a reltuples value of 0 was ambiguous: it could either mean the relation is empty, or it could mean that it hadn't yet been vacuumed or analyzed. (Commit `3d351d916b` taught v14 and newer to use -1 for the latter case.) This ambiguity allegedly can cause the planner to choose inefficient plans after restoring to v18 or newer. To fix, let's just dump reltuples as -1 in that case. This will cause some truly empty tables to be seen as not-yet-processed, but that seems unlikely to cause too much trouble in practice. Note that we could alternatively teach pg_restore_relation_stats() to translate reltuples based on the version argument, but since that function doesn't exist until v18, there's no particular advantage to that approach. That is, there's no chance of restoring stats dumped from a pre-v14 server to another pre-v14 server. Per discussion, the current policy is to fix pre-v18 behavior differences during export and everything else during import. Commit `9879105024` fixed a similar problem for vacuumdb by removing the check for reltuples != 0. Presumably we could reinstate that check now, but I've chosen to leave it in place in case reltuples isn't accurate. As before, processing some empty tables seems relatively harmless. Author: Hari Krishna Sunder <hari.db.pg@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CAAeiqZ0o2p4SX5_xPcuAbbsmXjg6MJLNuPYSLUjC%3DWh-VeW64A%40mail.gmail.com	2025-05-22 10:23:26 -05:00
Amit Langote	1722d5eb05	Revert "Don't lock partitions pruned by initial pruning" As pointed out by Tom Lane, the patch introduced fragile and invasive design around plan invalidation handling when locking of prunable partitions was deferred from plancache.c to the executor. In particular, it violated assumptions about CachedPlan immutability and altered executor APIs in ways that are difficult to justify given the added complexity and overhead. This also removes the firstResultRels field added to PlannedStmt in commit `28317de72`, which was intended to support deferred locking of certain ModifyTable result relations. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/605328.1747710381@sss.pgh.pa.us	2025-05-22 17:02:35 +09:00
Peter Eisentraut	f3622b6476	doc: Move documentation of md5_password_warnings to a better place Commit `db6a4a985b` categorized md5_password_warnings as an authentication setting, and the placement in postgresql.conf.sample matches that, but in the documentation it ended up under logging settings, which isn't unreasonable but inconsistent. This moves the documentation chunk to authentication settings as well.	2025-05-21 16:29:05 +02:00
Michael Paquier	3d0c3a418f	Adjust operation names of pg_aios to match the documentation pg_aios used the terms "read" and "write" for vectored I/O read and write operations, respectively. The documentation refers to them as "readv" and "writev", and the code uses internally the terms PGAIO_OP_READV and PGAIO_OP_WRITEV for them, as of "vectored". This commit adjusts these operation names to match with the code and the documentation. Oversight in `8e293e689b`. Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Discussion: https://postgr.es/m/6df1e949d1d759ad2767c18e5845963e@oss.nttdata.com	2025-05-21 15:58:03 +09:00
Fujii Masao	0bd762e81f	Fix incorrect WAL description for PREPARE TRANSACTION record. Since commit `8b1dccd37c`, the PREPARE TRANSACTION WAL record includes information about dropped statistics entries. However, the WAL resource manager description function for PREPARE TRANSACTION record failed to parse this information correctly and always assumed there were no such entries. As a result, for example, pg_waldump could not display the dropped statistics entries stored in PREPARE TRANSACTION records. The root cause was that ParsePrepareRecord() did not set the number of statistics entries to drop on commit or abort. These values remained zero-initialized and were never updated from the parsed record. This commit fixes the issue by properly setting those values during parsing. With this fix, pg_waldump can now correctly report dropped statistics entries in PREPARE TRANSACTION records. Back-patch to v15, where commit `8b1dccd37c` was introduced. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAJDiXgh-6Epb2XiJe4uL0zF-cf0_s_7Lw1TfEHDMLzYjEmfGOw@mail.gmail.com Backpatch-through: 15	2025-05-21 11:55:14 +09:00
Michael Paquier	06450c7b8c	Fix regression with location calculation of nested statements The statement location calculated for some nested query cases was wrong when multiple queries are sent as a single string, these being separated by semicolons. As pointed by Sami Imseih, the location calculation was incorrect when the last query of nested statement with multiple queries does NOT finish with a semicolon for the last statement. In this case, the statement length tracked by RawStmt is 0, which is equivalent to say that the string should be used until its end. The code previously discarded this case entirely, causing the location to remain at 0, the same as pointing at the beginning of the string. This caused pg_stat_statements to store incorrect query strings. This issue has been introduced in `499edb0974`. I have looked at the diffs generated by pgaudit back then, and noticed the difference generated for this nested query case, but I have missed the point that it was an actual regression with an existing case. A test case is added in pg_stat_statements to provide some coverage, restoring the pre-17 behavior for the calculation of the query locations. Special thanks to David Steele, who, through an analysis of the test diffs generated by pgaudit with the new v18 logic, has poked me about the fact that my original analysis of the matter was wrong. The test output of pg_overexplain is updated to reflect the new logic, as the new locations refer to the beginning of the argument passed to the function explain_filter(). When the module was introduced in `8d5ceb113e`, which was after `499edb0974` (for the new calculation method), the locations of the test were not actually right: the plan generated for the query string given in input of the function pointed to the top-level query, not the nested one. Reported-by: David Steele <david@pgbackrest.org> Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: David Steele <david@pgbackrest.org> Discussion: https://postgr.es/m/844a3b38-bbf1-4fb2-9fd6-f58c35c09917@pgbackrest.org	2025-05-21 10:22:12 +09:00
Nathan Bossart	a6060f1cbe	pg_dump: Fix array literals in fetchAttributeStats(). Presently, fetchAttributeStats() builds array literals by treating the elements as SQL identifiers. This is incorrect for a couple of reasons: * Array literal content must match the external text representation of the array, i.e., what array_out() would return. One notable problem is that double quotes are escaped with "" in identifiers but with \" in array literals. To fix, build the array content using the pre-existing appendPGArray() function. * Array literals must be written as string constants. A notable problem here is that single quotes are escaped via '' in strings but are not escaped in the text representation of an array. To fix, append the aforementioned array literal content to the query with appendStringLiteralAH(). While at it, modify a test case to use an identifier that would cause the test to fail without this change. Oversight in commit `9c02e3a986`. Reported-by: Philippe Beaudoin <pbh.emaj@free.fr> Author: Jian He <jian.universality@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Stepan Neretin <slpmcf@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Bug: #18923 Discussion: https://postgr.es/m/18923-e79273f87c6bed69%40postgresql.org	2025-05-20 16:31:00 -05:00
Heikki Linnakangas	cbf53e2b8a	Fix cross-version upgrade test failure Commit `29f7ce6fe7` added another view that needs adjustment in the cross-version upgrade test. This should fix the XversionUpgrade failures in the buildfarm. Backpatch-through: 16 Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org	2025-05-20 10:39:14 +03:00
Michael Paquier	54675d8986	doc: Clarify use of _ccnew and _ccold in REINDEX CONCURRENTLY Invalid indexes are suffixed with "_ccnew" or "_ccold". The documentation missed to mention the initial underscore. ChooseRelationName() may also append an extra number if indexes with a similar name already exist; let's add a note about that too. Author: Alec Cozens <acozens@pixelpower.com> Discussion: https://postgr.es/m/174733277404.1455388.11471370288789479593@wrigleys.postgresql.org Backpatch-through: 13	2025-05-20 14:39:06 +09:00
Andres Freund	acad909321	aio: Fix possible state confusions due to interrupt processing elog()/ereport() process interrupts, iff the log message is < ERROR and the log message will be emitted. aio's debug messages are emitted via ereport(), but in some places the code is not ready for interrupts to be processed. Fix the issue using a few different methods: 1) handle interrupts arriving concurrently - in some places it's easy to detect that by fetching the handle's generation a bit earlier 2) Check if interrupts made the work needing to be done obsolete 3) Disallow interrupts, as there's no sane way to make interrupt processing safe To prevent some similar issues from being re-introduced, assert that interrupts are held in pgaio_io_update_state(). This commit also fixes the contents of a debug message I added in `039bfc457e`. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/mvpm7ga3dfgz7bvum22hmuz26cariylmcppb3irayftc7bwk3r@l7gb6gr7azhc	2025-05-19 21:07:06 -04:00
Heikki Linnakangas	29f7ce6fe7	Fix deparsing FETCH FIRST <expr> ROWS WITH TIES In the grammar, <expr> is a c_expr, which accepts only a limited set of integer literals and simple expressions without parens. The deparsing logic didn't quite match the grammar rule, and failed to use parens e.g. for "5::bigint". To fix, always surround the expression with parens. Would be nice to omit the parens in simple cases, but unfortunately it's non-trivial to detect such simple cases. Even if the expression is a simple literal 123 in the original query, after parse analysis it becomes a FuncExpr with COERCE_IMPLICIT_CAST rather than a simple Const. Reported-by: yonghao lee Backpatch-through: 13 Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org	2025-05-19 18:50:26 +03:00
Amit Kapila	ad5eaf390c	Don't retreat slot's confirmed_flush LSN. Prevent moving the confirmed_flush backwards, as this could lead to data duplication issues caused by replicating already replicated changes. This can happen when a client acknowledges an LSN it doesn't have to do anything for, and thus didn't store persistently. After a restart, the client can send the prior LSN that it stored persistently as an acknowledgement, but we need to ignore such an LSN to avoid retreating confirm_flush LSN. Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Author: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Tested-by: Nisha Moond <nisha.moond412@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB57164AB5716AF2E477D53F6F9489A@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-05-19 12:13:06 +05:30
Tom Lane	f8db5c7a3f	Doc: add pre-branch task to run src/tools/copyright.pl. It's common for some files with last year's copyright date to sneak into the tree between early January (when we normally run copyright.pl) and feature freeze. Immediately before branching the new release is an ideal time to fix the stragglers, so add a note about it to the RELEASE_CHANGES checklist. Discussion: https://postgr.es/m/CALa6HA4_Wu7-2PV0xv-Q84cT8eG7rTx6bdjUV0Pc=McAwkNMfQ@mail.gmail.com	2025-05-18 23:31:44 -04:00
Michael Paquier	2c6469d4cd	Fix incorrect year in some copyright notices A couple of new files have been added in the tree with a copyright year of 2024 while we were already in 2025. These should be marked with 2025, so let's fix them. Reported-by: Shaik Mohammad Mujeeb <mujeeb.sk.dev@gmail.com> Discussion: https://postgr.es/m/CALa6HA4_Wu7-2PV0xv-Q84cT8eG7rTx6bdjUV0Pc=McAwkNMfQ@mail.gmail.com	2025-05-19 09:46:52 +09:00
Michael Paquier	11b2dc3709	ecpg: Add missing newline in meson.build Noticed while performing a routine sanity check of the files in the tree. Issue introduced by `28f04984f0`. Discussion: https://postgr.es/m/CALa6HA4_Wu7-2PV0xv-Q84cT8eG7rTx6bdjUV0Pc=McAwkNMfQ@mail.gmail.com	2025-05-19 09:44:17 +09:00
Alexander Korotkov	3d3a81fc24	Fix tuple_fraction calculation in generate_orderedappend_paths() `6b94e7a6da` adjusted generate_orderedappend_paths() to consider fractional paths. However, it didn't manage to interpret the tuple_fraction value correctly. According to the header comment of grouping_planner(), the tuple_fraction >= 1 specifies the absolute number of expected tuples. That number must be divided by the expected total number of tuples to get the actual fraction. Even though this is a bug fix, we don't backpatch it. The risks of the side effects of plan changes on stable branches are too high. Reported-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/3ca271fa-ca5c-458c-8934-eb148622b270%40gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2025-05-18 23:49:50 +03:00
Tom Lane	12eee85e51	Make our usage of memset_s() conform strictly to the C11 standard. Per the letter of the C11 standard, one must #define __STDC_WANT_LIB_EXT1__ as 1 before including <string.h> in order to have access to memset_s(). It appears that many platforms are lenient about this, because we weren't doing it and yet the code appeared to work anyway. But we now find that with -std=c11, macOS is strict and doesn't declare memset_s, leading to compile failures since we try to use it anyway. (Given the lack of prior reports, perhaps this is new behavior in the latest SDK? No matter, we're clearly in the wrong.) In addition to the immediate problem, which could be fixed merely by adding the needed #define to explicit_bzero.c, it seems possible that our configure-time probe for memset_s() could fail in case a platform implements the function in some odd way due to this spec requirement. This concern can be fixed in largely the same way that we dealt with strchrnul() in 6da2ba1d8: switch to using a declaration-based configure probe instead of a does-it-link probe. Back-patch to v13 where we started using memset_s(). Reported-by: Lakshmi Narayana Velayudam <dev.narayana.v@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAA4pTnLcKGG78xeOjiBr5yS7ZeE-Rh=FaFQQGOO=nPzA1L8yEA@mail.gmail.com Backpatch-through: 13	2025-05-18 12:45:55 -04:00
Daniel Gustafsson	0d4dad200d	Fix function name reference in comment Ensure that we refer to the function being used, rather than the name of the resulting function in question. Author: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+renyVZNiHEv5ceKDjA4j5xC6NT6mRuW33BDERBQMi_90_t6A@mail.gmail.com	2025-05-18 10:05:38 +02:00
Daniel Gustafsson	5987553fde	Align organization wording in copyright statement This aligns the copyright and legal notice wordig with commit `a233a603ba` and pgweb commit 2d764dbc083ab8. Backpatch down to all supported versions. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Dave Page <dpage@pgadmin.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/744E414E-3F52-404C-97FB-ED9B3AA37DC8@yesql.se Backpatch-through: 13	2025-05-16 11:20:07 -04:00
Richard Guo	fe29b2a1da	Fix Assert failure in XMLTABLE parser In an XMLTABLE expression, columns can be marked NOT NULL, and the parser internally fabricates an option named "is_not_null" to represent this. However, the parser also allows users to specify arbitrary option names. This creates a conflict: a user can explicitly use "is_not_null" as an option name and assign it a non-Boolean value, which violates internal assumptions and triggers an assertion failure. To fix, this patch checks whether a user-supplied name collides with the internally reserved option name and raises an error if so. Additionally, the internal name is renamed to "__pg__is_not_null" to further reduce the risk of collision with user-defined names. Reported-by: Евгений Горбанев <gorbanyoves@basealt.ru> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/6bac9886-65bf-4cec-96bd-e304159f28db@basealt.ru Backpatch-through: 15	2025-05-15 17:09:04 +09:00
Richard Guo	2c0ed86d39	Add explicit initialization for all PlannerGlobal fields When creating a new PlannerGlobal node in standard_planner(), most fields are explicitly initialized, but a few are not. This doesn't cause any functional issues, as makeNode() zeroes all fields by default. However, the inconsistency is undesirable from a clarity and maintenance perspective. This patch explicitly initializes the remaining fields to improve consistency and readability. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-TgQHNOiouqGcuHoBqbJjWyx4UxGKxUY3FrF4trGbcPA@mail.gmail.com	2025-05-14 09:59:31 +09:00
Daniel Gustafsson	6e289f2d5d	Fix order of parameters in POD documentation The documentation for log_check() had the parameters in the wrong order. Also while there, rename %parameters to %params to better documentation for similar functions which use %params. Backpatch down to v14 where this was introduced. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/9F503B5-32F2-45D7-A0AE-952879AD65F1@yesql.se Backpatch-through: 14	2025-05-13 07:29:14 -04:00
Amit Kapila	8ede692de5	Fix the race condition in the test added by `7c99dc587`. After executing ALTER SUBSCRIPTION tap_sub SET PUBLICATION, we did not wait for the new walsender process to restart. As a result, an INSERT executed immediately after the ALTER could be decoded and skipped, considering it is not part of any subscribed publication. And, the old apply worker could also confirm the LSN of such an INSERT. This could cause the replication to resume from a point after the INSERT. In such cases, we miss the expected warning about the missing publication. To fix this, ensure the walsender has restarted before continuing after ALTER SUBSCRIPTION. Reported-by: Tom Lane as per CI Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/1230066.1745992333@sss.pgh.pa.us	2025-05-13 09:54:29 +05:30
Álvaro Herrera	dbf42b84ac	Add tab-complete for ALTER DOMAIN ADD [CONSTRAINT] We can add tab-completion with "CHECK (" and "NOT NULL" after ALTER DOMAIN ADD [CONSTRAINT]. ALTER DOMAIN dom ADD -> CHECK ( ALTER DOMAIN dom ADD -> NOT NULL ALTER DOMAIN dom ADD -> CONSTRAINT ALTER DOMAIN dom ADD CONSTRAINT nm -> CHECK ( ALTER DOMAIN dom ADD CONSTRAINT nm -> NOT NULL Author: jian he <jian.universality@gmail.com> Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/CACJufxG_f6LzAT_McC-kKmQWpuWnOYKyNBw8Kv3xzTjPqmeHcA@mail.gmail.com	2025-05-11 10:16:45 -04:00
Álvaro Herrera	0588656366	Fix comment of tsquerysend() The comment describes the order in which fields are sent, and it had one of the fields in the wrong place. This has been wrong since `e6dbcb72fa` (2008), so backpatch all the way back. Author: Emre Hasegeli <emre@hasegeli.com> Discussion: https://postgr.es/m/CAE2gYzzf38bR_R=izhpMxAmqHXKeM5ajkmukh4mNs_oXfxcMCA@mail.gmail.com	2025-05-11 09:47:10 -04:00
Álvaro Herrera	dc9a2d54fd	relcache: Avoid memory leak on tables with no CHECK constraints As complained about by Valgrind, in commit `a379061a22` I failed to realize that I was causing rd_att->constr->check to become allocated when no CHECK constraints exist; previously it'd remain NULL. (This was my bug, not the mentioned commit author's). Fix by making the allocation conditional, and set ->check to NULL if unallocated. Reported-by: Yasir <yasir.hussain.shah@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202505082025.57ijx3qrbx7u@alvherre.pgsql	2025-05-11 09:22:12 -04:00
Álvaro Herrera	7b2ad43426	Sort includes in alphabetical order Added by commit `042a66291b`, no backpatch needed.	2025-05-11 09:15:05 -04:00
Tom Lane	d4a7e4e179	Fix incorrect "return NULL" in BumpAllocLarge(). This must be "return MemoryContextAllocationFailure(context, size, flags)" instead. The effect of this oversight is that if we got a malloc failure right here, the code would act as though MCXT_ALLOC_NO_OOM had been specified, whether it was or not. That would likely lead to a null-pointer-dereference crash at the unsuspecting call site. Noted while messing with a patch to improve our Valgrind leak detection support. Back-patch to v17 where this code came in.	2025-05-10 20:22:39 -04:00
Noah Misch	4a4ee0c2c1	Remove GLOBALTABLESPACE_OID assert for locked buffers. Commit `f4ece891fc` added the assertion in an attempt to catch some defects even after VACUUM FULL or REINDEX. However, IsCatalogTextUniqueIndexOid(tag.relNumber) always returns false after a relfilenode change, provoking unintended assertion failures. Reported-by: Adam Guo <adamguo@amazon.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Bug: #18912 Discussion: https://postgr.es/m/18912-a41c9bd0e0ad19b1@postgresql.org	2025-05-10 07:36:27 -07:00
Bruce Momjian	99ddf8615c	doc PG 18 relnotes: mv. hash joins and GROUP BY item to General Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvqJz+Zf7a6abisqoTGottDSRD+YPx=aQSgCsCKD476vGA@mail.gmail.com	2025-05-09 23:40:02 -04:00
Michael Paquier	c259ba881c	aio: Use runtime arguments with injections points in tests This cleans up the code related to the testing infrastructure of AIO that used injection points, switching the test code to use the new facility for injection points added by `371f2db8b0` rather than tweaks to pass and reset arguments to the callbacks run. This removes all the dependencies to USE_INJECTION_POINTS in the AIO code. pgaio_io_call_inj(), pgaio_inj_io_get() and pgaio_inj_cur_handle are now gone. Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz	2025-05-10 12:36:57 +09:00
Michael Paquier	36e5fda632	injection_points: Add support and tests for runtime arguments This commit provides some test coverage for the runtime arguments of injection points, for both INJECTION_POINT_CACHED() and INJECTION_POINT(), as extended in `371f2db8b0`. The SQL functions injection_points_cached() and injection_points_run() are extended so as it is possible to pass an optional string value to them. Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz	2025-05-10 07:40:25 +09:00
Michael Paquier	371f2db8b0	Add support for runtime arguments in injection points The macros INJECTION_POINT() and INJECTION_POINT_CACHED() are extended with an optional argument that can be passed down to the callback attached when an injection point is run, giving to callbacks the possibility to manipulate a stack state given by the caller. The existing callbacks in modules injection_points and test_aio have their declarations adjusted based on that. `da7226993f` (core AIO infrastructure) and `93bc3d75d8` (test_aio) and been relying on a set of workarounds where a static variable called pgaio_inj_cur_handle is used as runtime argument in the injection point callbacks used by the AIO tests, in combination with a TRY/CATCH block to reset the argument value. The infrastructure introduced in this commit will be reused for the AIO tests, simplifying them. Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz	2025-05-10 06:56:26 +09:00
Bruce Momjian	89372d0aaa	doc PG 18 relnotes: fix missing parens for crc32c() Reported-by: Steven Niu Discussion: https://postgr.es/m/CABBtG=ejqK58cFWpw3etVZfQfhjC-qOqV+9GQWRnLO+p9wYMbw@mail.gmail.com	2025-05-09 14:16:17 -04:00
Tom Lane	95129709fd	Skip RSA-PSS ssl test when using LibreSSL. Presently, LibreSSL does not have working support for RSA-PSS, so disable that test. Per discussion at https://marc.info/?l=libressl&m=174664225002441&w=2 they do intend to fix this, but it's a ways off yet. Reported-by: Thomas Munro <thomas.munro@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com Backpatch-through: 15	2025-05-09 12:29:01 -04:00
Tom Lane	75d73331d0	Hack one ssl test case to pass with current LibreSSL. With LibreSSL, our test of error logging for cert chain depths > 0 reports the wrong certificate. This is almost certainly their bug not ours, so just tweak the test to accept their answer. No back-patch needed, since this test case wasn't enabled before `e0f373ee4`. Reported-by: Thomas Munro <thomas.munro@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com	2025-05-09 11:53:51 -04:00
Tom Lane	0aaf69965d	Centralize ssl tests' check for whether we're using LibreSSL. Right now there's only one caller, so that this is merely an exercise in shoving code from one module to another, but there will shortly be another one. It seems better to avoid having two copies of this highly-subject-to-change test. Back-patch to v15, where we first introduced some tests that don't work with LibreSSL. Reported-by: Thomas Munro <thomas.munro@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com Backpatch-through: 15	2025-05-09 11:50:33 -04:00
Peter Eisentraut	bc35adee8d	doc: Put new options in consistent order on man pages	2025-05-09 09:03:41 +02:00
Heikki Linnakangas	b28c59a6cd	Use 'void ' for arbitrary buffers, 'uint8 ' for byte arrays A 'void ' argument suggests that the caller might pass an arbitrary struct, which is appropriate for functions like libc's read/write, or pq_sendbytes(). 'uint8 ' is more appropriate for byte arrays that have no structure, like the cancellation keys or SCRAM tokens. Some places used 'char ', but 'uint8 ' is better because 'char *' is commonly used for null-terminated strings. Change code around SCRAM, MD5 authentication, and cancellation key handling to follow these conventions. Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org	2025-05-08 22:01:25 +03:00
Heikki Linnakangas	965213d9c5	Use more mundane 'int' type for cancel key lengths in libpq The documented max length of a cancel key is 256 bytes, so it fits in uint8. It nevertheless seems weird to not just use 'int', like in commit `0f1433f053` for the backend. Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64%40eisentraut.org	2025-05-08 22:01:20 +03:00
Bruce Momjian	9d710a1ac0	PG 18 relnotes: adjust RETURNING new/old item Reported-by: jian he Discussion: https://postgr.es/m/CACJufxFM1avdwu=OrTx_uMAjTDbFOj1Gp7mnNHOofTVj9QtmRw@mail.gmail.com	2025-05-08 11:11:08 -04:00
Daniel Gustafsson	8fcc648780	doc: Fix title markup for AT TIME ZONE and AT LOCAL The title for AT TIME ZONE and AT LOCAL was accidentally wrapping the "and" in the <literal> tag. Backpatch to v17 where it was introduced in `97957fdbaa`. Author: Noboru Saito <noborusai@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAAM3qn+7QUWW9R6_YwPKXmky0xGE4n63U3EsxZeWE_QtogeU8g@mail.gmail.com Backpatch-through: 17	2025-05-08 13:53:16 +02:00
Richard Guo	c06e909c26	Track the number of presorted outer pathkeys in MergePath When creating an explicit Sort node for the outer path of a mergejoin, we need to determine the number of presorted keys of the outer path to decide whether explicit incremental sort can be applied. Currently, this is done by repeatedly calling pathkeys_count_contained_in. This patch caches the number of presorted outer pathkeys in MergePath, allowing us to save several calls to pathkeys_count_contained_in. It can be considered a complement to the changes in commit `828e94c9d`. Reported-by: David Rowley <dgrowleyml@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAApHDvqvBireB_w6x8BN5txdvBEHxVgZBt=rUnpf5ww5P_E_ww@mail.gmail.com	2025-05-08 18:21:32 +09:00
Richard Guo	773db22269	Suppress unnecessary explicit sorting for EPQ mergejoin path When building a ForeignPath for a joinrel, if there's a possibility that EvalPlanQual will be executed, we must identify a suitable path for EPQ checks. If the outer or inner path of the chosen path is a ForeignPath representing a pushed-down join, we replace it with its fdw_outerpath to ensure that the EPQ check path consists entirely of local joins. If the chosen path is a MergePath, and its outer or inner path is a ForeignPath that is not already well enough ordered, the MergePath will have non-NIL outersortkeys or innersortkeys indicating the desired ordering to be created by an explicit Sort node. If we then replace the outer or inner path with its corresponding fdw_outerpath, and that path is already sufficiently ordered, we end up in an inconsistent state: the MergePath has non-NIL outersortkeys or innersortkeys, and its input path is already properly ordered. This inconsistency can result in an Assert failure or the addition of a redundant Sort node. To fix, check if the new outer or inner path of a MergePath is already properly sorted, and set its outersortkeys or innersortkeys to NIL if so. Bug: #18902 Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18902-71c1bed2b9f7c46f@postgresql.org	2025-05-08 18:20:18 +09:00
Bruce Momjian	9fef27a83b	doc PG 18 relnotes: adjust pg_log_backend_memory_contexts() Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvrGLBqs_Vm9COMY7uBDvUDMKds7RwC20YjEPf+XRTY9XQ@mail.gmail.com	2025-05-07 21:11:16 -04:00
Bruce Momjian	f8d49aa130	doc PG 18 relnotes: add pg_log_backend_memory_contexts() mention Now zero-based. Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvqMfTBdfwc0Z-tHXLnBMKJLYEZDApgUzA7x_PUDZsY3GA@mail.gmail.com	2025-05-07 20:36:21 -04:00
Bruce Momjian	69aca072eb	doc PG 18 relnotes: adjust pgbench per-script reporting item Also run src/tools/add_commit_links.pl for a previous commit. Reported-by: Yugo Nagata Discussion: https://postgr.es/m/20250507195941.c6e1b48c73f062b727f686a8@sraoss.co.jp	2025-05-07 16:56:26 -04:00
Bruce Momjian	3bd5271729	doc PG 18 relnotes: mention GROUP SET fixes Reported-by: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_asKPqTCt0h9pp=zHc9vmPcnczbHeF6Xkxn1LhLapcTQ@mail.gmail.com	2025-05-07 16:39:49 -04:00
Nathan Bossart	16bf24e0e4	Remove pg_replication_origin's TOAST table. A few places that access this catalog don't set up an active snapshot before potentially accessing its TOAST table. However, roname (the replication origin name) is the only varlena column, so this is only a problem if the name requires out-of-line storage. This commit removes its TOAST table to avoid needing to set up a snapshot. It also places a limit on replication origin names so that attempts to set long names will fail with a more user-friendly error. Those chosen limit of 512 bytes should be sufficient to avoid "row is too big" errors independent of BLCKSZ, but it should also be lenient enough for all reasonable use-cases. Bumps catversion. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan	2025-05-07 14:47:36 -05:00
Peter Geoghegan	5f4d98d4f3	Prevent premature nbtree array advancement. nbtree array index scans could fail to return matching tuples in rare cases where the missed tuples cover key space that the scan's arrays incorrectly indicate has already been read. These cases involved nearby tuples with NULL values that were evaluated using a skip array key while in pstate.forcenonrequired mode. To fix, prevent forcenonrequired mode from prematurely advancing the scan's array keys beyond key space that the scan has yet to read tuples from: reset the scan's array keys (to the first elements in the current scan direction) before the _bt_checkkeys call for pstate.finaltup. That way _bt_checkkeys starts from a clean slate, which ensures that it will call _bt_advance_array_keys (while passing it sktrig_required=true). This reliably restores the invariant that the scan's arrays always accurately track its progress through the index's key space (at least when the scan is "between pages"). Oversight in commit `8a510275`, which optimized nbtree search scan key comparisons. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/CAH2-WzmodSE+gpTd1CRGU9ez8ytyyDS+Kns2r9NzgUp1s56kpw@mail.gmail.com	2025-05-07 15:20:42 -04:00
Peter Geoghegan	7e25c9363a	nbtree: tighten up array recheck rules. Be more conservative when performing a scheduled recheck of an nbtree scan's array keys once on the next page, having set so->scanBehind: back out of reading the page (perform another primitive scan instead) when the next page's high key/finaltup has an untruncated prefix of matching values and truncated suffix attributes associated with lower-order keys. In other words, stop assuming that the lower-order keys have been satisfied by the truncated suffix attributes in this context (only do so when considering scheduling a recheck within _bt_advance_array_keys). The new behavior is more logical: if the next page read after setting so->scanBehind can only contain tuples that are themselves "behind the scan", that's reason enough to cut our losses. In general, when we set so->scanBehind, we only expect to perform one recheck on the next page to make a final decision about whether or not to continue the current primitive index scan. It seems unprincipled for the recheck to allow a _bt_readpage to continue unless the scan's arrays will advance/unless the page might actually contain relevant tuples. In practice it is highly unlikely that things will line up like this (the untruncated prefix of attribute values from the next page's high key is seldom an exact match for their corresponding array's current element following array advancement on the original/previous page). That gives us all the more reason to keep things simple and consistent. This was arguably an oversight in commit `9a2e2a285a`, which improved nbtree array primitive scan scheduling. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WzkXzJajgyW-pCQ7vaDPhaT3huU+Zw_j448rpCBEsu2YOQ@mail.gmail.com	2025-05-07 15:17:40 -04:00
Nathan Bossart	acea3fc49f	pg_dumpall: Add --sequence-data. I recently added this option to pg_dump, but I forgot to add it to pg_dumpall, too. There's probably little use for it at the moment, but we will need it if/when we teach pg_upgrade to use pg_dumpall to dump the database schemas. Oversight in commit `9c49f0e8cd`. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aBE8rHFo922xQUwh%40nathan	2025-05-07 13:36:51 -05:00
Alexander Korotkov	ab42d643c1	Refactor ChangeVarNodesExtended() using the custom callback `fc069a3a63` implemented Self-Join Elimination (SJE) and put related logic to ChangeVarNodes_walker(). This commit provides refactoring to remove the SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to ChangeVarNodesExtended(), which has a chance to process a node before ChangeVarNodes_walker(). Passing this callback to ChangeVarNodesExtended() allows SJE-related node handling to be kept within the analyzejoins.c. Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com>	2025-05-07 11:10:16 +03:00
Peter Eisentraut	2448c7a9e0	doc: Put some psql documentation pieces back into alphabetical order	2025-05-07 08:23:44 +02:00
Peter Eisentraut	c0cf282551	Remove some tabs in C string literals	2025-05-07 08:23:44 +02:00
Peter Eisentraut	c11bd5f500	doc: Add link to table Formal tables should generally have an xref in the text that points to them. Add them here.	2025-05-07 08:23:44 +02:00
Peter Eisentraut	a2c6d84acd	doc: Fix up spacing around verbatim DocBook elements	2025-05-07 08:23:44 +02:00
Michael Paquier	c4c236ab5c	Fix some comments related to IO workers IO workers are treated as auxiliary processes. The comments fixed in this commit stated that there could be only one auxiliary process of each BackendType at the same time. This is not true for IO workers, as up to MAX_IO_WORKERS of them can co-exist at the same time. Author: Cédric Villemain <Cedric.Villemain@data-bene.io> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/e4a3ac45-abce-4b58-a043-b4a31cd11113@Data-Bene.io	2025-05-07 14:55:57 +09:00
Peter Eisentraut	09a47c68e2	Fix whitespace	2025-05-07 07:01:03 +02:00
Bruce Momjian	b560ce7884	doc PG 18 relnotes: adjust partition planning item Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvqgK7uqPZAwxsfBiFhvBHHB0txaUxhUrdwG4d5Mik_RnA@mail.gmail.com	2025-05-06 21:15:44 -04:00
Bruce Momjian	ada78f9bef	doc PG 18 relnotes: small adjustments regarding options Reported-by: jian he Discussion: https://postgr.es/m/CACJufxH1jo=hv77AK0HUJYBBMuPmr6+JT+8g-yovuJmHUPGOZQ@mail.gmail.com	2025-05-06 17:17:46 -04:00
Bruce Momjian	575f6003ed	doc PG 18 relnotes: move partition locking item to General Perf Reported-by: Amit Langote Discussion: https://postgr.es/m/CA+HiwqE+8Pui_NCCC7zgacnet0Cf3tc_vU+P=nhLDES-8xuCUw@mail.gmail.com	2025-05-06 16:03:56 -04:00
Bruce Momjian	45750c6cfe	doc PG 18 relnotes: adjust partition items Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvo+BrVTXMBPjNXBTnAovJWN9+-dYc0kN7rSDqdNvpggZQ@mail.gmail.com	2025-05-06 15:45:03 -04:00
Tom Lane	caa76b91a6	Stamp 18beta1.	2025-05-05 16:25:46 -04:00
Bruce Momjian	c0e6aace02	doc PG 18 relnotes: reword OAuth item Reported-by: Jacob Champion Discussion: https://postgr.es/m/CAOYmi+mEQOqBSJas5V5t__b+6h_MLxyy3JFrVJEq638fnNxi0A@mail.gmail.com	2025-05-05 15:42:03 -04:00
Bruce Momjian	0de2e1c8b5	doc PG 18 relnotes: add mention of pg_stat_reset_backend_stats() This is for WAL statistics. Reported-by: Bertrand Drouvot Discussion: https://postgr.es/m/aBjGlj+Yi++fVRQt@ip-10-97-1-34.eu-west-3.compute.internal	2025-05-05 14:56:58 -04:00
Bruce Momjian	092e72a930	doc PG 18 relnotes: adjust hash item Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvrNmGncNgZMh2oBG5K-+4d1LGJgzrz7180OcHRT1VFojw@mail.gmail.com	2025-05-05 12:30:35 -04:00
Bruce Momjian	cf847d6340	doc PG 18 relnotes: split partition optimizer item into two Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvohfoJ0D9eiUuVyHU_kq2Y7A_jAjWVsUt0Fm7Gw1Q=1cQ@mail.gmail.com	2025-05-05 11:59:56 -04:00
Noah Misch	627acc3caa	With GB18030, prevent SIGSEGV from reading past end of allocation. With GB18030 as source encoding, applications could crash the server via SQL functions convert() or convert_from(). Applications themselves could crash after passing unterminated GB18030 input to libpq functions PQescapeLiteral(), PQescapeIdentifier(), PQescapeStringConn(), or PQescapeString(). Extension code could crash by passing unterminated GB18030 input to jsonapi.h functions. All those functions have been intended to handle untrusted, unterminated input safely. A crash required allocating the input such that the last byte of the allocation was the last byte of a virtual memory page. Some malloc() implementations take measures against that, making the SIGSEGV hard to reach. Back-patch to v13 (all supported versions). Author: Noah Misch <noah@leadboat.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Backpatch-through: 13 Security: CVE-2025-4207	2025-05-05 04:52:04 -07:00
Noah Misch	5be213caaa	Refactor test_escape.c for additional ways of testing. Start the file with static functions not specific to pe_test_vectors tests. This way, new tests can use them without disrupting the file's layout. Change report_result() PQExpBuffer arguments to plain strings. Back-patch to v13 (all supported versions), for the next commit. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Backpatch-through: 13 Security: CVE-2025-4207	2025-05-05 04:52:04 -07:00
Peter Eisentraut	18c4fff640	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: f90ee4803c30491e5c49996b973b8a30de47bfb2	2025-05-05 12:04:49 +02:00
Bruce Momjian	b3754dcc9f	doc PG 18 relnotes: adjust COPY and REJECT_LIMIT items Reported-by: Atsushi Torikoshi Discussion: https://postgr.es/m/CAM6-o=CEF6tKAjtGMEOd45YySwNRXPu8d_zyYq=fhnia9hOU6Q@mail.gmail.com	2025-05-04 22:37:20 -04:00
Bruce Momjian	d83981c24b	doc PG 18 relnotes: move and clarify constraint items Reported-by: Álvaro Herrera Discussion: https://postgr.es/m/202505041135.cpo7zgdcya2u@alvherre.pgsql	2025-05-04 22:08:20 -04:00
Bruce Momjian	8c9eec540d	doc PG 18 relnotes: add commit for cancel key and protocol neg. Reported-by: Jelte Fennema-Nio Discussion: https://postgr.es/m/CAGECzQQehQrhkNNXvLiBgE3odBbTPG=9PzV8F4Oqq3kOorK0Sw@mail.gmail.com	2025-05-04 21:44:39 -04:00
Bruce Momjian	a675149e87	doc PG 18 relnotes: fix libpq wording Reported-by: Jelte Fennema-Nio Discussion: https://postgr.es/m/CAGECzQT4804OLOP+nDBxDpMw3Soq=g+fKOE7NryBHggy4GgEcg@mail.gmail.com	2025-05-03 18:50:03 -04:00
Alexander Korotkov	2782f3b845	Revert "Refactor ChangeVarNodesExtended() using the custom callback" This reverts commit `250a718aad`. It shouldn't be pushed during the release freeze. Reported-by: Tom Lane Discussion: https://postgr.es/m/E1uBIbY-000owH-0O%40gemulon.postgresql.org	2025-05-03 22:42:05 +03:00
Alexander Korotkov	250a718aad	Refactor ChangeVarNodesExtended() using the custom callback `fc069a3a63` implemented Self-Join Elimination (SJE) and put related logic to ChangeVarNodes_walker(). This commit provides refactoring to remove the SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to ChangeVarNodesExtended(), which has a chance to process a node before ChangeVarNodes_walker(). Passing this callback to ChangeVarNodesExtended() allows SJE-related node handling to be kept within the analyzejoins.c. Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com>	2025-05-03 22:30:52 +03:00
Bruce Momjian	fb21ed6c38	doc: update guidelines on non-ASCII characters in docs	2025-05-03 14:45:26 -04:00
Bruce Momjian	24987c6f06	doc PG 18 relnotes: add GROUP BY column elimination item With a nod to PG 9.6. Reported-by: jian he Discussion: https://postgr.es/m/CACJufxEqs=EXZETwtaOooTFhZrtxvSWg8M2uPfzjNtS3wQ6Dzw@mail.gmail.com	2025-05-03 12:57:18 -04:00
Bruce Momjian	04b269da56	doc PG 18 relnotes: move protocol version item to "server" Reported-by: Jelte Fennema-Nio Discussion: https://postgr.es/m/CAGECzQSTBgTsDJPxOHWKo7106-YnnYQGzpzNJdis+xTKGUhu2g@mail.gmail.com	2025-05-03 12:19:54 -04:00
Etsuro Fujita	5201bba266	Fix memory allocation/copy mistakes. The previous code was allocating more memory and copying more data than necessary because it specified the wrong PgStat_KindInfo member as the size argument for MemoryContextAlloc and memcpy, respectively. Although these issues exist since `5891c7a8e`, there have been no reports from the field. So for now, it seems sufficient to fix them in master. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Gurjeet Singh <gurjeet@singh.im> Discussion: https://postgr.es/m/CAPmGK15eTRCZTnfgQ4EuBNo%3DQLYGFEbXS_7m2dXqtkcT7L8qrQ%40mail.gmail.com	2025-05-03 20:00:00 +09:00
Etsuro Fujita	6e91b9c16f	Fix typos in comments. Also adjust the phrasing in the comments. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Author: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAPmGK17%3DPHSDZ%2B0G6jcj12buyyE1bQQc3sbp1Wxri7tODT-SDw%40mail.gmail.com Backpatch-through: 15	2025-05-03 19:10:00 +09:00
Bruce Momjian	9fd989ff99	doc PG 18 relnotes: update chapter tags for recent commit	2025-05-02 20:10:10 -04:00
Bruce Momjian	9f8fcadb20	doc PG 18 relnotes: adjust libpq trace & potocol version items Reported-by: Jelte Fennema-Nio Discussion: https://postgr.es/m/CAGECzQQj0r_JX38fa-_kepp9UaMzCcujRAYaJG2+fPks1b8MVg@mail.gmail.com	2025-05-02 20:09:12 -04:00
Bruce Momjian	aa82ebdc29	doc PG 18 relnotes: reword and reorder items Also move ssl_groups to a more appropriate section. Reported-by: Jacob Champion (ssl_groups item) Discussion: https://postgr.es/m/CAOYmi+k_zpGaDOrwV46_j-O-a_hSWxcXM6h8vccq45Y28deP-g@mail.gmail.com	2025-05-02 19:59:17 -04:00
Peter Geoghegan	0f08df4068	Avoid treating nonrequired nbtree keys as required. Consistently prevent nbtree array advancement from treating a scankey as required when operating in pstate.forcenonrequired mode. Otherwise, we risk a NULL pointer dereference. This was possible in the path where _bt_check_compare is called to recheck a tuple that advanced all of the scan's arrays to matching values: its continuescan=false handling expects _bt_advance_array_keys to have been called with a valid pstate, but it'll always be NULL during sktrig_required=false calls (which is how _bt_advance_array_keys must be called when pstate.forcenonrequired). Oversight in commit `8a510275`, which optimized nbtree search scan key comparisons. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/CAHgHdKsn2W=gPBmj7p6MjQFvxB+zZDBkwTSg0o3f5Hh8rkRrsA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzmodSE+gpTd1CRGU9ez8ytyyDS+Kns2r9NzgUp1s56kpw@mail.gmail.com	2025-05-02 17:50:58 -04:00
Tomas Vondra	1681a70df3	Fix memory leak in _gin_parallel_merge To insert the merged GIN entries in _gin_parallel_merge, the leader calls ginEntryInsert(). This may allocate memory, e.g. for a new leaf tuple. This was allocated in the PortalContext, and kept until the end of the index build. For most GIN indexes the amount of leaked memory is negligible, but for custom opclasses with large keys it may cause OOMs. Fixed by calling ginEntryInsert() in a temporary memory context, reset after each insert. Other ginEntryInsert() callers do this too, except that the context is reset after batches of inserts. More frequent resets don't seem to hurt performance, it may even help it a bit. Report and fix by Vinod Sridharan. Author: Vinod Sridharan <vsridh90@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAFMdLD4p0VBd8JG=Nbi=BKv6rzFAiGJ_sXSFrw-2tNmNZFO5Kg@mail.gmail.com	2025-05-02 23:05:18 +02:00
Tom Lane	e83a8ae447	Don't use a tuplestore if we don't have to for SQL-language functions. We only need a tuplestore if we're actually going to accumulate multiple result tuples. Obviously then we don't need one for non-set- returning functions; but even a SRF doesn't need one if we decide to use "lazyEval" (one row at a time) mode. In these cases, it's sufficient to use the junkfilter's result slot to hold the single row that's due to be returned. We just need to "materialize" that slot to ensure it holds onto the data past shutdown of the sub-executor. The original intent of this patch was partially to save a few cycles (by not putting tuples into a tuplestore only to pull them back out immediately), but mostly to ensure that we don't use a tuplestore in non-set-returning functions. That's because I had concerns about whether a tuplestore is safe to keep across queries, which was possible for functions invoked via long-lived FmgrInfos such as those kept in the typcache. There are no cases where SRFs are called that way, so getting rid of the tuplestore in non-SRFs should make things safer. However, it emerges that running fmgr_sql in a short-lived context (as `595d1efed` made it do) makes the existing coding unsafe anyway: we can end up with a long-lived TupleTableSlot holding a freeable reference to a short-lived tuple, resulting in a double-free crash. Not trying to pull tuples out of the tuplestore using that slot dodges the problem, so I'm going to commit this now rather than invent a band-aid solution for v18. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2443532.1744919968@sss.pgh.pa.us Discussion: https://postgr.es/m/9f975803-1a1c-4f21-b987-f572e110e860@gmail.com	2025-05-02 16:16:20 -04:00
Álvaro Herrera	c83a38758d	Handle self-referencing FKs correctly in partitioned tables For self-referencing foreign keys in partitioned tables, we weren't handling creation of pg_constraint rows during CREATE TABLE PARTITION AS as well as ALTER TABLE ATTACH PARTITION. This is an old bug -- mostly, we broke this in `614a406b4f` while trying to fix it (so 12.13, 13.9, 14.6 and 15.0 and up all behave incorrectly). This commit reverts part of that with additional fixes for full correctness, and installs more tests to verify the parts we broke, not just the catalog contents but also the user-visible behavior. Backpatch to all live branches. In branches 13 and 14, commit 46a8c27a7226 changed the behavior during DETACH to drop a FK constraint rather than trying to repair it, because the complete fix of repairing catalog constraints was problematic due to lack of previous fixes. For this reason, the test behavior in those branches is a bit different. However, as best as I can tell, the fix works correctly there. In release notes we have to recommend that all self-referencing foreign keys on partitioned tables be recreated if partitions have been created or attached after the FK was created, keeping in mind that violating rows might already be present on the referencing side. Reported-by: Guillaume Lelarge <guillaume@lelarge.info> Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com> Reported-by: Luca Vallisa <luca.vallisa@gmail.com> Discussion: https://postgr.es/m/CAECtzeWHCA+6tTcm2Oh2+g7fURUJpLZb-=pRXgeWJ-Pi+VU=_w@mail.gmail.com Discussion: https://postgr.es/m/18156-a44bc7096f0683e6@postgresql.org Discussion: https://postgr.es/m/CAAT=myvsiF-Attja5DcWoUWh21R12R-sfXECY2-3ynt8kaOqjw@mail.gmail.com	2025-05-02 21:25:50 +02:00
Tom Lane	ac557793d4	Doc: correct spelling of meson switch. It's --auto-features not --auto_features. Reported-by: Egor Chindyaskin <kyzevan23@mail.ru> Discussion: https://postgr.es/m/172465652540.862882.17808523044292761256@wrigleys.postgresql.org Discussion: https://postgr.es/m/1979661.1746212726@sss.pgh.pa.us Backpatch-through: 16	2025-05-02 15:12:49 -04:00
Jacob Champion	3db68212a3	oauth: Correct SSL dependency for libpq-oauth.a libpq-oauth.a includes libpq-int.h, which includes OpenSSL headers. The Autoconf side picks up the necessary include directories via CPPFLAGS, but Meson needs the dependency to be made explicit. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Tested-by: Nathan Bossart <nathandbossart@gmail.com> Tested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aBTgjDfrdOZmaPgv%40nathan	2025-05-02 10:45:12 -07:00
Peter Eisentraut	81eaaa2c41	Make "directory" setting work with extension_control_path The extension_control_path setting (commit `4f7f7b0375`) did not support extensions that set a custom "directory" setting in their control file. Very few extensions use that and during the discussion on the previous commit it was suggested to maybe remove that functionality. But a fix was easier than initially thought, so this just adds that support. The fix is to use the control->control_dir as a share dir to return the path of the extension script files. To make this work more sensibly overall, the directory suffix "extension" is no longer to be included in the extension_control_path value. To quote the patch, it would be -extension_control_path = '/usr/local/share/postgresql/extension:/home/my_project/share/extension:$system' +extension_control_path = '/usr/local/share/postgresql:/home/my_project/share:$system' During the initial patch, there was some discussion on which of these two approaches would be better, and the committed patch was a 50/50 decision. But the support for the "directory" setting pushed it the other way, and also it seems like many people didn't like the previous behavior much. Author: Matheus Alcantara <mths.dev@pm.me> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: David E. Wheeler <david@justatheory.com> Discussion: https://www.postgresql.org/message-id/flat/aAi1VACxhjMhjFnb%40msg.df7cb.de#0cdf7b7d727cc593b029650daa3c4fbc	2025-05-02 16:35:48 +02:00
Bruce Momjian	a724c7889f	doc: first draft of the PG 18 release notes	2025-05-01 22:36:58 -04:00
Noah Misch	c6a26e4ccd	Doc: stop implying recommendation of insecure search_path value. SQL "SET search_path = 'pg_catalog, pg_temp'" is silently equivalent to "SET search_path = pg_temp, pg_catalog, "pg_catalog, pg_temp"" instead of the intended "SET search_path = pg_catalog, pg_temp". (The intent was a two-element search path. With the single quotes, it instead specifies one element with a comma and a space in the middle of the element.) In addition to the SET statement, this affects SET clauses of CREATE FUNCTION, ALTER ROLE, and ALTER DATABASE. It does not affect the set_config() SQL function. Though the documentation did not show an insecure command, remove single quotes that could entice a reader to write an insecure command. Back-patch to v13 (all supported versions). Reported-by: Sven Klemm <sven@timescale.com> Author: Sven Klemm <sven@timescale.com> Backpatch-through: 13	2025-05-01 16:51:59 -07:00
Peter Eisentraut	0064020680	doc: Flesh out extension docs for the "prefix" make variable The variable is a bit magical in how it requires "postgresql" or "pgsql" to be part of the path, and files end up in its "share" and "lib" subdirectories. So mention all that and show an example of setting "extension_control_path" and "dynamic_library_path" to use those locations. Author: David E. Wheeler <david@justatheory.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://www.postgresql.org/message-id/6B5BF07B-8A21-48E3-858C-1DC22F3A28B4@justatheory.com	2025-05-01 22:23:52 +02:00
Jacob Champion	4ea1254f35	oauth: Fix Autoconf build on macOS Oversight in `b0635bfda`. -lintl is necessary for gettext on Mac, which libpq-oauth depends on via pgport/pgcommon. (I'd incorrectly removed this change from an earlier version of the patch, where it was suggested by Peter Eisentraut.) Per buildfarm member indri.	2025-05-01 12:35:52 -07:00
Jacob Champion	b0635bfda0	oauth: Move the builtin flow into a separate module The additional packaging footprint of the OAuth Curl dependency, as well as the existence of libcurl in the address space even if OAuth isn't ever used by a client, has raised some concerns. Split off this dependency into a separate loadable module called libpq-oauth. When configured using --with-libcurl, libpq.so searches for this new module via dlopen(). End users may choose not to install the libpq-oauth module, in which case the default flow is disabled. For static applications using libpq.a, the libpq-oauth staticlib is a mandatory link-time dependency for --with-libcurl builds. libpq.pc has been updated accordingly. The default flow relies on some libpq internals. Some of these can be safely duplicated (such as the SIGPIPE handlers), but others need to be shared between libpq and libpq-oauth for thread-safety. To avoid exporting these internals to all libpq clients forever, these dependencies are instead injected from the libpq side via an initialization function. This also lets libpq communicate the offsets of PGconn struct members to libpq-oauth, so that we can function without crashing if the module on the search path came from a different build of Postgres. (A minor-version upgrade could swap the libpq-oauth module out from under a long-running libpq client before it does its first load of the OAuth flow.) This ABI is considered "private". The module has no SONAME or version symlinks, and it's named libpq-oauth-<major>.so to avoid mixing and matching across Postgres versions. (Future improvements may promote this "OAuth flow plugin" to a first-class concept, at which point we would need a public API to replace this anyway.) Additionally, NLS support for error messages in `b3f0be788a` was incomplete, because the new error macros weren't being scanned by xgettext. Fix that now. Per request from Tom Lane and Bruce Momjian. Based on an initial patch by Daniel Gustafsson, who also contributed docs changes. The "bare" dlopen() concept came from Thomas Munro. Many people reviewed the design and implementation; thank you! Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Wolfgang Walther <walther@technowledgy.de> Discussion: https://postgr.es/m/641687.1742360249%40sss.pgh.pa.us	2025-05-01 09:14:30 -07:00
Nathan Bossart	a3ef0b570c	Remove extra "not" in pg_upgrade documentation. Oversight in commit `cb45dc3afb`. Reported-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://postgr.es/m/7b856277-62ad-80f0-36e1-a134ec3c9cab%40xs4all.nl	2025-05-01 09:31:36 -05:00
Dean Rasheed	d73d4cfdfc	doc: Warn that ts_headline() output is not HTML-safe. Add a documentation warning to ts_headline() pointing out that, when working with untrusted input documents, the output is not guaranteed to be safe for direct inclusion in web pages. This is because, while it does remove some XML tags from the input, it doesn't remove all HTML markup, and so the result may be unsafe (e.g., it might permit XSS attacks). To guard against that, all HTML markup should be removed from the input, making it plain text, or the output should be passed through an HTML sanitizer. In addition, document precisely what the default text search parser recognises as valid XML tags, since that's what determines which XML tags ts_headline() will remove. Reported-by: Richard Neill <richard.neill@telos.digital> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 13	2025-05-01 11:03:43 +01:00
Peter Eisentraut	06c4f3ae80	doc: Improve explanations when a table rewrite is needed Further improvement for commit `11bd831860`. That commit confused identity and generated columns; fix that. Also, virtual generated columns have since been added; add more details about that. Also some small rewordings and reformattings to further improve clarity. Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/00e6eb5f5c793b8ef722252c7a519c9a@oss.nttdata.com	2025-05-01 08:57:48 +02:00
Peter Geoghegan	9d924dbb37	Adjust overstrong nbtree skip array assertion. Make an nbtree array preprocessing assertion account for scans that add fewer skip arrays than initially expected due to preprocessing finding an unsatisfiable array qual. Oversight in commit `92fe23d9`. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/CAHgHdKtQMhHy5qcB3KqCcGiW-Rp8P7KzUFRa9ZMKUiv6zen7LQ@mail.gmail.com	2025-04-30 23:15:51 -04:00
Michael Paquier	92ee8a4df5	doc: Mention cost-based delays for total_[auto]{vacuum,analyze}_time `30a6ed0ce4` has added four attributes to pg_stat_all_tables to track the cumulative time spent in [auto]vacuum and [auto]analyze. It was not mentioned that the vacuum cost-based delays are included in these numbers, which could be confusing now that the delays are included in the vacuum progress view (`bb8dff9995`). This commit adds an extra note about this matter. Reported-by: Magnus Hagander <magnus@hagander.net> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CABUevEz9v1ZNToPyD98JnWDGZgG=SmPZKkSNzU9hXQ-nGTQF0g@mail.gmail.com	2025-05-01 08:52:19 +09:00
Daniel Gustafsson	45e7e8ca9e	Convert strncpy to strlcpy We try to avoid using strncpy() due to the ease of which it can be misused. Convert this callsite to use strlcpy() instead to match similar codepaths in this file. Suggested-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/2a796830-de2d-4030-b480-d673f6cc5d94@eisentraut.org	2025-04-30 23:00:47 +02:00
Nathan Bossart	2d6745a66b	doc: Add missing reference to track_cost_delay_timing. Oversight in commit `bb8dff9995`.	2025-04-30 14:45:54 -05:00
Nathan Bossart	9879105024	vacuumdb: Don't skip empty relations in --missing-stats-only mode. Presently, --missing-stats-only skips relations with reltuples set to 0 because empty relations don't get optimizer statistics. However, before v14, a reltuples value of 0 was ambiguous: it could either mean the relation is empty, or it could mean that it hadn't yet been vacuumed or analyzed. (Commit `3d351d916b` taught v14 and newer to use -1 for the latter case.) This ambiguity can cause --missing-stats-only to inadvertently skip relations that need optimizer statistics after upgrades to v18 and newer (since reltuples is now transferred from the old cluster). To fix, simply remove the check for reltuples != 0. This will cause --missing-stats-only to analyze some empty tables, but that doesn't seem too terrible a trade-off. Reported-by: Christoph Berg <myon@debian.org> Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aAjyvW5_fRGNr7yF%40msg.df7cb.de	2025-04-30 14:12:59 -05:00
Nathan Bossart	d5f1b6a75b	Further adjust guidance for running vacuumdb after pg_upgrade. Since pg_upgrade does not transfer the cumulative statistics used to trigger autovacuum and autoanalyze, the server may take much longer than expected to process them post-upgrade. Currently, we recommend analyzing only relations for which optimizer statistics were not transferred by using the --analyze-in-stages and --missing-stats-only options. This commit appends another recommendation to analyze all relations to update the relevant cumulative statistics by using the --analyze-only option. This is similar to the recommendation for pg_stat_reset(). Reported-by: Christoph Berg <myon@debian.org> Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aAfxfKC82B9NvJDj%40msg.df7cb.de	2025-04-30 14:12:59 -05:00
Nathan Bossart	f60420cff6	doc: Alphabetize long options for pg_dump[all]. The current ordering strategy for these pages is to list the short options in alphabetical order followed by the long options in alphabetical order. If an option has both a short variant and a long variant, the short variant takes precedence. This commit moves a few recently added options to match this style. We should probably adjust all pages and --help output to list the long and short options in one combined alphabetical list (with the long variants taking precedence), but that is a much larger change, so it is left as a future exercise. Oversights in commits `a5cf808be5`, `1fd1bd8710`, and `bde2fb797a`. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/aBFBtsltgu3-IU1d%40nathan	2025-04-30 13:07:51 -05:00
Tom Lane	368c3fbf9d	Update time zone data files to tzdata release 2025b. DST law changes in Chile: there is a new time zone America/Coyhaique for Chile's Aysén Region, to account for it changing to UTC-03 year-round and thus diverging from America/Santiago. Historical corrections for Iran. Backpatch-through: 13	2025-04-30 11:13:49 -04:00
Daniel Gustafsson	f8c115a6cb	Typo and doc fixups for memory context reporting This fixes comment and docs typos as well as a small documentation change to make it clearer. Found via post-commit review. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAH2L28vt16C9xTuK+K7QZvtA3kCNWXOEiT=gEekUw3Xxp9LVQw@mail.gmail.com	2025-04-30 11:10:27 +02:00
Daniel Gustafsson	d2a1ed1727	Add missing string terminator When copying the string strncpy won't add nul termination since the string length is equal to the length specified. Explicitly set a nul terminator after copying to properly terminate. Found via post-commit review. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAH2L28vt16C9xTuK+K7QZvtA3kCNWXOEiT=gEekUw3Xxp9LVQw@mail.gmail.com	2025-04-30 10:34:08 +02:00
David Rowley	991407ae86	Add `918e7287e` to .git-blame-ignore-revs	2025-04-30 19:27:56 +12:00
David Rowley	918e7287ed	Fix broken indentation I forgot to run pgindent in `d8555e522`. Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://postgr.es/m/156083c9-eac0-418d-9667-92dec4d6d6cd@oss.nttdata.com	2025-04-30 19:18:30 +12:00
David Rowley	d8555e522e	Fix a couple of comment typos Author: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAEG8a3+MRwDKc4YSFKKPKq7Y+vMufVC5u94wM5KZPB2CbgCxnQ@mail.gmail.com	2025-04-30 13:40:46 +12:00
Tom Lane	810a8b1c80	Give up on running with NetBSD/OpenBSD's default semaphore settings. This reverts commit `38da053463`, which attempted to preserve our ability to start with only 60 semaphores. Subsequent changes (particularly `55b454d0e`) have put that idea pretty much permanently out of reach: people wishing to use Postgres v18 on OpenBSD or NetBSD will have no choice but to increase those platforms' default values of SEMMNI and SEMMNS. Hence, revert 38da05346's changes in SEMAS_PER_SET and the minimum tested value of max_connections. Adjust a comment from the subsequent patch `6d0154196`, and tweak the wording in runtime.sgml to make it clear that changing SEMMNI/SEMMNS is no longer even a little bit optional on these platforms. Although `38da05346` was later back-patched into v17, leave that branch alone: it's still capable of starting with 60 semaphores, and there's no reason to break that. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/E1tuZNv-0037Gs-34@gemulon.postgresql.org Discussion: https://postgr.es/m/1052019.1745947915@sss.pgh.pa.us	2025-04-29 17:27:52 -04:00
Jacob Champion	e974f1c216	oauth: Classify oauth_client_secret as a password Tell UIs to hide the value of oauth_client_secret, like the other passwords. Due to the previous commit, this does not affect postgres_fdw and dblink, but add a comment to try to warn others of the hazard in the future. Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250415191435.55.nmisch%40google.com	2025-04-29 13:08:55 -07:00
Jacob Champion	d2e7d2a09d	oauth: Disallow OAuth connections via postgres_fdw/dblink A subsequent commit will reclassify oauth_client_secret from dispchar="" to dispchar="", so that UIs will treat it like a secret. For our FDWs, this change will move that option from SERVER to USER MAPPING, which we need to avoid. But upon further discussion, we don't really want our FDWs to use our builtin Device Authorization flow at all, for several reasons: - the URL and code would be printed to the server logs, not sent over the client connection - tokens are not cached/refreshed, so every single connection has to be manually authorized by a user with a browser - oauth_client_secret needs to belong to the foreign server, but options on SERVER are publicly accessible - all non-superusers would need password_required=false, which is dangerous Future OAuth work can use FDWs as a motivating use case. But for now, disallow all oauth_ connection options for these two extensions. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250415191435.55.nmisch%40google.com	2025-04-29 13:08:24 -07:00
Jacob Champion	45363fca63	Bump the minimum supported Python version to 3.6.8 Python 3.2 is no longer tested by the buildfarm, and there are only a handful of buildfarm animals running versions older than 3.6, which itself went end-of-life in 2021. Python 3.6.8 is the default version shipped in RHEL8, so that seems like a reasonable baseline for PG18. Now that we use the Python Limited API as of `0793ab810`, older versions of Python should continue functioning for users of PL/Python in particular, so soften the language from "required" to "supported". Wording by Tom Lane. Separate from the review of the patch itself, several people provided input on the choice of cutoff: Christoph Berg, Devrim Gündüz, Florents Tselai, Jelte Fennema-Nio, and Renan Alves Fonseca. Thank you! Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/16098.1745079444%40sss.pgh.pa.us	2025-04-29 13:04:19 -07:00
Peter Eisentraut	eec34099c3	Fix whitespace typo in string	2025-04-29 19:16:11 +02:00
Nathan Bossart	2b49492eda	initdb: Do not report default autovacuum_worker_slots. Commit `6d01541960` taught initdb to lower the default value of autovacuum_worker_slots for systems with very few semaphores. It also added a "fake" report for the chosen value, i.e., initdb prints a message about selecting the default, but the value was already selected in a previous test. Per discussion, this is not a precedent we want to set, and it seems unnecessary to report everything derived from max_connections, so let's remove the "fake" report. Reported-by: Peter Eisentraut <peter@eisentraut.org> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/de722583-4ba4-4063-bc41-e20684978116%40eisentraut.org	2025-04-29 11:41:42 -05:00
Bruce Momjian	faced8e6a4	doc: adjust max_files_per_process again Reported-by: Andres Freund Discussion: https://postgr.es/m/5yqochswkulckuzzrwgv2nqdrfh4k4coc4uwq4lvgzkfwnbjbd@46igbiwjabn2	2025-04-29 10:30:08 -04:00
Bruce Momjian	9a9e60fed3	doc: clarify new behavior of max_files_per_process	2025-04-29 09:45:41 -04:00
Peter Eisentraut	913c60b067	doc: Small example improvement Add a comment character before a line annotation, so that the query can be used as presented. Reported-by: Yaroslav Saburov <y.saburov@gmail.com> Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://www.postgresql.org/message-id/flat/174393459040.678.17810152410419444783%40wrigleys.postgresql.org	2025-04-29 14:43:35 +02:00
Alexander Korotkov	2260c7f6d9	Fixes for ChangeVarNodes_walker() This commit fixes two bug in ChangeVarNodes_walker() function. * When considering RestrictInfo, walk down to its clauses based on the presense of relid to be deleted not just in clause_relids but also in required_relids. * Incrementally adjust num_base_rels based on the change of clause_relids instead of recalculating it using clause_relids, which could contain outer-join relids. Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-04-29 14:34:44 +03:00
Peter Eisentraut	15b1b4dd3f	pg_restore: Improve --help synopsis The --help synopsis should only be one line. This rephrases the first line a bit to reflect the new functionality of restoring multiple databases from pg_dumpall output. Additional explanations are better kept in the man page.	2025-04-29 11:32:49 +02:00
Peter Eisentraut	dadc58f50a	pg_restore: Put new option in consistent order in --help output Also make the description a bit more consistent with similar options.	2025-04-29 10:59:05 +02:00
Amit Kapila	3ff2a1f0c9	Fix assertion failure during decoding from synced slots. The slot synchronization skips updating the confirmed_flush LSN of the local slot if the local slot has a newer catalog_xmin or restart_lsn, but still allows updating the two_phase and two_phase_at fields of the slot. This opens up a window for the prepared transactions between old confirmed_flush LSN and two_phase_at to unexpectedly get decoded and sent to the downstream after promotion. Then, while decoding the commit prepared the assert will fail, which expects that the prepare hasn't been sent to the downstream. The fix is to skip updating the other slot fields when we are skipping to update the confirmed_flush LSN of the slot. We didn't backpatch this commit as two_phase_at was not synced in back branches, which means prepared transactions won't be unexpectedly sent to downstream. We discovered this problem while analyzing BF failure reported in the discussion link. Reliably reproducing this issue without a debugger is difficult. Given its rarity, adding specific injection point to test it doesn't seem worthwhile, so we won't be adding a dedicated test case. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716B44052000EB91EFAE60E94BC2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-04-29 12:52:05 +05:30
Peter Eisentraut	ef1811ac9a	pg_verifybackup: Message style improvements	2025-04-29 09:19:15 +02:00
Peter Eisentraut	c893245ec3	test_slru: Fix incorrect format placeholders Before commit `a0ed19e0a9` there was a cast around these, but the cast inadvertently changed the signedness, but that made the format placeholder correct. Commit `a0ed19e0a9` removed the casts, so now the format placeholders had the wrong signedness.	2025-04-29 09:09:00 +02:00
Amit Kapila	9807617a92	Doc: Specify the interaction of publish_generated_columns with column list. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHut+PtnjLiNFFh-3f9cXH0wnwqjdkTjQNbVmZdZ1y+zKt_PPg@mail.gmail.com	2025-04-29 09:01:43 +05:30
Melanie Plageman	f132815fd7	Add maintenance_io_concurrency flag to some read stream users Index vacuuming and [auto]prewarm AIO concurrency should be governed by maintenance_io_concurrency. As such, pass those read stream users the READ_STREAM_MAINTENANCE flag which will calculate their read stream distance with maintenance_io_concurrency instead of effective_io_concurrency. This was an oversight in the original commits making those operations use the read stream API. Discussion: https://postgr.es/m/flat/CAAKRu_aopDxTo4b41Mt_7Zc-z0_ngocrY8SFCCY6Aph1HgwuNw%40mail.gmail.com	2025-04-28 14:19:45 -04:00
Peter Geoghegan	ce72e7e02e	Fix obsolete nbtree array advancement comment. Checking if another primitive scan is required after all once the next leaf page was moved from _bt_checkkeys to its _bt_readpage caller by commit `9a2e2a28`. Update a comment that incorrectly described the recheck mechanism as something that takes place in _bt_checkkeys. Also fix an older typo in related code comments.	2025-04-28 12:49:17 -04:00
Peter Geoghegan	b75fedcab7	Make NULL tuple values always advance skip arrays. _bt_check_compare neglected to handle a case that can arise when the scan's keys are temporarily treated as nonrequired, as an optimization: whenever a NULL tuple value was encountered that had a skip array whose current element wasn't already NULL, _bt_check_compare failed to advance the array to the NULL element. This allowed _bt_check_compare to fail to return matching tuples containing a NULL value (though only with an array column that came before a skip array column with NULLs, and only during _bt_readpage calls that set pstate.forcenonrequired=true on a page where the higher-order column also had to advance). To fix, teach _bt_check_compare to handle this case just like any other case where a skip array key is unsatisfied and must be advanced directly (due to the key being considered a nonrequired key). Oversight in commit `8a510275`, which optimized nbtree search scan key comparisons with skip arrays. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/CAHgHdKtLFWZcjr87hMH0hYDHgcifu4Tj7iHz-xh8qsJREt5cqA@mail.gmail.com	2025-04-28 12:11:08 -04:00
Álvaro Herrera	0e13b13d26	Fix pg_dump for inherited validated not-null constraints When a child constraint is validated and the parent constraint it derives from isn't, pg_dump must be coerced into printing the child constraint; failing to do would result in a dump that restores the constraint as not valid, which would be incorrect. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: jian he <jian.universality@gmail.com> Message-id: https://postgr.es/m/CACJufxGHNNMc0E2JphUqJMzD3=bwRSuAEVBF5ekgkG8uY0Q3hg@mail.gmail.com	2025-04-28 16:25:06 +02:00
Peter Eisentraut	c061000311	pg_combinebackup: Message style improvements	2025-04-28 14:26:49 +02:00
Alexander Korotkov	73e7361376	Restore comments in ChangeVarNodesExtended() This commit restores comments in ChangeVarNodesExtended(), which were accidentally removed by `fc069a3a63`. Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com	2025-04-28 11:20:22 +03:00
Amit Kapila	aaf9e95e87	Fix xmin advancement during fast_forward decoding. During logical decoding, we advance catalog_xmin of logical too early in fast_forward mode, resulting in required catalog data being removed by vacuum. This mode is normally used to advance the slot without processing the changes, but we still can't let the slot's xmin to advance to an incorrect value. Commit `f49a80c481` fixed a similar issue where the logical slot's catalog_xmin was getting advanced prematurely during non-fast-forward mode. During xl_running_xacts processing, instead of directly advancing the slot's xmin to the oldest running xid in the record, it allowed the xmin to be held back for snapshots that can be used for not-yet-replayed transactions, as those might consider older txns as running too. However, it missed the fact that the same problem can happen during fast_forward mode decoding, as we won't build a base snapshot in that mode, and the future call to get_changes from the same slot can miss seeing the required catalog changes leading to incorrect reslts. This commit allows building the base snapshot even in fast_forward mode to prevent the early advancement of xmin. Reported-by: Amit Kapila <amit.kapila16@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB57163087F86621D44D9A72BF94BB2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-04-28 11:35:54 +05:30
Michael Paquier	b225c5e76e	Remove circular #include's between wait_event.h and wait_event_types.h wait_event_types.h is generated by the code, and included wait_event.h. wait_event.h did the opposite move, including wait_event_types.h, causing a circular dependency between both. wait_event_types.h only needs to now about the wait event classes, so this information is moved into its own file, and wait_event_types.h uses this new header so as it does not depend anymore on wait_event.h. Note that such errors can be found with clang-tidy, with commands like this one: clang-tidy source_file.c --checks=misc-header-include-cycle -- \ -I/install/path/include/ -I/install/path/include/server/ Issue introduced by `fa88928470`. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/350192.1745768770@sss.pgh.pa.us	2025-04-28 09:08:15 +09:00
Alexander Korotkov	1aa7cf9eb8	Disallow removing placeholders during Self-Join Elimination. `fc069a3a63` implements Self-Join Elimination (SJE), which can remove base relations when appropriate. However, regressions tests for SJE only cover the case when placeholder variables (PHVs) are evaluated and needed only in a single base rel. If this baserel is removed due to SJE, its clauses, including PHVs, will be transferred to the keeping relation. Removing these PHVs may trigger an error on plan creation -- thanks to the `b3ff6c742f` for detecting that. This commit skips removal of PHVs during SJE. This might also happen that we skip the removal of some PHVs that could be removed. However, the overhead of extra PHVs is small compared to the complexity of analysis needed to remove them. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Alena Rybakina <a.rybakina@postgrespro.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com>	2025-04-28 01:40:42 +03:00
Tom Lane	2f5b056203	Remove inappropriate inclusions of c.h and postgres_fe.h. Per our usual policy, Postgres header files should not include these; the decision as to which one to use is to be made in the calling .c file instead. These errors aren't particularly new, but I'm not feeling a need to back-patch these changes; it's mostly just neatnik-ism.	2025-04-27 16:58:57 -04:00
Tom Lane	94b84a6072	Don't use double-quotes in #include's of system headers, redux. This cleans up some loose ends left by commit `e8ca9ed1d`. I hadn't looked closely enough at these places before, but now I have. The use of double-quoted #includes for Perl headers in plperl_system.h seems to be simply a mistake introduced in `6c944bf3c` and faithfully copied forward since then. (I had thought possibly it was required by some weird Windows build setup, but there's no evidence of that in our history.) The occurrences in SectionMemoryManager.h and SectionMemoryManager.cpp evidently stem from those files' origin as LLVM code. It's understandable that LLVM would treat their own files as needing double-quoted #includes; but they're still system headers to us. I also applied the same check to *.c files, and found a few other random incorrect usages in both directions. Our ECPG headers and test files routinely use angle brackets to refer to ECPG headers. I left those usages alone, since it seems reasonable for an ECPG user to regard those headers as system headers.	2025-04-27 13:23:19 -04:00
Tom Lane	2311f193ea	Remove circular #include's between plpython.h and plpy_util.h. plpython.h included plpy_util.h, simply on the grounds that "it's easier to just include it everywhere". However, plpy_util.h must include plpython.h, or it won't pass headerscheck. While the resulting circularity doesn't have any immediate bad effect, it's poor design. We have seen serious messes arise in the past from overly-broad inclusion footprints created by such circularities, so let's establish a project policy against it. To fix, just replace *.c files' inclusions of plpython.h with plpy_util.h. They'll pull in plpython.h indirectly; indeed, almost all have already done so via inclusions of other plpy_xxx.h headers. (Any extensions using plpython.h can do likewise without breaking the compatibility of their code with prior Postgres versions.) Reported-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aAxQ6fcY5QQV1lo3@ip-10-97-1-34.eu-west-3.compute.internal	2025-04-27 11:43:02 -04:00
Tom Lane	e8ca9ed1d2	Don't use double-quotes in #include's of system headers. While few if any C compilers will complain about this, it's inconsistent with our other #include's of the same headers. There are some other questionable usages in src/include/jit/SectionMemoryManager.h and src/pl/plperl/plperl_system.h, but perhaps those have a reason to be like that. I can't see that these do. Noticed while fooling around with a script to do analysis of our header cross-inclusions.	2025-04-26 20:30:27 -04:00
David Rowley	936457419d	Eliminate divide in new fast-path locking code `c4d5cb71d2` adjusted the fast-path locking code to allow some configuration of the number of fast-path locking slots via the max_locks_per_transaction GUC. In that commit the FAST_PATH_REL_GROUP() macro used integer division to determine the fast-path locking group slot to use for the lock. The divisor in this case is always a power-of-two value. Here we swap out the divide by a bitwise-AND, which is a significantly faster operation to perform. In passing, adjust the code that's setting FastPathLockGroupsPerBackend so that it's more clear that the value being set is a power-of-two. Also, adjust some comments in the area which contained some magic numbers. It seems better to justify the 1024 upper limit in the location where the #define is made instead of where it is used. Author: David Rowley <drowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAApHDvodr3bcnpxcs7+k-3cFwYR0tP-BYhyd2PpDhe-bCx9i=g@mail.gmail.com	2025-04-27 11:53:40 +12:00
John Naylor	27757677ca	Match parameter in new function to earlier equivalents Oversight in commit `3c6e8c123`.	2025-04-27 03:03:52 +07:00
Bruce Momjian	10e8176950	doc: improve wording of vacuum_max_eager_freeze_failure_rate	2025-04-26 11:41:23 -04:00
Andres Freund	039bfc457e	aio: Improve debug logging around waiting for IOs Trying to investigate a bug report by Alexander Lakhin made it apparent that the debug logging around waiting for IO completion is insufficient. Fix that. Discussion: https://postgr.es/m/h4in2db37vepagmi2oz5vvqymjasc5gyb4lpqkunj4eusu274i@37jpd3c2spd3	2025-04-25 13:31:25 -04:00
Andres Freund	500b61769f	Fix bug allowing io_combine_limit > io_max_combine_combine limit `10f6646847` intended to limit the value of io_combine_limit to the minimum of io_combine_limit and io_max_combine_limit. To avoid issues with interdependent GUCs, it introduced io_combine_limit_guc and set io_combine_limit in assign hooks. That plan was thwarted by guc_tables.c accidentally still referencing io_combine_limit, instead of io_combine_limit_guc. That lead to the GUC machinery overriding the work done in the assign hooks, potentially leaving io_combine_limit with a too high value. The consequence of this bug was that when running with io_combine_limit > io_combine_limit_guc the AIO machinery would not have reserved large enough iovec and IO data arrays, with one IO's arrays overlapping with another IO's, leading to total confusion. To make such a problem easier to detect in the future, add assertions to pgaio_io_set_handle_data_* checking the length is smaller than io_max_combine_limit (not just PG_IOV_MAX). It'd be nice to have a few tests for this, but it's not entirely obvious how to do so portably. As remarked upon by Tom, the GUC assignment hooks really shouldn't set the underlying variable, that's the job of the GUC machinery. Change that as well. Discussion: https://postgr.es/m/c5jyqnuwrpigd35qe7xdypxsisdjrdba5iw63mhcse4mzjogxo@qdjpv22z763f	2025-04-25 13:31:24 -04:00
Andres Freund	0d9114b704	aio: Fix crash potential for pg_aios views due to late state update pgaio_io_reclaim() reset the fields in PgAioHandle before updating the state to IDLE or incrementing the generation. For most things that's OK, but for pg_get_aios() it is not - if it copied the PgAioHandle while fields were being reset, we wouldn't detect that and could call pgaio_io_get_target_description() with ioh->target == PGAIO_TID_INVALID, leading to a crash. Fix this issue by incrementing the generation and state earlier, before resetting. Also add an assertion to pgaio_io_get_target_description() for the target to be valid - that'd have made this case a bit easier to debug. While at it, add/update a few related assertions. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/062daca9-dfad-4750-9da8-b13388301ad9@gmail.com	2025-04-25 13:31:13 -04:00
Peter Eisentraut	76d52e7165	Fix incorrect format placeholders Before commit `a0ed19e0a9` there was a cast around these, but the cast inadvertently changed the signedness, but that made the format placeholder correct. Commit `a0ed19e0a9` removed the casts, so now the format placeholders had the wrong signedness.	2025-04-25 16:49:30 +02:00
Peter Eisentraut	385959bdea	Fix terminology in comment and message Should be "bracket" not "brace" for [].	2025-04-25 16:26:28 +02:00
Peter Eisentraut	0787646e1d	Small code consistency improvement Adjust the way the increment operators are placed to be consistent throughout the function. Fixup for commit commit `c1da728106`.	2025-04-25 13:01:31 +02:00
Amit Kapila	50b8ad30f7	Fix typo in test file name added in commit `4909b38af0`. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CANhcyEXsObdjkjxEnq10aJumDpa5J6aiPzgTh_w4KCWRYHLw6Q@mail.gmail.com	2025-04-25 12:46:02 +05:30
Fujii Masao	632f62dcec	doc: remove unnecessary secondary index terms for replication settings. Previously, config.sgml included secondary index terms for max_replication_slots and max_active_replication_origins. These are no longer necessary, as each parameter now has a single distinct index entry. The secondary index terms were originally useful because max_active_replication_origins was part of max_replication_slots, and separate index entries helped users locate each setting. However, commit `04ff636cbc` split them into independent parameters, making the secondary terms redundant. This commit removes the unnecessary secondary index entries to simplify the documentation. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/e825e7a7-4877-441d-93c1-25377db36c31@oss.nttdata.com	2025-04-25 14:58:14 +09:00
Bruce Momjian	6389db2320	doc: simplify new EXPLAIN ANALYZE BUFFERS description	2025-04-24 22:02:35 -04:00
Michael Paquier	3631612eae	psql: Fix assertion failures with pipeline mode A correct cocktail of COPY FROM, SELECT and/or DML queries and \syncpipeline was able to break the logic in charge of discarding results of a pipeline, done in discardAbortedPipelineResults(). Such sequence make the backend generate a FATAL, due to a protocol synchronization loss. This problem comes down to the fact that we did not consider the case of libpq returning a PGRES_FATAL_ERROR when discarding the results of an aborted pipeline. The discarding code is changed so as this result status is handled as a special case, with the caller of discardAbortedPipelineResults() being responsible for consuming the result. A couple of tests are added to cover the problems reported, bringing an interesting gain in coverage as there were no tests in the tree covering the case of protocol synchronization loss. Issue introduced by `41625ab8ea`. Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru> Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/ebf6ce77-b180-4d6b-8eab-71f641499ddf@postgrespro.ru	2025-04-24 12:22:53 +09:00
Michael Paquier	923ae50cf5	Add sanity check for dshash entries when reading pgstats file Not having this check would produce a core dump at startup when running pgstat_read_statsfile(), in the case where the information of a stats kind for an entry in the dshash could not be found. The same check already happens for fixed-numbered stats and entries that are stored with their names. This issue can be seen with custom stats kinds. Note that this problem can be reproduced what what is in the core code: - Tweak the test module injection_points to not load the fixed-numbered stats part, leaving only the variable-numbered stats. - Create an instance with injection_points defined in shared_preload_libraries. - Create a pgstats entry by attaching and running a point. - Restart the server without shared_preload_libraries. The startup process detects that something is wrong and reports a WARNING. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aAieZAvM+K1d89R2@ip-10-97-1-34.eu-west-3.compute.internal	2025-04-24 09:20:01 +09:00
Tom Lane	bc19f63f80	Avoid possibly-theoretical OOM crash hazard in hash_create(). One place in hash_create() used DynaHashAlloc() as a convenient shorthand for MemoryContextAlloc(). That was fine when it was written, but it stopped being fine when `9c911ec06` changed DynaHashAlloc() to use MCXT_ALLOC_NO_OOM (mea culpa). Change the code to call plain MemoryContextAlloc() as intended. I think that this bug may be unreachable in practice, since we now always create AllocSets with some space already allocated, so that an OOM failure here for a non-shared hash table should be impossible (with a hash table name of reasonable length anyway). And there aren't enough shared hash tables to make a crash for one of those probable. Nonetheless it's clearly not operating as designed, so back-patch to v16 where `9c911ec06` came in. Reported-by: Maksim Korotkov <m.korotkov@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/219bdccd460510efaccf90b57e5e5ef2@postgrespro.ru Backpatch-through: 16	2025-04-23 16:04:55 -04:00
Jacob Champion	005ccae0f2	oauth: Support Python 3.6 in tests RHEL8 ships a patched 3.6.8 as its base Python version, and I accidentally let some newer Python-isms creep into oauth_server.py during development. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Tested-by: Renan Alves Fonseca <renanfonseca@gmail.com> Tested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/16098.1745079444%40sss.pgh.pa.us	2025-04-23 11:16:45 -07:00
Alexander Korotkov	bb78e42678	Maintain RelIdToTypeIdCacheHash in TypeCacheOpcCallback() `b85a9d046e` introduced a new RelIdToTypeIdCacheHash, whose entries should exist for typecache entries with TCFLAGS_HAVE_PG_TYPE_DATA flag set or any of TCFLAGS_OPERATOR_FLAGS set or tupDesc set. However, TypeCacheOpcCallback(), which resets TCFLAGS_OPERATOR_FLAGS, was forgotten to update RelIdToTypeIdCacheHash. This commit adds a delete_rel_type_cache_if_needed() call to the TypeCacheOpcCallback() function to maintain RelIdToTypeIdCacheHash after resetting TCFLAGS_OPERATOR_FLAGS. Also, this commit fixes the name of the delete_rel_type_cache_if_needed() function in its mentions in the comments. Reported-by: Noah Misch Discussion: https://postgr.es/m/20250411203241.e9.nmisch%40google.com	2025-04-23 20:26:52 +03:00
Alexander Korotkov	9f404d7922	Properly prepare varinfos in estimate_multivariate_bucketsize() To estimate with extended statistics, we need to clear the varnullingrels field in the expression, and duplicates are not allowed in the GroupVarInfo list. We might re-use add_unique_group_var(), but we don't do so for two reasons. 1) We must keep the origin_rinfos list ordered exactly the same way as varinfos. 2) add_unique_group_var() is designed for estimate_num_groups(), where a larger number of groups is worse. While estimating the number of hash buckets, we have the opposite: a lesser number of groups is worse. Therefore, we don't have to remove "known equal" vars: the removed var may valuably contribute to the multivariate statistics to grow the number of groups. This commit adds custom code to estimate_multivariate_bucketsize() to initialize varinfos properly. Reported-by: Robins Tharakan <tharakan@gmail.com> Discussion: https://postgr.es/m/18885-da51324078588253%40postgresql.org Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-04-23 20:25:21 +03:00
Tom Lane	3db61db48e	Change the names generated for child foreign key constraints. When a foreign key constraint is placed on a partitioned table, we actually make two pg_constraint entries associated with that table. (I have my doubts about the wisdom of that, but it's been like that since v12 and post-feature-freeze is no time to be messing with such entrenched decisions.) The second "child" entry always had a name generated according to the default rule, "table_column(s)_fkey[nnn]", even if the primary entry had an unrelated user-specified name. The trouble with doing that is that the default name could collide with the user-specified name of some other constraint on the same table. While we were willing to adjust the generated name to avoid collisions, that only helps if it's made second; if it's made first then creation of the other constraint would fail, potentially causing dump/reload or pg_upgrade failures. The core of the problem here is that we're infringing on user namespace, so I doubt that there's any 100% solution other than to find a way to not need the "child" entry. In the meantime, it seems like it'd be an improvement to make the child's name be the name of the parent constraint with an underscore and digit(s) appended as necessary to make it unique. This rule can in theory fail in the same way, but it seems much less probable; for one thing, this rule is guaranteed not to match primary entries having auto-generated names. (While an auto-generated primary name isn't user-specified to begin with, it acts like that during dump/reload, so collisions against such names are definitely possible.) An additional bonus, visible in some of the regression test cases that change here, arises from the fact that some error messages cite the child constraint's name not the parent's. In the previous approach the two names could be completely unrelated, leading to user confusion --- the more so since psql's \d command hides child constraints. With this approach it's hopefully much clearer which constraint-the-user-knows-about is failing. However, that does mean that there's user-visible behavior change occurring here, making it seem like not something to back-patch. I feel it's not too late for v18, though. Reported-by: Kirill Reshke <reshkekirill@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CALdSSPhGitjpTfzEMJN-Y2x+Q-5QChSxAsmSJ1-E8mQJLkHOqQ@mail.gmail.com	2025-04-23 12:03:02 -04:00
Daniel Gustafsson	994a100b37	Allocate JsonLexContexts on the heap to avoid warnings The stack allocated JsonLexContexts, in combination with codepaths using goto, were causing warnings when compiling with LTO enabled as the optimizer is unable to figure out that is safe. Rather than contort the code with workarounds for this simply heap allocate the structs instead as these are not in any performance critical paths. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2074634.1744839761@sss.pgh.pa.us	2025-04-23 11:02:05 +02:00
Michael Paquier	0ff95e0a5b	psql: Rework TAP routine psql_fails_like() to define WAL sender context The routine was coded so as a WAL sender was always used, state required only for one failure test related to START_REPLICATION. This test is changed so as a WAL sender is used by passing a replication option to psql_fails_like(), instead of forcing the use of a WAL sender for all the tests. This has come up as useful in the context of a separate bug fix where we are looking at extending tests for some failure scenarios. These tests need to happen in the context of a normal backend, and not a WAL sender where the extended query protocol cannot be used. Discussion: https://postgr.es/m/aAXkJIOildLUA7vQ@paquier.xyz	2025-04-23 15:33:07 +09:00
Amit Kapila	0e091ce409	Fix an oversight in `3f28b2fcac`. Commit `3f28b2fcac` tried to ensure that the replication origin shouldn't be advanced in case of an ERROR in the apply worker, so that it can request the same data again after restart. However, it is possible that an ERROR was caught and handled by a (say PL/pgSQL) function, and the apply worker continues to apply further changes, in which case, we shouldn't reset the replication origin. Ensure to reset the origin only when the apply worker exits after an ERROR. Commit `3f28b2fcac` added new function geterrlevel, which we removed in HEAD as part of this commit, but kept it in backbranches to avoid breaking any applications. A separate case can be made to have such a function even for HEAD. Reported-by: Shawn McCoy <shawn.the.mccoy@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/CALsgZNCGARa2mcYNVTSj9uoPcJo-tPuWUGECReKpNgTpo31_Pw@mail.gmail.com	2025-04-23 11:08:24 +05:30
Michael Paquier	1f7878c33c	Remove assertion based on pending_since in pgstat_report_stat() This assertion, based on pending_since (timestamp used to prevent stats reports to be too frequent or should a partial flush happen), is reached when it is found that no data can be flushed but a previous call of pgstat_report_stat() determined that some stats data has been found as in need of a flush. So pending_since is set when some stats data is pending (in non-force mode) or if report attempts are too frequent, and reset to 0 once all stats have been flushed. Since 5cbbe70a9cc6, WAL senders have begun to report their stats on a periodic basis for IO stats in v16~ and backend stats on HEAD, creating some friction with the concurrent pgstat_report_stat() calls that can happen in the context of a WAL sender (shutdown callback doing a final report or backend-related code paths). This problem is the cause of spurious failures in the TAP tests. In theory, this assertion can be also reached in v15, even if that's very unlikely. For example, a process, say a background worker, could do periodic and direct stats flushes with concurrent calls of pgstat_report_stat() that could cause conflicting values of pending_since. This can be done with WAL or SLRU stats flushes using pgstat_flush_wal() or pgstat_slru_flush(). HEAD makes this situation easier to happen with custom cumulative stats. This commit removes the assertion altogether, per discussion, as it is more useful to keep the state of things as they are for the WAL sender. The assertion could use a special state based on for example am_walsender, but I doubt that this would be meaningful in the long run based on the other arguments raised while discussing this issue. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/1489124.1744685908@sss.pgh.pa.us Discussion: https://postgr.es/m/dwrkeszz6czvtkxzr5mqlciy652zau5qqnm3cp5f3p2po74ppk@omg4g3cc6dgq Backpatch-through: 15	2025-04-23 13:53:29 +09:00
Tom Lane	e0f373ee42	Re-enable SSL connect_fails tests, and fix related race conditions. Cluster.pm's connect_fails routine has long had the ability to sniff the postmaster log file for expected messages after a connection failure. However, that's always had a race condition: on some platforms it's possible for psql to exit and the test script to slurp up the postmaster log before the backend process has been able to write out its final log messages. Back in commit `55828a6b6` we disabled a bunch of tests after discovering that, and the aim of this patch is to re-enable them. (The sibling function connect_ok doesn't seem to have a similar problem, mainly because the messages we look for come out during the authentication handshake, so that if psql reports successful connection they should certainly have been emitted already.) The solution used here is borrowed from 002_connection_limits.pl's connect_fails_wait routine: set the server's log_min_messages setting to DEBUG2 so that the postmaster will log child-process exit, and then wait till we see that log entry before checking for the messages we are actually interested in. If a TAP test uses connect_fails' log_like or log_unlike options, and forgets to set log_min_messages, those connect_fails calls will now hang until timeout. Fixing up the existing callers shows that we had several other TAP tests that were in theory vulnerable to the same problem. It's unclear whether the lack of failures is just luck, or lack of buildfarm coverage, or perhaps there is some obscure timing effect that only manifests in SSL connections. In any case, this change should in principle make those other call sites more robust. I'm not inclined to back-patch though, unless sometime we observe an actual failure in one of them. Reported-by: Andrew Dunstan <andrew@dunslane.net> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/984fca80-85a8-4c6f-a5cc-bb860950b435@dunslane.net	2025-04-22 15:10:50 -04:00
Tom Lane	da83b1ea10	Avoid depending on post-UPDATE row order in float4/float8 tests. While heapam reproduces the insertion order of rows well, updates can move rows to varying places depending on autovacuum activity. In most regression tests we've guarded against getting variable results due to that, but float4.sql and float8.sql had escaped notice so far because they update tables that are too small for autovacuum to pay attention to. With increasing interest in non-heap table AMs, it seems worth allowing for update behaviors that are not like heapam's. Hence, add ORDER BY to stabilize the results in case the updates put the rows in a different order. (We'll continue to assume that a seqscan will reproduce original insertion order, though. Removing that assumption would require vastly-more-invasive test changes.) Author: Pavel Borisov <pashkin.elfe@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CALT9ZEExHAnBoBVQzQuWPMKUbapF5-FBO3fdeYG3s2tuWQz1NQ@mail.gmail.com	2025-04-22 14:24:21 -04:00
Tom Lane	eaf582806c	gen_node_support.pl: improve error message for unclosed struct. This error message was 'runaway "struct_name"', which isn't all that clear; I think 'could not find closing brace for "struct_name"' is better. Also, provide the location of the struct start using the script's usual '$file:$lineno' style. Bug: #18901 Reported-by: Clemens Ruck <clemens.ruck@t-online.de> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18901-424272abe01357e6@postgresql.org	2025-04-22 13:56:31 -04:00
Michael Paquier	e29df428a1	doc: Mention naming convention used by injection points All the injection points used in the tree have relied on an implied rule: their names should be made of lower-case characters, with dashes between the words used. This commit adds a light mention about that in the docs, encouraging the practice. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 17	2025-04-22 12:41:29 +09:00
David Rowley	0b06459f3c	Doc: reword text explaining the --maintenance-db option The previous text was a little clumsy. Here we improve that. Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Noboru Saito <noborusai@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com Backpatch-through: 13	2025-04-22 14:54:22 +12:00
Michael Paquier	02c63f9438	Rename injection point for invalidation messages at end of transaction This injection point was named "AtEOXact_Inval-with-transInvalInfo", not respecting the implied naming convention that injection points should use lower-case characters, with terms separated by dashes. All the other points defined in the tree follow this style, so let's be more consistent. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 17	2025-04-22 10:01:38 +09:00
David Rowley	5e6f9a9c4e	Doc: various fixups * Use <symbol> tags for CONNECTION_* #defines We were using an inconsistent mix of <literal> and sometimes <function> tags. * Use <application> tag for libpq There was a mix of <literal> and <productname> Also fix a whitespace issue. None of these seem critical enough mistakes to backpatch. Author: Noboru Saito <noborusai@gmail.com> Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com	2025-04-22 11:10:08 +12:00
David Rowley	d010cc6cca	Doc: fix incorrect punctuation Author: Noboru Saito <noborusai@gmail.com> Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com Backpatch-through: 17	2025-04-22 11:04:04 +12:00
Jeff Davis	90260e2ec6	Fix INITCAP() word boundaries for PG_UNICODE_FAST. Word boundaries are based on whether a character is alphanumeric or not. For the PG_UNICODE_FAST collation, alphanumeric includes non-ASCII digits; whereas for the PG_C_UTF8 collation, it only includes digits 0-9. Pass down the right information from the pg_locale_t into initcap_wbnext to differentiate the behavior. Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250417135841.33.nmisch@google.com	2025-04-21 12:34:58 -07:00
Tom Lane	80b727eb9d	Use the same cmd_context throughout a walsender's lifetime. exec_replication_command created a cmd_context to work in and then deleted it on exit. This is pretty dangerous because some replication commands start/finish transactions. In the wake of commit `1afe31f03`, that could lead to re-selecting a CurrentMemoryContext that's already been deleted, leading to hilarity such as a memory context that is its own parent. To fix, let's make the cmd_context persist across exec_replication_command calls; instead of deleting it, we'll just reset it each time. In this way it retains the same identity and there's no problem if transaction abort restores it as the working context. It probably even saves a few microseconds to do this. This fix also ensures that exec_replication_command returns to the caller (PostgresMain) with the same context active that had been when it was called (probably MessageContext). The previous coding could get that wrong too. Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAO6_XqoJA7-_G6t7Uqe5nWF3nj+QBGn4F6Ptp=rUGDr0zo+KvA@mail.gmail.com	2025-04-21 12:09:36 -04:00
Tom Lane	5ec8b01c30	MemoryContextCreate: assert parent is valid and different from node. The case of "node == parent" might seem impossible, since we just allocated the new node. But it's possible if parent is a dangling reference to a recently-deleted context. In fact, given aset.c's habit of recycling contexts, it's actually rather likely if that's so. If we'd had this assertion before, it would have simplified debugging a recently-identified walsender issue. Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAO6_XqoJA7-_G6t7Uqe5nWF3nj+QBGn4F6Ptp=rUGDr0zo+KvA@mail.gmail.com	2025-04-21 11:34:36 -04:00
Fujii Masao	706cbed351	doc: Fix memory context level in pg_log_backend_memory_contexts() example. Commit `d9e03864b6` changed the memory context level numbers shown by pg_log_backend_memory_contexts() to be 1-based. However, the example in the documentation was not updated and still used 0-based numbering. This commit updates the example to match the current 1-based output. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/1ad6d388-1b43-400d-bec9-36d52f755f74@oss.nttdata.com	2025-04-21 14:53:25 +09:00
David Rowley	78eda9e264	Fix a few more duplicate words in comments Similar to `84fd3bc14` but these ones were found using a regex that can span multiple lines. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com	2025-04-21 13:50:50 +12:00
David Rowley	84fd3bc141	Fix a few duplicate words in comments These are all new to v18 Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com	2025-04-21 10:41:18 +12:00
Noah Misch	8180136652	Comment on need to MarkBufferDirty() if omitting DELAY_CHKPT_START. Blocking checkpoint phase 2 requires MarkBufferDirty() and BUFFER_LOCK_EXCLUSIVE; neither suffices by itself. transam/README documents this, citing SyncOneBuffer(). Update the DELAY_CHKPT_START documentation to say this. Expand the heap_inplace_update_and_unlock() comment that cites XLogSaveBufferForHint() as precedent, since heap_inplace_update_and_unlock() could have opted not to use DELAY_CHKPT_START. Commit `8e7e672cda` added DELAY_CHKPT_START to heap_inplace_update_and_unlock(). Since commit bc6bad88572501aecaa2ac5d4bc900ac0fd457d5 reverted it in non-master branches, no back-patch. Discussion: https://postgr.es/m/20250406180054.26.nmisch@google.com	2025-04-20 12:00:17 -07:00
Noah Misch	714bd9e3a7	Test restartpoints in archive recovery. v14 commit 1f95181b44c843729caaa688f74babe9403b5850 and its v13 equivalent caused timing-dependent failures in archive recovery, at restartpoints. The symptom was "invalid magic number 0000 in log segment X, offset 0", "unexpected pageaddr X in log segment Y, offset 0" [X < Y], or an assertion failure. Commit 3635a0a35aafd3bfa80b7a809bc6e91ccd36606a and predecessors back-patched v15 changes to fix that. This test reproduces the problem probabilistically, typically in less than 1000 iterations of the test. Hence, buildfarm and CI runs would have surfaced enough failures to get attention within a day. Reported-by: Arun Thirupathi <arunth@google.com> Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13	2025-04-20 08:28:48 -07:00
Noah Misch	2d5350cfbd	Avoid ERROR at ON COMMIT DELETE ROWS after relhassubclass=f. Commit `7102070329` fixed a similar bug, but it missed the case of database-wide ANALYZE ("use_own_xacts" mode). Commit `a07e03fd8f` changed consequences from silent discard of a pg_class stats (relpages et al.) update to ERROR "tuple to be updated was already modified". Losing a relpages update of an ON COMMIT DELETE ROWS table was negligible, but a COMMIT-time error isn't negligible. Back-patch to v13 (all supported versions). Reported-by: Richard Guo <guofenglinux@gmail.com Reported-by: Robins Tharakan <tharakan@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-XwMKMKJ_GT=p3_-_=j9rQSEs1FbDFUnW9zHuKPsPNEQ@mail.gmail.com Backpatch-through: 13	2025-04-20 08:28:48 -07:00
David Rowley	d47f922246	Fix issue with ORDER BY / DISTINCT aggregates and FILTER `1349d2790` added support so that aggregate functions with an ORDER BY or DISTINCT clause could make use of presorted inputs to avoid an implicit sort within nodeAgg.c. That commit failed to consider that a FILTER clause may exist that filters rows before the aggregate function arguments are evaluated. That can be problematic if an aggregate argument contains an expression which could error out during evaluation. It's perfectly valid to want to have a FILTER clause which eliminates such values, and with the pre-sorted path added in `1349d2790`, it was possible that the planner would produce a plan with a Sort node above the Aggregate to perform the sort on the aggregate's arguments long before the Aggregate node would filter out the non-matching values. Here we fix this by inspecting ORDER BY / DISTINCT aggregate functions which have a FILTER clause to see if the aggregate's arguments are anything more complex than a Var or a Const. Evaluating these isn't going to cause an error. If we find any non-Var, non-Const parameters then the planner will now opt to perform the sort in the Aggregate node for these aggregates, i.e. disable the presorted aggregate optimization. An alternative fix would have been to completely disallow the presorted optimization for Aggrefs with any FILTER clause, but that wasn't done as that could cause large performance regressions for queries that see significant gains from `1349d2790` due to presorted results coming in from an Index Scan. Backpatch to 16, where `1349d2790` was introduced Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Kaimeh <kkaimeh@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAK-%2BJz9J%3DQ06-M7cDJoPNeYbz5EZDqkjQbJnmRyQyzkbRGsYkA%40mail.gmail.com Backpatch-through: 16	2025-04-20 22:12:07 +12:00
Michael Paquier	78231baaf9	psql: Split extended query protocol meta-commands in --help=commands Compared to v17 with only \bind able to do extended query protocol work, v18 has now a total of 11 meta-commands related to the extended query protocol. These were all listed under the "General" section of the --help=commands output and are specialized, bloating the output generated. All these meta-commands are moved into a new section called "Extended Query Protocol", listed at the end of --help=commands. This split has been suggested by Noah Misch. Discussion: https://postgr.es/m/20250415213450.1f.nmisch@google.com	2025-04-20 08:34:38 +09:00
Michael Paquier	5743d122fc	psql: Improve descriptions of \\flush[request] in --help Noah has reported that the current wording was confusing compared to the description of the underlying libpq routine. The new wording is from me. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250415213450.1f.nmisch@google.com	2025-04-20 08:16:57 +09:00
Michael Paquier	5ee7bd944e	psql: Fix incorrect status code returned by \getresults When an invalid number of results is requested for \getresults, the status code returned by exec_command_getresults() was PSQL_CMD_SKIP_LINE and not PSQL_CMD_ERROR. This led to incorrect behaviors, with ON_ERROR_STOP for example. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250415213450.1f.nmisch@google.com	2025-04-20 08:15:39 +09:00
Tom Lane	d05996340d	Be more wary of corrupt data in pageinspect's heap_page_items(). The original intent in heap_page_items() was to return nulls, not throw an error or crash, if an item was sufficiently corrupt that we couldn't safely extract data from it. However, commit `d6061f83a` utterly missed that memo, and not only put in an un-length-checked copy of the tuple's data section, but also managed to break the check on sane nulls-bitmap length. Either mistake could possibly lead to a SIGSEGV crash if the tuple is corrupt. Bug: #18896 Reported-by: Dmitry Kovalenko <d.kovalenko@postgrespro.ru> Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18896-add267b8e06663e3@postgresql.org Backpatch-through: 13	2025-04-19 16:37:42 -04:00
Michael Paquier	88e947136b	Fix typos and grammar in the code The large majority of these have been introduced by recent commits done in the v18 development cycle. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com	2025-04-19 19:17:42 +09:00
Michael Paquier	114f7fa81c	Rename injection points used in AIO tests The format of the injection point names used by the AIO code does not match the existing naming convention used everywhere else in the code, so let's be consistent. These points are used in test_aio. Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/Z_yTB80bdu1sYDqJ@paquier.xyz	2025-04-19 18:53:35 +09:00
Fujii Masao	3aad76a0a9	Make pg_upgrade log message with control file path translatable. Commit `173c97812f` replaced the hardcoded "global/pg_control" in pg_upgrade log message with a string literal concatenation of XLOG_CONTROL_FILE macro. However, this change made the message untranslatable. This commit fixes the issue by using %s with XLOG_CONTROL_FILE instead of that literal concatenation, allowing the message to be translated properly. It also wraps the file path in double quotes for consistency with similar log messages. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Masao Fujii <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20250407.155546.2129693791769531891.horikyota.ntt@gmail.com	2025-04-18 18:35:40 +09:00
Tatsuo Ishii	05883bd6e5	Doc: fix missing comma at the end of a line. Backpatch to 17, where the line was added. Reported by Noboru Saito while he was working on translating the file into Japanese. Discussion: https://postgr.es/m/20250417.203047.1321297410457834775.ishii%40postgresql.org Reported-by: Noboru Saito <noborusai@gmail.com> Reviewed-by: Daniel Gustafs <daniel@yesql.se> Backpatch-through: 17	2025-04-18 09:38:46 +09:00
David Rowley	1bd08f6ba5	Fixup various older misuses of appendPQExpBuffer Use appendPQExpBufferStr when there are no parameters and appendPQExpBufferChar when the string length is 1. Unlike `3fae25cbb`, which fixed this issue for code that was new to v18, this one fixes up instances which exist in the backbranches. We've historically tried to maintain this standard and if we're going to continue doing that, then we won't be doing that selectively based on when the code was introduced. Now seems like a good time to flush out the existing misuses. Waiting until v19 just prolongs their existence in terms of released versions that the misuses exist in. Author: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvoARMvPeXTTC0HnpARBHn-WgVstc8XFCyMGOzvgu_1HvQ@mail.gmail.com	2025-04-18 12:15:08 +12:00
David Rowley	d9e03864b6	Make levels 1-based in pg_log_backend_memory_contexts() Both pg_get_process_memory_contexts() and pg_backend_memory_contexts have 1-based levels, whereas pg_log_backend_memory_contexts() was using 0-based levels. Align these. This results in slightly saner behavior from MemoryContextStatsDetail() in regards to the max_level. Previously it would stop at 1 level before the maximum requested level rather than at that level. Reported-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Author: David Rowley <drowleyml@gmail.com Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/395ea5d4fe190480efa95bf533485c70@oss.nttdata.com	2025-04-18 09:04:28 +12:00
Tom Lane	fc5e966f73	Suppress "may be used uninitialized" warnings from older compilers. The "children" list won't be used until "got_children" has been set true, but older compilers don't get that; about half a dozen buildfarm animals are warning about this. Issue added by `11ff192b5`. While here, improve slightly-shaky grammar in comment. Discussion: https://postgr.es/m/2057835.1744833309@sss.pgh.pa.us	2025-04-17 16:47:04 -04:00
Tom Lane	4aad2cb770	Portability fix: isdigit() must be passed an unsigned char. Oversight in commit `40b9c2701`, per buildfarm member mamba.	2025-04-17 16:33:21 -04:00
Tom Lane	0400ae4a68	Cache typlens of a SQL function's input arguments. This gets rid of repetitive get_typlen calls in postquel_sub_params, which show up as costing a few percent of the runtime in simple test cases (more with more parameters). In combination with the preceding patches, this gets us most of the way back down to the amount of per-call overhead that functions.c had before commit `0dca5d68d`. There are some more things that could be done, but this seems like an okay place to stop for v18.	2025-04-17 12:56:40 -04:00
Tom Lane	0313c5dc62	Make SQLFunctionCache long-lived again. At this point, the only data structures we allocate directly in fcontext are the SQLFunctionCache struct itself, the ParamListInfo struct, and the execution_state array, all of which are small and perfectly capable of being re-used across executions of the same FmgrInfo. Hence, let's give them the same lifespan as the FmgrInfo. This step gets rid of the separate SQLFunctionLink struct and makes fn_extra point to SQLFunctionCache again. We also get rid of the separate fcontext memory context and allocate these items directly in fn_mcxt. For notational simplicity, SQLFunctionCache still has an fcontext field, but it's just a copy of fn_mcxt. The motivation for this is to allow these structures to live as long as the FmgrInfo and be re-used across calls, restoring the original design without its propensity for memory leaks. This gets rid of some per-call overhead that we added in `0dca5d68d`. We also make an effort to re-use the JunkFilter and result slot. Those might need to change if the function definition changes, so we compromise by rebuilding them if the cached plan changes. This also moves the tuplestore into fn_mcxt so that it can be re-used across calls, again undoing a change made in `0dca5d68d`.	2025-04-17 12:56:31 -04:00
Tom Lane	f45a5444ee	Split some storage out to separate subcontexts of fcontext. Put the JunkFilter and its result slot (and thence also some subsidiary data such as the result tupledesc) into a separate subcontext "jfcontext". This doesn't accomplish a lot at this point, because we make a new JunkFilter each time through the SQL function. However, the plan is to make the fcontext long-lived, and that raises the possibility that we'll need a new JunkFilter because the plan for the result-generating query changes. A separate context makes it easy to free the obsoleted data when that happens. Also, instead of always running the sub-executor in fcontext, make a separate context for it if we're doing lazy eval of a SRF, and otherwise just run it inside CurrentMemoryContext.	2025-04-17 12:56:21 -04:00
Tom Lane	595d1efeda	Make functions.c mostly run in a short-lived memory context. Previously, much of this code ran with CurrentMemoryContext set to be the function's fcontext, so that we tended to leak a lot of stuff there. Commit `0dca5d68d` dealt with that by releasing the fcontext at the completion of each SQL function call, but we'd like to go back to the previous approach of allowing the fcontext to be query-lifespan. To control the leakage problem, rearrange the code so that we mostly run in the memory context that fmgr_sql is called in (which we expect to be short-lived). Notably, this means that parsing/planning is all done in the short-lived context and doesn't leak cruft into fcontext. This patch also fixes the allocation of execution_state records so that we don't leak them across executions. I set that up with a re-usable array that contains at least as many execution_state structs as we need for the current querytree. The chain structure is still there, but it's not really doing much for us, and maybe somebody will be motivated to get rid of it. I'm not though. This incidentally also moves the call of BlessTupleDesc to be with the code that creates the JunkFilter. That doesn't make much difference now, but a later patch will reduce the number of times the JunkFilter gets made, and we needn't bless the results any more often than that. We still leak a fair amount in fcontext, particularly when executing utility statements, but that's material for a separate patch step; the point here is only to get rid of unintentional allocations in fcontext.	2025-04-17 12:56:08 -04:00
Tom Lane	09b07c2953	Minor performance improvement for SQL-language functions. Late in the development of commit `0dca5d68d`, I added a step to copy the result tlist we extract from the cached final query, because I was afraid that that might not last as long as the JunkFilter that we're passing it off to. However, that turns out to cost a noticeable number of cycles, and it's really quite unnecessary because the JunkFilter will not examine that tlist after it's been created. (ExecFindJunkAttribute would use it, but we don't use that function on this JunkFilter.) Hence, remove the copy step. For safety, reset the might-become-dangling jf_targetList pointer to NIL. In passing, remove DR_sqlfunction.cxt, which we don't use anymore; it's confusing because it's not entirely clear which context it ought to point at.	2025-04-17 12:55:58 -04:00
Noah Misch	f4ece891fc	Assert lack of hazardous buffer locks before possible catalog read. Commit `0bada39c83` fixed a bug of this kind, which existed in all branches for six days before detection. While the probability of reaching the trouble was low, the disruption was extreme. No new backends could start, and service restoration needed an immediate shutdown. Hence, add this to catch the next bug like it. The new check in RelationIdGetRelation() suffices to make autovacuum detect the bug in commit `243e9b40f1` that led to commit `0bada39`. This also checks in a number of similar places. It replaces each Assert(IsTransactionState()) that pertained to a conditional catalog read. No back-patch for now, but a back-patch of commit `243e9b4` should back-patch this, too. A back-patch could omit the src/test/regress changes, since back branches won't gain new index columns. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com	2025-04-17 05:00:30 -07:00
Daniel Gustafsson	b669293e34	pg_dump: Set private_date pointer to NULL in callback The end callback for ZStandard compression frees the private_data but didn't set the pointer to NULL after freeing. This is not a bug as the code is right now, since nothing is dereferencing the pointer upon returning from the callback but it is good practice to do. Author: Alexander Kuznetsov <kuznetsovam@altlinux.org> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/efaee52b-9550-44ca-8633-ea86076b3283@altlinux.org	2025-04-17 12:58:00 +02:00
Fujii Masao	e4b0f86e1f	pg_dump: Fix incorrect archive format shown in error message. In pg_dump and pg_restore, _allocAH() calls _discoverArchiveFormat() to determine the archive format when the input format is unknown one. If the input or discovered format is unrecognized, it reports an error including the archive format number. If discovered format is unrecognized, its number should be shown in the error message. But previously the error message mistakenly showed the originally requested format number (i.e., unknown one) instead of the discovered one, due to referencing the wrong variable in the error message. This commit corrects the issue by using the appropriate variable in the error message. This fix has no practical impact since _discoverArchiveFormat() never returns an unrecognized format and that error mesasge is actually never output. Therefore, while the issue exists in back branches, it's not worth the trouble and buildfarm cycles to back-patch. So this fix is applied only to the master branch. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAKYtNAqu+N-Ab2Fq6wzNSOm_-0N-BMneanYNV1+6kFDXjva1Eg@mail.gmail.com	2025-04-17 09:52:47 +09:00
Jeff Davis	2e5353be25	Another unintentional behavior change in commit `e9931bfb75`. Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250412123430.8c.nmisch@google.com	2025-04-16 16:49:42 -07:00
Jeff Davis	b107744ce7	Improve comment in regc_pg_locale.c. Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250412123430.8c.nmisch@google.com	2025-04-16 16:49:35 -07:00
David Rowley	3fae25cbb3	Fixup various new-to-v18 usages of appendPQExpBuffer Use appendPQExpBufferStr when there are no parameters and appendPQExpBufferChar when the string length is 1. Author: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvoARMvPeXTTC0HnpARBHn-WgVstc8XFCyMGOzvgu_1HvQ@mail.gmail.com	2025-04-17 11:37:55 +12:00
David Rowley	f3281f9f93	Improve comments for estimate_multivariate_ndistinct() estimate_multivariate_ndistinct() is coded to assume the caller handles passing it a list of GroupVarInfos with unique 'var' fields over the entire list. `6bb6a62f3` added code which didn't ensure this and that could result in estimate_multivariate_ndistinct() erroring out with: ERROR: corrupt MVNDistinct entry This occurred because estimate_multivariate_ndistinct() first searches for a set of stats that match to at least two of the given GroupVarInfos and then later assumes that the MVNDistinctItem.items array of the best matching stats will have an entry for those two columns. If the GroupVarInfos List contained a duplicate entry then the same column could be matched to twice and that could trick the code into thinking we have >= 2 columns matched in cases where only a single distinct column has been matched. This could result in a failure to find the correct MVNDistinctItem in the stats as the array containing those never contains an item for single columns. Here we make it more clear that the function needs a distinct set of GroupVarInfos and also tidy up a few other comments to make things a bit easier to follow. Author: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvocZCUhM9W9mJ39d6oQz7ePKoqFnao_347mvC-A7QatcQ@mail.gmail.com	2025-04-17 11:03:24 +12:00
Tom Lane	ab3d8afc7f	Sync declarations and definitions of two new tablecmds.c functions. Buildfarm member drongo complained because the definitions of these functions used "const Oid foo" where the forward declarations just had "Oid foo". (I'm a bit surprised that drongo seems to be the only complainant.) I chose to fix this by removing the "consts" because (a) I'm generally not a fan of using const that way, and (b) it was a minority usage even within these two functions, let alone compared to the rest of our code base. Oversight in commit `eec0040c4`, so no need for back-patch.	2025-04-16 17:59:08 -04:00
Álvaro Herrera	11ff192b5b	Elide not-null constraint checks on child tables during PK creation We were unnecessarily acquiring AccessExclusiveLock on all child tables when "ALTER TABLE ONLY sometab ADD PRIMARY KEY" was run on their parent table, an oversight in commit `14e87ffa5c`. This caused deadlocks during pg_restore of partitioned tables. The reason to acquire the AEL was that we need to verify that child tables have the involved columns already marked as not-null; but if the parent table has an inheritable not-null constraint, then all children must necessarily be in the correct state already, so we can skip the check, which avoids acquiring the lock. Reorder the code so that it works that way. This doesn't change things in the case where the constraint doesn't exist, but that case is of lesser importance because it doesn't occur during parallel pg_restore. While at it, reword some errmsg() and add errhint() to similar cases in related but not adjacent code. Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/67469c1c-38bc-7d94-918a-67033f5dd731@gmx.net Discussion: https://postgr.es/m/2045026.1743801143@sss.pgh.pa.us Discussion: https://postgr.es/m/1280408.1744650810@sss.pgh.pa.us	2025-04-16 21:51:23 +02:00
Daniel Gustafsson	1fd3566ebc	Update pg_config.h.in with libnuma changes Add macros from autoheader which were accidentally omitted in commit `65c298f61f`. There is no function change by this as no code is currently using the missing macro. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CF6D7D7F-E1C4-45BE-9019-0F4B4BC7C135@yesql.se	2025-04-16 20:16:57 +02:00
Tom Lane	1fc3403626	Fix pg_dump --clean with partitioned indexes. We'd try to drop the partitions of a partitioned index separately, which is disallowed by the backend, leading to an error during restore. While the error is harmless, it causes problems if you try to use --single-transaction mode. Fortunately, there seems no need to do a DROP at all, since the partition will go away silently when we drop either the parent index or the partition's table. So just make the DROP conditional on not being a partition. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com Backpatch-through: 13	2025-04-16 13:31:59 -04:00
Andrew Dunstan	40b9c27014	pg_restore cleanups . remove unnecessary oid_string list stuff . use pg_get_line_buf() instead of open-coding it . cleaner parsing of map.dat lines Reverts `2b69afbe50` add new list type simple_oid_string_list to fe-utils/simple_list Author: Álvaro Herrera <alvherre@kurilemu.de> Author: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/202504141220.343fmoxfsbj4@alvherre.pgsql	2025-04-16 12:04:34 -04:00
Richard Guo	3b35f9a4c5	Fix an incorrect check in get_memoize_path Memoize typically marks cache entries as complete after fully scanning the inner side of a join. However, in the case of unique joins, we skip to the next outer tuple as soon as the first matching inner tuple is found, leaving no opportunity to scan the inner side to completion. To work around that, we mark cache entries as complete after fetching the first matching inner tuple in unique joins. This approach is only safe when all of the join's restriction clauses are parameterized; otherwise, there is no guarantee that reading just one tuple from the inner side is sufficient. Currently, we check for this by verifying that the number of clauses in ppi_clauses is no less than the number of the join's restriction clauses. However, this check isn't entirely reliable, as ppi_clauses includes join clauses available from all outer rels, not just the current outer rel. This means the check could pass even if a restriction clause isn't parameterized, as long as another join clause, which doesn't belong to the current join, is included in ppi_clauses. To fix this, we explicitly check whether each restriction clause of the current join is present in ppi_clauses. While we're here, remove the XXX comment from the modified code, as it's not justified; in certain cases, it's not possible to move a join clause to the inner side. This is arguably a bugfix, but no backpatch given the lack of field reports. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-8JPouj=wBDj4DhK-WO4+Xdx=A2jbjvvyyTBQneJ1=BQ@mail.gmail.com	2025-04-16 10:55:44 +09:00
Daniel Gustafsson	5ee476294c	doc: Fix typos in documentation This fixes a set of typos introduced during the v18 development cycle. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/7038B4C5-2742-42B1-A8F0-0FFEAECF02A7@yesql.se	2025-04-15 21:32:18 +02:00
Tom Lane	7c87284940	Fix failure for generated column with a not-null domain constraint. If a GENERATED column is declared to have a domain data type where the domain's constraints disallow null values, INSERT commands failed because we built a targetlist that included coercing a null constant to the domain's type. The failure occurred even when the generated value would have been perfectly OK. This is adjacent to the issues fixed in `0da39aa76`, but we didn't notice for lack of testing a domain with such a constraint. We aren't going to use the result of the targetlist entry for the generated column --- ExecComputeStoredGenerated will overwrite it. So it's not really necessary that it have the exact datatype of the generated column. This patch fixes the problem by changing the targetlist entry to be a null Const of the domain's base type, which should be sufficiently legal. (We do have to tweak ExecCheckPlanOutput to accept the situation, though.) This has been broken since we implemented generated columns. However, this patch only applies easily as far back as v14, partly because I (tgl) only carried `0da39aa76` back that far, but mostly because v14 significantly refactored the handling of INSERT/UPDATE targetlists. Given the lack of field complaints and the short remaining support lifetime of v13, I judge the cost-benefit ratio not good for devising a version that would work in v13. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxG59tip2+9h=rEv-ykOFjt0cbsPVchhi0RTij8bABBA0Q@mail.gmail.com Backpatch-through: 14	2025-04-15 12:08:34 -04:00
Fujii Masao	f840f8ee30	doc: Fix missing whitespace in pg_restore documentation. Previously, a space was missing between "<option>--exclude-schema</option>" and "for" in the pg_restore documentation. This commit fixes the typo by adding the missing whitespace. Back-patch to v17 where the typo was added. Author: Lele Gaifax <lele@metapensiero.it> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/87lds3ysm0.fsf@metapensiero.it Backpatch-through: 17	2025-04-15 23:15:06 +09:00
Daniel Gustafsson	7ae13170ba	pg_combinebackup: Fix incorrect code documentation The code comment for parse_oid accidentally used the wrong parameter when referring to the location of the last backup. Also, while there, improve sentence wording by removing a superfluous word. Backpatch to v17 where pg_combinebackup was addedd Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAAJ_b95ecWgzcS4K3Dx0E_Yp-SLwK5JBasFgioKMSjhQLw9xvg@mail.gmail.com Backpatch-through: 17	2025-04-15 15:27:08 +02:00
Peter Eisentraut	c55df7c6ea	Fix incorrect format placeholders BlockNumber is unsigned int. Fix for commit `14ffaece0f`.	2025-04-14 08:56:33 +02:00
Peter Eisentraut	7cd171a5d2	Add more source files to pg_verifybackup/nls.mk also related to commit `8dfd312902`	2025-04-14 08:32:46 +02:00
David Rowley	b51f86e49a	Doc: use "an SQL" consistently rather than "a SQL" Per the precedent set by `04539e73f`, adjust article prefixes for "SQL" to use "an" consistently rather than "a", i.e., "an es-que-ell" rather than "a sequel". Both of these are new to v18. Also see `b1b13d2b5`, `d866f0374` and `7bdd489d3`.	2025-04-14 11:55:18 +12:00
Daniel Gustafsson	2970c75dd9	Mark sslkeylogfile as Debug option Mark the sslkeylogile option as "D" debug as this truly is a debug option, and it will allow postgres_fdw et.al to filter it out as well. Also update the display length to match that for an ssl key as they are both filename based inputs. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CAOYmi+=5GyBKpu7bU4D_xkAnYJTj=rMzGaUvHO99-DpNG_YKcw@mail.gmail.com	2025-04-13 21:53:03 +02:00
Andrew Dunstan	64e193f5dd	Make AIO error test more portable Alpine Linux's C library (musl) spells one error message differently. Reported-by: Wolfgang Walther	2025-04-13 14:39:45 -04:00
Andrew Dunstan	f09088a01d	Free memory properly in pg_restore.c Thinko in commit `39729ec01d`. Mea maxima culpa. Per Mahendra Singh Thalor <mahi6run@gmail.com>	2025-04-12 14:54:48 -04:00
Tom Lane	78637a8be2	Doc: do a little copy-editing on Index Storage Parameters list. Add a paragraph break per suggestion from David G. Johnston. Use a consistent voice for all the different parameter descriptions, and fix a couple of grammatical issues. Reported-by: Igor Korot <ikorot01@gmail.com> Co-authored-by: "David G. Johnston" <david.g.johnston@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+FnnTz=EW1VQRpWB9J+G-NSchrPFcw4nR7d0JqzEK9jWKB35A@mail.gmail.com	2025-04-12 13:42:31 -04:00
Tom Lane	e708ffe79d	Fix GIN's shimTriConsistentFn to not corrupt its input. Commit `0f21db36d` made an assumption that GIN triConsistentFns would not modify their input entryRes[] arrays. But in fact, the "shim" triConsistentFn that we use for opclasses that don't supply their own did exactly that, potentially leading to wrong answers from a GIN index search. Through bad luck, none of the test cases that we have for such opclasses exposed the bug. One response to this could be that the assumption of consistency check functions not modifying entryRes[] arrays is a bad one, but it still seems reasonable to me. Notably, shimTriConsistentFn is itself assuming that with respect to the underlying boolean consistentFn, so it's sure being self-centered in supposing that it gets to do so. Fortunately, it's quite simple to fix shimTriConsistentFn to restore the entry-time state of entryRes[], so let's do that instead. This issue doesn't affect any core GIN opclasses, since they all supply their own triConsistentFns. It does affect contrib modules btree_gin, hstore, and intarray. Along the way, I (tgl) noticed that shimTriConsistentFn failed to pick up on a "recheck" flag returned by its first call to the boolean consistentFn. This may be only a latent problem, since it would be unlikely for a consistentFn to set recheck for the all-false case and not any other cases. (Indeed, none of our contrib modules do that.) Nonetheless, it's formally wrong. Reported-by: Vinod Sridharan <vsridh90@gmail.com> Author: Vinod Sridharan <vsridh90@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com Backpatch-through: 13	2025-04-12 12:28:02 -04:00
Peter Geoghegan	a6cab6a78e	Harmonize function parameter names for Postgres 18. Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 18 development. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit `035ce1fe`).	2025-04-12 12:07:36 -04:00
Michael Paquier	fdb69dd582	Fix instability with WAL fsync test in stats.sql A backend using wal_sync_method set to "open_sync" or "open_datasync" would fail the test checking the WAL sync data in pg_stat_io. These modes guarantee that a sync is done when WAL is written to disk, and the data checked by the test is not incremented in this case, issue_xlog_fsync() doing nothing. Oversight in commit `a051e71e28`. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0uxwg3xAi4nvdBMJ-zJQEeyg+RotuU+ebM2F6CKmnvaYA@mail.gmail.com	2025-04-12 13:09:48 +09:00
Daniel Gustafsson	847bbb21f8	Fix recently introduced typos This fixes typos in docs and comments introduced during the v18 development cycle, to keep them from ending up in backbranches. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+COZaCgGua25f2hSrjrDLJcJJAHkwoKgTTqUy-wyL1=64JNjw@mail.gmail.com	2025-04-11 22:17:12 +02:00
Nathan Bossart	5822bf21d5	Add missing space in pg_restore documentation. Oversight in commit `1495eff7bd`.	2025-04-11 10:05:32 -05:00
Peter Eisentraut	914ea1c93c	Add missing source file to pg_verifybackup/nls.mk added by commit `8dfd312902`	2025-04-11 10:53:36 +02:00
Peter Eisentraut	b63cbacb86	Add missing source file to pg_dump/nls.mk added by commit `c1da728106`	2025-04-11 10:28:59 +02:00
Peter Eisentraut	9e0e1cfc3e	Add missing source file to pg_upgrade/nls.mk added by commit `40e2e5e92b`	2025-04-11 10:26:51 +02:00
Peter Eisentraut	7d430a5728	Add missing PGDLLIMPORT markings Discussion: https://www.postgresql.org/message-id/flat/25095db5-b595-4b85-9100-d358907c25b5%40eisentraut.org	2025-04-11 08:59:52 +02:00
Michael Paquier	2e57790836	Fix race with synchronous_standby_names at startup synchronous_standby_names cannot be reloaded safely by backends, and the checkpointer is in charge of updating a state in shared memory if the GUC is enabled in WalSndCtl, to let the backends know if they should wait or not for a given LSN. This provides a strict control on the timing of the waiting queues if the GUC is enabled or disabled, then reloaded. The checkpointer is also in charge of waking up the backends that could be waiting for a LSN when the GUC is disabled. This logic had a race condition at startup, where it would be possible for backends to not wait for a LSN even if synchronous_standby_names is enabled. This would cause visibility issues with transactions that we should be waiting for but they were not. The problem lasts until the checkpointer does its initial update of the shared memory state when it loads synchronous_standby_names. In order to take care of this problem, the shared memory state in WalSndCtl is extended to detect if it has been initialized by the checkpointer, and not only check if synchronous_standby_names is defined. In WalSndCtlData, sync_standbys_defined is renamed to sync_standbys_status, a bits8 able to know about two states: - If the shared memory state has been initialized. This flag is set by the checkpointer at startup once, and never removed. - If synchronous_standby_names is known as defined in the shared memory state. This is the same as the previous sync_standbys_defined in WalSndCtl. This method gives a way for backends to decide what they should do until the shared memory area is initialized, and they now ultimately fall back to a check on the GUC value in this case, which is the best thing that can be done. Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by the checkpointer when this process starts, so the window is very narrow. It is possible to enlarge the problematic window by making the checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined() with a hardcoded sleep for example, and doing so has showed that a 2PC visibility test is indeed failing. On machines slow enough, this bug would cause spurious failures. In 17~, we have looked at the possibility of adding an injection point to have a reproducible test, but as the problematic window happens at early startup, we would need to invent a way to make an injection point optionally persistent across restarts when attached, something that would be fine for this case as it would involve the checkpointer. This issue is quite old, and can be reproduced on all the stable branches. Author: Melnikov Maksim <m.melnikov@postgrespro.ru> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru Backpatch-through: 13	2025-04-11 10:00:21 +09:00
David Rowley	530050d8d2	Add code comment explaining ins_since_vacuum and aborted inserts Sami complained that there's a discrepancy between n_mod_since_analyze and n_ins_since_vacuum, as the former only accounts for committed changes and the latter tracks committed and aborted inserts. Nobody seemed overly concerned that this would cause any concerning issues. The repercussions, from what I can tell, are limited to causing an autovacuum to trigger for inserts sooner than it otherwise might. For typical ratios of commits to aborts, it's unlikely to ever be noticed. Fixing things to make it so n_ins_since_vacuum only displays committed inserts would require an additional field in PgStat_TableCounts, which does not quite seem worthwhile at this stage. This commit just adds a comment with some details to mention that we know about it, which will hopefully prevent repeat discussions. Reported-by: Sami Imseih <samimseih@gmail.com> Author: David Rowley <drowleyml@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAApHDvpgV3a-R2EGmPOh0L-x3pHbZpM3y4dySWfy+UqUazwDQA@mail.gmail.com	2025-04-11 11:36:21 +12:00
Andrew Dunstan	39729ec01d	Fix fat fingering in `22cb6d2895` Per Rainier Vilela	2025-04-10 19:08:04 -04:00
David Rowley	928394b664	Improve various new-to-v18 appendStringInfo calls Similar to `8461424fd`, here we adjust a few new locations which were not using the most suitable appendStringInfo* function for the intended purpose. Author: David Rowley <drowleyml@gmail.com Discussion: https://postgr.es/m/CAApHDvqJnNjueb=Eoj8K+8n0g7nj_AcPWSiCj5RNV4fDejAfqA@mail.gmail.com	2025-04-11 10:07:22 +12:00
Daniel Gustafsson	55ef7abf88	Rename global variable backing DSA area The global variable backing the DSA area for Memory Context stats reporting had a too generic name, rename to be more descriptive. Independently reported by Peter and Laurenz. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Peter Eisentraut <peter@eisentraut.org> Reported-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/d51172bd4e7f4b07a18a0288ca1b1c28a71a5f6a.camel@cybertec.at Discussion: https://postgr.es/m/25095db5-b595-4b85-9100-d358907c25b5@eisentraut.org	2025-04-10 22:40:27 +02:00
Andrew Dunstan	22cb6d2895	Fix memory leak in pg_restore.c Oversight in `1495eff7bd` Author: Ranier Vilela <ranier.vf@gmail.com>	2025-04-10 14:57:02 -04:00
Tom Lane	d89335eea6	Doc: remove long-obsolete advice about generated constraint names. It's been twenty years since we generated constraint names that look like "$N". So this advice about double-quoting such names is well past its sell-by date, and now it merely seems confusing. Reported-by: Yaroslav Saburov <y.saburov@gmail.com> Author: "David G. Johnston" <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/174393459040.678.17810152410419444783@wrigleys.postgresql.org Backpatch-through: 13	2025-04-10 14:49:10 -04:00
Tom Lane	f27eb0325b	Remove useless check for negative result of ip_addrsize(). By inspection, ip_addrsize() can't return a negative result. (If it could, we'd have way bigger problems elsewhere.) So delete useless check in network_send(). Most C compilers are probably perfectly capable of removing this code by themselves, but it's confusing/misleading. Bug: #18889 Reported-by: Daniel Elishakov <dan-eli@mail.ru> Discussion: https://postgr.es/m/18889-73d4f19e953a629e@postgresql.org	2025-04-10 14:18:07 -04:00
Andrew Dunstan	4170298b6e	Further cleanup for directory creation on pg_dump/pg_dumpall Instead of two separate (and different) implementations, refactor to use a single common routine. Along the way, remove use of a hardcoded file permissions constant in favor of the common project setting for directory creation. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Discussion: https://postgr.es/m/CAKYtNApihL8X1h7XO-zOjznc8Ca66Aevgvhc9zOTh6DBh2iaeA@mail.gmail.com	2025-04-10 12:11:36 -04:00
Amit Kapila	4909b38af0	Fix data loss in logical replication. Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ... or ALTER TYPE ... that don't take a strong lock on table happens concurrently to DMLs on the tables involved in the DDL. This happens because logical decoding doesn't distribute invalidations to concurrent transactions and those transactions use stale cache data to decode the changes. The problem becomes bigger because we keep using the stale cache even after those in-progress transactions are finished and skip the changes required to be sent to the client. This commit fixes the issue by distributing invalidation messages from catalog-modifying transactions to all concurrent in-progress transactions. This allows the necessary rebuild of the catalog cache when decoding new changes after concurrent DDL. We observed performance regression primarily during frequent execution of publication DDL statements that modify the published tables. The regression is minor or nearly nonexistent for DDLs that do not affect the published tables or occur infrequently, making this a worthwhile cost to resolve a longstanding data loss issue. An alternative approach considered was to take a strong lock on each affected table during publication modification. However, this would only address issues related to publication DDLs (but not the ALTER TYPE ...) and require locking every relation in the database for publications created as FOR ALL TABLES, which is impractical. The bug exists in all supported branches, but we are backpatching till 14. The fix for 13 requires somewhat bigger changes than this fix, so the fix for that branch is still under discussion. Reported-by: hubert depesz lubaczewski <depesz@depesz.com> Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Tested-by: Benoit Lobréau <benoit.lobreau@dalibo.com> Backpatch-through: 14 Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca74c6@enterprisedb.com Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com	2025-04-10 13:14:40 +05:30
Peter Eisentraut	9ad19295e9	Fix incorrect format placeholders for commits `8f427187db`, `6ee3b91bad`	2025-04-10 08:04:35 +02:00
David Rowley	d7c04db27a	Update wording in optimizer/README for EquivalenceClasses `d69d45a5a` changed how em_is_child members are stored in EquivalenceClasses. Children are no longer stored in the ec_members list. optimizer/README mentioned that most operations "should ignore child members", but that felt a little untrue now since child members are now stored in a separate place, they simply won't be found by the normal means of looking (a foreach loop over ec_members), and if you don't find them, there's technically no need to "ignore" them. Here we tweak the wording slightly to reflect the new storage location for child members. Reported-by: Amit Langote <amitlangote09@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CA+HiwqE8v=EuAP_3F_A2xn8zWx+nG_etW_Fe_DvKO-Fkx=+DdQ@mail.gmail.com	2025-04-10 17:33:58 +12:00
Amit Kapila	d438515c29	Cosmetic fixes for pg_createsubscriber's -all option. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHut+PsmSCQ-ENSDQ0YOUcsgzT=GG-E9jyXBvxd51A_dMXH5XA@mail.gmail.com	2025-04-10 10:30:05 +05:30
Tomas Vondra	d15acc915d	ci: Check for missing dependencies in meson builds Extends the Linux and Windows meson builds with a check for missing dependencies by running ninja -t missingdeps after the build. This highlights unindended dependencies. Reviewed-by: Andres Freund <andres@anarazel.de> https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com	2025-04-09 22:01:58 +02:00
Tomas Vondra	3887d0cfeb	Cleanup of pg_numa.c This moves/renames some of the functions defined in pg_numa.c: * pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and moved to src/backend/storage/ipc/shmem.c. The new name better reflects that the page size is not related to NUMA, and it's specifically about the page size used for the main shared memory segment. * move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into the backend (which more appropriate for functions callable from SQL). While at it, improve the comment to explain what page size it returns. * remove unnecessary includes from src/port/pg_numa.c, adding unnecessary dependencies (src/port should be suitable for frontent). These were either leftovers or unnecessary thanks to the other changes in this commit. This eliminates unnecessary dependencies on backend symbols, which we don't want in src/port. Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com	2025-04-09 21:50:17 +02:00
Nathan Bossart	e2665efd0f	pg_upgrade: Mention that we preserve database OIDs in a comment. Oversight in commit `aa01051418`. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/4055696.1744134682%40sss.pgh.pa.us	2025-04-09 14:27:08 -05:00
Tom Lane	837cc73af2	Fix performance issue in deadlock-parallel isolation test. With debug_discard_caches = 1, the runtime of this test script increased by about a factor of 10 after commit `0dca5d68d`. That's causing some of our buildfarm animals to fail with a timeout. The reason for the increased time is that now we are re-planning some intentionally-non-inlineable SQL functions on every execution, where the previous coding held onto the original plans throughout the outer query. The previous behavior was arguably quite buggy, so I don't think `0dca5d68d` deserves blame here. But we would like this test script to not take so long. To fix, instead of forcing a "parallel safe" label via a non-inlineable SQL function, apply it directly to the advisory-lock functions by making internal-language aliases for them. A small problem is that the advisory-lock functions return void but this test would really like them to return integer 1. I cheated here by declaring the aliases as returning "int". That's perhaps undue familiarity with the implementation of PG_RETURN_VOID(), but that hasn't changed in twenty years and is unlikely to do so in the next twenty. That gets us an integer 0 result, and then an inline-able wrapper to convert that to an integer 1 allows the rest of the script to remain unchanged. For me, this reduces the runtime with debug_discard_caches = 1 by about 100x, making the test comfortably faster than before instead of slower. Discussion: https://postgr.es/m/136163.1744179562@sss.pgh.pa.us	2025-04-09 12:28:34 -04:00
Noah Misch	5bbc596391	Fix test races between syscache-update-pruned.spec and autovacuum. This spec fails ~3% of my Valgrind runs, and the spec has failed on Valgrind buildfarm member skink at a similar rate. Two problems contributed to that: - A competing buffer pin triggered VACUUM's lazy_scan_noprune() path, causing "tuples missed: 1 dead from 1 pages not removed due to cleanup lock contention". FREEZE fixes that. - The spec ran lazy VACUUM immediately after VACUUM FULL. The spec implicitly assumed lazy VACUUM prunes the one tuple that VACUUM FULL made dead. First wait for old snapshots, making that assumption reliable. This also adds two forms of defense in depth: - Wait for snapshots using shared catalog pruning rules (VISHORIZON_SHARED). This avoids the removable cutoff moving backward when an XID-bearing autoanalyze process runs in another database. That may never happen in this test, but it's cheap insurance. - Use lazy VACUUM option DISABLE_PAGE_SKIPPING. Commit `c2dc1a7976` did this for a related requirement in other tests, but I suspect FREEZE is necessary and sufficient in all these tests. Back-patch to v17, where the test first appeared. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/sv3taq4e6ea4qckimien3nxp3sz4b6cw6sfcy4nhwl52zpur4g@h6i6tohxmizu Backpatch-through: 17	2025-04-09 07:23:39 -07:00
Peter Eisentraut	306dd6e727	Update config.guess and config.sub	2025-04-09 12:41:54 +02:00
Heikki Linnakangas	0f1433f053	Fix a few oversights in the longer cancel keys patch Change MyCancelKeyLength's type from uint8 to int. While it always fits in a uint8, plain int is less surprising, as there's no particular reason for it to be uint8. Fix one ProcSignalInit caller that passed 'false' instead of NULL for the pointer argument. Author: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org	2025-04-09 13:11:42 +03:00
Daniel Gustafsson	ef366b7d7e	Perform missed catversion bump Commit `c57971034e` renamed an argument for a function but missed to bump the catversion to reflect this. Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvqOega=dPtu3h2C5fJWJEuaGCMDib_sVfhKQqgUNJVmFA@mail.gmail.com	2025-04-09 09:29:12 +02:00
Tom Lane	dd496eedea	Doc: note that two examples in optimizer/README are oversimplified. These examples fail to account for join clauses generated by EquivalenceClasses, but since we haven't mentioned EquivalenceClasses yet it seems like it'd just add confusion to make them fully accurate. Instead, parenthetically note that they're oversimplified. Reported-by: Zeyuan Hu <ferrishu3886@gmail.com> Co-authored-by: David Rowley <dgrowleyml@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACvHWmYFo+60yMqKJajDDvKN5EM41YHrCT3oxukwXmGAqpWvyw@mail.gmail.com	2025-04-08 23:03:33 -04:00
Tom Lane	b65b9da568	Adjust AdjustUpgrade.pm for commit `b1720fe63`. Need to delete the functions we no longer have available from the dumps to be reloaded from old versions. Per buildfarm.	2025-04-08 20:21:03 -04:00
Tom Lane	b1720fe63f	Move contrib/spi testing from core regression tests to contrib/spi. It's weird to have the core regression tests depending on contrib code, and coverage testing shows that those test queries add nothing to the core-code coverage of the core tests. So pull those test bits out and put them into ordinary test scripts inside contrib/spi/, making that more like other contrib modules. Aside from being structurally nicer, anything we can take out of the core tests (which are executed multiple times per check-world run) and put into tests executed only once should be a win. It doesn't look like this change will buy a whole lot of milliseconds, but a cycle saved is a cycle earned. Also, there is some discussion around possibly removing refint and/or autoinc altogether. I don't know if that will happen, but we'd certainly need to decouple them from the core tests to do so. The tests for autoinc were quite intertwined with the undocumented "ttdummy" trigger in regress.c. That made the tests very hard to understand and contributed nothing to autoinc's testing either. So I just deleted ttdummy and rewrote the autoinc tests without it. I realized while doing this that the description of autoinc in the SGML docs is not a great description of what the function actually does, so the patch includes some updates to those docs. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/3872677.1744077559@sss.pgh.pa.us	2025-04-08 19:12:03 -04:00
Daniel Gustafsson	c57971034e	Rename argument in pg_get_process_memory_contexts(). During development the third argument to pg_get_process_memory_contexts was a retry count, but it was changed to a timeout instead. The param name was accidentally left in pg_proc.dat though. Fix by renaming to the correct parameter name. Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/3eb40b3e-45c7-426a-b7f8-81f7d05a9b53@oss.nttdata.com	2025-04-08 23:09:13 +02:00
Peter Eisentraut	8969194b73	Fix incorrect format placeholder for commit `749a9e20c9`	2025-04-08 19:12:03 +02:00
Nathan Bossart	b0a4c3e88b	Prevent 006_transfer_modes.pl from leaving files behind. This test was leaving files like delete_old_cluster.{sh,bat} in the source directory for VPATH and meson builds. To fix, change the directory to tmp_check before running the test, as was done in commits `15b6d21553`, `8af917be6b`, and `c462b054ba`. Oversight in commit `af0d4901c1`. Reported-by: Andrew Dunstan <andrew@dunslane.net> (on Discord) Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/Z_RHkG770w3SE0yU%40nathan	2025-04-08 10:57:31 -05:00
Daniel Gustafsson	88edd661c8	ci: Add MBUILD_TARGET for NetBSD and OpenBSD Commit `b2bdb972c0` added MBUILD_TARGET to ensure that meson builds the tests before running them, this adds MBUILD_TARGET to OpenBSD and NetBSD builds as well where it was missing. No backpatching since OpenBSD and NetBSD support does not exist in the backbranch CI. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAN55FZ2LNnRrtL+cpSdEg44fQcLPq_GjJjfNa0vz+xqEdq=ZHw@mail.gmail.com	2025-04-08 15:28:29 +02:00
Tomas Vondra	91f1fe90c7	pg_buffercache: Change page_num type to bigint The page_num was defined as integer, which should be sufficient for the near future (with 4K pages it's 8TB). But it's virtually free to return bigint, and get a wider range. This was agreed on the thread, but I forgot to tweak this in `ba2a3c2302`. While at it, make the data types in CREATE VIEW a bit more consistent. Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.co	2025-04-08 12:38:42 +02:00
Tomas Vondra	b8a6078ca8	doc: Correct pg_shmem_allocations_numa.size data type The code in pg_get_shmem_allocations_numa() returned 'size' as int64, but the docs said int32. Report and fix by Noriyoshi Shinoda. Reported-by: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com> Discussion: https://postgr.es/m/DM4PR84MB1734308EB741A6ECFF040C27EEAA2@DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM	2025-04-08 12:36:36 +02:00
Amit Kapila	12eece5fd5	Fix uninitialized index information access during apply. The issue happens when building conflict information during apply of INSERT or UPDATE operations that violate unique constraints on leaf partitions. The problem was introduced in commit `9ff68679b5`, which removed the redundant calls to ExecOpenIndices/ExecCloseIndices. The previous code was relying on the redundant ExecOpenIndices call in apply_handle_tuple_routing() to build the index information required for unique key conflict detection. The fix is to delay building the index information until a conflict is detected instead of relying on ExecOpenIndices to do the same. The additional benefit of this approach is that it avoids building index information when there is no conflict. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by:Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/TYAPR01MB57244ADA33DDA57119B9D26494A62@TYAPR01MB5724.jpnprd01.prod.outlook.com	2025-04-08 15:35:42 +05:30
Thomas Munro	7ea21f4ee2	Fix typo in docs. Typo in previous commit.	2025-04-08 22:02:45 +12:00
Thomas Munro	f78ca6f3eb	Introduce file_copy_method setting. It can be set to either COPY (the default) or CLONE if the system supports it. CLONE causes callers of copydir(), currently CREATE DATABASE ... STRATEGY=FILE_COPY and ALTER DATABASE ... SET TABLESPACE = ..., to use copy_file_range (Linux, FreeBSD) or copyfile (macOS) to copy files instead of a read-write loop over the contents. CLONE gives the kernel the opportunity to share block ranges on copy-on-write file systems and push copying down to storage on others, depending on configuration. On some systems CLONE can be used to clone large databases quickly with CREATE DATABASE ... TEMPLATE=source STRATEGY=FILE_COPY. Other operating systems could be supported; patches welcome. Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGLM%2Bt%2BSwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w%40mail.gmail.com	2025-04-08 21:35:38 +12:00
Daniel Gustafsson	042a66291b	Add function to get memory context stats for processes This adds a function for retrieving memory context statistics and information from backends as well as auxiliary processes. The intended usecase is cluster debugging when under memory pressure or unanticipated memory usage characteristics. When calling the function it sends a signal to the specified process to submit statistics regarding its memory contexts into dynamic shared memory. Each memory context is returned in detail, followed by a cumulative total in case the number of contexts exceed the max allocated amount of shared memory. Each process is limited to use at most 1Mb memory for this. A summary can also be explicitly requested by the user, this will return the TopMemoryContext and a cumulative total of all lower contexts. In order to not block on busy processes the caller specifies the number of seconds during which to retry before timing out. In the case where no statistics are published within the set timeout, the last known statistics are returned, or NULL if no previously published statistics exist. This allows dash- board type queries to continually publish even if the target process is temporarily congested. Context records contain a timestamp to indicate when they were submitted. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com	2025-04-08 11:06:56 +02:00
Andres Freund	15f0cb26b5	Increase BAS_BULKREAD based on effective_io_concurrency Before, BAS_BULKREAD was always of size 256kB. With the default io_combine_limit of 16, that only allowed 1-2 IOs to be in flight - insufficient even on very low latency storage. We don't just want to increase the size to a much larger hardcoded value, as very large rings (10s of MBs of of buffers), appear to have negative performance effects when reading in data that the OS has cached (but not when actually needing to do IO). To address this, increase the size of BAS_BULKREAD to allow for io_combine_limit * effective_io_concurrency buffers getting read in. To prevent the ring being much larger than useful, limit the increased size with GetPinLimit(). The formula outlined above keeps the ring size to sizes for which we have not observed performance regressions, unless very large effective_io_concurrency values are used together with large shared_buffers setting. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/lqwghabtu2ak4wknzycufqjm5ijnxhb4k73vzphlt2a3wsemcd@gtftg44kdim6 Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah@brqs62irg4dt	2025-04-08 02:41:03 -04:00
Andres Freund	dcf7e1697b	Add pg_buffercache_evict_{relation,all} functions In addition to the added functions, the pg_buffercache_evict() function now shows whether the buffer was flushed. pg_buffercache_evict_relation(): Evicts all shared buffers in a relation at once. pg_buffercache_evict_all(): Evicts all shared buffers at once. Both functions provide mechanism to evict multiple shared buffers at once. They are designed to address the inefficiency of repeatedly calling pg_buffercache_evict() for each individual buffer, which can be time-consuming when dealing with large shared buffer pools. (e.g., ~477ms vs. ~2576ms for 16GB of fully populated shared buffers). These functions are intended for developer testing and debugging purposes and are available to superusers only. Minimal tests for the new functions are included. Also, there was no test for pg_buffercache_evict(), test for this added too. No new extension version is needed, as it was already increased this release by `ba2a3c2302`. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru> Reviewed-by: Joseph Koshakow <koshy44@gmail.com> Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com	2025-04-08 02:19:32 -04:00
David Rowley	d69d45a5a9	Speedup child EquivalenceMember lookup in planner When planning queries to partitioned tables, we clone all EquivalenceMembers belonging to the partitioned table into em_is_child EquivalenceMembers for each non-pruned partition. For partitioned tables with large numbers of partitions, this meant the ec_members list could become large and code searching that list would become slow. Effectively, the more partitions which were present, the more searches needed to be performed for operations such as find_ec_member_matching_expr() during create_plan() and the more partitions present, the longer these searches would take, i.e., a quadratic slowdown. To fix this, here we adjust how we store EquivalenceMembers for em_is_child members. Instead of storing these directly in ec_members, these are now stored in a new array of Lists in the EquivalenceClass, which is indexed by the relid. When we want to find EquivalenceMembers belonging to a certain child relation, we can narrow the search to the array element for that relation. To make EquivalenceMember lookup easier and to reduce the amount of code change, this commit provides a pair of functions to allow iteration over the EquivalenceMembers of an EC which also handles finding the child members, if required. Callers that never need to look at child members can remain using the foreach loop over ec_members, which will now often be faster due to only parent-level members being stored there. The actual performance increases here are highly dependent on the number of partitions and the query being planned. Performance increases can be visible with as few as 8 partitions, but the speedup is marginal for such low numbers of partitions. The speedups become much more visible with a few dozen to hundreds of partitions. With some tested queries using 56 partitions, the planner was around 3x faster than before. For use cases with thousands of partitions, these are likely to become significantly faster. Some testing has shown planner speedups of 60x or more with 8192 partitions. Author: Yuya Watari <watari.yuya@gmail.com> Co-authored-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Tested-by: Thom Brown <thom@linux.com> Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com> Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com	2025-04-08 18:09:57 +12:00
Amit Kapila	105b2cb336	Stabilize 035_standby_logical_decoding.pl. Some tests try to invalidate logical slots on the standby server by running VACUUM on the primary. The problem is that xl_running_xacts was getting generated and replayed before the VACUUM command, leading to the advancement of the active slot's catalog_xmin. Due to this, active slots were not getting invalidated, leading to test failures. We fix it by skipping the generation of xl_running_xacts for the required tests with the help of injection points. As the required interface for injection points was not present in back branches, we fixed the failing tests in them by disallowing the slot to become active for the required cases (where rows_removed conflict could be generated). Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/Z6oQXc8LmiTLfwLA@ip-10-97-1-34.eu-west-3.compute.internal	2025-04-08 09:38:02 +05:30
Bruce Momjian	46b4ba533c	Fix PG 17 [NOT] NULL optimization bug for domains A PG 17 optimization allowed columns with NOT NULL constraints to skip table scans for IS NULL queries, and to skip IS NOT NULL checks for IS NOT NULL queries. This didn't work for domain types, since domain types don't follow the IS NULL/IS NOT NULL constraint logic. To fix, disable this optimization for domains for PG 17+. Reported-by: Jan Behrens Diagnosed-by: Tom Lane Discussion: https://postgr.es/m/Z37p0paENWWUarj-@momjian.us Backpatch-through: 17	2025-04-07 21:33:42 -04:00
Michael Paquier	039549d70f	Flush the IO statistics of active WAL senders more frequently WAL senders do not flush their statistics until they exit, limiting the monitoring possible for live processes. This is penalizing when WAL senders are running for a long time, like in streaming or logical replication setups, because it is not possible to know the amount of IO they generate while running. This commit makes WAL senders more aggressive with their statistics flush, using an internal of 1 second, with the flush timing calculated based on the existing GetCurrentTimestamp() done before the sleeps done to wait for some activity. Note that the sleep done for logical and physical WAL senders happens in two different code paths, so the stats flushes need to happen in these two places. One test is added for the physical WAL sender case, and one for the logical WAL sender case. This can be done in a stable fashion by relying on the WAL generated by the TAP tests in combination with a stats reset while a server is running, but only on HEAD as WAL data has been added to pg_stat_io in `a051e71e28`. This issue exists since `a9c70b46db` and the introduction of pg_stat_io, so backpatch down to v16. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/Z73IsKBceoVd4t55@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 16	2025-04-08 07:57:19 +09:00
Tomas Vondra	ba2a3c2302	Add pg_buffercache_numa view with NUMA node info Introduces a new view pg_buffercache_numa, showing NUMA memory nodes for individual buffers. For each buffer the view returns an entry for each memory page, with the associated NUMA node. The database blocks and OS memory pages may have different size - the default block size is 8KB, while the memory page is 4K (on x86). But other combinations are possible, depending on configure parameters, platform, etc. This means buffers may overlap with multiple memory pages, each associated with a different NUMA node. To determine the NUMA node for a buffer, we first need to touch the memory pages using pg_numa_touch_mem_if_required, otherwise we might get status -2 (ENOENT = The page is not present), indicating the page is either unmapped or unallocated. The view may be relatively expensive, especially when accessed for the first time in a backend, as it touches all memory pages to get reliable information about the NUMA node. This may also force allocation of the shared memory. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com	2025-04-07 23:08:17 +02:00
Tomas Vondra	8cc139bec3	Introduce pg_shmem_allocations_numa view Introduce new pg_shmem_alloctions_numa view with information about how shared memory is distributed across NUMA nodes. For each shared memory segment, the view returns one row for each NUMA node backing it, with the total amount of memory allocated from that node. The view may be relatively expensive, especially when executed for the first time in a backend, as it has to touch all memory pages to get reliable information about the NUMA node. This may also force allocation of the shared memory. Unlike pg_shmem_allocations, the view does not show anonymous shared memory allocations. It also does not show memory allocated using the dynamic shared memory infrastructure. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com	2025-04-07 23:08:17 +02:00
Tomas Vondra	65c298f61f	Add support for basic NUMA awareness Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c portability wrapper and an optional build dependency, enabled by --with-libnuma configure option. For now this is Linux-only, other platforms may be supported later. A built-in SQL function pg_numa_available() allows checking NUMA support, i.e. that the server was built/linked with the NUMA library. The main function introduced is pg_numa_query_pages(), which allows determining the NUMA node for individual memory pages. Internally the function uses move_pages(2) syscall, as it allows batching, and is more efficient than get_mempolicy(2). Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com	2025-04-07 23:08:17 +02:00
Álvaro Herrera	17bcf4f545	Use specific collation where needed in new test Oversight in commit `a379061a22`. Per Czech buildfarm members jay and hippopotamus.	2025-04-07 21:58:06 +02:00
Tom Lane	8cfbdf8f4d	Fix some issues in contrib/spi/refint.c. check_foreign_key incorrectly used a single cache entry for its saved plans for a 'c' (cascade) trigger, although there are two different queries to execute depending on whether it fires for an update or a delete. This caused the wrong things to be done if both types of event occur in one session. (This was indeed visible in the triggers regression test, but apparently nobody ever questioned it.) To fix, add the operation type to the cache key. Its debug log output failed to distinguish update from delete events, too. Also, change the intended trigger usage from BEFORE ROW to AFTER ROW, and add checks insisting on that usage. BEFORE is really rather unsafe, since if there are other BEFORE triggers they might change or cancel the operation we are trying to check. AFTER triggers are the standard way to propagate changes to other rows, so we should follow that way here. In passing, remove a useless duplicate lookup of the cache entry. This code is mostly intended as a documentation example, so we won't consider a back-patch. Author: Dmitrii Bondar <d.bondar@postgrespro.ru> Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Lilian Ontowhee <ontowhee@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/79755a2b18ed4fe5e29da6a87a1e00d1@postgrespro.ru	2025-04-07 15:54:16 -04:00
Andres Freund	8e293e689b	aio: Make AIO more compatible with valgrind In some edge cases valgrind flags issues with the memory referenced by IOs. All of the cases addressed in this change are false positives. Most of the false positives are caused by UnpinBuffer[NoOwner] marking buffer data as inaccessible. This happens even though the AIO subsystem still holds a pin. That's good, there shouldn't be accesses to the buffer outside of AIO related code until it is pinned by "user" code again. But it requires some explicit work - if the buffer is not pinned by the current backend, we need to explicitly mark the buffer data accessible/inaccessible while executing completion callbacks. That however causes a cascading issue in IO workers: After the completion callbacks for a buffer is executed, the page is marked as inaccessible. If subsequently the same worker is executing IO targeting the same buffer, we would get an error, as the memory is still marked inaccessible. To avoid that, we need to explicitly mark the memory as accessible in IO workers. Another issue is that IO executed in workers or via io_uring will not mark memory as DEFINED. In the case of workers that is because valgrind does not track memory definedness across processes. For io_uring that is because valgrind does not understand io_uring, and therefore its IOs never mark memory as defined, whether the completions are processed in the defining process or in another context. It's not entirely clear how to best solve that. The current user of AIO is not affected, as it explicitly marks buffers as DEFINED & NOACCESS anyway. Defer solving this issue until we have a user with different needs. Per buildfarm animal skink. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/3pd4322mogfmdd5nln3zphdwhtmq3rzdldqjwb2sfqzcgs22lf@ok2gletdaoe6	2025-04-07 15:20:30 -04:00
Andres Freund	8ab4241b9f	localbuf: Add Valgrind buffer access instrumentation This mirrors `1e0dfd166b` (+ `46ef520b95`), for temporary table buffers. This is mainly interesting right now because the AIO work currently triggers spurious valgrind errors, and the fix for that is cleaner if temp buffers behave the same as shared buffers. This requires one change beyond the annotations themselves, namely to pin local buffers while writing them out in FlushRelationBuffers(). Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/3pd4322mogfmdd5nln3zphdwhtmq3rzdldqjwb2sfqzcgs22lf@ok2gletdaoe6	2025-04-07 15:20:30 -04:00
Masahiko Sawada	a13d49014d	doc: Fix a typo in pg_recvlogical documentation. Oversight in `cf2655a902`. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/OS3PR01MB5718DD1466E2B9043448AE5094AA2@OS3PR01MB5718.jpnprd01.prod.outlook.com	2025-04-07 12:13:08 -07:00
Tom Lane	969ab9d4f5	Follow-up fixes for SHA-2 patch (commit `749a9e20c`). This changes the check for valid characters in the salt string to only allow plain ASCII letters and digits. The previous coding was locale-dependent which doesn't really seem like a great idea here; moreover it could not work correctly in multibyte encodings. This fixes a careless pointer-use-after-pfree, too. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Author: Bernd Helmle <mailings@oopsware.de> Discussion: https://postgr.es/m/6fab35422df6b6b9727fdcc243c5fa1c667dd3b5.camel@oopsware.de	2025-04-07 14:14:28 -04:00
Tom Lane	b73e6d71a8	Fix erroneous construction of functions' dependencies on transforms. The list of transform objects that a function should use is specified in CREATE FUNCTION's TRANSFORM clause, and then represented indirectly in pg_proc.protrftypes. However, ProcedureCreate completely ignored that for purposes of constructing pg_depend entries, and instead made the function depend on any transforms that exist for its parameter or return data types. This is bad in both directions: the function could be made dependent on a transform it does not actually use, or it could try to use a transform that's since been dropped. (The latter scenario would require use of a transform that's not for any of the parameter or return types, but that seems legit for cases where the function performs SQL operations internally.) To fix, pass in the list of transform objects that CreateFunction identified, and build pg_depend entries from that not from the parameter/return types. This results in changes in the expected test outputs in contrib/bool_plperl, which I guess are due to different ordering of pg_depend entries -- that test case is surely not exercising either of the problem scenarios. This fix is not back-patchable as-is: changing the signature of ProcedureCreate seems too risky in stable branches. We could do something like making ProcedureCreate a wrapper around ProcedureCreateExt or so. However, I'm more inclined to do nothing in the back branches. We had no field complaints up to now, so the hazards don't seem to be a big issue in practice. And we couldn't do anything about existing pg_depend entries, so a back-patched fix would result in a mishmash of dependencies created according to different rules. That cure could be worse than the disease, perhaps. I bumped catversion just to lay down a marker that the expected contents of pg_depend are a bit different than before. Reported-by: Chapman Flack <jcflack@acm.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3112950.1743984111@sss.pgh.pa.us	2025-04-07 13:31:37 -04:00
Álvaro Herrera	a379061a22	Allow NOT NULL constraints to be added as NOT VALID This allows them to be added without scanning the table, and validating them afterwards without holding access exclusive lock on the table after any violating rows have been deleted or fixed. Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid not-null constraint validates that constraint. ALTER TABLE .. VALIDATE CONSTRAINT is also supported. There are various checks on whether an invalid constraint is allowed in a child table when the parent table has a valid constraint; this should match what we do for enforced/not enforced constraints. pg_attribute.attnotnull is now only an indicator for whether a not-null constraint exists for the column; whether it's valid or invalid must be queried in pg_constraint. Applications can continue to query pg_attribute.attnotnull as before, but now it's possible that NULL rows are present in the column even when that's set to true. For backend internal purposes, we cache the nullability status in CompactAttribute->attnullability that each tuple descriptor carries (replacing CompactAttribute.attnotnull, which was a mirror of Form_pg_attribute.attnotnull). During the initial tuple descriptor creation, based on the pg_attribute scan, we set this to UNRESTRICTED if pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we update the latter to VALID or INVALID depending on the pg_constraint scan. This flag is also copied when tupledescs are copied. Comparing tuple descs for equality must also compare the CompactAttribute.attnullability flag and return false in case of a mismatch. pg_dump deals with these constraints by storing the OIDs of invalid not-null constraints in a separate array, and running a query to obtain their properties. The regular table creation SQL omits them entirely. They are then dealt with in the same way as "separate" CHECK constraints, and dumped after the data has been loaded. Because no additional pg_dump infrastructure was required, we don't bump its version number. I decided not to bump catversion either, because the old catalog state works perfectly in the new world. (Trying to run with new catalog state and the old server version would likely run into issues, however.) System catalogs do not support invalid not-null constraints (because commit `14e87ffa5c` didn't allow them to have pg_constraint rows anyway.) Author: Rushabh Lathia <rushabh.lathia@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com	2025-04-07 19:19:50 +02:00
Andrew Dunstan	b52a4a5f28	Clean up error messages from `1495eff7bd` Quote file names, and mostly avoid hard coded file names. Along the way make a few other minor improvements. Discussion: https://postgr.es/m/20250407.152721.1397761902317499205.horikyota.ntt@gmail.com	2025-04-07 12:22:41 -04:00
Tom Lane	3516ea768c	Add local-address escape "%L" to log_line_prefix. This escape shows the numeric server IP address that the client has connected to. Unix-socket connections will show "[local]". Non-client processes (e.g. background processes) will show "[none]". We expect that this option will be of interest to only a fairly small number of users. Therefore the implementation is optimized for the case where it's not used (that is, we don't do the string conversion until we have to), and we've not added the field to csvlog or jsonlog formats. Author: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Cary Huang <cary.huang@highgo.ca> Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKAnmmK-U+UicE-qbNU23K--Q5XTLdM6bj+gbkZBZkjyjrd3Ow@mail.gmail.com	2025-04-07 11:06:05 -04:00
Andrew Dunstan	8f5e419484	Revert "Use workaround of __builtin_setjmp only on MINGW on MSVCRT" This reverts commit `c313fa4602`. This is found to cause issues on x86_64 Windows even when using UCRT. Discussion: https://postgr.es/m/3312149.1744001936@sss.pgh.pa.us	2025-04-07 11:01:15 -04:00
Andres Freund	8ce79483dc	read_stream: Fix overflow hazard with large shared buffers If the limit returned by GetAdditionalPinLimit() is large, the buffer_limit variable in read_stream_start_pending_read() can overflow. While the code is careful to limit buffer_limit PG_INT16_MAX, we subsequently add the number of forwarded buffers. The overflow can lead to assertion failures, crashes or wrong query results when using large shared buffers. It seems easier to avoid this if we make the buffer_limit variable an int, instead of an int16. Do so, and clamp buffer_limit after adding the number of forwarded buffers. It's possible we might want to address this and related issues more widely by changing to int instead of int16 more widely, but since the consequences of this bug can be confusing, it seems better to fix it now. This bug was introduced in `ed0b87caac`. Discussion: https://postgr.es/m/ewvz3cbtlhrwqk7h6ca6cctiqh7r64ol3pzb3iyjycn2r5nxk5@tnhw3a5zatlr	2025-04-07 09:45:00 -04:00
Alexander Korotkov	717d0e8dd9	Remove GUC_NOT_IN_SAMPLE from enable_self_join_elimination `fc069a3a63` implements Self-Join Elimination (SJE) and provides a new GUC variable: enable_self_join_elimination. This new GUC variable was marked as GUC_NOT_IN_SAMPLE. However, enable_self_join_elimination is documented and is not different from any other enable_* GUCs. Thus, remove GUC_NOT_IN_SAMPLE from it and add it to the postgresql.conf.sample. Discussion: https://postgr.es/m/CAPpHfdsqMTEsmxk3aQwt6xPz%2BKpUELO%3D6fzmER9ZRGrbs4uMfA%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2025-04-07 16:28:54 +03:00
Daniel Gustafsson	ae60947643	psql: Clarify help message for WATCH_INTERVAL The help message for WATCH_INTERVAL was hard to interpret and didn't follow the style of other messages, this updates it to nake it fit in better and be easier to interpret. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/20250326.120732.1167093737847500721.horikyota.ntt@gmail.com	2025-04-07 13:44:58 +02:00
Michael Paquier	d6f118444d	Fix grammar in log message of pg_restore.c Introduced by `1495eff7bd`. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20250407.151359.72428746612514925.horikyota.ntt@gmail.com	2025-04-07 15:37:34 +09:00
Michael Paquier	2c7bd2ba50	libpq: Fix some issues in TAP tests for service files The valid service file was not correctly shaped, as append_to_file() was called with an array as input. This is changed so as the parameter and value pairs from the valid connection string are appended to the valid service file one by one. Even with the first issue fixed, the tests should fail. However, they have been passing because all the connection attempts relied on the default values given to PGPORT and PGHOST from the node when using Cluster.pm's connect_ok() and connect_fails(), rather than the data in the service file. The test is updated to use an interesting trick: a dummy node is initialized but not started, and all the connection attempts are done through it. This ensures that the data inside the service file is used for all the connection tests. Note that breaking the contents of the valid service file on purpose makes all the tests that rely on it fail. Issues introduced by `72c2f36d57`. Author: Andrew Jackson <andrewjackson947@gmail.com> Discussion: https://postgr.es/m/CAKK5BkG_6_YSaebM6gG=8EuKaY7_VX1RFgYeySuwFPh8FZY73g@mail.gmail.com	2025-04-07 12:55:09 +09:00
Michael Paquier	c36eda2591	Clarify comment for worst-case allocation in quote_literal_cstr() palloc() is invoked with a specific formula for its allocation size in quote_literal_cstr(). This wastes some memory, but the size is large enough to cover even the worst-case scenarios. No explanations were given about the reasons behind these numbers. This commit adds more documentation about all that. Author: Steve Chavez <steve@supabase.io> Discussion: https://postgr.es/m/CAGRrpzZ9bToRWS+fAnjxDJrxwZN1QcJ-y1Pn2yg=Hst6rydLtw@mail.gmail.com	2025-04-07 10:02:12 +09:00
Michael Paquier	3191a593d6	Fix use-after-free in pgstat_fetch_stat_backend_by_pid() stats_fetch_consistency set to "snapshot" causes the backend entry "beentry" retrieved by pgstat_get_beentry_by_proc_number() to be reset at the beginning of pgstat_fetch_stat_backend() when fetching the backend pgstats entry. As coded, "beentry" was being accessed after being freed. This commit moves all the accesses to "beentry" to happen before calling pgstat_fetch_stat_backend(), fixing the problem. This problem could be reached by calling the SQL functions pg_stat_get_backend_io() or pg_stat_get_backend_wal(). Issue caught by valgrind. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/f1788cc0-253a-4a3a-aee0-1b8ab9538736@gmail.com	2025-04-07 09:51:40 +09:00
Fujii Masao	173c97812f	Use XLOG_CONTROL_FILE macro consistently for control file name. The XLOG_CONTROL_FILE macro (defined in access/xlog_internal.h) represents the control file name. While some parts of the codebase already use this macro, others previously hardcoded the file name as a string. This commit replaces those hardcoded strings with the macro, ensuring consistent usage throughout the code. This makes future maintenance easier and improves searchability, for example when grepping for control file usage. Author: Anton A. Melnikov <a.melnikov@postgrespro.ru> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Masao Fujii <masao.fujii@gmail.com> Discussion: https://postgr.es/m/0841ec77-47e5-452a-adb4-c6fa55d605fc@postgrespro.ru	2025-04-07 09:27:33 +09:00
Daniel Gustafsson	a233a603ba	doc: Clarify project naming Clarify the project naming in the history section of the docs to match the recent license preamble changes. Backpatch to all supported versions. Author: Dave Page <dpage@pgadmin.org> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+OCxozLzK2+Jc14XZyWXSp6L9Ot+3efwXUE35FJG=fsbib2EA@mail.gmail.com Backpatch-through: 13	2025-04-07 00:03:18 +02:00
Andrew Dunstan	643a1a6198	Clean up checking for pg_dumpall output directory Coverity objected to the original code, and in any case this is much cleaner, using the existing routine pg_check_dir() instead of rolling its own test. Per suggestion from Tom Lane.	2025-04-06 17:04:58 -04:00
Tom Lane	218ab68275	Doc: fix PDF "contents ... exceed the available area" warnings. Tweak column widths in a new table, similarly to some previous fixes such as `b62381d9a`. Per buildfarm.	2025-04-06 16:27:39 -04:00
Nathan Bossart	de48056ec7	pg_upgrade: Fix memory leak in check_for_unicode_update(). This function was initializing the "task" variable before a couple of early returns. To fix, postpone the initialization until just before it's needed. Per Coverity. Discussion: https://postgr.es/m/Z_KMsUH2-FEbiNjC%40nathan	2025-04-06 15:11:41 -05:00
Andres Freund	57dec20fd4	aio: Avoid spurious coverity warning PgAioResult.result is never accessed in the relevant path, but coverity complains about an uninitialized access anyway. So just zero-initialize the whole thing. While at it, reduce the scope of the variable. Reported-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CAEudQApsKqd-s+fsUQ0OmxJAMHmBSXxrAz3dCs+uvqb3iRtjSw@mail.gmail.com	2025-04-06 12:07:02 -04:00
Tom Lane	8ab6ef2bb8	Fix memory leaks in px_crypt_shacrypt(). Per Coverity. I don't think these are of any actual significance since the function ought to be invoked in a short-lived context. Still, if it's trying to be neat it should get it right. Also const-ify a constant and fix up typedef formatting.	2025-04-06 11:57:22 -04:00
Tom Lane	2e4ccf1b45	Use "(void)" to mark pgstat_lock_entry(..., false) calls. This should silence Coverity's complaints about the result being sometimes ignored. I'm inclined to think that these routines are simply misdesigned, because sometimes it's okay to ignore the result and sometimes it isn't, and we have no way to enforce the latter. But for now I just added a comment.	2025-04-06 11:37:09 -04:00
Andrew Dunstan	5e19154390	Avoid unnecessary copying of a string in pg_restore.c Coverity complained about a possible overrun in the copy, but there is no actual need to copy the string at all.	2025-04-06 09:21:09 -04:00
Andrew Dunstan	6d5417e634	Fix a couple of memory leaks in pg_restore.c per complaint from Coverity.	2025-04-06 09:09:25 -04:00
Peter Eisentraut	a8025f5448	Relax ordering-related hardcoded btree requirements in planning There were several places in ordering-related planning where a requirement for btree was hardcoded but an amcanorder index could suffice. This fixes that. We just need to do the necessary mapping between strategy numbers and compare types and adjust some related APIs so that this works independent of btree strategy numbers. For instance, non-btree amcanorder indexes can now be used to support sorting and merge joins. Also, predtest.c works independent of btree strategy numbers now. To avoid performance regressions, some details on btree and other built-in index types are still hardcoded as shortcuts, but other index types now have access to the same features by providing the required flags and callbacks. Author: Mark Dilger <mark.dilger@enterprisedb.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-04-06 14:43:51 +02:00
Alexander Korotkov	3a1a7c5a70	Revert "Put enable_self_join_elimination into postgresql.conf.sample" This reverts commit `c2d329260c`. Reported-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/D292EB44-806E-439A-82A4-491A1BA59E7A%40yesql.se	2025-04-06 14:30:20 +03:00
Alexander Korotkov	c2d329260c	Put enable_self_join_elimination into postgresql.conf.sample `fc069a3a63` implements Self-Join Elimination (SJE) and provides a new GUC variable: enable_self_join_elimination. This commit adds enable_self_join_elimination to the postgresql.conf.sample, as it was forgotten in the original commit. Discussion: https://postgr.es/m/CAHewXN%3D%2Bghd6O6im46q7j2u6c3H6vkXtXmF%3D_v4CfGSnjje8PA%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com>	2025-04-06 13:24:16 +03:00
John Naylor	3c6e8c1238	Compute CRC32C using AVX-512 instructions where available The previous implementation of CRC32C on x86 relied on the native CRC32 instruction from the SSE 4.2 extension, which operates on up to 8 bytes at a time. We can get a substantial speedup by using carryless multiplication on SIMD registers, processing 64 bytes per loop iteration. Shorter inputs fall back to ordinary CRC instructions. On Intel Tiger Lake hardware (2020), CRC is now 50% faster for inputs between 64 and 112 bytes, and 3x faster for 256 bytes. The VPCLMULQDQ instruction on 512-bit registers has been available on Intel hardware since 2019 and AMD since 2022. There is an older variant for 128-bit registers, but at least on Zen 2 it performs worse than normal CRC instructions for short inputs. We must now do a runtime check, even for builds that target SSE 4.2. This doesn't matter in practice for WAL (arguably the most critical case), because since commit `e2809e3a1` the final computation with the 20-byte WAL header is inlined and unrolled when targeting that extension. Compared with two direct function calls, testing showed equal or slightly faster performance in performing an indirect function call on several dozen bytes followed by inlined instructions on constant input of 20 bytes. The MIT-licensed implementation was generated with the "generate" program from https://github.com/corsix/fast-crc32/ Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction" V. Gopal, E. Ozturk, et al., 2009 Co-authored-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com> Co-authored-by: Paul Amonson <paul.d.amonson@intel.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reviewed-by: Matthew Sterrett <matthewsterrett2@gmail.com> (earlier version) Tested-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com> Tested-by: David Rowley <<dgrowleyml@gmail.com>> (earlier version) Discussion: https://postgr.es/m/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com	2025-04-06 14:04:30 +07:00
Daniel Gustafsson	683df3f4de	Quote filename in error message Project standard is to quote filenames in error and log messages, which commit `2da74d8d64` missed in two error messages. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20250404.120328.103562371975971823.horikyota.ntt@gmail.com	2025-04-05 22:10:28 +02:00
Tom Lane	691836405f	Fix parse_cte.c's failure to examine sub-WITHs in DML statements. makeDependencyGraphWalker thought that only SelectStmt nodes could contain a WithClause. Which was true in our original implementation of WITH, but astonishingly we missed updating this code when we added the ability to attach WITH to INSERT/UPDATE/DELETE (and later MERGE). Moreover, since it was coded to deliberately block recursion to a WithClause, even updating raw_expression_tree_walker didn't save it. The upshot of this was that we didn't see references to outer CTE names appearing within an inner WITH, and would neither complain about disallowed recursion nor account for such references when sorting CTEs into a usable order. The lack of complaints about this is perhaps not so surprising, because typical usage of WITH wouldn't hit either case. Still, it's pretty broken; failing to detect recursion here leads to assert failures or worse later on. Fix by factoring out the processing of sub-WITHs into a new function WalkInnerWith, and invoking that for all the statement types that can have WITH. Bug: #18878 Reported-by: Yu Liang <luy70@psu.edu> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18878-a26fa5ab6be2f2cf@postgresql.org Backpatch-through: 13	2025-04-05 15:01:48 -04:00
Álvaro Herrera	749a9e20c9	Add modern SHA-2 based password hashes to pgcrypto. This adapts the publicly available reference implementation on https://www.akkadia.org/drepper/SHA-crypt.txt and adds the new hash algorithms sha256crypt and sha512crypt to crypt() and gen_salt() respectively. Author: Bernd Helmle <mailings@oopsware.de> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/c763235a2757e2f5f9e3e27268b9028349cef659.camel@oopsware.de	2025-04-05 19:17:13 +02:00
Tom Lane	e33f2335a9	Avoid double transformation of json_array()'s subquery. transformJsonArrayQueryConstructor() applied transformStmt() to the same subquery tree twice. While this causes no issue in many cases, there are some where it causes a coredump, thanks to the parser's habit of scribbling on its input. Fix by making a copy before the first transformation (compare `0f43083d1`). This is quite brute-force, but then so is the whole business of transforming the input twice. Per discussion in the bug thread, this implementation of json_array() parsing should be replaced completely. But that will take some work and will surely not be back-patchable, so for the moment let's take the easy way out. Oversight in `7081ac46a`. Back-patch to v16 where that came in. Bug: #18877 Reported-by: Yu Liang <luy70@psu.edu> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18877-c3c3ad75845833bb@postgresql.org Backpatch-through: 16	2025-04-05 12:13:35 -04:00
Andrew Dunstan	5db3bf7391	Clean up from commit `1495eff7bd` Fix some comments, and remove the hacky way of quoting database names in favor of appendStringLiteralConn.	2025-04-05 08:00:24 -04:00
Álvaro Herrera	64fba9c617	Set log_statement=none in t/002_pg_upgrade.pl This should make the test a wee bit faster on high-load machines (e.g., when running under valgrind). Per complaint from Andres Freund. Discussion: https://postgr.es/m/cwbcyjp2ts7o7xgy5y5gwtcd4zltvncsj67el7xgci7xbwrhlu@k363vk5tce4g	2025-04-05 11:41:01 +02:00
Álvaro Herrera	4be6a74cfb	pg_dump: Tiny header cleanup In commits `9c02e3a986` and `8ec0aaeae0`, Nathan added a duplicate TocEntry typedef forward declaration (plus assorted #ifdef hackery to avoid C99 preprocessor issues) to deal with some very old untidyness regarding DefnDumperPtr function prototype being located in pg_backup.h. But there's no reason to have the DefnDumperPtr typedef (and the accompanying DataDumperPtr typedef) in that file at all; they are better placed in pg_backup_archiver.h, the internal header, because they are only used internally. That also requires zero #ifdef hackery, so move them there. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202504042140.qo66ggw6wzsz@alvherre.pgsql	2025-04-05 11:22:40 +02:00
Nathan Bossart	f0d0083f52	pg_dump: Fix query for gathering attribute stats on older versions. Commit `9c02e3a986` taught pg_dump to retrieve attribute statistics for 64 relations at a time. pg_dump supports dumping from v9.2 and newer versions, but our query for retrieving statistics for multiple relations uses WITH ORDINALITY and multi-argument UNNEST(), both of which were introduced in v9.4. To fix, we resort to gathering statistics for a single relation at a time on versions older than v9.4. Per buildfarm member crake. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/Z_BcWVMvlUIJ_iuZ%40nathan	2025-04-04 21:05:30 -05:00
Tom Lane	43b8e6c4ab	Repair misbehavior with duplicate entries in FK SET column lists. Since v15 we've had an option to apply a foreign key constraint's ON DELETE SET DEFAULT or SET NULL action to just some of the referencing columns. There was not a check for duplicate entries in the list of columns-to-set, though. That caused a potential memory stomp in CreateConstraintEntry(), which incautiously assumed that the list of columns-to-set couldn't be longer than the number of key columns. Even after fixing that, the case doesn't work because you get an error like "multiple assignments to same column" from the SQL command that is generated to do the update. We could either raise an error for duplicate columns or silently suppress the dups, and after a bit of thought I chose to do the latter. This is motivated by the fact that duplicates in the FK column list are legal, so it's not real clear why duplicates in the columns-to-set list shouldn't be. Of course there's no need to actually set the column more than once. I left in the fix in CreateConstraintEntry() too, just because it didn't seem like such low-level code ought to be making assumptions about what it's handed. Bug: #18879 Reported-by: Yu Liang <luy70@psu.edu> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18879-259fc59d072bd4d7@postgresql.org Backpatch-through: 15	2025-04-04 20:11:48 -04:00
Tom Lane	0f43083d16	functions.c: copy trees from source_list before parse analysis etc. This is yet another bit of fallout from the fact that backend/parser (like other code) feels free to scribble on the parse tree it's handed. In this case that resulted in modifying the relatively-short-lived copy in the cached function's source_list. That would be fine since we only need each source_list tree once ... except that if the parser fails after making some changes, the function cache entry remains as-is and will still be there if the user tries to execute the function again. Then we have problems because we're feeding a non-pristine tree to the parser. The most expedient fix is a quick copyObject(). I considered other answers like somehow marking the cache entry invalid temporarily, but that would add complexity and I'm not sure it's worth it. In typical scenarios we'd only do this once per function query per session. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6d442183-102c-498a-81d1-eeeb086cdc5a@gmail.com	2025-04-04 18:26:51 -04:00
Andrew Dunstan	2ef5790806	Fix a couple of error messages and tests for them oversights in `1495eff7bd` and `289f74d0cb`. Mea culpa.	2025-04-04 17:07:45 -04:00
Nathan Bossart	8ec0aaeae0	Prevent redeclaration of typedef TocEntry. Commit `9c02e3a986` added a forward declaration for this typedef that caused redeclarations, which is not valid in C99. To fix, add some preprocessor guards to avoid a redefinition, as is done elsewhere (e.g., commit `382092a0cd`). Per buildfarm.	2025-04-04 15:56:23 -05:00
Andrew Dunstan	289f74d0cb	Add more TAP tests for pg_dumpall Author: Matheus Alcantara <matheusssilv97@gmail.com> Author: Mahendra Singh Thalor <mahi6run@gmail.com>	2025-04-04 16:07:46 -04:00
Andrew Dunstan	1495eff7bd	Non text modes for pg_dumpall, correspondingly change pg_restore pg_dumpall acquires a new -F/--format option, with the same meanings as pg_dump. The default is p, meaning plain text. For any other value, a directory is created containing two files, globals.data and map.dat. The first contains SQL for restoring the global data, and the second contains a map from oids to database names. It will also contain a subdirectory called databases, inside which it will create archives in the specified format, named using the database oids. In these casess the -f argument is required. If pg_restore encounters a directory containing globals.dat, and no toc.dat, it restores the global settings and then restores each database. pg_restore acquires two new options: -g/--globals-only which suppresses restoration of any databases, and --exclude-database which inhibits restoration of particualr database(s) in the same way the same option works in pg_dumpall. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Co-authored-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Srinath Reddy <srinath2133@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/cb103623-8ee6-4ba5-a2c9-f32e3a4933fa@dunslane.net	2025-04-04 16:01:22 -04:00
Andrew Dunstan	2b69afbe50	add new list type simple_oid_string_list to fe-utils/simple_list This type contains both an oid and a string. This will be used in forthcoming changes to pg_restore. Author: Andrew Dunstan <andrew@dunslane.net>	2025-04-04 16:01:22 -04:00
Andrew Dunstan	c1da728106	Move common pg_dump code related to connections to a new file ConnectDatabase is used by pg_dumpall, pg_restore and pg_dump so move common code to new file. new file name: connectdb.c Author: Mahendra Singh Thalor <mahi6run@gmail.com>	2025-04-04 16:01:22 -04:00
Nathan Bossart	ff3a7f0b68	Remove unused function parameters in pg_backup_archiver.c. Thanks to commit `9c02e3a986`, which modified some of the changes from commit `a0a4601765`, we can remove the now-unused ArchiveHandle parameter from _tocEntryRestorePass() and move_to_ready_heap(). Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/Z-3x2AnPCP331JA3%40nathan	2025-04-04 14:55:04 -05:00
Nathan Bossart	9c02e3a986	pg_dump: Retrieve attribute statistics in batches. Currently, pg_dump gathers attribute statistics with a query per relation, which can cause pg_dump to take significantly longer, especially when there are many relations. This commit addresses this by teaching pg_dump to gather attribute statistics for 64 relations at a time. Some simple tests showed this was the optimal batch size, but performance may vary depending on the workload. Our lookahead code determines the next batch of relations by searching the TOC sequentially for relevant entries. This approach assumes that we will dump all such entries in TOC order, which unfortunately isn't true for dump formats that use RestoreArchive(). RestoreArchive() does multiple passes through the TOC and selectively dumps certain groups of entries each time. This is particularly problematic for index stats and a subset of matview stats; both are in SECTION_POST_DATA, but matview stats that depend on matview data are dumped in RESTORE_PASS_POST_ACL, while all other stats are dumped in RESTORE_PASS_MAIN. To handle this, this commit moves all statistics data entries in SECTION_POST_DATA to RESTORE_PASS_POST_ACL, which ensures that we always dump them in TOC order. A convenient side effect of this change is that we can revert a decent chunk of commit `a0a4601765`, but that is left for a follow-up commit. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com	2025-04-04 14:51:08 -05:00
Nathan Bossart	7d5c83b4e9	pg_dump: Reduce memory usage of dumps with statistics. Right now, pg_dump stores all generated commands for statistics in memory. These commands can be quite large and therefore can significantly increase pg_dump's memory footprint. To fix, wait until we are about to write out the commands before generating them, and be sure to free the commands after writing. This is implemented via a new defnDumper callback that works much like the dataDumper one but is specifically designed for TOC entries. Custom dumps that include data might write the TOC twice (to update data offset information), which would ordinarily cause pg_dump to run the attribute statistics queries twice. However, as a hack, we save the length of the written-out entry in the first pass and skip over it in the second. While there is no known technical issue with executing the queries multiple times and rewriting the results, it's expensive and feels risky, so let's avoid it. As an exception, we _do_ execute the queries twice for the tar format. This format does a second pass through the TOC to generate the restore.sql file. pg_restore doesn't use this file, so even if the second round of queries returns different results than the first, it won't corrupt the output; the archive and restore.sql file will just have different content. A follow-up commit will teach pg_dump to gather attribute statistics in batches, which our testing indicates more than makes up for the added expense of running the queries twice. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com	2025-04-04 14:51:08 -05:00
Nathan Bossart	e3cc039a7d	Skip second WriteToc() call for custom-format dumps without data. Presently, "pg_dump --format=custom" calls WriteToc() twice. The second call updates the data offset information, which allegedly makes parallel pg_restore significantly faster. However, if we're not dumping any data, there are no data offsets to update, so we can skip this step. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/Z9c1rbzZegYQTOQE%40nathan	2025-04-04 14:51:08 -05:00
Melanie Plageman	d9c7911e1a	Use streaming read I/O in autoprewarm Make a read stream for each valid fork of each valid relation represented in the autoprewarm dump file and prewarm those blocks through the read stream API instead of by directly invoking ReadBuffer(). Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> (earlier versions) Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> (earlier versions) Reviewed-by: Matheus Alcantara <mths.dev@pm.me> (earlier versions) Discussion: https://postgr.es/m/flat/CAN55FZ3n8Gd%2BhajbL%3D5UkGzu_aHGRqnn%2BxktXq2fuds%3D1AOR6Q%40mail.gmail.com	2025-04-04 15:28:54 -04:00
Melanie Plageman	6acab8bdbc	Refactor autoprewarm_database_main() in preparation for read stream Autoprewarm prewarms blocks from a dump file representing the contents of shared buffers at the time it was dumped. It uses a sorted array of BlockInfoRecords, each representing a block from one of the cluster's databases and tables. autoprewarm_database_main() prewarms all the blocks from a single database. It is optimized to ensure we don't try to open the same relation or fork over and over again if it has been dropped or is invalid. The main loop handled this by carefully setting various local variables to sentinel values when a run of blocks should be skipped. This method won't work with the read stream API. The read stream callback must be able to advance the current position in the BlockInfoRecord array to allow for reading ahead additional blocks, however a read stream maps 1-1 with a relation and fork combination. So, the main loop in autoprewarm_database_main() must also advance the position in the array of BlockInfoRecords to skip invalid relations and forks. This split control doesn't fit well with the current flow control in autoprewarm_database_main() To make it compatible with the read stream API, change autoprewarm_database_main() to explicitly fast-forward in the BlockInfoRecords array past the blocks belonging to an invalid relation or fork. This commit only implements the new control flow -- it does not use the read stream API. Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/flat/CAN55FZ3n8Gd%2BhajbL%3D5UkGzu_aHGRqnn%2BxktXq2fuds%3D1AOR6Q%40mail.gmail.com	2025-04-04 15:28:49 -04:00
Melanie Plageman	7f848cb788	Remove superfluous autoprewarm check autoprewarm_database_main() prewarms blocks from the same database. It is passed an array of sorted BlockInfoRecords and a start and stop index into the array. The range represented should include only blocks belonging to global objects or blocks from a single database. Remove an unnecessary check that the current block is from the same database and add an assert to ensure this invariant remains. Doing so removes a special case that makes future refactoring to accommodate read streamifying autoprewarm easier. Noticed off-list by Andres Freund	2025-04-04 15:28:39 -04:00
Peter Geoghegan	b3f1a13f22	Avoid extra index searches through preprocessing. Transform low_compare and high_compare nbtree skip array inequalities (with opclasses that offer skip support) in such a way as to allow _bt_first to consistently apply later keys when it descends the tree. This can lower the number of index searches for multi-column scans that use a ">" key on one of the index's prefix columns (or use a "<" key, when scanning backwards) when it precedes some later lower-order key. For example, an index qual "WHERE a > 5 AND b = 2" will now be converted to "WHERE a >= 6 AND b = 2" by a new preprocessing step that takes place after low_compare and high_compare have been finalized. That way, the initial call to _bt_first can use "WHERE a >= 6 AND b = 2" to find an initial position, rather than just using "WHERE a > 5" -- "b = 2" can be applied during every _bt_first call. There's a decent chance that this will allow such a scan to avoid the extra search that might otherwise be needed to determine the lowest "a" value still satisfying "WHERE a > 5". The transformation process can only lower the total number of index pages read when the use of a more restrictive set of initial positioning keys in _bt_first actually allows the scan to land on some later leaf page directly, relative to the unoptimized case (or on an earlier leaf page directly, when scanning backwards). But the savings can really add up in cases where an affected skip array comes after some other array. For example, a scan indexqual "WHERE x IN (1, 2, 3) AND y > 5 AND z = 2" can save as many as 3 _bt_first calls by applying the new transformation to its "y" array (up to 1 extra search can be avoided per "x" element). Follow-up to commit `92fe23d9`, which added nbtree skip scan. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=FJ78K3WsF3iWNxWnUCY9f=Jdg3QPxaXE=uYUbmuRz5Q@mail.gmail.com	2025-04-04 14:14:08 -04:00
Peter Geoghegan	21a152b37f	Improve nbtree skip scan primitive scan scheduling. Don't allow nbtree scans with skip arrays to end any primitive scan on its first leaf page without giving some consideration to how many times the scan's arrays advanced while changing at least one skip array (though continue not caring about the number of array advancements that only affected SAOP arrays, even during skip scans with SAOP arrays). Now when a scan performs more than 3 such array advancements in the course of reading a single leaf page, it is taken as a signal that the next page is unlikely to be skippable. We'll therefore continue the ongoing primitive index scan, at least until we can perform a recheck against the next page's finaltup. Testing has shown that this new heuristic occasionally makes all the difference with skip scans that were expected to rely on the "passed first page" heuristic added by commit `9a2e2a28`. Without it, there is a remaining risk that certain kinds of skip scans will never quite manage to clear the initial hurdle of performing a primitive scan that lasts beyond its first leaf page (or that such a skip scan will only clear that initial hurdle when it has already wasted noticeably-many cycles due to inefficient primitive scan scheduling). Follow-up to commits `92fe23d9` and `9a2e2a28`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=RVdG3zWytFWBsyW7fWH7zveFvTHed5JKEsuTT0RCO_A@mail.gmail.com	2025-04-04 13:58:05 -04:00
Masahiko Sawada	cf2655a902	pg_recvlogical: Add --failover option. This new option instructs pg_recvlogical to create the logical replication slot with the failover option enabled. It can be used in conjunction with the --create-slot option. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Banck <mbanck@gmx.net> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966C54097FC83AF19F3516BF5AC2@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-04-04 10:39:57 -07:00
Jeff Davis	3556c89321	Oversight in commit `b81ffa13e3`. Should warn if a materialized view may be affected, as well.	2025-04-04 10:28:52 -07:00
Peter Geoghegan	8a510275dd	Further optimize nbtree search scan key comparisons. Postgres 17 commit `e0b1ee17` added two complementary optimizations to nbtree: the "prechecked" and "firstmatch" optimizations. _bt_readpage was made to avoid needlessly evaluating keys that are guaranteed to be satisfied by applying page-level context. "prechecked" did this for keys required in the current scan direction, while "firstmatch" did it for keys required in the opposite-to-scan direction only. The "prechecked" design had a number of notable issues. It didn't account for the fact that an = array scan key's sk_argument field might need to advance at the point of the page precheck (it didn't check the precheck tuple against the key's array, only the key's sk_argument, which needlessly made it ineffective in cases involving stepping to a page having advanced the scan's arrays using a truncated high key). "prechecked" was also completely ineffective when only one scan key wasn't guaranteed to be satisfied by every tuple (it didn't recognize that it was still safe to avoid evaluating other, earlier keys). The "firstmatch" optimization had similar limitations. It could only be applied after _bt_readpage found its first matching tuple, regardless of why any earlier tuples failed to satisfy the scan's index quals. This allowed unsatisfied non-required scan keys to impede the optimization. Replace both optimizations with a new optimization, without any of these limitations: the "startikey" optimization. Affected _bt_readpage calls generate a page-level key offset ("startikey"), that their _bt_checkkeys calls can then start at. This is an offset to the first key that isn't known to be satisfied by every tuple on the page. Although this is independently useful work, its main goal is to avoid performance regressions with index scans that use skip arrays, but still never manage to skip over irrelevant leaf pages. We must avoid wasting CPU cycles on overly granular skip array maintenance in these cases. The new "startikey" optimization helps with this by selectively disabling array maintenance for the duration of a _bt_readpage call. This has no lasting consequences for the scan's array keys (they'll still reliably track the scan's progress through the index's key space whenever the scan is "between pages"). Skip scan adds skip arrays during preprocessing using simple, static rules, and decides how best to navigate/apply the scan's skip arrays dynamically, at runtime. The "startikey" optimization enables this approach. As a result of all this, the planner doesn't need to generate distinct, competing index paths (one path for skip scan, another for an equivalent traditional full index scan). The overall effect is to make scan runtime close to optimal, even when the planner works off an incorrect cardinality estimate. Scans will also perform well given a skipped column with data skew: individual groups of pages with many distinct values (in respect of a skipped column) can be read about as efficiently as before -- without the scan being forced to give up on skipping over other groups of pages that are provably irrelevant. Many scans that cannot possibly skip will still benefit from the use of skip arrays, since they'll allow the "startikey" optimization to be as effective as possible (by allowing preprocessing to mark all the scan's keys as required). A scan that uses a skip array on "a" for a qual "WHERE a BETWEEN 0 AND 1_000_000 AND b = 42" is often much faster now, even when every tuple read by the scan has its own distinct "a" value. However, there are still some remaining regressions, affecting certain trickier cases. Scans whose index quals have several range skip arrays, each on some high cardinality column, can still be slower than they were before the introduction of skip scan -- even with the new "startikey" optimization. There are also known regressions affecting very selective index scans that use a skip array. The underlying issue with such selective scans is that they never get as far as reading a second leaf page, and so will never get a chance to consider applying the "startikey" optimization. In principle, all regressions could be avoided by teaching preprocessing to not add skip arrays whenever they aren't expected to help, but it seems best to err on the side of robust performance. Follow-up to commit `92fe23d9`, which added nbtree skip scan. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi> Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=Y93jf5WjoOsN=xvqpMjRy-bxCE037bVFi-EasrpeUJA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WznWDK45JfNPNvDxh6RQy-TaCwULaM5u5ALMXbjLBMcugQ@mail.gmail.com	2025-04-04 12:27:52 -04:00
Peter Geoghegan	92fe23d93a	Add nbtree skip scan optimization. Teach nbtree multi-column index scans to opportunistically skip over irrelevant sections of the index given a query with no "=" conditions on one or more prefix index columns. When nbtree is passed input scan keys derived from a predicate "WHERE b = 5", new nbtree preprocessing steps output "WHERE a = ANY(<every possible 'a' value>) AND b = 5" scan keys. That is, preprocessing generates a "skip array" (and an output scan key) for the omitted prefix column "a", which makes it safe to mark the scan key on "b" as required to continue the scan. The scan is therefore able to repeatedly reposition itself by applying both the "a" and "b" keys. A skip array has "elements" that are generated procedurally and on demand, but otherwise works just like a regular ScalarArrayOp array. Preprocessing can freely add a skip array before or after any input ScalarArrayOp arrays. Index scans with a skip array decide when and where to reposition the scan using the same approach as any other scan with array keys. This design builds on the design for array advancement and primitive scan scheduling added to Postgres 17 by commit `5bf748b8`. Testing has shown that skip scans of an index with a low cardinality skipped prefix column can be multiple orders of magnitude faster than an equivalent full index scan (or sequential scan). In general, the cardinality of the scan's skipped column(s) limits the number of leaf pages that can be skipped over. The core B-Tree operator classes on most discrete types generate their array elements with the help of their own custom skip support routine. This infrastructure gives nbtree a way to generate the next required array element by incrementing (or decrementing) the current array value. It can reduce the number of index descents in cases where the next possible indexable value frequently turns out to be the next value stored in the index. Opclasses that lack a skip support routine fall back on having nbtree "increment" (or "decrement") a skip array's current element by setting the NEXT (or PRIOR) scan key flag, without directly changing the scan key's sk_argument. These sentinel values behave just like any other value from an array -- though they can never locate equal index tuples (they can only locate the next group of index tuples containing the next set of non-sentinel values that the scan's arrays need to advance to). A skip array's range is constrained by "contradictory" inequality keys. For example, a skip array on "x" will only generate the values 1 and 2 given a qual such as "WHERE x BETWEEN 1 AND 2 AND y = 66". Such a skip array qual usually has near-identical performance characteristics to a comparable SAOP qual "WHERE x = ANY('{1, 2}') AND y = 66". However, improved performance isn't guaranteed. Much depends on physical index characteristics. B-Tree preprocessing is optimistic about skipping working out: it applies static, generic rules when determining where to generate skip arrays, which assumes that the runtime overhead of maintaining skip arrays will pay for itself -- or lead to only a modest performance loss. As things stand, these assumptions are much too optimistic: skip array maintenance will lead to unacceptable regressions with unsympathetic queries (queries whose scan can't skip over many irrelevant leaf pages). An upcoming commit will address the problems in this area by enhancing _bt_readpage's approach to saving cycles on scan key evaluation, making it work in a way that directly considers the needs of = array keys (particularly = skip array keys). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiro Ikeda <masahiro.ikeda@nttdata.com> Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Aleksander Alekseev <aleksander@timescale.com> Reviewed-By: Alena Rybakina <a.rybakina@postgrespro.ru> Discussion: https://postgr.es/m/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com	2025-04-04 12:27:04 -04:00
Tom Lane	3ba2cdaa45	Stabilize regression test from `c0962a113`. Per buildfarm. Co-authored-by: Alena Rybakina <a.rybakina@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/srnuqlttuimzmvoulhsrbgvj4vnul6b65osswvua7sfkqsvmuy@yg7apybpxp34	2025-04-04 11:57:26 -04:00
Melanie Plageman	64e7fa43a9	Fix autoprewarm neglect of tablespaces While prewarming blocks from a dump file, autoprewarm_database_main() mistakenly ignored tablespace when detecting the beginning of the next relation to prewarm. Because RelFileNumbers are only unique within a tablespace, autoprewarm could miss prewarming blocks from a relation with the same RelFileNumber in a different tablespace. Though this situation is likely rare in practice, it's best to make the code correct. Do so by explicitly checking for the RelFileNumber when detecting a new relation. Reported-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/97c36982-603b-494a-95f4-aaf2a12ac27e%40iki.fi	2025-04-04 11:34:06 -04:00
Nathan Bossart	742317a80f	Add commit `e1a8b1ad58` to .git-blame-ignore-revs.	2025-04-04 09:41:59 -05:00
Nathan Bossart	e1a8b1ad58	Re-pgindent pg_largeobject.c after commit `0d6c477664`.	2025-04-04 09:38:22 -05:00
Alexander Korotkov	c0962a113d	Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate This commit implements the automatic conversion of 'x IN (VALUES ...)' into ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance of an unnecessary join. Since VALUES describes a relational table, and the value of such a list is a table row, the optimizer will likely face an underestimation problem due to the inability to estimate cardinality through MCV statistics. The cardinality evaluation mechanism can work with the array inclusion check operation. If the array is small enough (< 100 elements), it will perform a statistical evaluation element by element. We perform the transformation in the convert_ANY_sublink_to_join() if VALUES RTE is proper and the transformation is convertible. The conversion is only possible for operations on scalar values, not rows. Also, we currently support the transformation only when it ends up with a constant array. Otherwise, the evaluation of non-hashed SAOP might be slower than the corresponding Hash Join with VALUES. Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com Author: Alena Rybakina <a.rybakina@postgrespro.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-04-04 16:01:50 +03:00
Alexander Korotkov	d48d2e2dc8	Extract make_SAOP_expr() function from match_orclause_to_indexcol() This commit extracts the code to generate ScalarArrayOpExpr on top of the list of expressions from match_orclause_to_indexcol() into a separate function make_SAOP_expr(). This function was extracted to be used in optimization for conversion of 'x IN (VALUES ...)' to 'x = ANY ...'. make_SAOP_expr() is placed in clauses.c file as only two additional headers were needed there compared with other places. Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com Author: Alena Rybakina <a.rybakina@postgrespro.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-04-04 16:01:28 +03:00
Peter Eisentraut	ee1ae8b99f	Fix crash/valgrind error Fix for commit 9ef1851685b: We have to skip indexes where sortopfamily is NULL. This takes the place of the previous btree check. Detected by valgrind on the buildfarm.	2025-04-04 14:45:53 +02:00
Heikki Linnakangas	b4f453f6ab	docs: Clarify that NULL arg to set_config() means reset to default Author: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/CAKFQuwY0SK6JdCci1VJX6xsztRXgGeVEY-grkENZx%2B3CZpyPcQ@mail.gmail.com	2025-04-04 15:17:17 +03:00
Heikki Linnakangas	7afca7edef	Relax assertion in finding correct GiST parent Commit `28d3c2ddcf` introduced an assertion that if the memorized downlink location in the insertion stack isn't valid, the parent's LSN should've changed too. Turns out that was too strict. In gistFindCorrectParent(), if we walk right, we update the parent's block number and clear its memorized 'downlinkoffnum'. That triggered the assertion on next call to gistFindCorrectParent(), if the parent needed to be split too. Relax the assertion, so that it's OK if downlinkOffnum is InvalidOffsetNumber. Backpatch to v13-, all supported versions. The assertion was added in commit `28d3c2ddcf` in v12. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://www.postgresql.org/message-id/18396-03cac9beb2f7aac3@postgresql.org	2025-04-04 13:49:00 +03:00
Fujii Masao	534874fac0	Allow "COPY table TO" command to copy rows from materialized views. Previously, "COPY table TO" command worked only with plain tables and did not support materialized views, even when they were populated and had physical storage. To copy rows from materialized views, "COPY (query) TO" command had to be used, instead. This commit extends "COPY table TO" to support populated materialized views directly, improving usability and performance, as "COPY table TO" is generally faster than "COPY (query) TO". Note that copying from unpopulated materialized views will still result in an error. Author: jian he <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CACJufxHVxnyRYy67hiPePNCPwVBMzhTQ6FaL9_Te5On9udG=yg@mail.gmail.com	2025-04-04 19:32:00 +09:00
Peter Eisentraut	9ef1851685	Support non-btree indexes in get_actual_variable_range() This was previously not supported because the btree strategy numbers were hardcoded. Now we can support this for any index that has the required strategy mapping support and the required operators. If an index scan used for get_actual_variable_range() requires recheck, we now just ignore it instead of erroring out. With btree we knew this couldn't happen, but now it might. Author: Mark Dilger <mark.dilger@enterprisedb.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-04-04 12:21:34 +02:00
Fujii Masao	0d6c477664	Extend ALTER DEFAULT PRIVILEGES to define default privileges for large objects. Previously, ALTER DEFAULT PRIVILEGES did not support large objects. This meant that to grant privileges to users other than the owner, permissions had to be manually assigned each time a large object was created, which was inconvenient. This commit extends ALTER DEFAULT PRIVILEGES to allow defining default access privileges for large objects. With this change, specified privileges will automatically apply to newly created large objects, making privilege management more efficient. As a side effect, this commit introduces the new keyword OBJECTS since it's used in the syntax of ALTER DEFAULT PRIVILEGES. Original patch by Haruka Takatsuka, with some fixes and tests by Yugo Nagata, and rebased by Laurenz Albe. Author: Takatsuka Haruka <harukat@sraoss.co.jp> Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp> Co-authored-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Masao Fujii <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20240424115242.236b499b2bed5b7a27f7a418@sraoss.co.jp	2025-04-04 19:02:17 +09:00
Heikki Linnakangas	6e9c81836e	Use standard die() signal handler in walreceiver This gets rid of the bespoken ProcessWalRcvInterrupts() function, which lets walreceiver terminate at any CHECK_FOR_INTERRUPTS() call. And it's less code anyway. We can now use the standard libpqsrv_connect_params() libpq wrapper from libpq-be-fe-helpers.h, removing more code. We attempted to do that earlier already in commit `728f86fec6`, but that was reverted because it didn't call ProcessWalRcvInterrupts() and therefore didn't react to shutdown requests. Now that ProcessWalRcvInterrupts() is gone, it works. As stated in that commit, this also leads to libpqwalreceiver reserving file descriptors for libpq conncetions, which is nice. Author: Andres Freund <andres@anarazel.de> (the earlier commit) Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yura Sokolov <y.sokolov@postgrespro.ru>	2025-04-04 12:38:32 +03:00
Peter Eisentraut	8123e91f5a	Convert PathKey to use CompareType Change the PathKey struct to use CompareType to record the sort direction instead of hardcoding btree strategy numbers. The CompareType is then converted to the index-type-specific strategy when the plan is created. This reduces the number of places btree strategy numbers are hardcoded, and it's a self-contained subset of a larger effort to allow non-btree indexes to behave like btrees. Author: Mark Dilger <mark.dilger@enterprisedb.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-04-04 11:22:20 +02:00
Daniel Gustafsson	daa16893fa	doc: Clarify the system value for sslrootcert The documentation for the special value "system" for sslrootcert could be misinterpreted to mean the default operating system CA store, which it may be, but it's defined to be the default CA store of the SSL lib used. Backpatch down to v16 where support for the system value was added. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: George MacKerron <george@mackerron.co.uk> Discussion: https://postgr.es/m/B3CBBAA3-6EA3-4AB7-8619-4BBFAB93DDB4@yesql.se Backpatch-through: 16	2025-04-04 09:47:36 +02:00
Amit Kapila	898c131b58	pg_createsubscriber: Improve error messages. Consistently, an option name is used in the error messages where applicable. Also, change the code to use pg_fatal() instead of a combination of pg_log_error() and exit(). Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm0HxF1RH27LP7VisLzNsSJbssy8a64M5p6UduDaBq6-ag@mail.gmail.com	2025-04-04 10:58:59 +05:30
Fujii Masao	d5d85f1881	Fix logical decoding test to correctly check slot removal on standby. The regression test for logical decoding verifies whether a logical slot is correctly dropped on a standby when its associated database is dropped. However, the test mistakenly retrieved slot information from the primary instead of the standby, causing incorrect behavior. This commit fixes the issue by ensuring the test correctly checks the slot on the standby. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/1fdfd020-a509-403c-bd8f-a04664aba148@oss.nttdata.com Backpatch-through: 13	2025-04-04 13:32:46 +09:00
Fujii Masao	c754bdd8a2	Fix logical decoding regression tests to correctly check slot existence. The regression tests for logical decoding verify whether a logical slot exists or has been dropped. Previously, these tests attempted to retrieve "slot_name" from the result of slot(), but since "slot_name" was not included in the result, slot()->{'slot_name'} always returned undef, leading to incorrect behavior. This commit fixes the issue by checking the "plugin" field in the result of slot() instead, ensuring the tests properly verify slot existence. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB149667EC4E738769CA80B7EA5F5AE2@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 13	2025-04-04 13:09:06 +09:00
Tomas Vondra	1aff1dc8df	Revert "Improve accounting for memory used by shared hash tables" This reverts commit `f5930f9a98`. This broke the expansion of private hash tables, which reallocates the directory. But that's impossible when it's allocated together with the other fields, and dir_realloc() failed with BogusFree. Clearly, this needs rethinking. Discussion: https://postgr.es/m/CAApHDvriCiNkm=v521AP6PKPfyWkJ++jqZ9eqX4cXnhxLv8w-A@mail.gmail.com	2025-04-04 04:43:50 +02:00
Amit Langote	88f55bc976	Make derived clause lookup in EquivalenceClass more efficient Derived clauses are stored in ec_derives, a List of RestrictInfos. These clauses are later looked up by matching the left and right EquivalenceMembers along with the clause's parent EC. This linear search becomes expensive in queries with many joins or partitions, where ec_derives may contain thousands of entries. In particular, create_join_clause() can spend significant time scanning this list. To improve performance, introduce a hash table (ec_derives_hash) that is built when the list reaches 32 entries -- the same threshold used for join_rel_hash. The original list is retained alongside the hash table to support EC merging and serialization (_outEquivalenceClass()). Each clause is stored in the hash table using a canonicalized key: the EquivalenceMember with the lower memory address is placed in the key before the one with the higher memory address. This avoids storing or searching for both permutations of the same clause. For clauses involving a constant EM, the key places NULL in the first slot and the non-constant EM in the second. The hash table is initialized using list_length(ec_derives_list) as the size hint. simplehash internally adjusts this to the next power of two after dividing by the fillfactor, so this typically results in at least 64 buckets near the threshold -- avoiding immediate resizing while adapting to the actual number of entries. The lookup logic for derived clauses is now centralized in ec_search_derived_clause_for_ems(), which consults the hash table when available and falls back to the list otherwise. The new ec_clear_derived_clauses() always frees ec_derives_list, even though some of the original code paths that cleared the old ec_derives field did not. This ensures consistent cleanup and avoids leaking memory when large lists are discarded. An assertion originally placed in find_derived_clause_for_ec_member() is moved into ec_search_derived_clause_for_ems() so that it is enforced consistently, regardless of whether the hash table or list is used for lookup. This design incorporates suggestions by David Rowley, who proposed both the key canonicalization and the initial sizing approach to balance memory usage and CPU efficiency. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> Tested-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Tested-by: Amit Langote <amitlangote09@gmail.com> Tested-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAExHW5vZiQtWU6moszLP5iZ8gLX_ZAUbgEX0DxGLx9PGWCtqUg@mail.gmail.com	2025-04-04 10:45:05 +09:00
Amit Langote	887160d1be	Add assertion to verify derived clause has constant RHS find_derived_clause_for_ec_member() searches for a previously-derived clause that equates a non-constant EquivalenceMember to a constant. It is only called for EquivalenceClasses with ec_has_const set, and with a non-constant member the EquivalenceMember to search for. The matched clause is expected to have the non-constant member on the left-hand side and the constant EquivalenceMember on the right. Assert that the RHS is indeed a constant, to catch violations of this structure and enforce assumptions made by generate_base_implied_equalities_const(). Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CAExHW5scMxyFRqOFE6ODmBiW2rnVBEmeEcA-p4W_CyuEikURdA@mail.gmail.com	2025-04-04 10:45:05 +09:00
Melanie Plageman	67be093562	Use AIO batchmode for bitmap heap scans Previously bitmap heap scan was not AIO batchmode safe because of the visibility map reads potentially done for the "skip fetch" optimization (which skipped fetching tuples from the heap if the pages were all visible and none of the columns were used in the query). The skip fetch optimization implementation was found to have bugs and was removed in `459e7bf8e2`, so we can safely enable batchmode for bitmap heap scans.	2025-04-03 18:23:02 -04:00
Melanie Plageman	54a3615f15	Remove misleading read stream asserts in a few users Several read stream users asserted that the read stream was exhausted after looping on that very condition. It was pointed out in an a review of an as-of-yet uncommitted read stream user [1] that this was confusing and could lead the reader to think there was a possibility of some kind of race condition. Remove these asserts. [1] https://postgr.es/m/F9ACE8D0-B807-4A17-B6BD-87EF0717983D%40yesql.se	2025-04-03 18:22:37 -04:00
Tom Lane	dbd437e670	Fix oversight in commit `0dca5d68d`. As coded, fmgr_sql() would get an assertion failure for a SQL function that has an empty body and is declared to return some type other than VOID. Typically you'd never get that far because fmgr_sql_validator() would reject such a definition (I suspect that's how come I managed to miss the bug). But if check_function_bodies is off or the function is polymorphic, the validation check wouldn't get made. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/0fde377a-3870-4d18-946a-ce008ee5bb88@gmail.com	2025-04-03 16:03:12 -04:00
Daniel Gustafsson	46c4c7cbc6	oauth: Remove timeout from t/002_client when not needed The connect_timeout=1 setting for the --hang-forever test was left in place and used by later tests, causing unexpected timeouts on slower buildfarm animals. Remove it when no longer needed. Per buildfarm member skink, reported by Andres on Discord. Author: Jacob Champion <jacob.champion@enterprisedb.com> Reported-by: Andres Freund <andres@anarazel.de>	2025-04-03 20:41:09 +02:00
Daniel Gustafsson	8ae0a37932	oauth: Fix build on platforms without epoll/kqueue register_socket() missed a variable declaration if neither HAVE_SYS_EPOLL_H nor HAVE_SYS_EVENT_H was defined. While we're fixing that, adjust the tests to check pg_config.h for one of the multiplexer implementations, rather than assuming that Windows is the only platform without support. (Christoph reported this on hurd-amd64, an experimental Debian.) Author: Jacob Champion <jacob.champion@enterprisedb.com> Reported-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/Z-sPFl27Y0ZC-VBl%40msg.df7cb.de	2025-04-03 20:37:52 +02:00
Jeff Davis	945126234b	Fix unintentional 'NULL' string literal in pg_upgrade. Introduced in `2a083ab807`. Discussion: https://postgr.es/m/e852442da35b4f31acc600ed98bbee0f12e65e0c.camel@j-davis.com Reviewed-by: Michael Paquier <michael@paquier.xyz>	2025-04-03 11:04:37 -07:00
Jeff Davis	b81ffa13e3	pg_upgrade check for Unicode-dependent relations. This check will not cause an upgrade failure, only a warning. Discussion: https://postgr.es/m/ef03d678b39a64392f4b12e0f59d1495c740969e.camel%40j-davis.com Reviewed-by: Peter Eisentraut <peter@eisentraut.org>	2025-04-03 10:45:38 -07:00
Masahiko Sawada	fd09c1316b	Restrict copying of invalidated replication slots. Previously, invalidated logical and physical replication slots could be copied using the pg_copy_logical_replication_slot and pg_copy_physical_replication_slot functions. Replication slots that were invalidated for reasons other than WAL removal retained their restart_lsn. This meant that a new slot copied from an invalidated slot could have a restart_lsn pointing to a WAL segment that might have already been removed. This commit restricts the copying of invalidated replication slots. Backpatch to v16, where slots could retain their restart_lsn when invalidated for reasons other than WAL removal. For v15 and earlier, this check is not required since slots can only be invalidated due to WAL removal, and existing checks already handle this issue. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com Backpatch-through: 16	2025-04-03 10:30:00 -07:00
Álvaro Herrera	f104192e52	Remove duplicate set of print_notnull I inserted the second one by mistake in commit `14e87ffa5c`. Reported-by: jian he <jian.universality@gmail.com> Confirmed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CACJufxFqckBFxPfCixHHbOr0zMLksviTj2m3o12-tErfx_PvTg@mail.gmail.com	2025-04-03 17:34:25 +02:00
Daniel Gustafsson	b82e7eddb0	Add missing declarations to pg_config.h.in Add missing pg_config.h.in declarations from `09be391126` where the corresponding autoconf/meson declarations were added. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/70145721-6949-4ABF-BB54-63F866488DF8@yesql.se	2025-04-03 13:57:27 +02:00
Daniel Gustafsson	2da74d8d64	libpq: Add support for dumping SSL key material to file This adds a new connection parameter which instructs libpq to write out keymaterial clientside into a file in order to make connection debugging with Wireshark and similar tools possible. The file format used is the standardized NSS format. Author: Abhishek Chanda <abhishek.becs@gmail.com> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CAKiP-K85C8uQbzXKWf5wHQPkuygGUGcufke713iHmYWOe9q2dA@mail.gmail.com	2025-04-03 13:16:43 +02:00
Heikki Linnakangas	e4309f73f6	Add support for sorted gist index builds to btree_gist This enables sortsupport in the btree_gist extension for faster builds of gist indexes. Sorted gist index build strategy is the new default now. Regression tests are unchanged (except for one small change in the 'enum' test to add coverage for enum values added later) and are using the sorted build strategy instead. One version of this was committed a long time ago already, in commit `9f984ba6d2`, but it was quickly reverted because of buildfarm failures. The failures were presumably caused by some small bugs, but we never got around to debug and commit it again. This patch was written from scratch, implementing the same idea, with some fragments and ideas from the original patch. Author: Bernd Helmle <mailings@oopsware.de> Author: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://www.postgresql.org/message-id/64d324ce2a6d535d3f0f3baeeea7b25beff82ce4.camel@oopsware.de	2025-04-03 13:46:35 +03:00
Heikki Linnakangas	9370978da8	Fix boilerplate comments in btree_gist A few of these were copy-pasted wrong, like the comment "Bytea ops" in btree_numeric.c. Instead of fixing the incorrect ones, replace them all with generic comment "GiST support functions". Also tidy up the inconsistent newlines between various functions while we're at it.	2025-04-03 13:39:33 +03:00
Peter Eisentraut	82a46cca99	Update Unicode data to Unicode 16.0.0 Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://www.postgresql.org/message-id/flat/146349e4-4687-4321-91af-f235572490a8@eisentraut.org	2025-04-03 12:00:09 +02:00
Peter Eisentraut	231064aa0f	plpython: Add test for returning Python set from SETOF function This is claimed in the documentation but there was a no test case for it. Reported-by: Bogdan Grigorenko <gri.bogdan.2020@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/173543330569.680.6706329879058172623%40wrigleys.postgresql.org	2025-04-03 11:09:50 +02:00
Amit Kapila	d1d83827ba	Doc: Improve -R option added in `e5aeed4b80`. Author: Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAHut+PvJPnaL=70SbBe3fYg2nq74Z=Yv4X=zRpUWYfOi-q6=2w@mail.gmail.com	2025-04-03 14:27:13 +05:30
Álvaro Herrera	8806e4e8de	002_pg_upgrade.pl: Move pg_dump test code for better stability The alleged "statistics pg_dump bug" that prevented us from enabling stats dumping in commit `172259afb5` wasn't a pg_dump bug after all: it was just a side effect of not running pg_dump at the right time (namely, before giving autovacuum some time to do its thing and then disabling it to stabilize things). Move the code around to fix this problem and enable statistics dumping. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Diagnosed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/5f3703fd7f27da62a8f3615218f937507f522347.camel@j-davis.com Discussion: https://postgr.es/m/CAExHW5sDm+aGb7A4EXK=X9rkrmSPDgc03EdADt=wWkdMO=XPSA@mail.gmail.com	2025-04-03 10:16:24 +02:00
Álvaro Herrera	abe56227b2	002_pg_upgrade.pl: rename some variables for clarity This renames %node_params to %old_node_params, @initdb_params to @old_initdb_params, and adds separate @new_initdb_params and %new_node_params rather than reusing the former in confusing ways. Extracted from a larger patch from the same author. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAExHW5sDm+aGb7A4EXK=X9rkrmSPDgc03EdADt=wWkdMO=XPSA@mail.gmail.com	2025-04-03 09:56:58 +02:00
Richard Guo	ea5d3f5233	Remove duplicated comment in get_relation_constraints The check for non-inheritable constraints is performed later, and the same comment is included at that point. While we're here, remove one extraneous blank line. Author: jian he <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CACJufxETi6x86S8EkH8mRfOcm2AenoE9t1pyCFVMpU34gVhF3w@mail.gmail.com	2025-04-03 16:43:53 +09:00
Peter Eisentraut	84fea854c9	Update Unicode data to CLDR 47 No actual changes result.	2025-04-03 09:20:25 +02:00
Peter Eisentraut	bbf24fe2f1	Update code comment Commit `4e7f62bc38` added a new input file to a script but didn't update the comment listing the input files.	2025-04-03 09:20:25 +02:00
Peter Eisentraut	34f04aa653	Fix update-unicode make target The addition of SpecialCasing.txt by commit `286a365b9c` was not added to the make target dependencies, so the invoked script would fail because the required file wasn't downloaded first. (The meson version appears to work correctly.)	2025-04-03 09:20:25 +02:00
Amit Kapila	4868c96bc8	Fix slot synchronization for two_phase enabled slots. The issue is that the transactions prepared before two-phase decoding is enabled can fail to replicate to the subscriber after being committed on a promoted standby following a failover. This is because the two_phase_at field of a slot, which tracks the LSN from which two-phase decoding starts, is not synchronized to standby servers. Without two_phase_at, the logical decoding might incorrectly identify prepared transaction as already replicated to the subscriber after promotion of standby server, causing them to be skipped. To address the issue on HEAD, the two_phase_at field of the slot is exposed by the pg_replication_slots view and allows the slot synchronization to copy this value to the corresponding synced slot on the standby server. This bug is likely to occur if the user toggles the two_phase option to true after initial slot creation. Given that altering the two_phase option of a replication slot is not allowed in PostgreSQL 17, this bug is less likely to occur. We can't change the view/function definition in backbranch so we can't push the same fix but we are brainstorming an appropriate solution for PG17. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/TYAPR01MB5724CC7C288535BBCEEE65DA94A72@TYAPR01MB5724.jpnprd01.prod.outlook.com	2025-04-03 12:26:54 +05:30
Tom Lane	a7187c3723	Remove unnecessary type violation in tsvectorrecv(). compareentry() is declared to work on WordEntryIN structs, but tsvectorrecv() is using it in two places to work on WordEntry structs. This is almost okay, since WordEntry is the first field of WordEntryIN. But on machines with 8-byte pointers, WordEntryIN will have a larger alignment spec than WordEntry, and it's at least theoretically possible that the compiler could generate code that depends on the larger alignment. Given the lack of field reports, this may be just a hypothetical bug that upsets nothing except sanitizer tools. Or it may be real on certain hardware but nobody's tried to use tsvectorrecv() on such hardware. In any case we should fix it, and the fix is trivial: just change compareentry() so that it works on WordEntry without any mention of WordEntryIN. We can also get rid of the quite-useless intermediate function WordEntryCMP. Bug: #18875 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18875-07a29c49c825a608@postgresql.org Backpatch-through: 13	2025-04-02 16:17:43 -04:00
Andres Freund	24da5b239a	Add test for HeapBitmapScan's broken skip_fetch optimization In the previous commit HeapBitmapScan's skip_fetch optimization was removed, due to being broken in not easily fixable ways. Add a test that verifies we don't re-introduce this bug if somebody tries to re-add the feature. Only add the test to master for now, it's possible it's not entirely stable. That seems sufficient, as we're not going to re-introduce the feature on the backbranches. I did verify that the test passes on all branches. If the test turns out to be unproblematic, we can backpatch it later, should we feel a need to do so. Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com	2025-04-02 14:58:39 -04:00
Andres Freund	459e7bf8e2	Remove HeapBitmapScan's skip_fetch optimization The optimization does not take the removal of TIDs by a concurrent vacuum into account. The concurrent vacuum can remove dead TIDs and make pages ALL_VISIBLE while those dead TIDs are referenced in the bitmap. This can lead to a skip_fetch scan returning too many tuples. It likely would be possible to implement this optimization safely, but we don't have the necessary infrastructure in place. Nor is it clear that it's worth building that infrastructure, given how limited the skip_fetch optimization is. In the backbranches we just disable the optimization by always passing need_tuples=true to table_beginscan_bm(). We can't perform API/ABI changes in the backbranches and we want to make the change as minimal as possible. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reported-By: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com Backpatch-through: 13	2025-04-02 14:54:20 -04:00
Tom Lane	0dca5d68d7	Change SQL-language functions to use the plan cache. In the historical implementation of SQL functions (if they don't get inlined), we built plans for all the contained queries at first call within an outer query, and then re-used those plans for the duration of the outer query, and then forgot everything. This was not ideal, not least because the plans could not be customized to specific values of the function's parameters. Our plancache infrastructure seems mature enough to be used here. That will solve both the problem with not being able to build custom plans and the problem with not being able to share work across successive outer queries. Aside from those performance concerns, this change fixes a longstanding bugaboo with SQL functions: you could not write DDL that would affect later statements in the same function. That's mostly still true with new-style SQL functions, since the results of parse analysis are baked into the stored query trees (and protected by dependency records). But for old-style SQL functions, it will now work much as it does with PL/pgSQL functions, because we delay parse analysis and planning of each query until we're ready to run it. Some edge cases that require replanning are now handled better too; see for example the new rowsecurity test, where we now detect an RLS context change that was previously missed. One other edge-case change that might be worthy of a release note is that we now insist that a SQL function's result be generated by the physically-last query within it. Previously, if the last original query was deleted by a DO INSTEAD NOTHING rule, we'd be willing to take the result from the preceding query instead. This behavior was undocumented except in source-code comments, and it seems hard to believe that anyone's relying on it. Along the way to this feature, we needed a few infrastructure changes: * The plancache can now take either a raw parse tree or an analyzed-but-not-rewritten Query as the starting point for a CachedPlanSource. If given a Query, it is caller's responsibility that nothing will happen to invalidate that form of the query. We use this for new-style SQL functions, where what's in pg_proc is serialized Query(s) and we trust the dependency mechanism to disallow DDL that would break those. * The plancache now offers a way to invoke a post-rewrite callback to examine/modify the rewritten parse tree when it is rebuilding the parse trees after a cache invalidation. We need this because SQL functions sometimes adjust the parse tree to make its output exactly match the declared result type; if the plan gets rebuilt, that has to be re-done. * There is a new backend module utils/cache/funccache.c that abstracts the idea of caching data about a specific function usage (a particular function and set of input data types). The code in it is moved almost verbatim from PL/pgSQL, which has done that for a long time. We use that logic now for SQL-language functions too, and maybe other PLs will have use for it in the future. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop	2025-04-02 14:06:02 -04:00
Heikki Linnakangas	e9e7b66044	Add GiST and btree sortsupport routines for range types For GiST, having a sortsupport function allows building the index using the "sorted build" method, which is much faster. For b-tree, the sortsupport routine doesn't give any new functionality, but speeds up sorting a tiny bit. The difference is not very significant, about 2% in cursory testing on my laptop, because the range type comparison function has quite a lot of overhead from detoasting. In any case, since we have the function for GiST anyway, we might as well register it for the btree opfamily too. Author: Bernd Helmle <mailings@oopsware.de> Discussion: https://www.postgresql.org/message-id/64d324ce2a6d535d3f0f3baeeea7b25beff82ce4.camel@oopsware.de	2025-04-02 19:51:28 +03:00
Heikki Linnakangas	ea3f9b6da3	docs: Fix column count attribute in table Nothing seems to actually depend on the attribute, as the docs built successfully, but let's be tidy. Reported offlist by Matthias van de Meent	2025-04-02 18:21:07 +03:00
Tomas Vondra	46df9487d9	Improve accounting for PredXactList, RWConflictPool and PGPROC Various places allocated shared memory by first allocating a small chunk using ShmemInitStruct(), followed by ShmemAlloc() calls to allocate more memory. Unfortunately, ShmemAlloc() does not update ShmemIndex, so this affected pg_shmem_allocations - it only shown the initial chunk. This commit modifies the following allocations, to allocate everything as a single chunk, and then split it internally. - PredXactList - RWConflictPool - PGPROC structures - Fast-Path Lock Array The fast-path lock array is allocated separately, not as a part of the PGPROC structures allocation. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com	2025-04-02 17:14:28 +02:00
Tomas Vondra	f5930f9a98	Improve accounting for memory used by shared hash tables pg_shmem_allocations tracks the memory allocated by ShmemInitStruct(), but for shared hash tables that covered only the header and hash directory. The remaining parts (segments and buckets) were allocated later using ShmemAlloc(), which does not update the shmem accounting. Thus, these allocations were not shown in pg_shmem_allocations. This commit improves the situation by allocating all the hash table parts at once, using a single ShmemInitStruct() call. This way the ShmemIndex entries (and thus pg_shmem_allocations) better reflect the proper size of the hash table. This affects allocations for private (non-shared) hash tables too, as the hash_create() code is shared. For non-shared tables this however makes no practical difference. This changes the alignment a bit. ShmemAlloc() aligns the chunks using CACHELINEALIGN(), which means some parts (header, directory, segments) were aligned this way. Allocating all parts as a single chunk removes this (implicit) alignment. We've considered adding explicit alignment, but we've decided not to - it seems to be merely a coincidence due to using the ShmemAlloc() API, not due to necessity. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com	2025-04-02 17:14:28 +02:00
Tom Lane	bd178960c6	Need to do CommandCounterIncrement after StoreAttrMissingVal. Without this, an additional change to the same pg_attribute row within the same command will fail. This is possible at least with ALTER TABLE ADD COLUMN on a multiple-inheritance-pathway structure. (Another potential hazard is that immediately-following operations might not see the missingval.) Introduced by `95f650674`, which split the former coding that used a single pg_attribute update to change both atthasdef and atthasmissing/attmissingval into two updates, but missed that this should entail two CommandCounterIncrements as well. Like that fix, back-patch through v13. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/025a3ffa-5eff-4a88-97fb-8f583b015965@gmail.com Backpatch-through: 13	2025-04-02 11:13:01 -04:00
Heikki Linnakangas	b05751220b	docs: Add a new section and a table listing protocol versions Move the discussion on protocol versions and version negotiation to a new "Protocol versions" section. Add a table listing all the different protocol versions, starting from the obsolete protocol version 2, and the PostgreSQL versions that support each. Discussion: https://www.postgresql.org/message-id/69f53970-1d55-4165-9151-6fb524e36af9@iki.fi	2025-04-02 16:41:51 +03:00
Heikki Linnakangas	a460251f0a	Make cancel request keys longer Currently, the cancel request key is a 32-bit token, which isn't very much entropy. If you want to cancel another session's query, you can brute-force it. In most environments, an unauthorized cancellation of a query isn't very serious, but it nevertheless would be nice to have more protection from it. Hence make the key longer, to make it harder to guess. The longer cancellation keys are generated when using the new protocol version 3.2. For connections using version 3.0, short 4-bytes keys are still used. The new longer key length is not hardcoded in the protocol anymore, the client is expected to deal with variable length keys, up to 256 bytes. This flexibility allows e.g. a connection pooler to add more information to the cancel key, which might be useful for finding the connection. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Discussion: https://www.postgresql.org/message-id/508d0505-8b7a-4864-a681-e7e5edfe32aa@iki.fi	2025-04-02 16:41:48 +03:00
Heikki Linnakangas	285613c60a	libpq: Add min/max_protocol_version connection options All supported version of the PostgreSQL server send the NegotiateProtocolVersion message when an unsupported minor protocol version is requested by a client. But many other applications that implement the PostgreSQL protocol (connection poolers, or other databases) do not, and the same is true for PostgreSQL server versions older than 9.3. Connecting to such other applications thus fails if a client requests a protocol version different than 3.0. This patch adds a max_protocol_version connection option to libpq that specifies the protocol version that libpq should request from the server. Currently only 3.0 is supported, but that will change in a future commit that bumps the protocol version. Even after that version bump the default will likely stay 3.0 for the time being. Once more of the ecosystem supports the NegotiateProtocolVersion message we might want to change the default to the latest minor version. This also adds the similar min_protocol_version connection option, to allow the client to specify that connecting should fail if a lower protocol version is attempted by the server. This can be used to ensure that certain protocol features are used, which can be particularly useful if those features impact security. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Discussion: https://www.postgresql.org/message-id/CAGECzQTfc_O%2BHXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAGECzQRbAGqJnnJJxTdKewTsNOovUt4bsx3NFfofz3m2j-t7tA@mail.gmail.com	2025-04-02 16:41:45 +03:00
Heikki Linnakangas	5070349102	libpq: Handle NegotiateProtocolVersion message differently Previously libpq would always error out if the server sends a NegotiateProtocolVersion message. This was fine because libpq only supported a single protocol version and did not support any protocol parameters. But in the upcoming commits, we will introduce a new protocol version and the NegotiateProtocolVersion message starts to actually be used. This patch modifies the client side checks to allow a range of supported protocol versions, instead of only allowing the exact version that was requested. Currently this "range" only contains the 3.0 version, but in a future commit we'll change this. Also clarify the error messages, making them suitable for the world where libpq will support multiple protocol versions and protocol extensions. Note that until the later commits that introduce new protocol version, this change does not have any behavioural effect, because libpq will only request version 3.0 and will never send protocol parameters, and therefore will never receive a NegotiateProtocolVersion message from the server. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Discussion: https://www.postgresql.org/message-id/CAGECzQTfc_O%2BHXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAGECzQRbAGqJnnJJxTdKewTsNOovUt4bsx3NFfofz3m2j-t7tA@mail.gmail.com	2025-04-02 16:41:42 +03:00
Peter Eisentraut	748e98d05b	Fix code comment The changes made in commit `d2b4b4c225` contained incorrect comments: They said that certain forward declarations were necessary to "avoid including pathnodes.h here", but the file is itself pathnodes.h! So change the comment to just say it's a forward declaration in one case, and in the other case we don't need the declaration at all because it already appeared earlier in the file.	2025-04-02 14:46:47 +02:00
Heikki Linnakangas	09be391126	Add timingsafe_bcmp(), for constant-time memory comparison timingsafe_bcmp() should be used instead of memcmp() or a naive for-loop, when comparing passwords or secret tokens, to avoid leaking information about the secret token by timing. This commit just introduces the function but does not change any existing code to use it yet. Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl> Discussion: https://www.postgresql.org/message-id/7b86da3b-9356-4e50-aa1b-56570825e234@iki.fi	2025-04-02 15:32:40 +03:00
Heikki Linnakangas	85d799ba8a	docs: Update phrase on message lengths in the protocol The reasoning for why all the message formats are parseable without the explicit message length field is anachronistic; the real reason is that protocol version 2 did not have a message length field. There's nothing wrong with relying on the message length, like we do in the CopyData messags, even though it often still makes sense to have length fields for individual parts in messages. Discussion: https://www.postgresql.org/message-id/02a4eed2-98f0-4796-9d4f-12128ff44fe0@iki.fi	2025-04-02 15:32:33 +03:00
Andres Freund	a6285b150a	tests: Fix incompatibility of test_aio with *_FORCE_RELEASE The test added in `93bc3d75d8` failed in a build with RELCACHE_FORCE_RELEASE and CATCACHE_FORCE_RELEASE defined. The test intentionally forgets to exit batchmode - normally that would trigger an error at the end of the transaction, which the test verifies. However, with RELCACHE_FORCE_RELEASE and CATCACHE_FORCE_RELEASE defined, we get other code (output function lookup) entering batchmode and erroring out because batchmode isn't allowed to be entered recursively. Fix that by changing the queries in question to not output any rows. That's not exactly pretty, but seems to avoid the problem reliably. Eventually we might want to make RELCACHE_FORCE_RELEASE and CATCACHE_FORCE_RELEASE GUCs, so we can disable them where necessary - this isn't the first test having difficulty with those debug options. But that's for later. Per buildfarm member prion. Discussion: https://postgr.es/m/uc62i6vi5gd4bi6wtjj5poadqxolgy55e7ihkmf3mthjegb6zl@zqo7xez7sc2r	2025-04-02 07:57:11 -04:00
Andres Freund	43dca8a116	tests: Cope with WARNINGs during failed CREATE DB on windows The test added in `93bc3d75d8` sometimes fails on windows, due to warnings like WARNING: some useless files may be left behind in old database directory "base/16514" The reason for that is createdb_failure_callback() does not ensure that there are no open file descriptors for files in the partially created, to-be-dropped, database. We do take care in dropdb(), but that involves waiting for checkpoints and a ProcSignalBarrier, which we probably don't want to do in an error callback. This should probably be fixed one day, but for now 001_aio.pl needs to cope. Per buildfarm animals fairywren and drongo. Discussion: https://postgr.es/m/uc62i6vi5gd4bi6wtjj5poadqxolgy55e7ihkmf3mthjegb6zl@zqo7xez7sc2r	2025-04-02 07:51:48 -04:00
Peter Eisentraut	eec0040c4b	Add support for NOT ENFORCED in foreign key constraints This expands the NOT ENFORCED constraint flag, previously only supported for CHECK constraints (commit `ca87c415e2`), to foreign key constraints. Normally, when a foreign key constraint is created on a table, action and check triggers are added to maintain data integrity. With this patch, if a constraint is marked as NOT ENFORCED, integrity checks are no longer required, making these triggers unnecessary. Consequently, when creating a NOT ENFORCED foreign key constraint, triggers will not be created, and the constraint will be marked as NOT VALID. Similarly, if an existing foreign key constraint is changed to NOT ENFORCED, the associated triggers will be dropped, and the constraint will also be marked as NOT VALID. Conversely, if a NOT ENFORCED foreign key constraint is changed to ENFORCED, the necessary triggers will be created, and the will be changed to VALID by performing necessary validation. Since not-enforced foreign key constraints have no triggers, the shortcut used for example in psql and pg_dump to skip looking for foreign keys if the relation is known not to have triggers no longer applies. (It already didn't work for partitioned tables.) Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Joel Jacobson <joel@compiler.org> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Isaac Morland <isaac.morland@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Tested-by: Triveni N <triveni.n@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com	2025-04-02 13:36:44 +02:00
Andres Freund	327d987df1	tests: Cope with io_method in TEMP_CONFIG in test_aio If io_method is set in TEMP_CONFIG the test added in `93bc3d75d8` fails, because it assumes the io_method specified at initdb is actually used. Fix that by appending the io_method again, after initdb (and thus after TEMP_CONFIG has been added by Cluster.pm). Per buildfarm animal bumblebee Discussion: https://postgr.es/m/zh5u22wbpcyfw2ddl3lsvmsxf4yvsrvgxqwwmfjddc4c2khsgp@gfysyjsaelr5	2025-04-02 07:00:40 -04:00
Alexander Korotkov	bc22dc0e0d	Get rid of WALBufMappingLock Allow multiple backends to initialize WAL buffers concurrently. This way `MemSet((char ) NewPage, 0, XLOG_BLCKSZ);` can run in parallel without taking a single LWLock in exclusive mode. The new algorithm works as follows: reserve a page for initialization using XLogCtl->InitializeReserved, * ensure the page is written out, * once the page is initialized, try to advance XLogCtl->InitializedUpTo and signal to waiters using XLogCtl->InitializedUpToCondVar condition variable, * repeat previous steps until we reserve initialization up to the target WAL position, * wait until concurrent initialization finishes using a XLogCtl->InitializedUpToCondVar. Now, multiple backends can, in parallel, concurrently reserve pages, initialize them, and advance XLogCtl->InitializedUpTo to point to the latest initialized page. Author: Yura Sokolov <y.sokolov@postgrespro.ru> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Tested-by: Michael Paquier <michael@paquier.xyz>	2025-04-02 12:44:24 +03:00
Fujii Masao	b53b88109f	Improve error message when standby does accept connections. Even after reaching the minimum recovery point, if there are long-lived write transactions with 64 subtransactions on the primary, the recovery snapshot may not yet be ready for hot standby, delaying read-only connections on the standby. Previously, when read-only connections were not accepted due to this condition, the following error message was logged: FATAL: the database system is not yet accepting connections DETAIL: Consistent recovery state has not been yet reached. This DETAIL message was misleading because the following message was already logged in this case: LOG: consistent recovery state reached This contradiction, i.e., indicating that the recovery state was consistent while also stating it wasn’t, caused confusion. This commit improves the error message to better reflect the actual state: FATAL: the database system is not yet accepting connections DETAIL: Recovery snapshot is not yet ready for hot standby. HINT: To enable hot standby, close write transactions with more than 64 subtransactions on the primary server. To implement this, the commit introduces a new postmaster signal, PMSIGNAL_RECOVERY_CONSISTENT. When the startup process reaches a consistent recovery state, it sends this signal to the postmaster, allowing it to correctly recognize that state. Since this is not a clear bug, the change is applied only to the master branch and is not back-patched. Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/02db8cd8e1f527a8b999b94a4bee3165@oss.nttdata.com	2025-04-02 15:13:01 +09:00
David Rowley	121d774cae	Doc: add information about partition locking The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAApHDvp738G75HfkKcfXaf3a8s%3D6mmtOLh46tMD0D2hAo1UCzA%40mail.gmail.com Backpatch-through: 13	2025-04-02 14:02:44 +13:00
Melanie Plageman	b3219c69fc	aio: Add errcontext for processing I/Os for another backend Push an ErrorContextCallback adding additional detail about the process performing the I/O and the owner of the I/O when those are not the same. For io_method worker, this adds context specifying which process owns the I/O that the I/O worker is processing. For io_method io_uring, this adds context only when a backend is completing I/O for another backend. It specifies the pid of the owning process. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/rdml3fpukrqnas7qc5uimtl2fyytrnu6ymc2vjf2zuflbsjuul%40hyizyjsexwmm	2025-04-01 19:53:07 -04:00
David Rowley	b136db07c6	Fix planner's failure to identify multiple hashable ScalarArrayOpExprs `50e17ad28` (v14) and `29f45e299` (v15) made it so the planner could identify IN and NOT IN clauses which have Const lists as right-hand arguments and when an appropriate hash function is available for the data types, mark the ScalarArrayOpExpr as hashable so the executor could execute it more optimally by building and probing a hash table during expression evaluation. These commits both worked correctly when there was only a single ScalarArrayOpExpr in the given expression being processed by the planner, but when there were multiple, only the first was checked and any subsequent ones were not identified, which resulted in less optimal expression evaluation during query execution for all but the first found ScalarArrayOpExpr. Backpatch to 14, where `50e17ad28` was introduced. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/29a76f51-97b0-4c07-87b7-ec8e3b5345c9@gmail.com Backpatch-through: 14	2025-04-02 11:56:29 +13:00
Tom Lane	6c12ae09f5	Introduce a SQL-callable function array_sort(anyarray). Create a function that will sort the elements of an array according to the element type's sort order. If the array has more than one dimension, the sub-arrays of the first dimension are sorted per normal array-comparison rules, leaving their contents alone. In support of this, add pg_type.typarray to the set of fields cached by the typcache. Author: Junwang Zhao <zhjwpku@gmail.com> Co-authored-by: Jian He <jian.universality@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAEG8a3J41a4dpw_-F94fF-JPRXYxw-GfsgoGotKcjs9LVfEEvw@mail.gmail.com	2025-04-01 18:03:55 -04:00
Tom Lane	6da2ba1d8a	Fix detection and handling of strchrnul() for macOS 15.4. As of 15.4, macOS has strchrnul(), but access to it is blocked behind a check for MACOSX_DEPLOYMENT_TARGET >= 15.4. But our does-it-link configure check finds it, so we try to use it, and fail with the present default deployment target (namely 15.0). This accounts for today's buildfarm failures on indri and sifaka. This is the identical problem that we faced some years ago when Apple introduced preadv and pwritev in the same way. We solved that in commit `f014b1b9b` by using AC_CHECK_DECLS instead of AC_CHECK_FUNCS to check the functions' availability. So do the same now for strchrnul(). Interestingly, we already had a workaround for "the link check doesn't agree with <string.h>" cases with glibc, which we no longer need since only the header declaration is being checked. Testing this revealed that the meson version of this check has never worked, because it failed to use "-Werror=unguarded-availability-new". (Apparently nobody's tried to build with meson on macOS versions that lack preadv/pwritev as standard.) Adjust that while at it. Also, we had never put support for "-Werror=unguarded-availability-new" into v13, but we need that now. Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/385134.1743523038@sss.pgh.pa.us Backpatch-through: 13	2025-04-01 16:50:09 -04:00
Andrew Dunstan	c313fa4602	Use workaround of __builtin_setjmp only on MINGW on MSVCRT MSVCRT is not present Windows/ARM64 and the workaround is not necessary on any UCRT based toolchain. Author: Lars Kanis <lars@greiz-reinsdorf.de> Discussion: https://postgr.es/m/CAHXCYb2OjNHtoGVKyXtXmw4B3bUXwJX6M-Lcp1KcMCRUMLOocA@mail.gmail.com	2025-04-01 16:24:59 -04:00
Andres Freund	e19dc74491	aio: Minor comment improvements Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/usbwzckj7q3jhfx3ann3nrfnukmupbs35axvq5zfyeo6nvrzrm@onjhxs2du4st	2025-04-01 16:06:48 -04:00
Andres Freund	fdd146a8ef	aio: Add README.md explaining higher level design Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-04-01 16:06:48 -04:00
Nathan Bossart	5aec7e07fb	doc: Adjust some notes about pg_upgrade's file transfer modes. --copy-file-range and --swap were not mentioned in a few places that discuss the available file transfer modes. This entire page would likely benefit from an overhaul, but that's v19 material at this point. Oversights in commits `d93627bcbe` and `626d7236b6`.	2025-04-01 14:37:47 -05:00
Andres Freund	00066aa173	md: Add comment & assert to buffer-zeroing path in md[start]readv() mdreadv() has a codepath to zero out buffers when a read returns zero bytes, guarded by a check for zero_damaged_pages \|\| InRecovery. The InRecovery codepath to zero out buffers in mdreadv() appears to be unreachable. The only known paths to reach mdreadv()/mdstartreadv() in recovery are XLogReadBufferExtended(), vm_readbuf(), and fsm_readbuf(), each of which takes care to extend the relation if necessary. This looks to either have been the case for a long time, or the code was never reachable. The zero_damaged_pages path is incomplete, as missing segments are not created. Putting blocks into the buffer-pool that do not exist on disk is rather problematic, as such blocks will, at least initially, not be found by scans that rely on smgrnblocks(), as they are beyond EOF. It also can cause weird problems with relation extension, as relation extension does not expect blocks beyond EOF to exist. Therefore we would like to remove that path. mdstartreadv(), which I added in e5fe570b51c, does not implement this zeroing logic. I had started a discussion about that a while ago (linked below), but forgot to act on the conclusion of the discussion, namely to disable the in-memory-zeroing behavior. We could certainly implement equivalent zeroing logic in mdstartreadv(), but it would have to be more complicated due to potential differences in the zero_damaged_pages setting between the definer and completor of IO. Given that we want to remove the logic, that does not seem worth implementing the necessary logic. For now, put an Assert(false) and comments documenting this choice into mdreadv() and comments documenting the deprecation of the path in mdreadv() and the non-implementation of it in mdstartreadv(). If we, during testing, discover that we do need the path, we can implement it at that time. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/postgr.es/m/20250330024513.ac.nmisch@google.com Discussion: https://postgr.es/m/postgr.es/m/3qxxsnciyffyf3wyguiz4besdp5t5uxvv3utg75cbcszojlz7p@uibfzmnukkbd	2025-04-01 13:50:39 -04:00
Andres Freund	93bc3d75d8	aio: Add test_aio module To make the tests possible, a few functions from bufmgr.c/localbuf.c had to be exported, via buf_internals.h. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Andres Freund <andres@anarazel.de> Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-04-01 13:47:46 -04:00
Andres Freund	60f566b4f2	aio: Add pg_aios view The new view lists all IO handles that are currently in use and is mainly useful for PG developers, but may also be useful when tuning PG. Bumps catversion. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-04-01 13:30:33 -04:00
Andres Freund	46250cdcb0	docs: Add acronym and glossary entries for I/O and AIO These are fairly basic, but better than nothing. While there are several opportunities to link to these entries, this patch does not add any. They will however be referenced by future patches. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250326183102.92.nmisch@google.com	2025-04-01 13:30:33 -04:00
Álvaro Herrera	172259afb5	Verify roundtrip dump/restore of regression database Add a test to pg_upgrade's test suite that verifies that dump-restore-dump of regression database produces equivalent output to dumping it directly. This was already being tested by running pg_upgrade itself, but non-binary-upgrade mode was not being covered. The regression database has accrued, over time, a sufficient collection of interesting objects to ensure good coverage, but there hasn't been a concerted effort to be completely exhaustive, so it is likely still possible to have more. This'd belong more naturally in the pg_dump test suite, but we chose to put it in src/bin/pg_upgrade/t/002_pg_upgrade.pl because we need a run of the regression tests which is already done here, so this has less total test runtime impact. Also, experiments have shown that using parallel dump/restore is slightly faster, so we use --format=directory -j2. This test has already reported pg_dump bugs, as fixed in `fd41ba93e4`, `74563f6b90`, `d611f8b158`, `4694aedf63`. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2Wf61PT+hfquzjBqouRzQJQ@mail.gmail.com	2025-04-01 18:50:40 +02:00
Peter Eisentraut	764d501d24	Remove a stray "pgrminclude" annotation We don't use those anymore. Fix for commit `8492feb98f`.	2025-04-01 15:28:22 +02:00
Peter Eisentraut	113ecf1f8c	Fix minor C type confusion Returning false instead of NULL gets a compiler error under gcc-14 -std=gnu23, and it appears to have been unintentional. Fix for commit `8492feb98f`.	2025-04-01 15:28:22 +02:00
Heikki Linnakangas	2904324a88	heapam: Only set tuple's block once per page in pagemode Due to splitting the block id into two 16 bit integers, BlockIdSet() is more expensive than one might think. Doing it once per returned tuple shows up as a small but reliably reproducible cost. It's simple enough to set the block number just once per block in pagemode, so do so. Author: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6@lzx3sjsqhmt6	2025-04-01 13:24:27 +03:00
John Naylor	af0c248557	Use function attributes for SSE 4.2 even when targeting that extension On Red Hat 9 systems (or similar), the packaged gcc targets x86-64-v2, but clang does not. This has caused build failures in the wake of commit `e2809e3a1` when building --with-llvm. The most expedient fix is to use the same function attributes for the inlined function as we do for the global function. Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> (plus members skimmer and bumblebee) Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Todd Cook <cookt@blackduck.com> Discussion: https://postgr.es/m/CANWCAZZSxs3a1YRKehkgk2OHKbrVn+xZ+AWW8Co2R_f70NqqmA@mail.gmail.com	2025-04-01 12:01:58 +07:00
David Rowley	3dbdf86c63	Fix failing regression test on x86-32 machines `95d6e9af0` added code to display the tuplestore storage type for WindowAgg nodes and added a test to ensure the "Disk" storage method was working correctly by setting work_mem to 64 and running a test which caused the WindowAgg to go to disk. Seemingly, the number of rows chosen there wasn't quite enough for that to happen in x86 32-bit. Fix this by increasing the number of rows slightly. I suspect the buildfarm didn't catch this as MEMORY_CONTEXT_CHECKING builds will use a bit more memory for MemoryChunks to store the requested_size and also because of the additional space to store the chunk's sentinel byte. Reported-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/Z-q3ZAM4OhE-4UiI@msg.df7cb.de	2025-04-01 10:52:25 +13:00
Tom Lane	2fd3e2fa5c	Fix accidentally-harmless thinko in psqlscan_test_variable(). This code was passing literal strings to psqlscan_emit, which is quite contrary to that function's specification: "If you pass it something that is not part of the yytext string, you are making a mistake". It accidentally worked anyway, even in non-safe_encoding mode. psqlscan_emit would compute a garbage "reference" pointer, but would never dereference that since the passed string is all-ASCII. So there's no live bug today, but that is a happenstance outcome of psqlscan_emit's current implementation. Let's make psqlscan_test_variable do what it's supposed to, namely append directly to the output buffer. This is just future-proofing against possible changes in psqlscan_emit, so I don't feel a need to back-patch.	2025-03-31 12:16:32 -04:00
Peter Eisentraut	0fcf02ad45	doc: Mention clock synchronization recommendation for hot_standby_feedback hot_standby_feedback mechanics assume that clocks are synchronized, but it was not clear from documentation. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAKZiRmwBcALLrDgCyEhHP1enUxtPMjyNM_d1A2Lng3_6Rf4Qfw%40mail.gmail.com	2025-03-31 16:54:50 +02:00
John Naylor	e2809e3a10	Inline CRC computation for small fixed-length input on x86 pg_crc32c.h now has a simplified copy of the loop in pg_crc32c_sse42.c suitable for inlining where possible. This may slightly reduce contention for the WAL insertion lock, but that hasn't been tested. The motivation for this change is avoid regressing for a future commit that will use a function pointer for non-constant input in all x86 builds. While it's technically possible to make a similar change for Arm and LoongArch, there are some questions about how inlining should work since those platforms prefer stricter alignment. There are also no immediate plans to add additional implementations for them. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com> Discussion: https://postgr.es/m/CANWCAZZEiTzhZcuwTiJ2=opiNpAUn1vuDRu1N02z61AthwRZLA@mail.gmail.com Discussion: https://postgr.es/m/CANWCAZYRhLHArpyfV4uRK-Rw9N5oV5HMkkKtBehcuTjNOMwCZg@mail.gmail.com	2025-03-31 13:17:21 +07:00
Jeff Davis	4694aedf63	Add relallfrozen to pg_dump statistics. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=desCuf3dVHasADvdUVRmb-5gO0mhMO5u9nzgv6i7U86Q@mail.gmail.com	2025-03-30 22:14:06 -07:00
Andres Freund	2a5e709e72	Enable IO concurrency on all systems Previously effective_io_concurrency and maintenance_io_concurrency could not be set above 0 on machines without fadvise support. AIO enables IO concurrency without such support, via io_method=worker. Currently only subsystems using the read stream API will take advantage of this. Other users of maintenance_io_concurrency (like recovery prefetching) which leverage OS advice directly will not benefit from this change. In those cases, maintenance_io_concurrency will have no effect on I/O behavior. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CAAKRu_atGgZePo=_g6T3cNtfMf0QxpvoUh5OUqa_cnPdhLd=gw@mail.gmail.com	2025-03-30 19:16:47 -04:00
Andres Freund	ae3df4b341	read_stream: Introduce and use optional batchmode support Submitting IO in larger batches can be more efficient than doing so one-by-one, particularly for many small reads. It does, however, require the ReadStreamBlockNumberCB callback to abide by the restrictions of AIO batching (c.f. pgaio_enter_batchmode()). Basically, the callback may not: a) block without first calling pgaio_submit_staged(), unless a to-be-waited-on lock cannot be part of a deadlock, e.g. because it is never held while waiting for IO. b) directly or indirectly start another batch pgaio_enter_batchmode() As this requires care and is nontrivial in some cases, batching is only used with explicit opt-in. This patch adds an explicit flag (READ_STREAM_USE_BATCHING) to read_stream and uses it where appropriate. There are two cases where batching would likely be beneficial, but where we aren't using it yet: 1) bitmap heap scans, because the callback reads the VM This should soon be solved, because we are planning to remove the use of the VM, due to that not being sound. 2) The first phase of heap vacuum This could be made to support batchmode, but would require some care. Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-03-30 18:36:41 -04:00
Andres Freund	f4d0730bbc	aio: Basic read_stream adjustments for real AIO Adapt the read stream logic for real AIO: - If AIO is enabled, we shouldn't issue advice, but if it isn't, we should continue issuing advice - AIO benefits from reading ahead with direct IO - If effective_io_concurrency=0, pass READ_BUFFERS_SYNCHRONOUSLY to StartReadBuffers() to ensure synchronous IO execution There are further improvements we should consider: - While in read_stream_look_ahead(), we can use AIO batch submission mode for increased efficiency. That however requires care to avoid deadlocks and thus done separately. - It can be beneficial to defer starting new IOs until we can issue multiple IOs at once. That however requires non-trivial heuristics to decide when to do so. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Andres Freund <andres@anarazel.de> Co-authored-by: Thomas Munro <thomas.munro@gmail.com>	2025-03-30 18:26:44 -04:00
Andres Freund	b27f8637ea	docs: Reframe track_io_timing related docs as wait time With AIO it does not make sense anymore to track the time for each individual IO, as multiple IOs can be in-flight at the same time. Instead we now track the time spent waiting for IOs. This should be reflected in the docs. While, so far, we only do a subset of reads, and no other operations, via AIO, describing the GUC and view columns as measuring IO waits is accurate for synchronous and asynchronous IO. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/5dzyoduxlvfg55oqtjyjehez5uoq6hnwgzor4kkybkfdgkj7ag@rbi4gsmzaczk	2025-03-30 18:04:40 -04:00
Andres Freund	12ce89fd07	bufmgr: Use AIO in StartReadBuffers() This finally introduces the first actual use of AIO. StartReadBuffers() now uses the AIO routines to issue IO. As the implementation of StartReadBuffers() is also used by the functions for reading individual blocks (StartReadBuffer() and through that ReadBufferExtended()) this means all buffered read IO passes through the AIO paths. However, as those are synchronous reads, actually performing the IO asynchronously would be rarely beneficial. Instead such IOs are flagged to always be executed synchronously. This way we don't have to duplicate a fair bit of code. When io_method=sync is used, the IO patterns generated after this change are the same as before, i.e. actual reads are only issued in WaitReadBuffers() and StartReadBuffers() may issue prefetch requests. This allows to bypass most of the actual asynchronicity, which is important to make a change as big as this less risky. One thing worth calling out is that, if IO is actually executed asynchronously, the precise meaning of what track_io_timing is measuring has changed. Previously it tracked the time for each IO, but that does not make sense when multiple IOs are executed concurrently. Now it only measures the time actually spent waiting for IO. A subsequent commit will adjust the docs for this. While AIO is now actually used, the logic in read_stream.c will often prevent using sufficiently many concurrent IOs. That will be addressed in the next commit. Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Andres Freund <andres@anarazel.de> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-30 18:02:23 -04:00
Andres Freund	047cba7fa0	bufmgr: Implement AIO read support This commit implements the infrastructure to perform asynchronous reads into the buffer pool. To do so, it: - Adds readv AIO callbacks for shared and local buffers It may be worth calling out that shared buffer completions may be run in a different backend than where the IO started. - Adds an AIO wait reference to BufferDesc, to allow backends to wait for in-progress asynchronous IOs - Adapts StartBufferIO(), WaitIO(), TerminateBufferIO(), and their localbuf.c equivalents, to be able to deal with AIO - Moves the code to handle BM_PIN_COUNT_WAITER into a helper function, as it now also needs to be called on IO completion As of this commit, nothing issues AIO on shared/local buffers. A future commit will update StartReadBuffers() to do so. Buffer reads executed through this infrastructure will report invalid page / checksum errors / warnings differently than before: In the error case the error message will cover all the blocks that were included in the read, rather than just the reporting the first invalid block. If more than one block is invalid, the error will include information about the range of the read, the first invalid block and the number of invalid pages, with a HINT towards the server log for per-block details. For the warning case (i.e. zero_damaged_buffers) we would previously emit one warning message for each buffer in a multi-block read. Now there is only a single warning message for the entire read, again referring to the server log for more details in case of multiple checksum failures within a single larger read. Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-30 17:28:03 -04:00
Andres Freund	ef64fe26ba	aio: Add WARNING result status If an IO succeeds, but issues a warning, e.g. due to a page verification failure with zero_damaged_pages, we want to issue that warning in the context of the issuer of the IO, not the process that executes the completion (always the case for worker). It's already possible for a completion callback to report a custom error message, we just didn't have a result status that allowed a user of AIO to know that a warning should be emitted even though the IO request succeeded. All that's needed for that is a dedicated PGAIO_RS_ value. Previously there were not enough bits in PgAioResult.id for the new value. Increase. While at that, add defines for the amount of bits and static asserts to check that the widths are appropriate. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250329212929.a6.nmisch@google.com	2025-03-30 16:27:10 -04:00
Andres Freund	d445990adc	Let caller of PageIsVerified() control ignore_checksum_failure For AIO the completion of a read into shared buffers (i.e. verifying the page including the checksum, updating the BufferDesc to reflect the IO) can happen in a different backend than the backend that started the IO. As ignore_checksum_failure can differ between backends, we need to allow the caller of PageIsVerified() control whether to ignore checksum failures. The commit leaves a gap in the PIV_* values, as an upcoming commit, which depends on this commit, will add PIV_LOG_LOG, which better fits just after PIV_LOG_WARNING. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250329212929.a6.nmisch@google.com	2025-03-30 16:27:10 -04:00
Andres Freund	b96d3c3897	pgstat: Allow checksum errors to be reported in critical sections For AIO we execute completion callbacks in critical sections (to ensure that AIO can in the future be used for WAL, which in turn requires that we can call completion callbacks in critical sections, to get the resources for WAL io). To report checksum errors a backend now has to call pgstat_prepare_report_checksum_failure(), before entering a critical section, which guarantees the relevant pgstats entry is in shared memory, the relevant DSM segment is mapped into the backend's memory and the address is known via a PgStat_EntryRef. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/wkjj4p2rmkevutkwc6tewoovdqznj6c6nvjmvii4oo5wmbh5sr@retq7d6uqs4j	2025-03-30 16:12:04 -04:00
Andres Freund	4244cf6876	Add errhint_internal() We have errmsg_internal(), errdetail_internal(), but not errhint_internal(). Sometimes it is useful to output a hint with already translated format string (e.g. because there different messages depending on the condition). For message/detail we do that with the _internal() variants, but we can't do that with hint today. It's possible to work around that that by using something like str = psprintf(translated_format, args); ereport(... errhint("%s", str); but that's not exactly pretty and makes it harder to avoid memory leaks. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/ym3dqpa4xcvoeknewcw63x77vnqdosbqcetjinb2zfoh65k55m@m4ozmwhr6lk6	2025-03-30 16:10:51 -04:00
Tomas Vondra	49b82522f1	Remove incidental md5() function use from test Replace md5() with sha256() in tests introduced in `14ffaece0f`, to allow test to pass in OpenSSL FIPS mode. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3518736.1743307492@sss.pgh.pa.us	2025-03-30 13:22:39 +02:00
Andres Freund	d6d8054dc7	localbuf: Track pincount in BufferDesc as well For AIO on temporary table buffers the AIO subsystem needs to be able to ensure a pin on a buffer while AIO is going on, even if the IO issuing query errors out. Tracking the buffer in LocalRefCount does not work, as it would cause CheckForLocalBufferLeaks() to assert out. Instead, also track the refcount in BufferDesc.state, not just LocalRefCount. This also makes local buffers behave a bit more akin to shared buffers. Note that we still don't need locking, AIO completion callbacks for local buffers are executed in the issuing session (i.e. nobody else has access to the BufferDesc). Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-03-29 16:36:51 -04:00
Andres Freund	08ccd56ac7	aio, bufmgr: Comment fixes/improvements Some of these comments have been wrong for a while (`12f3867f55`), some I recently introduced (`da7226993f`, `55b454d0e1`). This includes an update to a comment in FlushBuffer(), which will be copied in a future commit. These changes seem big enough to be worth doing in separate commits. Suggested-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250319212530.80.nmisch@google.com	2025-03-29 14:45:42 -04:00
Andres Freund	50cb7505b3	aio: Implement support for reads in smgr/md/fd This implements the following: 1) An smgr AIO target, for AIO on smgr files. This should be usable not just for md.c but also other SMGR implementation if we ever get them. 2) readv support in fd.c, which requires a small bit of infrastructure work in fd.c 3) smgr.c and md.c support for readv There still is nothing performing AIO, but as of this commit it would be possible. As part of this change FileGetRawDesc() actually ensures that the file is opened - previously it was basically not usable. It's used to reopen a file in IO workers. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-29 13:38:35 -04:00
Andres Freund	dee8002468	Fix mis-attribution of checksum failure stats to the wrong database Checksum failure stats could be attributed to the wrong database in two cases: - when a read of a shared relation encountered a checksum error , it would be attributed to the current database, instead of the "database" representing shared relations - when using CREATE DATABASE ... STRATEGY WAL_LOG checksum errors in the source database would be attributed to the current database The checksum stats reporting via PageIsVerifiedExtended(PIV_REPORT_STAT) does not have access to the information about what database a page belongs to. This fixes the issue by removing PIV_REPORT_STAT and delegating the responsibility to report stats to the caller, which now can learn about the number of stats via a new optional argument. As this changes the signature of PageIsVerifiedExtended() and all callers should adapt to the new signature, use the occasion to rename the function to PageIsVerified() and remove the compatibility macro. We could instead have fixed this by adding information about the database to the args of PageIsVerified(), but there are soon-to-be-applied patches that need to separate the stats reporting from the PageIsVerified() call anyway. Those patches also include testing for the failure paths, something we inexplicably have not had. As there is no caller of pgstat_report_checksum_failure() left, remove it. It'd be possible, but awkward to fix this in the back branches. We considered doing the work not quite worth it, as mis-attributed stats should still elicit concern. The emitted error messages do allow to attribute the errors correctly. Discussion: https://postgr.es/m/5tyic6epvdlmd6eddgelv47syg2b5cpwffjam54axp25xyq2ga@ptwkinxqo3az Discussion: https://postgr.es/m/mglpvvbhighzuwudjxzu4br65qqcxsnyvio3nl4fbog3qknwhg@e4gt7npsohuz	2025-03-29 13:38:35 -04:00
Tomas Vondra	68f97aeadb	amcheck: Add a GIN index to the CREATE INDEX CONCURRENTLY tests The existing CREATE INDEX CONCURRENTLY tests checking only B-Tree, but can be cheaply extended to also check GIN. This helps increasing test coverage for GIN amcheck, especially related to handling concurrent page splits and posting list trees. This already helped to identify several issues during development of the GIN amcheck support. Author: Mark Dilger <mark.dilger@enterprisedb.com> Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-By: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/BC221A56-977C-418E-A1B8-9EFC881D80C5%40enterprisedb.com	2025-03-29 16:47:44 +01:00
Tomas Vondra	ca738bdc4c	amcheck: Add a test with GIN index on JSONB data Extend the existing test of GIN checks to also include an index on JSONB data, using the jsonb_path_ops opclass. This is a common enough usage of GIN that it makes sense to have better test coverage for it. Author: Mark Dilger <mark.dilger@enterprisedb.com> Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-By: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/BC221A56-977C-418E-A1B8-9EFC881D80C5%40enterprisedb.com	2025-03-29 16:47:44 +01:00
Tomas Vondra	ec4327d106	amcheck: Fix indentation in verify_gin.c I forgot to reindent the code after a couple last-minute adjustments just before committing `14ffaece0f`. Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru	2025-03-29 16:47:44 +01:00
Andres Freund	116e851db5	Fix "‘static’ is not at beginning of declaration" warning `b98be8a2a2` used "const static" instead of "static const". We normally use the latter form. Discussion: https://postgr.es/m/z4mc2hzecahyq3paupfsouhuupmzmgum45md3k5my6bmo7gvn7@z5j26doqamqy	2025-03-29 10:48:59 -04:00
Tomas Vondra	14ffaece0f	amcheck: Add gin_index_check() to verify GIN index Adds a new function, validating two kinds of invariants on a GIN index: - parent-child consistency: Paths in a GIN graph have to contain consistent keys. Tuples on parent pages consistently include tuples from child pages; parent tuples do not require any adjustments. - balanced-tree / graph: Each internal page has at least one downlink, and can reference either only leaf pages or only internal pages. The GIN verification is based on work by Grigory Kryachko, reworked by Heikki Linnakangas and with various improvements by Andrey Borodin. Investigation and fixes for multiple bugs by Kirill Reshke. Author: Grigory Kryachko <GSKryachko@gmail.com> Author: Heikki Linnakangas <hlinnaka@iki.fi> Author: Andrey Borodin <amborodin@acm.org> Reviewed-By: José Villanova <jose.arthur@gmail.com> Reviewed-By: Aleksander Alekseev <aleksander@timescale.com> Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-By: Kirill Reshke <reshkekirill@gmail.com> Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com> Reviewed-By: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru	2025-03-29 15:44:29 +01:00
Peter Eisentraut	53a2a1564a	pgbench: Make set_random_seed() 64-bit everywhere. Delete an intermediate variable, a redundant cast, a use of long and a use of long long. scanf() the seed directly into a uint64, now that we can do that with SCNu64 from <inttypes.h>. The previous coding was from pre-C99 times when %lld might not have been there, so it read into an unsigned long. Therefore behavior varied by OS, and --random-seed would accept either 32 or 64 bit seeds. Now it's the same everywhere. Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org	2025-03-29 15:24:42 +01:00
Tomas Vondra	d70b17636d	amcheck: Move common routines into a separate module Before performing checks on an index, we need to take some safety measures that apply to all index AMs. This includes: * verifying that the index can be checked - Only selected AMs are supported by amcheck (right now only B-Tree). The index has to be valid and not a temporary index from another session. * changing (and then restoring) user's security context * obtaining proper locks on the index (and table, if needed) * discarding GUC changes from the index functions Until now this was implemented in the B-Tree amcheck module, but it's something every AM will have to do. So relocate the code into a new module verify_common for reuse. The shared steps are implemented by amcheck_lock_relation_and_check(), receiving the AM-specific verification as a callback. Custom parameters may be supplied using a pointer. Author: Andrey Borodin <amborodin@acm.org> Reviewed-By: José Villanova <jose.arthur@gmail.com> Reviewed-By: Aleksander Alekseev <aleksander@timescale.com> Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com> Reviewed-By: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru	2025-03-29 15:14:49 +01:00
Tomas Vondra	fb9dff7663	Fix grammar in GIN README Author: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CALdSSPgu9uAhVYojQ0yjG%3Dq5MaqmiSLUJPhz%2B-u7cA6K6Mc9UA%40mail.gmail.com	2025-03-29 15:14:25 +01:00
Dean Rasheed	8b6a0e2392	Fix MERGE with DO NOTHING actions into a partitioned table. ExecInitPartitionInfo() duplicates much of the logic in ExecInitMerge(), except that it failed to handle DO NOTHING actions. This would cause an "unknown action in MERGE WHEN clause" error if a MERGE with any DO NOTHING actions attempted to insert into a partition not already initialised by ExecInitModifyTable(). Bug: #18871 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Gurjeet Singh <gurjeet@singh.im> Discussion: https://postgr.es/m/18871-b44e3c96de3bd2e8%40postgresql.org Backpatch-through: 15	2025-03-29 09:58:40 +00:00
Peter Eisentraut	a0ed19e0a9	Use PRI?64 instead of "ll?" in format strings (continued). Continuation of work started in commit `15a79c73`, after initial trial. Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org	2025-03-29 10:43:57 +01:00
Jeff Davis	a0a4601765	Matview statistics depend on matview data. REFRESH MATERIALIZED VIEW replaces the storage, which resets statistics, so statistics must be restored afterward. If both statistics and data are being dumped for a materialized view, add a dependency from the former to the latter. Defer the statistics to SECTION_POST_DATA, and use RESTORE_PASS_POST_ACL. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAExHW5s47kmubpbbRJzSM-Zfe0Tj2O3GBagB7YAyE8rQ-V24Uw@mail.gmail.com	2025-03-28 16:12:55 -07:00
Alexander Korotkov	775a06d44c	Make group_similar_or_args() reorder clause list as little as possible Currently, group_similar_or_args() permutes original positions of clauses independently on whether it manages to find any groups of similar clauses. While we are not providing any strict warranties on saving the original order of OR-clauses, it is preferred that the original order be modified as little as possible. This commit changes the reordering algorithm of group_similar_or_args() in the following way. We reorder each group of similar clauses so that the first item of the group stays in place, but all the other items are moved after it. So, if there are no similar clauses, the order of clauses stays the same. When there are some groups, only required reordering happens while the rest of the clauses remain in their places. Reported-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/3ac7c436-81e1-4191-9caf-b0dd70b51511%40gmail.com Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>	2025-03-28 23:37:49 +02:00
Nathan Bossart	519338ace4	Optimize popcount functions with ARM SVE intrinsics. This commit introduces SVE implementations of pg_popcount{32,64}. Unlike the Neon versions, we need an additional configure-time check to determine if the compiler supports SVE intrinsics, and we need a runtime check to determine if the current CPU supports SVE instructions. Our testing showed that the SVE implementations are much faster for larger inputs and are comparable to the status quo for smaller inputs. Author: "Devanga.Susmitha@fujitsu.com" <Devanga.Susmitha@fujitsu.com> Co-authored-by: "Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com> Co-authored-by: "Malladi, Rama" <ramamalladi@hotmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com Discussion: https://postgr.es/m/OSZPR01MB84990A9A02A3515C6E85A65B8B2A2%40OSZPR01MB8499.jpnprd01.prod.outlook.com	2025-03-28 16:20:20 -05:00
Peter Eisentraut	3c8e463b0d	Revert "Tidy up locale thread safety in ECPG library." This reverts commit `8e993bff53`. It causes various build failures on the buildfarm, to be investigated. Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech	2025-03-28 21:27:37 +01:00
Nathan Bossart	6be53c2767	Optimize popcount functions with ARM Neon intrinsics. This commit introduces Neon implementations of pg_popcount{32,64}, pg_popcount(), and pg_popcount_masked(). As in simd.h, we assume that all available AArch64 hardware supports Neon, so we don't need any new configure-time or runtime checks. Some compilers already emit Neon instructions for these functions, but our hand-rolled implementations for pg_popcount() and pg_popcount_masked() performed better in testing, likely due to better instruction-level parallelism. Author: "Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com	2025-03-28 14:49:35 -05:00
Heikki Linnakangas	51a0382e8d	Fix crash if LockErrorCleanup() is called twice The refactoring in commit `3c0fd64fec` removed the clearing of awaitedLock from LockErrorCleanup(). It's still needed, otherwise LockErrorCleanup() during abort processing will try to update the LOCALLOCK struct even after the lock has already been released. Put it back. Reported-by: Richard Guo <guofenglinux@gmail.com> Reported-by: Robins Tharakan <tharakan@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4_dNX1SzBmvFdoY-LxJh_4W_BjtVd5i008ihfU-wFF=eg@mail.gmail.com Discussion: https://www.postgresql.org/message-id/18832-38e5575b1bbd7277@postgresql.org Discussion: https://www.postgresql.org/message-id/e11a30e5-c0d8-491d-8546-3a1b50c10ad4@gmail.com	2025-03-28 20:19:17 +02:00
Nathan Bossart	9ac6f7e7ce	Rename TRY_POPCNT_FAST to TRY_POPCNT_X86_64. This macro protects x86_64-specific code, and a subsequent commit will introduce AArch64-specific versions of that code. To prevent confusion, let's rename it to clearly indicate that it's for x86_64. We should likely move this code to its own file (perhaps merging it with the AVX-512 popcount code), but that is left as a future exercise. Reviewed-by: "Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com	2025-03-28 12:27:47 -05:00
Masahiko Sawada	a5419bc72e	Fix timestamp overflow in UUIDv7 implementation. The uuidv7_interval() function previously converted a shifted microsecond-precision timestamp (64-bit integer) to another 64-bit integer representing a timestamp with nanosecond precision. This conversion caused overflow for dates beyond the year 2262. The millisecond and sub-millisecond parts were then extracted from this nanosecond-precision timestamp and stored in UUIDv7 values. With this commit, the millisecond and sub-millisecond parts are stored directly into the UUIDv7 value without being converted back to a nanosecond precision timestamp. Following RFC 9562, the timestamp is stored as an unsigned integer, enabling support for dates up to the year 10889. Reported and fixed by Andrey Borodin, with cosmetic changes and regression tests by me. Reported-by: Andrey Borodin <x4mmm@yandex-team.ru> Author: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/96DEC2D9-659A-40E8-B7BA-AF5D162A9E21@yandex-team.ru	2025-03-28 09:39:11 -07:00
Peter Eisentraut	8e993bff53	Tidy up locale thread safety in ECPG library. Remove setlocale() and _configthreadlocal() as fallback strategy on systems that don't have uselocale(), where ECPG tries to control LC_NUMERIC formatting on input and output of floating point numbers. It was probably broken on some systems (NetBSD), and the code was also quite messy and complicated, with obsolete configure tests (Windows). It was also arguably broken, or at least had unstated environmental requirements, if pgtypeslib code was called directly. Instead, introduce PG_C_LOCALE to refer to the "C" locale as a locale_t value. It maps to the special constant LC_C_LOCALE when defined by libc (macOS, NetBSD), or otherwise uses a process-lifetime locale_t that is allocated on first use, just as ECPG previously did itself. The new replacement might be more widely useful. Then change the float parsing and printing code to pass that to _l() functions where appropriate. Unfortunately the portability of those functions is a bit complicated. First, many obvious and useful _l() functions are missing from POSIX, though most standard libraries define some of them anyway. Second, although the thread-safe save/restore technique can be used to replace the missing ones, Windows and NetBSD refused to implement standard uselocale(). They might have a point: "wide scope" uselocale() is hard to combine with other code and error-prone, especially in library code. Luckily they have the _l() functions we want so far anyway. So we have to be prepared for both ways of doing things: 1. In ECPG, use strtod_l() for parsing, and supply a port.h replacement using uselocale() over a limited scope if missing. 2. Inside our own snprintf.c, use three different approaches to format floats. For frontend code, call libc's snprintf_l(), or wrap libc's snprintf() in uselocale() if it's missing. For backend code, snprintf.c can keep assuming that the global locale's LC_NUMERIC is "C" and call libc's snprintf() without change, for now. (It might eventually be possible to call our in-tree Ryū routines to display floats in snprintf.c, given the C-locale-always remit of our in-tree snprintf(), but this patch doesn't risk changing anything that complicated.) Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tristan Partin <tristan@partin.io> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech	2025-03-28 16:18:36 +01:00
Peter Eisentraut	2247281c47	Cast result of i64abs() back to int64 Without the cast, the return type could be long or long long, depending on what int64 is underneath. This doesn't affect code correctness, but it could result in format-mismatch warnings when attempting to printf such values using PRId64. Reported-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+hUKGJc4s+Wyb3EFOQNN9VVK+Qv40r2LK41o9PkS9ThxviTvQ@mail.gmail.com	2025-03-28 14:34:57 +01:00
Robert Haas	83ccc85859	pg_overexplain: Use PG_MODULE_MAGIC_EXT. I committed this contrib module just after Tom committed 55527368bd07248e91e3d37a782bf66b76f06865; adjust it to match. Author: Man Zeng <zengman@halodbtech.com> Discussion: http://postgr.es/m/174313513707.60295.16516085012903412705.pgcf@coridan.postgresql.org	2025-03-28 09:16:29 -04:00
Robert Haas	9f0c36aea0	pg_overexplain: Call previous hooks as appropriate. It makes no sense to remember the previous values of the hook variables and then never bother calling those functions. Thanks to Andrei for spotting my goof. Author: Andrei Lepikhov <lepihov@gmail.com> Discussion: http://postgr.es/m/41a344e3-ffb1-4296-8ba7-801f1e9642e5@gmail.com	2025-03-28 09:02:37 -04:00
Peter Eisentraut	cdc168ad4b	Add support for not-null constraints on virtual generated columns This was left out of the original patch for virtual generated columns (commit `83ea6c5402`). This just involves a bit of extra work in the executor to expand the generation expressions and run a "IS NOT NULL" test against them. There is also a bit of work to make sure that not-null constraints are checked during a table rewrite. Author: jian he <jian.universality@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Navneet Kumar <thanit3111@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CACJufxHArQysbDkWFmvK+D1TPHQWWTxWN15cMuUaTYX3xhQXgg@mail.gmail.com	2025-03-28 13:53:37 +01:00
Peter Eisentraut	747ddd38cb	Modernize some code a bit Modernize code in ExecRelCheck() and ExecConstraints() a bit, preparing the way for some new code. Co-authored-by: jian he <jian.universality@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Navneet Kumar <thanit3111@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CACJufxHArQysbDkWFmvK+D1TPHQWWTxWN15cMuUaTYX3xhQXgg@mail.gmail.com	2025-03-28 10:49:15 +01:00
Peter Eisentraut	9a9ead1105	Rename a node field for clarity Rename ResultRelInfo.ri_ConstraintExprs to ri_CheckConstraintExprs. This reflects its specific purpose better and avoids confusion with adjacent fields with similar but distinct purposes. Discussion: https://postgr.es/m/CACJufxHArQysbDkWFmvK+D1TPHQWWTxWN15cMuUaTYX3xhQXgg@mail.gmail.com	2025-03-28 09:50:01 +01:00
Amit Kapila	fb2ea12f42	pg_createsubscriber: Add '--all' option. The '--all' option indicates that the tool queries the source server (publisher) for all databases and creates subscriptions on the target server (subscriber) for databases with matching names. Without this user needs to explicitly specify all databases by using -d option for each database. This simplifies converting a physical standby to a logical subscriber, particularly during upgrades. The options '--database', '--publication', '--subscription', and '--replication-slot' cannot be used when '--all' is specified. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Discussion: https://postgr.es/m/CAHv8RjKhA=_h5vAbozzJ1Opnv=KXYQHQ-fJyaMfqfRqPpnC2bA@mail.gmail.com	2025-03-28 12:26:39 +05:30
Peter Eisentraut	890fc826c9	Use thread-safe strftime_l() instead of strftime(). This removes some setlocale() calls and a lot of commentary about how dangerous that is. strftime_l() is from POSIX 2008, and on Windows we use _wcsftime_l(). While here, adjust error message for strftime_l() failure: it does not in practice set errno (even though POSIX says it could), so no %m. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com	2025-03-28 07:13:43 +01:00
Amit Kapila	474d7a1fd8	Stablize tests added in `3abe9dc188`. The problem is that after the ALTER SUBSCRIPTION tap_sub SET PUBLICATION command, we didn't wait for the new walsender to start on the publisher. Immediately after ALTER, we performed Insert and expected it to replicate. However, the replication could start from a point after the INSERT location, and as the subscription isn't copying initial data, we could miss such an Insert. The fix is to wait for connection to be established between publisher and subscriber before starting DML operations that are expected to replicate. As per CI. Reported-by: Andres Freund <andres@anarazel.de> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/CALDaNm2ms1deM5EYNLFEfESv_Kw=Y4AiTB0LP=qGS-UpFwGbPg@mail.gmail.com	2025-03-28 11:03:05 +05:30
Daniel Gustafsson	058b5152f0	Fix guc_malloc calls for consistency and OOM checks check_createrole_self_grant and check_synchronized_standby_slots were allocating memory on a LOG elevel without checking if the allocation succeeded or not, which would have led to a segfault on allocation failure. On top of that, a number of callsites were using the ERROR level, relying on erroring out rather than returning false to allow the GUC machinery handle it gracefully. Other callsites used WARNING instead of LOG. While neither being not wrong, this changes all check_ functions do it consistently with LOG. init_custom_variable gets a promoted elevel to FATAL to keep the guc_malloc error handling in line with the rest of the error handling in that function which already call FATAL. If we encounter an OOM in this callsite there is no graceful handling to be had, better to error out hard. Backpatch the fix to check_createrole_self_grant down to v16 and the fix to check_synchronized_standby_slots down to v17 where they were introduced. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Nikita <pm91.arapov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Bug: #18845 Discussion: https://postgr.es/m/18845-582c6e10247377ec@postgresql.org Backpatch-through: 16	2025-03-27 22:57:34 +01:00
Melanie Plageman	043799fa08	Use streaming read I/O in heap amcheck Instead of directly invoking ReadBuffer() for each unskippable block in the heap relation, verify_heapam() now uses the read stream API to acquire the next buffer to check for corruption. Author: Matheus Alcantara <matheusssilv97@gmail.com> Co-authored-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/flat/CAFY6G8eLyz7%2BsccegZYFj%3D5tAUR-GZ9uEq4Ch5gvwKqUwb_hCA%40mail.gmail.com	2025-03-27 14:04:14 -04:00
Tom Lane	4623d71443	Prevent assertion failure in contrib/pg_freespacemap. Applying pg_freespacemap() to a relation lacking storage (such as a view) caused an assertion failure, although there was no ill effect in non-assert builds. Add an error check for that case. Bug: #18866 Reported-by: Robins Tharakan <tharakan@gmail.com> Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/18866-d68926d0f1c72d44@postgresql.org Backpatch-through: 13	2025-03-27 13:20:23 -04:00
Tom Lane	d66997dfe8	Avoid mixing designated and non-designated field initializers. As revised by commit `9324c8c58`, PG_MODULE_MAGIC constructed a struct initializer containing both designated fields and a non-designated "0". That's okay in C, but not in C++, with the result that extensions written in C++ failed to compile. Change it to use only designated field initializers. Author: Yurii Rashkovskii <yrashk@omnigres.com> Discussion: https://postgr.es/m/CAG=VW14mctsR543gpzLCuJ9JgJqwa=ptmBfGvxEjs+k8Jf7-Bg@mail.gmail.com	2025-03-27 11:06:30 -04:00
Daniel Gustafsson	0f3604a518	psql: Fix incorrect equality comparison Commit `1a759c8327` contained an incorrect equality comparison which was discovered by Coverity. Reported-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQApfAWzLo+oSuy2byXktdr7R8KJC_ACT5VV8fontrL35Pw@mail.gmail.com	2025-03-27 14:09:25 +01:00
Robert Haas	081ec08e6a	pg_overexplain: Filter out actual row count from test result. Per buildfarm, these are not stable. In particular, 1/8 is sometimes 0.12 and sometimes 0.13.	2025-03-27 09:00:46 -04:00
Álvaro Herrera	9fbd53dea5	Remove the query_id_squash_values GUC Commit `62d712ecfd` introduced the capability to calculate the same queryId for queries with different lengths of constants in a list for an IN clause. This behavior was originally enabled with a GUC query_id_squash_values. After a discussion about the value of such a GUC, it was decided to back out of the use of a GUC and make the squashing behavior the only available option. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z-LZyygkkNyA8-kR@msg.df7cb.de Discussion: https://postgr.es/m/CA+q6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ@mail.gmail.com	2025-03-27 13:33:37 +01:00
Peter Eisentraut	5d5f415816	Expand test a bit Make pg_constraint output in inherit test show the convalidated column as well. This shows the interaction between convalidated and conenforced. This is extracted from a larger patch so that this reformatting isn't distracting there. Author: Amul Sul <amul.sul@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com	2025-03-27 12:11:15 +01:00
Peter Eisentraut	b98be8a2a2	Provide thread-safe pg_localeconv_r(). This involves four different implementation strategies: 1. For Windows, we now require _configthreadlocale() to be available and work (commit `f1da075d9a`), and the documentation says that the object returned by localeconv() is in thread-local memory. 2. For glibc, we translate to nl_langinfo_l() calls, because it offers the same information that way as an extension, and that API is thread-safe. 3. For macOS/*BSD, use localeconv_l(), which is thread-safe. 4. For everything else, use uselocale() to set the locale for the thread, and use a big ugly lock to defend against the returned object being concurrently clobbered. In practice this currently means only Solaris. The new call is used in pg_locale.c, replacing calls to setlocale() and localeconv(). Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com	2025-03-27 10:54:28 +01:00
Álvaro Herrera	4a02af8b1a	Simplify syntax for ALTER TABLE ALTER CONSTRAINT NO INHERIT Commit `d45597f72f` introduced the ability to change a not-null constraint from NO INHERIT to INHERIT and vice versa, but we included the SET noise word in the syntax for it. The SET turns out not to be necessary and goes against what the SQL standard says for other ALTER TABLE subcommands, so remove it. This changes the way this command is processed for constraint types other than not-null, so there are some error message changes. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Suraj Kharage <suraj.kharage@enterprisedb.com> Discussion: https://postgr.es/m/202503251602.vsxaehsyaoac@alvherre.pgsql	2025-03-27 09:24:52 +01:00
Michael Paquier	72c2f36d57	libpq: Add TAP tests for service files and names This commit adds a set of regression tests that checks various patterns with service names and service files, with: - Service file with no contents, used as default for PGSERVICEFILE to prevent any lookups at the HOME directory of an environment where the test is run. - Service file with valid service name and its section. - Service file at the root of PGSYSCONFDIR, named pg_service.conf. - Missing service file. - Service name defined as a connection parameter or as PGSERVICE. Note that PGSYSCONFDIR is set to always point at a temporary directory created by the test, so as we never try to look at SYSCONFDIR. This set of tests has come up as a useful independent addition while discussing a patch that adds an equivalent of PGSERVICEFILE as a connection parameter as there have never been any tests for service files and service names. Torsten Foertsch and Ryo Kanbayashi have provided a basic implementation, that I have expanded to what is introduced in this commit. Author: Torsten Foertsch <tfoertsch123@gmail.com> Author: Ryo Kanbayashi <kanbayashi.dev@gmail.com> Author: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAKkG4_nCjx3a_F3gyXHSPWxD8Sd8URaM89wey7fG_9g7KBkOCQ@mail.gmail.com	2025-03-27 16:01:38 +09:00
David Rowley	ad9a23bc4f	Optimize Query jumble `f31aad9b0` adjusted query jumbling so it no longer ignores NULL nodes during the jumble. This added some overhead. Here we tune a few things to make jumbling faster again. This makes jumbling perform similar or even slightly faster than prior to that change. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAApHDvreP04nhTKuYsPw0F-YN+4nr4f=L72SPeFb81jfv+2c7w@mail.gmail.com	2025-03-27 18:34:34 +13:00
David Rowley	f31aad9b07	Fix query jumbling to account for NULL nodes Previously NULL nodes were ignored. This could cause issues where the computed query ID could match for queries where fields that are next to each other in their Node struct where one field was NULL and the other non-NULL. For example, the Query struct had distinctClause and sortClause next to each other. If someone wrote; SELECT DISTINCT c1 FROM t; and then; SELECT c1 FROM t ORDER BY c1; these would produce the same query ID since, in the first query, we ignored the NULL sortClause and appended the jumble bytes for the distictClause. In the latter query, since we did nothing for the NULL distinctClause then jumble the non-NULL sortClause, and since the node representation stored is the same in both cases, the query IDs were identical. Here we fix this by always accounting for NULL nodes by recording that we saw a NULL in the jumble buffer. This fixes the issue as the order that the NULL is recorded isn't the same in the above two queries. Author: Bykov Ivan <i.bykov@modernsys.ru> Author: Michael Paquier <michael@paquier.xyz> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/aafce7966e234372b2ba876c0193f1e9%40localhost.localdomain	2025-03-27 18:23:00 +13:00
Michael Paquier	44fe6ceb51	doc: Correct description of values used in FSM for indexes The implementation of FSM for indexes is simpler than heap, where 0 is used to track if a page is in-use and (BLCKSZ - 1) if a page is free. One comment in indexfsm.c and one description in the documentation of pg_freespacemap were incorrect about that. Author: Alex Friedman <alexf01@gmail.com> Discussion: https://postgr.es/m/71eef655-c192-453f-ac45-2772fec2cb04@gmail.com Backpatch-through: 13	2025-03-27 10:20:41 +09:00
Andres Freund	c325a7633f	aio: Add io_method=io_uring Performing AIO using io_uring can be considerably faster than io_method=worker, particularly when lots of small IOs are issued, as a) the context-switch overhead for worker based AIO becomes more significant b) the number of IO workers can become limiting io_uring, however, is linux specific and requires an additional compile-time dependency (liburing). This implementation is fairly simple and there are substantial optimization opportunities. The description of the existing AIO_IO_COMPLETION wait event is updated to make the difference between it and the new AIO_IO_URING_EXECUTION clearer. Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-26 19:49:13 -04:00
Andres Freund	8eadd5c73c	aio: Add liburing dependency Will be used in a subsequent commit, to implement io_method=io_uring. Kept separate for easier review. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-03-26 19:45:32 -04:00
Michael Paquier	f056f75daf	doc: Mention possible ephemeral discrepancies in pg_stat_activity Ephemeral inconsistencies across multiple attributes of pg_stat_activity can exist as the system is designed to be efficient with a low overhead. This question is raised by users from time to time based on the data read in the view, so let's add a note in the docs about this possibility. Author: Alex Friedman <alexf01@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/8a275154-a654-44b0-ab37-197802f04c7b@gmail.com	2025-03-27 08:07:54 +09:00
Andres Freund	9469d7fdd2	aio: Rename pgaio_io_prep_* to pgaio_io_start_* The old naming pattern (mirroring liburing's naming) was inconsistent with the (not yet introduced) callers. It seems better to get rid of the inconsistency now than to grow more users of the odd naming. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250326001915.bc.nmisch@google.com	2025-03-26 16:10:29 -04:00
Andres Freund	f321ec237a	aio: Pass result of local callbacks to ->report_return Otherwise the results of e.g. temp table buffer verification errors will not reach bufmgr.c. Obviously that's not right. Found while expanding the tests for invalid buffer contents. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250326001915.bc.nmisch@google.com	2025-03-26 16:06:54 -04:00
Andres Freund	96da9050a5	aio: Be more paranoid about interrupts As reported by Noah, it's possible, although practically very unlikely, that interrupts could be processed in between pgaio_io_reopen() and pgaio_io_perform_synchronously(). Prevent that by explicitly holding interrupts. It also seems good to add an assertion to pgaio_io_before_prep() to ensure that interrupts are held, as otherwise FDs referenced by the IO could be closed during interrupt processing. All code in the aio series currently runs the code with interrupts held, but it seems better to be paranoid. Reviewed-by: Noah Misch <noah@leadboat.com> Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250324002939.5c.nmisch@google.com	2025-03-26 16:06:54 -04:00
Robert Haas	47a1f076a7	pg_overexplain: SET jit=off when running tests. Per buildfarm.	2025-03-26 15:43:25 -04:00
Robert Haas	de65c4dade	Fix oversights in commit `8d5ceb113e` It added bogus whitespace at the end of a line in the documentation. It should not have done that. The pg_overexplain tests must SET debug_parallel_query = false, not just RESET debug_parallel_query, or we get failures on test machines that make debug_parallel_query = true the defualt.	2025-03-26 14:22:45 -04:00
Robert Haas	8d5ceb113e	pg_overexplain: Additional EXPLAIN options for debugging. There's a fair amount of information in the Plan and PlanState trees that isn't printed by any existing EXPLAIN option. This means that, when working on the planner, it's often necessary to rely on facilities such as debug_print_plan, which produce excessively voluminous output. Hence, use the new EXPLAIN extension facilities to implement EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE) as extensions to the core EXPLAIN facility. A great deal more could be done here, and the specific choices about what to print and how are definitely arguable, but this is at least a starting point for discussion and a jumping-off point for possible future improvements. Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviweed-by: Andrei Lepikhov <lepihov@gmail.com> (who didn't like it) Discussion: http://postgr.es/m/CA+TgmoZfvQUBWQ2P8iO30jywhfEAKyNzMZSR+uc2xr9PZBw6eQ@mail.gmail.com	2025-03-26 13:52:21 -04:00
Tomas Vondra	818245506c	Keep the decompressed filter in brin_bloom_union The brin_bloom_union() function combines two BRIN summaries, by merging one filter into the other. With bloom, we have to decompress the filters first, but the function failed to update the summary to store the merged filter. As a consequence, the index may be missing some of the data, and return false negatives. This issue exists since BRIN bloom indexes were introduced in Postgres 14, but at that point the union function was called only when two sessions happened to summarize a range concurrently, which is rare. It got much easier to hit in 17, as parallel builds use the union function to merge summaries built by workers. Fixed by storing a pointer to the decompressed filter, and freeing the original one. Free the second filter too, if it was decompressed. The freeing is not strictly necessary, because the union is called in short-lived contexts, but it's tidy. Backpatch to 14, where BRIN bloom indexes were introduced. Reported by Arseniy Mukhin, investigation and fix by me. Reported-by: Arseniy Mukhin Discussion: https://postgr.es/m/18855-1cf1c8bcc22150e6%40postgresql.org Backpatch-through: 14	2025-03-26 17:01:41 +01:00
Tom Lane	55527368bd	Use PG_MODULE_MAGIC_EXT in our installable shared libraries. It seems potentially useful to label our shared libraries with version information, now that a facility exists for retrieving that. This patch labels them with the PG_VERSION string. There was some discussion about using semantic versioning conventions, but that doesn't seem terribly helpful for modules with no SQL-level presence; and for those that do have SQL objects, we typically expect them to support multiple revisions of the SQL definitions, so it'd still not be very helpful. I did not label any of src/test/modules/. It seems unnecessary since we don't install those, and besides there ought to be someplace that still provides test coverage for the original PG_MODULE_MAGIC macro. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com	2025-03-26 11:11:02 -04:00
Tom Lane	9324c8c580	Introduce PG_MODULE_MAGIC_EXT macro. This macro allows dynamically loaded shared libraries (modules) to provide a wired-in module name and version, and possibly other compile-time-constant fields in future. This information can be retrieved with the new pg_get_loaded_modules() function. This feature is expected to be particularly useful for modules that do not have any exposed SQL functionality and thus are not associated with a SQL-level extension object. But even for modules that do belong to extensions, being able to verify the actual code version can be useful. Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Yurii Rashkovskii <yrashk@omnigres.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com	2025-03-26 11:06:12 -04:00
Daniel Gustafsson	e92c0632c1	Move GSSAPI includes into its own header Due to a conflict in macro names on Windows between <wincrypt.h> and <openssl/ssl.h> these headers need to be included using a predictable pattern with an undef to handle that. The GSSAPI header <gssapi.h> does include <wincrypt.h> which cause problems with compiling PostgreSQL using MSVC when OpenSSL and GSSAPI are both enabled in the tree. Rather than fixing piecemeal for each file including gssapi headers, move the the includes and undef to a new file which should be used to centralize the logic. This patch is a reworked version of a patch by Imran Zaheer proposed earlier in the thread. Once this has proven effective in master we should look at backporting this as the problem exist at least since v16. Author: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Imran Zaheer <imran.zhir@gmail.com> Reported-by: Dave Page <dpage@pgadmin.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/20240708173204.3f3xjilglx5wuzx6@awork3.anarazel.de	2025-03-26 15:31:46 +01:00
Daniel Gustafsson	1eb399366e	psql: Make test robust against locale variations The test committed in `1a759c8327` was prone to failing when using locales with a different decimal separator. Since the test value isn't the important part, change to using an integer instead. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/CAFj8pRDE=7uW7QP4rg-OQLE2i-puYsUUt+eHE-L6_b_J9w=eWg@mail.gmail.com	2025-03-26 13:20:56 +01:00
Peter Eisentraut	3642df265d	dblink: SCRAM authentication pass-through This enables SCRAM authentication for dblink (using dblink_fdw) when connecting to a foreign server without having to store a plain-text password on user mapping options This uses the same approach as it was implemented for postgres_fdw in commit `761c79508e`. (It also contains the equivalent of the subsequent fixes `76563f88cf` and d2028e9bbc1.) Author: Matheus Alcantara <mths.dev@pm.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAFY6G8ercA1KES%3DE_0__R9QCTR805TTyYr1No8qF8ZxmMg8z2Q%40mail.gmail.com	2025-03-26 10:49:23 +01:00
Dean Rasheed	a3b6dfd410	Add support for gamma() and lgamma() functions. These are useful general-purpose math functions which are included in POSIX and C99, and are commonly included in other math libraries, so expose them as SQL-callable functions. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Stepan Neretin <sncfmgg@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: https://postgr.es/m/CAEZATCXpGyfjXCirFk9au+FvM0y2Ah+2-0WSJx7MO368ysNUPA@mail.gmail.com	2025-03-26 09:35:53 +00:00
Richard Guo	7c82b4f711	Fix integer-overflow problem in scram_SaltedPassword() Setting the iteration count for SCRAM secret generation to INT_MAX will cause an infinite loop in scram_SaltedPassword() due to integer overflow, as the loop uses the "i <= iterations" comparison. To fix, use "i < iterations" instead. Back-patch to v16 where the user-settable GUC scram_iterations has been added. Author: Kevin K Biju <kevinkbiju@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAM45KeEMm8hnxdTOxA98qhfZ9CzGDdgy3mxgJmy0c+2WwjA6Zg@mail.gmail.com	2025-03-26 17:46:51 +09:00
Michael Paquier	787514b30b	Use relation name instead of OID in query jumbling for RangeTblEntry custom_query_jumble (introduced in `5ac462e2b7` as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by `3430215fe3`, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count() FROM tab; SET search_path = 'foo2'; SELECT count() FROM tab; SELECT count() FROM foo1.tab; SELECT count() FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query \| calls --------------------------+------- SELECT count(*) FROM tab \| 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de	2025-03-26 15:21:05 +09:00
Peter Eisentraut	d2028e9bbc	postgres_fdw: Fix tests on some Windows variants The tests introduced by commit `76563f88cf` only work when Unix-domain sockets are available. This is optional on Windows, and buildfarm member drongo runs without them. To fix, skip the test if Unix-domain sockets are not enabled.	2025-03-26 07:00:00 +01:00
Jeff Davis	bde2fb797a	Add pg_dump --with-{schema\|data\|statistics} options. By adding the positive variants of options, in addition to the negative variants that already exist, users can be explicit about what pg_dump should produce. Discussion: https://postgr.es/m/bd0513e4b1ea2b2f2d06f02720c6579711cb62a6.camel@j-davis.com Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de>	2025-03-25 17:36:38 -07:00
Michael Paquier	27ee6ede6b	Fix two issues with custom_query_jumble in gen_node_support.pl A node field marked with custom_query_jumble and query_jumble_ignore would generate some code of a custom routine. The script is changed so as custom_query_jumble behaves like the other options in this case, query_jumble_ignore taking priority, with no code generated. A comment related to the code generated for node types was misplaced. Thinkos introduced in `5ac462e2b7`. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1324036.1742945060@sss.pgh.pa.us	2025-03-26 09:06:36 +09:00
Tom Lane	cb36f8ec21	Fix order of -I switches for building pg_regress.o. We need the -I switch for libpq_srcdir to come before any -I switches injected by configure. Otherwise there is a risk of pulling in a mismatched version of libpq_fe.h from someplace like /usr/local/include, if the platform has another Postgres version installed there. This evidently accounts for today's buildfarm failures on "anaconda". In principle the -I switch for src/port/ is at similar hazard, and has been for a very long time. But the only .h files we keep there are pg_config_paths.h and pthread-win32.h, neither of which get installed on Unix-ish systems, so the odds of picking up a conflicting header seem pretty small. That doubtless accounts for the lack of prior reports. Back-patch to v17 where pg_regress acquired a build dependency on libpq_fe.h. We could go back further to fix the hazard for src/port/ in older branches, but it seems unlikely to be worth troubling over. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/Z-MhRzoc7t-nPUQG@nathan Backpatch-through: 17	2025-03-25 20:03:56 -04:00
Michael Paquier	3430215fe3	pg_stat_statements: Add more tests with temp tables and namespaces These tests provide coverage for RangeTblEntry and how query jumbling works with search_path, as well as the case where relations are re-created, generating a different query ID as the relation OID is used in the computation. A patch is under discussion to switch to a different approach based on the relation name, and there was no test coverage for this area, including how queries are currently grouped with search_path. This is useful to track how the situation changes between HEAD and any patches proposed. Christoph has proposed the test with ON COMMIT DROP temporary tables, and I have written the second part. Author: Christoph Berg <myon@debian.org> Author: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de	2025-03-26 07:25:23 +09:00
Nathan Bossart	626d7236b6	pg_upgrade: Add --swap for faster file transfer. This new option instructs pg_upgrade to move the data directories from the old cluster to the new cluster and then to replace the catalog files with those generated for the new cluster. This mode can outperform --link, --clone, --copy, and --copy-file-range, especially on clusters with many relations. However, this mode creates many garbage files in the old cluster, which can prolong the file synchronization step if --sync-method=syncfs is used. To handle that, we recommend using --sync-method=fsync with this mode, and pg_upgrade internally uses "initdb --sync-only --no-sync-data-files" for file synchronization. pg_upgrade will synchronize the catalog files as they are transferred. We assume that the database files transferred from the old cluster were synchronized prior to upgrade. This mode also complicates reverting to the old cluster, so we recommend restoring from backup upon failure during or after file transfer. We did consider teaching pg_upgrade how to generate a revert script for such failures, but we decided against it due to the rarity of failing during file transfer, the complexity of generating the script, and the potential for misusing the script. The new mode is limited to clusters located in the same file system. With some effort, we could probably support upgrades between different file systems, but this mode is unlikely to offer much benefit if we have to copy the files across file system boundaries. It is also limited to upgrades from version 10 or newer. There are a few known obstacles for using swap mode to upgrade from older versions. For example, the visibility map format changed in v9.6, and the sequence tuple format changed in v10. In fact, swap mode omits the --sequence-data option in its uses of pg_dump and instead reuses the old cluster's sequence data files. While teaching swap mode to deal with these kinds of changes is surely possible (and we may have to deal with similar problems in the future, anyway), it doesn't seem worth the effort to support upgrades from long-unsupported versions. Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan	2025-03-25 16:02:35 -05:00
Nathan Bossart	9c49f0e8cd	pg_dump: Add --sequence-data. This new option instructs pg_dump to dump sequence data when the --no-data, --schema-only, or --statistics-only option is specified. This was originally considered for commit `a7e5457db8`, but it was left out at that time because there was no known use-case. A follow-up commit will use this to optimize pg_upgrade's file transfer step. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan	2025-03-25 16:02:35 -05:00
Nathan Bossart	cf131fa942	initdb: Add --no-sync-data-files. This new option instructs initdb to skip synchronizing any files in database directories, the database directories themselves, and the tablespace directories, i.e., everything in the base/ subdirectory and any other tablespace directories. Other files, such as those in pg_wal/ and pg_xact/, will still be synchronized unless --no-sync is also specified. --no-sync-data-files is primarily intended for internal use by tools that separately ensure the skipped files are synchronized to disk. A follow-up commit will use this to help optimize pg_upgrade's file transfer step. The --sync-method=fsync implementation of this option makes use of a new exclude_dir parameter for walkdir(). When not NULL, exclude_dir specifies a directory to skip processing. The --sync-method=syncfs implementation of this option just skips synchronizing the non-default tablespace directories. This means that initdb will still synchronize some or all of the database files, but there's not much we can do about that. Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan	2025-03-25 16:02:35 -05:00
Jeff Davis	650ab8aaf1	Stats: use schemaname/relname instead of regclass. For import and export, use schemaname/relname rather than regclass. This is more natural during export, fits with the other arguments better, and it gives better control over error handling in case we need to downgrade more errors to warnings. Also, use text for the argument types for schemaname, relname, and attname so that casts to "name" are not required. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com	2025-03-25 11:16:06 -07:00
Jeff Davis	2a420f7995	Minor doc update for commit `99f8f3fbbc`. Author: Corey Huinker <corey.huinker@gmail.com>	2025-03-25 11:15:52 -07:00
Daniel Gustafsson	1a759c8327	psql: Make default \watch interval configurable The default interval for \watch to wait between executing queries, when executed without a specified interval, was hardcoded to two seconds. This adds the new variable WATCH_INTERVAL which is used to set the default interval, making it configurable for the user. This makes \watch the first command which has a user configurable default setting. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/B2FD26B4-8F64-4552-A603-5CC3DF1C7103@yesql.se	2025-03-25 17:53:33 +01:00
Daniel Gustafsson	a19db08274	pg_basebackup: Add missing PQclear in error path This adds a missing PQclear in the error path of StreamLogicalLog, a fix in the same vein as `e889422d98` with an equivalent low impact. Author: Steven Niu <niushiji@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/c4b1c627-a3e4-4347-a670-1e28a43ce0eb@gmail.com	2025-03-25 17:24:23 +01:00
Peter Eisentraut	ef7a5af77d	refactor: Pass relation OID instead of Relation to createForeignKeyCheckTriggers() Currently, createForeignKeyCheckTriggers() takes a Relation type as its first argument, but it doesn't use that argument directly. Instead, it fetches the relation OID by calling RelationGetRelid(). Therefore, it would be more consistent with other functions (e.g., createForeignKeyCheckTriggers()) to pass the relation OID directly instead of the whole Relation. Author: Amul Sul <amul.sul@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com	2025-03-25 17:04:12 +01:00
Peter Eisentraut	639238b978	refactor: Split ATExecAlterConstraintInternal() Split ATExecAlterConstraintInternal() into two functions: ATExecAlterConstrDeferrability() and ATExecAlterConstrInheritability(). This simplifies the code and avoids unnecessary confusion caused by recursive code, which isn't needed for ATExecAlterConstrInheritability(). (This also takes over the changes in commit `64224a834c`, as the new AlterConstrDeferrabilityRecurse() is essentially the old ATExecAlterChildConstr().) Author: Amul Sul <amul.sul@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com	2025-03-25 16:18:00 +01:00
Peter Eisentraut	a3280e2a49	refactor: Move some code that updates pg_constraint to a separate function This extracts common/duplicate code for different ALTER CONSTRAINT variants into a common function. We plan to add more variants that would use the same code. Author: Amul Sul <amul.sul@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com	2025-03-25 14:37:22 +01:00
Peter Eisentraut	f4b2a62ae3	Small fixes for Add ALTER TABLE ... ALTER CONSTRAINT ... SET [NO] INHERIT Small fixes for commit f4e53e10b6c: Add missing calls to InvokeObjectPostAlterHook() and also CacheInvalidateRelcache(). The former change could have a user-visible effect. The latter omission might have caused other bugs, but it is not clear whether one actually existed. With these changes, the code is now more consistent with similar ALTER CONSTRAINT variants, especially the ones that set the deferrability. Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAF1DzPVfOW6Kk=7SSh7LbneQDJWh=PbJrEC_Wkzc24tHOyQWGg@mail.gmail.com	2025-03-25 13:40:24 +01:00
Alexander Korotkov	62f36d6924	postgres_fdw: Remove redundant check in semijoin_target_ok() If a var belongs to the innerrel of the joinrel, it's not possible that it belongs to the outerrel. This commit removes the redundant check from the if-clause but keeps it as an assertion. Discussion: https://postgr.es/m/flat/CAHewXN=8aW4hd_W71F7Ua4+_w0=bppuvvTEBFBF6G0NuSXLwUw@mail.gmail.com Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Alexander Pyhalov <a.yhalov@postgrespro.ru> Backpatch-through: 17	2025-03-25 12:49:01 +02:00
Thomas Munro	3c86223c99	libpq: Deprecate pg_int64. Previously we used pg_int64 in three function prototypes in libpq. It was added by commit `461ef73f` to expose the platform-dependent type used for int64 in the C89 era. As of commit `962da900` it is defined as standard int64_t, and the dust seems to have settled. Let's just use int64_t directly in these three client-facing functions instead of (yet) another name. We've required C99 and thus <stdint.h> since PostgreSQL 12, C89 and C++98 compilers are long gone, and client applications very likely use standard types for their own 64-bit needs. This also cleans up the obscure placement of a new #include <stdint.h> directive in postgres_ext.h, required for the new definition. The typedef was hiding in there for historical reasons, but it doesn't fit postgres_ext.h's own description of its purpose and there is no evidence of client applications including postgres_ext.h directly to see it. Keep a typedef marked deprecated for backward compatibility, but move it into libpq-fe.h where it was used. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CA%2BhUKGKn_EkNNGMY5RzMcKP%2Ba6urT4JF%3DCPhw_zHtQwjvX6P2g%40mail.gmail.com	2025-03-25 21:40:00 +13:00
Peter Eisentraut	be1cc9aaf5	Generalize index support in network support function The network (inet) support functions currently only supported a hardcoded btree operator family. With the generalized compare type facility, we can generalize this to support any operator family from any index type that supports the required operators. Author: Mark Dilger <mark.dilger@enterprisedb.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-03-25 07:11:56 +01:00
Michael Paquier	5ac462e2b7	Add support for custom_query_jumble as a node field attribute This option gives the possibility for query jumble to define a custom routine for the field of a Node, extending support for custom_query_jumble as a node field attribute. When dealing with complex node structures, this can be simpler than having to enforce a custom function across a full node. Custom functions need to be defined in queryjumblefuncs.c, named as _jumble${node}_${field}(), and use in input the JumbleState, the node and its field. The field is not really required if we have the Node, but it makes custom implementations somewhat easier to think about. The code generated by gen_node_support.pl uses a macro called JUMBLE_CUSTOM(), hiding the internals of the logic inside queryjumblefuncs.c. This will be used by an upcoming patch manipulating adding a custom routine into a field of RangeTblEntry, but this facility can become useful in more cases. Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/Z9y43-dRvb4EtxQ0@paquier.xyz	2025-03-25 14:18:00 +09:00
Jeff Davis	626df47ad9	Remove 'additional' pointer from TupleHashEntryData. Reduces memory required for hash aggregation by avoiding an allocation and a pointer in the TupleHashEntryData structure. That structure is used for all buckets, whether occupied or not, so the savings is substantial. Discussion: https://postgr.es/m/AApHDvpN4v3t_sdz4dvrv1Fx_ZPw=twSnxuTEytRYP7LFz5K9A@mail.gmail.com Reviewed-by: David Rowley <dgrowleyml@gmail.com>	2025-03-24 22:06:02 -07:00
Jeff Davis	a0942f441e	Add ExecCopySlotMinimalTupleExtra(). Allows an "extra" argument that allocates extra memory at the end of the MinimalTuple. This is important for callers that need to store additional data, but do not want to perform an additional allocation. Suggested-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvppeqw2pNM-+ahBOJwq2QmC0hOAGsmCpC89QVmEoOvsdg@mail.gmail.com	2025-03-24 22:05:53 -07:00
Jeff Davis	4d143509cb	Create accessor functions for TupleHashEntry. Refactor for upcoming optimizations. Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/1cc3b400a0e8eead18ff967436fa9e42c0c14cfb.camel@j-davis.com	2025-03-24 22:05:41 -07:00
Jeff Davis	cc721c459d	HashAgg: use Bump allocator for hash TupleHashTable entries. The entries aren't freed until the entire hash table is destroyed, so use the Bump allocator to improve allocation speed, avoid wasting space on the chunk header, and avoid wasting space due to the power-of-two allocations. Discussion: https://postgr.es/m/CAApHDvqv1aNB4cM36FzRwivXrEvBO_LsG_eQ3nqDXTjECaatOQ@mail.gmail.com Reviewed-by: David Rowley	2025-03-24 22:05:33 -07:00
Amit Kapila	cc4331605a	Fix the typo in the test case added in `73eba5004a`. Author: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm2ms1deM5EYNLFEfESv_Kw=Y4AiTB0LP=qGS-UpFwGbPg@mail.gmail.com Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com	2025-03-25 09:39:53 +05:30
Amit Kapila	b87ced747d	Fix an oversight in `3abe9dc188`. Forgot to update the comment atop one of the functions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/OSCPR01MB1496623BE1125B44614494E7AF5A72@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-03-25 09:26:23 +05:30
Alexander Korotkov	023fb51275	postgres_fdw: Avoid pulling up restrict infos from subqueries Semi-join joins below left/right join are deparsed as subqueries. Thus, we can't refer to subqueries vars from upper relations. This commit avoids pulling conditions from them. Reported-by: Robins Tharakan <tharakan@gmail.com> Bug: #18852 Discussion: https://postgr.es/m/CAEP4nAzryLd3gwcUpFBAG9MWyDfMRX8ZjuyY2XXjyC_C6k%2B_Zw%40mail.gmail.com Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Backpatch-through: 17	2025-03-25 05:49:47 +02:00
Andres Freund	adb5f85fa5	Redefine max_files_per_process to control additionally opened files Until now max_files_per_process=N limited each backend to open N files in total (minus a safety factor), even if there were already more files opened in postmaster and inherited by backends. Change max_files_per_process to control how many additional files each process is allowed to open. The main motivation for this is the patch to add io_method=io_uring, which needs to open one file for each backend. Without this patch, even if RLIMIT_NOFILE is high enough, postmaster will fail in set_max_safe_fds() if started with a high max_connections. The cause of the failure is that, until now, set_max_safe_fds() subtracted the already open files from max_files_per_process. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/w6uiicyou7hzq47mbyejubtcyb2rngkkf45fk4q7inue5kfbeo@bbfad3qyubvs Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d=h9J=D1rStFCMR+t7yh_Kwj-g87aLQ@mail.gmail.com	2025-03-24 18:20:18 -04:00
Nathan Bossart	7d559c8580	Expand comment for isset_offset. This field was added in commit `0164a0f9ee` to provide a way to determine whether a storage parameter was explicitly set for the relation or if it just picked up the default value. In most cases, this can be accomplished by giving the storage parameter a special out-of-range default value (e.g., the autovacuum_vacuum_insert_threshold storage parameter defaults to -2), but this approach doesn't work in all cases. For example, a Boolean storage parameter cannot be given an out-of-range default, so we need another way to discover the source of its value. Reported-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/CAKFQuwYKtEUYKS%2B18gRs-xPhn0qOJgM2KGyyWVCODHuVn9F-XQ%40mail.gmail.com	2025-03-24 15:47:02 -05:00
Melanie Plageman	aea916fe55	Fix bitmapheapscan incorrect recheck of NULL tuples The bitmap heap scan skip fetch optimization skips fetching the heap block when a page is set all-visible in the visibility map and no columns from the table are needed to satisfy the query. `2b73a8cd33` and `c3953226a0` changed the control flow of bitmap heap scan to use the read stream API. The read stream API returns buffers containing blocks to the user. To make this work with the skip fetch optimization, we keep a count of the empty tuples we need to emit for all the blocks skipped and only emit the empty tuples after processing the next block fetched from the heap or at the end of the scan. It's incorrect to recheck NULL tuples, so we must set `recheck` to false before yielding control back to BitmapHeapNext(). This was done before emitting any remaining empty tuples at the end of the scan but not for empty tuples emitted during the scan. This meant that if a page fetched from the heap did require recheck and set `recheck` to true and then we emitted empty tuples for subsequent blocks, we would get wrong results. Fix this by always setting `recheck` to false before emitting empty tuples. Reported-by: Alexander Lakhin <exclusion@gmail.com> Tested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/496f7acd-881c-4df3-9bd3-8f8534dfec26%40gmail.com	2025-03-24 16:40:59 -04:00
Álvaro Herrera	0e3e0ec06b	Fix typo	2025-03-24 17:36:44 +01:00
Fujii Masao	c68100aa43	Allow pg_recvlogical --drop-slot to work without --dbname. When pg_recvlogical was introduced in 9.4, the --dbname option was not required for --drop-slot. Without it, pg_recvlogical --drop-slot connected using a replication connection (not tied to a specific database) and was able to drop both physical and logical replication slots, similar to pg_receivewal --drop-slot. However, commit `0c013e08cf` unintentionally changed this behavior in 9.5, making pg_recvlogical always check whether it's connected to a specific database and fail if it's not. This change was expected for --create-slot and --start, which handle logical replication slots and require a database connection, but it was unnecessary for --drop-slot, which should work with any replication connection. As a result, --dbname became a required option for --drop-slot. This commit removes that restriction, restoring the original behavior and allowing pg_recvlogical --drop-slot to work without specifying --dbname. Although this issue originated from an unintended change, it has existed for a long time without complaints or bug reports, and the documentation never explicitly stated that --drop-slot should work without --dbname. Therefore, the change is not treated as a bug fix and is applied only to master. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/b15ecf4f-e5af-4fbb-82c2-a425f453e0b2@oss.nttdata.com	2025-03-25 00:18:27 +09:00
Fujii Masao	dfc13428a9	doc: Clarify required options for each action in pg_recvlogical. Each pg_recvlogical action requires specific options. For example, --slot, --dbname, and --file must be specified with the --start action. Previously, the documentation did not clearly outline these requirements. This commit updates the documentation to explicitly state the necessary options for each action. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966930B4357BAE8C9D68A8AF5C72@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-03-25 00:14:38 +09:00
Peter Eisentraut	76563f88cf	postgres_fdw: improve security checks SCRAM pass-through should not bypass the FDW security check as it was implemented for postgres_fdw in commit `761c79508e`. This commit improves the security check by adding new SCRAM pass-through checks to ensure that the required SCRAM connection options are not overwritten by the user mapping or foreign server options. This is meant to match the security requirements for a password-using connection. Since libpq has no SCRAM-specific equivalent of PQconnectionUsedPassword(), we enforce this instead by making the use_scram_passthrough option of postgres_fdw imply require_auth=scram-sha-256. This means that if use_scram_passthrough is set, some situations that might otherwise have worked are preempted, for example GSSAPI with delegated credentials. This could be enhanced in the future if there is desire for more flexibility. Reported-by: Jacob Champion <jacob.champion@enterprisedb.com> Author: Matheus Alcantara <mths.dev@pm.me> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAFY6G8ercA1KES%3DE_0__R9QCTR805TTyYr1No8qF8ZxmMg8z2Q%40mail.gmail.com	2025-03-24 15:56:53 +01:00
Magnus Hagander	a8eeb22f17	psql: use consistent alias for pg_description Author:Jelte Fennema-Nio <github-tech@jeltef.nl> Suggested-By: Michael Banck <mbanck@gmx.net> Discussion: https://www.postgresql.org/message-id/67813520.170a0220.183245.7bf0%40mx.google.com	2025-03-24 14:31:28 +01:00
Magnus Hagander	d696406a9b	psql: show default extension version in \dx output Reviewed-By: Julien Rouhaud <rjuju123@gmail.com> Reviewed-By: Michael Banck <mbanck@gmx.net> Reviewed-By: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-By: Nathan Bossart <nathandbossart@gmail.com> Reviewed-By: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CABUevEyTMyXC6OvCWkj+rPnHrfi8_Rw_+DD_jzgFFNPqgf+Oig@mail.gmail.com	2025-03-24 14:25:05 +01:00
Heikki Linnakangas	19c6eb06c5	Add test case for when subscriber table is missing a column We haven't had bugs in this area, but there's some not-entirely trivial code to detect that case, so it seems good to have test coverage for it. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/CAHut%2BPtX8P0EGhsk9p%3DhQGUHrzxeCSzANXSMKOvYiLX-EjdyNw@mail.gmail.com	2025-03-24 12:13:32 +02:00
Amit Kapila	73eba5004a	Detect and Log multiple_unique_conflicts type conflict. Introduce a new conflict type, multiple_unique_conflicts, to handle cases where an incoming row during logical replication violates multiple UNIQUE constraints. Previously, the apply worker detected and reported only the first encountered key conflict (insert_exists/update_exists), causing repeated failures as each constraint violation needs to be handled one by one making the process slow and error-prone. With this patch, the apply worker checks all unique constraints upfront once the first key conflict is detected and reports multiple_unique_conflicts if multiple violations exist. This allows users to resolve all conflicts at once by deleting all conflicting tuples rather than dealing with them individually or skipping the transaction. In the future, this will also allow us to specify different resolution handlers for such a conflict type. Add the stats for this conflict type in pg_stat_subscription_stats. Author: Nisha Moond <nisha.moond412@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com	2025-03-24 12:30:44 +05:30
David Rowley	35a92b7c25	Add tests for POSITION(bytea, bytea) Previously there was no coverage for this function. Author: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TMT6XCooMVKnCd_tR2oBdGcnjefSeCDCv8jzKy9VkWA5w@mail.gmail.com	2025-03-24 19:32:02 +13:00
Michael Paquier	2a0cd38da5	Allow plugins to set a 64-bit plan identifier in PlannedStmt This field can be optionally set in a PlannedStmt through the planner hook, giving extensions the possibility to assign an identifier related to a computed plan. The backend is changed to report it in the backend entry of a process running (including the extended query protocol), with semantics and APIs to set or get it similar to what is used for the existing query ID (introduced in the backend via `4f0b0966c8`). The plan ID is reset at the same timing as the query ID. Currently, this information is not added to the system view pg_stat_activity; extensions can access it through PgBackendStatus. Some patches have been proposed to provide some features in the planning area, where a plan identifier is used as a key to know the plan involved (for statistics, plan storage and manipulations, etc.), and the point of this commit is to provide an anchor in the backend that extensions can rely on for future work. The reset of the plan identifier is controlled by core and follows the same pattern as the query identifier added in `4f0b0966c8`. The contents of this commit are extracted from a larger set proposed originally by Lukas Fittl, that Sami Imseih has proposed as an independent change, with a few tweaks sprinkled by me. Author: Lukas Fittl <lukas@fittl.com> Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg@mail.gmail.com Discussion: https://postgr.es/m/CAA5RZ0vyWd4r35uUBUmhngv8XqeiJUkJDDKkLf5LCoWxv-t_pw@mail.gmail.com	2025-03-24 13:23:42 +09:00
Tom Lane	8a3e4011f0	psql: Add tab completion for VACUUM and ANALYZE ... ONLY option. Improve psql's tab completion for VACUUM and ANALYZE by supporting the ONLY option introduced in `62ddf7ee9`. In passing, simplify some of the VACUUM patterns by making use of MatchAnyN. Author: Umar Hayat <postgresql.wizard@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Discussion: https://postgr.es/m/CAD68Dp3L6yW_nWs+MWBs6s8tKLRzXaQdQgVRm4byZe0L-hRD8g@mail.gmail.com	2025-03-23 17:16:08 -04:00
Heikki Linnakangas	2817525f0d	Fix rare assertion failure in standby, if primary is restarted During hot standby, ExpireAllKnownAssignedTransactionIds() and ExpireOldKnownAssignedTransactionIds() functions mark old transactions as no-longer running, but they failed to update xactCompletionCount and latestCompletedXid. AFAICS it would not lead to incorrect query results, because those functions effectively turn in-progress transactions into aborted transactions and an MVCC snapshot considers both as "not visible". But it could surprise GetSnapshotDataReuse() and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin))" assertion in it, if the apparent xmin in a backend would move backwards. We saw this happen when GetCatalogSnapshot() would reuse an older catalog snapshot, when GetTransactionSnapshot() had already advanced TransactionXmin. The bug goes back all the way to commit `623a9ba79b` in v14 that introduced the snapshot reuse mechanism, but it started to happen more frequently with commit `952365cded` which removed a GetTransactionSnapshot() call from backend startup. That made it more likely for ExpireOldKnownAssignedTransactionIds() to be called between GetCatalogSnapshot() and the first GetTransactionSnapshot() in a backend. Andres Freund first spotted this assertion failure on buildfarm member 'skink'. Reproduction and analysis by Tomas Vondra. Backpatch-through: 14 Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5zmdj	2025-03-23 20:41:16 +02:00
Noah Misch	f0446384ea	Fix "make clean" for new TAP suite. Commit `28f04984f0` missed this.	2025-03-23 06:12:02 -07:00
Andres Freund	ca3067cc57	aio: Change prefix of PgAioResultStatus values to PGAIO_RS_ The previous prefix wasn't consistent with the naming of other AIO related enum values. It seems best to rename it before the users are introduced. Reported-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_Yb+JzQpNsgUxCB0gBi+sE-mi_HmcJF6ALnmO4W+UgwpA@mail.gmail.com	2025-03-22 17:30:44 -04:00
Tom Lane	58fdca2204	plpgsql: make WHEN OTHERS distinct from WHEN SQLSTATE '00000'. The catchall exception condition OTHERS was represented as sqlerrstate == 0, which was a poor choice because that comes out the same as SQLSTATE '00000'. While we don't issue that as an error code ourselves, there isn't anything particularly stopping users from doing so. Use -1 instead, which can't match any allowed SQLSTATE string. While at it, invent a macro PLPGSQL_OTHERS to use instead of a hard-coded magic number. While this seems like a bug fix, I'm inclined not to back-patch. It seems barely possible that someone has written code like this and would be annoyed by changing the behavior in a minor release. Reported-by: David Fiedler <david.fido.fiedler@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAHjN70-=H5EpTOuZVbC8mPvRS5EfZ4MY2=OUdVDWoyGvKhb+Rw@mail.gmail.com	2025-03-22 14:17:00 -04:00
Peter Geoghegan	9a2e2a285a	Improve nbtree array primitive scan scheduling. Add a new scheduling heuristic: don't end the ongoing primitive index scan immediately (at the point where _bt_advance_array_keys notices that the next set of matching tuples must be on a later page) if the primscan already managed to step right/left from its first leaf page. Schedule a recheck against the next sibling leaf page's finaltup instead. The new heuristic tends to avoid scenarios where the top-level scan repeatedly starts and ends primitive index scans that each read only one leaf page from a group of neighboring leaf pages. Affected top-level scans will now tend to step forward (or backward) through the index instead, without wasting cycles on descending the index anew. The recheck mechanism isn't exactly new. But up until now it has only been used to deal with edge cases involving high key finaltups with one or more truncated -inf attributes that _bt_advance_array_keys deemed "provisionally satisfied" (satisfied for the purposes of allowing the scan to step onto the next page, subject to recheck once on that page). The mechanism was added by commit `5bf748b8`, which invented the general concept of primitive scan scheduling. It was later enhanced by commit `79fa7b3b`, which taught it about cases involving -inf attributes that satisfy inequality scan keys required in the opposite-to-scan direction only (arguably, they should have been covered by the earliest version). Now the recheck mechanism can be applied based on scan-level heuristics, which have nothing to do with truncated high keys. Now rechecks might be performed by _bt_readpage when scanning in _either_ scan direction. The theory behind the new heuristic is that any primitive scan that makes it past its first leaf page is one that is already likely to have arrays whose key values match index tuples that are closely clustered together in the index. The rules that determine whether we ever get past the first page are still conservative (that'll still only happen when pstate.finaltup strongly suggests that it's the right thing to do). Surviving past the first leaf page is a strong signal in itself. Preparation for an upcoming patch that will add skip scan optimizations to nbtree. That'll work by adding skip arrays, which behave similarly to SAOP arrays, but generate their elements procedurally and on-demand. Note that this commit isn't specifically concerned with skip arrays; the scheduling logic doesn't (and won't) condition anything on whether the scan uses skip arrays, SAOP arrays, or some combination of the two (which seems like a good general principle for _bt_advance_array_keys). While the problems that this commit ameliorates are more likely with skip arrays (at least in practice), SAOP arrays (or those with very dense, contiguous array elements) are also affected. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wzkz0wPe6+02kr+hC+JJNKfGtjGTzpG3CFVTQmKwWNrXNw@mail.gmail.com	2025-03-22 13:02:18 -04:00
Melanie Plageman	e215166c9c	Use streaming read I/O in SP-GiST vacuuming Like `69273b818b` did for GiST vacuuming, make SP-GiST vacuum use the read stream API for vacuuming physically contiguous index pages. Concurrent insertions may cause SP-GiST index tuples to be redirected. While vacuuming, these are added to a pending list which is later processed to ensure no dead tuples are left behind. Pages containing such tuples are still read by directly calling ReadBuffer() and do not use the read stream API. Author: Andrey M. Borodin <x4mmm@yandex-team.ru> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/37432403-8657-403B-9CDF-5A642BECDD81%40yandex-team.ru	2025-03-21 17:51:22 -04:00
Thomas Munro	e51ca405ed	Fix ps display for IO workers. This code must have missed a memo about the backend type description being supplied automatically these days, and was duplicating that information. Before: "io worker io worker: N" After: "io worker N"	2025-03-22 10:13:23 +13:00
Tom Lane	16a3ae504e	Revert inappropriate weakening of an Assert in plpgsql. Commit `682ce911f` modified exec_save_simple_expr to accept a Param in the tlist of a Gather node, rather than the normal case of a Var referencing the Gather's input. It turns out that this was a kluge to work around the bug later fixed in `0f7ec8d9c`, namely that setrefs.c was failing to replace Params in upper plan nodes with Var references to the same Params appearing in the child tlists. With that fixed, there seems no reason to continue to allow a Param here. (Moreover, even if we did expect a Param here, the semantically correct thing to do would be to take the Param as the expression being sought. Whatever it may represent, it is not a reference to the child.) Hence, revert that part of `682ce911f`. That all happened a long time ago. However, since the net effect here is just to tighten an Assert condition, I'm content to change it only in master. Discussion: https://postgr.es/m/1565347.1742572349@sss.pgh.pa.us	2025-03-21 15:55:06 -04:00
Masahiko Sawada	04ff636cbc	Add GUC option to control maximum active replication origins. This commit introduces a new GUC option max_active_replication_origins to control the maximum number of active replication origins. Previously, this was controlled by 'max_replication_slots'. Having a separate GUC option provides better flexibility for setting up subscribers, as they may not require replication slots (for cascading replication) but always require replication origins. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/b81db436-8262-4575-b7c4-bc0c1551000b@app.fastmail.com	2025-03-21 12:20:15 -07:00
Tom Lane	0e032a2240	Place "extern" declaration in the right part of pg_class.h. errdetail_relkind_not_supported() was declared within EXPOSE_TO_CLIENT_CODE, which is mistaken since that function isn't available client-side. While relatively harmless, this isn't good precedent. Discussion: https://postgr.es/m/1134562.1742507765@sss.pgh.pa.us	2025-03-21 15:14:15 -04:00
Tom Lane	cd72c1b76e	Label the contents of pg__d.h files a little better. Make genbki.pl emit some boilerplate comments identifying the sections of the pg__d.h files that it generates. This is in hopes of making them slightly more readable, in case people look at those files and not the pg_.h/pg_.dat originals. Discussion: https://postgr.es/m/1134562.1742507765@sss.pgh.pa.us	2025-03-21 15:09:46 -04:00
Melanie Plageman	69273b818b	Use streaming read I/O in GiST vacuuming Like `c5c239e26e` did for btree vacuuming, make GiST vacuum use the read stream API for sequentially processed pages. Because it is possible for concurrent insertions to relocate unprocessed index entries to already vacuumed pages, GiST vacuum must backtrack and reprocess those pages. These pages are still read with explicit ReadBuffer() calls. Author: Andrey M. Borodin <x4mmm@yandex-team.ru> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/EFEBED92-18D1-4C0F-A4EB-CD47072EF071%40yandex-team.ru	2025-03-21 14:06:45 -04:00
Melanie Plageman	3f850c3fc5	Assorted trivial cleanup of `c5c239e26e` `c5c239e26e` made btree vacuum use the read stream API. Though it used functions declared in read_stream.h, it relied on transitively including it. Explicitly include that file. Also remove an extraneous newline and decrease the scope of one of the local variables in btvacuumscan().	2025-03-21 14:06:40 -04:00
Tom Lane	7fe312f609	Fix plpgsql's handling of simple expressions in scrollable cursors. exec_save_simple_expr did not account for the possibility that standard_planner would stick a Materialize node atop the plan of even a simple Result, if CURSOR_OPT_SCROLL is set. This led to an "unexpected plan node type" error. This is a very old bug, but it'd only be reached by declaring a cursor for a "SELECT simple-expression" query and explicitly marking it scrollable, which is an odd thing to do. So the lack of prior reports isn't too surprising. Bug: #18859 Reported-by: Olleg Samoylov <splarv@ya.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18859-0d5f28ac99a37059@postgresql.org Backpatch-through: 13	2025-03-21 11:30:42 -04:00
Melanie Plageman	c5c239e26e	Use streaming read I/O in btree vacuuming Btree vacuum processes all index pages in physical order. Now it uses the read stream API to get the next buffer instead of explicitly invoking ReadBuffer(). It is possible for concurrent insertions to cause page splits during index vacuuming. This can lead to index entries that have yet to be vacuumed being moved to pages that have already been vacuumed. Btree vacuum code handles this by backtracking to reprocess those pages. So, while sequentially encountered pages are now read through the read stream API, backtracked pages are still read with explicit ReadBuffer() calls. Author: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_bW1UOyup%3DjdFw%2BkOF9bCaAm%3D9UpiyZtbPMn8n_vnP%2Big%40mail.gmail.com#3b3a84132fc683b3ee5b40bc4c2ea2a5	2025-03-21 09:09:39 -04:00
Álvaro Herrera	1d617a2028	Change one loop in ATRewriteTable to use 1-based attnums All TupleDescAttr() calls in tablecmds.c that aren't in loops across all attributes use AttrNumber-style indexes (1-based); there was only one place in ATRewriteTable that was stashing 0-based indexes in a list for later processing. Switch that to use attnums for consistency. Author: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxEoYA5ScUr2=CmA1xcpaS_1ixneDbEkVU77X1ctGxY2mA@mail.gmail.com	2025-03-21 10:55:06 +01:00
Thomas Munro	ce1a75c4fe	Support buffer forwarding in StartReadBuffers(). StartReadBuffers() reports a short read when it finds a cached block that ends a range needing I/O by updating the caller's nblocks. It doesn't want to have to unpin the trailing hit that it knows the caller wants, so the v17 version used sleight of hand in the name of simplicity: it included it in nblocks as if it were part of the I/O, but internally tracked the shorter real I/O size in io_buffers_len (now removed). This API change "forwards" the delimiting buffer to the next call. It's still pinned, and still stored in the caller's array, but *nblocks no longer includes stray buffers that are not really part of the operation. The expectation is that the caller still wants the rest of the blocks and will call again starting from that point, and now it can pass the already pinned buffer back in (or choose not to and release it). The change is needed for the coming asynchronous I/O version's larger version of the problem: by definition it must move BM_IO_IN_PROGRESS negotiation from WaitReadBuffers() to StartReadBuffers(), but it might already have many buffers pinned before it discovers a need to split an I/O. (The current synchronous I/O version hides that detail from callers by looping over smaller reads if required to make all covered buffers valid in WaitReadBuffers(), so it looks like one operation but it might occasionally be several under the covers.) Aside from avoiding unnecessary pin traffic, this will also be important for later work on out-of-order streams: you can't prioritize data that is already available right now if that fact is hidden from you. The new API is natural for read_stream.c (see `ed0b87ca`). After a short read it leaves forwarded buffers where they fell in its circular queue for the continuing call to pick up. Single-block StartReadBuffer() and traditional ReadBuffer() share code but are not affected by the change. They don't do multi-block I/O. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-21 20:43:59 +13:00
Thomas Munro	ed0b87caac	Support buffer forwarding in read_stream.c. In preparation for a follow-up change to the buffer manager, teach read_stream.c to manage buffers "forwarded" from one StartReadBuffers() call to the next after a short read. This involves a small amount of extra book-keeping, and opens the way for lower levels to split I/O operations without having to drop pins, as required for efficient handling of various edge cases. Concretely, the "buffers" argument will change from an out parameter to an in/out parameter. Buffer queue elements must be initialized on first use and cleared after they're consumed, but forwarded buffers are left where they fall ahead of the current pending read in the queue, ready for use by the operation that continues where a short read left off. The stream also needs to count them for pin limit management and release them on reset/early end. Tested-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-21 18:44:47 +13:00
Fujii Masao	14413d0ef5	doc: Remove incorrect description about dropping replication slots. pg_drop_replication_slot() can drop replication slots created on a different database than the one where it is executed. This behavior has been in place since PostgreSQL 9.4, when pg_drop_replication_slot() was introduced. However, commit ff539d mistakenly added the following incorrect description in the documentation: For logical slots, this must be called when connected to the same database the slot was created on. This commit removes that incorrect statement. A similar mistake was also present in the documentation for the DROP_REPLICATION_SLOT command, which has now been corrected as well. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966C6BE304B5BB2E58D4009F5DE2@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 13	2025-03-21 12:56:39 +09:00
David Rowley	00b52c3db6	Simplify EXPLAIN code for Memoize This removes a needless special case for Memoize's FORMAT TEXT EXPLAIN output. ExplainPropertyText() outputs the same thing in text mode as the special-case code was doing, so removing the special-case code results in the same EXPLAIN output, just with less code. It seems like a good idea to fix this to help prevent future changes in this area from copying the same pattern. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/88a71bcd-0b5c-4d0b-8107-757e96f402d5@tantorlabs.com	2025-03-21 13:40:05 +13:00
Andres Freund	202b12774d	bufmgr: Improve stats when a buffer is read in concurrently Previously we would have the following inaccuracies when a backend tried to read in a buffer, but that buffer was read in concurrently by another backend: - the read IO was double-counted in the global buffer access stats (pgBufferUsage) - the buffer hit was not accounted for in: - global buffer access statistics - pg_stat_io - relation level IO stats - vacuum cost balancing While trying to read in a buffer that is concurrently read in by another backend is not a common occurrence, it's also not that rare, e.g. due to concurrent sequential scans on the same relation. This scenario has become more likely in PG 17, due to the introducing of read streams, which can pin multiple buffers before calling StartBufferIO() for all the buffers. This behaviour has historically grown, but there doesn't seem to be any reason to continue with the wrong accounting. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_Zk-B08AzPsO-6680LUHLOCGaNJYofaxTFseLa=OepV1g@mail.gmail.com	2025-03-20 19:58:22 -04:00
Andrew Dunstan	12604593e9	Show plperl version in the meson setup summary. Also, use perl 'version' instead of 'api_versionstring' to sync with the configure script. Author: Roman Zharkov <r.zharkov@postgrespro.ru> Discussion: https://postgr.es/m/93e7f77bf4e1ef4640e4ee733f9e2a78@postgrespro.ru	2025-03-20 18:55:29 -04:00
Andres Freund	fc51a60dd4	smgr: Hold interrupts in most smgr functions We need to hold interrupts across most of the smgr.c/md.c functions, as otherwise interrupt processing, e.g. due to a < ERROR elog/ereport, can trigger procsignal processing, which in turn can trigger smgrreleaseall(). As the relevant code is not reentrant, we quickly end up in a bad situation. The only reason we haven't noticed this before is that there is only one non-error ereport called in affected routines, in register_dirty_segments(), and that one is extremely rarely reached. If one enables fd.c's FDDEBUG it's easy to reproduce crashes. It seems better to put the HOLD_INTERRUPTS()/RESUME_INTERRUPTS() in smgr.c, instead of trying to push them down to md.c where possible: For one, every smgr implementation would be vulnerable, for another, a good bit of smgr.c code itself is affected too. Eventually we might want a more targeted solution, allowing e.g. a networked smgr implementation to be interrupted, but many other, more complicated, problems would need to be fixed for that to be viable (e.g. smgr.c is often called with interrupts already held). One could argue this should be backpatched, but the existing < ERROR elog/ereports that can be reached with unmodified sources are unlikely to be reached. On balance the risk of backpatching seems higher than the gain - at least for now. Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl	2025-03-20 17:33:57 -04:00
Tom Lane	fdb5dd6331	Be more paranoid in configure's checks for CRC and POPCNT intrinsics. In these tests, we need to verify not only that the compiler has heard of these intrinsics, but that lower-level tools cope with them too. (For example, the assembler must also know the instructions, and on some platforms there might be library support involved.) The hazard is that the compiler might optimize away the calls altogether, allowing the configure check to succeed only to have the build fail later if lower-level support is missing. The existing code tried to prevent that by ensuring that the result of the intrinsic is used for something, but that's really insufficient because we were feeding constant input to it. So the compiler would be perfectly entitled to optimize away the calls anyway. Fix by making the inputs into global variables. (Hypothetically, LTO optimization could still remove the code --- but that's well past where we'd be likely to hit trouble.) It is not known that any current compiler would actually optimize away these calls, and even if that happened it would be unlikely that any problem would manifest. Our concern for this stems from largely-bygone days when it was common to install gcc on platforms with some other native compiler, so that a compiler-vs-library support discrepancy was more probable. Still, there's little point in defending against such cases in a way that is visibly incomplete. I'm content to fix this in master for now; we can back-patch if any indication appears that it's a live problem for someone. Discussion: https://postgr.es/m/3368102.1741993462@sss.pgh.pa.us	2025-03-20 16:23:09 -04:00
Robert Haas	50ba65e733	Add an additional hook for EXPLAIN option validation. Commit `c65bc2e1d1` made it possible for loadable modules to add EXPLAIN options. Normally, any necessary validation can be performed by the hook function passed to RegisterExtensionExplainOption, but if a loadable module wants to sanity check options against each other, that needs to be done after the entire options list has been processed. So, add an additional hook for that purpose. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CAA5RZ0vOcJF91O2e5AQN+V6guMNLMhJx83dxALf-iUZ-hLGO_Q@mail.gmail.com	2025-03-20 13:47:55 -04:00
Nathan Bossart	af0d4901c1	Add test for pg_upgrade file transfer modes. This new test checks all of pg_upgrade's file transfer modes. For each mode, we verify that pg_upgrade either succeeds (and some test objects successfully reach the new version) or fails with an error that indicates the mode is not supported on the current platform. For cross-version tests, we also check that pg_upgrade transfers non-default tablespaces. (Tablespaces can't be tested on same version upgrades because of the version-specific subdirectory conflict, but we might be able to enable such tests once we teach pg_upgrade how to handle in-place tablespaces.) Suggested-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan	2025-03-20 11:08:42 -05:00
Nathan Bossart	0164a0f9ee	Add vacuum_truncate configuration parameter. This new parameter works just like the storage parameter of the same name: if set to true (which is the default), autovacuum and VACUUM attempt to truncate any empty pages at the end of the table. It is primarily intended to help users avoid locking issues on hot standbys. The setting can be overridden with the storage parameter or VACUUM's TRUNCATE option. Since there's presently no way to determine whether a Boolean storage parameter is explicitly set or has just picked up the default value, this commit also introduces an isset_offset member to relopt_parse_elt. Suggested-by: Will Storey <will@summercat.com> Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/Z2DE4lDX4tHqNGZt%40dev.null	2025-03-20 10:16:50 -05:00
Peter Eisentraut	618c64ffd3	Revert workarounds for -Wmissing-braces false positives on old GCC We have collected several instances of a workaround for GCC bug 53119, which caused false-positive compiler warnings. This bug has long been fixed, but was still seen on the buildfarm, most recently on lapwing with gcc (Debian 4.7.2-5). (The GCC bug tracker mentions that a fix was backported to 4.7.4 and 4.8.3.) That compiler no longer runs warning-free since commit `6fdd5d9563`, so we don't need to keep these workarounds. And furthermore, the consensus appears to be that we don't want to keep supporting that era of platform anymore at all. This reverts the following commits: `d937904cce` `506428d091` `b449afb582` `6392f2a096` `bad0763a4d` `5e0c761d0a` and makes a few similar fixes to newer code. Discussion: https://www.postgresql.org/message-id/flat/e170d61f-01ab-4cf9-ab68-91cd1fac62c5%40eisentraut.org Discussion: https://www.postgresql.org/message-id/flat/CA%2BTgmoYEAm-KKZibAP3hSqbTFTjUd47XtVcf3xSFDpyecXX9uQ%40mail.gmail.com	2025-03-20 11:25:58 +01:00
Peter Eisentraut	b7076c1e7f	Fix extension control path tests Change expected extension to be installed from amcheck to plpgsql since not all build farm animals has the contrib module installed. Author: Matheus Alcantara <mths.dev@pm.me> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/E7C7BFFB-8857-48D4-A71F-88B359FADCFD@justatheory.com	2025-03-20 10:53:59 +01:00
Peter Eisentraut	47929324c5	Fix typo in comment	2025-03-20 10:44:12 +01:00
Amit Kapila	e5aeed4b80	pg_createsubscriber: Add -R publications option. This patch introduces a new '-R'/'--remove' option in the 'pg_createsubscriber' utility to specify the object types to be removed from the subscriber. Currently, we add support to specify 'publications' as an object type. In the future, other object types like failover-slots could be added. This feature allows optionally to remove publications on the subscriber that were replicated from the primary server (before running this tool) during physical replication. Users may want to retain these publications in case they want some pre-existing subscribers to point to the newly created subscriber. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHv8RjL4OvoYafofTb_U_JD5HuyoNowBoGpMfnEbhDSENA74Kg@mail.gmail.com	2025-03-20 12:21:54 +05:30
Andres Freund	5941946d09	meson: Flush stdout in testwrap Otherwise the progress won't reliably be displayed during a test. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/kx6xu7suexal5vwsxpy7ybgkcznx6hgywbuhkr6qabcwxjqax2@i4pcpk75jvaa Backpatch-through: 16	2025-03-19 09:04:09 -04:00
Peter Eisentraut	190dc27998	Update a code comment The comment explained that ALTER TABLE ADD CONSTRAINT USING INDEX is only supported with a btree index. (This is not being changed.) The reason is to keep upgrades robust, as explained there. The other part of the comment, that btree is the only unique index kind anyway, is somewhat less true as we're trying to enable unique indexes other than btree, and it's irrelevant to this check. There is a check for indisunique earlier already. So just remove this part of the comment. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-03-19 10:39:06 +01:00
Peter Eisentraut	4f7f7b0375	extension_control_path The new GUC extension_control_path specifies a path to look for extension control files. The default value is $system, which looks in the compiled-in location, as before. The path search uses the same code and works in the same way as dynamic_library_path. Some use cases of this are: (1) testing extensions during package builds, (2) installing extensions outside security-restricted containers like Python.app (on macOS), (3) adding extensions to PostgreSQL running in a Kubernetes environment using operators such as CloudNativePG without having to rebuild the base image for each new extension. There is also a tweak in Makefile.global so that it is possible to install extensions using PGXS into an different directory than the default, using 'make install prefix=/else/where'. This previously only worked when specifying the subdirectories, like 'make install datadir=/else/where/share pkglibdir=/else/where/lib', for purely implementation reasons. (Of course, without the path feature, installing elsewhere was rarely useful.) Author: Peter Eisentraut <peter@eisentraut.org> Co-authored-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: David E. Wheeler <david@justatheory.com> Reviewed-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> Reviewed-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Reviewed-by: Niccolò Fei <niccolo.fei@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E7C7BFFB-8857-48D4-A71F-88B359FADCFD@justatheory.com	2025-03-19 07:03:20 +01:00
Michael Paquier	2cce0fe440	psql: Allow queries terminated by semicolons while in pipeline mode Currently, the only way to pipe queries in an ongoing pipeline (in a \startpipeline block) is to leverage the meta-commands able to create extended queries such as \bind, \parse or \bind_named. While this is good enough for testing the backend with pipelines, it has been mentioned that it can also be very useful to allow queries terminated by semicolons to be appended to a pipeline. For example, it would be possible to migrate existing psql scripts to use pipelines by just adding a set of \startpipeline and \endpipeline meta-commands, making such scripts more efficient. Doing such a change is proving to be simple in psql: queries terminated by semicolons can be executed through PQsendQueryParams() without any parameters set when the pipeline mode is active, instead of PQsendQuery(), the default, like pgbench. \watch is still forbidden while in a pipeline, as it expects its results to be processed synchronously. The large portion of this commit consists in providing more test coverage, with mixes of extended queries appended in a pipeline by \bind and friends, and queries terminated by semicolons. This improvement has been suggested by Daniel Vérité. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/d67b9c19-d009-4a50-8020-1a0ea92366a1@manitou-mail.org	2025-03-19 13:34:59 +09:00
Thomas Munro	0b53c08677	Fix compiler warning for commit `434dbf69`. Reported-by: Tom Lane <tgl@sss.pgh.pa.us>	2025-03-19 17:26:16 +13:00
Thomas Munro	1cf4c56480	oauth: Simplify copy of PGoauthBearerRequest Follow-up to `03366b61d`. Since there are no more const members in the PGoauthBearerRequest struct, the previous memcpy() can be replaced with simple assignment. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq	2025-03-19 16:59:25 +13:00
Thomas Munro	873c0fd678	oauth: Improve validator docs on interruptibility Andres pointed out that EINTR handling is inadequate for real-world use cases. Direct module writers to our wait APIs instead. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq	2025-03-19 16:58:06 +13:00
Thomas Munro	d7e40845f9	oauth: Disallow synchronous DNS in libcurl There is concern that a blocking DNS lookup in libpq could stall a backend process (say, via FDW). Since there's currently no strong evidence that synchronous DNS is a popular option, disallow it entirely rather than warning at configure time. We can revisit if anyone complains. Per query from Andres Freund. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq	2025-03-19 16:56:19 +13:00
Thomas Munro	434dbf6907	oauth: Fix postcondition for set_timer on macOS On macOS, readding an EVFILT_TIMER to a kqueue does not appear to clear out previously queued timer events, so checks for timer expiration do not work correctly during token retrieval. Switching to IPv4-only communication exposes the problem, because libcurl is no longer clearing out other timeouts related to Happy Eyeballs dual-stack handling. Fully remove and re-register the kqueue timer events during each call to set_timer(), to clear out any stale expirations. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CAOYmi%2Bn4EDOOUL27_OqYT2-F2rS6S%2B3mK-ppWb2Ec92UEoUbYA%40mail.gmail.com	2025-03-19 16:45:01 +13:00
Thomas Munro	8d9d5843b5	oauth: Use IPv4-only issuer in oauth_validator tests The test authorization server implemented in oauth_server.py does not listen on IPv6. Most of the time, libcurl happily falls back to IPv4 after failing its initial connection, but on NetBSD, something is consistently showing up on the unreserved IPv6 port and causing a test failure. Rather than deal with dual-stack details across all test platforms, change the issuer to enforce the use of IPv4 only. (This elicits more punishing timeout behavior from libcurl, so it's a useful change from the testing perspective as well.) Author: Jacob Champion <jacob.champion@enterprisedb.com> Reported-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CAOYmi%2Bn4EDOOUL27_OqYT2-F2rS6S%2B3mK-ppWb2Ec92UEoUbYA%40mail.gmail.com	2025-03-19 16:45:01 +13:00
Amit Langote	28317de723	Ensure first ModifyTable rel initialized if all are pruned Commit `cbc127917e` introduced tracking of unpruned relids to avoid processing pruned relations, and changed ExecInitModifyTable() to initialize only unpruned result relations. As a result, MERGE statements that prune all target partitions can now lead to crashes or incorrect behavior during execution. The crash occurs because some executor code paths rely on ModifyTableState.resultRelInfo[0] being present and initialized, even when no result relations remain after pruning. For example, ExecMerge() and ExecMergeNotMatched() use the first resultRelInfo to determine the appropriate action. Similarly, ExecInitPartitionInfo() assumes that at least one result relation exists. To preserve these assumptions, ExecInitModifyTable() now includes the first result relation in the initialized result relation list if all result relations for that ModifyTable were pruned. To enable that, ExecDoInitialPruning() ensures the first relation is locked if it was pruned and locking is necessary. To support this exception to the pruning logic, PlannedStmt now includes a list of RT indexes identifying the first result relation of each ModifyTable node in the plan. This allows ExecDoInitialPruning() to check whether each such relation was pruned and, if so, lock it if necessary. Bug: #18830 Reported-by: Robins Tharakan <tharakan@gmail.com> Diagnozed-by: Tender Wang <tndrwang@gmail.com> Diagnozed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Co-authored-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/18830-1f31ea1dc930d444%40postgresql.org	2025-03-19 12:14:24 +09:00
Thomas Munro	06fb5612c9	Increase io_combine_limit range to 1MB. The default of 128kB is unchanged, but the upper limit is changed from 32 blocks to 128 blocks, unless the operating system's IOV_MAX is too low. Some other RDBMSes seem to cap their multi-block buffer pool I/O around this number, and it seems useful to allow experimentation. The concrete change is to our definition of PG_IOV_MAX, which provides the maximum for io_combine_limit and io_max_combine_limit. It also affects a couple of other places that work with arrays of struct iovec or smaller objects on the stack, so we still don't want to use the system IOV_MAX directly without a clamp: it is not under our control and likely to be 1024. 128 seems acceptable for our current usage. For Windows, we can't use real scatter/gather yet, so we continue to define our own IOV_MAX value of 16 and emulate preadv()/pwritev() with loops. Someone would need to research the trade-offs of raising that number. NB if trying to see this working: you might temporarily need to hack BAS_BULKREAD to be bigger, since otherwise the obvious way of "a very big SELECT" is limited by that for now. Suggested-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com	2025-03-19 15:40:35 +13:00
Thomas Munro	10f6646847	Introduce io_max_combine_limit. The existing io_combine_limit can be changed by users. The new io_max_combine_limit is fixed at server startup time, and functions as a silent clamp on the user setting. That in itself is probably quite useful, but the primary motivation is: aio_init.c allocates shared memory for all asynchronous IOs including some per-block data, and we didn't want to waste memory you'd never used by assuming they could be up to PG_IOV_MAX. This commit already halves the size of 'AioHandleIov' and 'AioHandleData'. A follow-up commit can now expand PG_IOV_MAX without affecting that. Since our GUC system doesn't support dependencies or cross-checks between GUCs, the user-settable one now assigns a "raw" value to io_combine_limit_guc, and the lower of io_combine_limit_guc and io_max_combine_limit is maintained in io_combine_limit. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com	2025-03-19 15:23:54 +13:00
Michael Paquier	17d8bba6da	Fix copy-paste error related to the autovacuum launcher in pgstat_io.c Autovacuum launchers perform no WAL IO reads, but pgstat_tracks_io_op() was tracking them as an allowed combination for the "init" and "normal" contexts. This caused the "read", "read_bytes" and "read_time" attributes of pg_stat_io to show zeros for the autovacuum launcher rather than NULL. NULL means that a combination of IO object, IO context and IO operation has no meaning for a backend type. Zero is the same as telling that a combination is relevant, and that WAL reads are possible in an autovacuum launcher, but it is not relevant. Copy-pasto introduced in `a051e71e28`. Author: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAEudQAopEMAPiUqE7BvDV+x2fUPmKmb9RrsaoDR+hhQzLKg4PQ@mail.gmail.com	2025-03-19 08:52:10 +09:00
Masahiko Sawada	f4290f20dd	Fix assertion failure in parallel vacuum with minimal maintenance_work_mem setting. `bbf668d66f` lowered the minimum value of maintenance_work_mem to 64kB. However, in parallel vacuum cases, since the initial underlying DSA size is 256kB, it attempts to perform a cycle of index vacuuming and table vacuuming with an empty TID store, resulting in an assertion failure. This commit ensures that at least one page is processed before index vacuuming and table vacuuming begins. Backpatch to 17, where the minimum maintenance_work_mem value was lowered. Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAD21AoCEAmbkkXSKbj4dB+5pJDRL4ZHxrCiLBgES_g_g8mVi1Q@mail.gmail.com Backpatch-through: 17	2025-03-18 16:37:02 -07:00
Michael Paquier	6d3ea48ff1	Optimize check for pending backend IO stats This commit changes the backend stats code so as we rely on a single boolean rather than a repeated check based on pg_memory_is_all_zeros() in the code, making it cheaper should PgStat_PendingIO get bigger in size. The frequency of backend stats reports is not a bottleneck, but there is no reason to not make that cheaper, and the logic is simple as the only entry points updating backend IO stats are pgstat_count_backend_io_op() and pgstat_count_backend_io_op_time(). Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/Z8WYf1jyy4MwOveQ@ip-10-97-1-34.eu-west-3.compute.internal	2025-03-19 08:03:06 +09:00
Nathan Bossart	7fb418f020	Add commit `796bdda484` to .git-blame-ignore-revs.	2025-03-18 17:00:23 -05:00
Nathan Bossart	c9d502eb68	Update guidance for running vacuumdb after pg_upgrade. Now that pg_upgrade can carry over most optimizer statistics, we should recommend using vacuumdb's new --missing-stats-only option to only analyze relations that are missing statistics. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan	2025-03-18 16:32:56 -05:00
Nathan Bossart	edba754f05	vacuumdb: Add option for analyzing only relations missing stats. This commit adds a new --missing-stats-only option that can be used with --analyze-only or --analyze-in-stages. When this option is specified, vacuumdb will analyze a relation if it lacks any statistics for a column, expression index, or extended statistics object. This new option is primarily intended for use after pg_upgrade (since it can now retain most optimizer statistics), but it might be useful in other situations, too. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan	2025-03-18 16:32:56 -05:00
Nathan Bossart	9c03c8d187	vacuumdb: Teach vacuum_one_database() to reuse query results. Presently, each call to vacuum_one_database() queries the catalogs to retrieve the list of tables to process. A follow-up commit will add a "missing stats only" feature to --analyze-in-stages, which requires saving the catalog query results (since tables without statistics will have them after the first stage). This commit adds a new parameter to vacuum_one_database() that specifies either a previously-retrieved list or a place to return the catalog query results. Note that nothing uses this new parameter yet. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan	2025-03-18 16:32:55 -05:00
Tom Lane	a6524105d2	Doc: manually break lines in wide UUID examples. Buildfarm member crake has been complaining "WARNING: The contents of fo:inline line 1 exceed the available area in the inline-progression direction by 20500 millipoints. (See position 23808:106)" since `ba57dcfdc` went in. The other doc-building animals are not showing this warning, and I don't see it on my RHEL8 workstation either, but I was able to reproduce it on a Fedora 41 box. So apparently this is due to a recent-ish change in DocBook's line-breaking heuristics, which caused it to cope less well with the UUIDs in these examples. Put in some zero-width spaces to encourage the PDF toolchain to break these lines in a better place. (Only one of these examples actually needs this today, but I marked up all three to ensure that they get wrapped in a consistent way.)	2025-03-18 15:35:13 -04:00
Andres Freund	499faf9063	smgr: Make SMgrRelation initialization safer against errors In case the smgr_open callback failed, the ->pincount field would not be initialized and the relation would not be put onto the unpinned_relns list. This buglet was introduced in `21d9c3ee4e`, in 17. Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl Backpatch-through: 17	2025-03-18 14:04:44 -04:00
Álvaro Herrera	62d712ecfd	Introduce squashing of constant lists in query jumbling pg_stat_statements produces multiple entries for queries like SELECT something FROM table WHERE col IN (1, 2, 3, ...) depending on the number of parameters, because every element of ArrayExpr is individually jumbled. Most of the time that's undesirable, especially if the list becomes too large. Fix this by introducing a new GUC query_id_squash_values which modifies the node jumbling code to only consider the first and last element of a list of constants, rather than each list element individually. This affects both the query_id generated by query jumbling, as well as pg_stat_statements query normalization so that it suppresses printing of the individual elements of such a list. The default value is off, meaning the previous behavior is maintained. Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Sergey Dudoladov (mysterious, off-list) Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Marcos Pegoraro <marcos@f10.com.br> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Tested-by: Yasuo Honda <yasuo.honda@gmail.com> Tested-by: Sergei Kornilov <sk@zsrv.org> Tested-by: Maciek Sakrejda <m.sakrejda@gmail.com> Tested-by: Chengxi Sun <sunchengxi@highgo.com> Tested-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com	2025-03-18 18:56:11 +01:00
Andres Freund	247ce06b88	aio: Add io_method=worker The previous commit introduced the infrastructure to start io_workers. This commit actually makes the workers execute IOs. IO workers consume IOs from a shared memory submission queue, run traditional synchronous system calls, and perform the shared completion handling immediately. Client code submits most requests by pushing IOs into the submission queue, and waits (if necessary) using condition variables. Some IOs cannot be performed in another process due to lack of infrastructure for reopening the file, and must processed synchronously by the client code when submitted. For now the default io_method is changed to "worker". We should re-evaluate that around beta1, we might want to be careful and set the default to "sync" for 18. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-18 11:54:01 -04:00
Andres Freund	55b454d0e1	aio: Infrastructure for io_method=worker This commit contains the basic, system-wide, infrastructure for io_method=worker. It does not yet actually execute IO, this commit just provides the infrastructure for running IO workers, kept separate for easier review. The number of IO workers can be adjusted with a PGC_SIGHUP GUC. Eventually we'd like to make the number of workers dynamically scale up/down based on the current "IO load". To allow the number of IO workers to be increased without a restart, we need to reserve PGPROC entries for the workers unconditionally. This has been judged to be worth the cost. If it turns out to be problematic, we can introduce a PGC_POSTMASTER GUC to control the maximum number. As io workers might be needed during shutdown, e.g. for AIO during the shutdown checkpoint, a new PMState phase is added. IO workers are shut down after the shutdown checkpoint has been performed and walsender/archiver have shut down, but before the checkpointer itself shuts down. See also `87a6690cc6`. Updates PGSTAT_FILE_FORMAT_ID due to the addition of a new BackendType. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-18 11:54:01 -04:00
Jeff Davis	549ea06e42	Fix headerscheck warning. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/93731.1742310701@sss.pgh.pa.us	2025-03-18 08:37:07 -07:00
Tom Lane	4078da6c47	Silence compiler warning. Assorted buildfarm members are complaining about "'process_list' may be used uninitialized in this function" since `f76892c9f`, presumably because they don't trust that the switch case labels are exhaustive. We can silence that by initializing the variable to NULL. Should a switch fall-through actually happen, we'll get SIGSEGV at the first use, which is as good as an Assert.	2025-03-18 10:54:10 -04:00
Daniel Gustafsson	daa02c6bd9	Add X25519 to the default set of curves Since many clients default to the X25519 curve in the TLS handshake, the fact that the server by defualt doesn't support it cause an extra roundtrip for each TLS connection. By adding multiple curves, which is supported since `3d1ef3a15c`, we can reduce the risk of extra roundtrips. Author: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/20240616234612.6cslu7nqexquvwj7@awork3.anarazel.de	2025-03-18 15:26:27 +01:00
Robert Haas	4fd02bf7cf	Add some new hooks so extensions can add details to EXPLAIN. Specifically, add a per-node hook that is called after the per-node information has been displayed but before we display children, and a per-query hook that is called after existing query-level information is printed. This assumes that extension-added information should always go at the end rather than the beginning or the middle, but that seems like an acceptable limitation for simplicity. It also assumes that extensions will only want to add information, not remove or reformat existing details; those also seem like acceptable restrictions, at least for now. If multiple EXPLAIN extensions are used, the order in which any additional details are printed is likely to depend on the order in which the modules are loaded. That seems OK, since the user may have opinions about the order in which output should appear, and the extension author can't really know whether their stuff is more or less important to a particular user than some other extension. Discussion: http://postgr.es/m/CA+TgmoYSzg58hPuBmei46o8D3SKX+SZoO4K_aGQGwiRzvRApLg@mail.gmail.com Reviewed-by: Srinath Reddy <srinath2133@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Sami Imseih <samimseih@gmail.com>	2025-03-18 09:28:01 -04:00
Álvaro Herrera	f76892c9ff	Simplify reindexdb coding get_parallel_object_list() was trying to serve two masters, and it was doing a bad job at both. In particular, it treated the given user_list as an output argument, but only sometimes. This was confusing, and the two paths through it didn't really have all that much in common, so the complexity wasn't buying us much. Split it in two: get_parallel_tables_list() handles the straightforward cases for schemas, databases and tables, takes one list as argument and returns another list. A new function get_parallel_tabidx_list() handles the case for indexes. This takes a list as argument and outputs two lists, just like get_parallel_object_list used to do, but now the API is clearer (IMO anyway). Another difference is that accompanying the list of indexes now we have a list of tables as an OID list rather than a fully-qualified table name list. This makes some comparisons easier, and we don't really need the names of the tables, just their OIDs. (This requires atooid, which requires <stdlib.h>). Author: Ranier Vilela <ranier.vf@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAEudQArfqr0-s0VVPSEh=0kgOgBJvFNdGW=xSL5rBcr0WDMQYQ@mail.gmail.com	2025-03-18 14:21:26 +01:00
Melanie Plageman	cc6be07ebd	Increase default maintenance_io_concurrency to 16 Since its introduction in `fc34b0d9de`, the default maintenance_io_concurrency has been larger than the default effective_io_concurrency. maintenance_io_concurrency primarily controlled prefetching done on behalf of the whole system, for operations like recovery. Therefore it makes sense for it to have a value equal to or greater than effective_io_concurrency, which controls I/O concurrency for reading a relation in a bitmap heap scan. `ff79b5b2ab` increased effective_io_concurrency to 16, so we'll increase maintenance_io_concurrency as well. For now, though, we'll keep the defaults of effective_io_concurrency and maintenance_io_concurrency equal to one another (16). On fast, high IOPs systems, significantly higher values of maintenance_io_concurrency are observably beneficial [1]. However, such values would flood low IOPs systems and increase overall system I/O latency. It is worth mentioning that since `9256822608` and `c3e775e608`, maintenance_io_concurrency also controls the I/O concurrency of each vacuum worker. Since many autovacuum workers may be simultaneously issuing I/Os, we want to keep maintenance_io_concurrency appropriately conservative. [1] https://postgr.es/m/c5d52837-6256-0556-ac8c-d6d3d558820a%40enterprisedb.com Suggested-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/CAKZiRmxdHQaU%2B2Zpe6d%3Dx%3D0vigJ1sfWwwVYLJAf%3Dud_wQ_VcUw%40mail.gmail.com	2025-03-18 09:08:10 -04:00
Robert Haas	796bdda484	Fix indentation again. Because somehow I manage to keep forgetting this.	2025-03-18 09:02:36 -04:00
Robert Haas	c65bc2e1d1	Make it possible for loadable modules to add EXPLAIN options. Modules can use RegisterExtensionExplainOption to register new EXPLAIN options, and GetExplainExtensionId, GetExplainExtensionState, and SetExplainExtensionState to store related state inside the ExplainState object. Since this substantially increases the amount of code that needs to handle ExplainState-related tasks, move a few bits of existing code to a new file explain_state.c and add the rest of this infrastructure there. See the comments at the top of explain_state.c for further explanation of how this mechanism works. This does not yet provide a way for such such options to do anything useful. The intention is that we'll add hooks for that purpose in a separate commit. Discussion: http://postgr.es/m/CA+TgmoYSzg58hPuBmei46o8D3SKX+SZoO4K_aGQGwiRzvRApLg@mail.gmail.com Reviewed-by: Srinath Reddy <srinath2133@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Sami Imseih <samimseih@gmail.com>	2025-03-18 08:41:12 -04:00
Peter Eisentraut	9d6db8bec1	Allow non-btree unique indexes for matviews We were rejecting non-btree indexes in some cases owing to the inability to determine the equality operators for other index AMs; that problem no longer exists, because we can look up the equality operator using COMPARE_EQ. Stop rejecting these indexes, but instead rely on all unique indexes having equality operators. Unique indexes must have equality operators. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-03-18 11:29:15 +01:00
Peter Eisentraut	f278e1fe30	Allow non-btree unique indexes for partition keys We were rejecting non-btree indexes in some cases owing to the inability to determine the equality operators for other index AMs; that problem no longer exists, because we can look up the equality operator using COMPARE_EQ. The problem of not knowing the strategy number for equality in other index AMs is already resolved. Stop rejecting the indexes upfront, and instead reject any for which the equality operator lookup fails. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-03-18 11:25:36 +01:00
Peter Eisentraut	7317e64126	Add some opfamily support functions to lsyscache.c Add get_opfamily_method() and get_opfamily_member_for_cmptype() in lsyscache.c. No callers yet, but we'll add some soon. This is part of generalizing some parts of the code away from having btree hardcoded and use CompareType instead. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-03-18 11:17:43 +01:00
Amit Kapila	122a9af5de	Fix typo. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CALDaNm1KqJ0VFfDJRPbfYi9Shz6LHFEE-Ckn+eqsePfKhebv9w@mail.gmail.com	2025-03-18 14:18:09 +05:30
Amit Kapila	01e27aab05	Use correct variable name in publicationcmds.c. subid was used at few places for publicationid in publicationcmds.c/.h. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CALDaNm1KqJ0VFfDJRPbfYi9Shz6LHFEE-Ckn+eqsePfKhebv9w@mail.gmail.com	2025-03-18 14:06:51 +05:30
Masahiko Sawada	c462b054ba	Fix the test 005_char_signedness. pg_upgrade test 005_char_signedness was leaving files like delete_old_cluster.sh in the source directory for VPATH and meson builds. The fix is to change the directory to tmp_check before running the test. Reported-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: http://postgr.es/m/CA+TgmoYg5e4oznn0XGoJ3+mceG1qe_JJt34rF2JLwvGS5T1hgQ@mail.gmail.com	2025-03-17 21:34:10 -07:00
Michael Paquier	17caf66445	psql: Add \sendpipeline to send query buffers while in a pipeline In the initial pipeline support for psql added in `41625ab8ea`, \g was used as the way to push extended query into an ongoing pipeline. \gx was blocked. These two meta-commands have format-related options that can be applied when fetching a query result (expanded, etc.). As the results of a pipeline are fetched asynchronously, not at the moment of the meta-command execution but at the moment of a \getresults or a \endpipeline, authorizing \g while blocking \gx leads to a confusing implementation, making one think that psql should be smart enough to remember the output format options defined from the time when \g or \gx were executed. Doing so would lead to more code complications when retrieving a batch of results. There is an extra argument other than simplicity here: the output format options defined at the point of a \getresults or a \endpipeline execution should be what affect the output format for a batch of results. To avoid any confusion, we have settled to the introduction of a new meta-command called \sendpipeline, replacing \g when within a pipeline. An advantage of this design is that it is possible to add new options specific to pipelines when sending a query buffer, independent of \g and \gx, should it prove to be necessary. Most of the changes of this commit happen in the regression tests, where \g is replaced by \sendpipeline. More tests are added to check that \g is not allowed. Per discussion between the author, Daniel Vérité and me. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/ad4b9f1a-f7fe-4ab8-8546-90754726d0be@manitou-mail.org	2025-03-18 09:41:21 +09:00
Andres Freund	da7226993f	aio: Add core asynchronous I/O infrastructure The main motivations to use AIO in PostgreSQL are: a) Reduce the time spent waiting for IO by issuing IO sufficiently early. In a few places we have approximated this using posix_fadvise() based prefetching, but that is fairly limited (no completion feedback, double the syscalls, only works with buffered IO, only works on some OSs). b) Allow to use Direct-I/O (DIO). DIO can offload most of the work for IO to hardware and thus increase throughput / decrease CPU utilization, as well as reduce latency. While we have gained the ability to configure DIO in `d4e71df6`, it is not yet usable for real world workloads, as every IO is executed synchronously. For portability, the new AIO infrastructure allows to implement AIO using different methods. The choice of the AIO method is controlled by the new io_method GUC. As of this commit, the only implemented method is "sync", i.e. AIO is not actually executed asynchronously. The "sync" method exists to allow to bypass most of the new code initially. Subsequent commits will introduce additional IO methods, including a cross-platform method implemented using worker processes and a linux specific method using io_uring. To allow different parts of postgres to use AIO, the core AIO infrastructure does not need to know what kind of files it is operating on. The necessary behavioral differences for different files are abstracted as "AIO Targets". One example target would be smgr. For boring portability reasons, all targets currently need to be added to an array in aio_target.c. This commit does not implement any AIO targets, just the infrastructure for them. The smgr target will be added in a later commit. Completion (and other events) of IOs for one type of file (i.e. one AIO target) need to be reacted to differently, based on the IO operation and the callsite. This is made possible by callbacks that can be registered on IOs. E.g. an smgr read into a local buffer does not need to update the corresponding BufferDesc (as there is none), but a read into shared buffers does. This commit does not contain any callbacks, they will be added in subsequent commits. For now the AIO infrastructure only understands READV and WRITEV operations, but it is expected that more operations will be added. E.g. fsync/fdatasync, flush_range and network operations like send/recv. As of this commit, nothing uses the AIO infrastructure. Later commits will add an smgr target, md.c and bufmgr.c callbacks and then finally use AIO for read_stream.c IO, which, in one fell swoop, will convert all read stream users to AIO. The goal is to use AIO in many more places. There are patches to use AIO for checkpointer and bgwriter that are reasonably close to being ready. There also are prototypes to use it for WAL, relation extension, backend writes and many more. Those prototypes were important to ensure the design of the AIO subsystem is not too limiting (e.g. WAL writes need to happen in critical sections, which influenced a lot of the design). A future commit will add an AIO README explaining the AIO architecture and how to use the AIO subsystem. The README is added later, as it references details only added in later commits. Many many more people than the folks named below have contributed with feedback, work on semi-independent patches etc. E.g. various folks have contributed patches to use the read stream infrastructure (added by Thomas in `b5a9b18cd0`) in more places. Similarly, a lot of folks have contributed to the CI infrastructure, which I had started to work on to make adding AIO feasible. Some of the work by contributors has gone into the "v1" prototype of AIO, which heavily influenced the current design of the AIO subsystem. None of the code from that directly survives, but without the prototype, the current version of the AIO infrastructure would not exist. Similarly, the reviewers below have not necessarily looked at the current design or the whole infrastructure, but have provided very valuable input. I am to blame for problems, not they. Author: Andres Freund <andres@anarazel.de> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-17 18:51:33 -04:00
Andres Freund	02844012b3	aio: Basic subsystem initialization This commit just does the minimal wiring up of the AIO subsystem, added in the next commit, to the rest of the system. The next commit contains more details about motivation and architecture. This commit is kept separate to make it easier to review, separating the changes across the tree, from the implementation of the new subsystem. We discussed squashing this commit with the main commit before merging AIO, but there has been a mild preference for keeping it separate. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-03-17 18:51:33 -04:00
Nathan Bossart	65db3963ae	Add commit `203c1b4cc4` to .git-blame-ignore-revs.	2025-03-17 15:58:02 -05:00
Robert Haas	203c1b4cc4	Fix indentation. Commit `99aeb84703` wasn't fully reindented prior to commit.	2025-03-17 16:06:17 -04:00
Nathan Bossart	7e05df430b	pg_upgrade: Remove some dead code. Since commit `e469f0aaf3`, tablespace_suffix can't be empty. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/Z9hc3mkYFKR56Xof%40nathan	2025-03-17 13:18:14 -05:00
Andres Freund	1a22a8a0f1	tests: Expand temp table tests to some pin related matters Added tests: - recovery from running out of unpinned local buffers - that we don't run out of unpinned buffers due to read stream (only recently fixed, in `92fc6856cb`) - temp tables can't be dropped while in use by cursors Discussion: weskknhckugbdm2yt7sa2uq53xlsax67gcdkac34sanb7qpd3p@hcc2wadao5wy Discussion: https://postgr.es/m/ge6nsuddurhpmll3xj22vucvqwp4agqz6ndtcf2mhyeydzarst@l75dman5x53p	2025-03-17 14:12:44 -04:00
Robert Haas	99aeb84703	pg_combinebackup: Add -k, --link option. This is similar to pg_upgrade's --link option, except that here we won't typically be able to use it for every input file: sometimes we will need to reconstruct a complete backup from blocks stored in different files. However, when a whole file does need to be copied, we can use an optimized copying strategy: see the existing --clone and --copy-file-range options and the code to use CopyFile() on Windows. This commit adds a new strategy: add a hard link to an existing file. Making a hard link doesn't actually copy anything, but it makes sense for the code to treat it as doing so. This is useful when the input directories are merely staging directories that will be removed once the restore is complete. In such cases, there is no need to actually copy the data, and making a bunch of new hard links can be very quick. However, it would be quite dangerous to use it if the input directories might later be reused for any other purpose, since starting postgres on the output directory would destructively modify the input directories. For that reason, using this new option causes pg_combinebackup to emit a warning about the danger involved. Author: Israel Barth Rubio <barthisrael@gmail.com> Co-authored-by: Robert Haas <robertmhaas@gmail.com> (cosmetic changes) Reviewed-by: Vignesh C <vignesh21@gmail.com> Discussion: http://postgr.es/m/CA+TgmoaEFsYHsMefNaNkU=2SnMRufKE3eVJxvAaX=OWgcnPmPg@mail.gmail.com	2025-03-17 14:03:14 -04:00
Tom Lane	ed762e9425	Unify wording of user-facing "row security" messages. Row-level security is mostly referred to as "row security" in user-facing messages. Commit `cd3c45125` introduced one inconsistent use of "row level security"; make that one match the rest. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20250317.135305.573764276033358827.horikyota.ntt@gmail.com	2025-03-17 12:53:50 -04:00
Michael Paquier	3943f5cff6	Fix inconsistent quoting for some options in TAP tests This commit addresses some inconsistencies with how the options of some routines from PostgreSQL/Test/ are written, mainly for init() and init_from_backup() in Cluster.pm. These are written as unquoted, except in the locations updated here. Changes extracted from a larger patch by the same author. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87jz8rzf3h.fsf@wibble.ilmari.org	2025-03-17 14:07:12 +09:00
Michael Paquier	19c6e92b13	Apply more consistent style for command options in TAP tests This commit reshapes the grammar of some commands to apply a more consistent style across the board, following rules similar to ce1b0f9da03e: - Elimination of some pointless used-once variables. - Use of long options, to self-document better the options used. - Use of fat commas to link option names and their assigned values, including redirections, so as perltidy can be tricked to put them together. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87jz8rzf3h.fsf@wibble.ilmari.org	2025-03-17 12:42:23 +09:00
Michael Paquier	5721e5453e	Revert "Add redo LSN to pgstats files" This reverts commit `b860848232`, that was added as a prerequisite for the support of pgstats data flush across checkpoints, linking a pgstats file to a specific checkpoint redo LSN. As reported, this is proving to be currently problematic when going through a pg_upgrade, that does direct manipulations of the control file in the new cluster. The LSN stored in the pgstats file is not able to cope with any changes done in the control file by pg_upgrade yet, causing the pgstats file to be discarded when starting the new cluster after overriding its redo LSN (one is a `pg_resetwal -l` where the new cluster's start LSN is bumped by a hardcoded value of 8 segments, see copy_xact_xlog_xid). The least painful path going forward is likely going to be a refactor of the pgstats code so as it is possible to read and write some of its data with some routines in src/common/, so as pg_upgrade or pg_resetwal are able to update its data. The main point is that we are going to need a LSN in the stats file should we make it written at checkpoint time and not only as part of a shutdown sequence. It is too late to dive into these details for v18, so let's revert the change, and let's try to figure out all the details in the next release cycle. The pgstats file is currently only written as part of a shutdown sequence, and its contents are still lost on crash, same as older releases. Bump PGSTAT_FILE_FORMAT_ID. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2563883.1741826489@sss.pgh.pa.us	2025-03-17 08:35:12 +09:00
Tom Lane	cd3c45125d	pg_dump, pg_dumpall, pg_restore: Add --no-policies option. Add --no-policies option to control row level security policy handling in dump and restore operations. When this option is used, both CREATE POLICY commands and ALTER TABLE ... ENABLE ROW LEVEL SECURITY commands are excluded from dumps and skipped during restores. This is useful in scenarios where policies need to be redefined in the target system or when moving data between environments with different security requirements. Author: Nikolay Samokhvalov <nik@postgres.ai> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com> Discussion: https://postgr.es/m/CAM527d8kG2qPKvbfJ=OYJkT7iRNd623Bk+m-a4ngm+nyHYsHog@mail.gmail.com	2025-03-16 18:08:15 -04:00
Tom Lane	4489044239	contrib/isn: Make weak mode a GUC setting, and fix related functions. isn's weak mode used to be a simple static variable, settable only via the isn_weak(boolean) function. This wasn't optimal, as this means it doesn't respect transactions nor respond to RESET ALL. This patch makes isn.weak a GUC parameter instead, so that it acts like any other user-settable parameter. The isn_weak() functions are retained for backwards compatibility. But we must fix their volatility markings: they were marked IMMUTABLE which is surely incorrect, and PARALLEL RESTRICTED which isn't right for GUC-related functions either. Mark isn_weak(boolean) as VOLATILE and PARALLEL UNSAFE, matching set_config(). Mark isn_weak() as STABLE and PARALLEL SAFE, matching current_setting(). Reported-by: Viktor Holmberg <v@viktorh.net> Diagnosed-by: Daniel Gustafsson <daniel@yesql.se> Author: Viktor Holmberg <v@viktorh.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/790bc1f9-74dc-4b50-94d2-8147315b1556@Spark	2025-03-16 13:45:48 -04:00
Alexander Korotkov	682c5be25c	reindexdb: Fix the index-level REINDEX with multiple jobs `47f99a407d` introduced a parallel index-level REINDEX. The code was written assuming that running run_reindex_command() with 'async == true' can schedule a number of queries for a connection. That's not true, and the second query sent using run_reindex_command() will wait for the completion of the previous one. This commit fixes that by putting REINDEX commands for the same table into a single query. Also, this commit removes the 'async' argument from run_reindex_command(), as only its call always passes 'async == true'. Reported-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/202503071820.j25zn3lo4hvn%40alvherre.pgsql Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Backpatch-through: 17	2025-03-16 13:29:15 +02:00
Michael Paquier	83e5763d4d	pg_createsubscriber: Remove some code bloat in the atexit() callback This commit adjusts some code added by `e117cfb2f6` in the atexit() callback of pg_createsubscriber.c, in charge of performing post-failure cleanup actions. The code loops over all the databases specified, and it is changed here to rely on a single LogicalRepInfo for each database rather than always using LogicalRepInfos, simplifying its logic. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAHut+PtdBSVi4iH7BObDVwDNVwOpn+H3fezOBdSTtENx+rhNMw@mail.gmail.com	2025-03-16 19:20:49 +09:00
Andres Freund	771ba90298	localbuf: Introduce StartLocalBufferIO() To initiate IO on a shared buffer we have StartBufferIO(). For temporary table buffers no similar function exists - likely because the code for that currently is very simple due to the lack of concurrency. However, the upcoming AIO support will make it possible to re-encounter a local buffer, while the buffer already is the target of IO. In that case we need to wait for already in-progress IO to complete. This commit makes it easier to add the necessary code, by introducing StartLocalBufferIO(). Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andres Freund	4b4d33b9ea	localbuf: Introduce FlushLocalBuffer() Previously we had two paths implementing writing out temporary table buffers. For shared buffers, the logic for that is centralized in FlushBuffer(). Introduce FlushLocalBuffer() to do the same for local buffers. Besides being a nice cleanup on its own, it also makes an upcoming change slightly easier. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andres Freund	dd6f2618f6	localbuf: Introduce TerminateLocalBufferIO() Previously TerminateLocalBufferIO() was open-coded in multiple places, which doesn't seem like a great idea. While TerminateLocalBufferIO() currently is rather simple, an upcoming patch requires additional code to be added to TerminateLocalBufferIO(), making this modification particularly worthwhile. For some reason FlushRelationBuffers() previously cleared BM_JUST_DIRTIED, even though that's never set for temporary buffers. This is not carried over as part of this change. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andres Freund	0762a151b0	localbuf: Introduce InvalidateLocalBuffer() Previously, there were three copies of this code, two of them identical. There's no good reason for that. This change is nice on its own, but the main motivation is the AIO patchset, which needs to add extra checks the deduplicated code, which of course is easier if there is only one version. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andres Freund	fa6af9b25e	localbuf: Fix dangerous coding pattern in GetLocalVictimBuffer() If PinLocalBuffer() were to modify the buf_state, the buf_state in GetLocalVictimBuffer() would be out of date. Currently that does not happen, as PinLocalBuffer() only modifies the buf_state if adjust_usagecount=true and GetLocalVictimBuffer() passes false. However, it's easy to make this not the case anymore - it cost me a few hours to debug the consequences. The minimal fix would be to just refetch the buf_state after after calling PinLocalBuffer(), but the same danger exists in later parts of the function. Instead, declare buf_state in the narrower scopes and re-read the state in conditional branches. Besides being safer, it also fits well with an upcoming set of cleanup patches that move the contents of the conditional branches in GetLocalVictimBuffer() into helper functions. I "broke" this in `794f259447`. Arguably this should be backpatched, but as the relevant functions are not exported and there is no actual misbehaviour, I chose to not backpatch, at least for now. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andrew Dunstan	5eabd91a83	Silence perl critic Commit `27bdec0684` uses a loop variable that is not strictly local to the loop. Perlcritic disapproves, and there's really no reason as the variable is not used outside the loop. Per buildfarm animals koel and crake.	2025-03-15 17:41:54 -04:00
Jeff Davis	27bdec0684	Optimization for lower(), upper(), casefold() functions. Improve performance and reduce table sizes for case mapping. The main case mapping table stores only 16-bit offsets, which can be used to look up the mapped code point in any of the case tables (fold, lower, upper, or title case). Simple case pairs point to the same offsets. Generate a function in generate-unicode_case_table.pl that consists of a nested branches to test for specific codepoint ranges that determine the offset in the main table. Other approaches were considered, such as representing these ranges as another structure (rather than branches in a generated function), or a different approach such as a radix tree, or perfect hashing. The author implemented and tested these alternatives and settled on the generated branches. Author: Alexander Borisov <lex.borisov@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/7cac7e66-9a3b-4e3f-a997-42aa0c401f80%40gmail.com	2025-03-15 13:00:50 -07:00
Melanie Plageman	c3953226a0	Remove table AM callback scan_bitmap_next_block After pushing the bitmap iterator into table-AM specific code (as part of making bitmap heap scan use the read stream API in `2b73a8cd33`), scan_bitmap_next_block() no longer returns the current block number. Since scan_bitmap_next_block() isn't returning any relevant information to bitmap table scan code, it makes more sense to get rid of it. Now, bitmap table scan code only calls table_scan_bitmap_next_tuple(), and the heap AM implementation of scan_bitmap_next_block() is a local helper in heapam_handler.c. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/flat/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA%40mail.gmail.com	2025-03-15 10:37:46 -04:00
Melanie Plageman	2b73a8cd33	BitmapHeapScan uses the read stream API Make Bitmap Heap Scan use the read stream API instead of invoking ReadBuffer() for each block indicated by the bitmap. The read stream API handles prefetching, so remove all of the explicit prefetching from bitmap heap scan code. Now, heap table AM implements a read stream callback which uses the bitmap iterator to return the next required block to the read stream code. Tomas Vondra conducted extensive regression testing of this feature. Andres Freund, Thomas Munro, and I analyzed regressions and Thomas Munro patched the read stream API. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Tested-by: Tomas Vondra <tomas@vondra.me> Tested-by: Andres Freund <andres@anarazel.de> Tested-by: Thomas Munro <thomas.munro@gmail.com> Tested-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA%40mail.gmail.com	2025-03-15 10:34:42 -04:00
Melanie Plageman	944e81bf99	Separate TBM[Shared\|Private]Iterator and TBMIterateResult Remove the TBMIterateResult member from the TBMPrivateIterator and TBMSharedIterator and make tbm_[shared\|private_]iterate() take a TBMIterateResult as a parameter. This allows tidbitmap API users to manage multiple TBMIterateResults per scan. This is required for bitmap heap scan to use the read stream API, with which there may be multiple I/Os in flight at once, each one with a TBMIterateResult. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me	2025-03-15 10:11:19 -04:00
Thomas Munro	799959dc7c	Simplify distance heuristics in read_stream.c. Make the distance control heuristics simpler and more aggressive in preparation for asynchronous I/O. The v17 version of read_stream.c made a conservative choice to limit the look-ahead distance when streaming sequential blocks, because it couldn't benefit very much from looking ahead further yet. It had a three-behavior model where only random I/O would rapidly increase the look-ahead distance, to support read-ahead advice. Sequential I/O would move it towards the io_combine_limit setting, just enough to build one full-sized synchronous I/O at a time, and then expect kernel read-ahead to avoid I/O stalls. That already left I/O performance on the table with advice-based I/O concurrency, since sequential blocks could be followed by random jumps, eg with the proposed streaming Bitmap Heap Scan patch. It is time to delete the cautious middle option and adjust the distance based on recent I/O needs only, since asynchronous reads will need to be started ahead of time whether random or sequential. It is still limited by io_combine_limit, *_io_concurrency, buffer availability and strategy ring size, as before. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Tested-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-16 03:05:07 +13:00
Thomas Munro	7ea8cd1566	Improve read_stream.c advice for dense streams. read_stream.c tries not to issue read-ahead advice when it thinks the kernel's own read-ahead should be active, ie when using buffered I/O and reading sequential blocks. It previously gave up too easily, and issued advice only for the first read of up to io_combine_limit blocks in a larger range of sequential blocks after random jump. The following read could suffer an avoidable I/O stall. Fix, by continuing to issue advice until the corresponding preadv() calls catch up with the start of the region we're currently issuing advice for, if ever. That's when the kernel actually sees the sequential pattern. Advice is now disabled only when the stream is entirely sequential as far as we can see in the look-ahead window, or in other words, when a sequential region is larger than we can cover with the current io_concurrency and io_combine_limit settings. While refactoring the advice control logic, also get rid of the "suppress_advice" argument that was passed around between functions to skip useless posix_fadvise() calls immediately followed by preadv(). read_stream_start_pending_read() can figure that out, so let's concentrate knowledge of advice heuristics in fewer places (our goal being to make advice-based I/O concurrency a legacy mode soon). The problem cases were revealed by Tomas Vondra's extensive regression testing with many different disk access patterns using Melanie Plageman's streaming Bitmap Heap Scan patch, in a battle against the venerable always-issue-advice-and-always-one-block-at-a-time code. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reported-by: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Tomas Vondra <tomas@vondra.me> Reported-by: Andres Freund <andres@anarazel.de> Tested-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGJ3HSWciQCz8ekP1Zn7N213RfA4nbuotQawfpq23%2Bw-5Q%40mail.gmail.com	2025-03-15 19:04:54 +13:00
Álvaro Herrera	11bd831860	doc: Explain more thoroughly when a table rewrite is needed Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/00e6eb5f5c793b8ef722252c7a519c9a@oss.nttdata.com	2025-03-14 20:44:59 +01:00
Tom Lane	1c9242b2cd	Doc: remove obsolete comment. This para should have been removed by `2f9661311`, which made it both false and irrelevant. Noted while looking at SQL function plancache patch.	2025-03-14 14:08:47 -04:00
Fujii Masao	6d376c3b0d	Add GUC option to log lock acquisition failures. This commit introduces a new GUC, log_lock_failure, which controls whether a detailed log message is produced when a lock acquisition fails. Currently, it only supports logging lock failures caused by SELECT ... NOWAIT. The log message includes information about all processes holding or waiting for the lock that couldn't be acquired, helping users analyze and diagnose the causes of lock failures. Currently, this option does not log failures from SELECT ... SKIP LOCKED, as that could generate excessive log messages if many locks are skipped, causing unnecessary noise. This mechanism can be extended in the future to support for logging lock failures from other commands, such as LOCK TABLE ... NOWAIT. Author: Yuki Seino <seinoyu@oss.nttdata.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/411280a186cc26ef7034e0f2dfe54131@oss.nttdata.com	2025-03-14 23:14:12 +09:00
Fujii Masao	e80171d57c	Optimize iteration over PGPROC for fast-path lock searches. This commit improves efficiency in FastPathTransferRelationLocks() and GetLockConflicts(), which iterate over PGPROCs to search for fast-path locks. Previously, these functions recalculated the fast-path group during every loop iteration, even though it remained constant. This update optimizes the process by calculating the group once and reusing it throughout the loop. The functions also now skip empty fast-path groups, avoiding unnecessary scans of their slots. Additionally, groups belonging to inactive backends (with pid=0) are always empty, so checking the group is sufficient to bypass these backends, further enhancing performance. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/07d5fd6a-71f1-4ce8-8602-4cc6883f4bd1@oss.nttdata.com	2025-03-14 22:49:29 +09:00
Peter Eisentraut	a359d37019	Simplify and generalize PrepareSortSupportFromIndexRel() PrepareSortSupportFromIndexRel() was accepting btree strategy numbers purely for the purpose of comparing it later against btree strategies to determine if the sort direction was forward or reverse. Change that. Instead, pass a bool directly, to indicate the same without an unfortunate assumption that a strategy number refers specifically to a btree strategy. (This is similar in spirit to commits `0d2aa4d493` and c594f1ad2ba.) (This could arguably be simplfied further by having the callers fill in ssup_reverse directly. But this way, it preserves consistency by having all PrepareSortSupport*() variants be responsible for filling in ssup_reverse.) Moreover, remove the hardcoded check against BTREE_AM_OID, and check against amcanorder instead, which is the actual requirement. Co-authored-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-03-14 10:34:08 +01:00
Álvaro Herrera	1548c3a304	Remove direct handling of reloptions for toast tables It doesn't actually work, even with allow_system_table_mods turned on: the ALTER TABLE operation is rejected by ATSimplePermissions(), so even the error message we're adding in this commit is unreachable. Add a test case for it. Author: Nikolay Shaplov <dhyan@nataraj.su> Discussion: https://postgr.es/m/1913854.tdWV9SEqCh@thinkpad-pgpro	2025-03-14 09:28:51 +01:00
Thomas Munro	92fc6856cb	Respect changing pin limits in read_stream.c. To avoid pinning too much of the buffer pool at once, read_stream.c previously used LimitAdditionalPins(). The coding was naive, and only considered the available buffers at stream construction time. This commit checks before each StartReadBuffers() call with GetAdditionalPinLimit(). The result might change over time due to pins acquired outside this stream by the same backend. No extra CPU cycles are added to the all-buffered fast-path code, but the I/O-starting path now considers the up-to-date remaining buffer limit. In practice it was quite difficult to exceed limits and cause any real problems in v17, so no back-patch for now, but proposed changes will make it easier. Per code review from Andres, in the course of testing his AIO patches. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-14 21:21:09 +13:00
Peter Eisentraut	0793ab8100	Activate Python "Limited API" in PL/Python This allows building PL/Python against any Python 3.x version and using another Python 3.x version at run time. This is useful for installers that want to run against a separately downloaded Python, so that they don't have to bundle it themselves. This builds on the earlier patch to only use APIs supported by the Limited API. At the moment, this is not activated on MSVC because that leads to build failures that no one could explain or cared enough to address. This could be done later. Reviewed-by: Jakob Egger <jakob@eggerapps.at> Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org	2025-03-14 08:57:02 +01:00
Peter Eisentraut	05cbd6cb22	Swap order of extern/static and pg_nodiscard When pg_nodiscard was first added, the C standard draft had it as a function specifier, and so the code comment about placement was written with that in mind. The final C23 standard has it as an attribute and the placement rules are a bit different for that. Specifically, it needs to be before extern or static. (Or at least both current clang and gcc require that.) So just swap these. (To be clear: The current implementation with gcc attributes doesn't care. This change is just for maximum forward compatibility for non-gcc compilers.) This also keeps the order consistent with the previously introduced pg_noreturn. Also update the code comment to reflect the mentioned developments since its introduction. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw	2025-03-14 07:18:07 +01:00
Thomas Munro	01261fb078	Improve buffer manager API for backend pin limits. Previously the support functions assumed that the caller needed one pin to make progress, and could optionally use some more, allowing enough for every connection to do the same. Add a couple more functions for callers that want to know: * what the maximum possible number could be, irrespective of currently held pins, for space planning purposes * how many additional pins they could acquire right now, without the special case allowing one pin, for callers that already hold pins and could already make progress even if no extra pins are available The pin limit logic began in commit `31966b15`. This refactoring is better suited to read_stream.c, which will be adjusted to respect the remaining limit as it changes over time in a follow-up commit. It also computes MaxProportionalPins up front, to avoid performing divisions whenever a caller needs to check the balance. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-14 17:13:09 +13:00
Amit Kapila	7c99dc587a	Fix ALTER SUBSCRIPTION ... SET PUBLICATION ... command. The problem is that ALTER SUBSCRIPTION ... SET PUBLICATION ... will lead to restarting of apply worker and after the restart, the apply worker will use the existing slot and replication origin corresponding to the subscription. Now, it is possible that before the restart, the origin has not been updated, and the WAL start location points to a location before where PUBLICATION pointed to by SET PUBLICATION doesn't exist, and that can lead to an error like: "ERROR: publication "pub1" does not exist". Once this error occurs, apply worker will never be able to proceed and will always return the same error. We decided to skip loading the publication if the publication does not exist. The publication is loaded later and updates the relation entry when the publication gets created. We decided not to backpatch this as this is a behaviour change, and we don't see field reports. This problem has been found by intermittent buildfarm failures. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/flat/CALDaNm0-n8FGAorM%2BbTxkzn%2BAOUyx5%3DL_XmnvOP6T24%2B-NcBKg%40mail.gmail.com Discussion: https://postgr.es/m/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com	2025-03-14 08:57:40 +05:30
Tom Lane	4618045bee	Fix ARRAY_SUBLINK and ARRAY[] for int2vector and oidvector input. If the given input_type yields valid results from both get_element_type and get_array_type, initArrayResultAny believed the former and treated the input as an array type. However this is inconsistent with what get_promoted_array_type does, leading to situations where the output of an ARRAY() subquery is labeled with the wrong type: it's labeled as oidvector[] but is really a 2-D array of OID. That at least results in strange output, and can result in crashes if further processing such as unnest() is applied. AFAIK this is only possible with the int2vector and oidvector types, which are special-cased to be treated mostly as true arrays even though they aren't quite. Fix by switching the logic to match get_promoted_array_type by testing get_array_type not get_element_type, and remove an Assert thereby made pointless. (We need not introduce a symmetrical check for get_element_type in the other if-branch, because initArrayResultArr will check it.) This restores the behavior that existed before `bac27394a` introduced initArrayResultAny: the output really is int2vector[] or oidvector[]. Comparable confusion exists when an input of an ARRAY[] construct is int2vector or oidvector: transformArrayExpr decides it's dealing with a multidimensional array constructor, and we end up with something that's a multidimensional OID array but is alleged to be of type oidvector. I have not found a crashing case here, but it's easy to demonstrate totally-wrong results. Adjust that code so that what you get is an oidvector[] instead, for consistency with ARRAY() subqueries. (This change also makes these types work like domains-over-arrays in this context, which seems correct.) Bug: #18840 Reported-by: yang lei <ylshiyu@126.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18840-fbc9505f066e50d6@postgresql.org Backpatch-through: 13	2025-03-13 16:07:55 -04:00
Álvaro Herrera	c7fc8808a9	ATExecSetRelOptions: Reduce scope of 'isnull' variable Author: Nikolay Shaplov <dhyan@nataraj.su> Reviewed-by: Timur Magomedov <t.magomedov@postgrespro.ru> Discussion: https://postgr.es/m/1913854.tdWV9SEqCh@thinkpad-pgpro	2025-03-13 18:15:59 +01:00
Álvaro Herrera	da0f0582e8	Make lwlocknames.h generated file less ugly We can make the output look a bit better by aligning each lock's definition, so add some padding space to achieve that. This change makes no practical difference, but casual onlookers will be less distracted by (lack of) whitespace. Author: Gurjeet Singh <gurjeet@singh.im> Discussion: https://postgr.es/m/CABwTF4VxfwDtRV-H22_XK4XeDogaV-Vaobu+af5U=8ZAZn9ZZQ@mail.gmail.com	2025-03-13 17:38:21 +01:00
Nathan Bossart	0697b23906	Add reverse(bytea). This commit introduces a function for reversing the order of the bytes in binary strings. Bumps catversion. Author: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAJ7c6TMe0QVRuNssUArbMi0bJJK32%2BzNA3at5m3osrBQ25MHuw%40mail.gmail.com	2025-03-13 11:20:53 -05:00
Peter Eisentraut	bb25276205	Fix copy-and-paste mistake in error message Introduced in commit `a68159ff2b`.	2025-03-13 15:17:08 +01:00
Peter Eisentraut	3691edfab9	pg_noreturn to replace pg_attribute_noreturn() We want to support a "noreturn" decoration on more compilers besides just GCC-compatible ones, but for that we need to move the decoration in front of the function declaration instead of either behind it or wherever, which is the current style afforded by GCC-style attributes. Also rename the macro to "pg_noreturn" to be similar to the C11 standard "noreturn". pg_noreturn is now supported on all compilers that support C11 (using _Noreturn), as well as GCC-compatible ones (using __attribute__, as before), as well as MSVC (using __declspec). (When PostgreSQL requires C11, the latter two variants can be dropped.) Now, all supported compilers effectively support pg_noreturn, so the extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped. This also fixes a possible problem if third-party code includes stdnoreturn.h, because then the current definition of #define pg_attribute_noreturn() __attribute__((noreturn)) would cause an error. Note that the C standard does not support a noreturn attribute on function pointer types. So we have to drop these here. There are only two instances at this time, so it's not a big loss. In one case, we can make up for it by adding the pg_noreturn to a wrapper function and adding a pg_unreachable(), in the other case, the latter was already done before. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw	2025-03-13 12:37:26 +01:00
Richard Guo	cc5d98525d	Fix incorrect handling of subquery pullup When pulling up a subquery, if the subquery's target list items are used in grouping set columns, we need to wrap them in PlaceHolderVars. This ensures that expressions retain their separate identity so that they will match grouping set columns when appropriate. In `90947674f`, we decided to wrap subquery outputs that are non-var expressions in PlaceHolderVars. This prevents const-simplification from merging them into the surrounding expressions after subquery pullup, which could otherwise lead to failing to match those subexpressions to grouping set columns, with the effect that they'd not go to null when expected. However, that left some loose ends. If the subquery's target list contains two or more identical Var expressions, we can still fail to match the Var expression to the expected grouping set expression. This is not related to const-simplification, but rather to how we match expressions to lower target items in setrefs.c. For sort/group expressions, we use ressortgroupref matching, which works well. For other expressions, we primarily rely on comparing the expressions to determine if they are the same. Therefore, we need a way to prevent setrefs.c from matching the expression to some other identical ones. To fix, wrap all subquery outputs in PlaceHolderVars if the parent query uses grouping sets, ensuring that they preserve their separate identity throughout the whole planning process. Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-meSahaanKskpBn0KKxdHAXC1_EJCVWHxEodqirrGJnw@mail.gmail.com	2025-03-13 16:36:03 +09:00
Richard Guo	4c49611715	Remove code setting wrap_non_vars to true for UNION ALL subqueries In pull_up_simple_subquery and pull_up_constant_function, there is code that sets wrap_non_vars to true when dealing with an appendrel member. The goal is to wrap subquery outputs that are not simple Vars in PlaceHolderVars, ensuring that what we pull up doesn't get merged into a surrounding expression during later processing, which could cause it to fail to match the expression actually available from the appendrel. However, this is unnecessary. When pulling up an appendrel child subquery, the only part of the upper query that could reference the appendrel child yet is the translated_vars list of the associated AppendRelInfo that we just made for this child. Furthermore, we do not want to force use of PHVs in the AppendRelInfo, as there is no outer join between. In fact, perform_pullup_replace_vars always sets wrap_non_vars to false before performing pullup_replace_vars on the AppendRelInfo. This patch simply removes the code that sets wrap_non_vars to true for UNION ALL subqueries. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-VXDEi1v+hZYLxpOv0riJxHsCkCH1f46tLnhonEAyGCQ@mail.gmail.com	2025-03-13 16:34:28 +09:00
Jeff Davis	d3b2e5e1ab	Refactor convert_case() to prepare for optimizations. Upcoming optimizations will add complexity to convert_case(). This patch reorganizes slightly so that the complexity can be contained within the logic to convert the case of a single character, rather than mixing it in with logic to iterate through the string. Reviewed-by: Alexander Borisov <lex.borisov@gmail.com> Discussion: https://postgr.es/m/44005c3d-88f4-4a26-981f-fd82dfa8e313@gmail.com	2025-03-12 21:51:52 -07:00
Amit Kapila	3abe9dc188	Avoid invalidating all RelationSyncCache entries on publication rename. On Publication rename, we need to only invalidate the RelationSyncCache entries corresponding to relations that are part of the publication being renamed. As part of this patch, we introduce a new invalidation message to invalidate the cache maintained by the logical decoding output plugin. We can't use existing relcache invalidation for this purpose, as that would unnecessarily cause relcache invalidations in other backends. This will improve performance by building fewer relation cache entries during logical replication. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-03-13 09:16:33 +05:30
Thomas Munro	75da2bece6	Fix read_stream.c for changing io_combine_limit. In a couple of places, read_stream.c assumed that io_combine_limit would be stable during the lifetime of a stream. That is not true in at least one unusual case: streams held by CURSORs where you could change the GUC between FETCH commands, with unpredictable results. Fix, by storing stream->io_combine_limit and referring only to that after construction. This mirrors the treatment of the other important setting {effective,maintenance}_io_concurrency, which is stored in stream->max_ios. One of the cases was the queue overflow space, which was sized for io_combine_limit and could be overrun if the GUC was increased. Since that coding was a little hard to follow, also introduce a variable for better readability instead of open-coding the arithmetic. Doing so revealed an off-by-one thinko while clamping max_pinned_buffers to INT16_MAX, though that wasn't a live bug due to the current limits on GUC values. Back-patch to 17. Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com	2025-03-13 15:43:34 +13:00
Amit Langote	d4f79865d4	Fix copy-paste error in datum_to_jsonb_internal() Commit `3c152a27b0` mistakenly repeated JSONTYPE_JSON in a condition, omitting JSONTYPE_CAST. As a result, datum_to_jsonb_internal() failed to reject inputs that were casts (e.g., from an enum to json as in the example below) when used as keys in JSON constructors. This led to a crash in cases like: SELECT JSON_OBJECT('happy'::mood: '123'::jsonb); where 'happy'::mood is implicitly cast to json. The missing check meant such casted values weren’t properly rejected as invalid (non-scalar) JSON keys. Reported-by: Maciek Sakrejda <maciek@pganalyze.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Maciek Sakrejda <maciek@pganalyze.com> Discussion: https://postgr.es/m/CADXhmgTJtJZK9A3Na_ry+Xrq-ghjcejBRhcRMzWZvbd__QdgJA@mail.gmail.com Backpatch-through: 17	2025-03-13 09:56:36 +09:00
Masahiko Sawada	4ecdd4110d	pg_rewind: Add dbname to primary_conninfo when using --write-recovery-conf. This commit enhances pg_rewind's --write-recovery-conf option to include the dbname in the generated primary_conninfo value when specified in the --source-server option. With this modification, the rewound server can connect to the primary server without manual configuration file modifications when sync_replication_slots is enabled. Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAD21AoAkW=Ht0k9dVoBTCcqLiiZ2MXhVr+d=j2T_EZMerGrLWQ@mail.gmail.com	2025-03-12 16:56:04 -07:00
David Rowley	cdc1471cc7	Add `b955df443` to .git-blame-ignore-revs	2025-03-13 12:44:26 +13:00
David Rowley	b955df4434	Fix indentation issue Introduced recently by `9e088f7dd` Per buildfarm member koel	2025-03-13 12:41:44 +13:00
Masahiko Sawada	9e088f7dd8	Fix compiler warning in pg_logicalinspect. Oversight in `bd65cb3cd4`. Reported-by: David Rowley <dgrowleyml@gmail.com> Reported-by: Nathan Bossart <nathandbossart@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvqrhFfnetbcwgGkJ=z63T8HfQ_OyP=vX8BYiXyxFKt67w@mail.gmail.com	2025-03-12 14:23:56 -07:00
Heikki Linnakangas	ac4494646d	Rename alloc/free functions in reorderbuffer.c There used to be bespoken pools for these structs to reduce the palloc/pfree overhead, but that was ripped out a long time ago and replaced with the generic, cheaper generational memory allocator (commit `a4ccc1cef5`). The Get/Return terminology made sense with the pools, as you "got" an object from the pool and "returned" it later, but now it just looks weird. Rename to Alloc/Free. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/c9e43d2d-8e83-444f-b111-430377368989@iki.fi	2025-03-12 22:03:39 +02:00
Nathan Bossart	025e7e1eb4	Remove count_one_bits() in acl.c. The only caller, select_best_grantor(), can instead use pg_popcount64(). This isn't performance-critical code, but we might as well use the centralized implementation. While at it, add some test coverage for this part of select_best_grantor(). Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/Z9GtL7Nm6hsYyJnF%40nathan	2025-03-12 15:01:52 -05:00
Melanie Plageman	ff79b5b2ab	Increase default effective_io_concurrency to 16 The default effective_io_concurrency has been 1 since it was introduced in `b7b8f0b609`. Referencing the associated discussion [1], it seems 1 was chosen as a conservative value that seemed unlikely to cause regressions. Experimentation on high latency cloud storage as well as fast, local nvme storage (see Discussion link) shows that even slightly higher values improve query timings substantially. 1 actually performs worse than 0 [2]. With effective_io_concurrency 1, we are not prefetching enough to avoid I/O stalls, but we are issuing extra syscalls. The new default is 16, which should be more appropriate for common hardware while still avoiding flooding low IOPs devices with I/O requests. [1] https://www.postgresql.org/message-id/flat/FDDBA24E-FF4D-4654-BA75-692B3BA71B97%40enterprisedb.com [2] https://www.postgresql.org/message-id/CAAKRu_Zv08Cic%3DqdCfzrQabpEXGrd9Z9UOW5svEVkCM6%3DFXA9g%40mail.gmail.com Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAAKRu_Z%2BJa-mwXebOoOERMMUMvJeRhzTjad4dSThxG0JLXESxw%40mail.gmail.com	2025-03-12 15:57:44 -04:00
Heikki Linnakangas	af717317a0	Handle interrupts while waiting on Append's async subplans We did not wake up on interrupts while waiting on async events on an async-capable append node. For example, if you tried to cancel the query, nothing would happen until one of the async subplans becomes readable. To fix, add WL_LATCH_SET to the WaitEventSet. Backpatch down to v14 where async Append execution was introduced. Discussion: https://www.postgresql.org/message-id/37a40570-f558-40d3-b5ea-5c2079b3b30b@iki.fi	2025-03-12 20:53:09 +02:00
Tom Lane	f4e7756ef9	Build whole-row Vars the same way during parsing and planning. makeWholeRowVar() has different rules for constructing a whole-row Var depending on the kind of RTE it's representing. This turns out to be problematic because the rewriter and planner can convert view RTEs and set-returning-function RTEs into subquery RTEs; so a whole-row Var made during planning might look different from one made by the parser. In isolation this doesn't cause any problem, but if a query contains Vars made both ways for the same varno, there are cross-checks in the executor that will complain. This manifests for UPDATE, DELETE, and MERGE queries that use whole-row table references. To fix, we need makeWholeRowVar() to produce the same result from an inlined RTE as it would have for the original. For an inlined view, we can use RangeTblEntry.relid to detect that this had been a view RTE. For inlined SRFs, make a data structure definition change akin to commit `47bb9db75`, and say that we won't clear RangeTblEntry.functions until the end of planning. That allows makeWholeRowVar() to repeat what it would have done with the unmodified RTE. Reported-by: Duncan Sands <duncan.sands@deepbluecap.com> Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com> Diagnosed-by: Tender Wang <tndrwang@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/3518c50a-ab18-482f-b916-a37263622501@deepbluecap.com Backpatch-through: 13	2025-03-12 11:47:38 -04:00
Melanie Plageman	18cd15e706	Add connection establishment duration logging Add log_connections option 'setup_durations' which logs durations of several key parts of connection establishment and backend setup. For an incoming connection, starting from when the postmaster gets a socket from accept() and ending when the forked child backend is first ready for query, there are multiple steps that could each take longer than expected due to external factors. This logging provides visibility into authentication and fork duration as well as the end-to-end connection establishment and backend initialization time. To make this portable, the timings captured in the postmaster (socket creation time, fork initiation time) are passed through the BackendStartupData. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Guillaume Lelarge <guillaume.lelarge@dalibo.com> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com	2025-03-12 11:35:27 -04:00
Melanie Plageman	9219093cab	Modularize log_connections output Convert the boolean log_connections GUC into a list GUC comprised of the connection aspects to log. This gives users more control over the volume and kind of connection logging. The current log_connections options are 'receipt', 'authentication', and 'authorization'. The empty string disables all connection logging. 'all' enables all available connection logging. For backwards compatibility, the most common values for the log_connections boolean are still supported (on, off, 1, 0, true, false, yes, no). Note that previously supported substrings of on, off, true, false, yes, and no are no longer supported. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com	2025-03-12 11:35:21 -04:00
Michael Paquier	f554a95379	Remove initialization from PendingBackendStats `9a8dd2c5a6` has added an initialization to PendingBackendStats, which has been causing compilation warnings in the buildfarm. This code does not strictly require it as PendingBackendStats is always initialized with memset(0), so let's remove it. Per report from multiple buildfarm members, like ayu and batfish, via Tom Lane. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/1870853.1741749264@sss.pgh.pa.us	2025-03-12 20:37:43 +09:00
Peter Eisentraut	72a3d0462b	Prepare for Python "Limited API" in PL/Python Using the Python Limited API would allow building PL/Python against any Python 3.x version and using another Python 3.x version at run time. This commit does not activate that, but it prepares the code to only use APIs supported by the Limited API. Implementation details: - Convert static types to heap types (https://docs.python.org/3/howto/isolating-extensions.html#heap-types). - Replace PyRun_String() with component functions. - Replace PyList_SET_ITEM() with PyList_SetItem(). This was previously committed as `c47e8df815` and then reverted because it wasn't working under Python older than 3.8. That has been fixed in this version. There was a Python API change/bugfix between 3.7 and 3.8 that directly affects this patch. The relevant commit is <https://github.com/python/cpython/commit/364f0b0f19c>. The workarounds described there have been applied in this patch, and it has been confirmed to work with Python 3.6 and 3.7. Reviewed-by: Jakob Egger <jakob@eggerapps.at> Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org	2025-03-12 08:53:54 +01:00
Tom Lane	c872516d8f	Doc: silence A4 PDF build warnings. Commit `0fbceae84` put a "&zwsp;" in almost but not quite the correct place to avoid "The contents of fo:block line 1 exceed the available area" warnings. Per buildfarm.	2025-03-11 23:35:39 -04:00
Heikki Linnakangas	043745c3a0	Improve snapmgr.c comment Add more details on the different kinds of snapshots, how to use them, and how the active snapshot stack works. Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi	2025-03-11 23:28:38 +02:00
Heikki Linnakangas	8076c00592	Assert that a snapshot is active or registered before it's used The comment in GetTransactionSnapshot() said that you "should call RegisterSnapshot or PushActiveSnapshot on the returned snap if it is to be used very long". That felt too unclear to me. Make the comment more strongly worded. To enforce that rule and to catch potential bugs where a snapshot might get invalidated while it's still in use, add an assertion to HeapTupleSatisfiesMVCC() to check that the snapshot is registered or pushed to active stack. No new bugs were found by this, but it seems like good future-proofing. It's not a great place for the check; HeapTupleSatisfiesMVCC() is in fact safe to call with an unregistered snapshot, and the assertion won't catch other unsafe uses. But it goes a long way in practice. Fix a few cases that were playing fast and loose with that and just assumed that the snapshot cannot be invalidated during a scan. Those assumptions were not wrong, but they're not performance critical, so let's drop the excuses and just register the snapshot. These were false positives found by the new assertion. Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi	2025-03-11 23:20:34 +02:00
Masahiko Sawada	bd65cb3cd4	pg_logicalinspect: Fix possible crash when passing a directory path. Previously, pg_logicalinspect functions were too trusting of their input and blindly passed it to SnapBuildRestoreSnapshot(). If the input pointed to a directory, the server could a PANIC error while attempting to fsync_fname() with isdir=false on a directory. This commit adds validation checks for input filenames and passes the LSN extracted from the filename to SnapBuildRestoreSnapshot() instead of the filename itself. It also adds regression tests for various input patterns and permission checks. Bug: #18828 Reported-by: Robins Tharakan <tharakan@gmail.com> Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/18828-0f4701c635064211@postgresql.org	2025-03-11 09:56:40 -07:00
Masahiko Sawada	a49927f04c	pg_logicalinspect: Stabilize isolation tests. The previous isolation tests did not account for the possibility that the background writer or the checkpointer could write a RUNNING_XACTS record, which could cause logical decoding to produce more logical snapshots than expected. This commit modifies the isolation tests to verify that at least one logical snapshot contains the expected number of committed or ongoing catalog-change transactions. Per buildfarm member skink. Reported-by: Andres Freund <andres@anarazel.de> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/5qbxud4pvnvmtuoi7weiizm5hmumxaeohx4vztfhrwlfhyz6rj@buh4435mllwo	2025-03-11 09:30:00 -07:00
Tom Lane	8b1b342544	Improve EXPLAIN's display of window functions. Up to now we just punted on showing the window definitions used in a plan, with window function calls represented as "OVER (?)". To improve that, show the window definition implemented by each WindowAgg plan node, and reference their window names in OVER. For nameless window clauses generated by "OVER (...)", assign unique names w1, w2, etc. In passing, re-order the properties shown for a WindowAgg node so that the Run Condition (if any) appears after the Window property and before the Filter (if any). This seems more sensible since the Run Condition is associated with the Window and acts before the Filter. Thanks to David G. Johnston and Álvaro Herrera for design suggestions. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/144530.1741469955@sss.pgh.pa.us	2025-03-11 11:19:54 -04:00
Peter Geoghegan	426ea61117	nbtree: Make BTMaxItemSize into object-like macro. Make nbtree's "1/3 of a page limit" BTMaxItemSize function-like macro (which accepts a "page" argument) into an object-like macro that can be used from code that doesn't have convenient access to an nbtree page. Preparation for an upcoming patch that adds skip scan to nbtree. Parallel index scans that use skip scan will serialize datums (not just SAOP array subscripts) when scheduling primitive scans. BTMaxItemSize will be used by btestimateparallelscan to determine how much DSM to request. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wz=H_RG5weNGeUG_TkK87tRBnH9mGCQj6WpM4V4FNWKv2g@mail.gmail.com	2025-03-11 10:35:56 -04:00
Peter Geoghegan	0fbceae841	Show index search count in EXPLAIN ANALYZE, take 2. Expose the count of index searches/index descents in EXPLAIN ANALYZE's output for index scan/index-only scan/bitmap index scan nodes. This information is particularly useful with scans that use ScalarArrayOp quals, where the number of index searches can be unpredictable due to implementation details that interact with physical index characteristics (at least with nbtree SAOP scans, since Postgres 17 commit `5bf748b8`). The information shown also provides useful context when EXPLAIN ANALYZE runs a plan with an index scan node that successfully applied the skip scan optimization (set to be added to nbtree by an upcoming patch). The instrumentation works by teaching all index AMs to increment a new nsearches counter whenever a new index search begins. The counter is incremented at exactly the same point that index AMs already increment the pg_stat_*_indexes.idx_scan counter (we're counting the same event, but at the scan level rather than the relation level). Parallel queries have workers copy their local counter struct into shared memory when an index scan node ends -- even when it isn't a parallel aware scan node. An earlier version of this patch that only worked with parallel aware scans became commit `5ead85fb` (though that was quickly reverted by commit `d00107cd` following "debug_parallel_query=regress" buildfarm failures). Our approach doesn't match the approach used when tracking other index scan related costs (e.g., "Rows Removed by Filter:"). It is comparable to the approach used in similar cases involving costs that are only readily accessible inside an access method, not from the executor proper (e.g., "Heap Blocks:" output for a Bitmap Heap Scan, which was recently enhanced to show per-worker costs by commit `5a1e6df3`, using essentially the same scheme as the one used here). It is necessary for index AMs to have direct responsibility for maintaining the new counter, since the counter might need to be incremented multiple times per amgettuple call (or per amgetbitmap call). But it is also necessary for the executor proper to manage the shared memory now used to transfer each worker's counter struct to the leader. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com	2025-03-11 09:20:50 -04:00
Peter Eisentraut	12c5f797ea	Update nls.mk for newly added file Commit `f18231e817` moved some code to a new file, but the new file wasn't added to nls.mk.	2025-03-11 13:48:14 +01:00
Álvaro Herrera	17ce344f86	BRIN: be more strict about required support procs With improperly defined operator classes, it's possible to get a Postgres crash because we'd try to invoke a procedure that doesn't exist. This is because the code is being a bit too trusting that the opclass is correctly defined. Add some ereport(ERROR)s for cases where mandatory support procedures are not defined, transforming the crashes into errors. The particular case that was reported is an incomplete opclass in PostGIS. Backpatch all the way down to 13. Reported-by: Tobias Wendorff <tobias.wendorff@tu-dortmund.de> Diagnosed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/fb6d9a35-6c8e-4869-af80-0a4944a793a4@tu-dortmund.de	2025-03-11 12:50:35 +01:00
Daniel Gustafsson	d35d32d711	Add special case fast-paths for strict functions Many STRICT function calls will have one or two arguments, in which case we can speed up checking for NULL input by avoiding setting up a loop over the arguments. This adds EEOP_FUNCEXPR_STRICT_1 and the corresponding EEOP_FUNCEXPR_STRICT_2 for functions with one and two arguments respectively. Author: Andres Freund <andres@anarazel.de> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de	2025-03-11 12:02:42 +01:00
Daniel Gustafsson	8dd7c7cd0a	Replace EEOP_DONE with special steps for return/no return Knowing when the side-effects of an expression is the intended result of the execution, rather than the returnvalue, is important for being able generate more efficient JITed code. This replaces EEOP_DONE with two new steps: EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN. Expressions which return a value should use the former step; expressions used for their side-effects which don't return value should use the latter. Author: Andres Freund <andres@anarazel.de> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de	2025-03-11 12:02:38 +01:00
Peter Eisentraut	dabccf4513	Move RemoveInheritedConstraint() call slightly earlier This change is harmless and does not affect the existing intended operation. It is necessary for a subsequent patch operation (NOT ENFORCED foreign keys), where we may need to change the child constraint to enforced. In this case, we would create the necessary triggers and queue the constraint for validation, so it is important to remove any unnecessary constraints before proceeding. This is a small change that could have been included in the previous "split tryAttachPartitionForeignKey" refactoring patch (commit `1d26c2d2c4`), but was kept separate to highlight the changes. Author: Amul Sul <amul.sul@enterprisedb.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA%40mail.gmail.com	2025-03-11 10:43:48 +01:00
Peter Eisentraut	1d26c2d2c4	refactor: Split tryAttachPartitionForeignKey() Split tryAttachPartitionForeignKey() into three functions: AttachPartitionForeignKey(), RemoveInheritedConstraint(), and DropForeignKeyConstraintTriggers(), so they can be reused in some subsequent patches for the NOT ENFORCED feature. Author: Amul Sul <amul.sul@enterprisedb.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA%40mail.gmail.com	2025-03-11 09:35:24 +01:00
Peter Eisentraut	64224a834c	refactor: re-add ATExecAlterChildConstr() ATExecAlterChildConstr() was removed in commit `80d7f99049`, but it is needed in some subsequent patches for the NOT ENFORCED feature, to recurse over child constraints. This adds it back in slightly altered form. Author: Amul Sul <amul.sul@enterprisedb.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA%40mail.gmail.com	2025-03-11 08:43:35 +01:00
Michael Paquier	76def4cdd7	Add WAL data to backend statistics This commit adds per-backend WAL statistics, providing the same information as pg_stat_wal, except that it is now possible to know how much WAL activity is happening in each backend rather than an overall aggregate of all the activity. Like pg_stat_wal, the implementation relies on pgWalUsage, tracking the difference of activity between two reports to pgstats. This data can be retrieved with a new system function called pg_stat_get_backend_wal(), that returns one tuple based on the PID provided in input. Like pg_stat_get_backend_io(), this is useful when joined with pg_stat_activity to get a live picture of the WAL generated for each running backend, showing how the activity is [un]balanced. pgstat_flush_backend() gains a new flag value, able to control the flush of the WAL stats. This commit relies mostly on the infrastructure provided by `9aea73fc61`, that has introduced backend statistics. Bump catalog version. A bump of PGSTAT_FILE_FORMAT_ID is not required, as backend stats do not persist on disk. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal	2025-03-11 09:04:11 +09:00
Andres Freund	59a1592e39	tests: Make postmaster/002_connection_limits deal verbose logs When log_error_verbosity=verbose is configured the test would hand (and then fail), because of the sqlstate being added between log level and message. Make regex cope. Reported-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/c7ba6bd0-3701-43d1-9087-017777fe9cd2%40dunslane.net	2025-03-10 19:32:26 -04:00
Tom Lane	29d6808ede	CREATE INDEX: do update index stats if autovacuum=off. This fixes a thinko from commit `d611f8b15`. The intent was to prevent updating the stats of the pre-existing heap if autovacuum is off, but it also disabled updating the stats of the just-created index. There is AFAICS no good reason to do the latter, since there could not be any pre-existing stats to refrain from overwriting, and the zeroed stats that are there to begin with are very unlikely to be useful. Moreover, the change broke our cross-version upgrade tests again. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1116282.1741374848@sss.pgh.pa.us	2025-03-10 17:49:27 -04:00
Heikki Linnakangas	f7c566a1a2	Fix a few more redundant calls of GetLatestSnapshot() Commit `2367503177` fixed this in RelationFindReplTupleByIndex(), but I missed two other similar cases. Per report from Ranier Vilela. Discussion: https://www.postgresql.org/message-id/CAEudQArUT1dE45WN87F-Gb7XMy_hW6x1DFd3sqdhhxP-RMDa0Q@mail.gmail.com Backpatch-through: 13	2025-03-10 18:58:10 +02:00
Heikki Linnakangas	2367503177	Fix snapshot used in logical replication index lookup The function calls GetLatestSnapshot() to acquire a fresh snapshot, makes it active, and was meant to pass it to table_tuple_lock(), but instead called GetLatestSnapshot() again to acquire yet another snapshot. It was harmless because the heap AM and all other known table AMs ignore the 'snapshot' argument anyway, but let's be tidy. In the long run, this perhaps should be redesigned so that snapshot was not needed in the first place. The table AM API uses TID + snapshot as the unique identifier for the row version, which is questionable when the row came from an index scan with a Dirty snapshot. You might lock a different row version when you use a different snapshot in the table_tuple_lock() call (a fresh MVCC snapshot) than in the index scan (DirtySnapshot). However, in the heap AM and other AMs where the TID alone identifies the row version, it doesn't matter. So for now, just fix the obvious albeit harmless bug. This has been wrong ever since the table AM API was introduced in commit `5db6df0c01`, so backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/83d243d6-ad8d-4307-8b51-2ee5844f6230@iki.fi Backpatch-through: 13	2025-03-10 17:07:38 +02:00
Tom Lane	9f87e2593f	Doc: improve description of window function processing. The previous wording talked about a "single pass over the data", which can be read as promising more than intended (to wit, that only one WindowAgg plan node will be used). What we promise is only what the SQL spec requires, namely that the data not get re-sorted between window functions with compatible PARTITION BY/ORDER BY clauses. Adjust the wording in hopes of making this clearer. Reported-by: Christopher Inokuchi <cinokuchi@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/CABde6B5va2wMsnM79u_x=n9KUgfKQje_pbLROEBmA9Ru5XWidw@mail.gmail.com Backpatch-through: 13	2025-03-10 10:22:08 -04:00
Alexander Korotkov	6bb6a62f3c	Use extended stats for precise estimation of bucket size in hash join Recognizing the real-life complexity where columns in the table often have functional dependencies, PostgreSQL's estimation of the number of distinct values over a set of columns can be underestimated (or much rarely, overestimated) when dealing with multi-clause JOIN. In the case of hash join, it can end up with a small number of predicted hash buckets and, as a result, picking non-optimal merge join. To improve the situation, we introduce one additional stage of bucket size estimation - having two or more join clauses estimator lookup for extended statistics and use it for multicolumn estimation. Clauses are grouped into lists, each containing expressions referencing the same relation. The result of the multicolumn estimation made over such a list is combined with others according to the caller's logic. Clauses that are not estimated are returned to the caller for further estimation. Discussion: https://postgr.es/m/52257607-57f6-850d-399a-ec33a654457b%40postgrespro.ru Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Andy Fan <zhihui.fan1213@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-03-10 13:42:01 +02:00
Alexander Korotkov	fae535da0a	Teach Append to consider tuple_fraction when accumulating subpaths. This change is dedicated to more active usage of IndexScan and parameterized NestLoop paths in partitioned cases under an Append node, as it already works with plain tables. As newly added regression tests demonstrate, it should provide more smartness to the partitionwise technique. With an indication of how many tuples are needed, it may be more meaningful to use the 'fractional branch' subpaths of the Append path list, which are more optimal for this specific number of tuples. Planning on a higher level, if the optimizer needs all the tuples, it will choose non-fractional paths. In the case when, during execution, Append needs to return fewer tuples than declared by tuple_fraction, it would not be harmful to use the 'intermediate' variant of paths. However, it will earn a considerable profit if a sensible set of tuples is selected. The change of the existing regression test demonstrates the positive outcome of this feature: instead of scanning the whole table, the optimizer prefers to use a parameterized scan, being aware of the only single tuple the join has to produce to perform the query. Discussion: https://www.postgresql.org/message-id/flat/CAN-LCVPxnWB39CUBTgOQ9O7Dd8DrA_tpT1EY3LNVnUuvAX1NjA%40mail.gmail.com Author: Nikita Malakhov <hukutoc@gmail.com> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Andy Fan <zhihuifan1213@163.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-03-10 13:38:39 +02:00
Peter Eisentraut	b83e8a2ca2	Remove support for temporal RESTRICT foreign keys It isn't clear how these should behave, so let's wait to implement them until we are sure how to do it. This feature was initially added by commit `89f908a6d0`, so it hasn't been released yet. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/e773bc11-4ac1-40de-bb91-814e02f05b6d%40eisentraut.org	2025-03-10 11:31:01 +01:00
David Rowley	e033696596	Fix incorrect #endif comment Noticed while reading code in this area.	2025-03-10 13:36:04 +13:00
Heikki Linnakangas	03f8e9a7fe	Fix incorrect assertion in libpqwalreceiver Was supposed to check the length of the array, but was checking its size in bytes. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Discussion: https://www.postgresql.org/message-id/CA%2BCOZaA_9afJxj9ZuO73U5P7WXP%2BZM9NGnZvTDCmBFz0FGP%2BwA@mail.gmail.com	2025-03-09 20:40:45 +02:00
Heikki Linnakangas	2a943afcff	Fix test name and username used in failed connection attempts The first failed connection tests the "regular" connections limit, not the reserved limit. In the second failed connection, the username doesn't really matter, but since the previous successful connections used "regress_reserved", it seems weird to switch back to "regress_regular" for the expected-to-fail attempt. Discussion: https://www.postgresql.org/message-id/fd5e9523-78d3-4270-86b2-fd1b1eeb4fc9@iki.fi	2025-03-09 19:47:55 +02:00
Tom Lane	fedfcf6650	Don't try to parallelize array_agg() on an anonymous record type. This doesn't work because record_recv requires the typmod that identifies the specific record type (in our session) and array_agg_deserialize has no convenient way to get that information. The result is an "input of anonymous composite types is not implemented" error. We could probably make this work if we had to, but it does not seem worth the trouble, given that it took this long to get a field report. Just shut off parallelization, as though record_recv didn't exist. Oversight in commit `16fd03e95`. Back-patch to v16 where that came in. Reported-by: Kirill Zdornyy <kirill@dineserve.com> Diagnosed-by: Richard Guo <guofenglinux@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/atLI5Kce2ie1zcYjU0w_kjtVaxiYbYGTihrkLDmGZQnRDD4pnXukIATaABbnIj9pUnelC4ESvCXMm4HAyHg-v61XABaKpERj0A2IXzJZM7g=@dineserve.com Backpatch-through: 16	2025-03-09 13:11:20 -04:00
Nathan Bossart	3c472a1829	doc: Adjust note about pg_upgrade's --jobs option. Presently, this section lists a couple of parallelized parts of pg_upgrade and suggests a starting point for setting the --jobs option. The list of parallelized tasks is not particularly actionable, and the phrasing for the --jobs recommendation is confusing to some readers. This commit attempts to improve this section by eliminating the list of parallelized tasks and instead highlighting that --jobs is most useful for clusters with multiple databases or tablespaces. Additionally, the recommendation for setting --jobs is simplified to suggest starting with the number of CPU cores. Reported-by: Magnus Hagander <magnus@hagander.net> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Magnus Hagander <magnus@hagander.net> Discussion: https://postgr.es/m/Z8dBn_5iGLNuYiPo%40nathan	2025-03-08 14:28:16 -06:00
Jeff Davis	1852aea3f5	Don't convert to and from floats in pg_dump. Commit `8f427187db` improved performance by remembering relation stats as native types rather than issuing a new query for each relation. Using native types is fine for integers like relpages; but reltuples is floating point. The commit controllled for that complexity by using setlocale(LC_NUMERIC, "C"). After that, Alexander Lakhin found a problem in pg_strtof(), fixed in `00d61a08c5`. While we aren't aware of any more problems with that approach, it seems wise to just use a string the whole way for floating point values, as Corey's original patch did, and get rid of the setlocale(). Integers are still converted to native types to avoid wasting memory. Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/3049348.1740855411@sss.pgh.pa.us Discussion: https://postgr.es/m/560cca3781740bd69881bb07e26eb8f65b09792c.camel%40j-davis.com	2025-03-08 11:25:36 -08:00
Tom Lane	7fb8801021	Clear errno before calling strtol() in spell.c. Per POSIX, a caller of strtol() that wishes to check for errors must set errno to 0 beforehand. Several places in spell.c neglected that, so that they risked delivering a false overflow error in case errno had been ERANGE already. Given the lack of field reports, this case may be unreachable at present --- but it's surely trouble waiting to happen, so fix it. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Discussion: https://postgr.es/m/CA+COZaBhsq6EromFm+knMJfzK6nTpG23zJ+K2=nfUQQXcj_xcQ@mail.gmail.com Backpatch-through: 13	2025-03-08 11:24:25 -05:00
Peter Geoghegan	67fc4c9fd7	Make parallel nbtree index scans use an LWLock. Teach parallel nbtree index scans to use an LWLock (not a spinlock) to protect the scan's shared descriptor state. Preparation for an upcoming patch that will add skip scan optimizations to nbtree. That patch will create the need to occasionally allocate memory while the scan descriptor is locked, while copying datums that were serialized by another backend. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com	2025-03-08 11:10:14 -05:00
Peter Eisentraut	8021c77769	Make amcanorder independent of amconsistentordering Follow-up to commit af4002b381d: Make amconsistentordering not depend on amcanorder. Although they are related, they are independent properties. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org	2025-03-08 09:37:06 +01:00
Peter Eisentraut	661781f3a3	Fix typo Duplicate assignment in commit `af4002b381` should have been a different field. (But it didn't affect the outcome.) Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org	2025-03-08 08:06:30 +01:00
Michael Paquier	21f653cc00	Use stricter ordering in regression test query for pg_stat_io The query introduced in `8b532771a0` is proving to have ordering issues under at least the locale cs_CZ. This commit updates the query to use a stricter ordering. Per reports from buildfarm members hippopotamus and jay.	2025-03-08 13:39:57 +09:00
Michael Paquier	8b532771a0	Add regression test listing all the possible tuples in pg_stat_io pg_stat_io returns a set of tuples based on a combination of three properties (BackendType, IOObject and IOContext) and pgstat_tracks_io_object() to decide if a BackendType should return a tuple based on a pair made of an IOObject and an IOContext. This commit adds a regression test to track all the combinations supported. This is useful to know which tuples are relevant when adding a new BackendType to the set or when touching pgstat_tracks_io_object(), and I have noticed while playing with this area that it is not complicated to break it without the regression tests noticing a difference in some cases. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z8exfAehbVbEKXW5@paquier.xyz	2025-03-08 12:22:41 +09:00
Michael Paquier	9a8dd2c5a6	Improve check for detection of pending data in backend statistics The callback pgstat_backend_have_pending_cb() is used as a way for pg_stat_report() to detect if there is any pending data for backend statistics. It did not include a check based on pgstat_tracks_backend_bktype(), that discards processes whose backend types do not support backend statistics. The logic is not a problem on HEAD, as processes that do not support backend statistics cannot touch PendingBackendStats, so the callback would always report that there is no pending data in this case. However, we would run into trouble once backend statistics include portions of pending stats that are not always zeroed, like pgWalUsage. There is no reason for pgstat_backend_have_pending_cb() to not check for pgstat_tracks_backend_bktype(), anyway, and this pattern is safer in the long run, so let's update the code to do so. While on it, this commit adds a proper initialization to PendingBackendStats. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z8l6EMM4ImVoWRkg@ip-10-97-1-34.eu-west-3.compute.internal	2025-03-08 10:56:30 +09:00
Peter Geoghegan	8e167e6188	nbtree: refine _bt_readnextpage contract comments. Another minor follow-up commit for commit `1bd4bc85`, which changed the _bt_readnextpage contract.	2025-03-07 18:35:13 -05:00
Nathan Bossart	088f8e2d56	Assert that wrapper_handler()'s argument is within expected range. pqsignal() already does a similar check, but strange Valgrind reports have us wondering if wrapper_handler() is somehow getting called with an invalid signal number. Reported-by: Tomas Vondra <tomas@vondra.me> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/ace01111-f9ac-4f61-b1b1-8e9379415444%40vondra.me Backpatch-through: 17	2025-03-07 15:23:09 -06:00
Tom Lane	34c3c5ce1c	Include column name in build_attrmap_by_position's error reports. Formerly we only provided the column number, but it's frequently more useful to mention the column name. The input tupdesc often doesn't have useful column names, but the output tupdesc usually contains user-supplied names, so report that one. Author: Marcos Pegoraro <marcos@f10.com.br> Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Erik Wienhold <ewie@ewie.name> Reviewed-by: Vladlen Popolitov <v.popolitov@postgrespro.ru> Discussion: https://postgr.es/m/CAB-JLwanky28gjAMdnMh1CjyO1b2zLdr6UOA1-oY9G7PVL9KKQ@mail.gmail.com	2025-03-07 13:24:20 -05:00
Andres Freund	b48832cddb	tests: Don't fail due to high default timeout in postmaster/003_start_stop Some BF animals use very high timeouts due to their slowness. Unfortunately postmaster/003_start_stop fails if a high timeout is configured, due to authentication_timeout having a fairly low max. As this test is reasonably fast, the easiest fix seems to be to cap the timeout to 600. Per buildfarm animal skink. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/ggflhkciwdyotpoie323chu2c2idpjk5qimrn462encwx2io7s@thmcxl7i6dpw	2025-03-07 13:09:16 -05:00
Andres Freund	71d1ed6fe1	tests: Fix race condition in postmaster/002_connection_limits The test occasionally failed due to unexpected connection limit errors being encountered after having waited for FATAL errors on another connection. These spurious failures were caused by the the backend reporting FATAL errors to the client before detaching from the PGPROC entry. Adding a sleep(1) before proc_exit() makes it easy to reproduce that problem. To fix the issue, add a helper function that waits for postmaster to notice the process having exited. For now this is implemented by waiting for the DEBUG2 message that postmaster logs in that case. That's not the prettiest fix, but simple. If we notice this problem elsewhere, it might be worthwhile to make this more general, e.g. by adding an injection point. Reported-by: Tomas Vondra <tomas@vondra.me> Diagnosed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Tested-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/ggflhkciwdyotpoie323chu2c2idpjk5qimrn462encwx2io7s@thmcxl7i6dpw	2025-03-07 13:09:16 -05:00
Robert Haas	d3fc7a5120	doc: Add missing decimal places to example rowcount. Commit `95dbd827f2` updated a bunch of similar cases in the documentation, but missed this one. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>	2025-03-07 09:00:53 -05:00
Peter Eisentraut	7f24c02743	Improve possible performance regression Commit `ce62f2f2a0` introduced calls to GetIndexAmRoutineByAmId() in lsyscache.c functions. This call is a bit more expensive than a simple syscache lookup. So rearrange the nesting so that we call that one last and do the cheaper checks first. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org	2025-03-07 11:46:33 +01:00
Peter Eisentraut	af4002b381	Rename amcancrosscompare After more discussion about commit `ce62f2f2a0`, rename the index AM property amcancrosscompare to two separate properties amconsistentequality and amconsistentordering. Also improve the documentation and update some comments that were previously missed. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org	2025-03-07 11:46:33 +01:00
Dean Rasheed	6da469bada	Allow casting between bytea and integer types. This allows smallint, integer, and bigint values to be cast to and from bytea. The bytea value is the two's complement representation of the integer, with the most significant byte first. For example: 1234::bytea -> \x000004d2 (-1234)::bytea -> \xfffffb2e Author: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Joel Jacobson <joel@compiler.org> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TPtOp6%2BkFX5QX3fH1SVr7v65uHr-7yEJ%3DGMGQi5uhGtcA%40mail.gmail.com	2025-03-07 09:31:18 +00:00
Jeff Davis	d611f8b158	CREATE INDEX: don't update table stats if autovacuum=off. We previously fixed this for binary upgrade in `71b66171d0`, but a similar problem remained when dumping statistics without data. Fix by not opportunistically updating table stats during CREATE INDEX when autovacuum is disabled. For stats to be stable at all, the server needs to be aware that it should not take every opportunity to update stats. Per discussion, autovacuum=off is a signal that the user expects stats to be stable; though if necessary, we could create a more specific mode in the future. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAExHW5vf9D+8-a5_BEX3y=2y_xY9hiCxV1=C+FnxDvfprWvkng@mail.gmail.com Discussion: https://postgr.es/m/ca81cbf6e6ea2af838df972801ad4da52640a503.camel%40j-davis.com	2025-03-06 19:39:14 -08:00
John Naylor	19e57f4f78	Revert "vacuumdb: Add option for analyzing only relations missing stats." This reverts commit `5f8eb25706`, which in my branch by mistake.	2025-03-07 10:35:21 +07:00
John Naylor	fcabc3adf8	Doc: correct aggressive vacuum threshold for multixact members storage The threshold is two billion members, which was interpreted as 2GB in the documentation. Fix to reflect that each member takes up five bytes, which translates to about 10GB. This is not exact, because of page boundaries. While at it, mention the maximum size 20GB. This has been wrong since commit `c552e171d1`, so backpatch to version 14. Author: Alex Friedman <alexf01@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CACbFw60UOk6fCC02KsyT3OfU9Dnuq5roYxdw2aFisiN_p1L0bg@mail.gmail.com Backpatch-through: 14	2025-03-07 10:22:56 +07:00
Nathan Bossart	5f8eb25706	vacuumdb: Add option for analyzing only relations missing stats. This commit adds a new --missing-only option that can be used in conjunction with --analyze-only and --analyze-in-stages. When this option is specified, vacuumdb will generate ANALYZE commands for a relation if it is missing any statistics it should ordinarily have. For example, if a table has statistics for one column but not another, we will analyze the whole table. A similar principle applies to extended statistics, expression indexes, and table inheritance. Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: TODO Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan	2025-03-07 10:17:35 +07:00
Michael Paquier	e2080261cc	Fix race condition in TAP test 007_pre_auth The authentication test added in `c76db55c90` expects a backend to start and wait at the injection point "init-pre-auth". A query is used to retrieve the PID of the backend waiting at authentication, but its WHERE clause was too soft, checking only for a backend in a "starting" state. As proved by the CI, this WHERE clause is not enough. There is a small window between the moment when the backend is reported as "starting" in its backend entry and the moment when it waits in its injection point, and it was possible for the test to return the PID of a backend process not yet waiting in the injection point, causing spurious failures. This issue is fixed by tweaking the query retrieving the PID of the backend waiting before authentication so as we check for "init-pre-auth" in its wait_event. An extra check based on the backend_type is added, based on a suggestion by Jacob, to be more cautious. Error spotted by the CI on Windows, but it could happen anywhere, as long as the authentication path is slow enough compared to the TAP test. Reported-by: Andres Freund <andres@anarazel.de> Author: Jacob Champion <jacob.champion@enterprisedb.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/soexrl7oeyku24bj3czupxmv27ow35u6edymp5y3oyoysbe2kb@r3tgoos2xp2x	2025-03-07 08:12:45 +09:00
Álvaro Herrera	24503fa95c	reindexdb: move PQfinish() calls to the right place get_parallel_object_list() has no business closing a connection it did not create. Make things more sensible by closing the connection at the level where it is created, in reindex_one_database(). Extracted from a larger patch by the same author. However, the patch as submitted not only was not described as containing this change, but in addition it contained a fatal flaw whereby reindexdb would crash and fail across all of its TAP test, which is why I list myself as co-author. Author: Ranier Vilela <ranier.vf@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAEudQArfqr0-s0VVPSEh=0kgOgBJvFNdGW=xSL5rBcr0WDMQYQ@mail.gmail.com	2025-03-06 19:40:06 +01:00
Tom Lane	0f21db36d6	Fix some performance issues in GIN query startup. If a GIN index search had a lot of search keys (for example, "jsonbcol ?\| array[]" with tens of thousands of array elements), both ginFillScanKey() and startScanKey() took O(N^2) time. Worse, those loops were uncancelable for lack of CHECK_FOR_INTERRUPTS. The problem in ginFillScanKey() is the brute-force search key de-duplication done in ginFillScanEntry(). The most expedient solution seems to be to just stop trying to de-duplicate once there are "too many" search keys. We could imagine working harder, say by using a sort-and-unique algorithm instead of brute force compare-all-the-keys. But it seems unlikely to be worth the trouble. There is no correctness issue here, since the code already allowed duplicate keys if any extra_data is present. The problem in startScanKey() is the loop that attempts to identify the first non-required search key. In the submitted test case, that vainly tests all the key positions, and each iteration takes O(N) time. One part of that is that it's reinitializing the entryRes[] array from scratch each time, which is entirely unnecessary given that the triConsistentFn isn't supposed to scribble on its input. We can easily adjust the array contents incrementally instead. The other part of it is that the triConsistentFn may itself take O(N) time (and does in this test case). This is all extremely brute force: in simple cases with AND or OR semantics, we could know without any looping whatever that all or none of the keys are required. But GIN opclasses don't have any API for exposing that knowledge, so at least in the short run there is little to be done about that. Put in a CHECK_FOR_INTERRUPTS so that at least the loop is cancelable. These two changes together resolve the primary complaint that the test query doesn't respond promptly to cancel interrupts. Also, while they don't completely eliminate the O(N^2) behavior, they do provide quite a nice speedup for mid-sized examples. Bug: #18831 Reported-by: Niek <niek.brasa@hitachienergy.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18831-e845ac44ebc5dd36@postgresql.org Backpatch-through: 13	2025-03-06 11:54:31 -05:00
Andrew Dunstan	e33969abc1	Further fix for json_strip_nulls documentation Oversight in commit `4603903d29`. Author: Shinoda, Noriyoshi (SXD Japan FSI) <noriyoshi.shinoda@hpe.com>	2025-03-06 10:24:03 -05:00
Andrew Dunstan	0e76f253f4	Remove extraneous commas in json{b}_strip_nulls documentation Oversight in commit `4603903d29`. Author: Ian Lawrence Barwick <barwick@gmail.com>	2025-03-06 08:46:15 -05:00
Amit Kapila	588acf6d0e	Avoid invalidating all RelationSyncCache entries on publication change. On change of publication via ALTER PUBLICATION ... SET/ADD/DROP commands, we were invalidating all the relations present in relation sync cache maintained by pgoutput. We need to invalidate only the relation entries that are changed as part of publication DDL. We have ensured that the publication DDL execution generated the invalidations required to invalidate impacted relation sync entries in RelationSyncCache. This improves the performance by avoiding building the cache entries for the cases where a publication has many tables but only one of them is dropped. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-03-06 14:19:38 +05:30
Jeff Davis	1d33de9d68	Organize and deduplicate statistics import tests. Author: Corey Huinker <corey.huinker@gmail.com> Reported-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_bWEqUfxhODfJ-XbZC75vq=P6DYOKK6biyey=yM1Ah3Hg@mail.gmail.com Discussion: https://postgr.es/m/CADkLM=f1n2_Vomq0gKab7xdxDHmJGgn=DE48P8fzQOp3Mrs1Qg@mail.gmail.com	2025-03-06 00:19:22 -08:00
Jeff Davis	f9f4b43b8d	Address stats export review comments. Per discussion, did not use Jian He's patch exactly. Reported-by: jian he <jian.universality@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CACJufxFVq=tq9u1zrHWYSbMi1T07gS9Ff0LJScMco4HZmtZ1xw@mail.gmail.com Discussion: https://postgr.es/m/CADkLM=f1n2_Vomq0gKab7xdxDHmJGgn=DE48P8fzQOp3Mrs1Qg@mail.gmail.com	2025-03-06 00:11:12 -08:00
Jeff Davis	298944e8d8	Address stats import review comments. Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHG9MBQozbJQ4JRBcRbUO+t+sx4qLZX092rS_9b4SR_EA@mail.gmail.com	2025-03-05 23:07:25 -08:00
Heikki Linnakangas	39de4f157d	Fix compiler warnings about typedef redefinitions Clang with -Wtypedef-redefinition produced warnings: src/include/storage/latch.h:122:3: error: redefinition of typedef 'Latch' is a C11 feature [-Werror,-Wtypedef-redefinition] Per buildfarm	2025-03-06 03:10:22 +02:00
Michael Paquier	7f7f324eb5	Add more monitoring data for WAL writes in the WAL receiver This commit adds two improvements related to the monitoring of WAL writes for the WAL receiver. First, write counts and timings are now counted in pg_stat_io for the WAL receiver. These have been discarded from pg_stat_wal in `ff99918c62` due to performance concerns, related to the fact that we still relied on an on-disk file for the stats back then, even with track_wal_io_timing to avoid the overhead of the timestamp calculations. This implementation is simpler than the original proposal as it is possible to rely on the APIs of pgstat_io.c to do the job. Like the fsync and read data, track_wal_io_timing needs to be enabled to track the timings. Second, a wait event is added around the pg_pwrite() call in charge of the writes, using the exiting WAIT_EVENT_WAL_WRITE. This is useful as the WAL receiver data is tracked in pg_stat_activity. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z8gFnH4o3jBm5BRz@ip-10-97-1-34.eu-west-3.compute.internal	2025-03-06 09:41:37 +09:00
Heikki Linnakangas	393e0d2314	Split WaitEventSet functions to separate source file latch.c now only contains the Latch related functions, which build on the WaitEventSet abstraction. Most of the platform-dependent stuff is now in waiteventset.c. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi	2025-03-06 01:26:16 +02:00
Heikki Linnakangas	84e5b2f07a	Use ModifyWaitEvent to update exit_on_postmaster_death This is in preparation for splitting WaitEventSet related functions to a separate source file. That will hide the details of WaitEventSet from WaitLatch, so it must use an exposed function instead of modifying WaitEventSet->exit_on_postmaster_death directly. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi	2025-03-06 01:26:12 +02:00
Fujii Masao	9f25b9f739	ecpg: Fix compiler warning in ecpg build with Meson. Previously, Meson could produce a warning about the use of 'deps' in ecpg: WARNING: Project targets '>=0.54' but uses a feature introduced in '0.60.0': list.<plus>. The right-hand operand was not a list. The right-hand operand of 'deps' should be a list. This commit fixes the warning by wrapping it with square brackets. This issue was introduced in commit `28f04984f0`. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CAOYmi+ks8wO06Ymxduw2h_eQJ_D4_jHGeyMK0P=p5Q3psnEdMA@mail.gmail.com	2025-03-06 08:22:30 +09:00
Heikki Linnakangas	a98e4dee63	Remove unused ShutdownLatchSupport() function The only caller was removed in commit `80a8f95b3b`. I don't foresee needing it any time soon, and I'm working on some big changes in this area, so let's remove it out of the way. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi	2025-03-05 23:52:04 +02:00
Daniel Gustafsson	153836b99a	ci: Remove installation of libcurl The CI images come with libcurl pre-installed since commit a119426 in the pg-vm-images repository so remove the installation commands from the Cirrus tasks. Installation of libcurl packages was added in the OAuth patchset which introduced the dependency, a backpatch is thus not applicable. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/8745B9D8-D897-4302-BD4C-FC18F291ECB7@yesql.se	2025-03-05 22:12:20 +01:00
Andres Freund	d4a6c847ca	ci: Document what makes certain tasks special To increase coverage without drastically increasing CI resource usage, we have different CI tasks test different things (e.g. the linux tasks use sanitizers). Unfortunately that can create confusing situations where CI fails on some OS, but not others, without the problem appearing to be platform dependent. To, partially, address that, add a comment, prefixed with SPECIAL, to each task that we use to test in some non-default way. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/321570.1741195755@sss.pgh.pa.us	2025-03-05 13:19:28 -05:00
Andres Freund	0a2f5df881	ci: freebsd: Specify debug_parallel_query=regress A lot of buildfarm animals run with debug_parallel_query=regress, while CI didn't test that. That lead to the annoying situation of only noticing related test instabilities after merging changes upstream. FreeBSD was chosen because it's a relatively fast task. It also tests debug_write_read_parse_plan_trees etc, which probably is exercised a bit more heavily with debug_parallel_query=regress. Discussion: https://postgr.es/m/zbuk4mlov22yfoktf5ub3lwjw2b7ezwphwolbplthepda42int@h6wpvq7orc44	2025-03-05 13:19:28 -05:00
Andres Freund	ad40644eb8	ci: Upgrade FreeBSD image Upgrade to the current stable version. To avoid needing commits like this in the future, the CI image name now doesn't contain the OS version number anymore. Backpatch to all versions with CI support, we don't want to generate CI images for multiple FreeBSD versions. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ3_P4JJ6tWZafjf-_XbHgG6DQGXhH-y6Yp78_bwBJjcww@mail.gmail.com Backpatch-through: 15	2025-03-05 10:33:47 -05:00
Peter Geoghegan	d00107cd63	Revert "Show index search count in EXPLAIN ANALYZE." This reverts commit `5ead85fbc8`. This commit shows test failures with debug_parallel_query=regress. The underlying issue needs to be debugged, so revert for now.	2025-03-05 10:27:31 -05:00
Andrew Dunstan	4603903d29	Allow json{b}_strip_nulls to remove null array elements An additional paramater ("strip_in_arrays") is added to these functions. It defaults to false. If true, then null array elements are removed as well as null valued object fields. JSON that just consists of a single null is not affected. Author: Florents Tselai <florents.tselai@gmail.com> Discussion: https://postgr.es/m/4BCECCD5-4F40-4313-9E98-9E16BEB0B01D@gmail.com	2025-03-05 10:04:02 -05:00
Peter Geoghegan	5ead85fbc8	Show index search count in EXPLAIN ANALYZE. Expose the count of index searches/index descents in EXPLAIN ANALYZE's output for index scan nodes. This information is particularly useful with scans that use ScalarArrayOp quals, where the number of index scans isn't predictable in advance (at least not with optimizations like the one added to nbtree by Postgres 17 commit `5bf748b8`). It will also be useful when EXPLAIN ANALYZE shows details of an nbtree index scan that uses skip scan optimizations set to be introduced by an upcoming patch. The instrumentation works by teaching index AMs to increment a new nsearches counter whenever a new index search begins. The counter is incremented at exactly the same point that index AMs must already increment the index's pg_stat_*_indexes.idx_scan counter (we're counting the same event, but at the scan level rather than the relation level). The new counter is stored in the scan descriptor (IndexScanDescData), which explain.c reaches by going through the scan node's PlanState. This approach doesn't match the approach used when tracking other index scan specific costs (e.g., "Rows Removed by Filter:"). It is similar to the approach used in other cases where we must track costs that are only readily accessible inside an access method, and not from the executor (e.g., "Heap Blocks:" output for a Bitmap Heap Scan). It is inherently necessary to maintain a counter that can be incremented multiple times during a single amgettuple call (or amgetbitmap call), and directly exposing PlanState.instrument to index access methods seems unappealing. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com	2025-03-05 09:36:48 -05:00
Heikki Linnakangas	635f580120	Rename some signal and interrupt handling functions for consistency The usual pattern for handling a signal is that the signal handler sets a flag and calls SetLatch(MyLatch), and CHECK_FOR_INTERRUPTS() or other code that is part of a wait loop calls another function to deal with it. The naming of the functions involved was a bit inconsistent, however. CHECK_FOR_INTERRUPTS() calls ProcessInterrupts() to do the heavy-lifting, but the analogous functions in aux processes were called HandleMainLoopInterrupts(), HandleStartupProcInterrupts(), etc. Similarly, most subroutines of ProcessInterrupts() were called Process(), but some were called Handle(). To make things less confusing, rename all the functions that are part of the overall signal/interrupt handling system but are not executed in a signal handler to e.g. ProcessSomething(), rather than HandleSomething(). The "Process" prefix is now consistently used in the non-signal-handler functions, and the "Handle" prefix in functions that are part of signal handlers, except for some completely unrelated functions that clearly have nothing to do with signal or interrupt handling. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/8a384b26-1499-41f6-be33-64b801fb98b8@iki.fi	2025-03-05 16:22:26 +02:00
Álvaro Herrera	f4e53e10b6	Add ALTER TABLE ... ALTER CONSTRAINT ... SET [NO] INHERIT This allows to redefine an existing non-inheritable constraint to be inheritable, which allows to straighten up situations with NO INHERIT constraints so that thay can become normal constraints without having to re-verify existing data. For existing inheritance children this may require creating additional constraints, if they don't exist already. It also allows to do the opposite, if only for symmetry. Author: Suraj Kharage <suraj.kharage@enterprisedb.com> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAF1DzPVfOW6Kk=7SSh7LbneQDJWh=PbJrEC_Wkzc24tHOyQWGg@mail.gmail.com	2025-03-05 13:50:22 +01:00
Michael Paquier	f4694e0f35	Fix some gaps in pg_stat_io with WAL receiver and WAL summarizer The WAL receiver and WAL summarizer processes gain each one a call to pgstat_report_wal(), to make sure that they report their WAL statistics to pgstats, gathering data for pg_stat_io. In the WAL receiver, the stats reports are timed with status updates sent to the primary, that depend on wal_receiver_status_interval and wal_receiver_timeout. This is a conservative choice, but perhaps we could be more aggressive with the frequency of the stats reports. An interesting historical fact is that the WAL receiver does writes and syncs of WAL, but it has never reported its statistics to pgstats in pg_stat_wal. In the WAL summarizer, the stats reports are done each time the process waits for WAL. While on it, pg_stat_io is adjusted so as these two processes do not report any rows when IOObject is not WAL, making the view easier to use with less rows. Two tests are added in TAP, checking statistics for the WAL summarizer and the WAL receiver. Status updates in the WAL receiver are currently possible in the recovery test 001_stream_rep.pl. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z8UKZyVSHUUQJHNb@paquier.xyz	2025-03-05 10:17:39 +09:00
Michael Paquier	54d23601b9	psql: Fix memory leak with \gx used within a pipeline While inside a pipeline, \gx is currently forbidden and will make exec_command_g() exit early. There was a memory leak in this code path, so let's fix it. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqqFVQjLjZQiL7xdwLpzZEy1ghO_JWvCFPM_OmwF9s7XdA@mail.gmail.com	2025-03-05 07:56:03 +09:00
Tomas Vondra	b229c10164	Enforce memory limit during parallel GIN builds Index builds are expected to respect maintenance_work_mem, just like other maintenance operations. For serial builds this is done simply by flushing the buffer in ginBuildCallback() into the index. But with parallel builds it's more complicated, because there are multiple places that can allocate memory. ginBuildCallbackParallel() does the same thing as ginBuildCallback(), except that the accumulated items are written into tuplesort. Then the entries with the same key get merged - first in the worker, then in the leader - and the TID lists may get (arbitrarily) long. It's unlikely it would exceed the memory limit, but it's possible. We address this by evicting some of the data if the list gets too long. We can't simply dump the whole in-memory TID list. The GIN index bulk insert code expects to see TIDs in monotonic order; it may fail if the TIDs go backwards. If the TID lists overlap, evicting the whole current TID list would break this (a later entry might add "old" TID values into the already-written part). In the workers this is not an issue, because the lists never overlap. But the leader may see overlapping lists produced by the workers. We can however derive a safe "horizon" TID - the entries (for a given key) are sorted by (key, first TID), which means no future list can add values before the last "first TID" we've seen. This patch tracks the "frozen" part of the TID list, which we know can't change by merging additional TID lists. If needed, we can evict this part of the list. We don't want to do this too often - the smaller lists we evict, the more expensive it'll be to merge them in the next step (especially in the leader). Therefore we only trim the list if we have at least 1024 frozen items, and if the whole list is at least 64kB large. These thresholds are somewhat arbitrary and conservative. We might calculate the values from maintenance_work_mem, but tests show that does not really improve anything (time, compression ratio, ...). So we stick to these conservative values to release memory faster. Author: Tomas Vondra Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com	2025-03-04 20:41:13 +01:00
Masahiko Sawada	f52345995d	pg_upgrade: Check for the expected error message in TAP tests. Since pg_upgrade prints its error messages on stdout, we can't use command_fails_like() to check if it fails for the right reason. This commit uses command_checks_all() in pg_upgrade TAP tests to check the exit status and stdout, enabling proper verification of error reasons. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87tt8h1vb7.fsf@wibble.ilmari.org	2025-03-04 11:16:12 -08:00
Álvaro Herrera	7bbc46213d	Fix ALTER TABLE error message This bogus error message was introduced in 2013 by commit `f177cbfe67`, because of misunderstanding the processCASbits() API; at the time, no test cases were added that would be affected by this change. Only in `ca87c415e2` was one added (along with a couple of typos), with an XXX note that the error message was bogus. Fix the whole, add some test cases. Backpatch all the way back. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/202503041822.aobpqke3igvb@alvherre.pgsql	2025-03-04 20:07:30 +01:00
Masahiko Sawada	bacbc4863b	Refactor Copy{From\|To}GetRoutine() to use pass-by-reference argument. The change improves efficiency by eliminating unnecessary copying of CopyFormatOptions. The coverity also complained about inefficiencies caused by pass-by-value. Oversight in `7717f6300` and `2e4127b6d`. Reported-by: Junwang Zhao <zhjwpku@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> (per reports from coverity) Author: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAEG8a3L6YCpPksTQMzjD_CvwDEhW3D_t=5md9BvvdOs5k+TA=Q@mail.gmail.com	2025-03-04 10:38:41 -08:00
Tomas Vondra	0b2a45a5d1	Compress TID lists when writing GIN tuples to disk When serializing GIN tuples to tuplesorts during parallel index builds, we can significantly reduce the amount of data by compressing the TID lists. The GIN opclasses may produce a lot of data (depending on how many keys are extracted from each row), and the TID compression is very efficient and effective. If the number of distinct keys is high, the first worker pass (reading data from the table and writing them into a private tuplesort) may not benefit from the compression very much. It is likely to spill data to disk before the TID lists get long enough for the compression to help. The second pass (writing the merged data into the shared tuplesort) is more likely to benefit from compression. The compression can be seen as a way to reduce the amount of disk space needed by the parallel builds, because the data is written twice. First into the per-worker tuplesorts, then into the shared tuplesort. Author: Tomas Vondra Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com	2025-03-04 19:02:05 +01:00
Tom Lane	9b4bdf876a	Add .gitignore entry for ecpg test detritus. Oversight in commit `28f04984f`.	2025-03-04 12:58:07 -05:00
Tomas Vondra	c878de1db4	Make FP_LOCK_SLOTS_PER_BACKEND look like a function The FP_LOCK_SLOTS_PER_BACKEND macro looks like a constant, but it depends on the max_locks_per_transaction GUC, and thus can change. This is non-obvious and confusing, so make it look more like a function by renaming it to FastPathLockSlotsPerBackend(). While at it, use the macro when initializing fast-path shared memory, instead of using the formula. Reported-by: Andres Freund Discussion: https://postgr.es/m/ffiwtzc6vedo6wb4gbwelon5nefqg675t5c7an2ta7pcz646cg%40qwmkdb3l4ett	2025-03-04 18:33:12 +01:00
Fujii Masao	91ecb5e0bc	Add regression tests for pg_stat_progress_copy.tuples_skipped. This commit adds tests to verify that tuples_skipped in pg_stat_progress_copy works as expected. While existing tests checked other fields, tuples_skipped was previously untested. This improves test coverage and ensures accurate tracking of skipped tuples. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Josef Šimánek <josef.simanek@gmail.com> Discussion: https://postgr.es/m/CACJufxFazq-bfyhiO0KBojR=yOr84E25Rqf6mHB0Ow0KPidkKw@mail.gmail.com	2025-03-04 23:56:49 +09:00
Heikki Linnakangas	d2e7068392	Fix outdated comment Commit `bc971f4025` replaced the latch-setting mechanism that the comment talked about with a condition variable. And before that, commit `2258e76f90` moved the code so that the comment got detached from the loop that it talked about, so move the comment closer to the loop.	2025-03-04 15:33:19 +02:00
Daniel Gustafsson	ad13490be0	doc: Expand version compatibility for pg_basebackup features This updates the paragraph on backwards compatitibility for server features to include --incremental which only works on servers with v17 or newer. Backpatch down to v17 where incremental backup was added. Author: David G. Johnston <David.G.Johnston@Gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAKFQuwZYfZyeTkS3g2Ovw84TsxHa796xnf-u5kfgn_auyxZk0Q@mail.gmail.com Backpatch-through: 17	2025-03-04 12:08:27 +01:00
Peter Eisentraut	3abbd8dbeb	Fix accidental use of = instead of == Fix for commit `630f9a43ce`. It used = instead of ==. The result would be an incorrect error message. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/CA%2BCOZaC-JMbhQ4O0Q8V1Bxa0R%2BNex_RN9D6UyuLPiEx_CK4Heg%40mail.gmail.com	2025-03-04 09:45:01 +01:00
Peter Eisentraut	f011acdd61	Fix ALTER TABLE ADD VIRTUAL GENERATED COLUMN when table rewrite demo: CREATE TABLE gtest20a (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) VIRTUAL); ALTER TABLE gtest20a ADD COLUMN c float8 DEFAULT RANDOM() CHECK (b < 60); ERROR: no generation expression found for column number 2 of table "pg_temp_17306" In ATRewriteTable, the variable OIDNewHeap (if valid) corresponding pg_attrdef default expression entry was not populated. So OIDNewHeap cannot be used to call expand_generated_columns_in_expr or build_generation_expression. Therefore in ATRewriteTable, we can only use the existing relation to expand the generated expression. Author: jian he <jian.universality@gmail.com> Reviewed-by: Srinath Reddy <srinath2133@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxEJ%3DFoajabWXjszo_yrQeKSxdZ87KJqBW373rSbajKGAA%40mail.gmail.com	2025-03-04 09:18:32 +01:00
Richard Guo	716a051aac	Avoid NullTest deduction for clone clauses In commit `b262ad440`, we introduced an optimization that reduces an IS NOT NULL qual on a column defined as NOT NULL to constant true, and an IS NULL qual on a NOT NULL column to constant false, provided we can prove that the input expression of the NullTest is not nullable by any outer join. This deduction happens after we have generated multiple clones of the same qual condition to cope with commuted-left-join cases. However, performing the NullTest deduction for clone clauses can be unsafe, because we don't have a reliable way to determine if the input expression of a NullTest is non-nullable: nullingrel bits in clone clauses may not reflect reality, so we dare not draw conclusions from clones about whether Vars are guaranteed not-null. To fix, we check whether the given RestrictInfo is a clone clause in restriction_is_always_true and restriction_is_always_false, and avoid performing any reduction if it is. There are several ensuing plan changes in predicate.out, and we have to modify the tests to ensure that they continue to test what they are intended to. Additionally, this fix causes the test case added in `f00ab1fd1` to no longer trigger the bug that commit fixed, so we also remove that test case. Back-patch to v17 where this bug crept in. Reported-by: Ronald Cruz <cruz@rentec.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/f5320d3d-77af-4ce8-b9c3-4715ff33f213@rentec.com Backpatch-through: 17	2025-03-04 16:11:03 +09:00
Fujii Masao	28f04984f0	ecpg: Add TAP test for the ecpg command. This commit adds a TAP test to verify that the ecpg command correctly detects unsupported or disallowed statements in input files and reports the appropriate error or warning messages. This test helps catch bugs like the one introduced in commit `3d009e45bd`, which broke ecpg's handling of unsupported COPY FROM STDIN statements, later fixed by commit `94b914f601`. Author: Ryo Kanbayashi <kanbayashi.dev@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CANOn0EzoMyxA1m-quDS1UeQUq6FNki6+GGiGucgr9tm2R78rKw@mail.gmail.com	2025-03-04 14:58:46 +09:00
Michael Paquier	c76db55c90	Split pgstat_bestart() into three different routines pgstat_bestart(), used post-authentication to set up a backend entry in the PgBackendStatus array, so as its data becomes visible in pg_stat_activity and related catalogs, has its logic divided into three routines with this commit, called in order at different steps of the backend initialization: * pgstat_bestart_initial() sets up the backend entry with a minimal amount of information, reporting it with a new BackendState called STATE_STARTING while waiting for backend initialization and client authentication to complete. The main benefit that this offers is observability, so as it is possible to monitor the backend activity during authentication. This step happens earlier than in the logic prior to this commit. pgstat_beinit() happens earlier as well, before authentication. * pgstat_bestart_security() reports the SSL/GSS status of the connection, once authentication completes. Auxiliary processes, for example, do not need to call this step, hence it is optional. This step is called after performing authentication, same as previously. * pgstat_bestart_final() reports the user and database IDs, takes the entry out of STATE_STARTING, and reports its application_name. This is called as the last step of the three, once authentication completes. An injection point is added, with a test checking that the "starting" phase of a backend entry is visible in pg_stat_activity. Some follow-up patches are planned to take advantage of this refactoring with more information provided in backend entries during authentication (LDAP hanging was a problem for the author, initially). Author: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com	2025-03-04 14:09:44 +09:00
Michael Paquier	40d3f82744	Add more assertions in palloc0() and palloc_extended() palloc() includes an assertion checking that an alloc() implementation never returns NULL for all MemoryContextMethods. This commit adds a similar assertion in palloc0(). In palloc_extend(), a different assertion is added, checking that MCXT_ALLOC_NO_OOM is set when an alloc() routine returns NULL. These additions can be useful to catch errors when implementing a new set of MemoryContextMethods routines. Author: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/507e8eba-2035-4a12-a777-98199a66beb8@proxel.se	2025-03-04 10:53:10 +09:00
Masahiko Sawada	ba57dcfdcd	doc: Convert UUID functions list to table format. Convert the list of UUID functions into a table for better readability. This commit also adds references to the UUID type section and includes descriptions of different UUID generation algorithm versions. Author: Andy Alsup <bluesbreaker@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/CADOZ7s7OHag+r6w+BzKw2xgb3fVtAD-pU=_N9-9pSe5W1TB+xQ@mail.gmail.com	2025-03-03 15:44:01 -08:00
Tom Lane	246dedc5d0	Allow => syntax for named cursor arguments in plpgsql. We've traditionally accepted "name := value" syntax for cursor arguments in plpgsql. But it turns out that the equivalent statements in Oracle use "name => value". Since we accept both forms of punctuation for function arguments, it makes sense to do the same here. Author: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Gilles Darold <gilles@darold.net> Discussion: https://postgr.es/m/CAFj8pRA3d0ARQEMbABa1n6q25AUdNmyO8aGs56XNf9pD4sRMjQ@mail.gmail.com	2025-03-03 18:00:13 -05:00
Thomas Munro	b6904afae4	ci: Use a RAM disk for NetBSD and OpenBSD. Put the RAM disk setup for all three *BSD CI tasks into a common script, replacing the old FreeBSD-specific one from commit `0265e5c1`. This makes them run 3 times and a bit over 2 times faster, respectively. NetBSD and FreeBSD now share the same one-liner to mount tmpfs. OpenBSD needs a GCP-image specific recipe that knows where to steal an unused disk partition needed to reserve swap space for an mfs RAM disk, because its tmpfs is deprecated and currently broken. The configured size is enough for our current tests but could potentially need future expansion. Thanks to Bilal for the disklabel incantation. Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGJJ-XrPhN%2BQA4ZUfYAAXcwOSDty9t0vE9Z8__AdacKnQg%40mail.gmail.com	2025-03-04 11:29:21 +13:00
Melanie Plageman	06eae9e621	Trigger more frequent autovacuums with relallfrozen Calculate the insert threshold for triggering an autovacuum of a relation based on the number of unfrozen pages. By only considering the unfrozen portion of the table when calculating how many tuples to add to the insert threshold, we can trigger more frequent vacuums of insert-heavy tables. This increases the chances of vacuuming those pages when they still reside in shared buffers This also increases the number of autovacuums triggered by tuples inserted and not by wraparound risk. We prefer to freeze these pages during insert-triggered autovacuums, as anti-wraparound vacuums are not automatically canceled by conflicting lock requests. We calculate the unfrozen percentage of the table using the recently added (`99f8f3fbbc`) relallfrozen column of pg_class. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com	2025-03-03 14:42:00 -05:00
Tom Lane	35c8dd9e11	Simplify some logic around setting pg_attribute.atthasdef. DefineRelation was of the opinion that it could usefully pre-fill atthasdef flags to eliminate work for StoreAttrDefault. This is not the case, however: the tupledesc that it's filling is not the one that InsertPgAttributeTuples will work from. The tupledesc used there is made by RelationBuildLocalRelation, which deliberately doesn't copy atthasdef. Moreover, if this did happen as the code thinks, it would be wrong for the case of plain "DEFAULT NULL" clauses, since we detect and ignore simple-null-Const defaults later on. Hence, remove the useless code. It also emerges that it's not really worth a special-case path in StoreAttrDefault() for atthasdef already being set, because as far as we can see that never happens: cases where an existing default gets updated always do RemoveAttrDefault first, so as to clean up possibly-no-longer-correct dependency entries. If it were the case the code would still work, anyway. Also remove a nearby comment made moot by `5eaa0e92e`. Author: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com	2025-03-03 13:35:48 -05:00
Tom Lane	4528768d98	Remove now-dead code in StoreAttrDefault(). StoreAttrDefault() is no longer responsible for filling attmissingval, so remove the code for that. Get rid of RawColumnDefault.missingMode, too, as we no longer need that to pass information around. While here, clean up some sloppy coding in StoreAttrDefault(), such as failure to use XXXGetDatum macros. These aren't bugs but they're not good code either. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com	2025-03-03 13:09:20 -05:00
Tom Lane	95f650674d	Fix broken handling of domains in atthasmissing logic. If a domain type has a default, adding a column of that type (without any explicit DEFAULT clause) failed to install the domain's default value in existing rows, instead leaving the new column null. This is unexpected, and it used to work correctly before v11. The cause is confusion in the atthasmissing mechanism about which default value to install: we'd only consider installing an explicitly-specified default, and then we'd decide that no table rewrite is needed. To fix, take the responsibility for filling attmissingval out of StoreAttrDefault, and instead put it into ATExecAddColumn's existing logic that derives the correct value to fill the new column with. Also, centralize the logic that determines the need for default-related table rewriting there, instead of spreading it over four or five places. In the back branches, we'll leave the attmissingval-filling code in StoreAttrDefault even though it's now dead, for fear that some extension may be depending on that functionality to exist there. A separate HEAD-only patch will clean up the now-useless code. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com Backpatch-through: 13	2025-03-03 12:43:44 -05:00
Melanie Plageman	99f8f3fbbc	Add relallfrozen to pg_class Add relallfrozen, an estimate of the number of pages marked all-frozen in the visibility map. pg_class already has relallvisible, an estimate of the number of pages in the relation marked all-visible in the visibility map. This is used primarily for planning. relallfrozen, together with relallvisible, is useful for estimating the outstanding number of all-visible but not all-frozen pages in the relation for the purposes of scheduling manual VACUUMs and tuning vacuum freeze parameters. A future commit will use relallfrozen to trigger more frequent vacuums on insert-focused workloads with significant volume of frozen data. Bump catalog version Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com	2025-03-03 11:18:05 -05:00
Tomas Vondra	8492feb98f	Allow parallel CREATE INDEX for GIN indexes Allow using parallel workers to build a GIN index, similarly to BTREE and BRIN. For large tables this may result in significant speedup when the build is CPU-bound. The work is divided so that each worker builds index entries on a subset of the table, determined by the regular parallel scan used to read the data. Each worker uses a local tuplesort to sort and merge the entries for the same key. The TID lists do not overlap (for a given key), which means the merge sort simply concatenates the two lists. The merged entries are written into a shared tuplesort for the leader. The leader needs to merge the sorted entries again, before writing them into the index. But this way a significant part of the work happens in the workers, and the leader is left with merging fewer large entries, which is more efficient. Most of the parallelism infrastructure is a simplified copy of the code used by BTREE indexes, omitting the parts irrelevant for GIN indexes (e.g. uniqueness checks). Original patch by me, with reviews and substantial improvements by Matthias van de Meent, certainly enough to make him a co-author. Author: Tomas Vondra, Matthias van de Meent Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com	2025-03-03 16:53:06 +01:00
Michael Paquier	3f1db99bfa	Handle auxiliary processes in SQL functions of backend statistics This commit impacts the following SQL functions, authorizing the access to the PGPROC entries of auxiliary processes when attempting to fetch or reset backend-level pgstats entries: - pg_stat_reset_backend_stats() - pg_stat_get_backend_io() This is relevant since `a051e71e28` for at least the WAL summarizer, WAL receiver and WAL writer processes, that has changed the backend statistics to authorize these three following the addition of WAL I/O statistics in pg_stat_io and backend statistics. The code is more flexible with future changes written this way, adapting automatically to any updates done in pgstat_tracks_backend_bktype(). While on it, pgstat_report_wal() gains a call to pgstat_flush_backend(), making sure that backend I/O statistics are updated when calling this routine. This makes the statistics report correctly for the WAL writer. WAL receiver and WAL summarizer do not call pgstat_report_wal() yet (spoiler: both should). It should be possible to lift some of the existing restrictions for other auxiliary processes, as well, but this is left as future work. Reported-by: Rahila Syed <rahilasyed90@gmail.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CAH2L28v9BwN8_y0k6FQ591=0g2Hj_esHLGj3bP38c9nmVykoiA@mail.gmail.com	2025-03-03 09:57:48 +09:00
Fujii Masao	fe186bda78	postgres_fdw: Extend postgres_fdw_get_connections to return remote backend PID. This commit adds a new "remote_backend_pid" output column to the postgres_fdw_get_connections function. It returns the process ID of the remote backend, on the foreign server, handling the connection. This enhancement is useful for troubleshooting, monitoring, and reporting. For example, if a connection is unexpectedly closed by the foreign server, the remote backend's PID can help diagnose the cause. No extension version bump is needed, as commit `c297a47c5f` already handled it for v18~. Author: Sagar Dilip Shedge <sagar.shedge92@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAPhYifF25q5xUQWXETfKwhc0YVa_6+tfG9Kw4bCvCjpCWxYs2A@mail.gmail.com	2025-03-03 08:51:30 +09:00
Peter Eisentraut	15a79c7311	Use PRI64 instead of "ll" in format strings (minimal trial) Old: errmsg("hello %llu", (unsigned long long) x) New: errmsg("hello %" PRIu64, x) And likewise for everything printf-like. In the past we had to use long long so localized format strings remained architecture independent in message catalogs. Although long long is expected to be 64 bit everywhere, if we hadn't also cast the int64 values, we'd have generated compiler warnings on systems where int64 was long. Now that int64 is int64_t, C99 understand how to format them using <inttypes.h> macros, the casts are not necessary, and the gettext() tools recognize the macros and defer expansion until load time. (And if we ever manage to get -Wformat-signedness to work for us, that'd help with these too, but not the type-system-clobbering casts.) This particular patch converts only pg_checksums.c to the new system, to allow testing of the translation toolchain for everyone. If this works okay, a later patch will convert most of the rest. Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org	2025-03-02 13:53:03 +01:00
Tom Lane	00d61a08c5	Fix pg_strtof() to not crash on NULL endptr. We had managed not to notice this simple oversight because none of our calls exercised the case --- until commit `8f427187d`. That led to pg_dump crashing on any platform that uses this code (currently Cygwin and Mingw). Even though there's no immediate bug in the back branches, backpatch, because a non-POSIX-compliant strtof() substitute is trouble waiting to happen for extensions or future back-patches. Diagnosed-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/339b3902-4e98-4e31-a744-94e43b7b9292@gmail.com Backpatch-through: 13	2025-03-01 14:22:56 -05:00
Peter Eisentraut	56ba0463d3	Set amcancrosscompare to true for hash This was missed in the refactoring in patch `ce62f2f2a0`, which thus created a regression. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org	2025-03-01 09:15:27 +01:00
Thomas Munro	c301a0a74a	Work around OAuth/EVFILT_TIMER quirk on NetBSD. NetBSD's EVFILT_TIMER doesn't like zero timeouts, as introduced by commit `b3f0be788`. Steal the workaround from the same problem on Linux from a few lines up: round zero up to one. Do this only for NetBSD, as the other systems with the kevent() API accept zero and shouldn't have to insert a small bogus wait. Future improvement ideas: * when NetBSD < 10 falls out of support, we could try NODE_ABSTIME for the "fire now" meaning if timeout == 0 * when libcurl tells us to start a 0ms timer and call it back, we could figure out how to handle that more directly without involving the kernel (the current architecture doesn't make that straightforward) Failures with EINVAL errors could be seen on the new optional NetBSD CI task that we're trying to keep green as a candidate for inclusion as default-enabled CI task. The NetBSD build farm animals aren't testing OAuth yet, so no breakage there. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CA%2BhUKGJ%2BWyJ26QGvO_nkgvbxgw%2B03U4EQ4Hxw%2BQBft6Np%2BXW7w%40mail.gmail.com	2025-03-01 14:41:02 +13:00
Masahiko Sawada	8a1012b35d	Re-export NextCopyFromRawFields() to copy.h. Commit `7717f63006` removed NextCopyFromRawFields() from copy.h. While it was hoped that NextCopyFrom() could serve as an alternative, certain use cases still require NextCopyFromRawFields(). For instance, extensions like file_text_array_fdw, which process source data with an unknown number of columns, rely on this function. Per buildfarm member crake. Reported-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Sutou Kouhei <kou@clear-code.com> Discussion: https://postgr.es/m/5c7e1ac8-5083-4c08-af19-cb9ade2f16ce@dunslane.net	2025-02-28 15:11:41 -08:00
Nathan Bossart	e636da9200	Adjust auto_explain's GUC descriptions. This commit adjusts auto_explain's GUC descriptions to follow the style guidelines established by commit `977d865c36`. Specifically, it ensures the accepted special values are listed in a consistent manner. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/e82d4647-ce7f-45c7-9b01-fb900a050767%40tantorlabs.com	2025-02-28 16:05:51 -06:00
Tom Lane	8b49392b27	Tweak regex to avoid a bug in Perl 5.16.3. For some reason, 5.16.3 (and perhaps slightly earlier/later versions) go into an infinite loop with the version-replacement regex installed by commit `fc0d0ce97`. We can work around that by using an explicit "\n" instead of the line-start metacharacter "^". Reported-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0u9dV3CdKqkqdusA_RdvBkwWe0c0rxcFWj++VYoutFYSw@mail.gmail.com	2025-02-28 15:20:24 -05:00
Masahiko Sawada	7717f63006	Refactor COPY FROM to use format callback functions. This commit introduces a new CopyFromRoutine struct, which is a set of callback routines to read tuples in a specific format. It also makes COPY FROM with the existing formats (text, CSV, and binary) utilize these format callbacks. This change is a preliminary step towards making the COPY FROM command extensible in terms of input formats. Similar to `2e4127b6d2`, this refactoring contributes to a performance improvement by reducing the number of "if" branches that need to be checked on a per-row basis when sending field representations in text or CSV mode. The performance benchmark results showed ~5% performance gain in text or CSV mode. Author: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/20231204.153548.2126325458835528809.kou@clear-code.com	2025-02-28 10:29:36 -08:00
Robert Haas	77cb08be51	Avoid including explain.h in explain_format.h and explain_dr.h As per a suggestion from Tom Lane, we do this by declaring "struct ExplainState" here and refer to that rather than "ExplainState". Also per Tom, CreateExplainSerializeDestReceiver was still defined in explain.h in addition to explain_dr.h. Remove leftover prototype. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYtaad3i21V0jqua-fbr+CR0ix6uBvEX8_s6BG96abd=g@mail.gmail.com	2025-02-28 13:17:29 -05:00
Robert Haas	51d3e279c3	Fix missing space in EXPLAIN ANALYZE output. Commit `ddb17e387a` introduced this regression. Ideally, the regression tests would have caught this mistake, but apparently they don't test with timing enabled, presumably because that would make the output vary. Author: Thom Brown <thom@linux.com> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Discussion: http://postgr.es/m/CAA-aLv6nq=UeiyvM7_Mxgo9TVBzs2oh46b9vfyLzuyVEz3j1-g@mail.gmail.com	2025-02-28 13:04:12 -05:00
Jeff Davis	424ededc58	Adjust pg_dump tag for relation stats. Do not use fmtId(), just use dobj->name directly, like for table data.	2025-02-27 20:42:12 -08:00
Michael Paquier	c2a50ac678	Invent pgstat_fetch_stat_backend_by_pid() This code is extracted from pg_stat_get_backend_io() in pgstatfuncs.c, so as it can be shared with other areas that need backend pgstats entries while having the benefits of the various sanity checks refactored here. As per its name, this retrieves backend statistics based on a PID, with the option of retrieving a BackendType if given in input. Currently, this is used for the backend-level IO statistics. The next move would be to reuse that for the backend-level WAL statistics. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-28 11:20:31 +09:00
Michael Paquier	2a083ab807	pg_upgrade: Fix inconsistency in memory freeing The function in charge of freeing the memory from a result created by PQescapeIdentifier() has to be PQfreemem(), to ensure that both allocation and free come from libpq. One spot in pg_upgrade was not respecting that for pg_database's datlocale (daticulocale in v16) when the collation provider is libc (aka datlocale/daticulocale is NULL) with an allocation done using pg_strdup() and a free with PQfreemem(). The code is changed to always use PQescapeLiteral() when processing the input. Oversight in `9637badd9f`. This commit is similar to `48e4ae9a07` and `5b94e27534`. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/Z601RQxTmIUohdkV@paquier.xyz Backpatch-through: 16	2025-02-28 10:15:29 +09:00
Masahiko Sawada	2e4127b6d2	Refactor COPY TO to use format callback functions. This commit introduces a new CopyToRoutine struct, which is a set of callback routines to copy tuples in a specific format. It also makes the existing formats (text, CSV, and binary) utilize these format callbacks. This change is a preliminary step towards making the COPY TO command extensible in terms of output formats. Additionally, this refactoring contributes to a performance improvement by reducing the number of "if" branches that need to be checked on a per-row basis when sending field representations in text or CSV mode. The performance benchmark results showed ~5% performance gain in text or CSV mode. Author: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/20231204.153548.2126325458835528809.kou@clear-code.com	2025-02-27 15:03:52 -08:00
Robert Haas	555960a0fb	Create explain_dr.c and move DestReceiver-related code there. explain.c has grown rather large, and the code that deals with the DestReceiver that supports the SERIALIZE option is pretty easily severable from the rest of explain.c; hence, move it to a separate file. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: http://postgr.es/m/CA+TgmoYutMw1Jgo8BWUmB3TqnOhsEAJiYO=rOQufF4gPLWmkLQ@mail.gmail.com	2025-02-27 13:14:16 -05:00
Robert Haas	9173e8b604	Create explain_format.c and move relevant code there. explain.c has grown rather large, so move various functions that are principally concerned with output generation to a new source file, explain_format.c, instead of lumping them in with everything else that is part of explain.c Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: http://postgr.es/m/CA+TgmoYutMw1Jgo8BWUmB3TqnOhsEAJiYO=rOQufF4gPLWmkLQ@mail.gmail.com	2025-02-27 12:37:10 -05:00
Robert Haas	95dbd827f2	EXPLAIN: Always use two fractional digits for row counts. Commit `ddb17e387a` attempted to avoid confusing users by displaying digits after the decimal point only when nloops > 1, since it's impossible to have a fraction row count after a single iteration. However, this made the regression tests unstable since parallal queries will have nloops>1 for all nodes below the Gather or Gather Merge in normal cases, but if the workers don't start in time and the leader finishes all the work, they will suddenly have nloops==1, making it unpredictable whether the digits after the decimal point would be displayed or not. Although `44cbba9a7f` seemed to fix the immediate failures, it may still be the case that there are lower-probability failures elsewhere in the regression tests. Various fixes are possible here. For example, it has previously been proposed that we should try to display the digits after the decimal point only if rows/nloops is an integer, but currently rows is storead as a float so it's not theoretically an exact quantity -- precision could be lost in extreme cases. It has also been proposed that we should try to display the digits after the decimal point only if we're under some sort of construct that could potentially cause looping regardless of whether it actually does. While such ideas are not without merit, this patch adopts the much simpler solution of always display two decimal digits. If that approach stands up to scrutiny from the buildfarm and human users, it spares us the trouble of doing anything more complex; if not, we can reassess. This commit incidentally reverts `44cbba9a7f`, which should no longer be needed. Author: Robert Haas <robertmhaas@gmail.com> Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Discussion: http://postgr.es/m/CA+TgmoazzVHn8sFOMFAEwoqBTDxKT45D7mvkyeHgqtoD2cn58Q@mail.gmail.com	2025-02-27 11:27:16 -05:00
Peter Eisentraut	ce62f2f2a0	Generalize hash and ordering support in amapi Stop comparing access method OID values against HASH_AM_OID and BTREE_AM_OID, and instead check the IndexAmRoutine for an index to see if it advertises its ability to perform the necessary ordering, hashing, or cross-type comparing functionality. A field amcanorder already existed, this uses it more widely. Fields amcanhash and amcancrosscompare are added for the other purposes. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-02-27 17:03:31 +01:00
Tom Lane	6eb8a1a4f9	Avoid unnecessary computation of pgbench's script line number. ParseScript only needs the lineno for meta-commands, so let's not bother computing it otherwise. While this doesn't save much given the previous patch, there's no point in doing unnecessary work. While we're at it, avoid calling psql_scan_get_location() twice for a meta-command. One reason for making this change is that the line number computed in ParseScript's main loop was actually wrong in most cases: it would point just past the semicolon of the previous SQL command, not at what the user thinks the current command's line number is. We could add some code to skip whitespace before capturing the line number, but it would be pretty pointless at present. Just move the call to avoid the temptation to rely on that value. (Once we've lexed the backslash, the computed line number will be right.) This change also means that pgbench never inquires about the location before it's lexed something, so that the care taken in the previous patch to behave sanely in that case is unnecessary. It seems best to keep that logic, though, as future callers might depend on it. Author: Daniel Vérité <daniel@manitou-mail.org> Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org	2025-02-27 10:57:55 -05:00
Tom Lane	c8c74ad7e1	Get rid of O(N^2) script-parsing overhead in pgbench. pgbench wants to record the starting line number of each command in its scripts. It was computing that by scanning from the script start and counting newlines, so that O(N^2) work had to be done for an N-command script. In a script with 50K lines, this adds up to about 10 seconds on my machine. To add insult to injury, the results were subtly wrong, because expr_scanner_offset() scanned to find the NUL that flex inserts at the end of the current token --- and before the first yylex call, no such NUL has been inserted. So we ended by computing the script's last line number not its first one. This was visible only in case of \gset at the start of a script, which perhaps accounts for the lack of complaints. To fix, steal an idea from plpgsql and track the current lexer ending position and line count as we advance through the script. (It's a bit simpler than plpgsql since we can't need to back up.) Also adjust a couple of other places that were invoking scans from script start when they didn't really need to. I made a new psqlscan function psql_scan_get_location() that replaces both expr_scanner_offset() and expr_scanner_get_lineno(), since in practice expr_scanner_get_lineno() was only being invoked to find the line number of the current lexer end position. Reported-by: Daniel Vérité <daniel@manitou-mail.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org	2025-02-27 10:53:38 -05:00
Alexander Korotkov	e167191dc1	Get rid of ojrelid local variable in remove_rel_from_query() As spotted by Coverity, the calculation of ojrelid mixes signed and unsigned types causes possible overflow and undefined behavior. Instead of trying to fix the expression, this commit eliminates the relied local variable. The explicit branching is used to replace the -1 value. That, in turn, requires changing the signature of the remove_rel_from_eclass() function. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/914330.1740330169%40sss.pgh.pa.us Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>	2025-02-27 11:22:01 +02:00
Thomas Munro	55918f798b	Remove arbitrary cap on read_stream.c buffer queue. Previously the internal queue of buffers was capped at max_ios * 4, though not less than io_combine_limit, at allocation time. That was done in the first version based on conservative theories about resource usage and heuristics pending later work. The configured I/O depth could not always be reached with dense random streams generated by ANALYZE, VACUUM, the proposed Bitmap Heap Scan patch, and also sequential streams with the proposed AIO subsystem to name some examples. The new formula is (max_ios + 1) * io_combine_limit, enough buffers for the full configured I/O concurrency level using the full configured I/O combine size, plus the buffers from one finished but not yet consumed full-sized I/O. Significantly more memory would be needed for high GUC values if the client code requests a large per-buffer data size, but that is discouraged (existing and proposed stream users try to keep it under a few words, if not zero). With this new formula, an intermediate variable could have overflowed under maximum GUC values, so its data type is adjusted to cope. Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-02-27 20:49:48 +13:00
Michael Paquier	48e4ae9a07	pg_amcheck: Fix inconsistency in memory freeing The function in charge of freeing the memory from a result created by PQescapeIdentifier() has to be PQfreemem(), to ensure that both allocation and free come from libpq, but one spot in pg_amcheck was missing that. Oversight in `b859d94c63`. Author: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAEudQArD_nKSnYCNUZiPPsJ2tNXgRmLbXGSOrH1vpOF_XtP0Vg@mail.gmail.com Discussion: https://postgr.es/m/CAEudQArbTWVSbxq608GRmXJjnNSQ0B6R7CSffNnj2hPWMUsRNg@mail.gmail.com Backpatch-through: 14	2025-02-27 14:05:51 +09:00
Amit Kapila	8709dccc79	Fix the race condition in ReplicationSlotAcquire(). After commit `f41d8468dd`, a process could acquire and use a replication slot that had just been invalidated, leading to failures while accessing WAL. To ensure that we don't accidentally start using invalid slots, we must perform the invalidation check after acquiring the slot or under the spinlock where we associate the slot with a particular process. We choose the earlier method to keep the code simple. Reported-by: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CABdArM7J-LbGoMPGUPiFiLOyB_TZ5+YaZb=HMES0mQqzVTn8Gg@mail.gmail.com	2025-02-27 09:47:04 +05:30
Amit Kapila	845511a72a	Doc: Additional clarification for -d option of pg_createsubscriber. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm0zsFUYpe-tLha+-sp3K8KmBXu0o=LUN=8FFtxMLYikPA@mail.gmail.com	2025-02-27 08:50:03 +05:30
Michael Paquier	495864a4cf	Refactor code of pg_stat_get_wal() building result tuple This commit adds to pgstatfuncs.c a new routine called pg_stat_wal_build_tuple(), helper routine for pg_stat_get_wal(). This is in charge of filling one tuple based on the contents of PgStat_WalStats retrieved from pgstats. This refactoring will be used by an upcoming patch introducing backend-level WAL statistics, simplifying the main patch. Note that it is not possible for stats_reset to be NULL in pg_stat_wal; backend statistics need to be able to handle this case. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-27 11:54:36 +09:00
Michael Paquier	62ec3e1f67	Fix possible double-release of spinlock in procsignal.c `9d9b9d46f3` has added spinlocks to protect the fields in ProcSignal flags, introducing a code path in ProcSignalInit() where a spinlock could be released twice if the pss_pid field of a ProcSignalSlot is found as already set. Multiple spinlock releases have no effect with most spinlock implementations, but this could cause the code to run into issues when the spinlock is acquired concurrently by a different process. This sanity check on pss_pid generates a LOG that can be delayed until after the spinlock is released as, like older versions up to v17, the code expects the initialization of the ProcSignalSlot to happen even if pss_pid is found incorrect. The code is changed so as the old pss_pid is read while holding the slot's spinlock, with the LOG from the sanity check generated after releasing the spinlock, preventing the double release. Author: Maksim Melnikov <m.melnikov@postgrespro.ru> Co-authored-by: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/dca47527-2d8b-4e3b-b5a0-e2deb73371a4@postgrespro.ru	2025-02-27 09:43:06 +09:00
Jeff Davis	15df9d7b51	Remove stray diff introduced by `a5cbdeb98a`. Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z77IkjmmfbFfNh3f@paquier.xyz	2025-02-26 13:37:14 -08:00
Tom Lane	40e27d04b4	Use attnum to identify index columns in pg_restore_attribute_stats(). Previously we used attname for both table and index columns, but that is problematic for indexes because their attnames are assigned by internal rules that don't guarantee to preserve the names across dump and reload. (This is what's causing the remaining buildfarm failures in cross-version-upgrade tests.) Fortunately we can use attnum instead, since there's no such thing as adding or dropping columns in an existing index. We met this same problem previously with ALTER INDEX ... SET STATISTICS, and solved it the same way, cf commit `5b6d13eec`. In pg_restore_attribute_stats() itself, we accept either attnum or attname, but the policy used by pg_dump is to always use attname for tables and attnum for indexes. Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us	2025-02-26 16:36:20 -05:00
Peter Eisentraut	f734c9fc3a	Revert "Prepare for Python "Limited API" in PL/Python" This reverts commit `c47e8df815`. That commit makes the plpython tests crash with Python 3.6.* and 3.7.*. It will need further investigation and testing, so revert for now.	2025-02-26 21:58:38 +01:00
Masahiko Sawada	945a9e3832	Fix a typo in 005_char_signedness.pl test. The test in 005_char_signedness.pl was missing a dash in the --set-char-signedness option. Although the test didn't fail since it doesn't check the error message, it resulted in an unexpected error message instead of the intended one. Oversight in `1aab680591`. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87tt8h1vb7.fsf@wibble.ilmari.org	2025-02-26 11:10:03 -08:00
Peter Eisentraut	c47e8df815	Prepare for Python "Limited API" in PL/Python Using the Python Limited API would allow building PL/Python against any Python 3.x version and using another Python 3.x version at run time. This commit does not activate that, but it prepares the code to only use APIs supported by the Limited API. Implementation details: - Convert static types to heap types (https://docs.python.org/3/howto/isolating-extensions.html#heap-types). - Replace PyRun_String() with component functions. - Replace PyList_SET_ITEM() with PyList_SetItem(). Reviewed-by: Jakob Egger <jakob@eggerapps.at> Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org	2025-02-26 16:14:39 +01:00
Michael Paquier	0e42d31b0b	Adding new PgStat_WalCounters structure in pgstat.h This new structure contains the counters and the data related to the WAL activity statistics gathered from WalUsage, separated into its own structure so as it can be shared across more than one Stats structure in pg_stat.h. This refactoring will be used by an upcoming patch introducing backend-level WAL statistics. Bump PGSTAT_FILE_FORMAT_ID. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-26 16:48:54 +09:00
Michael Paquier	d7cbeaf261	Remove pgstat_flush_wal() All the processes that generate WAL should call pgstat_report_wal() to report all their statistics related to WAL, and this is already what happens in the tree. Keeping pgstat_report_wal() is confusing while the other routine is encouraged. This routine is not required since `fc415edf8c`, where it was lastly used in pgstat_report_stat() before an equivalent callback existed. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z71oPkJJICrRB5Ws@paquier.xyz	2025-02-26 15:37:28 +09:00
Amit Kapila	e117cfb2f6	Add two-phase option in pg_createsubscriber. This patch introduces the '--enable-two-phase' option to the 'pg_createsubscriber' utility, allowing users to enable two-phase commit for all subscriptions during their creation. Note that even without this option users can enable the two_phase option for the subscriptions created by pg_createsubscriber. However, it requires the subscription to be disabled first which could be inconvenient for users. When two-phase commit is enabled, prepared transactions are sent to the subscriber at the time of 'PREPARE TRANSACTION', and they are processed as two-phase transactions on the subscriber as well. If disabled, prepared transactions are sent only when committed and are processed immediately by the subscriber. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Ajin Cherian <itsajin@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHv8RjLPdFP=kA5LNSmWZ=+GMXmO+LczvV6p9HJjsXxZz10KGA@mail.gmail.com	2025-02-26 11:12:50 +05:30
Michael Paquier	adc6032fa8	Improve FATAL message for invalid TLI history at recovery The original message did not mention where the checkpoint record LSN was found, a control file or a backup_label file. A couple of LOG messages are generated before this FATAL check is reached, providing more details about the way recovery is set up. However, knowing this information in this specific message is useful for debugging. This is also useful for instances where log_min_messages is set to FATAL or more, where LOG messages do not show up. Author: Benoit Lobréau <benoit.lobreau@dalibo.com> Reviewed-by: David Steele <david@pgbackrest.org> Discussion: https://postgr.es/m/4ed10bc8-5513-4d8e-8643-8abcaa08336d@dalibo.com	2025-02-26 14:26:16 +09:00
Jeff Davis	6ee3b91bad	pg_dump: prepare attribute stats query. Follow precedent in pg_dump for preparing queries to improve performance. Also, simplify the query by removing unnecessary joins. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CADkLM=dRMC6t8gp9GVf6y6E_r5EChQjMAAh_vPyih_zMiq0zvA@mail.gmail.com	2025-02-25 19:52:11 -08:00
Jeff Davis	8f427187db	Avoid unnecessary relation stats query in pg_dump. The few fields we need can be easily collected in getTables() and getIndexes() and stored in RelStatsInfo. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CADkLM=f0a43aTd88xW4xCFayEF25g-7hTrHX_WhV40HyocsUGg@mail.gmail.com	2025-02-25 19:51:45 -08:00
Michael Paquier	6c349d83b6	Re-add GUC track_wal_io_timing This commit is a rework of `2421e9a51d`, about which Andres Freund has raised some concerns as it is valuable to have both track_io_timing and track_wal_io_timing in some cases, as the WAL write and fsync paths can be a major bottleneck for some workloads. Hence, it can be relevant to not calculate the WAL timings in environments where pg_test_timing performs poorly while capturing some IO data under track_io_timing for the non-WAL IO paths. The opposite can be also true: it should be possible to disable the non-WAL timings and enable the WAL timings (the previous GUC setups allowed this possibility). track_wal_io_timing is added back in this commit, controlling if WAL timings should be calculated in pg_stat_io for the read, fsync and write paths, as done previously with pg_stat_wal. pg_stat_wal previously tracked only the sync and write parts (now removed), read stats is new data tracked in pg_stat_io, all three are aggregated if track_wal_io_timing is enabled. The read part matters during recovery or if a XLogReader is used. Extra note: more control over if the types of timings calculated in pg_stat_io could be done with a GUC that lists pairs of (IOObject,IOOp). Reported-by: Andres Freund <andres@anarazel.de> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/3opf2wh2oljco6ldyqf7ukabw3jijnnhno6fjb4mlu6civ5h24@fcwmhsgmlmzu	2025-02-26 09:49:59 +09:00
Jeff Davis	a5cbdeb98a	Remove redundant pg_set__stats() variants. After commit `f3dae2ae58`, the primary purpose of separating the pg_set__stats() from the pg_restore_*_stats() variants was eliminated. Leave pg_restore_relation_stats() and pg_restore_attribute_stats(), which satisfy both purposes, and remove pg_set_relation_stats() and pg_set_attribute_stats(). Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us	2025-02-25 16:15:47 -08:00
Andres Freund	ecbff4378b	Change _mdfd_segpath() to return paths by value This basically mirrors the changes done in the predecessor commit. While there isn't currently a need to get these paths in critical sections, it seems a shame to unnecessarily allocate memory in these paths now that relpath() doesn't allocate anymore. Discussion: https://postgr.es/m/xeri5mla4b5syjd5a25nok5iez2kr3bm26j2qn4u7okzof2bmf@kwdh2vf7npra	2025-02-25 09:02:07 -05:00
Andres Freund	37c87e63f9	Change relpath() et al to return path by value For AIO, and also some other recent patches, we need the ability to call relpath() in a critical section. Until now that was not feasible, as it allocated memory. The fact that relpath() allocated memory also made it awkward to use in log messages because we had to take care to free the memory afterwards. Which we e.g. didn't do for when zeroing out an invalid buffer. We discussed other solutions, e.g. filling a pre-allocated buffer that's passed to relpath(), but they all came with plenty downsides or were larger projects. The easiest fix seems to be to make relpath() return the path by value. To be able to return the path by value we need to determine the maximum length of a relation path. This patch adds a long #define that computes the exact maximum, which is verified to be correct in a regression test. As this change the signature of relpath(), extensions using it will need to adapt their code. We discussed leaving a backward-compat shim in place, but decided it's not worth it given the use of relpath() doesn't seem widespread. Discussion: https://postgr.es/m/xeri5mla4b5syjd5a25nok5iez2kr3bm26j2qn4u7okzof2bmf@kwdh2vf7npra	2025-02-25 09:02:07 -05:00
Peter Eisentraut	32c393f9f1	Remove obsolete Python version check The checked version is already the current minimum supported version (3.2). Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org	2025-02-25 14:11:38 +01:00
Richard Guo	363a6e8c6f	Eliminate code duplication in replace_rte_variables callbacks The callback functions ReplaceVarsFromTargetList_callback and pullup_replace_vars_callback are both used to replace Vars in an expression tree that reference a particular RTE with items from a targetlist, and they both need to expand whole-tuple references and deal with OLD/NEW RETURNING list Vars. As a result, currently there is significant code duplication between these two functions. This patch introduces a new function, ReplaceVarFromTargetList, to perform the replacement and calls it from both callback functions, thereby eliminating code duplication. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAEZATCWhr=FM4X5kCPvVs-g2XEk+ceLsNtBK_zZMkqFn9vUjsw@mail.gmail.com	2025-02-25 16:11:34 +09:00
Richard Guo	1e4351af32	Expand virtual generated columns in the planner Commit `83ea6c540` added support for virtual generated columns that are computed on read. All Var nodes in the query that reference virtual generated columns must be replaced with the corresponding generation expressions. Currently, this replacement occurs in the rewriter. However, this approach has several issues. If a Var referencing a virtual generated column has any varnullingrels, those varnullingrels need to be propagated into the generation expression. Failing to do so can lead to "wrong varnullingrels" errors and improper outer-join removal. Additionally, if such a Var comes from the nullable side of an outer join, we may need to wrap the generation expression in a PlaceHolderVar to ensure that it is evaluated at the right place and hence is forced to null when the outer join should do so. In certain cases, such as when the query uses grouping sets, we also need a PlaceHolderVar for anything that is not a simple Var to isolate subexpressions. Failure to do so can result in incorrect results. To fix these issues, this patch expands the virtual generated columns in the planner rather than in the rewriter, and leverages the pullup_replace_vars architecture to avoid code duplication. The generation expressions will be correctly marked with nullingrel bits and wrapped in PlaceHolderVars when needed by the pullup_replace_vars callback function. This requires handling the OLD/NEW RETURNING list Vars in pullup_replace_vars_callback, as it may now deal with Vars referencing the result relation instead of a subquery. The "wrong varnullingrels" error was reported by Alexander Lakhin. The incorrect result issue and the improper outer-join removal issue were reported by Richard Guo. Author: Richard Guo <guofenglinux@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/75eb1a6f-d59f-42e6-8a78-124ee808cda7@gmail.com	2025-02-25 16:10:25 +09:00
Michael Paquier	560a842d63	Fix untranslatable string concatenation in pg_upgrade Oversight in `1aab680591`. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20250225.140953.1271748916018759840.horikyota.ntt@gmail.com	2025-02-25 15:53:32 +09:00
Amit Kapila	5b8f2ccc0a	Doc: Fix pg_copy_logical_replication_slot description. This commit documents that the failover option is not copied when using the pg_copy_logical_replication_slot function. In passing, we modify the comments in the function clarifying the reason for this behavior. Reported-by: <duffieldzane@gmail.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/173976850802.682632.11315364077431550250@wrigleys.postgresql.org	2025-02-25 09:42:07 +05:30
Jeff Davis	15601fa21a	Missing doc update for `f3dae2ae58`.	2025-02-24 17:27:32 -08:00
Jeff Davis	f3dae2ae58	Do not use in-place updates for statistics import. The use of in-place updates was originally there to follow the precedent of ANALYZE and to reduce the potential for bloat on pg_class. Per discussion, it's not worth the risks. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/cpdanvzykcb5o64rmapkx6n5gjypoce3y52hff7ocxupgpbxu4@53jmlyvukijo	2025-02-24 17:10:59 -08:00
Michael Paquier	3ce357584e	psql: Add pipeline status to prompt and some state variables This commit adds %P to psql prompts, able to report the status of a pipeline depending on PQpipelineStatus(): on, off or abort. The following variables are added to report the state of an ongoing pipeline: - PIPELINE_SYNC_COUNT: reports the number of piped syncs. - PIPELINE_COMMAND_COUNT: reports the number of piped commands, a command being either \bind, \bind_named, \close or \parse. - PIPELINE_RESULT_COUNT: reports the results available to read with \getresults. These variables can be used with \echo or in a prompt, using "%:name:" in PROMPT1, PROMPT2 or PROMPT3. Some basic regression tests are added for these. The suggestion to use variables to show the details about the status counters comes from me. The original patch proposed was less extensible, hardcoding the output in the prompt. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com	2025-02-25 10:07:24 +09:00
Amit Langote	cbb9086c9e	Fix bug in `cbc127917` to handle nested Append correctly A non-leaf partition with a subplan that is an Append node was omitted from PlannedStmt.unprunableRelids because it was mistakenly included in PlannerGlobal.prunableRelids due to the way PartitionedRelPruneInfo.leafpart_rti_map[] is constructed. This happened when a non-leaf partition used an unflattened Append or MergeAppend. As a result, ExecGetRangeTableRelation() reported an error when called from CreatePartitionPruneState() to process the partition's own PartitionPruneInfo, since it was treated as prunable when it should not have been. Reported-by: Alexander Lakhin <exclusion@gmail.com> (via sqlsmith) Diagnosed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/74839af6-aadc-4f60-ae77-ae65f94bf607@gmail.com	2025-02-25 09:24:42 +09:00
Masahiko Sawada	48796a98d5	Fix assertion when decoding XLOG_PARAMETER_CHANGE on promoted primary. When a standby replays an XLOG_PARAMETER_CHANGE record that lowers wal_level below logical, we invalidate all logical slots in hot standby mode. However, if this record was replayed while not in hot standby mode, logical slots could remain valid even after promotion, potentially causing an assertion failure during WAL record decoding. To fix this issue, this commit adds a check for hot_standby status when restoring a logical replication slot on standbys. This check ensures that logical slots are invalidated when they become incompatible due to insufficient wal_level during recovery. Backpatch to v16 where logical decoding on standby was introduced. Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CAD21AoABoFwGY_Rh2aeE6tEq3HkJxf0c6UeOXn4VV9v6BAQPSw%40mail.gmail.com Backpatch-through: 16	2025-02-24 14:03:04 -08:00
Daniel Gustafsson	d1146dc2a7	oauth: Rename macro to avoid collisions on Windows Our json parsing defined the macros OPTIONAL and REQUIRED to decorate the structs with for increased readability. This however collides with macros in the <windef.h> header on Windows. ../src/interfaces/libpq/fe-auth-oauth-curl.c:398:9: warning: "OPTIONAL" redefined 398 \| #define OPTIONAL false \| ^~~~~~~~ In file included from D:/a/_temp/msys64/ucrt64/include/windef.h:9, from D:/a/_temp/msys64/ucrt64/include/windows.h:69, from D:/a/_temp/msys64/ucrt64/include/winsock2.h:23, from ../src/include/port/win32_port.h:60, from ../src/include/port.h:24, from ../src/include/c.h:1331, from ../src/include/postgres_fe.h:28, from ../src/interfaces/libpq/fe-auth-oauth-curl.c:16: include/minwindef.h:65:9: note: this is the location of the previous definition 65 \| #define OPTIONAL \| ^~~~~~~~ Rename to avoid compilation errors in anticipation of implementing support for Windows. Reported-by: Dave Cramer (on PostgreSQL Hacking Discord)	2025-02-24 22:20:37 +01:00
Daniel Gustafsson	03366b61df	oauth: Fix incorrect const markers in struct Two members in PGoauthBearerRequest were incorrectly marked as const. While in there, align the name of the struct with the typedef as per project style. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/912516.1740329361@sss.pgh.pa.us	2025-02-24 22:20:29 +01:00
Melanie Plageman	bfe56cdf9a	Delay extraction of TIDBitmap per page offsets Pages from the bitmap created by the TIDBitmap API can be exact or lossy. The TIDBitmap API extracts the tuple offsets from exact pages into an array for the convenience of the caller. This was done in tbm_private\|shared_iterate() right after advancing the iterator. However, as long as tbm_private\|shared_iterate() set a reference to the PagetableEntry in the TBMIterateResult, the offset extraction can be done later. Waiting to extract the tuple offsets has a few benefits. For the shared iterator case, it allows us to extract the offsets after dropping the shared iterator state lock, reducing time spent holding a contended lock. Separating the iteration step and extracting the offsets later also allows us to avoid extracting the offsets for prefetched blocks. Those offsets were never used, so the overhead of extracting and storing them was wasted. The real motivation for this change, however, is that future commits will make bitmap heap scan use the read stream API. This requires a TBMIterateResult per issued block. By removing the array of tuple offsets from the TBMIterateResult and only extracting the offsets when they are used, we reduce the memory required for per buffer data substantially. Suggested-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com	2025-02-24 16:10:19 -05:00
Melanie Plageman	b8778c4cd8	Add lossy indicator to TBMIterateResult TBMIterateResult->ntuples is -1 when the page in the bitmap is lossy. Add an explicit lossy indicator so that we can move ntuples out of the TBMIterateResult in a future commit. Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com	2025-02-24 16:10:13 -05:00
Nathan Bossart	c56e8af75e	Fix comment for MAX_BACKENDS. This comment mentions that we check that the configured number of backends does not exceed MAX_BACKENDS in RegisterBackgroundWorker() and relevant GUC check hooks, neither of which has those checks anymore. To fix, adjust this comment to say that we do the check in InitializeMaxBackends(). Oversights in commits `6bc8ef0b7f` and `0b1fe1413e`. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/Z7zOEzz8lNjaU9yf%40nathan	2025-02-24 15:02:09 -06:00
Robert Haas	e87c14b19e	libpq: Trace all NegotiateProtocolVersion fields Previously, the names of the unsupported protocol options were not traced. Since NegotiateProtocolVersion has not really been used yet, that has not mattered much, but we hope to use it eventually, so let's fix this. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQTfc_O+HXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com	2025-02-24 12:06:21 -05:00
Robert Haas	c9d94ea215	libpq: Add PQfullProtocolVersion to exports.txt This is necessary to be able to actually use the function on Windows; bug introduced in commit `cdb6b0fdb0`. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQTfc_O+HXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com	2025-02-24 11:47:31 -05:00
Tom Lane	9de2cc455e	Fix confusion about data type of pg_class.relpages and relallvisible. Although they're exposed as int4 in pg_class, relpages and relallvisible are really of type BlockNumber, that is uint32. Correct type puns in relation_statistics_update() and remove inappropriate range-checks. The type puns are only cosmetic issues, but the range checks would cause failures with huge relations. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/614341.1740269035@sss.pgh.pa.us	2025-02-24 11:16:04 -05:00
Daniel Gustafsson	e889422d98	pg_amcheck: PQclear query results While the potential memory leak is small, ensure to PQclear the query results before disconnecting. Author: Jiao Shuntian <312199339@qq.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/tencent_F34922C91C41E76C734773E767C9FBDB9906@qq.com	2025-02-24 16:03:19 +01:00
Andres Freund	5ee75e32fa	Add static asserts for MAX_BACKENDS limiting factors So far the various dependencies were documented in the comment above MAX_BACKENDS, but not checked. Discussion: https://postgr.es/m/CA+COZaBO_s3LfALq=b+HcBHFSOEGiApVjrRacCe4VP9m7CJsNQ@mail.gmail.com	2025-02-24 06:23:41 -05:00
Andres Freund	418451bfe1	bufmgr: Make it easier to change number of buffer state bits In an upcoming commit I'd like to change the number of bits for the usage count (the current max is 5, fitting in three bits, but we reserve four bits). Until now that required adjusting a bunch of magic constants, now the constants are defined based on the number of bits reserved. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6@lzx3sjsqhmt6 Discussion: https://postgr.es/m/riivolmg6uzfvpzfn6wjo3ghwt42rcec43ok6mv4oenfg654y7@x7dbposbskwd	2025-02-24 06:23:41 -05:00
Andres Freund	cd3ccf88aa	Base LWLock limits directly on MAX_BACKENDS Jacob reported that comments for LW_SHARED_MASK referenced a MAX_BACKENDS limit of 2^23-1, but that MAX_BACKENDS is actually limited to 2^18-1. The limit was lowered in `48354581a4`, but the comment in lwlock.c wasn't updated. Instead of just fixing the comment, it seems better to directly base the lwlock defines on MAX_BACKENDS and add static assertions to ensure that there is enough space. That way there's no comment that can go out of sync in the future. As part of that change I noticed that for some reason the high bit wasn't used for flags, which seems somewhat odd. Redefine the flag values to start at the highest bit. Reported-by: Jacob Brazeal <jacob.brazeal@gmail.com> Reviewed-by: Jacob Brazeal <jacob.brazeal@gmail.com> Discussion: https://postgr.es/m/CA+COZaBO_s3LfALq=b+HcBHFSOEGiApVjrRacCe4VP9m7CJsNQ@mail.gmail.com	2025-02-24 06:23:41 -05:00
Andres Freund	6394a3a61c	Move MAX_BACKENDS to procnumber.h MAX_BACKENDS influences many things besides postmaster. I e.g. noticed that we don't have static assertions ensuring BUF_REFCOUNT_MASK is big enough for MAX_BACKENDS, adding them would require including postmaster.h in buf_internals.h which doesn't seem right. While at that, add MAX_BACKENDS_BITS, as that's useful in various places for static assertions (to be added in subsequent commits). Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/wptizm4qt6yikgm2pt52xzyv6ycmqiutloyvypvmagn7xvqkce@d4xuv3mylpg4	2025-02-24 06:23:41 -05:00
John Naylor	0600d276d4	Silence warning in older versions of Valgrind Due to misunderstanding on my part, commit `235328ee4` did not go far enough to silence older versions of Valgrind. For those, it was the bit scan that was problematic, not the subsequent bit-masking operation. To fix, use the unaligned path for the trailing bytes. Since we don't have a bit scan here anymore, also remove some comments and endian-specific coding around that. Reported-by: Anton A. Melnikov <a.melnikov@postgrespro.ru> Discussion: https://postgr.es/m/f3aa2d45-3b28-41c5-9499-a1bc30e0f8ec@postgrespro.ru Backpatch-through: 17	2025-02-24 18:03:29 +07:00
Michael Paquier	2421e9a51d	Remove read/sync fields from pg_stat_wal and GUC track_wal_io_timing The four following attributes are removed from pg_stat_wal: * wal_write * wal_sync * wal_write_time * wal_sync_time `a051e71e28` has added an equivalent of this information in pg_stat_io with more granularity as this now spreads across the backend types, IO context and IO objects. So, keeping the same information in pg_stat_wal has little benefits. Another benefit of this commit is the removal of PendingWalStats, simplifying an upcoming patch to add per-backend WAL statistics, which already support IO statistics and which have access to the write/sync stats data of WAL. The GUC track_wal_io_timing, that was used to enable or disable the aggregation of the write and sync timings for WAL, is also removed. pgstat_prepare_io_time() is simplified. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, due to the update of PgStat_WalStats. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-24 09:51:56 +09:00
Tom Lane	fc0d0ce978	Ignore hash's relallvisible when checking pg_upgrade from pre-v10. Our cross-version upgrade tests have been failing for some pre-v10 source versions since commit `1fd1bd871`. This turns out to be because relallvisible may change for tables that have hash indexes, because the upgrade process forcibly reindexes such indexes to deal with the changes made in v10. Fortunately, the set of tables that have such indexes is small and won't change anymore in those branches. So just hack up AdjustUpgrade.pm to not compare the relallvisible values of those specific tables. While here, also tighten the regex that suppresses comparison of version fields. Discussion: https://postgr.es/m/812817.1740277228@sss.pgh.pa.us	2025-02-23 14:16:26 -05:00
Peter Eisentraut	454c182f85	backend libpq void * argument for binary data Change some backend libpq functions to take void * for binary data instead of char *. This removes the need for numerous casts. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org	2025-02-23 14:27:02 +01:00
Peter Eisentraut	ebdccead16	SnapBuildRestoreContents() void * argument for binary data Change internal snapbuild API function to take void * for binary data instead of char *. This removes the need for numerous casts. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org	2025-02-23 12:38:21 +01:00
Michael Paquier	a4e986ef5a	Add more tests for utility commands in pipelines This commit checks interactions with pipelines and implicit transaction blocks for the following commands that have their own behaviors when used in pipelines depending on their order in a pipeline and sync requests: - SET LOCAL - REINDEX CONCURRENTLY - VACUUM - Subtransactions (SAVEPOINT, ROLLBACK TO SAVEPOINT) These scenarios could be tested only with pgbench previously. The meta-commands of psql controlling pipelines make these easier to implement, debug, and they can be run in a SQL script. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com	2025-02-23 16:43:07 +09:00
Peter Eisentraut	f98765f0ce	jsonb internal API void * argument for binary data Change some internal jsonb API functions to take void * for binary data instead of char *. This removes the need for numerous casts. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org	2025-02-23 08:34:55 +01:00
Jeff Davis	cb45dc3afb	Documentation fixups for dumping statistics. Reported-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reported-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/OSCPR01MB149665630030E7F54FDA8B27BF5C72@OSCPR01MB14966.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/25d26774-25fa-46f2-9888-c6a707d1fef7@dunslane.net	2025-02-22 10:03:11 -08:00
Álvaro Herrera	bba2fbc623	Change \conninfo to use tabular format (Initially the proposal was to keep \conninfo alone and add this feature as \conninfo+, but we decided against keeping the original.) Also display more fields than before, though not as many as were suggested during the discussion. In particular, we don't show 'role' nor 'session authorization', for both which a case can probably be made. These can be added as followup commits, if we agree to it. Some (most?) reviewers actually reviewed rather different versions of the patch and do not necessarily endorse the current one. Co-authored-by: Maiquel Grassi <grassi@hotmail.com.br> Co-authored-by: Hunaid Sohail <hunaidpgml@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Sami Imseih <simseih@amazon.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Luzanov <p.luzanov@postgrespro.ru> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Erik Wienhold <ewie@ewie.name> Discussion: https://postgr.es/m/CP8P284MB24965CB63DAC00FC0EA4A475EC462@CP8P284MB2496.BRAP284.PROD.OUTLOOK.COM	2025-02-22 10:05:26 +01:00
Amit Langote	4f1b6e5bb4	Remove unstable test suite added by `525392d57` The 'cached-plan-inval' test suite, introduced in `525392d57` under src/test/modules/delay_execution, aimed to verify that cached plan invalidation triggers replanning after deferred locks are taken. However, its ExecutorStart_hook-based approach relies on lock timing assumptions that, in retrospect, are fragile. This instability was exposed by failures on BF animal trilobite, which builds with CLOBBER_CACHE_ALWAYS. One option was to dynamically disable the cache behavior that causes the test suite to fail by setting "debug_discard_caches = 0", but it seems better to remove the suite. The risk of future failures due to other cache flush hazards outweighs the benefit of catching real breakage in the backend behavior it tests. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2990641.1740117879@sss.pgh.pa.us	2025-02-22 15:19:23 +09:00
Andres Freund	f8d7f29b3e	Allow lwlocks to be disowned To implement AIO writes, the backend initiating writes needs to transfer the lock ownership to the AIO subsystem, so the lock held during the write can be released in another backend. Other backends need to be able to "complete" an asynchronously started IO to avoid deadlocks (consider e.g. one backend starting IO for a buffer and then waiting for a heavyweight lock held by another relation followed by the current holder of the heavyweight lock waiting for the IO to complete). To that end, this commit adds LWLockDisown() and LWLockReleaseDisowned(). If code uses LWLockDisown() it's the code's responsibility to ensure that the lock is released in case of errors. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi	2025-02-21 20:55:23 -05:00
Robert Haas	44cbba9a7f	Adjust EXPLAIN test case to filter out "Actual Rows" values. Per the buildfarm, these tests appear to be unstable in the wake of commit `ddb17e387a`. I'm not sure that just hiding this output is the right way forward, because I think there may be other test cases that will fail with lower probability even after this fix. However, it's hard to tell right now, because this is failing on a number of buildfarm animals. So let's try this for now to either get a clearer picture of what else is broken, or as a stopgap until we decide what the permanent fix should be, or perhaps this will be the permanent fix after all.	2025-02-21 19:20:41 -05:00
Tom Lane	98fc31d649	Avoid race condition between "GRANT role" and "DROP ROLE". Concurrently dropping either the granted role or the grantee does not stop GRANT from completing, instead resulting in a dangling role reference in pg_auth_members. That's relatively harmless in the short run, but inconsistent catalog entries are not a good thing. This patch solves the problem by adding the granted and grantee roles as explicit shared dependencies of the pg_auth_members entry. That's a bit indirect, but it works because the pg_shdepend code applies the necessary locking and rechecking. Commit `6566133c5` previously established similar handling for the grantor column of pg_auth_members; it's not clear why it didn't cover the other two role OID columns. A side-effect of this approach is that DROP OWNED BY will now drop pg_auth_members entries that mention the target role as either the granted or grantee role. That's clearly appropriate for the grantee, since we'll drop its other privileges too. It doesn't seem too far out of line for the granted role, since we're presumably about to drop it and besides we're removing all reasons why it'd matter to be a member of it. (One could argue that this makes DropRole's code to auto-drop pg_auth_members entries unnecessary, but I chose to leave it in place since perhaps some people's workflows expect that to work without a DROP OWNED BY.) Note to patch readers: CreateRole's first CommandCounterIncrement call is now unconditional, because this change creates another case in which it's needed, and it seemed to be more trouble than it's worth to preserve that micro-optimization. Arguably this is a bug fix, but the fact that it changes the expected contents of pg_shdepend seems like not a great thing to do in the stable branches, and perhaps we don't want the change in DROP OWNED BY semantics there either. On the other hand, I opted not to force a catversion bump in HEAD, because the presence or absence of these entries doesn't matter for most purposes. Reported-by: Virender Singla <virender.cse@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com	2025-02-21 17:07:01 -05:00
Robert Haas	ddb17e387a	Allow EXPLAIN to indicate fractional rows. When nloops > 1, we now display two digits after the decimal point, rather than none. This is important because what we print is actually planstate->instrument->ntuples / nloops, and sometimes what you want to know is planstate->instrument->ntuples. You can estimate that by multiplying the displayed row count by the displayed nloops value, but the fact that the displayed value is rounded makes that inexact. It's still inexact even if we show these two extra decimal places, but less so. Perhaps we will agree on a way to further improve this output later, but for now this seems better than doing nothing. Author: Ibrar Ahmed <ibrar.ahmad@gmail.com> Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Greg Stark <stark@mit.edu> Reviewed-by: Naeem Akhter <akhternaeem@gmail.com> Reviewed-by: Hamid Akhtar <hamid.akhtar@percona.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrei Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Guillaume Lelarge <guillaume@lelarge.info> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru> Discussion: http://postgr.es/m/603c8f070905281830g2e5419c4xad2946d149e21f9d%40mail.gmail.com	2025-02-21 16:14:13 -05:00
Masahiko Sawada	78d3f48895	Add test 005_char_signedness.pl to meson.build. Oversight in `a8238f87f9` where the test has been added. Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com	2025-02-21 12:31:16 -08:00
Tom Lane	29d75b25b5	Fix pg_dumpall to cope with dangling OIDs in pg_auth_members. There is a race condition between "GRANT role" and "DROP ROLE", which allows GRANT to install pg_auth_members entries that refer to dropped roles. (Commit `6566133c5` prevented that for the grantor field, but not for the granted or grantee roles.) We'll soon fix that, at least in HEAD, but pg_dumpall needs to cope with the situation in case of pre-existing inconsistency. As pg_dumpall stands, it will emit invalid commands like 'GRANT foo TO ""', which causes pg_upgrade to fail. Fix it to emit warnings and skip those GRANTs, instead. There was some discussion of removing the problem by changing dumpRoleMembership's query to use JOIN not LEFT JOIN, but that would result in silently ignoring such entries. It seems better to produce a warning. Pre-v16 branches already coped with dangling grantor OIDs by simply omitting the GRANTED BY clause. I left that behavior as-is, although it's somewhat inconsistent with the behavior of later branches. Reported-by: Virender Singla <virender.cse@gmail.com> Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com Backpatch-through: 13	2025-02-21 13:37:15 -05:00
Masahiko Sawada	dfd8e6c73e	Fix an issue with index scan using pg_trgm due to char signedness on different architectures. GIN and GiST indexes utilizing pg_trgm's opclasses store sorted trigrams within index tuples. When comparing and sorting each trigram, pg_trgm treats each character as a 'char[3]' type in C. However, the char type in C can be interpreted as either signed char or unsigned char, depending on the platform, if the signedness is not explicitly specified. Consequently, during replication between different CPU architectures, there was an issue where index scans on standby servers could not locate matching index tuples due to the differing treatment of character signedness. This change introduces comparison functions for trgm that explicitly handle signed char and unsigned char. The appropriate comparison function will be dynamically selected based on the character signedness stored in the control file. Therefore, upgraded clusters can utilize the indexes without rebuilding, provided the cluster upgrade occurs on platforms with the same character signedness as the original cluster initialization. The default char signedness information was introduced in `44fe30fdab`, so no backpatch. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com	2025-02-21 10:27:39 -08:00
Masahiko Sawada	1aab680591	pg_upgrade: Add --set-char-signedness to set the default char signedness of new cluster. This change adds a new option --set-char-signedness to pg_upgrade. It enables user to set arbitrary signedness during pg_upgrade. This helps cases where user who knew they copied the v17 source cluster from x86 (signedness=true) to ARM (signedness=false) can pg_upgrade properly without the prerequisite of acquiring an x86 VM. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com	2025-02-21 10:23:39 -08:00
Masahiko Sawada	a8238f87f9	pg_upgrade: Preserve default char signedness value from old cluster. Commit `44fe30fdab` introduced the 'default_char_signedness' field in controlfile. Newly created database clusters always set this field to 'signed'. This change ensures that pg_upgrade updates the 'default_char_signedness' to 'unsigned' if the source database cluster has signedness=false. For source clusters from v17 or earlier, which lack the 'default_char_signedness' information, pg_upgrade assumes the source cluster was initialized on the same platform where pg_upgrade is running. It then sets the 'default_char_signedness' value according to the current platform's default character signedness. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com	2025-02-21 10:19:40 -08:00
Masahiko Sawada	30666d1857	pg_resetwal: Add --char-signedness option to change the default char signedness. With the newly added option --char-signedness, pg_resetwal updates the default char signedness flag in the controlfile. This option is primarily intended for an upcoming patch that pg_upgrade supports preserving the default char signedness during upgrades, and is not meant for manual operation. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com	2025-02-21 10:14:36 -08:00
Masahiko Sawada	44fe30fdab	Add default_char_signedness field to ControlFileData. The signedness of the 'char' type in C is implementation-dependent. For instance, 'signed char' is used by default on x86 CPUs, while 'unsigned char' is used on aarch CPUs. Previously, we accidentally let C implementation signedness affect persistent data. This led to inconsistent results when comparing char data across different platforms. This commit introduces a new 'default_char_signedness' field in ControlFileData to store the signedness of the 'char' type. While this change does not encourage the use of 'char' without explicitly specifying its signedness, this field can be used as a hint to ensure consistent behavior for pre-v18 data files that store data sorted by the 'char' type on disk (e.g., GIN and GiST indexes), especially in cross-platform replication scenarios. Newly created database clusters unconditionally set the default char signedness to true. pg_upgrade (with an upcoming commit) changes this flag for clusters if the source database cluster has signedness=false. As a result, signedness=false setting will become rare over time. If we had known about the problem during the last development cycle that forced initdb (v8.3), we would have made all clusters signed or all clusters unsigned. Making pg_upgrade the only source of signedness=false will cause the population of database clusters to converge toward that retrospective ideal. Bump catalog version (for the catalog changes) and PG_CONTROL_VERSION (for the additions in ControlFileData). Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com	2025-02-21 10:12:08 -08:00
Bruce Momjian	901a1cf8b4	doc: clarify default checksum behavior in non-master branches Also simplify and correct data checksum wording in master now that it is the default. PG 13 did not have the awkward wording. Reported-by: Felix <afripowered@gmail.com> Reviewed-by: Laurenz Albe Discussion: https://postgr.es/m/173928241056.707.3989867022954178032@wrigleys.postgresql.org Backpatch-through: 14	2025-02-21 13:03:29 -05:00
Bruce Momjian	6ea0734e41	doc: remove non-breaking space in SGML files, causes make error	2025-02-21 12:15:53 -05:00
Andres Freund	32ce58e9e9	Make test portlock logic work with meson Previously the portlock logic, added in `9b4eafcaf4`, didn't actually work properly when the tests were run via meson. `9b4eafcaf4` used the MESON_BUILD_ROOT environment variable to determine the directory for the port lock directory, but that's never set for running the tests. That meant that each test used its own portlock dir, unless the PG_TEST_PORT_DIR environment variable was set. Fix the problem by setting top_builddir for the environment. That's also used for the autoconf/make build. Backpatch back to 16, where meson support was added. Reported-by: Zharkov Roman <r.zharkov@postgrespro.ru> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Backpatch-through: 16	2025-02-21 11:25:05 -05:00
Michael Paquier	665cafe8a4	Fix cross-version upgrades with XMLSERIALIZE(NO INDENT) Dumps from versions older than v16 do not know about NO INDENT in a XMLSERIALIZE() clause. This commit adjusts AdjustUpgrade.pm so as NO INDENT is discarded in the contents of the new dump adjusted for comparison when the old version is v15 or older. This should be enough to make the cross-version upgrade tests pass. Per report from buildfarm member crake. Oversight in `984410b923`. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/88b183f1-ebf9-4f51-9144-3704380ccae7@dunslane.net Backpatch-through: 16	2025-02-21 20:37:31 +09:00
Peter Eisentraut	329304c901	Support text position search functions with nondeterministic collations This allows using text position search functions with nondeterministic collations. These functions are - position, strpos - replace - split_part - string_to_array - string_to_table which all use common internal infrastructure. There was previously no internal implementation of this, so it was met with a not-supported error. This adds the internal implementation and removes the error. Unlike with deterministic collations, the search cannot use any byte-by-byte optimized techniques but has to go substring by substring. We also need to consider that the found match could have a different length than the needle and that there could be substrings of different length matching at a position. In most cases, we need to find the longest such substring (greedy semantics), but this can be configured by each caller. Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://www.postgresql.org/message-id/flat/582b2613-0900-48ca-8b0d-340c06f4d400@eisentraut.org	2025-02-21 12:21:17 +01:00
Daniel Gustafsson	41336bf085	doc: Add links to olsen93 and ong90 in bibliography The bibliography entries for olsen93 and ong90 lacked links to online copies. While ong90 is available in digital form, the olsen93 thesis is only available as a physical copy in the UCB library. To save people from searching for it, we still link to it via the UCB library page. Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxFcJYdRvzgt59N26XjFp2tFFUXu+VN+x8Uo0NbDUCMCbw@mail.gmail.com	2025-02-21 11:28:42 +01:00
Amit Kapila	b4e0d0c53f	Fix a WARNING for data origin discrepancies. Previously, a WARNING was issued at the time of defining a subscription with origin=NONE only when the publisher subscribed to the same table from other publishers, indicating potential data origination from different origins. However, the publisher can subscribe to the partition ancestors or partition children of the table from other publishers, which could also result in mixed-origin data inclusion. So, give a WARNING in those cases as well. Reported-by: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/5eda6a9c-63cf-404d-8a49-8dcb116a29f3@postgrespro.ru	2025-02-21 14:34:40 +05:30
Michael Paquier	984410b923	Add missing deparsing of [NO] IDENT to XMLSERIALIZE() NO INDENT is the default, and is added if no explicit indentation flag was provided with XMLSERIALIZE(). Oversight in `483bdb2afe`. Author: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/bebd457e-5b43-46b3-8fc6-f6a6509483ba@uni-muenster.de Backpatch-through: 16	2025-02-21 17:30:56 +09:00
Peter Eisentraut	7d6d2c4bbd	Drop opcintype from index AM strategy translation API The type argument wasn't actually really necessary. It was a remnant of converting the API of the gist strategy translation from using opclass to using opfamily+opcintype (commits `c09e5a6a01`, `622f678c10`). For looking up the gist translation function, we used the convention "amproclefttype = amprocrighttype = opclass's opcintype" (see pg_amproc.h). But each operator family should only have one translation function, and getting the right type for the lookup is sometimes cumbersome and fragile, so this is all unnecessarily complicated. To simplify this, change the gist stategy support procedure to take "any", "any" as argument. (This is arbitrary but seems intuitive. The alternative of using InvalidOid as argument(s) upsets various DDL commands, so it's not practical.) Then we don't need opcintype for the lookup, and we can remove it from all the API layers introduced by commit `c09e5a6a01`. This also adds some more documentation about the correct signature of the gist support function and adds more checks in gistvalidate(). This was previously underspecified. (It relied implicitly on convention mentioned above.) Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-02-21 09:07:16 +01:00
Peter Eisentraut	7202d72787	backend launchers void * arguments for binary data Change backend launcher functions to take void * for binary data instead of char *. This removes the need for numerous casts. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org	2025-02-21 08:03:33 +01:00
Jeff Davis	b50a554cc8	Fix for pg_restore_attribute_stats(). Use RelationGetIndexExpressions() rather than rd_indexprs directly. Author: Corey Huinker <corey.huinker@gmail.com>	2025-02-20 22:31:22 -08:00
Michael Paquier	41625ab8ea	psql: Add support for pipelines With \bind, \parse, \bind_named and \close, it is possible to issue queries from psql using the extended protocol. However, it was not possible to send these queries using libpq's pipeline mode. This feature has two advantages: - Testing. Pipeline tests were only possible with pgbench, using TAP tests. It now becomes possible to have more SQL tests that are able to stress the backend with pipelines and extended queries. More tests will be added in a follow-up commit that were discussed on some other threads. Some external projects in the community had to implement their own facility to work around this limitation. - Emulation of custom workloads, with more control over the actions taken by a client with libpq APIs. It is possible to emulate more workload patterns to bottleneck the backend with the extended query protocol. This patch adds six new meta-commands to be able to control pipelines: * \startpipeline starts a new pipeline. All extended queries are queued until the end of the pipeline are reached or a sync request is sent and processed. * \endpipeline ends an existing pipeline. All queued commands are sent to the server and all responses are processed by psql. * \syncpipeline queues a synchronisation request, without flushing the commands to the server, equivalent of PQsendPipelineSync(). * \flush, equivalent of PQflush(). * \flushrequest, equivalent of PQsendFlushRequest() * \getresults reads the server's results for the queries in a pipeline. Unsent data is automatically pushed when \getresults is called. It is possible to control the number of results read in a single meta-command execution with an optional parameter, 0 means that all the results should be read. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com	2025-02-21 11:19:59 +09:00
Michael Paquier	40af897eb7	Add braces for if block with large comment in psql's common.c A patch touching this area of the code is under review, and this format makes the readability of the code slightly harder to parse. Extracted from a larger patch by the same author. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com	2025-02-21 09:18:49 +09:00
Daniel Gustafsson	2c53dec7f4	Add missing entry to oauth_validator test .gitignore Commit `b3f0be788` accidentally missed adding the oauth client test binary to the relevant .gitignore. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2839306.1740082041@sss.pgh.pa.us	2025-02-20 21:29:21 +01:00
Peter Eisentraut	3e4d868615	Remove various unnecessary (char ) casts Remove a number of (char ) casts that are unnecessary. Or in some cases, rewrite the code to make the purpose of the cast clearer. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org	2025-02-20 19:49:27 +01:00
Jeff Davis	ab84d0ff80	Trial fix for old cross-version upgrades. Per buildfarm and reports, it seems that 9.X to 18 upgrades were failing after commit `1fd1bd8710` due to an incorrect regex. Loosen the regex to accommodate older versions. Reported-by: vignesh C <vignesh21@gmail.com> Reported-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CALDaNm3GUs+U8Nt4S=V5zmb+K8-RfAc03vRENS0teeoq0Lc6Tw@mail.gmail.com Discussion: https://postgr.es/m/ea4cbbc1-c5a5-43d1-9618-8ff3f2155bfe@dunslane.net	2025-02-20 10:21:24 -08:00
Andrew Dunstan	8e4d72573c	Ignore blank lines in pgindent exclude files Currently a blank line matches everything, which is almost never what someone would want. If they really want that they can use a wildcard regex to do it. Author: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAN4CZFNka+2q3=-Dithr4w65RJfwPaV92T62spEzLn+T4MgcMg@mail.gmail.com	2025-02-20 11:36:07 -05:00
Daniel Gustafsson	9d9a71002a	cirrus: Temporarily fix libcurl link error On FreeBSD the ftp/curl port appears to be missing a minimum version dependency on libssh2, so the following starts showing up after upgrading to curl 8.11.1_1: libcurl.so.4: Undefined symbol "libssh2_session_callback_set2" Awaiting an upgrade of the FreeBSD CI images to version 14, work around the issue. Author: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAOYmi+kZAka0sdxCOBxsQc2ozEZGZKHWU_9nrPXg3sG1NJ-zJw@mail.gmail.com	2025-02-20 16:25:47 +01:00
Daniel Gustafsson	b3f0be788a	Add support for OAUTHBEARER SASL mechanism This commit implements OAUTHBEARER, RFC 7628, and OAuth 2.0 Device Authorization Grants, RFC 8628. In order to use this there is a new pg_hba auth method called oauth. When speaking to a OAuth- enabled server, it looks a bit like this: $ psql 'host=example.org oauth_issuer=... oauth_client_id=...' Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG Device authorization is currently the only supported flow so the OAuth issuer must support that in order for users to authenticate. Third-party clients may however extend this and provide their own flows. The built-in device authorization flow is currently not supported on Windows. In order for validation to happen server side a new framework for plugging in OAuth validation modules is added. As validation is implementation specific, with no default specified in the standard, PostgreSQL does not ship with one built-in. Each pg_hba entry can specify a specific validator or be left blank for the validator installed as default. This adds a requirement on libcurl for the client side support, which is optional to build, but the server side has no additional build requirements. In order to run the tests, Python is required as this adds a https server written in Python. Tests are gated behind PG_TEST_EXTRA as they open ports. This patch has been a multi-year project with many contributors involved with reviews and in-depth discussions: Michael Paquier, Heikki Linnakangas, Zhihong Yu, Mahendrakar Srinivasarao, Andrey Chudnovsky and Stephen Frost to name a few. While Jacob Champion is the main author there have been some levels of hacking by others. Daniel Gustafsson contributed the validation module and various bits and pieces; Thomas Munro wrote the client side support for kqueue. Author: Jacob Champion <jacob.champion@enterprisedb.com> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Reviewed-by: Kashif Zeeshan <kashi.zeeshan@gmail.com> Discussion: https://postgr.es/m/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com	2025-02-20 16:25:17 +01:00
Jeff Davis	1fd1bd8710	Transfer statistics during pg_upgrade. Add support to pg_dump for dumping stats, and use that during pg_upgrade so that statistics are transferred during upgrade. In most cases this removes the need for a costly re-analyze after upgrade. Some statistics are not transferred, such as extended statistics or statistics with a custom stakind. Now pg_dump accepts the options --schema-only, --no-schema, --data-only, --no-data, --statistics-only, and --no-statistics; which allow all combinations of schema, data, and/or stats. The options are named this way to preserve compatibility with the previous --schema-only and --data-only options. Statistics are in SECTION_DATA, unless the object itself is in SECTION_POST_DATA. The stats are represented as calls to pg_restore_relation_stats() and pg_restore_attribute_stats(). Author: Corey Huinker, Jeff Davis Reviewed-by: Jian He Discussion: https://postgr.es/m/CADkLM=fzX7QX6r78fShWDjNN3Vcr4PVAnvXxQ4DiGy6V=0bCUA@mail.gmail.com Discussion: https://postgr.es/m/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com	2025-02-20 01:29:06 -08:00
Amit Kapila	7da344b9f8	Improve errdetail message added by `ac0e33136a`. Make it consistent with other similar messages. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/20250220.140839.1444694904721968348.horikyota.ntt@gmail.com	2025-02-20 14:02:29 +05:30
Amit Langote	525392d572	Don't lock partitions pruned by initial pruning Before executing a cached generic plan, AcquireExecutorLocks() in plancache.c locks all relations in a plan's range table to ensure the plan is safe for execution. However, this locks runtime-prunable relations that will later be pruned during "initial" runtime pruning, introducing unnecessary overhead. This commit defers locking for such relations to executor startup and ensures that if the CachedPlan is invalidated due to concurrent DDL during this window, replanning is triggered. Deferring these locks avoids unnecessary locking overhead for pruned partitions, resulting in significant speedup, particularly when many partitions are pruned during initial runtime pruning. * Changes to locking when executing generic plans: AcquireExecutorLocks() now locks only unprunable relations, that is, those found in PlannedStmt.unprunableRelids (introduced in commit `cbc127917e`), to avoid locking runtime-prunable partitions unnecessarily. The remaining locks are taken by ExecDoInitialPruning(), which acquires them only for partitions that survive pruning. This deferral does not affect the locks required for permission checking in InitPlan(), which takes place before initial pruning. ExecCheckPermissions() now includes an Assert to verify that all relations undergoing permission checks, none of which can be in the set of runtime-prunable relations, are properly locked. * Plan invalidation handling: Deferring locks introduces a window where prunable relations may be altered by concurrent DDL, invalidating the plan. A new function, ExecutorStartCachedPlan(), wraps ExecutorStart() to detect and handle invalidation caused by deferred locking. If invalidation occurs, ExecutorStartCachedPlan() updates CachedPlan using the new UpdateCachedPlan() function and retries execution with the updated plan. To ensure all code paths that may be affected by this handle invalidation properly, all callers of ExecutorStart that may execute a PlannedStmt from a CachedPlan have been updated to use ExecutorStartCachedPlan() instead. UpdateCachedPlan() replaces stale plans in CachedPlan.stmt_list. A new CachedPlan.stmt_context, created as a child of CachedPlan.context, allows freeing old PlannedStmts while preserving the CachedPlan structure and its statement list. This ensures that loops over statements in upstream callers of ExecutorStartCachedPlan() remain intact. ExecutorStart() and ExecutorStart_hook implementations now return a boolean value indicating whether plan initialization succeeded with a valid PlanState tree in QueryDesc.planstate, or false otherwise, in which case QueryDesc.planstate is NULL. Hook implementations are required to call standard_ExecutorStart() at the beginning, and if it returns false, they should do the same without proceeding. * Testing: To verify these changes, the delay_execution module tests scenarios where cached plans become invalid due to changes in prunable relations after deferred locks. * Note to extension authors: ExecutorStart_hook implementations must verify plan validity after calling standard_ExecutorStart(), as explained earlier. For example: if (prev_ExecutorStart) plan_valid = prev_ExecutorStart(queryDesc, eflags); else plan_valid = standard_ExecutorStart(queryDesc, eflags); if (!plan_valid) return false; <extension-code> return true; Extensions accessing child relations, especially prunable partitions, via ExecGetRangeTableRelation() must now ensure their RT indexes are present in es_unpruned_relids (introduced in commit `cbc127917e`), or they will encounter an error. This is a strict requirement after this change, as only relations in that set are locked. The idea of deferring some locks to executor startup, allowing locks for prunable partitions to be skipped, was first proposed by Tom Lane. Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: David Rowley <dgrowleyml@gmail.com> (earlier versions) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier versions) Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com	2025-02-20 17:09:48 +09:00
Amit Kapila	4aa6fa3cd0	Include schema/table publications even with exclude options in dump. The current implementation inconsistently includes public schema but not information_schema when those are specified in FOR TABLES IN SCHMEA ... Apart from that, the current behavior for publications w.r.t exclude table and schema (--exclude-table, --exclude-schema) option differs from what we do at other places. We try to avoid including publications for corresponding tables or schemas when an exclude-table or exclude-schema option is given, unlike what we do for views using functions defined in a particular schema or a subscription pointing to publications with their corresponding exclude options. I decided not to backpatch this as it leads to a behavior change and we don't see any field report for current behavior. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/1270733.1734134272@sss.pgh.pa.us	2025-02-20 11:25:29 +05:30
Michael Paquier	f11674f8df	doc: Fix typo in section "WAL configuration" pg_stat_io has an attribute named fsync_time, not sync_time. Oversight in `2f70871c2b`. Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-20 14:22:00 +09:00
Michael Paquier	4538bd3f1d	doc: Add details about object "wal" in pg_stat_io This commit adds a short description of what kind of activity is tracked in pg_stat_io for the object "wal", with a link pointing to the section "WAL configuration" that has a lot of details on the matter. This should perhaps have been added in `a051e71e28`, but things are what they are. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-20 14:16:23 +09:00
Michael Paquier	2f70871c2b	doc: Recommend pg_stat_io rather than pg_stat_wal in WAL configuration Since `a051e71e28`, pg_stat_io is able to track statistics for the WAL activity, providing an equivalent of pg_stat_wal with more granularity for the fsyncs/writes counts and timings, as the data is split across backend types. This commit now recommends pg_stat_io rather than pg_stat_wal in the section "WAL configuration", some of the latter's attributes being candidate for removal in a follow-up commit. Extracted from a larger patch by the same author. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-20 13:55:00 +09:00
Michael Paquier	71f17823ba	Fix FATAL message for invalid recovery timeline at beginning of recovery If the requested recovery timeline is not reachable, the logged checkpoint and timeline should to be the values read from the backup_label when it is defined. The message generated used the values from the control file in this case, which is fine when recovering from the control file without a backup_label, but not if there is a backup_label. Issue introduced in `ee994272ca`. v15 has introduced xlogrecovery.c and more simplifications in this area (`4a92a1c3d1`, `a27048cbcb`), making this change a bit simpler to think about, so backpatch only down to this version. Author: David Steele <david@pgbackrest.org> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> Reviewed-by: Benoit Lobréau <benoit.lobreau@dalibo.com> Discussion: https://postgr.es/m/c3d617d4-1696-4aa7-8a4d-5a7d19cc5618@pgbackrest.org Backpatch-through: 15	2025-02-20 10:42:20 +09:00
Andres Freund	d38bab5edd	pgbench: Increase RLIMIT_NOFILE if necessary pgbench already had code to check if the soft rlimit is too low for the specified number of connections. If too low, it errored out, telling the user to increase the limit. However, we can do better: If the hard limit allows, increase the soft limit to be sufficiently for the number of connections. It is common for the soft limit to be considerably lower than the hard limit, due to the danger of soft limits > 1024 breaking programs that use the select(2), as explained in [1]. [1]: https://0pointer.net/blog/file-descriptor-limits.html Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d%3Dh9J%3DD1rStFCMR%2Bt7yh_Kwj-g87aLQ%40mail.gmail.com	2025-02-19 19:35:09 -05:00
Michael Paquier	9b1cb58c5f	test_escape: Fix output of --help The short option name -f was not listed, only its long option name --force-unsupported. Author: Japin Li Discussion: https://postgr.es/m/ME0P300MB04452BD1FB1B277D4C1C20B9B6C52@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM Backpatch-through: 13	2025-02-20 09:30:54 +09:00
Tomas Vondra	9ba7bcc894	Correct relation size estimate with low fillfactor Since commit `29cf61ade3`, table_block_relation_estimate_size() considers fillfactor when estimating number of rows in a relation before the first ANALYZE. The formula however did not consider tuples may be larger than available space determined by fillfactor, ending with density 0. This ultimately means the relation was estimated to contain a single row. The executor however places at least one tuple per page, even with very low fillfactor values, so the density should be at least 1. Fixed by clamping the density estimate using clamp_row_est(). Reported by Heikki Linnakangas. Fix by me, with regression test inspired by example provided by Heikki. Backpatch to 17, where the issue was introduced. Reported-by: Heikki Linnakangas Backpatch-through: 17 Discussion: https://postgr.es/m/2bf9d973-7789-4937-a7ca-0af9fb49c71e@iki.fi	2025-02-19 23:53:37 +01:00
Tom Lane	e596e077bb	Assert that ExecOpenIndices and ExecCloseIndices are not repeated. These functions should be called at most once per ResultRelInfo; it's wasteful to do otherwise, and certainly the pattern of opening twice and then closing twice is a bad idea. Moreover, aminsertcleanup functions might not be prepared to be called twice, as the just-hardened code in BRIN demonstrates. This amounts to an API change, since such coding patterns were safe even if wasteful before v17. Hence, apply to HEAD only. (Extension code violating this new rule faces some risk in v17, but we just fixed brininsertcleanup and there are probably few other aminsertcleanup functions as yet. So the odds of breaking usable code seem higher than the odds of doing something useful with a back-patch.) Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org	2025-02-19 16:45:12 -05:00
Tom Lane	9ff68679b5	Fix crash in brininsertcleanup during logical replication. Logical replication crashes if the subscriber's partitioned table has a BRIN index. There are two independently blamable causes, and this patch fixes both: 1. brininsertcleanup fails if called twice for the same IndexInfo, because it half-destroys its BrinInsertState but leaves it still linked from ii_AmCache. brininsert would also fail in that state, so it's pretty hard to see any advantage to this coding. Fully remove the BrinInsertState, instead, so that a new brininsert call would create a new cache. 2. A logical replication subscriber sometimes does ExecOpenIndices twice on the same ResultRelInfo, followed by doing ExecCloseIndices twice; the second call reaches the brininsertcleanup bug. Quite aside from tickling unexpected cases in aminsertcleanup methods, this seems very wasteful, because the IndexInfos built in the first ExecOpenIndices call are just lost during the second call, and have to be rebuilt at possibly-nontrivial cost. We should establish a coding rule that you don't do that. The problematic coding is that when the target table is partitioned, apply_handle_tuple_routing calls ExecFindPartition which does ExecOpenIndices (and expects that ExecCleanupTupleRouting will close the indexes again). Using the ResultRelInfo made by ExecFindPartition, it calls apply_handle_delete_internal or apply_handle_insert_internal, both of which think they need to do ExecOpenIndices/ExecCloseIndices for themselves. They do in the main non-partitioned code paths, but not here. The simplest fix is to pull their ExecOpenIndices/ExecCloseIndices calls out and put them in the call sites for the non-partitioned cases. (We could have refactored apply_handle_update_internal similarly, but I did not do so today because there's no bug there: the partitioned code path doesn't call it.) Also, remove the always-duplicative open/close calls within apply_handle_tuple_routing itself. Since brininsertcleanup and indeed the whole aminsertcleanup mechanism are new in v17, there's no observable bug in older branches. A case could be made for trying to avoid these duplicative open/close calls in the older branches, but for now it seems not worth the trouble and risk of new bugs. Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org Backpatch-through: 17	2025-02-19 16:35:15 -05:00
Tomas Vondra	a1b4f289be	Consider BufFiles when adjusting hashjoin parameters Until now ExecChooseHashTableSize() considered only the size of the in-memory hash table, and ignored the memory needed for the batch files. Which can be a significant amount, because each batch needs two BufFiles (each with a BLCKSZ buffer). The same issue applies to increasing the number of batches during execution. It's also possible to trigger a "batch explosion", e.g. due to duplicate values or skew. We've seen reports of joins with hundreds of thousands (or even millions) of batches, consuming gigabytes of memory, triggering OOM errors. These cases may be fairly rare, but it's clearly possible to hit them. These issues can't be prevented during planning. Even if we improve that, it does not help with execution-time batch explosion. We can however reduce the impact and use as little memory as possible. This patch improves the behavior by adjusting how the memory is divided between the hash table and batch files. It may be better to use fewer batch files, even if it means the hash table will exceed the limit. The capacity of the hash node may be increased either by doubling he number of batches, or doubling the size of the in-memory hash table. The outcome is the same, but the memory usage may be very different. For low nbatch values it's better to add batches, for high nbatch values it's better to allow a larger hash table. The patch considers both options, both during the initial sizing and then during execution, to minimize how much the limit gets exceeded. It might seem this patch is relaxing the memory limit - allowing it to be exceeded. But that's not really the case. It has always been like that, except the memory used by batches was ignored. Allowing the hash table to grow may also prevent the batch explosion. If there's a large batch that can't be split (due to hash collisions or duplicate values), at some point the memory limit will increase enough for the batch to fit into the hash table. This patch was in the works for a long time. The early versions were posted in 2019, and revived every year or two when we happened to get the next report of OOM due to a hashjoin batch explosion. Each of those patch versions were reviewed by a couple people. I'm mentioning only Melanie Plageman and Robert Haas, because they reviewed the last version, and the older patches are very different. Reviewed-by: Melanie Plageman, Robert Haas Discussion: https://postgr.es/m/7bed6c08-72a0-4ab9-a79c-e01fcdd0940f@vondra.me Discussion: https://postgr.es/m/20190504003414.bulcbnge3rhwhcsh%40development Discussion: https://postgr.es/m/20190428141901.5dsbge2ka3rxmpk6%40development	2025-02-19 21:08:20 +01:00
Andres Freund	8b886a4e34	tests: BackgroundPsql: Fix potential for lost errors on windows This addresses various corner cases in BackgroundPsql: - On windows stdout and stderr may arrive out of order, leading to errors not being reported, or attributed to the wrong statement. To fix, emit the "query-separation banner" on both stdout and stderr and wait for both. - Very occasionally the "query-separation banner" would not get removed, because we waited until the banner arrived, but then replaced the banner plus newline. To fix, wait for banner and newline. - For interactive psql replacing $banner\n is not sufficient, interactive psql outputs \r\n. - For interactive psql, where commands are echoed to stdout, the \echo command, rather than its output, would be matched. This would sometimes lead to output from the prior query, or wait_connect(), being returned in the next command. This also affected wait_connect(), leading to sometimes sending queries to psql before the connection actually was established. While debugging these issues I also found that it's hard to know whether a query separation banner was attributed to the right query. Make that easier by counting the queries each BackgroundPsql instance has emitted and include the number in the banner. Also emit psql stdout/stderr in query() and wait_connect() as Test::More notes, without that it's rather hard to debug some issues in CI and buildfarm. As this can cause issues not just to-be-added tests, but also existing ones, backpatch the fix to all supported versions. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/wmovm6xcbwh7twdtymxuboaoarbvwj2haasd3sikzlb3dkgz76@n45rzycluzft Backpatch-through: 13	2025-02-19 10:45:48 -05:00
Álvaro Herrera	80d7f99049	Add ATAlterConstraint struct for ALTER .. CONSTRAINT Replace the use of Constraint with a new ATAlterConstraint struct, which allows us to pass additional information. No functionality is added by this commit. This is necessary for future work that allows altering constraints in other ways. I (Álvaro) took the liberty of restructuring the code for ALTER CONSTRAINT beyond what Amul did. The original coding before Amul's patch was unnecessarily baroque, and this change makes things simpler by removing one level of subroutine. Also, partly remove the assumption that only partitioned tables are relevant (by passing sensible 'recurse' arguments) and no longer ignore whether ONLY was specified. I say 'partly' because the current coding only walks down via the 'conparentid' relationship, which is only used for partitioned tables; but future patches could handle ONLY or not for other types of constraint changes for legacy inheritance trees too. Author: Amul Sul <sulamul@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAAJ_b94bfgPV-8Mw_HwSBeheVwaK9=5s+7+KbBj_NpwXQFgDGg@mail.gmail.com	2025-02-19 13:06:13 +01:00
Alexander Korotkov	e983ee9380	Improve statistics estimation for single-column GROUP BY in sub-queries This commit follows the idea of the `4767bc8ff2`. If sub-query has only one GROUP BY column, we can consider its output variable as being unique. We can employ this fact in the statistics to make more precise estimations in the upper query block. Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-02-19 11:59:30 +02:00
Amit Kapila	8a695d7998	Add a test for commit `ac0e33136a` using the injection point. This test uses an injection point to bypass the time overhead caused by the idle_replication_slot_timeout GUC, which has a minimum value of one minute. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com	2025-02-19 15:02:22 +05:30
Michael Paquier	302cf15759	Add support for LIKE in CREATE FOREIGN TABLE LIKE enables the creation of foreign tables based on the column definitions, constraints and objects of the defined source relation(s). This feature mirrors the behavior of CREATE TABLE LIKE, but ignores the INCLUDING sub-options that do not make sense for foreign tables: INDEXES, COMPRESSION, IDENTITY and STORAGE. The supported sub-options are COMMENTS, CONSTRAINTS, DEFAULTS, GENERATED and STATISTICS, mapping with the clauses already supported by the command. Note that the restriction with LIKE in CREATE FOREIGN TABLE was added in `a0c6dfeecf`. Author: Zhang Mingli Reviewed-by: Álvaro Herrera, Sami Imseih, Michael Paquier Discussion: https://postgr.es/m/42d3f855-2275-4361-a42a-826172ca2dc4@Spark	2025-02-19 15:50:37 +09:00
Amit Langote	e7563e3c75	doc: Fix some issues with JSON_TABLE() examples 1. Remove an unused PASSING variable. 2. Adjust formatting of JSON data used in an example to be valid under strict mode Reported-by: Miłosz Chmura <mieszko4@gmail.com> Author: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/173859550337.1071.4748984213168572913@wrigleys.postgresql.org	2025-02-19 15:08:17 +09:00
Amit Kapila	ac0e33136a	Invalidate inactive replication slots. This commit introduces idle_replication_slot_timeout GUC that allows inactive slots to be invalidated at the time of checkpoint. Because checkpoints happen checkpoint_timeout intervals, there can be some lag between when the idle_replication_slot_timeout was exceeded and when the slot invalidation is triggered at the next checkpoint. To avoid such lags, users can force a checkpoint to promptly invalidate inactive slots. Note that the idle timeout invalidation mechanism is not applicable for slots that do not reserve WAL or for slots on the standby server that are synced from the primary server (i.e., standby slots having 'synced' field 'true'). Synced slots are always considered to be inactive because they don't perform logical decoding to produce changes. The slots can become inactive for a long period if a subscriber is down due to a system error or inaccessible because of network issues. If such a situation persists, it might be more practical to recreate the subscriber rather than attempt to recover the node and wait for it to catch up which could be time-consuming. Then, external tools could create replication slots (e.g., for migrations or upgrades) that may fail to remove them if an error occurs, leaving behind unused slots that take up space and resources. Manually cleaning them up can be tedious and error-prone, and without intervention, these lingering slots can cause unnecessary WAL retention and system bloat. As the duration of idle_replication_slot_timeout is in minutes, any test using that would be time-consuming. We are planning to commit a follow up patch for tests by using the injection point framework. Author: Nisha Moond <nisha.moond412@gmail.com> Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB5716C131A7D80DAE8CB9E88794FC2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-02-19 09:29:50 +05:30
Tom Lane	b464e51ab3	Update to latest Snowball sources. It's been some time since we did this, partly because the upstream snowball project hasn't formally tagged a new release since 2021. The main motivation for doing it now is to absorb a bug fix (their commit e322673a841d9abd69994ae8cd20e191090b6ef4), which prevents a null pointer dereference crash if SN_create_env() gets a malloc failure at just the wrong point. We'll patch the back branches with only that change, but we might as well do the full sync dance on HEAD. Aside from a bunch of mostly-minor tweaks to existing stemmers, this update adds a new stemmer for Estonian. It also removes the existing stemmer for Romanian using ISO-8859-2 encoding. Upstream apparently concluded that ISO-8859-2 doesn't provide an adequate representation of some Romanian characters, and the UTF-8 implementation should be used instead. While at it, update the README's instructions for doing a sync, which have not been adjusted during the addition of meson tooling. Thanks to Maksim Korotkov for discovering the null-pointer bug and submitting the fix to upstream snowball. Reported-by: Maksim Korotkov <m.korotkov@postgrespro.ru> Discussion: https://postgr.es/m/1d1a46-67ab1000-21-80c451@83151435	2025-02-18 21:13:54 -05:00
Richard Guo	71d02dc478	Fix unsafe access to BufferDescriptors When considering a local buffer, the GetBufferDescriptor() call in BufferGetLSNAtomic() would be retrieving a shared buffer with a bad buffer ID. Since the code checks whether the buffer is shared before using the retrieved BufferDesc, this issue did not lead to any malfunction. Nonetheless this seems like trouble waiting to happen, so fix it by ensuring that GetBufferDescriptor() is only called when we know the buffer is shared. Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAHewXNku-o46-9cmUgyv6LkSZ25doDrWq32p=oz9kfD8ovVJMg@mail.gmail.com Backpatch-through: 13	2025-02-19 11:05:35 +09:00
Richard Guo	c39392ebae	Fix freeing a child join's SpecialJoinInfo In try_partitionwise_join, we try to break down the join between two partitioned relations into joins between matching partitions. To achieve this, we iterate through each pair of partitions from the two joining relations and create child join relations for them. To reduce memory accumulation during each iteration, one step we take is freeing the SpecialJoinInfos created for the child joins. A child join's SpecialJoinInfo is a copy of the parent join's SpecialJoinInfo, with some members being translated copies of their counterparts in the parent. However, when freeing the bitmapset members in a child join's SpecialJoinInfo, we failed to check whether they were translated copies. As a result, we inadvertently freed the members that were still in use by the parent SpecialJoinInfo, leading to crashes when those freed members were accessed. To fix, check if each member of the child join's SpecialJoinInfo is a translated copy and free it only if that's the case. This requires passing the parent join's SpecialJoinInfo as a parameter to free_child_join_sjinfo. Back-patch to v17 where this bug crept in. Bug: #18806 Reported-by: 孟令彬 <m_lingbin@126.com> Diagnosed-by: Tender Wang <tndrwang@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/18806-d70b0c9fdf63dcbf@postgresql.org Backpatch-through: 17	2025-02-19 10:02:32 +09:00
Michael Paquier	aef6f907f6	test_escape: Fix handling of short options in getopt_long() This addresses two errors in the module, based on the set of options supported: - '-c', for --conninfo, was not listed. - '-f', for --force-unsupported, was not listed. While on it, these are now listed in an alphabetical order. Author: Japin Li Discussion: https://postgr.es/m/ME0P300MB04451FB20CE0346A59C25CADB6FA2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM Backpatch-through: 13	2025-02-19 09:45:42 +09:00
Michael Paquier	f2e4c2b203	Make the description of some GUCs more consistent This commit improves the description of a couple of GUCs, to be more consistent with the style of their surroundings: * array_nulls * enable_self_join_elimination * optimize_bounded_sort * row_security * synchronize_seqscans Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20250218.103240.1422205966404509831.horikyota.ntt@gmail.com	2025-02-19 08:42:35 +09:00
Bruce Momjian	06dc1ffd24	doc: add example of sign mismatch with POSIX/ISO-8601 time zones Author: Laurenz Albe Discussion: https://postgr.es/m/eb4d1e15c6822c1937be1491118500dd9201492f.camel@cybertec.at	2025-02-18 15:51:31 -05:00
Jeff Davis	a1f7f80bfe	Update outdated comments in nodeAgg.c. Author: Zhang Mingli Reviewed-by: Richard Guo Discussion: https://postgr.es/m/198a8d1e-0792-4e7f-828e-902aa342f36e@Spark	2025-02-18 10:37:50 -08:00
Melanie Plageman	c623e8593e	Reduce scope of heap vacuum per_buffer_data Move lazy_scan_heap()'s per_buffer_data variable into a tighter scope. In lazy_scan_heap()'s phase I heap vacuuming, the read stream API returns a pointer to the next block number to vacuum. As long as read_stream_next_buffer() returns a valid buffer, per_buffer_data should always be valid. Move per_buffer_data into a tighter scope and make sure it is reset to NULL on each iteration so that we get a core dump instead of bogus data from a previous block if something goes wrong in the read stream API. Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/626104.1739729538%40sss.pgh.pa.us	2025-02-18 09:29:10 -05:00
Daniel Gustafsson	95ef3d9029	Add PGErrorVerbosity to typedefs.list PGErrorVerbosity was missing which resulted in incorrect whitespace alignment going back all the way to `e3860ffa4d`. No backpatch for this though since we don't pgindent backbranches. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAGECzQTVi8n-HW4Q27je-b9ckQk7zf6bS_it42gNvQu+DX0NCQ@mail.gmail.com	2025-02-18 13:23:13 +01:00
David Rowley	593509202f	Fix poorly written regression test `bd10ec529` added code to allow redundant functionally dependent GROUP BY columns to be removed using unique indexes and NOT NULL constraints as proofs of functional dependency. In that commit, I (David) added a test to ensure that when there are multiple indexes available to remove columns that we pick the index that allows us to remove the most columns. This test was faulty as it assumed the t3 table's primary key index was valid to use as functional dependency proof, but that's not the case since that's defined as deferrable. Here we adjust the tests added by that commit to use the t2 table instead. That's defined with a non-deferrable primary key. Author: songjinzhou <tsinghualucky912@foxmail.com> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/tencent_CD414C79D39668455DF80D35143B87634C08@qq.com	2025-02-19 00:42:22 +13:00
Amit Kapila	217919dd09	Raise a WARNING for max_slot_wal_keep_size in pg_createsubscriber. During the pg_createsubscriber execution, it is possible that the required WAL is removed from the primary/publisher node due to 'max_slot_wal_keep_size'. This patch raises a WARNING during the '--dry-run' mode if the 'max_slot_wal_keep_size' is set to a non-default value on the primary/publisher node. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAHv8Rj+deqsQXOMa7Tck8CBQUbsua=+4AuMVQ2=MPM0f-ZHbjA@mail.gmail.com	2025-02-18 12:15:43 +05:30
John Naylor	53d3daa491	Specialize intarray sorting There is at least one report in the field of storing millions of integers in arrays, so it seems like a good time to specialize intarray's qsort function. In doing so, streamline the comparators: Previously there were three, two for each direction for sorting and one passed to qunique_arg. To preserve the early exit in the case of descending input, pass the direction as an argument to the comparator. This requires giving up duplicate detection, which previously allowed skipping the qunique_arg() call. Testing showed no regressions this way. In passing, get rid of nearby checks that the input has at least two elements, since preserving them would make some macros less readable. These are not necessary for correctness, and seem like premature optimizations. Author: Andrey M. Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/098A3E67-E4A6-4086-9C66-B1EAEB1DFE1C@yandex-team.ru	2025-02-18 11:04:55 +07:00
Amit Kapila	164bac92f0	Doc: Improve pg_replication_slots.inactive_since description. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHut+PssvVMTWVtUPto6HbPO8pgVsvtzndt_FdBomA_Oq4zf3w@mail.gmail.com	2025-02-18 09:23:43 +05:30
Thomas Munro	2509b857cc	Fix typo in `2a8a0067`. Builds configured with Valgrind but without assertions would fail due to a typo in the recent change. This should be included when back-patching `2a8a0067` into v17.	2025-02-18 14:44:59 +13:00
Daniel Gustafsson	9cdc21b533	Fix translator notes in comments The translator comments detailing what a %s inclusion refers to were accidentally including too many address types. In practice this is not a problem since it's not a translated string, but to minimize any risk of confusion let's fix them anwyays. Even though this exists in backbranches there is little use for backpatch as the translation work has already happened there, so let's avoid the churn. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/ME0P300MB04458DE627480614ABE639D2B6FB2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-02-17 20:23:34 +01:00

1651 changed files with 248187 additions and 139844 deletions

									
										72

.cirrus.tasks.yml
									
										View File
										
				@ -2,6 +2,12 @@

				#

				# For instructions on how to enable the CI integration in a repository and

				# further details, see src/tools/ci/README

				#

				#

				# NB: Different tasks intentionally test with different, non-default,

				# configurations, to increase the chance of catching problems. Each task with

				# non-obvious non-default documents their oddity at the top of the task,

				# prefixed by "SPECIAL:".

				env:

				@ -23,7 +29,7 @@ env:

				  MTEST_ARGS: --print-errorlogs --no-rebuild -C build

				  PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests

				  TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf

				  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance

				  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance oauth

				# What files to preserve in case tests fail

				@ -55,6 +61,10 @@ on_failure_meson: &on_failure_meson

				# To avoid unnecessarily spinning up a lot of VMs / containers for entirely

				# broken commits, have a minimal task that all others depend on.

				#

				# SPECIAL:

				# - Builds with --auto-features=disabled and thus almost no enabled

				#   dependencies

				task:

				  name: SanityCheck

				@ -125,21 +135,33 @@ task:

				      src/tools/ci/cores_backtrace.sh linux /tmp/cores

				# SPECIAL:

				# - Uses postgres specific CPPFLAGS that increase test coverage

				# - Specifies configuration options that test reading/writing/copying of node trees

				# - Specifies debug_parallel_query=regress, to catch related issues during CI

				# - Also runs tests against a running postgres instance, see test_running_script

				task:

				  name: FreeBSD - 13 - Meson

				  name: FreeBSD - Meson

				  env:

				    CPUS: 4

				    BUILD_JOBS: 4

				    TEST_JOBS: 8

				    IMAGE_FAMILY: pg-ci-freebsd-13

				    IMAGE_FAMILY: pg-ci-freebsd

				    DISK_SIZE: 50

				    CCACHE_DIR: /tmp/ccache_dir

				    CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS

				    CFLAGS: -Og -ggdb

				    PG_TEST_INITDB_EXTRA_OPTS: -c debug_copy_parse_plan_trees=on -c debug_write_read_parse_plan_trees=on -c debug_raw_expression_coverage_test=on

				    # Several buildfarm animals enable these options. Without testing them

				    # during CI, it would be easy to cause breakage on the buildfarm with CI

				    # passing.

				    PG_TEST_INITDB_EXTRA_OPTS: >-

				      -c debug_copy_parse_plan_trees=on

				      -c debug_write_read_parse_plan_trees=on

				      -c debug_raw_expression_coverage_test=on

				      -c debug_parallel_query=regress

				    PG_TEST_PG_UPGRADE_MODE: --link

				  <<: *freebsd_task_template

				@ -155,8 +177,7 @@ task:

				  ccache_cache:

				    folder: $CCACHE_DIR

				  # Work around performance issues due to 32KB block size

				  repartition_script: src/tools/ci/gcp_freebsd_repartition.sh

				  setup_ram_disk_script: src/tools/ci/gcp_ram_disk.sh

				  create_user_script: |

				    pw useradd postgres

				    chown -R postgres:postgres .

				@ -275,7 +296,7 @@ task:

				  ccache_cache:

				    folder: $CCACHE_DIR

				  setup_ram_disk_script: src/tools/ci/gcp_ram_disk.sh

				  create_user_script: |

				    useradd postgres

				    chown -R postgres:users /home/postgres

				@ -301,7 +322,7 @@ task:

				        build

				    EOF

				  build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS}'

				  build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}'

				  upload_caches: ccache

				  test_world_script: |

				@ -329,6 +350,7 @@ LINUX_CONFIGURE_FEATURES: &LINUX_CONFIGURE_FEATURES >-

				  --with-gssapi

				  --with-icu

				  --with-ldap

				  --with-libcurl

				  --with-libxml

				  --with-libxslt

				  --with-llvm

				@ -348,6 +370,7 @@ LINUX_MESON_FEATURES: &LINUX_MESON_FEATURES >-

				  -Duuid=e2fs

				# Check SPECIAL in the matrix: below

				task:

				  env:

				    CPUS: 4

				@ -426,6 +449,10 @@ task:

				    #DEBIAN_FRONTEND=noninteractive apt-get -y install ...

				  matrix:

				    # SPECIAL:

				    # - Uses address sanitizer, sanitizer failures are typically printed in

				    #   the server log

				    # - Configures postgres with a small segment size

				    - name: Linux - Debian Bookworm - Autoconf

				      env:

				@ -444,6 +471,8 @@ task:

				            --enable-cassert --enable-injection-points --enable-debug \

				            --enable-tap-tests --enable-nls \

				            --with-segsize-blocks=6 \

				            --with-libnuma \

				            --with-liburing \

				            \

				            ${LINUX_CONFIGURE_FEATURES} \

				            \

				@ -461,11 +490,18 @@ task:

				      on_failure:

				        <<: *on_failure_ac

				    # SPECIAL:

				    # - Uses undefined behaviour and alignment sanitizers, sanitizer failures

				    #   are typically printed in the server log

				    # - Test both 64bit and 32 bit builds

				    # - uses io_method=io_uring

				    - name: Linux - Debian Bookworm - Meson

				      env:

				        CCACHE_MAXSIZE: "400M" # tests two different builds

				        SANITIZER_FLAGS: -fsanitize=alignment,undefined

				        PG_TEST_INITDB_EXTRA_OPTS: >-

				          -c io_method=io_uring

				      configure_script: |

				        su postgres <<-EOF

				@ -488,11 +524,21 @@ task:

				            -Dllvm=disabled \

				            --pkg-config-path /usr/lib/i386-linux-gnu/pkgconfig/ \

				            -DPERL=perl5.36-i386-linux-gnu \

				            -Dlibnuma=disabled \

				            build-32

				        EOF

				      build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}'

				      build_32_script: su postgres -c 'ninja -C build-32 -j${BUILD_JOBS} ${MBUILD_TARGET}'

				      build_script: |

				        su postgres <<-EOF

				          ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}

				          ninja -C build -t missingdeps

				        EOF

				      build_32_script: |

				        su postgres <<-EOF

				          ninja -C build-32 -j${BUILD_JOBS} ${MBUILD_TARGET}

				          ninja -C build -t missingdeps

				        EOF

				      upload_caches: ccache

				@ -521,6 +567,11 @@ task:

				    cores_script: src/tools/ci/cores_backtrace.sh linux /tmp/cores

				# NB: macOS is by far the most expensive OS to run CI for, therefore no

				# expensive additional checks should be added.

				#

				# SPECIAL:

				# - Enables --clone for pg_upgrade and pg_combinebackup

				task:

				  name: macOS - Sonoma - Meson

				@ -687,6 +738,7 @@ task:

				  build_script: |

				    vcvarsall x64

				    ninja -C build %MBUILD_TARGET%

				    ninja -C build -t missingdeps

				  check_world_script: |

				    vcvarsall x64

15

.git-blame-ignore-revs

View File

 @ -14,6 +14,21 @@
 #
 # $ git log --pretty=format:"%H # %cd%n# %s" $PGINDENTGITHASH -1 --date=iso
 e7287ed20eb1fe280ab6c4056ccf94dcd53a8 # 2025-04-30 19:18:30 +1200
 # Fix broken indentation
 e1a8b1ad587112e67fdc5aa7b388631dde4dbdda # 2025-04-04 09:38:22 -0500
 # Re-pgindent pg_largeobject.c after commit 0d6c477664.
 bdda484c838313959f65e2b700f14ac7c0e66 # 2025-03-18 09:02:36 -0400
 # Fix indentation again.
 c1b4cc49455364b6bcab8034900d1c016b9cd # 2025-03-17 16:06:17 -0400
 # Fix indentation.
 b955df443405e056fd9047ef819a1465654f9d79 # 2025-03-13 12:41:44 +1300
 # Fix indentation issue
 aa615943049c04efd36ab4765c06eda89cdfea # 2025-01-31 16:44:24 +0900
 # Fix bad indentation introduced in commit d47cbf474

2

COPYRIGHT

View File

 @ -1,5 +1,5 @@
 PostgreSQL Database Management System
 (formerly known as Postgres, then as Postgres95)
 (also known as Postgres, formerly known as Postgres95)
 Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group

117

config/c-compiler.m4

View File

 @ -553,16 +553,20 @@ fi])# PGAC_HAVE_GCC__ATOMIC_INT64_CAS
 # the other ones are, on x86-64 platforms)
 #
 # If the intrinsics are supported, sets pgac_sse42_crc32_intrinsics.
 #
 # To detect the case where the compiler knows the function but library support
 # is missing, we must link not just compile, and store the results in global
 # variables so the compiler doesn't optimize away the call.
 AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
 [define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
 AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
     unsigned int crc;
     #if defined(__has_attribute) && __has_attribute (target)
     __attribute__((target("sse4.2")))
     #endif
     static int crc32_sse42_test(void)
     {
       unsigned int crc = 0;
       crc = _mm_crc32_u8(crc, 0);
       crc = _mm_crc32_u32(crc, 0);
       /* return computed value, to prevent the above being optimized away */
 @ -577,6 +581,43 @@ fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_SSE42_CRC32_INTRINSICS
 # PGAC_AVX512_PCLMUL_INTRINSICS
 # ---------------------------
 # Check if the compiler supports AVX-512 carryless multiplication
 # and three-way exclusive-or instructions used for computing CRC.
 # AVX-512F is assumed to be supported if the above are.
 #
 # If the intrinsics are supported, sets pgac_avx512_pclmul_intrinsics.
 AC_DEFUN([PGAC_AVX512_PCLMUL_INTRINSICS],
 [define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_pclmul_intrinsics])])dnl
 AC_CACHE_CHECK([for _mm512_clmulepi64_epi128], [Ac_cachevar],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
     __m512i x;
     __m512i y;
     #if defined(__has_attribute) && __has_attribute (target)
     __attribute__((target("vpclmulqdq,avx512vl")))
     #endif
     static int avx512_pclmul_test(void)
     {
       __m128i z;
       y = _mm512_clmulepi64_epi128(x, y, 0);
       z = _mm_ternarylogic_epi64(
                 _mm512_castsi512_si128(y),
                 _mm512_extracti32x4_epi32(y, 1),
                 _mm512_extracti32x4_epi32(y, 2),
 x96);
       return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
     }],
   [return avx512_pclmul_test();])],
   [Ac_cachevar=yes],
   [Ac_cachevar=no])])
 if test x"$Ac_cachevar" = x"yes"; then
   pgac_avx512_pclmul_intrinsics=yes
 fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_AVX512_PCLMUL_INTRINSICS
 # PGAC_ARMV8_CRC32C_INTRINSICS
 # ----------------------------
 @ -593,9 +634,9 @@ AC_DEFUN([PGAC_ARMV8_CRC32C_INTRINSICS],
 AC_CACHE_CHECK([for __crc32cb, __crc32ch, __crc32cw, and __crc32cd with CFLAGS=$1], [Ac_cachevar],
 [pgac_save_CFLAGS=$CFLAGS
 CFLAGS="$pgac_save_CFLAGS $1"
 AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <arm_acle.h>],
   [unsigned int crc = 0;
    crc = __crc32cb(crc, 0);
 AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <arm_acle.h>
 unsigned int crc;],
   [crc = __crc32cb(crc, 0);
    crc = __crc32ch(crc, 0);
    crc = __crc32cw(crc, 0);
    crc = __crc32cd(crc, 0);
 @ -628,9 +669,8 @@ AC_DEFUN([PGAC_LOONGARCH_CRC32C_INTRINSICS],
 AC_CACHE_CHECK(
   [for __builtin_loongarch_crcc_w_b_w, __builtin_loongarch_crcc_w_h_w, __builtin_loongarch_crcc_w_w_w and __builtin_loongarch_crcc_w_d_w],
   [Ac_cachevar],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([],
   [unsigned int crc = 0;
    crc = __builtin_loongarch_crcc_w_b_w(0, crc);
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([unsigned int crc;],
   [crc = __builtin_loongarch_crcc_w_b_w(0, crc);
    crc = __builtin_loongarch_crcc_w_h_w(0, crc);
    crc = __builtin_loongarch_crcc_w_w_w(0, crc);
    crc = __builtin_loongarch_crcc_w_d_w(0, crc);
 @ -680,22 +720,23 @@ undefine([Ac_cachevar])dnl
 AC_DEFUN([PGAC_AVX512_POPCNT_INTRINSICS],
 [define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_popcnt_intrinsics])])dnl
 AC_CACHE_CHECK([for _mm512_popcnt_epi64], [Ac_cachevar],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([[#include <immintrin.h>
     #include <stdint.h>
     char buf[sizeof(__m512i)];
     #if defined(__has_attribute) && __has_attribute (target)
     __attribute__((target("avx512vpopcntdq,avx512bw")))
     #endif
     static int popcount_test(void)
     {
       const char buf@<:@sizeof(__m512i)@:>@;
       int64_t popcnt = 0;
       __m512i accum = _mm512_setzero_si512();
       const __m512i val = _mm512_maskz_loadu_epi8((__mmask64) 0xf0f0f0f0f0f0f0f0, (const __m512i *) buf);
       const __m512i cnt = _mm512_popcnt_epi64(val);
       __m512i val = _mm512_maskz_loadu_epi8((__mmask64) 0xf0f0f0f0f0f0f0f0, (const __m512i *) buf);
       __m512i cnt = _mm512_popcnt_epi64(val);
       accum = _mm512_add_epi64(accum, cnt);
       popcnt = _mm512_reduce_add_epi64(accum);
       return (int) popcnt;
     }],
     }]],
   [return popcount_test();])],
   [Ac_cachevar=yes],
   [Ac_cachevar=no])])
 @ -704,3 +745,55 @@ if test x"$Ac_cachevar" = x"yes"; then
 fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_AVX512_POPCNT_INTRINSICS
 # PGAC_SVE_POPCNT_INTRINSICS
 # --------------------------
 # Check if the compiler supports the SVE popcount instructions using the
 # svptrue_b64, svdup_u64, svcntb, svld1_u64, svld1_u8, svadd_u64_x,
 # svcnt_u64_x, svcnt_u8_x, svaddv_u64, svaddv_u8, svwhilelt_b8_s32,
 # svand_n_u64_x, and svand_n_u8_x intrinsic functions.
 #
 # If the intrinsics are supported, sets pgac_sve_popcnt_intrinsics.
 AC_DEFUN([PGAC_SVE_POPCNT_INTRINSICS],
 [define([Ac_cachevar], [AS_TR_SH([pgac_cv_sve_popcnt_intrinsics])])dnl
 AC_CACHE_CHECK([for svcnt_x], [Ac_cachevar],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([[#include <arm_sve.h>
 	char buf[128];
 	#if defined(__has_attribute) && __has_attribute (target)
 	__attribute__((target("arch=armv8-a+sve")))
 	#endif
 	static int popcount_test(void)
 	{
 		svbool_t	pred = svptrue_b64();
 		svuint8_t	vec8;
 		svuint64_t	accum1 = svdup_u64(0),
 					accum2 = svdup_u64(0),
 					vec64;
 		char	   *p = buf;
 		uint64_t	popcnt,
 					mask = 0x5555555555555555;
 		vec64 = svand_n_u64_x(pred, svld1_u64(pred, (const uint64_t *) p), mask);
 		accum1 = svadd_u64_x(pred, accum1, svcnt_u64_x(pred, vec64));
 		p += svcntb();
 		vec64 = svand_n_u64_x(pred, svld1_u64(pred, (const uint64_t *) p), mask);
 		accum2 = svadd_u64_x(pred, accum2, svcnt_u64_x(pred, vec64));
 		p += svcntb();
 		popcnt = svaddv_u64(pred, svadd_u64_x(pred, accum1, accum2));
 		pred = svwhilelt_b8_s32(0, sizeof(buf));
 		vec8 = svand_n_u8_x(pred, svld1_u8(pred, (const uint8_t *) p), 0x55);
 		return (int) (popcnt + svaddv_u8(pred, svcnt_u8_x(pred, vec8)));
 	}]],
   [return popcount_test();])],
   [Ac_cachevar=yes],
   [Ac_cachevar=no])])
 if test x"$Ac_cachevar" = x"yes"; then
   pgac_sve_popcnt_intrinsics=yes
 fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_SVE_POPCNT_INTRINSICS

11

config/config.guess vendored

View File

 @ -4,7 +4,7 @@
 # shellcheck disable=SC2006,SC2268 # see below for rationale
 timestamp='2024-01-01'
 timestamp='2024-07-27'
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
 @ -123,7 +123,7 @@ set_cc_for_build() {
     dummy=$tmp/dummy
     case ${CC_FOR_BUILD-},${HOST_CC-},${CC-} in
 	,,)    echo "int x;" > "$dummy.c"
 	       for driver in cc gcc c89 c99 ; do
 	       for driver in cc gcc c17 c99 c89 ; do
 		   if ($driver -c -o "$dummy.o" "$dummy.c") >/dev/null 2>&1 ; then
 		       CC_FOR_BUILD=$driver
 		       break
 @ -634,7 +634,8 @@ EOF
 		sed 's/^		//' << EOF > "$dummy.c"
 		#include <sys/systemcfg.h>
 		main()
 		int
 		main ()
 			{
 			if (!__power_pc())
 				exit(1);
 @ -718,7 +719,8 @@ EOF
 		#include <stdlib.h>
 		#include <unistd.h>
 		int main ()
 		int
 		main ()
 		{
 		#if defined(_SC_KERNEL_BITS)
 		    long bits = sysconf(_SC_KERNEL_BITS);
 @ -1621,6 +1623,7 @@ cat > "$dummy.c" <<EOF
 #endif
 #endif
 #endif
 int
 main ()
 {
 #if defined (sony)

729

config/config.sub vendored

View File

 @ -2,9 +2,9 @@
 # Configuration validation subroutine script.
 #   Copyright 1992-2024 Free Software Foundation, Inc.
 # shellcheck disable=SC2006,SC2268 # see below for rationale
 # shellcheck disable=SC2006,SC2268,SC2162 # see below for rationale
 timestamp='2024-01-01'
 timestamp='2024-05-27'
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
 @ -120,7 +120,6 @@ case $# in
 esac
 # Split fields of configuration type
 # shellcheck disable=SC2162
 saved_IFS=$IFS
 IFS="-" read field1 field2 field3 field4 <<EOF
 $1
 @ -142,10 +141,20 @@ case $1 in
 		# parts
 		maybe_os=$field2-$field3
 		case $maybe_os in
 			nto-qnx* | linux-* | uclinux-uclibc* \
 			| uclinux-gnu* | kfreebsd*-gnu* | knetbsd*-gnu* | netbsd*-gnu* \
 			| netbsd*-eabi* | kopensolaris*-gnu* | cloudabi*-eabi* \
 			| storm-chaos* | os2-emx* | rtmk-nova* | managarm-* \
 			  cloudabi*-eabi* \
 			| kfreebsd*-gnu* \
 			| knetbsd*-gnu* \
 			| kopensolaris*-gnu* \
 			| linux-* \
 			| managarm-* \
 			| netbsd*-eabi* \
 			| netbsd*-gnu* \
 			| nto-qnx* \
 			| os2-emx* \
 			| rtmk-nova* \
 			| storm-chaos* \
 			| uclinux-gnu* \
 			| uclinux-uclibc* \
 			| windows-* )
 				basic_machine=$field1
 				basic_os=$maybe_os
 @ -161,8 +170,12 @@ case $1 in
 		esac
 		;;
 	*-*)
 		# A lone config we happen to match not fitting any pattern
 		case $field1-$field2 in
 			# Shorthands that happen to contain a single dash
 			convex-c[12] | convex-c3[248])
 				basic_machine=$field2-convex
 				basic_os=
 				;;
 			decstation-3100)
 				basic_machine=mips-dec
 				basic_os=
 @ -170,28 +183,88 @@ case $1 in
 			*-*)
 				# Second component is usually, but not always the OS
 				case $field2 in
 					# Prevent following clause from handling this valid os
 					# Do not treat sunos as a manufacturer
 					sun*os*)
 						basic_machine=$field1
 						basic_os=$field2
 						;;
 					# Manufacturers
 * \
 					| 32* \
 					| 3300* \
 					| 3600* \
 					| 7300* \
 					| acorn \
 					| altos* \
 					| apollo \
 					| apple \
 					| atari \
 					| att* \
 					| axis \
 					| be \
 					| bull \
 					| cbm \
 					| ccur \
 					| cisco \
 					| commodore \
 					| convergent* \
 					| convex* \
 					| cray \
 					| crds \
 					| dec* \
 					| delta* \
 					| dg \
 					| digital \
 					| dolphin \
 					| encore* \
 					| gould \
 					| harris \
 					| highlevel \
 					| hitachi* \
 					| hp \
 					| ibm* \
 					| intergraph \
 					| isi* \
 					| knuth \
 					| masscomp \
 					| microblaze* \
 					| mips* \
 					| motorola* \
 					| ncr* \
 					| news \
 					| next \
 					| ns \
 					| oki \
 					| omron* \
 					| pc533* \
 					| rebel \
 					| rom68k \
 					| rombug \
 					| semi \
 					| sequent* \
 					| siemens \
 					| sgi* \
 					| siemens \
 					| sim \
 					| sni \
 					| sony* \
 					| stratus \
 					| sun \
 					| sun[234]* \
 					| tektronix \
 					| tti* \
 					| ultra \
 					| unicom* \
 					| wec \
 					| winbond \
 					| wrs)
 						basic_machine=$field1-$field2
 						basic_os=
 						;;
 					zephyr*)
 						basic_machine=$field1-unknown
 						basic_os=$field2
 						;;
 					# Manufacturers
 					dec* | mips* | sequent* | encore* | pc533* | sgi* | sony* \
 					| att* | 7300* | 3300* | delta* | motorola* | sun[234]* \
 					| unicom* | ibm* | next | hp | isi* | apollo | altos* \
 					| convergent* | ncr* | news | 32* | 3600* | 3100* \
 					| hitachi* | c[123]* | convex* | sun | crds | omron* | dg \
 					| ultra | tti* | harris | dolphin | highlevel | gould \
 					| cbm | ns | masscomp | apple | axis | knuth | cray \
 					| microblaze* | sim | cisco \
 					| oki | wec | wrs | winbond)
 						basic_machine=$field1-$field2
 						basic_os=
 						;;
 					*)
 						basic_machine=$field1
 						basic_os=$field2
 @ -272,26 +345,6 @@ case $1 in
 				basic_machine=arm-unknown
 				basic_os=cegcc
 				;;
 			convex-c1)
 				basic_machine=c1-convex
 				basic_os=bsd
 				;;
 			convex-c2)
 				basic_machine=c2-convex
 				basic_os=bsd
 				;;
 			convex-c32)
 				basic_machine=c32-convex
 				basic_os=bsd
 				;;
 			convex-c34)
 				basic_machine=c34-convex
 				basic_os=bsd
 				;;
 			convex-c38)
 				basic_machine=c38-convex
 				basic_os=bsd
 				;;
 			cray)
 				basic_machine=j90-cray
 				basic_os=unicos
 @ -714,15 +767,26 @@ case $basic_machine in
 		vendor=dec
 		basic_os=tops20
 		;;
 	delta | 3300 | motorola-3300 | motorola-delta \
 	      | 3300-motorola | delta-motorola)
 	delta | 3300 | delta-motorola | 3300-motorola | motorola-delta | motorola-3300)
 		cpu=m68k
 		vendor=motorola
 		;;
 	dpx2*)
 	# This used to be dpx2*, but that gets the RS6000-based
 	# DPX/20 and the x86-based DPX/2-100 wrong.  See
 	# https://oldskool.silicium.org/stations/bull_dpx20.htm
 	# https://www.feb-patrimoine.com/english/bull_dpx2.htm
 	# https://www.feb-patrimoine.com/english/unix_and_bull.htm
 	dpx2 | dpx2[23]00 | dpx2[23]xx)
 		cpu=m68k
 		vendor=bull
 		basic_os=sysv3
 		;;
 	dpx2100 | dpx21xx)
 		cpu=i386
 		vendor=bull
 		;;
 	dpx20)
 		cpu=rs6000
 		vendor=bull
 		;;
 	encore | umax | mmax)
 		cpu=ns32k
 @ -837,18 +901,6 @@ case $basic_machine in
 	next | m*-next)
 		cpu=m68k
 		vendor=next
 		case $basic_os in
 		    openstep*)
 		        ;;
 		    nextstep*)
 			;;
 		    ns2*)
 		      basic_os=nextstep2
 			;;
 		    *)
 		      basic_os=nextstep3
 			;;
 		esac
 		;;
 	np1)
 		cpu=np1
 @ -937,7 +989,6 @@ case $basic_machine in
 		;;
 	*-*)
 		# shellcheck disable=SC2162
 		saved_IFS=$IFS
 		IFS="-" read cpu vendor <<EOF
 $basic_machine
 @ -972,15 +1023,19 @@ unset -v basic_machine
 # Decode basic machines in the full and proper CPU-Company form.
 case $cpu-$vendor in
 	# Here we handle the default manufacturer of certain CPU types in canonical form. It is in
 	# some cases the only manufacturer, in others, it is the most popular.
 	# Here we handle the default manufacturer of certain CPU types in canonical form.
 	# It is in some cases the only manufacturer, in others, it is the most popular.
 	c[12]-convex | c[12]-unknown | c3[248]-convex | c3[248]-unknown)
 		vendor=convex
 		basic_os=${basic_os:-bsd}
 		;;
 	craynv-unknown)
 		vendor=cray
 		basic_os=${basic_os:-unicosmp}
 		;;
 	c90-unknown | c90-cray)
 		vendor=cray
 		basic_os=${Basic_os:-unicos}
 		basic_os=${basic_os:-unicos}
 		;;
 	fx80-unknown)
 		vendor=alliant
 @ -1026,11 +1081,29 @@ case $cpu-$vendor in
 		vendor=alt
 		basic_os=${basic_os:-linux-gnueabihf}
 		;;
 	dpx20-unknown | dpx20-bull)
 		cpu=rs6000
 		vendor=bull
 	# Normalized CPU+vendor pairs that imply an OS, if not otherwise specified
 	m68k-isi)
 		basic_os=${basic_os:-sysv}
 		;;
 	m68k-sony)
 		basic_os=${basic_os:-newsos}
 		;;
 	m68k-tektronix)
 		basic_os=${basic_os:-bsd}
 		;;
 	m88k-harris)
 		basic_os=${basic_os:-sysv3}
 		;;
 	i386-bull | m68k-bull)
 		basic_os=${basic_os:-sysv3}
 		;;
 	rs6000-bull)
 		basic_os=${basic_os:-bosx}
 		;;
 	mips-sni)
 		basic_os=${basic_os:-sysv4}
 		;;
 	# Here we normalize CPU types irrespective of the vendor
 	amd64-*)
 @ -1038,7 +1111,7 @@ case $cpu-$vendor in
 		;;
 	blackfin-*)
 		cpu=bfin
 		basic_os=linux
 		basic_os=${basic_os:-linux}
 		;;
 	c54x-*)
 		cpu=tic54x
 @ -1061,7 +1134,7 @@ case $cpu-$vendor in
 		;;
 	m68knommu-*)
 		cpu=m68k
 		basic_os=linux
 		basic_os=${basic_os:-linux}
 		;;
 	m9s12z-* | m68hcs12z-* | hcs12z-* | s12z-*)
 		cpu=s12z
 @ -1071,7 +1144,7 @@ case $cpu-$vendor in
 		;;
 	parisc-*)
 		cpu=hppa
 		basic_os=linux
 		basic_os=${basic_os:-linux}
 		;;
 	pentium-* | p5-* | k5-* | k6-* | nexgen-* | viac3-*)
 		cpu=i586
 @ -1085,9 +1158,6 @@ case $cpu-$vendor in
 	pentium4-*)
 		cpu=i786
 		;;
 	pc98-*)
 		cpu=i386
 		;;
 	ppc-* | ppcbe-*)
 		cpu=powerpc
 		;;
 @ -1121,9 +1191,6 @@ case $cpu-$vendor in
 	tx39el-*)
 		cpu=mipstx39el
 		;;
 	x64-*)
 		cpu=x86_64
 		;;
 	xscale-* | xscalee[bl]-*)
 		cpu=`echo "$cpu" | sed 's/^xscale/arm/'`
 		;;
 @ -1179,90 +1246,227 @@ case $cpu-$vendor in
 		# Recognize the canonical CPU types that are allowed with any
 		# company name.
 		case $cpu in
 a | 580 \
 a \
 			| 580 \
 			| [cjt]90 \
 			| a29k \
 			| aarch64 | aarch64_be | aarch64c | arm64ec \
 			| aarch64 \
 			| aarch64_be \
 			| aarch64c \
 			| abacus \
 			| alpha | alphaev[4-8] | alphaev56 | alphaev6[78] \
 			| alpha64 | alpha64ev[4-8] | alpha64ev56 | alpha64ev6[78] \
 			| alphapca5[67] | alpha64pca5[67] \
 			| alpha \
 			| alpha64 \
 			| alpha64ev56 \
 			| alpha64ev6[78] \
 			| alpha64ev[4-8] \
 			| alpha64pca5[67] \
 			| alphaev56 \
 			| alphaev6[78] \
 			| alphaev[4-8] \
 			| alphapca5[67] \
 			| am33_2.0 \
 			| amdgcn \
 			| arc | arceb | arc32 | arc64 \
 			| arm | arm[lb]e | arme[lb] | armv* \
 			| avr | avr32 \
 			| arc \
 			| arc32 \
 			| arc64 \
 			| arceb \
 			| arm \
 			| arm64e \
 			| arm64ec \
 			| arm[lb]e \
 			| arme[lb] \
 			| armv* \
 			| asmjs \
 			| avr \
 			| avr32 \
 			| ba \
 			| be32 | be64 \
 			| bfin | bpf | bs2000 \
 			| c[123]* | c30 | [cjt]90 | c4x \
 			| c8051 | clipper | craynv | csky | cydra \
 			| d10v | d30v | dlx | dsp16xx \
 			| e2k | elxsi | epiphany \
 			| f30[01] | f700 | fido | fr30 | frv | ft32 | fx80 \
 			| javascript \
 			| h8300 | h8500 \
 			| hppa | hppa1.[01] | hppa2.0 | hppa2.0[nw] | hppa64 \
 			| be32 \
 			| be64 \
 			| bfin \
 			| bpf \
 			| bs2000 \
 			| c30 \
 			| c4x \
 			| c8051 \
 			| c[123]* \
 			| clipper \
 			| craynv \
 			| csky \
 			| cydra \
 			| d10v \
 			| d30v \
 			| dlx \
 			| dsp16xx \
 			| e2k \
 			| elxsi \
 			| epiphany \
 			| f30[01] \
 			| f700 \
 			| fido \
 			| fr30 \
 			| frv \
 			| ft32 \
 			| fx80 \
 			| h8300 \
 			| h8500 \
 			| hexagon \
 			| i370 | i*86 | i860 | i960 | ia16 | ia64 \
 			| ip2k | iq2000 \
 			| hppa \
 			| hppa1.[01] \
 			| hppa2.0 \
 			| hppa2.0[nw] \
 			| hppa64 \
 			| i*86 \
 			| i370 \
 			| i860 \
 			| i960 \
 			| ia16 \
 			| ia64 \
 			| ip2k \
 			| iq2000 \
 			| javascript \
 			| k1om \
 			| kvx \
 			| le32 | le64 \
 			| le32 \
 			| le64 \
 			| lm32 \
 			| loongarch32 | loongarch64 \
 			| m32c | m32r | m32rle \
 			| m5200 | m68000 | m680[012346]0 | m68360 | m683?2 | m68k \
 			| m6811 | m68hc11 | m6812 | m68hc12 | m68hcs12x \
 			| m88110 | m88k | maxq | mb | mcore | mep | metag \
 			| microblaze | microblazeel \
 			| loongarch32 \
 			| loongarch64 \
 			| m32c \
 			| m32r \
 			| m32rle \
 			| m5200 \
 			| m68000 \
 			| m680[012346]0 \
 			| m6811 \
 			| m6812 \
 			| m68360 \
 			| m683?2 \
 			| m68hc11 \
 			| m68hc12 \
 			| m68hcs12x \
 			| m68k \
 			| m88110 \
 			| m88k \
 			| maxq \
 			| mb \
 			| mcore \
 			| mep \
 			| metag \
 			| microblaze \
 			| microblazeel \
 			| mips* \
 			| mmix \
 			| mn10200 | mn10300 \
 			| mn10200 \
 			| mn10300 \
 			| moxie \
 			| mt \
 			| msp430 \
 			| mt \
 			| nanomips* \
 			| nds32 | nds32le | nds32be \
 			| nds32 \
 			| nds32be \
 			| nds32le \
 			| nfp \
 			| nios | nios2 | nios2eb | nios2el \
 			| none | np1 | ns16k | ns32k | nvptx \
 			| nios \
 			| nios2 \
 			| nios2eb \
 			| nios2el \
 			| none \
 			| np1 \
 			| ns16k \
 			| ns32k \
 			| nvptx \
 			| open8 \
 			| or1k* \
 			| or32 \
 			| orion \
 			| pdp10 \
 			| pdp11 \
 			| picochip \
 			| pdp10 | pdp11 | pj | pjl | pn | power \
 			| powerpc | powerpc64 | powerpc64le | powerpcle | powerpcspe \
 			| pj \
 			| pjl \
 			| pn \
 			| power \
 			| powerpc \
 			| powerpc64 \
 			| powerpc64le \
 			| powerpcle \
 			| powerpcspe \
 			| pru \
 			| pyramid \
 			| riscv | riscv32 | riscv32be | riscv64 | riscv64be \
 			| rl78 | romp | rs6000 | rx \
 			| s390 | s390x \
 			| riscv \
 			| riscv32 \
 			| riscv32be \
 			| riscv64 \
 			| riscv64be \
 			| rl78 \
 			| romp \
 			| rs6000 \
 			| rx \
 			| s390 \
 			| s390x \
 			| score \
 			| sh | shl \
 			| sh[1234] | sh[24]a | sh[24]ae[lb] | sh[23]e | she[lb] | sh[lb]e \
 			| sh[1234]e[lb] |  sh[12345][lb]e | sh[23]ele | sh64 | sh64le \
 			| sparc | sparc64 | sparc64b | sparc64v | sparc86x | sparclet \
 			| sh \
 			| sh64 \
 			| sh64le \
 			| sh[12345][lb]e \
 			| sh[1234] \
 			| sh[1234]e[lb] \
 			| sh[23]e \
 			| sh[23]ele \
 			| sh[24]a \
 			| sh[24]ae[lb] \
 			| sh[lb]e \
 			| she[lb] \
 			| shl \
 			| sparc \
 			| sparc64 \
 			| sparc64b \
 			| sparc64v \
 			| sparc86x \
 			| sparclet \
 			| sparclite \
 			| sparcv8 | sparcv9 | sparcv9b | sparcv9v | sv1 | sx* \
 			| sparcv8 \
 			| sparcv9 \
 			| sparcv9b \
 			| sparcv9v \
 			| spu \
 			| sv1 \
 			| sx* \
 			| tahoe \
 			| thumbv7* \
 			| tic30 | tic4x | tic54x | tic55x | tic6x | tic80 \
 			| tic30 \
 			| tic4x \
 			| tic54x \
 			| tic55x \
 			| tic6x \
 			| tic80 \
 			| tron \
 			| ubicom32 \
 			| v70 | v850 | v850e | v850e1 | v850es | v850e2 | v850e2v3 \
 			| v70 \
 			| v810 \
 			| v850 \
 			| v850e \
 			| v850e1 \
 			| v850e2 \
 			| v850e2v3 \
 			| v850es \
 			| vax \
 			| vc4 \
 			| visium \
 			| w65 \
 			| wasm32 | wasm64 \
 			| wasm32 \
 			| wasm64 \
 			| we32k \
 			| x86 | x86_64 | xc16x | xgate | xps100 \
 			| xstormy16 | xtensa* \
 			| x86 \
 			| x86_64 \
 			| xc16x \
 			| xgate \
 			| xps100 \
 			| xstormy16 \
 			| xtensa* \
 			| ymp \
 			| z8k | z80)
 			| z80 \
 			| z8k)
 				;;
 			*)
 @ -1307,7 +1511,6 @@ case $basic_os in
 		os=`echo "$basic_os" | sed -e 's|nto-qnx|qnx|'`
 		;;
 	*-*)
 		# shellcheck disable=SC2162
 		saved_IFS=$IFS
 		IFS="-" read kernel os <<EOF
 $basic_os
 @ -1354,6 +1557,23 @@ case $os in
 	unixware*)
 		os=sysv4.2uw
 		;;
 	# The marketing names for NeXT's operating systems were
 	# NeXTSTEP, NeXTSTEP 2, OpenSTEP 3, OpenSTEP 4.  'openstep' is
 	# mapped to 'openstep3', but 'openstep1' and 'openstep2' are
 	# mapped to 'nextstep' and 'nextstep2', consistent with the
 	# treatment of SunOS/Solaris.
 	ns | ns1 | nextstep | nextstep1 | openstep1)
 		os=nextstep
 		;;
 	ns2 | nextstep2 | openstep2)
 		os=nextstep2
 		;;
 	ns3 | nextstep3 | openstep | openstep3)
 		os=openstep3
 		;;
 	ns4 | nextstep4 | openstep4)
 		os=openstep4
 		;;
 	# es1800 is here to avoid being matched by es* (a different OS)
 	es1800*)
 		os=ose
 @ -1424,6 +1644,7 @@ case $os in
 		;;
 	utek*)
 		os=bsd
 		vendor=`echo "$vendor" | sed -e 's|^unknown$|tektronix|'`
 		;;
 	dynix*)
 		os=bsd
 @ -1440,21 +1661,25 @@ case $os in
 bsd)
 		os=bsd
 		;;
 	ctix* | uts*)
 	ctix*)
 		os=sysv
 		vendor=`echo "$vendor" | sed -e 's|^unknown$|convergent|'`
 		;;
 	uts*)
 		os=sysv
 		;;
 	nova*)
 		os=rtmk-nova
 		;;
 	ns2)
 		os=nextstep2
 		kernel=rtmk
 		os=nova
 		;;
 	# Preserve the version number of sinix5.
 	sinix5.*)
 		os=`echo "$os" | sed -e 's|sinix|sysv|'`
 		vendor=`echo "$vendor" | sed -e 's|^unknown$|sni|'`
 		;;
 	sinix*)
 		os=sysv4
 		vendor=`echo "$vendor" | sed -e 's|^unknown$|sni|'`
 		;;
 	tpf*)
 		os=tpf
 @ -1595,6 +1820,14 @@ case $cpu-$vendor in
 		os=
 		obj=elf
 		;;
 	# The -sgi and -siemens entries must be before the mips- entry
 	# or we get the wrong os.
 	*-sgi)
 		os=irix
 		;;
 	*-siemens)
 		os=sysv4
 		;;
 	mips*-cisco)
 		os=
 		obj=elf
 @ -1607,7 +1840,8 @@ case $cpu-$vendor in
 		os=
 		obj=coff
 		;;
 	*-tti)	# must be before sparc entry or we get the wrong os.
 	# This must be before the sparc-* entry or we get the wrong os.
 	*-tti)
 		os=sysv3
 		;;
 	sparc-* | *-sun)
 @ -1639,7 +1873,7 @@ case $cpu-$vendor in
 		os=hpux
 		;;
 	*-hitachi)
 		os=hiux
 		os=hiuxwe2
 		;;
 	i860-* | *-att | *-ncr | *-altos | *-motorola | *-convergent)
 		os=sysv
 @ -1683,12 +1917,6 @@ case $cpu-$vendor in
 	*-encore)
 		os=bsd
 		;;
 	*-sgi)
 		os=irix
 		;;
 	*-siemens)
 		os=sysv4
 		;;
 	*-masscomp)
 		os=rtu
 		;;
 @ -1735,40 +1963,193 @@ case $os in
 	ghcjs)
 		;;
 	# Now accept the basic system types.
 	# The portable systems comes first.
 	# Each alternative MUST end in a * to match a version number.
 	gnu* | android* | bsd* | mach* | minix* | genix* | ultrix* | irix* \
 	     | *vms* | esix* | aix* | cnk* | sunos | sunos[34]* \
 	     | hpux* | unos* | osf* | luna* | dgux* | auroraux* | solaris* \
 	     | sym* |  plan9* | psp* | sim* | xray* | os68k* | v88r* \
 	     | hiux* | abug | nacl* | netware* | windows* \
 	     | os9* | macos* | osx* | ios* | tvos* | watchos* \
 	     | mpw* | magic* | mmixware* | mon960* | lnews* \
 	     | amigaos* | amigados* | msdos* | newsos* | unicos* | aof* \
 	     | aos* | aros* | cloudabi* | sortix* | twizzler* \
 	     | nindy* | vxsim* | vxworks* | ebmon* | hms* | mvs* \
 	     | clix* | riscos* | uniplus* | iris* | isc* | rtu* | xenix* \
 	     | mirbsd* | netbsd* | dicos* | openedition* | ose* \
 	     | bitrig* | openbsd* | secbsd* | solidbsd* | libertybsd* | os108* \
 	     | ekkobsd* | freebsd* | riscix* | lynxos* | os400* \
 	     | bosx* | nextstep* | cxux* | oabi* \
 	     | ptx* | ecoff* | winnt* | domain* | vsta* \
 	     | udi* | lites* | ieee* | go32* | aux* | hcos* \
 	     | chorusrdb* | cegcc* | glidix* | serenity* \
 	     | cygwin* | msys* | moss* | proelf* | rtems* \
 	     | midipix* | mingw32* | mingw64* | mint* \
 	     | uxpv* | beos* | mpeix* | udk* | moxiebox* \
 	     | interix* | uwin* | mks* | rhapsody* | darwin* \
 	     | openstep* | oskit* | conix* | pw32* | nonstopux* \
 	     | storm-chaos* | tops10* | tenex* | tops20* | its* \
 	     | os2* | vos* | palmos* | uclinux* | nucleus* | morphos* \
 	     | scout* | superux* | sysv* | rtmk* | tpf* | windiss* \
 	     | powermax* | dnix* | nx6 | nx7 | sei* | dragonfly* \
 	     | skyos* | haiku* | rdos* | toppers* | drops* | es* \
 	     | onefs* | tirtos* | phoenix* | fuchsia* | redox* | bme* \
 	     | midnightbsd* | amdhsa* | unleashed* | emscripten* | wasi* \
 	     | nsk* | powerunix* | genode* | zvmoe* | qnx* | emx* | zephyr* \
 	     | fiwix* | mlibc* | cos* | mbr* | ironclad* )
 	  abug \
 	| aix* \
 	| amdhsa* \
 	| amigados* \
 	| amigaos* \
 	| android* \
 	| aof* \
 	| aos* \
 	| aros* \
 	| atheos* \
 	| auroraux* \
 	| aux* \
 	| beos* \
 	| bitrig* \
 	| bme* \
 	| bosx* \
 	| bsd* \
 	| cegcc* \
 	| chorusos* \
 	| chorusrdb* \
 	| clix* \
 	| cloudabi* \
 	| cnk* \
 	| conix* \
 	| cos* \
 	| cxux* \
 	| cygwin* \
 	| darwin* \
 	| dgux* \
 	| dicos* \
 	| dnix* \
 	| domain* \
 	| dragonfly* \
 	| drops* \
 	| ebmon* \
 	| ecoff* \
 	| ekkobsd* \
 	| emscripten* \
 	| emx* \
 	| es* \
 	| fiwix* \
 	| freebsd* \
 	| fuchsia* \
 	| genix* \
 	| genode* \
 	| glidix* \
 	| gnu* \
 	| go32* \
 	| haiku* \
 	| hcos* \
 	| hiux* \
 	| hms* \
 	| hpux* \
 	| ieee* \
 	| interix* \
 	| ios* \
 	| iris* \
 	| irix* \
 	| ironclad* \
 	| isc* \
 	| its* \
 	| l4re* \
 	| libertybsd* \
 	| lites* \
 	| lnews* \
 	| luna* \
 	| lynxos* \
 	| mach* \
 	| macos* \
 	| magic* \
 	| mbr* \
 	| midipix* \
 	| midnightbsd* \
 	| mingw32* \
 	| mingw64* \
 	| minix* \
 	| mint* \
 	| mirbsd* \
 	| mks* \
 	| mlibc* \
 	| mmixware* \
 	| mon960* \
 	| morphos* \
 	| moss* \
 	| moxiebox* \
 	| mpeix* \
 	| mpw* \
 	| msdos* \
 	| msys* \
 	| mvs* \
 	| nacl* \
 	| netbsd* \
 	| netware* \
 	| newsos* \
 	| nextstep* \
 	| nindy* \
 	| nonstopux* \
 	| nova* \
 	| nsk* \
 	| nucleus* \
 	| nx6 \
 	| nx7 \
 	| oabi* \
 	| ohos* \
 	| onefs* \
 	| openbsd* \
 	| openedition* \
 	| openstep* \
 	| os108* \
 	| os2* \
 	| os400* \
 	| os68k* \
 	| os9* \
 	| ose* \
 	| osf* \
 	| oskit* \
 	| osx* \
 	| palmos* \
 	| phoenix* \
 	| plan9* \
 	| powermax* \
 	| powerunix* \
 	| proelf* \
 	| psos* \
 	| psp* \
 	| ptx* \
 	| pw32* \
 	| qnx* \
 	| rdos* \
 	| redox* \
 	| rhapsody* \
 	| riscix* \
 	| riscos* \
 	| rtems* \
 	| rtmk* \
 	| rtu* \
 	| scout* \
 	| secbsd* \
 	| sei* \
 	| serenity* \
 	| sim* \
 	| skyos* \
 	| solaris* \
 	| solidbsd* \
 	| sortix* \
 	| storm-chaos* \
 	| sunos \
 	| sunos[34]* \
 	| superux* \
 	| syllable* \
 	| sym* \
 	| sysv* \
 	| tenex* \
 	| tirtos* \
 	| toppers* \
 	| tops10* \
 	| tops20* \
 	| tpf* \
 	| tvos* \
 	| twizzler* \
 	| uclinux* \
 	| udi* \
 	| udk* \
 	| ultrix* \
 	| unicos* \
 	| uniplus* \
 	| unleashed* \
 	| unos* \
 	| uwin* \
 	| uxpv* \
 	| v88r* \
 	|*vms* \
 	| vos* \
 	| vsta* \
 	| vxsim* \
 	| vxworks* \
 	| wasi* \
 	| watchos* \
 	| wince* \
 	| windiss* \
 	| windows* \
 	| winnt* \
 	| xenix* \
 	| xray* \
 	| zephyr* \
 	| zvmoe* )
 		;;
 	# This one is extra strict with allowed versions
 	sco3.2v2 | sco3.2v[4-9]* | sco5v6*)
 @ -1829,9 +2210,9 @@ esac
 case $kernel-$os-$obj in
 	linux-gnu*- | linux-android*- | linux-dietlibc*- | linux-llvm*- \
 		    | linux-mlibc*- | linux-musl*- | linux-newlib*- \
 		    | linux-relibc*- | linux-uclibc*- )
 		    | linux-relibc*- | linux-uclibc*- | linux-ohos*- )
 		;;
 	uclinux-uclibc*- )
 	uclinux-uclibc*- | uclinux-gnu*- )
 		;;
 	managarm-mlibc*- | managarm-kernel*- )
 		;;
 @ -1856,7 +2237,7 @@ case $kernel-$os-$obj in
 		echo "Invalid configuration '$1': '$os' needs 'windows'." 1>&2
 		exit 1
 		;;
 	kfreebsd*-gnu*- | kopensolaris*-gnu*-)
 	kfreebsd*-gnu*- | knetbsd*-gnu*- | netbsd*-gnu*- | kopensolaris*-gnu*-)
 		;;
 	vxworks-simlinux- | vxworks-simwindows- | vxworks-spe-)
 		;;
 @ -1864,6 +2245,8 @@ case $kernel-$os-$obj in
 		;;
 	os2-emx-)
 		;;
 	rtmk-nova-)
 		;;
 	*-eabi*- | *-gnueabi*-)
 		;;
 	none--*)
 @ -1890,7 +2273,7 @@ case $vendor in
 			*-riscix*)
 				vendor=acorn
 				;;
 			*-sunos*)
 			*-sunos* | *-solaris*)
 				vendor=sun
 				;;
 			*-cnk* | *-aix*)

80

config/programs.m4

View File

 @ -274,3 +274,83 @@ AC_DEFUN([PGAC_CHECK_STRIP],
   AC_SUBST(STRIP_STATIC_LIB)
   AC_SUBST(STRIP_SHARED_LIB)
 ])# PGAC_CHECK_STRIP
 # PGAC_CHECK_LIBCURL
 # ------------------
 # Check for required libraries and headers, and test to see whether the current
 # installation of libcurl is thread-safe.
 AC_DEFUN([PGAC_CHECK_LIBCURL],
 [
   AC_CHECK_HEADER(curl/curl.h, [],
 				  [AC_MSG_ERROR([header file <curl/curl.h> is required for --with-libcurl])])
   AC_CHECK_LIB(curl, curl_multi_init, [
 				 AC_DEFINE([HAVE_LIBCURL], [1], [Define to 1 if you have the `curl' library (-lcurl).])
 				 AC_SUBST(LIBCURL_LDLIBS, -lcurl)
 			   ],
 			   [AC_MSG_ERROR([library 'curl' does not provide curl_multi_init])])
   pgac_save_CPPFLAGS=$CPPFLAGS
   pgac_save_LDFLAGS=$LDFLAGS
   pgac_save_LIBS=$LIBS
   CPPFLAGS="$LIBCURL_CPPFLAGS $CPPFLAGS"
   LDFLAGS="$LIBCURL_LDFLAGS $LDFLAGS"
   LIBS="$LIBCURL_LDLIBS $LIBS"
   # Check to see whether the current platform supports threadsafe Curl
   # initialization.
   AC_CACHE_CHECK([for curl_global_init thread safety], [pgac_cv__libcurl_threadsafe_init],
   [AC_RUN_IFELSE([AC_LANG_PROGRAM([
 #include <curl/curl.h>
 ],[
     curl_version_info_data *info;
     if (curl_global_init(CURL_GLOBAL_ALL))
         return -1;
     info = curl_version_info(CURLVERSION_NOW);
 #ifdef CURL_VERSION_THREADSAFE
     if (info->features & CURL_VERSION_THREADSAFE)
         return 0;
 #endif
     return 1;
 ])],
   [pgac_cv__libcurl_threadsafe_init=yes],
   [pgac_cv__libcurl_threadsafe_init=no],
   [pgac_cv__libcurl_threadsafe_init=unknown])])
   if test x"$pgac_cv__libcurl_threadsafe_init" = xyes ; then
     AC_DEFINE(HAVE_THREADSAFE_CURL_GLOBAL_INIT, 1,
               [Define to 1 if curl_global_init() is guaranteed to be thread-safe.])
   fi
   # Fail if a thread-friendly DNS resolver isn't built.
   AC_CACHE_CHECK([for curl support for asynchronous DNS], [pgac_cv__libcurl_async_dns],
   [AC_RUN_IFELSE([AC_LANG_PROGRAM([
 #include <curl/curl.h>
 ],[
     curl_version_info_data *info;
     if (curl_global_init(CURL_GLOBAL_ALL))
         return -1;
     info = curl_version_info(CURLVERSION_NOW);
     return (info->features & CURL_VERSION_ASYNCHDNS) ? 0 : 1;
 ])],
   [pgac_cv__libcurl_async_dns=yes],
   [pgac_cv__libcurl_async_dns=no],
   [pgac_cv__libcurl_async_dns=unknown])])
   if test x"$pgac_cv__libcurl_async_dns" = xno ; then
     AC_MSG_ERROR([
 *** The installed version of libcurl does not support asynchronous DNS
 *** lookups. Rebuild libcurl with the AsynchDNS feature enabled in order
 *** to use it with libpq.])
   fi
   CPPFLAGS=$pgac_save_CPPFLAGS
   LDFLAGS=$pgac_save_LDFLAGS
   LIBS=$pgac_save_LIBS
 ])# PGAC_CHECK_LIBCURL

940

configure vendored

View File

File diff suppressed because it is too large Load Diff

140

configure.ac

View File

 @ -17,7 +17,7 @@ dnl Read the Autoconf manual for details.
 dnl
 m4_pattern_forbid(^PGAC_)dnl to catch undefined macros
 AC_INIT([PostgreSQL], [18devel], [pgsql-bugs@lists.postgresql.org], [], [https://www.postgresql.org/])
 AC_INIT([PostgreSQL], [18beta1], [pgsql-bugs@lists.postgresql.org], [], [https://www.postgresql.org/])
 m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.69], [], [m4_fatal([Autoconf version 2.69 is required.
 Untested combinations of 'autoconf' and PostgreSQL versions are not
 @ -975,6 +975,18 @@ AC_SUBST(with_readline)
 PGAC_ARG_BOOL(with, libedit-preferred, no,
               [prefer BSD Libedit over GNU Readline])
 #
 # liburing
 #
 AC_MSG_CHECKING([whether to build with liburing support])
 PGAC_ARG_BOOL(with, liburing, no, [build with io_uring support, for asynchronous I/O],
               [AC_DEFINE([USE_LIBURING], 1, [Define to build with io_uring support. (--with-liburing)])])
 AC_MSG_RESULT([$with_liburing])
 AC_SUBST(with_liburing)
 if test "$with_liburing" = yes; then
   PKG_CHECK_MODULES(LIBURING, liburing)
 fi
 #
 # UUID library
 @ -1007,6 +1019,62 @@ fi
 AC_SUBST(with_uuid)
 #
 # libcurl
 #
 AC_MSG_CHECKING([whether to build with libcurl support])
 PGAC_ARG_BOOL(with, libcurl, no, [build with libcurl support],
               [AC_DEFINE([USE_LIBCURL], 1, [Define to 1 to build with libcurl support. (--with-libcurl)])])
 AC_MSG_RESULT([$with_libcurl])
 AC_SUBST(with_libcurl)
 if test "$with_libcurl" = yes ; then
   # Check for libcurl 7.61.0 or higher (corresponding to RHEL8 and the ability
   # to explicitly set TLS 1.3 ciphersuites).
   PKG_CHECK_MODULES(LIBCURL, [libcurl >= 7.61.0])
   # Curl's flags are kept separate from the standard CPPFLAGS/LDFLAGS. We use
   # them only for libpq-oauth.
   LIBCURL_CPPFLAGS=
   LIBCURL_LDFLAGS=
   # We only care about -I, -D, and -L switches. Note that -lcurl will be added
   # to LIBCURL_LDLIBS by PGAC_CHECK_LIBCURL, below.
   for pgac_option in $LIBCURL_CFLAGS; do
     case $pgac_option in
       -I*|-D*) LIBCURL_CPPFLAGS="$LIBCURL_CPPFLAGS $pgac_option";;
     esac
   done
   for pgac_option in $LIBCURL_LIBS; do
     case $pgac_option in
       -L*) LIBCURL_LDFLAGS="$LIBCURL_LDFLAGS $pgac_option";;
     esac
   done
   AC_SUBST(LIBCURL_CPPFLAGS)
   AC_SUBST(LIBCURL_LDFLAGS)
   # OAuth requires python for testing
   if test "$with_python" != yes; then
     AC_MSG_WARN([*** OAuth support tests require --with-python to run])
   fi
 fi
 #
 # libnuma
 #
 AC_MSG_CHECKING([whether to build with libnuma support])
 PGAC_ARG_BOOL(with, libnuma, no, [build with libnuma support],
               [AC_DEFINE([USE_LIBNUMA], 1, [Define to build with NUMA support. (--with-libnuma)])])
 AC_MSG_RESULT([$with_libnuma])
 AC_SUBST(with_libnuma)
 if test "$with_libnuma" = yes ; then
   AC_CHECK_LIB(numa,    numa_available, [], [AC_MSG_ERROR([library 'libnuma' is required for NUMA support])])
   PKG_CHECK_MODULES(LIBNUMA, numa)
 fi
 #
 # XML
 #
 @ -1294,6 +1362,10 @@ failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zlib to disable zlib support.])])
 fi
 if test "$with_libcurl" = yes ; then
   PGAC_CHECK_LIBCURL
 fi
 if test "$with_gssapi" = yes ; then
   if test "$PORTNAME" != "win32"; then
     AC_SEARCH_LIBS(gss_store_cred_into, [gssapi_krb5 gss 'gssapi -lkrb5 -lcrypto'], [],
 @ -1329,7 +1401,7 @@ if test "$with_ssl" = openssl ; then
   # Function introduced in OpenSSL 1.0.2, not in LibreSSL.
   AC_CHECK_FUNCS([SSL_CTX_set_cert_cb])
   # Function introduced in OpenSSL 1.1.1, not in LibreSSL.
   AC_CHECK_FUNCS([X509_get_signature_info SSL_CTX_set_num_tickets])
   AC_CHECK_FUNCS([X509_get_signature_info SSL_CTX_set_num_tickets SSL_CTX_set_keylog_callback])
   AC_DEFINE([USE_OPENSSL], 1, [Define to 1 to build with OpenSSL support. (--with-ssl=openssl)])
 elif test "$with_ssl" != no ; then
   AC_MSG_ERROR([--with-ssl must specify openssl])
 @ -1587,6 +1659,13 @@ if test "$PORTNAME" = "win32" ; then
    AC_CHECK_HEADERS(crtdefs.h)
 fi
 if test "$with_libcurl" = yes ; then
   # Error out early if this platform can't support libpq-oauth.
   if test "$ac_cv_header_sys_event_h" != yes -a "$ac_cv_header_sys_epoll_h" != yes; then
     AC_MSG_ERROR([client-side OAuth is not supported on this platform])
   fi
 fi
 ##
 ## Types, structures, compiler characteristics
 ##
 @ -1711,14 +1790,13 @@ AC_CHECK_FUNCS(m4_normalize([
 	getpeerucred
 	inet_pton
 	kqueue
 	localeconv_l
 	mbstowcs_l
 	memset_s
 	posix_fallocate
 	ppoll
 	pthread_is_threaded_np
 	setproctitle
 	setproctitle_fast
 	strchrnul
 	strsignal
 	syncfs
 	sync_file_range
 @ -1752,12 +1830,15 @@ AC_CHECK_DECLS(posix_fadvise, [], [], [#include <fcntl.h>])
 ]) # fi
 AC_CHECK_DECLS(fdatasync, [], [], [#include <unistd.h>])
 AC_CHECK_DECLS([strlcat, strlcpy, strnlen, strsep])
 AC_CHECK_DECLS([strlcat, strlcpy, strnlen, strsep, timingsafe_bcmp])
 # We can't use AC_CHECK_FUNCS to detect these functions, because it
 # won't handle deployment target restrictions on macOS
 AC_CHECK_DECLS([preadv], [], [], [#include <sys/uio.h>])
 AC_CHECK_DECLS([pwritev], [], [], [#include <sys/uio.h>])
 AC_CHECK_DECLS([strchrnul], [], [], [#include <string.h>])
 AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
 #include <string.h>])
 # This is probably only present on macOS, but may as well check always
 AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include <fcntl.h>])
 @ -1772,6 +1853,7 @@ AC_REPLACE_FUNCS(m4_normalize([
 	strlcpy
 	strnlen
 	strsep
 	timingsafe_bcmp
 ]))
 AC_REPLACE_FUNCS(pthread_barrier_wait)
 @ -2016,6 +2098,15 @@ if test x"$host_cpu" = x"x86_64"; then
   fi
 fi
 # Check for SVE popcount intrinsics
 #
 if test x"$host_cpu" = x"aarch64"; then
   PGAC_SVE_POPCNT_INTRINSICS()
   if test x"$pgac_sve_popcnt_intrinsics" = x"yes"; then
     AC_DEFINE(USE_SVE_POPCNT_WITH_RUNTIME_CHECK, 1, [Define to 1 to use SVE popcount instructions with a runtime check.])
   fi
 fi
 # Check for Intel SSE 4.2 intrinsics to do CRC calculations.
 #
 PGAC_SSE42_CRC32_INTRINSICS()
 @ -2052,17 +2143,21 @@ AC_SUBST(CFLAGS_CRC)
 # Select CRC-32C implementation.
 #
 # If we are targeting a processor that has Intel SSE 4.2 instructions, we can
 # use the special CRC instructions for calculating CRC-32C. If we're not
 # targeting such a processor, but we can nevertheless produce code that uses
 # the SSE intrinsics, compile both implementations and select which one to use
 # at runtime, depending on whether SSE 4.2 is supported by the processor we're
 # running on.
 # There are three methods of calculating CRC, in order of increasing
 # performance:
 #
 # Similarly, if we are targeting an ARM processor that has the CRC
 # instructions that are part of the ARMv8 CRC Extension, use them. And if
 # we're not targeting such a processor, but can nevertheless produce code that
 # uses the CRC instructions, compile both, and select at runtime.
 # 1. The fallback using a lookup table, called slicing-by-8
 # 2. CRC-32C instructions (found in e.g. Intel SSE 4.2 and ARMv8 CRC Extension)
 # 3. Algorithms using carryless multiplication instructions
 #    (e.g. Intel PCLMUL and Arm PMULL)
 #
 # If we can produce code (via function attributes or additional compiler
 # flags) that uses #2 (and possibly #3), we compile all implementations
 # and select which one to use at runtime, depending on what is supported
 # by the processor we're running on.
 #
 # If we are targeting a processor that has #2, we can use that without
 # runtime selection.
 #
 # Note that we do not use __attribute__((target("..."))) for the ARM CRC
 # instructions because until clang 16, using the ARM intrinsics still requires
 @ -2110,7 +2205,7 @@ fi
 AC_MSG_CHECKING([which CRC-32C implementation to use])
 if test x"$USE_SSE42_CRC32C" = x"1"; then
   AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
   PG_CRC32C_OBJS="pg_crc32c_sse42.o"
   PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
   AC_MSG_RESULT(SSE 4.2)
 else
   if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
 @ -2143,6 +2238,19 @@ else
 fi
 AC_SUBST(PG_CRC32C_OBJS)
 # Check for carryless multiplication intrinsics to do vectorized CRC calculations.
 #
 if test x"$host_cpu" = x"x86_64"; then
   PGAC_AVX512_PCLMUL_INTRINSICS()
 fi
 AC_MSG_CHECKING([for vectorized CRC-32C])
 if test x"$pgac_avx512_pclmul_intrinsics" = x"yes"; then
   AC_DEFINE(USE_AVX512_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX-512 CRC algorithms with a runtime check.])
   AC_MSG_RESULT(AVX-512 with runtime check)
 else
   AC_MSG_RESULT(none)
 fi
 # Select semaphore implementation type.
 if test "$PORTNAME" != "win32"; then

									
										1

contrib/Makefile
									
										View File
										
				@ -33,6 +33,7 @@ SUBDIRS = \

						pg_buffercache	\

						pg_freespacemap \

						pg_logicalinspect \

						pg_overexplain \

						pg_prewarm	\

						pg_stat_statements \

						pg_surgery	\

									
										7

contrib/amcheck/Makefile
									
										View File
										
				@ -3,14 +3,17 @@

				MODULE_big	= amcheck

				OBJS = \

					$(WIN32RES) \

					verify_common.o \

					verify_gin.o \

					verify_heapam.o \

					verify_nbtree.o

				EXTENSION = amcheck

				DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql

				DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \

						amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql

				PGFILEDESC = "amcheck - function for verifying relation integrity"

				REGRESS = check check_btree check_heap

				REGRESS = check check_btree check_gin check_heap

				EXTRA_INSTALL = contrib/pg_walinspect

				TAP_TESTS = 1

									
										14

contrib/amcheck/amcheck--1.4--1.5.sql
									
										Normal file
									
										View File
										
				@ -0,0 +1,14 @@

				/* contrib/amcheck/amcheck--1.4--1.5.sql */

				-- complain if script is sourced in psql, rather than via CREATE EXTENSION

				\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit

				-- gin_index_check()

				--

				CREATE FUNCTION gin_index_check(index regclass)

				RETURNS VOID

				AS 'MODULE_PATHNAME', 'gin_index_check'

				LANGUAGE C STRICT;

				REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;

2

contrib/amcheck/amcheck.control

View File

 @ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
 default_version = '1.4'
 default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true

4

contrib/amcheck/expected/check_btree.out

View File

 @ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
 ERROR:  only B-Tree indexes are supported as targets for verification
 DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
 ERROR:  expected "btree" index as targets for verification
 DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');

78

contrib/amcheck/expected/check_gin.out Normal file

View File

 @ -0,0 +1,78 @@
 -- Test of index bulk load
 SELECT setseed(1);
  setseed
 ---------
 (1 row)
 CREATE TABLE "gin_check"("Column1" int[]);
 -- posting trees (frequently used entries)
 INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
 -- posting leaves (sparse entries)
 INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
 CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
 SELECT gin_index_check('gin_check_idx');
  gin_index_check
 -----------------
 (1 row)
 -- cleanup
 DROP TABLE gin_check;
 -- Test index inserts
 SELECT setseed(1);
  setseed
 ---------
 (1 row)
 CREATE TABLE "gin_check"("Column1" int[]);
 CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
 ALTER INDEX gin_check_idx SET (fastupdate = false);
 -- posting trees
 INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
 -- posting leaves
 INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
 SELECT gin_index_check('gin_check_idx');
  gin_index_check
 -----------------
 (1 row)
 -- cleanup
 DROP TABLE gin_check;
 -- Test GIN over text array
 SELECT setseed(1);
  setseed
 ---------
 (1 row)
 CREATE TABLE "gin_check_text_array"("Column1" text[]);
 -- posting trees
 INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300)::text::bytea)::text) from generate_series(1, 100000) as i group by i % 10000;
 -- posting leaves
 INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300 + 300)::text::bytea)::text) from generate_series(1, 10000) as i group by i % 100;
 CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
 SELECT gin_index_check('gin_check_text_array_idx');
  gin_index_check
 -----------------
 (1 row)
 -- cleanup
 DROP TABLE gin_check_text_array;
 -- Test GIN over jsonb
 CREATE TABLE "gin_check_jsonb"("j" jsonb);
 INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
 INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
 INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
 CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
 SELECT gin_index_check('gin_check_jsonb_idx');
  gin_index_check
 -----------------
 (1 row)
 -- cleanup
 DROP TABLE gin_check_jsonb;

									
										4

contrib/amcheck/meson.build
									
										View File
										
				@ -1,6 +1,8 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				amcheck_sources = files(

				  'verify_common.c',

				  'verify_gin.c',

				  'verify_heapam.c',

				  'verify_nbtree.c',

				)

				@ -24,6 +26,7 @@ install_data(

				  'amcheck--1.1--1.2.sql',

				  'amcheck--1.2--1.3.sql',

				  'amcheck--1.3--1.4.sql',

				  'amcheck--1.4--1.5.sql',

				  kwargs: contrib_data_args,

				)

				@ -35,6 +38,7 @@ tests += {

				    'sql': [

				      'check',

				      'check_btree',

				      'check_gin',

				      'check_heap',

				    ],

				  },

									
										52

contrib/amcheck/sql/check_gin.sql
									
										Normal file
									
										View File
										
				@ -0,0 +1,52 @@

				-- Test of index bulk load

				SELECT setseed(1);

				CREATE TABLE "gin_check"("Column1" int[]);

				-- posting trees (frequently used entries)

				INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;

				-- posting leaves (sparse entries)

				INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;

				CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");

				SELECT gin_index_check('gin_check_idx');

				-- cleanup

				DROP TABLE gin_check;

				-- Test index inserts

				SELECT setseed(1);

				CREATE TABLE "gin_check"("Column1" int[]);

				CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");

				ALTER INDEX gin_check_idx SET (fastupdate = false);

				-- posting trees

				INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;

				-- posting leaves

				INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;

				SELECT gin_index_check('gin_check_idx');

				-- cleanup

				DROP TABLE gin_check;

				-- Test GIN over text array

				SELECT setseed(1);

				CREATE TABLE "gin_check_text_array"("Column1" text[]);

				-- posting trees

				INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300)::text::bytea)::text) from generate_series(1, 100000) as i group by i % 10000;

				-- posting leaves

				INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300 + 300)::text::bytea)::text) from generate_series(1, 10000) as i group by i % 100;

				CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");

				SELECT gin_index_check('gin_check_text_array_idx');

				-- cleanup

				DROP TABLE gin_check_text_array;

				-- Test GIN over jsonb

				CREATE TABLE "gin_check_jsonb"("j" jsonb);

				INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');

				INSERT INTO gin_check_jsonb values ('[[14,2,3]]');

				INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');

				CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);

				SELECT gin_index_check('gin_check_jsonb_idx');

				-- cleanup

				DROP TABLE gin_check_jsonb;

									
										10

contrib/amcheck/t/002_cic.pl
									
										View File
										
				@ -21,8 +21,9 @@ $node->append_conf('postgresql.conf',

					'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));

				$node->start;

				$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));

				$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));

				$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));

				$node->safe_psql('postgres', q(CREATE INDEX idx ON tbl(i)));

				$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(j)));

				#

				# Stress CIC with pgbench.

				@ -40,13 +41,13 @@ $node->pgbench(

					{

						'002_pgbench_concurrent_transaction' => q(

							BEGIN;

							INSERT INTO tbl VALUES(0);

							INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');

							COMMIT;

						  ),

						'002_pgbench_concurrent_transaction_savepoints' => q(

							BEGIN;

							SAVEPOINT s1;

							INSERT INTO tbl VALUES(0);

							INSERT INTO tbl VALUES(0, '[[14,2,3]]');

							COMMIT;

						  ),

						'002_pgbench_concurrent_cic' => q(

				@ -54,7 +55,10 @@ $node->pgbench(

							\if :gotlock

								DROP INDEX CONCURRENTLY idx;

								CREATE INDEX CONCURRENTLY idx ON tbl(i);

								DROP INDEX CONCURRENTLY ginidx;

								CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);

								SELECT bt_index_check('idx',true);

								SELECT gin_index_check('ginidx');

								SELECT pg_advisory_unlock(42);

							\endif

						  )

									
										40

contrib/amcheck/t/003_cic_2pc.pl
									
										View File
										
				@ -25,7 +25,7 @@ $node->append_conf('postgresql.conf',

					'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));

				$node->start;

				$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));

				$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));

				$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));

				#

				@ -41,7 +41,7 @@ my $main_h = $node->background_psql('postgres');

				$main_h->query_safe(

					q(

				BEGIN;

				INSERT INTO tbl VALUES(0);

				INSERT INTO tbl VALUES(0, '[[14,2,3]]');

				));

				my $cic_h = $node->background_psql('postgres');

				@ -50,6 +50,7 @@ $cic_h->query_until(

					qr/start/, q(

				\echo start

				CREATE INDEX CONCURRENTLY idx ON tbl(i);

				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);

				));

				$main_h->query_safe(

				@ -60,7 +61,7 @@ PREPARE TRANSACTION 'a';

				$main_h->query_safe(

					q(

				BEGIN;

				INSERT INTO tbl VALUES(0);

				INSERT INTO tbl VALUES(0, '[[14,2,3]]');

				));

				$node->safe_psql('postgres', q(COMMIT PREPARED 'a';));

				@ -69,7 +70,7 @@ $main_h->query_safe(

					q(

				PREPARE TRANSACTION 'b';

				BEGIN;

				INSERT INTO tbl VALUES(0);

				INSERT INTO tbl VALUES(0, '"mary had a little lamb"');

				));

				$node->safe_psql('postgres', q(COMMIT PREPARED 'b';));

				@ -86,6 +87,9 @@ $cic_h->quit;

				$result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));

				is($result, '0', 'bt_index_check after overlapping 2PC');

				$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));

				is($result, '0', 'gin_index_check after overlapping 2PC');

				#

				# Server restart shall not change whether prepared xact blocks CIC

				@ -94,7 +98,7 @@ is($result, '0', 'bt_index_check after overlapping 2PC');

				$node->safe_psql(

					'postgres', q(

				BEGIN;

				INSERT INTO tbl VALUES(0);

				INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');

				PREPARE TRANSACTION 'spans_restart';

				BEGIN;

				CREATE TABLE unused ();

				@ -108,12 +112,16 @@ $reindex_h->query_until(

				\echo start

				DROP INDEX CONCURRENTLY idx;

				CREATE INDEX CONCURRENTLY idx ON tbl(i);

				DROP INDEX CONCURRENTLY ginidx;

				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);

				));

				$node->safe_psql('postgres', "COMMIT PREPARED 'spans_restart'");

				$reindex_h->quit;

				$result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));

				is($result, '0', 'bt_index_check after 2PC and restart');

				$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));

				is($result, '0', 'gin_index_check after 2PC and restart');

				#

				@ -136,14 +144,14 @@ $node->pgbench(

					{

						'003_pgbench_concurrent_2pc' => q(

							BEGIN;

							INSERT INTO tbl VALUES(0);

							INSERT INTO tbl VALUES(0,'null');

							PREPARE TRANSACTION 'c:client_id';

							COMMIT PREPARED 'c:client_id';

						  ),

						'003_pgbench_concurrent_2pc_savepoint' => q(

							BEGIN;

							SAVEPOINT s1;

							INSERT INTO tbl VALUES(0);

							INSERT INTO tbl VALUES(0,'[false, "jnvaba", -76, 7, {"_": [1]}, 9]');

							PREPARE TRANSACTION 'c:client_id';

							COMMIT PREPARED 'c:client_id';

						  ),

				@ -163,7 +171,25 @@ $node->pgbench(

								SELECT bt_index_check('idx',true);

								SELECT pg_advisory_unlock(42);

							\endif

						  ),

						'005_pgbench_concurrent_cic' => q(

							SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset

							\if :gotginlock

								DROP INDEX CONCURRENTLY ginidx;

								CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);

								SELECT gin_index_check('ginidx');

								SELECT pg_advisory_unlock(42);

							\endif

						  ),

						'006_pgbench_concurrent_ric' => q(

							SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset

							\if :gotginlock

								REINDEX INDEX CONCURRENTLY ginidx;

								SELECT gin_index_check('ginidx');

								SELECT pg_advisory_unlock(42);

							\endif

						  )

					});

				$node->stop;

									
										191

contrib/amcheck/verify_common.c
									
										Normal file
									
										View File
										
				@ -0,0 +1,191 @@

				/*-------------------------------------------------------------------------

				 *

				 * verify_common.c

				 *		Utility functions common to all access methods.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/amcheck/verify_common.c

				 *

				 *-------------------------------------------------------------------------

				 */

				#include "postgres.h"

				#include "access/genam.h"

				#include "access/table.h"

				#include "access/tableam.h"

				#include "verify_common.h"

				#include "catalog/index.h"

				#include "catalog/pg_am.h"

				#include "commands/tablecmds.h"

				#include "utils/guc.h"

				#include "utils/syscache.h"

				static bool amcheck_index_mainfork_expected(Relation rel);

				/*

				 * Check if index relation should have a file for its main relation fork.

				 * Verification uses this to skip unlogged indexes when in hot standby mode,

				 * where there is simply nothing to verify.

				 *

				 * NB: Caller should call index_checkable() before calling here.

				 */

				static bool

				amcheck_index_mainfork_expected(Relation rel)

				{

					if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||

						!RecoveryInProgress())

						return true;

					ereport(NOTICE,

							(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),

							 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",

									RelationGetRelationName(rel))));

					return false;

				}

				/*

				* Amcheck main workhorse.

				* Given index relation OID, lock relation.

				* Next, take a number of standard actions:

				* 1) Make sure the index can be checked

				* 2) change the context of the user,

				* 3) keep track of GUCs modified via index functions

				* 4) execute callback function to verify integrity.

				*/

				void

				amcheck_lock_relation_and_check(Oid indrelid,

												Oid am_id,

												IndexDoCheckCallback check,

												LOCKMODE lockmode,

												void *state)

				{

					Oid			heapid;

					Relation	indrel;

					Relation	heaprel;

					Oid			save_userid;

					int			save_sec_context;

					int			save_nestlevel;

					/*

					 * We must lock table before index to avoid deadlocks.  However, if the

					 * passed indrelid isn't an index then IndexGetRelation() will fail.

					 * Rather than emitting a not-very-helpful error message, postpone

					 * complaining, expecting that the is-it-an-index test below will fail.

					 *

					 * In hot standby mode this will raise an error when parentcheck is true.

					 */

					heapid = IndexGetRelation(indrelid, true);

					if (OidIsValid(heapid))

					{

						heaprel = table_open(heapid, lockmode);

						/*

						 * Switch to the table owner's userid, so that any index functions are

						 * run as that user.  Also lock down security-restricted operations

						 * and arrange to make GUC variable changes local to this command.

						 */

						GetUserIdAndSecContext(&save_userid, &save_sec_context);

						SetUserIdAndSecContext(heaprel->rd_rel->relowner,

											   save_sec_context | SECURITY_RESTRICTED_OPERATION);

						save_nestlevel = NewGUCNestLevel();

					}

					else

					{

						heaprel = NULL;

						/* Set these just to suppress "uninitialized variable" warnings */

						save_userid = InvalidOid;

						save_sec_context = -1;

						save_nestlevel = -1;

					}

					/*

					 * Open the target index relations separately (like relation_openrv(), but

					 * with heap relation locked first to prevent deadlocking).  In hot

					 * standby mode this will raise an error when parentcheck is true.

					 *

					 * There is no need for the usual indcheckxmin usability horizon test

					 * here, even in the heapallindexed case, because index undergoing

					 * verification only needs to have entries for a new transaction snapshot.

					 * (If this is a parentcheck verification, there is no question about

					 * committed or recently dead heap tuples lacking index entries due to

					 * concurrent activity.)

					 */

					indrel = index_open(indrelid, lockmode);

					/*

					 * Since we did the IndexGetRelation call above without any lock, it's

					 * barely possible that a race against an index drop/recreation could have

					 * netted us the wrong table.

					 */

					if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))

						ereport(ERROR,

								(errcode(ERRCODE_UNDEFINED_TABLE),

								 errmsg("could not open parent table of index \"%s\"",

										RelationGetRelationName(indrel))));

					/* Check that relation suitable for checking */

					if (index_checkable(indrel, am_id))

						check(indrel, heaprel, state, lockmode == ShareLock);

					/* Roll back any GUC changes executed by index functions */

					AtEOXact_GUC(false, save_nestlevel);

					/* Restore userid and security context */

					SetUserIdAndSecContext(save_userid, save_sec_context);

					/*

					 * Release locks early. That's ok here because nothing in the called

					 * routines will trigger shared cache invalidations to be sent, so we can

					 * relax the usual pattern of only releasing locks after commit.

					 */

					index_close(indrel, lockmode);

					if (heaprel)

						table_close(heaprel, lockmode);

				}

				/*

				 * Basic checks about the suitability of a relation for checking as an index.

				 *

				 *

				 * NB: Intentionally not checking permissions, the function is normally not

				 * callable by non-superusers. If granted, it's useful to be able to check a

				 * whole cluster.

				 */

				bool

				index_checkable(Relation rel, Oid am_id)

				{

					if (rel->rd_rel->relkind != RELKIND_INDEX ||

						rel->rd_rel->relam != am_id)

					{

						HeapTuple	amtup;

						HeapTuple	amtuprel;

						amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));

						amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),

								 errdetail("Relation \"%s\" is a %s index.",

										   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));

					}

					if (RELATION_IS_OTHER_TEMP(rel))

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								 errmsg("cannot access temporary tables of other sessions"),

								 errdetail("Index \"%s\" is associated with temporary relation.",

										   RelationGetRelationName(rel))));

					if (!rel->rd_index->indisvalid)

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								 errmsg("cannot check index \"%s\"",

										RelationGetRelationName(rel)),

								 errdetail("Index is not valid.")));

					return amcheck_index_mainfork_expected(rel);

				}

									
										31

contrib/amcheck/verify_common.h
									
										Normal file
									
										View File
										
				@ -0,0 +1,31 @@

				/*-------------------------------------------------------------------------

				 *

				 * amcheck.h

				 *		Shared routines for amcheck verifications.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/amcheck/amcheck.h

				 *

				 *-------------------------------------------------------------------------

				 */

				#include "storage/bufpage.h"

				#include "storage/lmgr.h"

				#include "storage/lockdefs.h"

				#include "utils/relcache.h"

				#include "miscadmin.h"

				/* Typedefs for callback functions for amcheck_lock_relation_and_check */

				typedef void (*IndexCheckableCallback) (Relation index);

				typedef void (*IndexDoCheckCallback) (Relation rel,

													  Relation heaprel,

													  void *state,

													  bool readonly);

				extern void amcheck_lock_relation_and_check(Oid indrelid,

															Oid am_id,

															IndexDoCheckCallback check,

															LOCKMODE lockmode, void *state);

				extern bool index_checkable(Relation rel, Oid am_id);

									
										799

contrib/amcheck/verify_gin.c
									
										Normal file
									
										View File
										
				@ -0,0 +1,799 @@

				/*-------------------------------------------------------------------------

				 *

				 * verify_gin.c

				 *		Verifies the integrity of GIN indexes based on invariants.

				 *

				 *

				 * GIN index verification checks a number of invariants:

				 *

				 * - consistency: Paths in GIN graph have to contain consistent keys: tuples

				 *   on parent pages consistently include tuples from children pages.

				 *

				 * - graph invariants: Each internal page must have at least one downlink, and

				 *   can reference either only leaf pages or only internal pages.

				 *

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/amcheck/verify_gin.c

				 *

				 *-------------------------------------------------------------------------

				 */

				#include "postgres.h"

				#include "access/gin_private.h"

				#include "access/nbtree.h"

				#include "catalog/pg_am.h"

				#include "utils/memutils.h"

				#include "utils/rel.h"

				#include "verify_common.h"

				#include "string.h"

				/*

				 * GinScanItem represents one item of depth-first scan of the index.

				 */

				typedef struct GinScanItem

				{

					int			depth;

					IndexTuple	parenttup;

					BlockNumber parentblk;

					XLogRecPtr	parentlsn;

					BlockNumber blkno;

					struct GinScanItem *next;

				} GinScanItem;

				/*

				 * GinPostingTreeScanItem represents one item of a depth-first posting tree scan.

				 */

				typedef struct GinPostingTreeScanItem

				{

					int			depth;

					ItemPointerData parentkey;

					BlockNumber parentblk;

					BlockNumber blkno;

					struct GinPostingTreeScanItem *next;

				} GinPostingTreeScanItem;

				PG_FUNCTION_INFO_V1(gin_index_check);

				static void gin_check_parent_keys_consistency(Relation rel,

															  Relation heaprel,

															  void *callback_state, bool readonly);

				static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);

				static IndexTuple gin_refind_parent(Relation rel,

													BlockNumber parentblkno,

													BlockNumber childblkno,

													BufferAccessStrategy strategy);

				static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,

												   OffsetNumber offset);

				/*

				 * gin_index_check(index regclass)

				 *

				 * Verify integrity of GIN index.

				 *

				 * Acquires AccessShareLock on heap & index relations.

				 */

				Datum

				gin_index_check(PG_FUNCTION_ARGS)

				{

					Oid			indrelid = PG_GETARG_OID(0);

					amcheck_lock_relation_and_check(indrelid,

													GIN_AM_OID,

													gin_check_parent_keys_consistency,

													AccessShareLock,

													NULL);

					PG_RETURN_VOID();

				}

				/*

				 * Read item pointers from leaf entry tuple.

				 *

				 * Returns a palloc'd array of ItemPointers. The number of items is returned

				 * in *nitems.

				 */

				static ItemPointer

				ginReadTupleWithoutState(IndexTuple itup, int *nitems)

				{

					Pointer		ptr = GinGetPosting(itup);

					int			nipd = GinGetNPosting(itup);

					ItemPointer ipd;

					int			ndecoded;

					if (GinItupIsCompressed(itup))

					{

						if (nipd > 0)

						{

							ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);

							if (nipd != ndecoded)

								elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",

									 nipd, ndecoded);

						}

						else

							ipd = palloc(0);

					}

					else

					{

						ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);

						memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);

					}

					*nitems = nipd;

					return ipd;

				}

				/*

				 * Scans through a posting tree (given by the root), and verifies that the keys

				 * on a child keys are consistent with the parent.

				 *

				 * Allocates a separate memory context and scans through posting tree graph.

				 */

				static void

				gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)

				{

					BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);

					GinPostingTreeScanItem *stack;

					MemoryContext mctx;

					MemoryContext oldcontext;

					int			leafdepth;

					mctx = AllocSetContextCreate(CurrentMemoryContext,

												 "posting tree check context",

												 ALLOCSET_DEFAULT_SIZES);

					oldcontext = MemoryContextSwitchTo(mctx);

					/*

					 * We don't know the height of the tree yet, but as soon as we encounter a

					 * leaf page, we will set 'leafdepth' to its depth.

					 */

					leafdepth = -1;

					/* Start the scan at the root page */

					stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));

					stack->depth = 0;

					ItemPointerSetInvalid(&stack->parentkey);

					stack->parentblk = InvalidBlockNumber;

					stack->blkno = posting_tree_root;

					elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);

					while (stack)

					{

						GinPostingTreeScanItem *stack_next;

						Buffer		buffer;

						Page		page;

						OffsetNumber i,

									maxoff;

						BlockNumber rightlink;

						CHECK_FOR_INTERRUPTS();

						buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,

													RBM_NORMAL, strategy);

						LockBuffer(buffer, GIN_SHARE);

						page = (Page) BufferGetPage(buffer);

						Assert(GinPageIsData(page));

						/* Check that the tree has the same height in all branches */

						if (GinPageIsLeaf(page))

						{

							ItemPointerData minItem;

							int			nlist;

							ItemPointerData *list;

							char		tidrange_buf[MAXPGPATH];

							ItemPointerSetMin(&minItem);

							elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);

							if (leafdepth == -1)

								leafdepth = stack->depth;

							else if (stack->depth != leafdepth)

								ereport(ERROR,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",

												RelationGetRelationName(rel), stack->blkno)));

							list = GinDataLeafPageGetItems(page, &nlist, minItem);

							if (nlist > 0)

								snprintf(tidrange_buf, sizeof(tidrange_buf),

										 "%d tids (%u, %u) - (%u, %u)",

										 nlist,

										 ItemPointerGetBlockNumberNoCheck(&list[0]),

										 ItemPointerGetOffsetNumberNoCheck(&list[0]),

										 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),

										 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));

							else

								snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");

							if (stack->parentblk != InvalidBlockNumber)

								elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",

									 stack->blkno,

									 stack->parentblk,

									 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),

									 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),

									 tidrange_buf);

							else

								elog(DEBUG3, "blk %u: root leaf, %s",

									 stack->blkno,

									 tidrange_buf);

							if (stack->parentblk != InvalidBlockNumber &&

								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&

								nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)

								ereport(ERROR,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",

												RelationGetRelationName(rel), stack->blkno)));

						}

						else

						{

							LocationIndex pd_lower;

							ItemPointerData bound;

							int			lowersize;

							/*

							 * Check that tuples in each page are properly ordered and

							 * consistent with parent high key

							 */

							maxoff = GinPageGetOpaque(page)->maxoff;

							rightlink = GinPageGetOpaque(page)->rightlink;

							elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);

							if (stack->parentblk != InvalidBlockNumber)

								elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",

									 stack->blkno, maxoff, stack->parentblk,

									 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),

									 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));

							else

								elog(DEBUG3, "blk %u: root internal posting tree page with %u items",

									 stack->blkno, maxoff);

							/*

							 * A GIN posting tree internal page stores PostingItems in the

							 * 'lower' part of the page. The 'upper' part is unused. The

							 * number of elements is stored in the opaque area (maxoff). Make

							 * sure the size of the 'lower' part agrees with 'maxoff'

							 *

							 * We didn't set pd_lower until PostgreSQL version 9.4, so if this

							 * check fails, it could also be because the index was

							 * binary-upgraded from an earlier version. That was a long time

							 * ago, though, so let's warn if it doesn't match.

							 */

							pd_lower = ((PageHeader) page)->pd_lower;

							lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);

							if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)

								ereport(ERROR,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",

												RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));

							/*

							 * Before the PostingItems, there's one ItemPointerData in the

							 * 'lower' part that stores the page's high key.

							 */

							bound = *GinDataPageGetRightBound(page);

							/*

							 * Gin page right bound has a sane value only when not a highkey

							 * on the rightmost page (at a given level). For the rightmost

							 * page does not store the highkey explicitly, and the value is

							 * infinity.

							 */

							if (ItemPointerIsValid(&stack->parentkey) &&

								rightlink != InvalidBlockNumber &&

								!ItemPointerEquals(&stack->parentkey, &bound))

								ereport(ERROR,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",

												RelationGetRelationName(rel),

												ItemPointerGetBlockNumberNoCheck(&bound),

												ItemPointerGetOffsetNumberNoCheck(&bound),

												stack->blkno, stack->parentblk,

												ItemPointerGetBlockNumberNoCheck(&stack->parentkey),

												ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));

							for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))

							{

								GinPostingTreeScanItem *ptr;

								PostingItem *posting_item = GinDataPageGetPostingItem(page, i);

								/* ItemPointerGetOffsetNumber expects a valid pointer */

								if (!(i == maxoff &&

									  rightlink == InvalidBlockNumber))

									elog(DEBUG3, "key (%u, %u) -> %u",

										 ItemPointerGetBlockNumber(&posting_item->key),

										 ItemPointerGetOffsetNumber(&posting_item->key),

										 BlockIdGetBlockNumber(&posting_item->child_blkno));

								else

									elog(DEBUG3, "key (%u, %u) -> %u",

										 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));

								if (i == maxoff && rightlink == InvalidBlockNumber)

								{

									/*

									 * The rightmost item in the tree level has (0, 0) as the

									 * key

									 */

									if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||

										ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)

										ereport(ERROR,

												(errcode(ERRCODE_INDEX_CORRUPTED),

												 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",

														RelationGetRelationName(rel),

														stack->blkno,

														ItemPointerGetBlockNumberNoCheck(&posting_item->key),

														ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));

								}

								else if (i != FirstOffsetNumber)

								{

									PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);

									if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)

										ereport(ERROR,

												(errcode(ERRCODE_INDEX_CORRUPTED),

												 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",

														RelationGetRelationName(rel), stack->blkno, i)));

								}

								/*

								 * Check if this tuple is consistent with the downlink in the

								 * parent.

								 */

								if (stack->parentblk != InvalidBlockNumber && i == maxoff &&

									ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)

									ereport(ERROR,

											(errcode(ERRCODE_INDEX_CORRUPTED),

											 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",

													RelationGetRelationName(rel),

													stack->blkno, i)));

								/* This is an internal page, recurse into the child. */

								ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));

								ptr->depth = stack->depth + 1;

								/*

								 * Set rightmost parent key to invalid item pointer. Its value

								 * is 'Infinity' and not explicitly stored.

								 */

								if (rightlink == InvalidBlockNumber)

									ItemPointerSetInvalid(&ptr->parentkey);

								else

									ptr->parentkey = posting_item->key;

								ptr->parentblk = stack->blkno;

								ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);

								ptr->next = stack->next;

								stack->next = ptr;

							}

						}

						LockBuffer(buffer, GIN_UNLOCK);

						ReleaseBuffer(buffer);

						/* Step to next item in the queue */

						stack_next = stack->next;

						pfree(stack);

						stack = stack_next;

					}

					MemoryContextSwitchTo(oldcontext);

					MemoryContextDelete(mctx);

				}

				/*

				 * Main entry point for GIN checks.

				 *

				 * Allocates memory context and scans through the whole GIN graph.

				 */

				static void

				gin_check_parent_keys_consistency(Relation rel,

												  Relation heaprel,

												  void *callback_state,

												  bool readonly)

				{

					BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);

					GinScanItem *stack;

					MemoryContext mctx;

					MemoryContext oldcontext;

					GinState	state;

					int			leafdepth;

					mctx = AllocSetContextCreate(CurrentMemoryContext,

												 "amcheck consistency check context",

												 ALLOCSET_DEFAULT_SIZES);

					oldcontext = MemoryContextSwitchTo(mctx);

					initGinState(&state, rel);

					/*

					 * We don't know the height of the tree yet, but as soon as we encounter a

					 * leaf page, we will set 'leafdepth' to its depth.

					 */

					leafdepth = -1;

					/* Start the scan at the root page */

					stack = (GinScanItem *) palloc0(sizeof(GinScanItem));

					stack->depth = 0;

					stack->parenttup = NULL;

					stack->parentblk = InvalidBlockNumber;

					stack->parentlsn = InvalidXLogRecPtr;

					stack->blkno = GIN_ROOT_BLKNO;

					while (stack)

					{

						GinScanItem *stack_next;

						Buffer		buffer;

						Page		page;

						OffsetNumber i,

									maxoff,

									prev_attnum;

						XLogRecPtr	lsn;

						IndexTuple	prev_tuple;

						BlockNumber rightlink;

						CHECK_FOR_INTERRUPTS();

						buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,

													RBM_NORMAL, strategy);

						LockBuffer(buffer, GIN_SHARE);

						page = (Page) BufferGetPage(buffer);

						lsn = BufferGetLSNAtomic(buffer);

						maxoff = PageGetMaxOffsetNumber(page);

						rightlink = GinPageGetOpaque(page)->rightlink;

						/* Do basic sanity checks on the page headers */

						check_index_page(rel, buffer, stack->blkno);

						elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);

						/*

						 * It's possible that the page was split since we looked at the

						 * parent, so that we didn't missed the downlink of the right sibling

						 * when we scanned the parent.  If so, add the right sibling to the

						 * stack now.

						 */

						if (stack->parenttup != NULL)

						{

							GinNullCategory parent_key_category;

							Datum		parent_key = gintuple_get_key(&state,

																	  stack->parenttup,

																	  &parent_key_category);

							ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,

																   page, maxoff);

							IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);

							OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);

							GinNullCategory page_max_key_category;

							Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);

							if (rightlink != InvalidBlockNumber &&

								ginCompareEntries(&state, attnum, page_max_key,

												  page_max_key_category, parent_key,

												  parent_key_category) > 0)

							{

								/* split page detected, install right link to the stack */

								GinScanItem *ptr;

								elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);

								ptr = (GinScanItem *) palloc(sizeof(GinScanItem));

								ptr->depth = stack->depth;

								ptr->parenttup = CopyIndexTuple(stack->parenttup);

								ptr->parentblk = stack->parentblk;

								ptr->parentlsn = stack->parentlsn;

								ptr->blkno = rightlink;

								ptr->next = stack->next;

								stack->next = ptr;

							}

						}

						/* Check that the tree has the same height in all branches */

						if (GinPageIsLeaf(page))

						{

							if (leafdepth == -1)

								leafdepth = stack->depth;

							else if (stack->depth != leafdepth)

								ereport(ERROR,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",

												RelationGetRelationName(rel), stack->blkno)));

						}

						/*

						 * Check that tuples in each page are properly ordered and consistent

						 * with parent high key

						 */

						prev_tuple = NULL;

						prev_attnum = InvalidAttrNumber;

						for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))

						{

							ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);

							IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);

							OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);

							GinNullCategory prev_key_category;

							Datum		prev_key;

							GinNullCategory current_key_category;

							Datum		current_key;

							if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))

								ereport(ERROR,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",

												RelationGetRelationName(rel), stack->blkno, i)));

							current_key = gintuple_get_key(&state, idxtuple, &current_key_category);

							/*

							 * First block is metadata, skip order check. Also, never check

							 * for high key on rightmost page, as this key is not really

							 * stored explicitly.

							 *

							 * Also make sure to not compare entries for different attnums,

							 * which may be stored on the same page.

							 */

							if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&

								!(i == maxoff && rightlink == InvalidBlockNumber))

							{

								prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);

								if (ginCompareEntries(&state, attnum, prev_key,

													  prev_key_category, current_key,

													  current_key_category) >= 0)

									ereport(ERROR,

											(errcode(ERRCODE_INDEX_CORRUPTED),

											 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",

													RelationGetRelationName(rel), stack->blkno, i, rightlink)));

							}

							/*

							 * Check if this tuple is consistent with the downlink in the

							 * parent.

							 */

							if (stack->parenttup &&

								i == maxoff)

							{

								GinNullCategory parent_key_category;

								Datum		parent_key = gintuple_get_key(&state,

																		  stack->parenttup,

																		  &parent_key_category);

								if (ginCompareEntries(&state, attnum, current_key,

													  current_key_category, parent_key,

													  parent_key_category) > 0)

								{

									/*

									 * There was a discrepancy between parent and child

									 * tuples. We need to verify it is not a result of

									 * concurrent call of gistplacetopage(). So, lock parent

									 * and try to find downlink for current page. It may be

									 * missing due to concurrent page split, this is OK.

									 */

									pfree(stack->parenttup);

									stack->parenttup = gin_refind_parent(rel, stack->parentblk,

																		 stack->blkno, strategy);

									/* We found it - make a final check before failing */

									if (!stack->parenttup)

										elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",

											 stack->blkno, stack->parentblk);

									else

									{

										parent_key = gintuple_get_key(&state,

																	  stack->parenttup,

																	  &parent_key_category);

										/*

										 * Check if it is properly adjusted. If succeed,

										 * proceed to the next key.

										 */

										if (ginCompareEntries(&state, attnum, current_key,

															  current_key_category, parent_key,

															  parent_key_category) > 0)

											ereport(ERROR,

													(errcode(ERRCODE_INDEX_CORRUPTED),

													 errmsg("index \"%s\" has inconsistent records on page %u offset %u",

															RelationGetRelationName(rel), stack->blkno, i)));

									}

								}

							}

							/* If this is an internal page, recurse into the child */

							if (!GinPageIsLeaf(page))

							{

								GinScanItem *ptr;

								ptr = (GinScanItem *) palloc(sizeof(GinScanItem));

								ptr->depth = stack->depth + 1;

								/* last tuple in layer has no high key */

								if (i != maxoff && !GinPageGetOpaque(page)->rightlink)

									ptr->parenttup = CopyIndexTuple(idxtuple);

								else

									ptr->parenttup = NULL;

								ptr->parentblk = stack->blkno;

								ptr->blkno = GinGetDownlink(idxtuple);

								ptr->parentlsn = lsn;

								ptr->next = stack->next;

								stack->next = ptr;

							}

							/* If this item is a pointer to a posting tree, recurse into it */

							else if (GinIsPostingTree(idxtuple))

							{

								BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);

								gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);

							}

							else

							{

								ItemPointer ipd;

								int			nipd;

								ipd = ginReadTupleWithoutState(idxtuple, &nipd);

								for (int j = 0; j < nipd; j++)

								{

									if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))

										ereport(ERROR,

												(errcode(ERRCODE_INDEX_CORRUPTED),

												 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",

														RelationGetRelationName(rel), stack->blkno)));

								}

								pfree(ipd);

							}

							prev_tuple = CopyIndexTuple(idxtuple);

							prev_attnum = attnum;

						}

						LockBuffer(buffer, GIN_UNLOCK);

						ReleaseBuffer(buffer);

						/* Step to next item in the queue */

						stack_next = stack->next;

						if (stack->parenttup)

							pfree(stack->parenttup);

						pfree(stack);

						stack = stack_next;

					}

					MemoryContextSwitchTo(oldcontext);

					MemoryContextDelete(mctx);

				}

				/*

				 * Verify that a freshly-read page looks sane.

				 */

				static void

				check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)

				{

					Page		page = BufferGetPage(buffer);

					/*

					 * ReadBuffer verifies that every newly-read page passes

					 * PageHeaderIsValid, which means it either contains a reasonably sane

					 * page header or is all-zero.  We have to defend against the all-zero

					 * case, however.

					 */

					if (PageIsNew(page))

						ereport(ERROR,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("index \"%s\" contains unexpected zero page at block %u",

										RelationGetRelationName(rel),

										BufferGetBlockNumber(buffer)),

								 errhint("Please REINDEX it.")));

					/*

					 * Additionally check that the special area looks sane.

					 */

					if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))

						ereport(ERROR,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("index \"%s\" contains corrupted page at block %u",

										RelationGetRelationName(rel),

										BufferGetBlockNumber(buffer)),

								 errhint("Please REINDEX it.")));

					if (GinPageIsDeleted(page))

					{

						if (!GinPageIsLeaf(page))

							ereport(ERROR,

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("index \"%s\" has deleted internal page %u",

											RelationGetRelationName(rel), blockNo)));

						if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)

							ereport(ERROR,

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("index \"%s\" has deleted page %u with tuples",

											RelationGetRelationName(rel), blockNo)));

					}

					else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)

						ereport(ERROR,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("index \"%s\" has page %u with exceeding count of tuples",

										RelationGetRelationName(rel), blockNo)));

				}

				/*

				 * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.

				 *

				 * If found, returns a palloc'd copy of the downlink tuple. Otherwise,

				 * returns NULL.

				 */

				static IndexTuple

				gin_refind_parent(Relation rel, BlockNumber parentblkno,

								  BlockNumber childblkno, BufferAccessStrategy strategy)

				{

					Buffer		parentbuf;

					Page		parentpage;

					OffsetNumber o,

								parent_maxoff;

					IndexTuple	result = NULL;

					parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,

												   strategy);

					LockBuffer(parentbuf, GIN_SHARE);

					parentpage = BufferGetPage(parentbuf);

					if (GinPageIsLeaf(parentpage))

					{

						UnlockReleaseBuffer(parentbuf);

						return result;

					}

					parent_maxoff = PageGetMaxOffsetNumber(parentpage);

					for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))

					{

						ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);

						IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);

						if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)

						{

							/* Found it! Make copy and return it */

							result = CopyIndexTuple(itup);

							break;

						}

					}

					UnlockReleaseBuffer(parentbuf);

					return result;

				}

				static ItemId

				PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,

									 OffsetNumber offset)

				{

					ItemId		itemid = PageGetItemId(page, offset);

					if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >

						BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))

						ereport(ERROR,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("line pointer points past end of tuple space in index \"%s\"",

										RelationGetRelationName(rel)),

								 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",

													block, offset, ItemIdGetOffset(itemid),

													ItemIdGetLength(itemid),

													ItemIdGetFlags(itemid))));

					/*

					 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,

					 * since GIN never uses all three.  Verify that line pointer has storage,

					 * too.

					 */

					if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||

						ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)

						ereport(ERROR,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("invalid line pointer storage in index \"%s\"",

										RelationGetRelationName(rel)),

								 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",

													block, offset, ItemIdGetOffset(itemid),

													ItemIdGetLength(itemid),

													ItemIdGetFlags(itemid))));

					return itemid;

				}

									
										142

contrib/amcheck/verify_heapam.c
									
										View File
										
				@ -25,6 +25,7 @@

				#include "miscadmin.h"

				#include "storage/bufmgr.h"

				#include "storage/procarray.h"

				#include "storage/read_stream.h"

				#include "utils/builtins.h"

				#include "utils/fmgroids.h"

				#include "utils/rel.h"

				@ -118,7 +119,10 @@ typedef struct HeapCheckContext

					Relation	valid_toast_index;

					int			num_toast_indexes;

					/* Values for iterating over pages in the relation */

					/*

					 * Values for iterating over pages in the relation. `blkno` is the most

					 * recent block in the buffer yielded by the read stream API.

					 */

					BlockNumber blkno;

					BufferAccessStrategy bstrategy;

					Buffer		buffer;

				@ -153,7 +157,32 @@ typedef struct HeapCheckContext

					Tuplestorestate *tupstore;

				} HeapCheckContext;

				/*

				 * The per-relation data provided to the read stream API for heap amcheck to

				 * use in its callback for the SKIP_PAGES_ALL_FROZEN and

				 * SKIP_PAGES_ALL_VISIBLE options.

				 */

				typedef struct HeapCheckReadStreamData

				{

					/*

					 * `range` is used by all SkipPages options. SKIP_PAGES_NONE uses the

					 * default read stream callback, block_range_read_stream_cb(), which takes

					 * a BlockRangeReadStreamPrivate as its callback_private_data. `range`

					 * keeps track of the current block number across

					 * read_stream_next_buffer() invocations.

					 */

					BlockRangeReadStreamPrivate range;

					SkipPages	skip_option;

					Relation	rel;

					Buffer	   *vmbuffer;

				} HeapCheckReadStreamData;

				/* Internal implementation */

				static BlockNumber heapcheck_read_stream_next_unskippable(ReadStream *stream,

																		  void *callback_private_data,

																		  void *per_buffer_data);

				static void check_tuple(HeapCheckContext *ctx,

										bool *xmin_commit_status_ok,

										XidCommitStatus *xmin_commit_status);

				@ -231,6 +260,11 @@ verify_heapam(PG_FUNCTION_ARGS)

					BlockNumber last_block;

					BlockNumber nblocks;

					const char *skip;

					ReadStream *stream;

					int			stream_flags;

					ReadStreamBlockNumberCB stream_cb;

					void	   *stream_data;

					HeapCheckReadStreamData stream_skip_data;

					/* Check supplied arguments */

					if (PG_ARGISNULL(0))

				@ -404,7 +438,46 @@ verify_heapam(PG_FUNCTION_ARGS)

					if (TransactionIdIsNormal(ctx.relfrozenxid))

						ctx.oldest_xid = ctx.relfrozenxid;

					for (ctx.blkno = first_block; ctx.blkno <= last_block; ctx.blkno++)

					/* Now that `ctx` is set up, set up the read stream */

					stream_skip_data.range.current_blocknum = first_block;

					stream_skip_data.range.last_exclusive = last_block + 1;

					stream_skip_data.skip_option = skip_option;

					stream_skip_data.rel = ctx.rel;

					stream_skip_data.vmbuffer = &vmbuffer;

					if (skip_option == SKIP_PAGES_NONE)

					{

						/*

						 * It is safe to use batchmode as block_range_read_stream_cb takes no

						 * locks.

						 */

						stream_cb = block_range_read_stream_cb;

						stream_flags = READ_STREAM_SEQUENTIAL |

							READ_STREAM_FULL |

							READ_STREAM_USE_BATCHING;

						stream_data = &stream_skip_data.range;

					}

					else

					{

						/*

						 * It would not be safe to naively use batchmode, as

						 * heapcheck_read_stream_next_unskippable takes locks. It shouldn't be

						 * too hard to convert though.

						 */

						stream_cb = heapcheck_read_stream_next_unskippable;

						stream_flags = READ_STREAM_DEFAULT;

						stream_data = &stream_skip_data;

					}

					stream = read_stream_begin_relation(stream_flags,

														ctx.bstrategy,

														ctx.rel,

														MAIN_FORKNUM,

														stream_cb,

														stream_data,

														0);

					while ((ctx.buffer = read_stream_next_buffer(stream, NULL)) != InvalidBuffer)

					{

						OffsetNumber maxoff;

						OffsetNumber predecessor[MaxOffsetNumber];

				@ -417,30 +490,11 @@ verify_heapam(PG_FUNCTION_ARGS)

						memset(predecessor, 0, sizeof(OffsetNumber) * MaxOffsetNumber);

						/* Optionally skip over all-frozen or all-visible blocks */

						if (skip_option != SKIP_PAGES_NONE)

						{

							int32		mapbits;

							mapbits = (int32) visibilitymap_get_status(ctx.rel, ctx.blkno,

																	   &vmbuffer);

							if (skip_option == SKIP_PAGES_ALL_FROZEN)

							{

								if ((mapbits & VISIBILITYMAP_ALL_FROZEN) != 0)

									continue;

							}

							if (skip_option == SKIP_PAGES_ALL_VISIBLE)

							{

								if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0)

									continue;

							}

						}

						/* Read and lock the next page. */

						ctx.buffer = ReadBufferExtended(ctx.rel, MAIN_FORKNUM, ctx.blkno,

														RBM_NORMAL, ctx.bstrategy);

						/* Lock the next page. */

						Assert(BufferIsValid(ctx.buffer));

						LockBuffer(ctx.buffer, BUFFER_LOCK_SHARE);

						ctx.blkno = BufferGetBlockNumber(ctx.buffer);

						ctx.page = BufferGetPage(ctx.buffer);

						/* Perform tuple checks */

				@ -799,6 +853,8 @@ verify_heapam(PG_FUNCTION_ARGS)

							break;

					}

					read_stream_end(stream);

					if (vmbuffer != InvalidBuffer)

						ReleaseBuffer(vmbuffer);

				@ -815,6 +871,42 @@ verify_heapam(PG_FUNCTION_ARGS)

					PG_RETURN_NULL();

				}

				/*

				 * Heap amcheck's read stream callback for getting the next unskippable block.

				 * This callback is only used when 'all-visible' or 'all-frozen' is provided

				 * as the skip option to verify_heapam(). With the default 'none',

				 * block_range_read_stream_cb() is used instead.

				 */

				static BlockNumber

				heapcheck_read_stream_next_unskippable(ReadStream *stream,

													   void *callback_private_data,

													   void *per_buffer_data)

				{

					HeapCheckReadStreamData *p = callback_private_data;

					/* Loops over [current_blocknum, last_exclusive) blocks */

					for (BlockNumber i; (i = p->range.current_blocknum++) < p->range.last_exclusive;)

					{

						uint8		mapbits = visibilitymap_get_status(p->rel, i, p->vmbuffer);

						if (p->skip_option == SKIP_PAGES_ALL_FROZEN)

						{

							if ((mapbits & VISIBILITYMAP_ALL_FROZEN) != 0)

								continue;

						}

						if (p->skip_option == SKIP_PAGES_ALL_VISIBLE)

						{

							if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0)

								continue;

						}

						return i;

					}

					return InvalidBlockNumber;

				}

				/*

				 * Shared internal implementation for report_corruption and

				 * report_toast_corruption.

									
										275

contrib/amcheck/verify_nbtree.c
									
										View File
										
				@ -30,6 +30,7 @@

				#include "access/tableam.h"

				#include "access/transam.h"

				#include "access/xact.h"

				#include "verify_common.h"

				#include "catalog/index.h"

				#include "catalog/pg_am.h"

				#include "catalog/pg_opfamily_d.h"

				@ -42,7 +43,10 @@

				#include "utils/snapmgr.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "amcheck",

									.version = PG_VERSION

				);

				/*

				 * A B-Tree cannot possibly have this many levels, since there must be one

				@ -156,14 +160,22 @@ typedef struct BtreeLastVisibleEntry

					ItemPointer tid;			/* Heap tid */

				} BtreeLastVisibleEntry;

				/*

				 * arguments for the bt_index_check_callback callback

				 */

				typedef struct BTCallbackState

				{

					bool		parentcheck;

					bool		heapallindexed;

					bool		rootdescend;

					bool		checkunique;

				} BTCallbackState;

				PG_FUNCTION_INFO_V1(bt_index_check);

				PG_FUNCTION_INFO_V1(bt_index_parent_check);

				static void bt_index_check_internal(Oid indrelid, bool parentcheck,

													bool heapallindexed, bool rootdescend,

													bool checkunique);

				static inline void btree_index_checkable(Relation rel);

				static inline bool btree_index_mainfork_expected(Relation rel);

				static void bt_index_check_callback(Relation indrel, Relation heaprel,

													void *state, bool readonly);

				static void bt_check_every_level(Relation rel, Relation heaprel,

												 bool heapkeyspace, bool readonly, bool heapallindexed,

												 bool rootdescend, bool checkunique);

				@ -238,15 +250,21 @@ Datum

				bt_index_check(PG_FUNCTION_ARGS)

				{

					Oid			indrelid = PG_GETARG_OID(0);

					bool		heapallindexed = false;

					bool		checkunique = false;

					BTCallbackState args;

					args.heapallindexed = false;

					args.rootdescend = false;

					args.parentcheck = false;

					args.checkunique = false;

					if (PG_NARGS() >= 2)

						heapallindexed = PG_GETARG_BOOL(1);

					if (PG_NARGS() == 3)

						checkunique = PG_GETARG_BOOL(2);

						args.heapallindexed = PG_GETARG_BOOL(1);

					if (PG_NARGS() >= 3)

						args.checkunique = PG_GETARG_BOOL(2);

					bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);

					amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,

													bt_index_check_callback,

													AccessShareLock, &args);

					PG_RETURN_VOID();

				}

				@ -264,18 +282,23 @@ Datum

				bt_index_parent_check(PG_FUNCTION_ARGS)

				{

					Oid			indrelid = PG_GETARG_OID(0);

					bool		heapallindexed = false;

					bool		rootdescend = false;

					bool		checkunique = false;

					BTCallbackState args;

					args.heapallindexed = false;

					args.rootdescend = false;

					args.parentcheck = true;

					args.checkunique = false;

					if (PG_NARGS() >= 2)

						heapallindexed = PG_GETARG_BOOL(1);

						args.heapallindexed = PG_GETARG_BOOL(1);

					if (PG_NARGS() >= 3)

						rootdescend = PG_GETARG_BOOL(2);

					if (PG_NARGS() == 4)

						checkunique = PG_GETARG_BOOL(3);

						args.rootdescend = PG_GETARG_BOOL(2);

					if (PG_NARGS() >= 4)

						args.checkunique = PG_GETARG_BOOL(3);

					bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);

					amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,

													bt_index_check_callback,

													ShareLock, &args);

					PG_RETURN_VOID();

				}

				@ -284,193 +307,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)

				 * Helper for bt_index_[parent_]check, coordinating the bulk of the work.

				 */

				static void

				bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,

										bool rootdescend, bool checkunique)

				bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)

				{

					Oid			heapid;

					Relation	indrel;

					Relation	heaprel;

					LOCKMODE	lockmode;

					Oid			save_userid;

					int			save_sec_context;

					int			save_nestlevel;

					BTCallbackState *args = (BTCallbackState *) state;

					bool		heapkeyspace,

								allequalimage;

					if (parentcheck)

						lockmode = ShareLock;

					else

						lockmode = AccessShareLock;

					/*

					 * We must lock table before index to avoid deadlocks.  However, if the

					 * passed indrelid isn't an index then IndexGetRelation() will fail.

					 * Rather than emitting a not-very-helpful error message, postpone

					 * complaining, expecting that the is-it-an-index test below will fail.

					 *

					 * In hot standby mode this will raise an error when parentcheck is true.

					 */

					heapid = IndexGetRelation(indrelid, true);

					if (OidIsValid(heapid))

					{

						heaprel = table_open(heapid, lockmode);

						/*

						 * Switch to the table owner's userid, so that any index functions are

						 * run as that user.  Also lock down security-restricted operations

						 * and arrange to make GUC variable changes local to this command.

						 */

						GetUserIdAndSecContext(&save_userid, &save_sec_context);

						SetUserIdAndSecContext(heaprel->rd_rel->relowner,

											   save_sec_context | SECURITY_RESTRICTED_OPERATION);

						save_nestlevel = NewGUCNestLevel();

						RestrictSearchPath();

					}

					else

					{

						heaprel = NULL;

						/* Set these just to suppress "uninitialized variable" warnings */

						save_userid = InvalidOid;

						save_sec_context = -1;

						save_nestlevel = -1;

					}

					/*

					 * Open the target index relations separately (like relation_openrv(), but

					 * with heap relation locked first to prevent deadlocking).  In hot

					 * standby mode this will raise an error when parentcheck is true.

					 *

					 * There is no need for the usual indcheckxmin usability horizon test

					 * here, even in the heapallindexed case, because index undergoing

					 * verification only needs to have entries for a new transaction snapshot.

					 * (If this is a parentcheck verification, there is no question about

					 * committed or recently dead heap tuples lacking index entries due to

					 * concurrent activity.)

					 */

					indrel = index_open(indrelid, lockmode);

					/*

					 * Since we did the IndexGetRelation call above without any lock, it's

					 * barely possible that a race against an index drop/recreation could have

					 * netted us the wrong table.

					 */

					if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))

					if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))

						ereport(ERROR,

								(errcode(ERRCODE_UNDEFINED_TABLE),

								 errmsg("could not open parent table of index \"%s\"",

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("index \"%s\" lacks a main relation fork",

										RelationGetRelationName(indrel))));

					/* Relation suitable for checking as B-Tree? */

					btree_index_checkable(indrel);

					if (btree_index_mainfork_expected(indrel))

					/* Extract metadata from metapage, and sanitize it in passing */

					_bt_metaversion(indrel, &heapkeyspace, &allequalimage);

					if (allequalimage && !heapkeyspace)

						ereport(ERROR,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",

										RelationGetRelationName(indrel))));

					if (allequalimage && !_bt_allequalimage(indrel, false))

					{

						bool		heapkeyspace,

									allequalimage;

						bool		has_interval_ops = false;

						if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))

							ereport(ERROR,

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("index \"%s\" lacks a main relation fork",

											RelationGetRelationName(indrel))));

						/* Extract metadata from metapage, and sanitize it in passing */

						_bt_metaversion(indrel, &heapkeyspace, &allequalimage);

						if (allequalimage && !heapkeyspace)

							ereport(ERROR,

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",

											RelationGetRelationName(indrel))));

						if (allequalimage && !_bt_allequalimage(indrel, false))

						{

							bool		has_interval_ops = false;

							for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)

								if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)

									has_interval_ops = true;

							ereport(ERROR,

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",

											RelationGetRelationName(indrel)),

									 has_interval_ops

									 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")

									 : 0));

						}

						/* Check index, possibly against table it is an index on */

						bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,

											 heapallindexed, rootdescend, checkunique);

						for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)

							if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)

							{

								has_interval_ops = true;

								ereport(ERROR,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",

												RelationGetRelationName(indrel)),

										 has_interval_ops

										 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")

										 : 0));

							}

					}

					/* Roll back any GUC changes executed by index functions */

					AtEOXact_GUC(false, save_nestlevel);

					/* Restore userid and security context */

					SetUserIdAndSecContext(save_userid, save_sec_context);

					/*

					 * Release locks early. That's ok here because nothing in the called

					 * routines will trigger shared cache invalidations to be sent, so we can

					 * relax the usual pattern of only releasing locks after commit.

					 */

					index_close(indrel, lockmode);

					if (heaprel)

						table_close(heaprel, lockmode);

				}

				/*

				 * Basic checks about the suitability of a relation for checking as a B-Tree

				 * index.

				 *

				 * NB: Intentionally not checking permissions, the function is normally not

				 * callable by non-superusers. If granted, it's useful to be able to check a

				 * whole cluster.

				 */

				static inline void

				btree_index_checkable(Relation rel)

				{

					if (rel->rd_rel->relkind != RELKIND_INDEX ||

						rel->rd_rel->relam != BTREE_AM_OID)

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								 errmsg("only B-Tree indexes are supported as targets for verification"),

								 errdetail("Relation \"%s\" is not a B-Tree index.",

										   RelationGetRelationName(rel))));

					if (RELATION_IS_OTHER_TEMP(rel))

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								 errmsg("cannot access temporary tables of other sessions"),

								 errdetail("Index \"%s\" is associated with temporary relation.",

										   RelationGetRelationName(rel))));

					if (!rel->rd_index->indisvalid)

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								 errmsg("cannot check index \"%s\"",

										RelationGetRelationName(rel)),

								 errdetail("Index is not valid.")));

				}

				/*

				 * Check if B-Tree index relation should have a file for its main relation

				 * fork.  Verification uses this to skip unlogged indexes when in hot standby

				 * mode, where there is simply nothing to verify.  We behave as if the

				 * relation is empty.

				 *

				 * NB: Caller should call btree_index_checkable() before calling here.

				 */

				static inline bool

				btree_index_mainfork_expected(Relation rel)

				{

					if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||

						!RecoveryInProgress())

						return true;

					ereport(DEBUG1,

							(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),

							 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",

									RelationGetRelationName(rel))));

					return false;

					/* Check index, possibly against table it is an index on */

					bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,

										 args->heapallindexed, args->rootdescend, args->checkunique);

				}

				/*

				@ -1597,8 +1473,7 @@ bt_target_page_check(BtreeCheckState *state)

						 */

						lowersizelimit = skey->heapkeyspace &&

							(P_ISLEAF(topaque) || BTreeTupleGetHeapTID(itup) == NULL);

						if (tupsize > (lowersizelimit ? BTMaxItemSize(state->target) :

									   BTMaxItemSizeNoHeapTid(state->target)))

						if (tupsize > (lowersizelimit ? BTMaxItemSize : BTMaxItemSizeNoHeapTid))

						{

							ItemPointer tid = BTreeTupleGetPointsToTID(itup);

							char	   *itid,

									
										5

contrib/auth_delay/auth_delay.c
									
										View File
										
				@ -16,7 +16,10 @@

				#include "libpq/auth.h"

				#include "utils/guc.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "auth_delay",

									.version = PG_VERSION

				);

				/* GUC Variables */

				static int	auth_delay_milliseconds = 0;

									
										13

contrib/auto_explain/auto_explain.c
									
										View File
										
				@ -16,11 +16,16 @@

				#include "access/parallel.h"

				#include "commands/explain.h"

				#include "commands/explain_format.h"

				#include "commands/explain_state.h"

				#include "common/pg_prng.h"

				#include "executor/instrument.h"

				#include "utils/guc.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "auto_explain",

									.version = PG_VERSION

				);

				/* GUC variables */

				static int	auto_explain_log_min_duration = -1; /* msec or -1 */

				@ -93,7 +98,7 @@ _PG_init(void)

					/* Define custom GUC variables. */

					DefineCustomIntVariable("auto_explain.log_min_duration",

											"Sets the minimum execution time above which plans will be logged.",

											"Zero prints all plans. -1 turns this feature off.",

											"-1 disables logging plans. 0 means log all plans.",

											&auto_explain_log_min_duration,

											-1,

											-1, INT_MAX,

				@ -104,8 +109,8 @@ _PG_init(void)

											NULL);

					DefineCustomIntVariable("auto_explain.log_parameter_max_length",

											"Sets the maximum length of query parameters to log.",

											"Zero logs no query parameters, -1 logs them in full.",

											"Sets the maximum length of query parameter values to log.",

											"-1 means log values in full.",

											&auto_explain_log_parameter_max_length,

											-1,

											-1, INT_MAX,

									
										15

contrib/auto_explain/t/001_auto_explain.pl
									
										View File
										
				@ -28,7 +28,7 @@ sub query_log

				}

				my $node = PostgreSQL::Test::Cluster->new('main');

				$node->init('auth_extra' => [ '--create-role', 'regress_user1' ]);

				$node->init(auth_extra => [ '--create-role' => 'regress_user1' ]);

				$node->append_conf('postgresql.conf',

					"session_preload_libraries = 'auto_explain'");

				$node->append_conf('postgresql.conf', "auto_explain.log_min_duration = 0");

				@ -212,4 +212,17 @@ REVOKE SET ON PARAMETER auto_explain.log_format FROM regress_user1;

				DROP USER regress_user1;

				});

				# Test pg_get_loaded_modules() function.  This function is particularly

				# useful for modules with no SQL presence, such as auto_explain.

				my $res = $node->safe_psql(

					"postgres", q{

				SELECT module_name,

				       version = current_setting('server_version') as version_ok,

				       regexp_replace(file_name, '\..*', '') as file_name_stripped

				FROM pg_get_loaded_modules()

				WHERE module_name = 'auto_explain';

				});

				like($res, qr/^auto_explain\|t\|auto_explain$/, "pg_get_loaded_modules() ok");

				done_testing();

									
										5

contrib/basebackup_to_shell/basebackup_to_shell.c
									
										View File
										
				@ -18,7 +18,10 @@

				#include "utils/acl.h"

				#include "utils/guc.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "basebackup_to_shell",

									.version = PG_VERSION

				);

				typedef struct bbsink_shell

				{

									
										10

contrib/basebackup_to_shell/t/001_basic.pl
									
										View File
										
				@ -24,8 +24,8 @@ my $node = PostgreSQL::Test::Cluster->new('primary');

				# Make sure pg_hba.conf is set up to allow connections from backupuser.

				# This is only needed on Windows machines that don't use UNIX sockets.

				$node->init(

					'allows_streaming' => 1,

					'auth_extra' => [ '--create-role' => 'backupuser' ]);

					allows_streaming => 1,

					auth_extra => [ '--create-role' => 'backupuser' ]);

				$node->append_conf('postgresql.conf',

					"shared_preload_libraries = 'basebackup_to_shell'");

				@ -131,8 +131,10 @@ sub verify_backup

						# Untar.

						my $extract_path = PostgreSQL::Test::Utils::tempdir;

						system_or_bail($tar, 'xf', $backup_dir . '/' . $prefix . 'base.tar',

							'-C', $extract_path);

						system_or_bail(

							$tar,

							'xf' => $backup_dir . '/' . $prefix . 'base.tar',

							'-C' => $extract_path);

						# Verify.

						$node->command_ok(

									
										5

contrib/basic_archive/basic_archive.c
									
										View File
										
				@ -37,7 +37,10 @@

				#include "storage/fd.h"

				#include "utils/guc.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "basic_archive",

									.version = PG_VERSION

				);

				static char *archive_directory = NULL;

									
										5

contrib/bloom/blinsert.c
									
										View File
										
				@ -22,7 +22,10 @@

				#include "utils/memutils.h"

				#include "utils/rel.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "bloom",

									.version = PG_VERSION

				);

				/*

				 * State of bloom index build.  We accumulate one page data here before

									
										2

contrib/bloom/blscan.c
									
										View File
										
				@ -116,6 +116,8 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)

					bas = GetAccessStrategy(BAS_BULKREAD);

					npages = RelationGetNumberOfBlocks(scan->indexRelation);

					pgstat_count_index_scan(scan->indexRelation);

					if (scan->instrument)

						scan->instrument->nsearches++;

					for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)

					{

									
										3

contrib/bloom/blutils.c
									
										View File
										
				@ -109,6 +109,9 @@ blhandler(PG_FUNCTION_ARGS)

					amroutine->amoptsprocnum = BLOOM_OPTIONS_PROC;

					amroutine->amcanorder = false;

					amroutine->amcanorderbyop = false;

					amroutine->amcanhash = false;

					amroutine->amconsistentequality = false;

					amroutine->amconsistentordering = false;

					amroutine->amcanbackward = false;

					amroutine->amcanunique = false;

					amroutine->amcanmulticol = true;

									
										5

contrib/bool_plperl/bool_plperl.c
									
										View File
										
				@ -4,7 +4,10 @@

				#include "plperl.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "bool_plperl",

									.version = PG_VERSION

				);

				PG_FUNCTION_INFO_V1(bool_to_plperl);

4

contrib/bool_plperl/expected/bool_plperl.out

View File

 @ -104,9 +104,9 @@ SELECT spi_test();
 DROP EXTENSION plperl CASCADE;
 NOTICE:  drop cascades to 6 other objects
 DETAIL:  drop cascades to function spi_test()
 drop cascades to extension bool_plperl
 DETAIL:  drop cascades to extension bool_plperl
 drop cascades to function perl2int(integer)
 drop cascades to function perl2text(text)
 drop cascades to function perl2undef()
 drop cascades to function bool2perl(boolean,boolean,boolean)
 drop cascades to function spi_test()

4

contrib/bool_plperl/expected/bool_plperlu.out

View File

 @ -104,9 +104,9 @@ SELECT spi_test();
 DROP EXTENSION plperlu CASCADE;
 NOTICE:  drop cascades to 6 other objects
 DETAIL:  drop cascades to function spi_test()
 drop cascades to extension bool_plperlu
 DETAIL:  drop cascades to extension bool_plperlu
 drop cascades to function perl2int(integer)
 drop cascades to function perl2text(text)
 drop cascades to function perl2undef()
 drop cascades to function bool2perl(boolean,boolean,boolean)
 drop cascades to function spi_test()

									
										5

contrib/btree_gin/btree_gin.c
									
										View File
										
				@ -14,7 +14,10 @@

				#include "utils/timestamp.h"

				#include "utils/uuid.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "btree_gin",

									.version = PG_VERSION

				);

				typedef struct QueryInfo

				{

									
										2

contrib/btree_gist/Makefile
									
										View File
										
				@ -34,7 +34,7 @@ DATA = btree_gist--1.0--1.1.sql \

				       btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql \

				       btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \

				       btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \

				       btree_gist--1.7--1.8.sql

				       btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql

				PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"

				REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \

									
										57

contrib/btree_gist/btree_bit.c
									
										View File
										
				@ -6,18 +6,18 @@

				#include "btree_gist.h"

				#include "btree_utils_var.h"

				#include "utils/fmgrprotos.h"

				#include "utils/sortsupport.h"

				#include "utils/varbit.h"

				/*

				** Bit ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_bit_compress);

				PG_FUNCTION_INFO_V1(gbt_bit_union);

				PG_FUNCTION_INFO_V1(gbt_bit_picksplit);

				PG_FUNCTION_INFO_V1(gbt_bit_consistent);

				PG_FUNCTION_INFO_V1(gbt_bit_penalty);

				PG_FUNCTION_INFO_V1(gbt_bit_same);

				PG_FUNCTION_INFO_V1(gbt_bit_sortsupport);

				PG_FUNCTION_INFO_V1(gbt_varbit_sortsupport);

				/* define for comparison */

				@ -121,7 +121,7 @@ static const gbtree_vinfo tinfo =

				/**************************************************

				 * Bit ops

				 * GiST support functions

				 **************************************************/

				Datum

				@ -161,8 +161,6 @@ gbt_bit_consistent(PG_FUNCTION_ARGS)

					PG_RETURN_BOOL(retval);

				}

				Datum

				gbt_bit_union(PG_FUNCTION_ARGS)

				{

				@ -173,7 +171,6 @@ gbt_bit_union(PG_FUNCTION_ARGS)

													&tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_bit_picksplit(PG_FUNCTION_ARGS)

				{

				@ -196,7 +193,6 @@ gbt_bit_same(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(result);

				}

				Datum

				gbt_bit_penalty(PG_FUNCTION_ARGS)

				{

				@ -207,3 +203,46 @@ gbt_bit_penalty(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_var_penalty(result, o, n, PG_GET_COLLATION(),

													  &tinfo, fcinfo->flinfo));

				}

				static int

				gbt_bit_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);

					GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);

					GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);

					GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);

					Datum		result;

					/* for leaf items we expect lower == upper, so only compare lower */

					result = DirectFunctionCall2(byteacmp,

												 PointerGetDatum(arg1.lower),

												 PointerGetDatum(arg2.lower));

					GBT_FREE_IF_COPY(key1, x);

					GBT_FREE_IF_COPY(key2, y);

					return DatumGetInt32(result);

				}

				Datum

				gbt_bit_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_bit_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

				Datum

				gbt_varbit_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_bit_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										32

contrib/btree_gist/btree_bool.c
									
										View File
										
				@ -5,6 +5,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/sortsupport.h"

				typedef struct boolkey

				{

				@ -12,9 +13,7 @@ typedef struct boolkey

					bool		upper;

				} boolKEY;

				/*

				** bool ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_bool_compress);

				PG_FUNCTION_INFO_V1(gbt_bool_fetch);

				PG_FUNCTION_INFO_V1(gbt_bool_union);

				@ -22,6 +21,7 @@ PG_FUNCTION_INFO_V1(gbt_bool_picksplit);

				PG_FUNCTION_INFO_V1(gbt_bool_consistent);

				PG_FUNCTION_INFO_V1(gbt_bool_penalty);

				PG_FUNCTION_INFO_V1(gbt_bool_same);

				PG_FUNCTION_INFO_V1(gbt_bool_sortsupport);

				static bool

				gbt_boolgt(const void *a, const void *b, FmgrInfo *flinfo)

				@ -82,10 +82,9 @@ static const gbtree_ninfo tinfo =

				/**************************************************

				 * bool ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_bool_compress(PG_FUNCTION_ARGS)

				{

				@ -124,7 +123,6 @@ gbt_bool_consistent(PG_FUNCTION_ARGS)

													  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_bool_union(PG_FUNCTION_ARGS)

				{

				@ -135,7 +133,6 @@ gbt_bool_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_bool_penalty(PG_FUNCTION_ARGS)

				{

				@ -166,3 +163,24 @@ gbt_bool_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_bool_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					boolKEY    *arg1 = (boolKEY *) DatumGetPointer(x);

					boolKEY    *arg2 = (boolKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return (int32) arg1->lower - (int32) arg2->lower;

				}

				Datum

				gbt_bool_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_bool_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										49

contrib/btree_gist/btree_bytea.c
									
										View File
										
				@ -6,17 +6,16 @@

				#include "btree_gist.h"

				#include "btree_utils_var.h"

				#include "utils/fmgrprotos.h"

				#include "utils/sortsupport.h"

				/*

				** Bytea ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_bytea_compress);

				PG_FUNCTION_INFO_V1(gbt_bytea_union);

				PG_FUNCTION_INFO_V1(gbt_bytea_picksplit);

				PG_FUNCTION_INFO_V1(gbt_bytea_consistent);

				PG_FUNCTION_INFO_V1(gbt_bytea_penalty);

				PG_FUNCTION_INFO_V1(gbt_bytea_same);

				PG_FUNCTION_INFO_V1(gbt_bytea_sortsupport);

				/* define for comparison */

				@ -69,7 +68,6 @@ gbt_byteacmp(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)

															 PointerGetDatum(b)));

				}

				static const gbtree_vinfo tinfo =

				{

					gbt_t_bytea,

				@ -86,10 +84,9 @@ static const gbtree_vinfo tinfo =

				/**************************************************

				 * Text ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_bytea_compress(PG_FUNCTION_ARGS)

				{

				@ -98,8 +95,6 @@ gbt_bytea_compress(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_var_compress(entry, &tinfo));

				}

				Datum

				gbt_bytea_consistent(PG_FUNCTION_ARGS)

				{

				@ -121,8 +116,6 @@ gbt_bytea_consistent(PG_FUNCTION_ARGS)

					PG_RETURN_BOOL(retval);

				}

				Datum

				gbt_bytea_union(PG_FUNCTION_ARGS)

				{

				@ -133,7 +126,6 @@ gbt_bytea_union(PG_FUNCTION_ARGS)

													&tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_bytea_picksplit(PG_FUNCTION_ARGS)

				{

				@ -156,7 +148,6 @@ gbt_bytea_same(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(result);

				}

				Datum

				gbt_bytea_penalty(PG_FUNCTION_ARGS)

				{

				@ -167,3 +158,35 @@ gbt_bytea_penalty(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_var_penalty(result, o, n, PG_GET_COLLATION(),

													  &tinfo, fcinfo->flinfo));

				}

				static int

				gbt_bytea_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);

					GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);

					GBT_VARKEY_R xkey = gbt_var_key_readable(key1);

					GBT_VARKEY_R ykey = gbt_var_key_readable(key2);

					Datum		result;

					/* for leaf items we expect lower == upper, so only compare lower */

					result = DirectFunctionCall2(byteacmp,

												 PointerGetDatum(xkey.lower),

												 PointerGetDatum(ykey.lower));

					GBT_FREE_IF_COPY(key1, x);

					GBT_FREE_IF_COPY(key2, y);

					return DatumGetInt32(result);

				}

				Datum

				gbt_bytea_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_bytea_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										41

contrib/btree_gist/btree_cash.c
									
										View File
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/cash.h"

				#include "utils/sortsupport.h"

				typedef struct

				{

				@ -14,9 +15,7 @@ typedef struct

					Cash		upper;

				} cashKEY;

				/*

				** Cash ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_cash_compress);

				PG_FUNCTION_INFO_V1(gbt_cash_fetch);

				PG_FUNCTION_INFO_V1(gbt_cash_union);

				@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_cash_consistent);

				PG_FUNCTION_INFO_V1(gbt_cash_distance);

				PG_FUNCTION_INFO_V1(gbt_cash_penalty);

				PG_FUNCTION_INFO_V1(gbt_cash_same);

				PG_FUNCTION_INFO_V1(gbt_cash_sortsupport);

				static bool

				gbt_cashgt(const void *a, const void *b, FmgrInfo *flinfo)

				@ -111,10 +111,10 @@ cash_dist(PG_FUNCTION_ARGS)

					PG_RETURN_CASH(ra);

				}

				/**************************************************

				 * Cash ops

				 **************************************************/

				/**************************************************

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_cash_compress(PG_FUNCTION_ARGS)

				@ -155,7 +155,6 @@ gbt_cash_consistent(PG_FUNCTION_ARGS)

													  fcinfo->flinfo));

				}

				Datum

				gbt_cash_distance(PG_FUNCTION_ARGS)

				{

				@ -173,7 +172,6 @@ gbt_cash_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_cash_union(PG_FUNCTION_ARGS)

				{

				@ -184,7 +182,6 @@ gbt_cash_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_cash_penalty(PG_FUNCTION_ARGS)

				{

				@ -215,3 +212,29 @@ gbt_cash_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_cash_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					cashKEY    *arg1 = (cashKEY *) DatumGetPointer(x);

					cashKEY    *arg2 = (cashKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					if (arg1->lower > arg2->lower)

						return 1;

					else if (arg1->lower < arg2->lower)

						return -1;

					else

						return 0;

				}

				Datum

				gbt_cash_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_cash_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										37

contrib/btree_gist/btree_date.c
									
										View File
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "utils/fmgrprotos.h"

				#include "utils/date.h"

				#include "utils/sortsupport.h"

				typedef struct

				{

				@ -14,9 +15,7 @@ typedef struct

					DateADT		upper;

				} dateKEY;

				/*

				** date ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_date_compress);

				PG_FUNCTION_INFO_V1(gbt_date_fetch);

				PG_FUNCTION_INFO_V1(gbt_date_union);

				@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_date_consistent);

				PG_FUNCTION_INFO_V1(gbt_date_distance);

				PG_FUNCTION_INFO_V1(gbt_date_penalty);

				PG_FUNCTION_INFO_V1(gbt_date_same);

				PG_FUNCTION_INFO_V1(gbt_date_sortsupport);

				static bool

				gbt_dategt(const void *a, const void *b, FmgrInfo *flinfo)

				@ -128,11 +128,9 @@ date_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * date ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_date_compress(PG_FUNCTION_ARGS)

				{

				@ -172,7 +170,6 @@ gbt_date_consistent(PG_FUNCTION_ARGS)

													  fcinfo->flinfo));

				}

				Datum

				gbt_date_distance(PG_FUNCTION_ARGS)

				{

				@ -190,7 +187,6 @@ gbt_date_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_date_union(PG_FUNCTION_ARGS)

				{

				@ -201,7 +197,6 @@ gbt_date_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_date_penalty(PG_FUNCTION_ARGS)

				{

				@ -238,7 +233,6 @@ gbt_date_penalty(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(result);

				}

				Datum

				gbt_date_picksplit(PG_FUNCTION_ARGS)

				{

				@ -257,3 +251,26 @@ gbt_date_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_date_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					dateKEY    *akey = (dateKEY *) DatumGetPointer(x);

					dateKEY    *bkey = (dateKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return DatumGetInt32(DirectFunctionCall2(date_cmp,

															 DateADTGetDatum(akey->lower),

															 DateADTGetDatum(bkey->lower)));

				}

				Datum

				gbt_date_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_date_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										47

contrib/btree_gist/btree_enum.c
									
										View File
										
				@ -7,6 +7,8 @@

				#include "btree_utils_num.h"

				#include "fmgr.h"

				#include "utils/fmgrprotos.h"

				#include "utils/fmgroids.h"

				#include "utils/sortsupport.h"

				/* enums are really Oids, so we just use the same structure */

				@ -16,9 +18,7 @@ typedef struct

					Oid			upper;

				} oidKEY;

				/*

				** enum ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_enum_compress);

				PG_FUNCTION_INFO_V1(gbt_enum_fetch);

				PG_FUNCTION_INFO_V1(gbt_enum_union);

				@ -26,6 +26,7 @@ PG_FUNCTION_INFO_V1(gbt_enum_picksplit);

				PG_FUNCTION_INFO_V1(gbt_enum_consistent);

				PG_FUNCTION_INFO_V1(gbt_enum_penalty);

				PG_FUNCTION_INFO_V1(gbt_enum_same);

				PG_FUNCTION_INFO_V1(gbt_enum_sortsupport);

				static bool

				@ -99,10 +100,9 @@ static const gbtree_ninfo tinfo =

				/**************************************************

				 * Enum ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_enum_compress(PG_FUNCTION_ARGS)

				{

				@ -152,7 +152,6 @@ gbt_enum_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_enum_penalty(PG_FUNCTION_ARGS)

				{

				@ -183,3 +182,39 @@ gbt_enum_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_enum_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					oidKEY	   *arg1 = (oidKEY *) DatumGetPointer(x);

					oidKEY	   *arg2 = (oidKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return DatumGetInt32(CallerFInfoFunctionCall2(enum_cmp,

																  ssup->ssup_extra,

																  InvalidOid,

																  arg1->lower,

																  arg2->lower));

				}

				Datum

				gbt_enum_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					FmgrInfo   *flinfo;

					ssup->comparator = gbt_enum_ssup_cmp;

					/*

					 * Since gbt_enum_ssup_cmp() uses enum_cmp() like the rest of the

					 * comparison functions, it also needs to pass flinfo when calling it. The

					 * caller to a SortSupport comparison function doesn't provide an FmgrInfo

					 * struct, so look it up now, save it in ssup_extra and use it in

					 * gbt_enum_ssup_cmp() later.

					 */

					flinfo = MemoryContextAlloc(ssup->ssup_cxt, sizeof(FmgrInfo));

					fmgr_info_cxt(F_ENUM_CMP, flinfo, ssup->ssup_cxt);

					ssup->ssup_extra = flinfo;

					PG_RETURN_VOID();

				}

									
										33

contrib/btree_gist/btree_float4.c
									
										View File
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/float.h"

				#include "utils/sortsupport.h"

				typedef struct float4key

				{

				@ -13,9 +14,7 @@ typedef struct float4key

					float4		upper;

				} float4KEY;

				/*

				** float4 ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_float4_compress);

				PG_FUNCTION_INFO_V1(gbt_float4_fetch);

				PG_FUNCTION_INFO_V1(gbt_float4_union);

				@ -24,6 +23,7 @@ PG_FUNCTION_INFO_V1(gbt_float4_consistent);

				PG_FUNCTION_INFO_V1(gbt_float4_distance);

				PG_FUNCTION_INFO_V1(gbt_float4_penalty);

				PG_FUNCTION_INFO_V1(gbt_float4_same);

				PG_FUNCTION_INFO_V1(gbt_float4_sortsupport);

				static bool

				gbt_float4gt(const void *a, const void *b, FmgrInfo *flinfo)

				@ -107,10 +107,9 @@ float4_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * float4 ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_float4_compress(PG_FUNCTION_ARGS)

				{

				@ -150,7 +149,6 @@ gbt_float4_consistent(PG_FUNCTION_ARGS)

													  fcinfo->flinfo));

				}

				Datum

				gbt_float4_distance(PG_FUNCTION_ARGS)

				{

				@ -168,7 +166,6 @@ gbt_float4_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_float4_union(PG_FUNCTION_ARGS)

				{

				@ -179,7 +176,6 @@ gbt_float4_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_float4_penalty(PG_FUNCTION_ARGS)

				{

				@ -210,3 +206,24 @@ gbt_float4_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_float4_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					float4KEY  *arg1 = (float4KEY *) DatumGetPointer(x);

					float4KEY  *arg2 = (float4KEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return float4_cmp_internal(arg1->lower, arg2->lower);

				}

				Datum

				gbt_float4_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_float4_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										36

contrib/btree_gist/btree_float8.c
									
										View File
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/float.h"

				#include "utils/sortsupport.h"

				typedef struct float8key

				{

				@ -13,9 +14,7 @@ typedef struct float8key

					float8		upper;

				} float8KEY;

				/*

				** float8 ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_float8_compress);

				PG_FUNCTION_INFO_V1(gbt_float8_fetch);

				PG_FUNCTION_INFO_V1(gbt_float8_union);

				@ -24,6 +23,7 @@ PG_FUNCTION_INFO_V1(gbt_float8_consistent);

				PG_FUNCTION_INFO_V1(gbt_float8_distance);

				PG_FUNCTION_INFO_V1(gbt_float8_penalty);

				PG_FUNCTION_INFO_V1(gbt_float8_same);

				PG_FUNCTION_INFO_V1(gbt_float8_sortsupport);

				static bool

				@ -113,10 +113,10 @@ float8_dist(PG_FUNCTION_ARGS)

					PG_RETURN_FLOAT8(fabs(r));

				}

				/**************************************************

				 * float8 ops

				 **************************************************/

				/**************************************************

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_float8_compress(PG_FUNCTION_ARGS)

				@ -157,7 +157,6 @@ gbt_float8_consistent(PG_FUNCTION_ARGS)

													  fcinfo->flinfo));

				}

				Datum

				gbt_float8_distance(PG_FUNCTION_ARGS)

				{

				@ -175,7 +174,6 @@ gbt_float8_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_float8_union(PG_FUNCTION_ARGS)

				{

				@ -186,7 +184,6 @@ gbt_float8_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_float8_penalty(PG_FUNCTION_ARGS)

				{

				@ -217,3 +214,24 @@ gbt_float8_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_float8_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					float8KEY  *arg1 = (float8KEY *) DatumGetPointer(x);

					float8KEY  *arg2 = (float8KEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return float8_cmp_internal(arg1->lower, arg2->lower);

				}

				Datum

				gbt_float8_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_float8_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										52

contrib/btree_gist/btree_gist--1.7--1.8.sql
									
										View File
										
				@ -9,79 +9,79 @@ AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD

					FUNCTION 12 (oid, oid) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD

					FUNCTION 12 (int2, int2) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD

					FUNCTION 12 (int4, int4) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD

					FUNCTION 12 (int8, int8) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD

					FUNCTION 12 (float4, float4) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD

					FUNCTION 12 (float8, float8) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD

					FUNCTION 12 (timestamp, timestamp) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD

					FUNCTION 12 (timestamptz, timestamptz) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_time_ops USING gist ADD

					FUNCTION 12 (time, time) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_date_ops USING gist ADD

					FUNCTION 12 (date, date) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD

					FUNCTION 12 (interval, interval) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD

					FUNCTION 12 (money, money) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_macaddr_ops USING gist ADD

					FUNCTION 12 (macaddr, macaddr) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_text_ops USING gist ADD

					FUNCTION 12 (text, text) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_bpchar_ops USING gist ADD

					FUNCTION 12 (bpchar, bpchar) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_bytea_ops USING gist ADD

					FUNCTION 12 (bytea, bytea) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_numeric_ops USING gist ADD

					FUNCTION 12 (numeric, numeric) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_bit_ops USING gist ADD

					FUNCTION 12 (bit, bit) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD

					FUNCTION 12 (varbit, varbit) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_inet_ops USING gist ADD

					FUNCTION 12 (inet, inet) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_cidr_ops USING gist ADD

					FUNCTION 12 (cidr, cidr) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_timetz_ops USING gist ADD

					FUNCTION 12 (timetz, timetz) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD

					FUNCTION 12 (uuid, uuid) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_macaddr8_ops USING gist ADD

					FUNCTION 12 (macaddr8, macaddr8) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD

					FUNCTION 12 (anyenum, anyenum) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

				ALTER OPERATOR FAMILY gist_bool_ops USING gist ADD

					FUNCTION 12 (bool, bool) gist_stratnum_btree (int) ;

					FUNCTION 12 ("any", "any") gist_stratnum_btree (int) ;

									
										197

contrib/btree_gist/btree_gist--1.8--1.9.sql
									
										Normal file
									
										View File
										
				@ -0,0 +1,197 @@

				/* contrib/btree_gist/btree_gist--1.7--1.8.sql */

				-- complain if script is sourced in psql, rather than via CREATE EXTENSION

				\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.9'" to load this file. \quit

				CREATE FUNCTION gbt_bit_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_varbit_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_bool_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_bytea_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_cash_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_date_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_enum_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_float4_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_float8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_inet_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int2_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int4_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_intv_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_macaddr_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_macad8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_numeric_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_oid_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_text_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_bpchar_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_time_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_ts_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_uuid_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				ALTER OPERATOR FAMILY gist_bit_ops USING gist ADD

					FUNCTION	11  (bit, bit) gbt_bit_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD

					FUNCTION	11  (varbit, varbit) gbt_varbit_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bool_ops USING gist ADD

					FUNCTION	11  (bool, bool) gbt_bool_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bytea_ops USING gist ADD

					FUNCTION	11  (bytea, bytea) gbt_bytea_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD

					FUNCTION	11  (money, money) gbt_cash_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_date_ops USING gist ADD

					FUNCTION	11  (date, date) gbt_date_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD

					FUNCTION	11  (anyenum, anyenum) gbt_enum_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD

					FUNCTION	11  (float4, float4) gbt_float4_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD

					FUNCTION	11  (float8, float8) gbt_float8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_inet_ops USING gist ADD

					FUNCTION	11  (inet, inet) gbt_inet_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_cidr_ops USING gist ADD

					FUNCTION	11  (cidr, cidr) gbt_inet_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD

					FUNCTION	11  (int2, int2) gbt_int2_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD

					FUNCTION	11  (int4, int4) gbt_int4_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD

					FUNCTION	11  (int8, int8) gbt_int8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD

					FUNCTION	11  (interval, interval) gbt_intv_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_macaddr_ops USING gist ADD

					FUNCTION	11  (macaddr, macaddr) gbt_macaddr_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_macaddr8_ops USING gist ADD

					FUNCTION	11  (macaddr8, macaddr8) gbt_macad8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_numeric_ops USING gist ADD

					FUNCTION	11  (numeric, numeric) gbt_numeric_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD

					FUNCTION	11  (oid, oid) gbt_oid_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_text_ops USING gist ADD

					FUNCTION	11  (text, text) gbt_text_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bpchar_ops USING gist ADD

					FUNCTION	11  (bpchar, bpchar) gbt_bpchar_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_time_ops USING gist ADD

					FUNCTION	11  (time, time) gbt_time_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timetz_ops USING gist ADD

					FUNCTION	11  (timetz, timetz) gbt_time_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD

					FUNCTION	11  (timestamp, timestamp) gbt_ts_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD

					FUNCTION	11  (timestamptz, timestamptz) gbt_ts_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD

					FUNCTION	11  (uuid, uuid) gbt_uuid_sortsupport (internal) ;

									
										5

contrib/btree_gist/btree_gist.c
									
										View File
										
				@ -7,7 +7,10 @@

				#include "access/stratnum.h"

				#include "utils/builtins.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "btree_gist",

									.version = PG_VERSION

				);

				PG_FUNCTION_INFO_V1(gbt_decompress);

				PG_FUNCTION_INFO_V1(gbtreekey_in);

2

contrib/btree_gist/btree_gist.control

View File

 @ -1,6 +1,6 @@
 # btree_gist extension
 comment = 'support for indexing common datatypes in GiST'
 default_version = '1.8'
 default_version = '1.9'
 module_pathname = '$libdir/btree_gist'
 relocatable = true
 trusted = true

									
										38

contrib/btree_gist/btree_inet.c
									
										View File
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "catalog/pg_type.h"

				#include "utils/builtins.h"

				#include "utils/sortsupport.h"

				typedef struct inetkey

				{

				@ -14,15 +15,14 @@ typedef struct inetkey

					double		upper;

				} inetKEY;

				/*

				** inet ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_inet_compress);

				PG_FUNCTION_INFO_V1(gbt_inet_union);

				PG_FUNCTION_INFO_V1(gbt_inet_picksplit);

				PG_FUNCTION_INFO_V1(gbt_inet_consistent);

				PG_FUNCTION_INFO_V1(gbt_inet_penalty);

				PG_FUNCTION_INFO_V1(gbt_inet_same);

				PG_FUNCTION_INFO_V1(gbt_inet_sortsupport);

				static bool

				@ -85,10 +85,9 @@ static const gbtree_ninfo tinfo =

				/**************************************************

				 * inet ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_inet_compress(PG_FUNCTION_ARGS)

				{

				@ -114,7 +113,6 @@ gbt_inet_compress(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(retval);

				}

				Datum

				gbt_inet_consistent(PG_FUNCTION_ARGS)

				{

				@ -142,7 +140,6 @@ gbt_inet_consistent(PG_FUNCTION_ARGS)

													  &strategy, GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_inet_union(PG_FUNCTION_ARGS)

				{

				@ -153,7 +150,6 @@ gbt_inet_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_inet_penalty(PG_FUNCTION_ARGS)

				{

				@ -184,3 +180,29 @@ gbt_inet_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_inet_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					inetKEY    *arg1 = (inetKEY *) DatumGetPointer(x);

					inetKEY    *arg2 = (inetKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					if (arg1->lower < arg2->lower)

						return -1;

					else if (arg1->lower > arg2->lower)

						return 1;

					else

						return 0;

				}

				Datum

				gbt_inet_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_inet_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										37

contrib/btree_gist/btree_int2.c
									
										View File
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/sortsupport.h"

				typedef struct int16key

				{

				@ -13,9 +14,7 @@ typedef struct int16key

					int16		upper;

				} int16KEY;

				/*

				** int16 ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_int2_compress);

				PG_FUNCTION_INFO_V1(gbt_int2_fetch);

				PG_FUNCTION_INFO_V1(gbt_int2_union);

				@ -24,6 +23,8 @@ PG_FUNCTION_INFO_V1(gbt_int2_consistent);

				PG_FUNCTION_INFO_V1(gbt_int2_distance);

				PG_FUNCTION_INFO_V1(gbt_int2_penalty);

				PG_FUNCTION_INFO_V1(gbt_int2_same);

				PG_FUNCTION_INFO_V1(gbt_int2_sortsupport);

				static bool

				gbt_int2gt(const void *a, const void *b, FmgrInfo *flinfo)

				@ -112,10 +113,9 @@ int2_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * int16 ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_int2_compress(PG_FUNCTION_ARGS)

				{

				@ -154,7 +154,6 @@ gbt_int2_consistent(PG_FUNCTION_ARGS)

													  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int2_distance(PG_FUNCTION_ARGS)

				{

				@ -172,7 +171,6 @@ gbt_int2_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int2_union(PG_FUNCTION_ARGS)

				{

				@ -183,7 +181,6 @@ gbt_int2_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int2_penalty(PG_FUNCTION_ARGS)

				{

				@ -214,3 +211,27 @@ gbt_int2_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_int2_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					int16KEY   *arg1 = (int16KEY *) DatumGetPointer(x);

					int16KEY   *arg2 = (int16KEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					if (arg1->lower < arg2->lower)

						return -1;

					else if (arg1->lower > arg2->lower)

						return 1;

					else

						return 0;

				}

				Datum

				gbt_int2_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_int2_ssup_cmp;

					PG_RETURN_VOID();

				}

									
										38

contrib/btree_gist/btree_int4.c
									
										View File
										
				@ -2,10 +2,10 @@

				 * contrib/btree_gist/btree_int4.c

				 */

				#include "postgres.h"

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/sortsupport.h"

				typedef struct int32key

				{

				@ -13,9 +13,7 @@ typedef struct int32key

					int32		upper;

				} int32KEY;

				/*

				** int32 ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_int4_compress);

				PG_FUNCTION_INFO_V1(gbt_int4_fetch);

				PG_FUNCTION_INFO_V1(gbt_int4_union);

				@ -24,7 +22,7 @@ PG_FUNCTION_INFO_V1(gbt_int4_consistent);

				PG_FUNCTION_INFO_V1(gbt_int4_distance);

				PG_FUNCTION_INFO_V1(gbt_int4_penalty);

				PG_FUNCTION_INFO_V1(gbt_int4_same);

				PG_FUNCTION_INFO_V1(gbt_int4_sortsupport);

				static bool

				gbt_int4gt(const void *a, const void *b, FmgrInfo *flinfo)

				@ -113,10 +111,9 @@ int4_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * int32 ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_int4_compress(PG_FUNCTION_ARGS)

				{

				@ -155,7 +152,6 @@ gbt_int4_consistent(PG_FUNCTION_ARGS)

													  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int4_distance(PG_FUNCTION_ARGS)

				{

				@ -173,7 +169,6 @@ gbt_int4_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int4_union(PG_FUNCTION_ARGS)

				{

				@ -184,7 +179,6 @@ gbt_int4_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int4_penalty(PG_FUNCTION_ARGS)

				{

				@ -215,3 +209,27 @@ gbt_int4_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_int4_ssup_cmp(Datum a, Datum b, SortSupport ssup)

				{

					int32KEY   *ia = (int32KEY *) DatumGetPointer(a);

					int32KEY   *ib = (int32KEY *) DatumGetPointer(b);

					/* for leaf items we expect lower == upper, so only compare lower */

					if (ia->lower < ib->lower)

						return -1;

					else if (ia->lower > ib->lower)

						return 1;

					else

						return 0;

				}

				Datum

				gbt_int4_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_int4_ssup_cmp;

					PG_RETURN_VOID();

				}

									
										37

contrib/btree_gist/btree_int8.c
									
										View File
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/sortsupport.h"

				typedef struct int64key

				{

				@ -13,9 +14,7 @@ typedef struct int64key

					int64		upper;

				} int64KEY;

				/*

				** int64 ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_int8_compress);

				PG_FUNCTION_INFO_V1(gbt_int8_fetch);

				PG_FUNCTION_INFO_V1(gbt_int8_union);

				@ -24,6 +23,7 @@ PG_FUNCTION_INFO_V1(gbt_int8_consistent);

				PG_FUNCTION_INFO_V1(gbt_int8_distance);

				PG_FUNCTION_INFO_V1(gbt_int8_penalty);

				PG_FUNCTION_INFO_V1(gbt_int8_same);

				PG_FUNCTION_INFO_V1(gbt_int8_sortsupport);

				static bool

				@ -113,10 +113,9 @@ int8_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * int64 ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_int8_compress(PG_FUNCTION_ARGS)

				{

				@ -155,7 +154,6 @@ gbt_int8_consistent(PG_FUNCTION_ARGS)

													  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int8_distance(PG_FUNCTION_ARGS)

				{

				@ -173,7 +171,6 @@ gbt_int8_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int8_union(PG_FUNCTION_ARGS)

				{

				@ -184,7 +181,6 @@ gbt_int8_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_int8_penalty(PG_FUNCTION_ARGS)

				{

				@ -215,3 +211,28 @@ gbt_int8_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_int8_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					int64KEY   *arg1 = (int64KEY *) DatumGetPointer(x);

					int64KEY   *arg2 = (int64KEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					if (arg1->lower < arg2->lower)

						return -1;

					else if (arg1->lower > arg2->lower)

						return 1;

					else

						return 0;

				}

				Datum

				gbt_int8_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_int8_ssup_cmp;

					PG_RETURN_VOID();

				}

									
										33

contrib/btree_gist/btree_interval.c
									
										View File
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/fmgrprotos.h"

				#include "utils/sortsupport.h"

				#include "utils/timestamp.h"

				typedef struct

				@ -14,10 +15,7 @@ typedef struct

								upper;

				} intvKEY;

				/*

				** Interval ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_intv_compress);

				PG_FUNCTION_INFO_V1(gbt_intv_fetch);

				PG_FUNCTION_INFO_V1(gbt_intv_decompress);

				@ -27,6 +25,7 @@ PG_FUNCTION_INFO_V1(gbt_intv_consistent);

				PG_FUNCTION_INFO_V1(gbt_intv_distance);

				PG_FUNCTION_INFO_V1(gbt_intv_penalty);

				PG_FUNCTION_INFO_V1(gbt_intv_same);

				PG_FUNCTION_INFO_V1(gbt_intv_sortsupport);

				static bool

				@ -137,10 +136,9 @@ interval_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * interval ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_intv_compress(PG_FUNCTION_ARGS)

				{

				@ -295,3 +293,26 @@ gbt_intv_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_intv_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					intvKEY    *arg1 = (intvKEY *) DatumGetPointer(x);

					intvKEY    *arg2 = (intvKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return DatumGetInt32(DirectFunctionCall2(interval_cmp,

															 IntervalPGetDatum(&arg1->lower),

															 IntervalPGetDatum(&arg2->lower)));

				}

				Datum

				gbt_intv_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_intv_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										35

contrib/btree_gist/btree_macaddr.c
									
										View File
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "utils/fmgrprotos.h"

				#include "utils/inet.h"

				#include "utils/sortsupport.h"

				typedef struct

				{

				@ -15,9 +16,7 @@ typedef struct

					char		pad[4];			/* make struct size = sizeof(gbtreekey16) */

				} macKEY;

				/*

				** OID ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_macad_compress);

				PG_FUNCTION_INFO_V1(gbt_macad_fetch);

				PG_FUNCTION_INFO_V1(gbt_macad_union);

				@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_macad_picksplit);

				PG_FUNCTION_INFO_V1(gbt_macad_consistent);

				PG_FUNCTION_INFO_V1(gbt_macad_penalty);

				PG_FUNCTION_INFO_V1(gbt_macad_same);

				PG_FUNCTION_INFO_V1(gbt_macaddr_sortsupport);

				static bool

				@ -88,11 +88,9 @@ static const gbtree_ninfo tinfo =

				/**************************************************

				 * macaddr ops

				 * GiST support functions

				 **************************************************/

				static uint64

				mac_2_uint64(macaddr *m)

				{

				@ -105,8 +103,6 @@ mac_2_uint64(macaddr *m)

					return res;

				}

				Datum

				gbt_macad_compress(PG_FUNCTION_ARGS)

				{

				@ -194,3 +190,26 @@ gbt_macad_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_macaddr_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					macKEY	   *arg1 = (macKEY *) DatumGetPointer(x);

					macKEY	   *arg2 = (macKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return DatumGetInt32(DirectFunctionCall2(macaddr_cmp,

															 MacaddrPGetDatum(&arg1->lower),

															 MacaddrPGetDatum(&arg2->lower)));

				}

				Datum

				gbt_macaddr_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_macaddr_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										38

contrib/btree_gist/btree_macaddr8.c
									
										View File
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "utils/fmgrprotos.h"

				#include "utils/inet.h"

				#include "utils/sortsupport.h"

				typedef struct

				{

				@ -15,9 +16,7 @@ typedef struct

					/* make struct size = sizeof(gbtreekey16) */

				} mac8KEY;

				/*

				** OID ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_macad8_compress);

				PG_FUNCTION_INFO_V1(gbt_macad8_fetch);

				PG_FUNCTION_INFO_V1(gbt_macad8_union);

				@ -25,7 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_macad8_picksplit);

				PG_FUNCTION_INFO_V1(gbt_macad8_consistent);

				PG_FUNCTION_INFO_V1(gbt_macad8_penalty);

				PG_FUNCTION_INFO_V1(gbt_macad8_same);

				PG_FUNCTION_INFO_V1(gbt_macad8_sortsupport);

				static bool

				gbt_macad8gt(const void *a, const void *b, FmgrInfo *flinfo)

				@ -88,11 +87,9 @@ static const gbtree_ninfo tinfo =

				/**************************************************

				 * macaddr ops

				 * GiST support functions

				 **************************************************/

				static uint64

				mac8_2_uint64(macaddr8 *m)

				{

				@ -105,8 +102,6 @@ mac8_2_uint64(macaddr8 *m)

					return res;

				}

				Datum

				gbt_macad8_compress(PG_FUNCTION_ARGS)

				{

				@ -145,7 +140,6 @@ gbt_macad8_consistent(PG_FUNCTION_ARGS)

													  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_macad8_union(PG_FUNCTION_ARGS)

				{

				@ -156,7 +150,6 @@ gbt_macad8_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_macad8_penalty(PG_FUNCTION_ARGS)

				{

				@ -194,3 +187,26 @@ gbt_macad8_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_macaddr8_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					mac8KEY    *arg1 = (mac8KEY *) DatumGetPointer(x);

					mac8KEY    *arg2 = (mac8KEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return DatumGetInt32(DirectFunctionCall2(macaddr8_cmp,

															 Macaddr8PGetDatum(&arg1->lower),

															 Macaddr8PGetDatum(&arg2->lower)));

				}

				Datum

				gbt_macad8_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_macaddr8_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										49

contrib/btree_gist/btree_numeric.c
									
										View File
										
				@ -11,16 +11,16 @@

				#include "utils/builtins.h"

				#include "utils/numeric.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				/*

				** Bytea ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_numeric_compress);

				PG_FUNCTION_INFO_V1(gbt_numeric_union);

				PG_FUNCTION_INFO_V1(gbt_numeric_picksplit);

				PG_FUNCTION_INFO_V1(gbt_numeric_consistent);

				PG_FUNCTION_INFO_V1(gbt_numeric_penalty);

				PG_FUNCTION_INFO_V1(gbt_numeric_same);

				PG_FUNCTION_INFO_V1(gbt_numeric_sortsupport);

				/* define for comparison */

				@ -90,10 +90,9 @@ static const gbtree_vinfo tinfo =

				/**************************************************

				 * Text ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_numeric_compress(PG_FUNCTION_ARGS)

				{

				@ -102,8 +101,6 @@ gbt_numeric_compress(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_var_compress(entry, &tinfo));

				}

				Datum

				gbt_numeric_consistent(PG_FUNCTION_ARGS)

				{

				@ -125,8 +122,6 @@ gbt_numeric_consistent(PG_FUNCTION_ARGS)

					PG_RETURN_BOOL(retval);

				}

				Datum

				gbt_numeric_union(PG_FUNCTION_ARGS)

				{

				@ -137,7 +132,6 @@ gbt_numeric_union(PG_FUNCTION_ARGS)

													&tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_numeric_same(PG_FUNCTION_ARGS)

				{

				@ -149,7 +143,6 @@ gbt_numeric_same(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(result);

				}

				Datum

				gbt_numeric_penalty(PG_FUNCTION_ARGS)

				{

				@ -215,8 +208,6 @@ gbt_numeric_penalty(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(result);

				}

				Datum

				gbt_numeric_picksplit(PG_FUNCTION_ARGS)

				{

				@ -227,3 +218,35 @@ gbt_numeric_picksplit(PG_FUNCTION_ARGS)

									  &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(v);

				}

				static int

				gbt_numeric_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);

					GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);

					GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);

					GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);

					Datum		result;

					/* for leaf items we expect lower == upper, so only compare lower */

					result = DirectFunctionCall2(numeric_cmp,

												 PointerGetDatum(arg1.lower),

												 PointerGetDatum(arg2.lower));

					GBT_FREE_IF_COPY(key1, x);

					GBT_FREE_IF_COPY(key2, y);

					return DatumGetInt32(result);

				}

				Datum

				gbt_numeric_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_numeric_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										38

contrib/btree_gist/btree_oid.c
									
										View File
										
				@ -5,6 +5,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/sortsupport.h"

				typedef struct

				{

				@ -12,9 +13,7 @@ typedef struct

					Oid			upper;

				} oidKEY;

				/*

				** OID ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_oid_compress);

				PG_FUNCTION_INFO_V1(gbt_oid_fetch);

				PG_FUNCTION_INFO_V1(gbt_oid_union);

				@ -23,6 +22,7 @@ PG_FUNCTION_INFO_V1(gbt_oid_consistent);

				PG_FUNCTION_INFO_V1(gbt_oid_distance);

				PG_FUNCTION_INFO_V1(gbt_oid_penalty);

				PG_FUNCTION_INFO_V1(gbt_oid_same);

				PG_FUNCTION_INFO_V1(gbt_oid_sortsupport);

				static bool

				@ -113,10 +113,9 @@ oid_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * Oid ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_oid_compress(PG_FUNCTION_ARGS)

				{

				@ -155,7 +154,6 @@ gbt_oid_consistent(PG_FUNCTION_ARGS)

													  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_oid_distance(PG_FUNCTION_ARGS)

				{

				@ -173,7 +171,6 @@ gbt_oid_distance(PG_FUNCTION_ARGS)

													  &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_oid_union(PG_FUNCTION_ARGS)

				{

				@ -184,7 +181,6 @@ gbt_oid_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_oid_penalty(PG_FUNCTION_ARGS)

				{

				@ -215,3 +211,29 @@ gbt_oid_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_oid_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					oidKEY	   *arg1 = (oidKEY *) DatumGetPointer(x);

					oidKEY	   *arg2 = (oidKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					if (arg1->lower > arg2->lower)

						return 1;

					else if (arg1->lower < arg2->lower)

						return -1;

					else

						return 0;

				}

				Datum

				gbt_oid_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_oid_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										82

contrib/btree_gist/btree_text.c
									
										View File
										
				@ -7,10 +7,9 @@

				#include "btree_utils_var.h"

				#include "mb/pg_wchar.h"

				#include "utils/fmgrprotos.h"

				#include "utils/sortsupport.h"

				/*

				** Text ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_text_compress);

				PG_FUNCTION_INFO_V1(gbt_bpchar_compress);

				PG_FUNCTION_INFO_V1(gbt_text_union);

				@ -19,6 +18,8 @@ PG_FUNCTION_INFO_V1(gbt_text_consistent);

				PG_FUNCTION_INFO_V1(gbt_bpchar_consistent);

				PG_FUNCTION_INFO_V1(gbt_text_penalty);

				PG_FUNCTION_INFO_V1(gbt_text_same);

				PG_FUNCTION_INFO_V1(gbt_text_sortsupport);

				PG_FUNCTION_INFO_V1(gbt_bpchar_sortsupport);

				/* define for comparison */

				@ -163,10 +164,9 @@ static gbtree_vinfo bptinfo =

				/**************************************************

				 * Text ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_text_compress(PG_FUNCTION_ARGS)

				{

				@ -187,8 +187,6 @@ gbt_bpchar_compress(PG_FUNCTION_ARGS)

					return gbt_text_compress(fcinfo);

				}

				Datum

				gbt_text_consistent(PG_FUNCTION_ARGS)

				{

				@ -216,7 +214,6 @@ gbt_text_consistent(PG_FUNCTION_ARGS)

					PG_RETURN_BOOL(retval);

				}

				Datum

				gbt_bpchar_consistent(PG_FUNCTION_ARGS)

				{

				@ -243,7 +240,6 @@ gbt_bpchar_consistent(PG_FUNCTION_ARGS)

					PG_RETURN_BOOL(retval);

				}

				Datum

				gbt_text_union(PG_FUNCTION_ARGS)

				{

				@ -254,7 +250,6 @@ gbt_text_union(PG_FUNCTION_ARGS)

													&tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_text_picksplit(PG_FUNCTION_ARGS)

				{

				@ -277,7 +272,6 @@ gbt_text_same(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(result);

				}

				Datum

				gbt_text_penalty(PG_FUNCTION_ARGS)

				{

				@ -288,3 +282,69 @@ gbt_text_penalty(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_var_penalty(result, o, n, PG_GET_COLLATION(),

													  &tinfo, fcinfo->flinfo));

				}

				static int

				gbt_text_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);

					GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);

					GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);

					GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);

					Datum		result;

					/* for leaf items we expect lower == upper, so only compare lower */

					result = DirectFunctionCall2Coll(bttextcmp,

													 ssup->ssup_collation,

													 PointerGetDatum(arg1.lower),

													 PointerGetDatum(arg2.lower));

					GBT_FREE_IF_COPY(key1, x);

					GBT_FREE_IF_COPY(key2, y);

					return DatumGetInt32(result);

				}

				Datum

				gbt_text_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_text_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

				static int

				gbt_bpchar_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);

					GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);

					GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);

					GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);

					Datum		result;

					/* for leaf items we expect lower == upper, so only compare lower */

					result = DirectFunctionCall2Coll(bpcharcmp,

													 ssup->ssup_collation,

													 PointerGetDatum(arg1.lower),

													 PointerGetDatum(arg2.lower));

					GBT_FREE_IF_COPY(key1, x);

					GBT_FREE_IF_COPY(key2, y);

					return DatumGetInt32(result);

				}

				Datum

				gbt_bpchar_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_bpchar_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										40

contrib/btree_gist/btree_time.c
									
										View File
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "utils/fmgrprotos.h"

				#include "utils/date.h"

				#include "utils/sortsupport.h"

				#include "utils/timestamp.h"

				typedef struct

				@ -15,9 +16,7 @@ typedef struct

					TimeADT		upper;

				} timeKEY;

				/*

				** time ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_time_compress);

				PG_FUNCTION_INFO_V1(gbt_timetz_compress);

				PG_FUNCTION_INFO_V1(gbt_time_fetch);

				@ -28,6 +27,8 @@ PG_FUNCTION_INFO_V1(gbt_time_distance);

				PG_FUNCTION_INFO_V1(gbt_timetz_consistent);

				PG_FUNCTION_INFO_V1(gbt_time_penalty);

				PG_FUNCTION_INFO_V1(gbt_time_same);

				PG_FUNCTION_INFO_V1(gbt_time_sortsupport);

				PG_FUNCTION_INFO_V1(gbt_timetz_sortsupport);

				#ifdef USE_FLOAT8_BYVAL

				@ -92,8 +93,6 @@ gbt_timelt(const void *a, const void *b, FmgrInfo *flinfo)

															TimeADTGetDatumFast(*bb)));

				}

				static int

				gbt_timekey_cmp(const void *a, const void *b, FmgrInfo *flinfo)

				{

				@ -150,11 +149,9 @@ time_dist(PG_FUNCTION_ARGS)

				/**************************************************

				 * time ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_time_compress(PG_FUNCTION_ARGS)

				{

				@ -163,7 +160,6 @@ gbt_time_compress(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_compress(entry, &tinfo));

				}

				Datum

				gbt_timetz_compress(PG_FUNCTION_ARGS)

				{

				@ -262,7 +258,6 @@ gbt_timetz_consistent(PG_FUNCTION_ARGS)

													  GIST_LEAF(entry), &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_time_union(PG_FUNCTION_ARGS)

				{

				@ -273,7 +268,6 @@ gbt_time_union(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));

				}

				Datum

				gbt_time_penalty(PG_FUNCTION_ARGS)

				{

				@ -313,7 +307,6 @@ gbt_time_penalty(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(result);

				}

				Datum

				gbt_time_picksplit(PG_FUNCTION_ARGS)

				{

				@ -332,3 +325,26 @@ gbt_time_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_timekey_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					timeKEY    *arg1 = (timeKEY *) DatumGetPointer(x);

					timeKEY    *arg2 = (timeKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return DatumGetInt32(DirectFunctionCall2(time_cmp,

															 TimeADTGetDatumFast(arg1->lower),

															 TimeADTGetDatumFast(arg2->lower)));

				}

				Datum

				gbt_time_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_timekey_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										38

contrib/btree_gist/btree_ts.c
									
										View File
										
				@ -10,6 +10,7 @@

				#include "utils/fmgrprotos.h"

				#include "utils/timestamp.h"

				#include "utils/float.h"

				#include "utils/sortsupport.h"

				typedef struct

				{

				@ -17,9 +18,7 @@ typedef struct

					Timestamp	upper;

				} tsKEY;

				/*

				** timestamp ops

				*/

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_ts_compress);

				PG_FUNCTION_INFO_V1(gbt_tstz_compress);

				PG_FUNCTION_INFO_V1(gbt_ts_fetch);

				@ -31,6 +30,7 @@ PG_FUNCTION_INFO_V1(gbt_tstz_consistent);

				PG_FUNCTION_INFO_V1(gbt_tstz_distance);

				PG_FUNCTION_INFO_V1(gbt_ts_penalty);

				PG_FUNCTION_INFO_V1(gbt_ts_same);

				PG_FUNCTION_INFO_V1(gbt_ts_sortsupport);

				#ifdef USE_FLOAT8_BYVAL

				@ -40,6 +40,8 @@ PG_FUNCTION_INFO_V1(gbt_ts_same);

				#endif

				/* define for comparison */

				static bool

				gbt_tsgt(const void *a, const void *b, FmgrInfo *flinfo)

				{

				@ -95,7 +97,6 @@ gbt_tslt(const void *a, const void *b, FmgrInfo *flinfo)

															TimestampGetDatumFast(*bb)));

				}

				static int

				gbt_tskey_cmp(const void *a, const void *b, FmgrInfo *flinfo)

				{

				@ -126,7 +127,6 @@ gbt_ts_dist(const void *a, const void *b, FmgrInfo *flinfo)

					return fabs(INTERVAL_TO_SEC(i));

				}

				static const gbtree_ninfo tinfo =

				{

					gbt_t_ts,

				@ -190,12 +190,10 @@ tstz_dist(PG_FUNCTION_ARGS)

					PG_RETURN_INTERVAL_P(abs_interval(r));

				}

				/**************************************************

				 * timestamp ops

				 * GiST support functions

				 **************************************************/

				static inline Timestamp

				tstz_to_ts_gmt(TimestampTz ts)

				{

				@ -212,7 +210,6 @@ gbt_ts_compress(PG_FUNCTION_ARGS)

					PG_RETURN_POINTER(gbt_num_compress(entry, &tinfo));

				}

				Datum

				gbt_tstz_compress(PG_FUNCTION_ARGS)

				{

				@ -398,3 +395,26 @@ gbt_ts_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_ts_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					tsKEY	   *arg1 = (tsKEY *) DatumGetPointer(x);

					tsKEY	   *arg2 = (tsKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return DatumGetInt32(DirectFunctionCall2(timestamp_cmp,

															 TimestampGetDatumFast(arg1->lower),

															 TimestampGetDatumFast(arg2->lower)));

				}

				Datum

				gbt_ts_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_ts_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

									
										12

contrib/btree_gist/btree_utils_var.h
									
										View File
										
				@ -41,7 +41,17 @@ typedef struct

					GBT_VARKEY *(*f_l2n) (GBT_VARKEY *, FmgrInfo *flinfo);	/* convert leaf to node */

				} gbtree_vinfo;

				/*

				 * Free ptr1 in case its a copy of ptr2.

				 *

				 * This is adapted from varlena's PG_FREE_IF_COPY, though doesn't require

				 * fcinfo access.

				 */

				#define GBT_FREE_IF_COPY(ptr1, ptr2) \

					do { \

						if ((Pointer) (ptr1) != DatumGetPointer(ptr2)) \

							pfree(ptr1); \

					} while (0)

				extern GBT_VARKEY_R gbt_var_key_readable(const GBT_VARKEY *k);

									
										30

contrib/btree_gist/btree_uuid.c
									
										View File
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "port/pg_bswap.h"

				#include "utils/sortsupport.h"

				#include "utils/uuid.h"

				typedef struct

				@ -15,9 +16,7 @@ typedef struct

				} uuidKEY;

				/*

				 * UUID ops

				 */

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_uuid_compress);

				PG_FUNCTION_INFO_V1(gbt_uuid_fetch);

				PG_FUNCTION_INFO_V1(gbt_uuid_union);

				@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_uuid_picksplit);

				PG_FUNCTION_INFO_V1(gbt_uuid_consistent);

				PG_FUNCTION_INFO_V1(gbt_uuid_penalty);

				PG_FUNCTION_INFO_V1(gbt_uuid_same);

				PG_FUNCTION_INFO_V1(gbt_uuid_sortsupport);

				static int

				@ -93,10 +93,9 @@ static const gbtree_ninfo tinfo =

				/**************************************************

				 * uuid ops

				 * GiST support functions

				 **************************************************/

				Datum

				gbt_uuid_compress(PG_FUNCTION_ARGS)

				{

				@ -233,3 +232,24 @@ gbt_uuid_same(PG_FUNCTION_ARGS)

					*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);

					PG_RETURN_POINTER(result);

				}

				static int

				gbt_uuid_ssup_cmp(Datum x, Datum y, SortSupport ssup)

				{

					uuidKEY    *arg1 = (uuidKEY *) DatumGetPointer(x);

					uuidKEY    *arg2 = (uuidKEY *) DatumGetPointer(y);

					/* for leaf items we expect lower == upper, so only compare lower */

					return uuid_internal_cmp(&arg1->lower, &arg2->lower);

				}

				Datum

				gbt_uuid_sortsupport(PG_FUNCTION_ARGS)

				{

					SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);

					ssup->comparator = gbt_uuid_ssup_cmp;

					ssup->ssup_extra = NULL;

					PG_RETURN_VOID();

				}

5

contrib/btree_gist/expected/enum.out

View File

 @ -1,5 +1,8 @@
 -- enum check
 create type rainbow as enum ('r','o','y','g','b','i','v');
 create type rainbow as enum ('r','o','g','b','i','v');
 -- enum values added later take some different codepaths internally,
 -- so make sure we have coverage for those too
 alter type rainbow add value 'y' before 'g';
 CREATE TABLE enumtmp (a rainbow);
 \copy enumtmp from 'data/enum.data'
 SET enable_seqscan=on;

									
										1

contrib/btree_gist/meson.build
									
										View File
										
				@ -51,6 +51,7 @@ install_data(

				  'btree_gist--1.5--1.6.sql',

				  'btree_gist--1.6--1.7.sql',

				  'btree_gist--1.7--1.8.sql',

				  'btree_gist--1.8--1.9.sql',

				  kwargs: contrib_data_args,

				)

									
										6

contrib/btree_gist/sql/enum.sql
									
										View File
										
				@ -1,6 +1,10 @@

				-- enum check

				create type rainbow as enum ('r','o','y','g','b','i','v');

				create type rainbow as enum ('r','o','g','b','i','v');

				-- enum values added later take some different codepaths internally,

				-- so make sure we have coverage for those too

				alter type rainbow add value 'y' before 'g';

				CREATE TABLE enumtmp (a rainbow);

									
										5

contrib/citext/citext.c
									
										View File
										
				@ -10,7 +10,10 @@

				#include "utils/varlena.h"

				#include "varatt.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "citext",

									.version = PG_VERSION

				);

				/*

				 *		====================

									
										5

contrib/cube/cube.c
									
										View File
										
				@ -17,7 +17,10 @@

				#include "utils/array.h"

				#include "utils/float.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "cube",

									.version = PG_VERSION

				);

				/*

				 * Taken from the intarray contrib header

									
										1

contrib/dblink/Makefile
									
										View File
										
				@ -13,6 +13,7 @@ PGFILEDESC = "dblink - connect to other PostgreSQL databases"

				REGRESS = dblink

				REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress

				TAP_TESTS = 1

				ifdef USE_PGXS

				PG_CONFIG = pg_config

									
										280

contrib/dblink/dblink.c
									
										View File
										
				@ -43,6 +43,8 @@

				#include "catalog/pg_foreign_server.h"

				#include "catalog/pg_type.h"

				#include "catalog/pg_user_mapping.h"

				#include "commands/defrem.h"

				#include "common/base64.h"

				#include "executor/spi.h"

				#include "foreign/foreign.h"

				#include "funcapi.h"

				@ -63,7 +65,10 @@

				#include "utils/varlena.h"

				#include "utils/wait_event.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "dblink",

									.version = PG_VERSION

				);

				typedef struct remoteConn

				{

				@ -100,7 +105,7 @@ static PGresult *storeQueryResult(volatile storeInfo *sinfo, PGconn *conn, const

				static void storeRow(volatile storeInfo *sinfo, PGresult *res, bool first);

				static remoteConn *getConnectionByName(const char *name);

				static HTAB *createConnHash(void);

				static void createNewConnection(const char *name, remoteConn *rconn);

				static remoteConn *createNewConnection(const char *name);

				static void deleteConnection(const char *name);

				static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);

				static char **get_text_array_contents(ArrayType *array, int *numitems);

				@ -114,7 +119,8 @@ static Relation get_rel_from_relname(text *relname_text, LOCKMODE lockmode, AclM

				static char *generate_relation_name(Relation rel);

				static void dblink_connstr_check(const char *connstr);

				static bool dblink_connstr_has_pw(const char *connstr);

				static void dblink_security_check(PGconn *conn, remoteConn *rconn, const char *connstr);

				static void dblink_security_check(PGconn *conn, const char *connname,

												  const char *connstr);

				static void dblink_res_error(PGconn *conn, const char *conname, PGresult *res,

											 bool fail, const char *fmt,...) pg_attribute_printf(5, 6);

				static char *get_connect_string(const char *servername);

				@ -126,6 +132,11 @@ static bool is_valid_dblink_option(const PQconninfoOption *options,

												   const char *option, Oid context);

				static int	applyRemoteGucs(PGconn *conn);

				static void restoreLocalGucs(int nestlevel);

				static bool UseScramPassthrough(ForeignServer *foreign_server, UserMapping *user);

				static void appendSCRAMKeysInfo(StringInfo buf);

				static bool is_valid_dblink_fdw_option(const PQconninfoOption *options, const char *option,

													   Oid context);

				static bool dblink_connstr_has_required_scram_options(const char *connstr);

				/* Global */

				static remoteConn *pconn = NULL;

				@ -137,16 +148,22 @@ static uint32 dblink_we_get_conn = 0;

				static uint32 dblink_we_get_result = 0;

				/*

				 *	Following is list that holds multiple remote connections.

				 *	Following is hash that holds multiple remote connections.

				 *	Calling convention of each dblink function changes to accept

				 *	connection name as the first parameter. The connection list is

				 *	connection name as the first parameter. The connection hash is

				 *	much like ecpg e.g. a mapping between a name and a PGconn object.

				 *

				 *	To avoid potentially leaking a PGconn object in case of out-of-memory

				 *	errors, we first create the hash entry, then open the PGconn.

				 *	Hence, a hash entry whose rconn.conn pointer is NULL must be

				 *	understood as a leftover from a failed create; it should be ignored

				 *	by lookup operations, and silently replaced by create operations.

				 */

				typedef struct remoteConnHashEnt

				{

					char		name[NAMEDATALEN];

					remoteConn *rconn;

					remoteConn	rconn;

				} remoteConnHashEnt;

				/* initial number of connection hashes */

				@ -160,8 +177,7 @@ xpstrdup(const char *in)

					return pstrdup(in);

				}

				static void

				pg_attribute_noreturn()

				pg_noreturn static void

				dblink_res_internalerror(PGconn *conn, PGresult *res, const char *p2)

				{

					char	   *msg = pchomp(PQerrorMessage(conn));

				@ -170,8 +186,7 @@ dblink_res_internalerror(PGconn *conn, PGresult *res, const char *p2)

					elog(ERROR, "%s: %s", p2, msg);

				}

				static void

				pg_attribute_noreturn()

				pg_noreturn static void

				dblink_conn_not_avail(const char *conname)

				{

					if (conname)

				@ -225,7 +240,7 @@ dblink_get_conn(char *conname_or_str,

									 errmsg("could not establish connection"),

									 errdetail_internal("%s", msg)));

						}

						dblink_security_check(conn, rconn, connstr);

						dblink_security_check(conn, NULL, connstr);

						if (PQclientEncoding(conn) != GetDatabaseEncoding())

							PQsetClientEncoding(conn, GetDatabaseEncodingName());

						freeconn = true;

				@ -288,15 +303,6 @@ dblink_connect(PG_FUNCTION_ARGS)

					else if (PG_NARGS() == 1)

						conname_or_str = text_to_cstring(PG_GETARG_TEXT_PP(0));

					if (connname)

					{

						rconn = (remoteConn *) MemoryContextAlloc(TopMemoryContext,

																  sizeof(remoteConn));

						rconn->conn = NULL;

						rconn->openCursorCount = 0;

						rconn->newXactForCursor = false;

					}

					/* first check for valid foreign data server */

					connstr = get_connect_string(conname_or_str);

					if (connstr == NULL)

				@ -309,6 +315,13 @@ dblink_connect(PG_FUNCTION_ARGS)

					if (dblink_we_connect == 0)

						dblink_we_connect = WaitEventExtensionNew("DblinkConnect");

					/* if we need a hashtable entry, make that first, since it might fail */

					if (connname)

					{

						rconn = createNewConnection(connname);

						Assert(rconn->conn == NULL);

					}

					/* OK to make connection */

					conn = libpqsrv_connect(connstr, dblink_we_connect);

				@ -316,8 +329,8 @@ dblink_connect(PG_FUNCTION_ARGS)

					{

						msg = pchomp(PQerrorMessage(conn));

						libpqsrv_disconnect(conn);

						if (rconn)

							pfree(rconn);

						if (connname)

							deleteConnection(connname);

						ereport(ERROR,

								(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),

				@ -326,16 +339,16 @@ dblink_connect(PG_FUNCTION_ARGS)

					}

					/* check password actually used if not superuser */

					dblink_security_check(conn, rconn, connstr);

					dblink_security_check(conn, connname, connstr);

					/* attempt to set client encoding to match server encoding, if needed */

					if (PQclientEncoding(conn) != GetDatabaseEncoding())

						PQsetClientEncoding(conn, GetDatabaseEncodingName());

					/* all OK, save away the conn */

					if (connname)

					{

						rconn->conn = conn;

						createNewConnection(connname, rconn);

					}

					else

					{

				@ -375,10 +388,7 @@ dblink_disconnect(PG_FUNCTION_ARGS)

					libpqsrv_disconnect(conn);

					if (rconn)

					{

						deleteConnection(conname);

						pfree(rconn);

					}

					else

						pconn->conn = NULL;

				@ -1296,6 +1306,9 @@ dblink_get_connections(PG_FUNCTION_ARGS)

						hash_seq_init(&status, remoteConnHash);

						while ((hentry = (remoteConnHashEnt *) hash_seq_search(&status)) != NULL)

						{

							/* ignore it if it's not an open connection */

							if (hentry->rconn.conn == NULL)

								continue;

							/* stash away current value */

							astate = accumArrayResult(astate,

													  CStringGetTextDatum(hentry->name),

				@ -1966,7 +1979,7 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)

					{

						DefElem    *def = (DefElem *) lfirst(cell);

						if (!is_valid_dblink_option(options, def->defname, context))

						if (!is_valid_dblink_fdw_option(options, def->defname, context))

						{

							/*

							 * Unknown option, or invalid option for the context specified, so

				@ -2531,8 +2544,8 @@ getConnectionByName(const char *name)

					hentry = (remoteConnHashEnt *) hash_search(remoteConnHash,

															   key, HASH_FIND, NULL);

					if (hentry)

						return hentry->rconn;

					if (hentry && hentry->rconn.conn != NULL)

						return &hentry->rconn;

					return NULL;

				}

				@ -2549,8 +2562,8 @@ createConnHash(void)

									   HASH_ELEM | HASH_STRINGS);

				}

				static void

				createNewConnection(const char *name, remoteConn *rconn)

				static remoteConn *

				createNewConnection(const char *name)

				{

					remoteConnHashEnt *hentry;

					bool		found;

				@ -2564,17 +2577,15 @@ createNewConnection(const char *name, remoteConn *rconn)

					hentry = (remoteConnHashEnt *) hash_search(remoteConnHash, key,

															   HASH_ENTER, &found);

					if (found)

					{

						libpqsrv_disconnect(rconn->conn);

						pfree(rconn);

					if (found && hentry->rconn.conn != NULL)

						ereport(ERROR,

								(errcode(ERRCODE_DUPLICATE_OBJECT),

								 errmsg("duplicate connection name")));

					}

					hentry->rconn = rconn;

					/* New, or reusable, so initialize the rconn struct to zeroes */

					memset(&hentry->rconn, 0, sizeof(remoteConn));

					return &hentry->rconn;

				}

				static void

				@ -2598,13 +2609,77 @@ deleteConnection(const char *name)

								 errmsg("undefined connection name")));

				}

				 /*

				  * Ensure that require_auth and SCRAM keys are correctly set on connstr.

				  * SCRAM keys used to pass-through are coming from the initial connection

				  * from the client with the server.

				  *

				  * All required SCRAM options are set by dblink, so we just need to ensure

				  * that these options are not overwritten by the user.

				  *

				  * See appendSCRAMKeysInfo and its usage for more.

				  */

				bool

				dblink_connstr_has_required_scram_options(const char *connstr)

				{

					PQconninfoOption *options;

					bool		has_scram_server_key = false;

					bool		has_scram_client_key = false;

					bool		has_require_auth = false;

					bool		has_scram_keys = false;

					options = PQconninfoParse(connstr, NULL);

					if (options)

					{

						/*

						 * Continue iterating even if we found the keys that we need to

						 * validate to make sure that there is no other declaration of these

						 * keys that can overwrite the first.

						 */

						for (PQconninfoOption *option = options; option->keyword != NULL; option++)

						{

							if (strcmp(option->keyword, "require_auth") == 0)

							{

								if (option->val != NULL && strcmp(option->val, "scram-sha-256") == 0)

									has_require_auth = true;

								else

									has_require_auth = false;

							}

							if (strcmp(option->keyword, "scram_client_key") == 0)

							{

								if (option->val != NULL && option->val[0] != '\0')

									has_scram_client_key = true;

								else

									has_scram_client_key = false;

							}

							if (strcmp(option->keyword, "scram_server_key") == 0)

							{

								if (option->val != NULL && option->val[0] != '\0')

									has_scram_server_key = true;

								else

									has_scram_server_key = false;

							}

						}

						PQconninfoFree(options);

					}

					has_scram_keys = has_scram_client_key && has_scram_server_key && MyProcPort->has_scram_keys;

					return (has_scram_keys && has_require_auth);

				}

				/*

				 * We need to make sure that the connection made used credentials

				 * which were provided by the user, so check what credentials were

				 * used to connect and then make sure that they came from the user.

				 *

				 * On failure, we close "conn" and also delete the hashtable entry

				 * identified by "connname" (if that's not NULL).

				 */

				static void

				dblink_security_check(PGconn *conn, remoteConn *rconn, const char *connstr)

				dblink_security_check(PGconn *conn, const char *connname, const char *connstr)

				{

					/* Superuser bypasses security check */

					if (superuser())

				@ -2614,6 +2689,18 @@ dblink_security_check(PGconn *conn, remoteConn *rconn, const char *connstr)

					if (PQconnectionUsedPassword(conn) && dblink_connstr_has_pw(connstr))

						return;

					/*

					 * Password was not used to connect, check if SCRAM pass-through is in

					 * use.

					 *

					 * If dblink_connstr_has_required_scram_options is true we assume that

					 * UseScramPassthrough is also true because the required SCRAM keys are

					 * only added if UseScramPassthrough is set, and the user is not allowed

					 * to add the SCRAM keys on fdw and user mapping options.

					 */

					if (MyProcPort->has_scram_keys && dblink_connstr_has_required_scram_options(connstr))

						return;

				#ifdef ENABLE_GSS

					/* If GSSAPI creds used to connect, make sure it was one delegated */

					if (PQconnectionUsedGSSAPI(conn) && be_gssapi_get_delegation(MyProcPort))

				@ -2622,8 +2709,8 @@ dblink_security_check(PGconn *conn, remoteConn *rconn, const char *connstr)

					/* Otherwise, fail out */

					libpqsrv_disconnect(conn);

					if (rconn)

						pfree(rconn);

					if (connname)

						deleteConnection(connname);

					ereport(ERROR,

							(errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),

				@ -2666,12 +2753,14 @@ dblink_connstr_has_pw(const char *connstr)

				}

				/*

				 * For non-superusers, insist that the connstr specify a password, except

				 * if GSSAPI credentials have been delegated (and we check that they are used

				 * for the connection in dblink_security_check later).  This prevents a

				 * password or GSSAPI credentials from being picked up from .pgpass, a

				 * service file, the environment, etc.  We don't want the postgres user's

				 * passwords or Kerberos credentials to be accessible to non-superusers.

				 * For non-superusers, insist that the connstr specify a password, except if

				 * GSSAPI credentials have been delegated (and we check that they are used for

				 * the connection in dblink_security_check later) or if SCRAM pass-through is

				 * being used.  This prevents a password or GSSAPI credentials from being

				 * picked up from .pgpass, a service file, the environment, etc.  We don't want

				 * the postgres user's passwords or Kerberos credentials to be accessible to

				 * non-superusers. In case of SCRAM pass-through insist that the connstr

				 * has the required SCRAM pass-through options.

				 */

				static void

				dblink_connstr_check(const char *connstr)

				@ -2682,6 +2771,9 @@ dblink_connstr_check(const char *connstr)

					if (dblink_connstr_has_pw(connstr))

						return;

					if (MyProcPort->has_scram_keys && dblink_connstr_has_required_scram_options(connstr))

						return;

				#ifdef ENABLE_GSS

					if (be_gssapi_get_delegation(MyProcPort))

						return;

				@ -2834,6 +2926,14 @@ get_connect_string(const char *servername)

						if (aclresult != ACLCHECK_OK)

							aclcheck_error(aclresult, OBJECT_FOREIGN_SERVER, foreign_server->servername);

						/*

						 * First append hardcoded options needed for SCRAM pass-through, so if

						 * the user overwrites these options we can ereport on

						 * dblink_connstr_check and dblink_security_check.

						 */

						if (MyProcPort->has_scram_keys && UseScramPassthrough(foreign_server, user_mapping))

							appendSCRAMKeysInfo(&buf);

						foreach(cell, fdw->options)

						{

							DefElem    *def = lfirst(cell);

				@ -3000,6 +3100,13 @@ is_valid_dblink_option(const PQconninfoOption *options, const char *option,

					if (strcmp(opt->keyword, "client_encoding") == 0)

						return false;

					/*

					 * Disallow OAuth options for now, since the builtin flow communicates on

					 * stderr by default and can't cache tokens yet.

					 */

					if (strncmp(opt->keyword, "oauth_", strlen("oauth_")) == 0)

						return false;

					/*

					 * If the option is "user" or marked secure, it should be specified only

					 * in USER MAPPING.  Others should be specified only in SERVER.

				@ -3018,6 +3125,20 @@ is_valid_dblink_option(const PQconninfoOption *options, const char *option,

					return true;

				}

				/*

				 * Same as is_valid_dblink_option but also check for only dblink_fdw specific

				 * options.

				 */

				static bool

				is_valid_dblink_fdw_option(const PQconninfoOption *options, const char *option,

										   Oid context)

				{

					if (strcmp(option, "use_scram_passthrough") == 0)

						return true;

					return is_valid_dblink_option(options, option, context);

				}

				/*

				 * Copy the remote session's values of GUCs that affect datatype I/O

				 * and apply them locally in a new GUC nesting level.  Returns the new

				@ -3087,3 +3208,66 @@ restoreLocalGucs(int nestlevel)

					if (nestlevel > 0)

						AtEOXact_GUC(true, nestlevel);

				}

				/*

				 * Append SCRAM client key and server key information from the global

				 * MyProcPort into the given StringInfo buffer.

				 */

				static void

				appendSCRAMKeysInfo(StringInfo buf)

				{

					int			len;

					int			encoded_len;

					char	   *client_key;

					char	   *server_key;

					len = pg_b64_enc_len(sizeof(MyProcPort->scram_ClientKey));

					/* don't forget the zero-terminator */

					client_key = palloc0(len + 1);

					encoded_len = pg_b64_encode(MyProcPort->scram_ClientKey,

												sizeof(MyProcPort->scram_ClientKey),

												client_key, len);

					if (encoded_len < 0)

						elog(ERROR, "could not encode SCRAM client key");

					len = pg_b64_enc_len(sizeof(MyProcPort->scram_ServerKey));

					/* don't forget the zero-terminator */

					server_key = palloc0(len + 1);

					encoded_len = pg_b64_encode(MyProcPort->scram_ServerKey,

												sizeof(MyProcPort->scram_ServerKey),

												server_key, len);

					if (encoded_len < 0)

						elog(ERROR, "could not encode SCRAM server key");

					appendStringInfo(buf, "scram_client_key='%s' ", client_key);

					appendStringInfo(buf, "scram_server_key='%s' ", server_key);

					appendStringInfoString(buf, "require_auth='scram-sha-256' ");

					pfree(client_key);

					pfree(server_key);

				}

				static bool

				UseScramPassthrough(ForeignServer *foreign_server, UserMapping *user)

				{

					ListCell   *cell;

					foreach(cell, foreign_server->options)

					{

						DefElem    *def = lfirst(cell);

						if (strcmp(def->defname, "use_scram_passthrough") == 0)

							return defGetBoolean(def);

					}

					foreach(cell, user->options)

					{

						DefElem    *def = (DefElem *) lfirst(cell);

						if (strcmp(def->defname, "use_scram_passthrough") == 0)

							return defGetBoolean(def);

					}

					return false;

				}

11

contrib/dblink/expected/dblink.out

View File

 @ -898,6 +898,17 @@ CREATE USER MAPPING FOR public SERVER fdtest
   OPTIONS (server 'localhost');  -- fail, can't specify server here
 ERROR:  invalid option "server"
 CREATE USER MAPPING FOR public SERVER fdtest OPTIONS (user :'USER');
 -- OAuth options are not allowed in either context
 ALTER SERVER fdtest OPTIONS (ADD oauth_issuer 'https://example.com');
 ERROR:  invalid option "oauth_issuer"
 ALTER SERVER fdtest OPTIONS (ADD oauth_client_id 'myID');
 ERROR:  invalid option "oauth_client_id"
 ALTER USER MAPPING FOR public SERVER fdtest
 	OPTIONS (ADD oauth_issuer 'https://example.com');
 ERROR:  invalid option "oauth_issuer"
 ALTER USER MAPPING FOR public SERVER fdtest
 	OPTIONS (ADD oauth_client_id 'myID');
 ERROR:  invalid option "oauth_client_id"
 GRANT USAGE ON FOREIGN SERVER fdtest TO regress_dblink_user;
 GRANT EXECUTE ON FUNCTION dblink_connect_u(text, text) TO regress_dblink_user;
 SET SESSION AUTHORIZATION regress_dblink_user;

									
										5

contrib/dblink/meson.build
									
										View File
										
				@ -36,4 +36,9 @@ tests += {

				    ],

				    'regress_args': ['--dlpath', meson.build_root() / 'src/test/regress'],

				  },

				  'tap': {

				    'tests': [

				      't/001_auth_scram.pl',

				    ],

				  },

				}

									
										8

contrib/dblink/sql/dblink.sql
									
										View File
										
				@ -469,6 +469,14 @@ CREATE USER MAPPING FOR public SERVER fdtest

				  OPTIONS (server 'localhost');  -- fail, can't specify server here

				CREATE USER MAPPING FOR public SERVER fdtest OPTIONS (user :'USER');

				-- OAuth options are not allowed in either context

				ALTER SERVER fdtest OPTIONS (ADD oauth_issuer 'https://example.com');

				ALTER SERVER fdtest OPTIONS (ADD oauth_client_id 'myID');

				ALTER USER MAPPING FOR public SERVER fdtest

					OPTIONS (ADD oauth_issuer 'https://example.com');

				ALTER USER MAPPING FOR public SERVER fdtest

					OPTIONS (ADD oauth_client_id 'myID');

				GRANT USAGE ON FOREIGN SERVER fdtest TO regress_dblink_user;

				GRANT EXECUTE ON FUNCTION dblink_connect_u(text, text) TO regress_dblink_user;

									
										253

contrib/dblink/t/001_auth_scram.pl
									
										Normal file
									
										View File
										
				@ -0,0 +1,253 @@

				# Copyright (c) 2024-2025, PostgreSQL Global Development Group

				# Test SCRAM authentication when opening a new connection with a foreign

				# server.

				#

				# The test is executed by testing the SCRAM authentifcation on a loopback

				# connection on the same server and with different servers.

				use strict;

				use warnings FATAL => 'all';

				use PostgreSQL::Test::Utils;

				use PostgreSQL::Test::Cluster;

				use Test::More;

				if (!$use_unix_sockets)

				{

					plan skip_all => "test requires Unix-domain sockets";

				}

				my $user = "user01";

				my $db0 = "db0";                               # For node1

				my $db1 = "db1";                               # For node1

				my $db2 = "db2";                               # For node2

				my $fdw_server = "db1_fdw";

				my $fdw_server2 = "db2_fdw";

				my $fdw_invalid_server = "db2_fdw_invalid";    # For invalid fdw options

				my $fdw_invalid_server2 =

				  "db2_fdw_invalid2";    # For invalid scram keys fdw options

				my $node1 = PostgreSQL::Test::Cluster->new('node1');

				my $node2 = PostgreSQL::Test::Cluster->new('node2');

				$node1->init;

				$node2->init;

				$node1->start;

				$node2->start;

				# Test setup

				$node1->safe_psql('postgres', qq'CREATE USER $user WITH password \'pass\'');

				$node2->safe_psql('postgres', qq'CREATE USER $user WITH password \'pass\'');

				$ENV{PGPASSWORD} = "pass";

				$node1->safe_psql('postgres', qq'CREATE DATABASE $db0');

				$node1->safe_psql('postgres', qq'CREATE DATABASE $db1');

				$node2->safe_psql('postgres', qq'CREATE DATABASE $db2');

				setup_table($node1, $db1, "t");

				setup_table($node2, $db2, "t2");

				$node1->safe_psql($db0, 'CREATE EXTENSION IF NOT EXISTS dblink');

				setup_fdw_server($node1, $db0, $fdw_server, $node1, $db1);

				setup_fdw_server($node1, $db0, $fdw_server2, $node2, $db2);

				setup_invalid_fdw_server($node1, $db0, $fdw_invalid_server, $node2, $db2);

				setup_fdw_server($node1, $db0, $fdw_invalid_server2, $node2, $db2);

				setup_user_mapping($node1, $db0, $fdw_server);

				setup_user_mapping($node1, $db0, $fdw_server2);

				setup_user_mapping($node1, $db0, $fdw_invalid_server);

				# Make the user have the same SCRAM key on both servers. Forcing to have the

				# same iteration and salt.

				my $rolpassword = $node1->safe_psql('postgres',

					qq"SELECT rolpassword FROM pg_authid WHERE rolname = '$user';");

				$node2->safe_psql('postgres', qq"ALTER ROLE $user PASSWORD '$rolpassword'");

				unlink($node1->data_dir . '/pg_hba.conf');

				unlink($node2->data_dir . '/pg_hba.conf');

				$node1->append_conf(

					'pg_hba.conf', qq{

				local   db0             all                                     scram-sha-256

				local   db1             all                                     scram-sha-256

				}

				);

				$node2->append_conf(

					'pg_hba.conf', qq{

				local   db2             all                                     scram-sha-256

				}

				);

				$node1->restart;

				$node2->restart;

				# End of test setup

				test_scram_keys_is_not_overwritten($node1, $db0, $fdw_invalid_server2);

				test_fdw_auth($node1, $db0, "t", $fdw_server,

					"SCRAM auth on the same database cluster must succeed");

				test_fdw_auth($node1, $db0, "t2", $fdw_server2,

					"SCRAM auth on a different database cluster must succeed");

				test_fdw_auth_with_invalid_overwritten_require_auth($fdw_invalid_server);

				# Ensure that trust connections fail without superuser opt-in.

				unlink($node1->data_dir . '/pg_hba.conf');

				unlink($node2->data_dir . '/pg_hba.conf');

				$node1->append_conf(

					'pg_hba.conf', qq{

				local   db0             all                                     scram-sha-256

				local   db1             all                                     trust

				}

				);

				$node2->append_conf(

					'pg_hba.conf', qq{

				local   all             all                                     password

				}

				);

				$node1->restart;

				$node2->restart;

				my ($ret, $stdout, $stderr) = $node1->psql(

					$db0,

					"SELECT * FROM dblink('$fdw_server', 'SELECT * FROM t') AS t(a int, b int)",

					connstr => $node1->connstr($db0) . " user=$user");

				is($ret, 3, 'loopback trust fails on the same cluster');

				like(

					$stderr,

					qr/failed: authentication method requirement "scram-sha-256" failed: server did not complete authentication/,

					'expected error from loopback trust (same cluster)');

				($ret, $stdout, $stderr) = $node1->psql(

					$db0,

					"SELECT * FROM dblink('$fdw_server2', 'SELECT * FROM t2') AS t2(a int, b int)",

					connstr => $node1->connstr($db0) . " user=$user");

				is($ret, 3, 'loopback password fails on a different cluster');

				like(

					$stderr,

					qr/authentication method requirement "scram-sha-256" failed: server requested a cleartext password/,

					'expected error from loopback password (different cluster)');

				# Helper functions

				sub test_fdw_auth

				{

					local $Test::Builder::Level = $Test::Builder::Level + 1;

					my ($node, $db, $tbl, $fdw, $testname) = @_;

					my $connstr = $node->connstr($db) . qq' user=$user';

					my $ret = $node->safe_psql(

						$db,

						qq"SELECT count(1) FROM dblink('$fdw', 'SELECT * FROM $tbl') AS $tbl(a int, b int)",

						connstr => $connstr);

					is($ret, '10', $testname);

				}

				sub test_fdw_auth_with_invalid_overwritten_require_auth

				{

					local $Test::Builder::Level = $Test::Builder::Level + 1;

					my ($fdw) = @_;

					my ($ret, $stdout, $stderr) = $node1->psql(

						$db0,

						"select * from dblink('$fdw', 'select * from t') as t(a int, b int)",

						connstr => $node1->connstr($db0) . " user=$user");

					is($ret, 3, 'loopback trust fails when overwriting require_auth');

					like(

						$stderr,

						qr/password or GSSAPI delegated credentials required/,

						'expected error when connecting to a fdw overwriting the require_auth'

					);

				}

				sub test_scram_keys_is_not_overwritten

				{

					local $Test::Builder::Level = $Test::Builder::Level + 1;

					my ($node, $db, $fdw) = @_;

					my ($ret, $stdout, $stderr) = $node->psql(

						$db,

						qq'CREATE USER MAPPING FOR $user SERVER $fdw OPTIONS (user \'$user\', scram_client_key \'key\');',

						connstr => $node->connstr($db) . " user=$user");

					is($ret, 3, 'user mapping creation fails when using scram_client_key');

					like(

						$stderr,

						qr/ERROR:  invalid option "scram_client_key"/,

						'user mapping creation fails when using scram_client_key');

					($ret, $stdout, $stderr) = $node->psql(

						$db,

						qq'CREATE USER MAPPING FOR $user SERVER $fdw OPTIONS (user \'$user\', scram_server_key \'key\');',

						connstr => $node->connstr($db) . " user=$user");

					is($ret, 3, 'user mapping creation fails when using scram_server_key');

					like(

						$stderr,

						qr/ERROR:  invalid option "scram_server_key"/,

						'user mapping creation fails when using scram_server_key');

				}

				sub setup_user_mapping

				{

					my ($node, $db, $fdw) = @_;

					$node->safe_psql($db,

						qq'CREATE USER MAPPING FOR $user SERVER $fdw OPTIONS (user \'$user\');'

					);

				}

				sub setup_fdw_server

				{

					my ($node, $db, $fdw, $fdw_node, $dbname) = @_;

					my $host = $fdw_node->host;

					my $port = $fdw_node->port;

					$node->safe_psql(

						$db, qq'CREATE SERVER $fdw FOREIGN DATA WRAPPER dblink_fdw options (

						host \'$host\', port \'$port\', dbname \'$dbname\', use_scram_passthrough \'true\') '

					);

					$node->safe_psql($db, qq'GRANT USAGE ON FOREIGN SERVER $fdw TO $user;');

					$node->safe_psql($db, qq'GRANT ALL ON SCHEMA public TO $user');

				}

				sub setup_invalid_fdw_server

				{

					my ($node, $db, $fdw, $fdw_node, $dbname) = @_;

					my $host = $fdw_node->host;

					my $port = $fdw_node->port;

					$node->safe_psql(

						$db, qq'CREATE SERVER $fdw FOREIGN DATA WRAPPER dblink_fdw options (

						host \'$host\', port \'$port\', dbname \'$dbname\', use_scram_passthrough \'true\', require_auth \'none\') '

					);

					$node->safe_psql($db, qq'GRANT USAGE ON FOREIGN SERVER $fdw TO $user;');

					$node->safe_psql($db, qq'GRANT ALL ON SCHEMA public TO $user');

				}

				sub setup_table

				{

					my ($node, $db, $tbl) = @_;

					$node->safe_psql($db,

						qq'CREATE TABLE $tbl AS SELECT g as a, g + 1 as b FROM generate_series(1,10) g(g)'

					);

					$node->safe_psql($db, qq'GRANT USAGE ON SCHEMA public TO $user');

					$node->safe_psql($db, qq'GRANT SELECT ON $tbl TO $user');

				}

				done_testing();

									
										5

contrib/dict_int/dict_int.c
									
										View File
										
				@ -15,7 +15,10 @@

				#include "commands/defrem.h"

				#include "tsearch/ts_public.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "dict_int",

									.version = PG_VERSION

				);

				typedef struct

				{

									
										5

contrib/dict_xsyn/dict_xsyn.c
									
										View File
										
				@ -20,7 +20,10 @@

				#include "tsearch/ts_public.h"

				#include "utils/formatting.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "dict_xsyn",

									.version = PG_VERSION

				);

				typedef struct

				{

									
										5

contrib/earthdistance/earthdistance.c
									
										View File
										
				@ -11,7 +11,10 @@

				#define M_PI 3.14159265358979323846

				#endif

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "earthdistance",

									.version = PG_VERSION

				);

				/* Earth's radius is in statute miles. */

				static const double EARTH_RADIUS = 3958.747716;

									
										29

contrib/file_fdw/file_fdw.c
									
										View File
										
				@ -24,8 +24,10 @@

				#include "commands/copy.h"

				#include "commands/copyfrom_internal.h"

				#include "commands/defrem.h"

				#include "commands/explain.h"

				#include "commands/explain_format.h"

				#include "commands/explain_state.h"

				#include "commands/vacuum.h"

				#include "executor/executor.h"

				#include "foreign/fdwapi.h"

				#include "foreign/foreign.h"

				#include "miscadmin.h"

				@ -40,7 +42,10 @@

				#include "utils/sampling.h"

				#include "utils/varlena.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "file_fdw",

									.version = PG_VERSION

				);

				/*

				 * Describes the valid options for objects that use this wrapper.

				@ -793,8 +798,8 @@ retry:

								cstate->num_errors > cstate->opts.reject_limit)

								ereport(ERROR,

										(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),

										 errmsg("skipped more than REJECT_LIMIT (%lld) rows due to data type incompatibility",

												(long long) cstate->opts.reject_limit)));

										 errmsg("skipped more than REJECT_LIMIT (%" PRId64 ") rows due to data type incompatibility",

												cstate->opts.reject_limit)));

							/* Repeat NextCopyFrom() until no soft error occurs */

							goto retry;

				@ -850,10 +855,10 @@ fileEndForeignScan(ForeignScanState *node)

						festate->cstate->num_errors > 0 &&

						festate->cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)

						ereport(NOTICE,

								errmsg_plural("%llu row was skipped due to data type incompatibility",

											  "%llu rows were skipped due to data type incompatibility",

											  (unsigned long long) festate->cstate->num_errors,

											  (unsigned long long) festate->cstate->num_errors));

								errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",

											  "%" PRIu64 " rows were skipped due to data type incompatibility",

											  festate->cstate->num_errors,

											  festate->cstate->num_errors));

					EndCopyFrom(festate->cstate);

				}

				@ -1314,10 +1319,10 @@ file_acquire_sample_rows(Relation onerel, int elevel,

						cstate->num_errors > 0 &&

						cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)

						ereport(NOTICE,

								errmsg_plural("%llu row was skipped due to data type incompatibility",

											  "%llu rows were skipped due to data type incompatibility",

											  (unsigned long long) cstate->num_errors,

											  (unsigned long long) cstate->num_errors));

								errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",

											  "%" PRIu64 " rows were skipped due to data type incompatibility",

											  cstate->num_errors,

											  cstate->num_errors));

					EndCopyFrom(cstate);

									
										5

contrib/fuzzystrmatch/fuzzystrmatch.c
									
										View File
										
				@ -44,7 +44,10 @@

				#include "utils/varlena.h"

				#include "varatt.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "fuzzystrmatch",

									.version = PG_VERSION

				);

				/*

				 * Soundex

									
										5

contrib/hstore/hstore_io.c
									
										View File
										
				@ -21,7 +21,10 @@

				#include "utils/memutils.h"

				#include "utils/typcache.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "hstore",

									.version = PG_VERSION

				);

				/* old names for C functions */

				HSTORE_POLLUTE(hstore_from_text, tconvert);

									
										5

contrib/hstore_plperl/hstore_plperl.c
									
										View File
										
				@ -4,7 +4,10 @@

				#include "hstore/hstore.h"

				#include "plperl.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "hstore_plperl",

									.version = PG_VERSION

				);

				/* Linkage to functions in hstore module */

				typedef HStore *(*hstoreUpgrade_t) (Datum orig);

									
										7

contrib/hstore_plpython/hstore_plpython.c
									
										View File
										
				@ -3,9 +3,12 @@

				#include "fmgr.h"

				#include "hstore/hstore.h"

				#include "plpy_typeio.h"

				#include "plpython.h"

				#include "plpy_util.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "hstore_plpython",

									.version = PG_VERSION

				);

				/* Linkage to functions in plpython module */

				typedef char *(*PLyObject_AsString_t) (PyObject *plrv);

									
										20

contrib/intarray/_int.h
									
										View File
										
				@ -41,17 +41,17 @@ typedef struct

				#define SORT(x) \

					do { \

						int		_nelems_ = ARRNELEMS(x); \

						if (_nelems_ > 1) \

							isort(ARRPTR(x), _nelems_); \

						bool _ascending = true; \

						isort(ARRPTR(x), _nelems_, &_ascending); \

					} while(0)

				/* sort the elements of the array and remove duplicates */

				#define PREPAREARR(x) \

					do { \

						int		_nelems_ = ARRNELEMS(x); \

						if (_nelems_ > 1) \

							if (isort(ARRPTR(x), _nelems_)) \

								(x) = _int_unique(x); \

						bool _ascending = true; \

						isort(ARRPTR(x), _nelems_, &_ascending); \

						(x) = _int_unique(x); \

					} while(0)

				/* "wish" function */

				@ -109,7 +109,7 @@ typedef struct

				/*

				 * useful functions

				 */

				bool		isort(int32 *a, int len);

				void		isort(int32 *a, size_t len, void *arg);

				ArrayType  *new_intArrayType(int num);

				ArrayType  *copy_intArrayType(ArrayType *a);

				ArrayType  *resize_intArrayType(ArrayType *a, int num);

				@ -176,16 +176,12 @@ bool		execconsistent(QUERYTYPE *query, ArrayType *array, bool calcnot);

				bool		gin_bool_consistent(QUERYTYPE *query, bool *check);

				bool		query_has_required_values(QUERYTYPE *query);

				int			compASC(const void *a, const void *b);

				int			compDESC(const void *a, const void *b);

				/* sort, either ascending or descending */

				#define QSORT(a, direction) \

					do { \

						int		_nelems_ = ARRNELEMS(a); \

						if (_nelems_ > 1) \

							qsort((void*) ARRPTR(a), _nelems_, sizeof(int32), \

								  (direction) ? compASC : compDESC ); \

						bool _ascending = (direction) ? true : false; \

						isort(ARRPTR(a), _nelems_, &_ascending); \

					} while(0)

				#endif							/* ___INT_H__ */

									
										5

contrib/intarray/_int_op.c
									
										View File
										
				@ -5,7 +5,10 @@

				#include "_int.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "intarray",

									.version = PG_VERSION

				);

				PG_FUNCTION_INFO_V1(_int_different);

				PG_FUNCTION_INFO_V1(_int_same);

									
										62

contrib/intarray/_int_tool.c
									
										View File
										
				@ -186,36 +186,38 @@ rt__int_size(ArrayType *a, float *size)

					*size = (float) ARRNELEMS(a);

				}

				/* qsort_arg comparison function for isort() */

				static int

				/* comparison function for isort() and _int_unique() */

				static inline int

				isort_cmp(const void *a, const void *b, void *arg)

				{

					int32		aval = *((const int32 *) a);

					int32		bval = *((const int32 *) b);

					if (aval < bval)

						return -1;

					if (aval > bval)

						return 1;

					/*

					 * Report if we have any duplicates.  If there are equal keys, qsort must

					 * compare them at some point, else it wouldn't know whether one should go

					 * before or after the other.

					 */

					*((bool *) arg) = true;

					if (*((bool *) arg))

					{

						/* compare for ascending order */

						if (aval < bval)

							return -1;

						if (aval > bval)

							return 1;

					}

					else

					{

						if (aval > bval)

							return -1;

						if (aval < bval)

							return 1;

					}

					return 0;

				}

				/* Sort the given data (len >= 2).  Return true if any duplicates found */

				bool

				isort(int32 *a, int len)

				{

					bool		r = false;

					qsort_arg(a, len, sizeof(int32), isort_cmp, &r);

					return r;

				}

				#define ST_SORT isort

				#define ST_ELEMENT_TYPE int32

				#define ST_COMPARE(a, b, ascending) isort_cmp(a, b, ascending)

				#define ST_COMPARE_ARG_TYPE void

				#define ST_SCOPE

				#define ST_DEFINE

				#include "lib/sort_template.h"

				/* Create a new int array with room for "num" elements */

				ArrayType *

				@ -311,10 +313,10 @@ ArrayType *

				_int_unique(ArrayType *r)

				{

					int			num = ARRNELEMS(r);

					bool		duplicates_found;	/* not used */

					bool		ascending = true;

					num = qunique_arg(ARRPTR(r), num, sizeof(int), isort_cmp,

									  &duplicates_found);

									  &ascending);

					return resize_intArrayType(r, num);

				}

				@ -393,15 +395,3 @@ int_to_intset(int32 elem)

					aa[0] = elem;

					return result;

				}

				int

				compASC(const void *a, const void *b)

				{

					return pg_cmp_s32(*(const int32 *) a, *(const int32 *) b);

				}

				int

				compDESC(const void *a, const void *b)

				{

					return pg_cmp_s32(*(const int32 *) b, *(const int32 *) a);

				}

42

contrib/intarray/expected/_int.out

View File

 @ -492,6 +492,12 @@ SELECT count(*) from test__int WHERE a @@ '!20 & !21';
 
 (1 row)
 SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';
  count
 -------
 
 (1 row)
 SET enable_seqscan = off;  -- not all of these would use index by default
 CREATE INDEX text_idx on test__int using gist ( a gist__int_ops );
 SELECT count(*) from test__int WHERE a && '{23,50}';
 @ -566,6 +572,12 @@ SELECT count(*) from test__int WHERE a @@ '!20 & !21';
 
 (1 row)
 SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';
  count
 -------
 
 (1 row)
 INSERT INTO test__int SELECT array(SELECT x FROM generate_series(1, 1001) x); -- should fail
 ERROR:  input array is too big (199 maximum allowed, 1001 current), use gist__intbig_ops opclass instead
 DROP INDEX text_idx;
 @ -648,6 +660,12 @@ SELECT count(*) from test__int WHERE a @@ '!20 & !21';
 
 (1 row)
 SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';
  count
 -------
 
 (1 row)
 DROP INDEX text_idx;
 CREATE INDEX text_idx on test__int using gist (a gist__intbig_ops(siglen = 0));
 ERROR:  value 0 out of bounds for option "siglen"
 @ -728,6 +746,12 @@ SELECT count(*) from test__int WHERE a @@ '!20 & !21';
 
 (1 row)
 SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';
  count
 -------
 
 (1 row)
 DROP INDEX text_idx;
 CREATE INDEX text_idx on test__int using gist ( a gist__intbig_ops );
 SELECT count(*) from test__int WHERE a && '{23,50}';
 @ -802,6 +826,12 @@ SELECT count(*) from test__int WHERE a @@ '!20 & !21';
 
 (1 row)
 SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';
  count
 -------
 
 (1 row)
 DROP INDEX text_idx;
 CREATE INDEX text_idx on test__int using gin ( a gin__int_ops );
 SELECT count(*) from test__int WHERE a && '{23,50}';
 @ -876,6 +906,12 @@ SELECT count(*) from test__int WHERE a @@ '!20 & !21';
 
 (1 row)
 SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';
  count
 -------
 
 (1 row)
 DROP INDEX text_idx;
 -- Repeat the same queries with an extended data set. The data set is the
 -- same that we used before, except that each element in the array is
 @ -968,4 +1004,10 @@ SELECT count(*) from more__int WHERE a @@ '!20 & !21';
 
 (1 row)
 SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';
  count
 -------
 
 (1 row)
 RESET enable_seqscan;

									
										7

contrib/intarray/sql/_int.sql
									
										View File
										
				@ -107,6 +107,7 @@ SELECT count(*) from test__int WHERE a @> '{20,23}' or a @> '{50,68}';

				SELECT count(*) from test__int WHERE a @@ '(20&23)|(50&68)';

				SELECT count(*) from test__int WHERE a @@ '20 | !21';

				SELECT count(*) from test__int WHERE a @@ '!20 & !21';

				SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';

				SET enable_seqscan = off;  -- not all of these would use index by default

				@ -124,6 +125,7 @@ SELECT count(*) from test__int WHERE a @> '{20,23}' or a @> '{50,68}';

				SELECT count(*) from test__int WHERE a @@ '(20&23)|(50&68)';

				SELECT count(*) from test__int WHERE a @@ '20 | !21';

				SELECT count(*) from test__int WHERE a @@ '!20 & !21';

				SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';

				INSERT INTO test__int SELECT array(SELECT x FROM generate_series(1, 1001) x); -- should fail

				@ -144,6 +146,7 @@ SELECT count(*) from test__int WHERE a @> '{20,23}' or a @> '{50,68}';

				SELECT count(*) from test__int WHERE a @@ '(20&23)|(50&68)';

				SELECT count(*) from test__int WHERE a @@ '20 | !21';

				SELECT count(*) from test__int WHERE a @@ '!20 & !21';

				SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';

				DROP INDEX text_idx;

				CREATE INDEX text_idx on test__int using gist (a gist__intbig_ops(siglen = 0));

				@ -162,6 +165,7 @@ SELECT count(*) from test__int WHERE a @> '{20,23}' or a @> '{50,68}';

				SELECT count(*) from test__int WHERE a @@ '(20&23)|(50&68)';

				SELECT count(*) from test__int WHERE a @@ '20 | !21';

				SELECT count(*) from test__int WHERE a @@ '!20 & !21';

				SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';

				DROP INDEX text_idx;

				CREATE INDEX text_idx on test__int using gist ( a gist__intbig_ops );

				@ -178,6 +182,7 @@ SELECT count(*) from test__int WHERE a @> '{20,23}' or a @> '{50,68}';

				SELECT count(*) from test__int WHERE a @@ '(20&23)|(50&68)';

				SELECT count(*) from test__int WHERE a @@ '20 | !21';

				SELECT count(*) from test__int WHERE a @@ '!20 & !21';

				SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';

				DROP INDEX text_idx;

				CREATE INDEX text_idx on test__int using gin ( a gin__int_ops );

				@ -194,6 +199,7 @@ SELECT count(*) from test__int WHERE a @> '{20,23}' or a @> '{50,68}';

				SELECT count(*) from test__int WHERE a @@ '(20&23)|(50&68)';

				SELECT count(*) from test__int WHERE a @@ '20 | !21';

				SELECT count(*) from test__int WHERE a @@ '!20 & !21';

				SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';

				DROP INDEX text_idx;

				@ -229,6 +235,7 @@ SELECT count(*) from more__int WHERE a @> '{20,23}' or a @> '{50,68}';

				SELECT count(*) from more__int WHERE a @@ '(20&23)|(50&68)';

				SELECT count(*) from more__int WHERE a @@ '20 | !21';

				SELECT count(*) from more__int WHERE a @@ '!20 & !21';

				SELECT count(*) from test__int WHERE a @@ '!2733 & (2738 | 254)';

				RESET enable_seqscan;

									
										4

contrib/isn/Makefile
									
										View File
										
				@ -3,8 +3,8 @@

				MODULES = isn

				EXTENSION = isn

				DATA = isn--1.1.sql isn--1.1--1.2.sql \

					isn--1.0--1.1.sql

				DATA = isn--1.0--1.1.sql isn--1.1.sql \

					isn--1.1--1.2.sql isn--1.2--1.3.sql

				PGFILEDESC = "isn - data types for international product numbering standards"

				# the other .h files are data tables, we don't install those

44

contrib/isn/expected/isn.out

View File

 @ -279,6 +279,50 @@ FROM (VALUES ('9780123456786', 'UPC'),
  9771234567003 | ISSN  | t  |                |                                                        |        |
 (3 rows)
 --
 -- test weak mode
 --
 SELECT '2222222222221'::ean13;  -- fail
 ERROR:  invalid check digit for EAN13 number: "2222222222221", should be 2
 LINE 1: SELECT '2222222222221'::ean13;
                ^
 SET isn.weak TO TRUE;
 SELECT '2222222222221'::ean13;
       ean13
 ------------------
 -222222222-2!
 (1 row)
 SELECT is_valid('2222222222221'::ean13);
  is_valid
 ----------
  f
 (1 row)
 SELECT make_valid('2222222222221'::ean13);
    make_valid
 -----------------
 -222222222-2
 (1 row)
 SELECT isn_weak();  -- backwards-compatibility wrappers for accessing the GUC
  isn_weak
 ----------
  t
 (1 row)
 SELECT isn_weak(false);
  isn_weak
 ----------
  f
 (1 row)
 SHOW isn.weak;
  isn.weak
 ----------
  off
 (1 row)
 --
 -- cleanup
 --

									
										7

contrib/isn/isn--1.2--1.3.sql
									
										Normal file
									
										View File
										
				@ -0,0 +1,7 @@

				/* contrib/isn/isn--1.2--1.3.sql */

				-- complain if script is sourced in psql, rather than via ALTER EXTENSION

				\echo Use "ALTER EXTENSION isn UPDATE TO '1.3'" to load this file. \quit

				ALTER FUNCTION isn_weak(boolean) VOLATILE PARALLEL UNSAFE;

				ALTER FUNCTION isn_weak() STABLE PARALLEL SAFE;

									
										32

contrib/isn/isn.c
									
										View File
										
				@ -21,8 +21,12 @@

				#include "UPC.h"

				#include "fmgr.h"

				#include "isn.h"

				#include "utils/guc.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "isn",

									.version = PG_VERSION

				);

				#ifdef USE_ASSERT_CHECKING

				#define ISN_DEBUG 1

				@ -39,6 +43,7 @@ enum isn_type

				static const char *const isn_names[] = {"EAN13/UPC/ISxN", "EAN13/UPC/ISxN", "EAN13", "ISBN", "ISMN", "ISSN", "UPC"};

				/* GUC value */

				static bool g_weak = false;

				@ -929,6 +934,20 @@ _PG_init(void)

						if (!check_table(UPC_range, UPC_index))

							elog(ERROR, "UPC failed check");

					}

					/* Define a GUC variable for weak mode. */

					DefineCustomBoolVariable("isn.weak",

											 "Accept input with invalid ISN check digits.",

											 NULL,

											 &g_weak,

											 false,

											 PGC_USERSET,

											 0,

											 NULL,

											 NULL,

											 NULL);

					MarkGUCPrefixReserved("isn");

				}

				/* isn_out

				@ -1109,17 +1128,16 @@ make_valid(PG_FUNCTION_ARGS)

				/* this function temporarily sets weak input flag

				 * (to lose the strictness of check digit acceptance)

				 * It's a helper function, not intended to be used!!

				 */

				PG_FUNCTION_INFO_V1(accept_weak_input);

				Datum

				accept_weak_input(PG_FUNCTION_ARGS)

				{

				#ifdef ISN_WEAK_MODE

					g_weak = PG_GETARG_BOOL(0);

				#else

					/* function has no effect */

				#endif							/* ISN_WEAK_MODE */

					bool		newvalue = PG_GETARG_BOOL(0);

					(void) set_config_option("isn.weak", newvalue ? "on" : "off",

											 PGC_USERSET, PGC_S_SESSION,

											 GUC_ACTION_SET, true, 0, false);

					PG_RETURN_BOOL(g_weak);

				}

2

contrib/isn/isn.control

View File

 @ -1,6 +1,6 @@
 # isn extension
 comment = 'data types for international product numbering standards'
 default_version = '1.2'
 default_version = '1.3'
 module_pathname = '$libdir/isn'
 relocatable = true
 trusted = true

									
										1

contrib/isn/isn.h
									
										View File
										
				@ -18,7 +18,6 @@

				#include "fmgr.h"

				#undef ISN_DEBUG

				#define ISN_WEAK_MODE

				/*

				 *	uint64 is the internal storage format for ISNs.

									
										3

contrib/isn/meson.build
									
										View File
										
				@ -19,8 +19,9 @@ contrib_targets += isn

				install_data(

				  'isn.control',

				  'isn--1.0--1.1.sql',

				  'isn--1.1--1.2.sql',

				  'isn--1.1.sql',

				  'isn--1.1--1.2.sql',

				  'isn--1.2--1.3.sql',

				  kwargs: contrib_data_args,

				)

									
										13

contrib/isn/sql/isn.sql
									
										View File
										
				@ -120,6 +120,19 @@ FROM (VALUES ('9780123456786', 'UPC'),

				      AS a(str,typ),

				     LATERAL pg_input_error_info(a.str, a.typ) as errinfo;

				--

				-- test weak mode

				--

				SELECT '2222222222221'::ean13;  -- fail

				SET isn.weak TO TRUE;

				SELECT '2222222222221'::ean13;

				SELECT is_valid('2222222222221'::ean13);

				SELECT make_valid('2222222222221'::ean13);

				SELECT isn_weak();  -- backwards-compatibility wrappers for accessing the GUC

				SELECT isn_weak(false);

				SHOW isn.weak;

				--

				-- cleanup

				--

									
										5

contrib/jsonb_plperl/jsonb_plperl.c
									
										View File
										
				@ -7,7 +7,10 @@

				#include "utils/fmgrprotos.h"

				#include "utils/jsonb.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "jsonb_plperl",

									.version = PG_VERSION

				);

				static SV  *Jsonb_to_SV(JsonbContainer *jsonb);

				static JsonbValue *SV_to_JsonbValue(SV *obj, JsonbParseState **ps, bool is_elem);

									
										7

contrib/jsonb_plpython/jsonb_plpython.c
									
										View File
										
				@ -2,12 +2,15 @@

				#include "plpy_elog.h"

				#include "plpy_typeio.h"

				#include "plpython.h"

				#include "plpy_util.h"

				#include "utils/fmgrprotos.h"

				#include "utils/jsonb.h"

				#include "utils/numeric.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "jsonb_plpython",

									.version = PG_VERSION

				);

				/* for PLyObject_AsString in plpy_typeio.c */

				typedef char *(*PLyObject_AsString_t) (PyObject *plrv);

									
										5

contrib/lo/lo.c
									
										View File
										
				@ -12,7 +12,10 @@

				#include "utils/fmgrprotos.h"

				#include "utils/rel.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "lo",

									.version = PG_VERSION

				);

				/*

									
										5

contrib/ltree/ltree_op.c
									
										View File
										
				@ -13,7 +13,10 @@

				#include "utils/selfuncs.h"

				#include "varatt.h"

				PG_MODULE_MAGIC;

				PG_MODULE_MAGIC_EXT(

									.name = "ltree",

									.version = PG_VERSION

				);

				/* compare functions */

				PG_FUNCTION_INFO_V1(ltree_cmp);

Compare commits

965 Commits c407d5426b ... e5a3c9d9b5

72 .cirrus.tasks.yml Unescape Escape View File

15 .git-blame-ignore-revs Unescape Escape View File

2 COPYRIGHT Unescape Escape View File

117 config/c-compiler.m4 Unescape Escape View File

11 config/config.guess vendored Unescape Escape View File

729 config/config.sub vendored Unescape Escape View File

80 config/programs.m4 Unescape Escape View File

940 configure vendored View File

140 configure.ac Unescape Escape View File

1 contrib/Makefile Unescape Escape View File

7 contrib/amcheck/Makefile Unescape Escape View File

14 contrib/amcheck/amcheck--1.4--1.5.sql Normal file Unescape Escape View File

2 contrib/amcheck/amcheck.control Unescape Escape View File

4 contrib/amcheck/expected/check_btree.out Unescape Escape View File

78 contrib/amcheck/expected/check_gin.out Normal file Unescape Escape View File

4 contrib/amcheck/meson.build Unescape Escape View File

52 contrib/amcheck/sql/check_gin.sql Normal file Unescape Escape View File

10 contrib/amcheck/t/002_cic.pl Unescape Escape View File

40 contrib/amcheck/t/003_cic_2pc.pl Unescape Escape View File

191 contrib/amcheck/verify_common.c Normal file Unescape Escape View File

31 contrib/amcheck/verify_common.h Normal file Unescape Escape View File

799 contrib/amcheck/verify_gin.c Normal file Unescape Escape View File

142 contrib/amcheck/verify_heapam.c Unescape Escape View File

275 contrib/amcheck/verify_nbtree.c Unescape Escape View File

5 contrib/auth_delay/auth_delay.c Unescape Escape View File

13 contrib/auto_explain/auto_explain.c Unescape Escape View File

15 contrib/auto_explain/t/001_auto_explain.pl Unescape Escape View File

5 contrib/basebackup_to_shell/basebackup_to_shell.c Unescape Escape View File

10 contrib/basebackup_to_shell/t/001_basic.pl Unescape Escape View File

5 contrib/basic_archive/basic_archive.c Unescape Escape View File

5 contrib/bloom/blinsert.c Unescape Escape View File

2 contrib/bloom/blscan.c Unescape Escape View File

3 contrib/bloom/blutils.c Unescape Escape View File

5 contrib/bool_plperl/bool_plperl.c Unescape Escape View File

4 contrib/bool_plperl/expected/bool_plperl.out Unescape Escape View File

4 contrib/bool_plperl/expected/bool_plperlu.out Unescape Escape View File

5 contrib/btree_gin/btree_gin.c Unescape Escape View File

2 contrib/btree_gist/Makefile Unescape Escape View File

57 contrib/btree_gist/btree_bit.c Unescape Escape View File

32 contrib/btree_gist/btree_bool.c Unescape Escape View File

49 contrib/btree_gist/btree_bytea.c Unescape Escape View File

41 contrib/btree_gist/btree_cash.c Unescape Escape View File

37 contrib/btree_gist/btree_date.c Unescape Escape View File

47 contrib/btree_gist/btree_enum.c Unescape Escape View File

33 contrib/btree_gist/btree_float4.c Unescape Escape View File

36 contrib/btree_gist/btree_float8.c Unescape Escape View File

52 contrib/btree_gist/btree_gist--1.7--1.8.sql Unescape Escape View File

197 contrib/btree_gist/btree_gist--1.8--1.9.sql Normal file Unescape Escape View File

5 contrib/btree_gist/btree_gist.c Unescape Escape View File

2 contrib/btree_gist/btree_gist.control Unescape Escape View File

38 contrib/btree_gist/btree_inet.c Unescape Escape View File

37 contrib/btree_gist/btree_int2.c Unescape Escape View File

38 contrib/btree_gist/btree_int4.c Unescape Escape View File

37 contrib/btree_gist/btree_int8.c Unescape Escape View File

33 contrib/btree_gist/btree_interval.c Unescape Escape View File

35 contrib/btree_gist/btree_macaddr.c Unescape Escape View File

38 contrib/btree_gist/btree_macaddr8.c Unescape Escape View File

49 contrib/btree_gist/btree_numeric.c Unescape Escape View File

38 contrib/btree_gist/btree_oid.c Unescape Escape View File

82 contrib/btree_gist/btree_text.c Unescape Escape View File

40 contrib/btree_gist/btree_time.c Unescape Escape View File

38 contrib/btree_gist/btree_ts.c Unescape Escape View File

12 contrib/btree_gist/btree_utils_var.h Unescape Escape View File

30 contrib/btree_gist/btree_uuid.c Unescape Escape View File

5 contrib/btree_gist/expected/enum.out Unescape Escape View File

1 contrib/btree_gist/meson.build Unescape Escape View File

6 contrib/btree_gist/sql/enum.sql Unescape Escape View File

5 contrib/citext/citext.c Unescape Escape View File

5 contrib/cube/cube.c Unescape Escape View File

1 contrib/dblink/Makefile Unescape Escape View File

280 contrib/dblink/dblink.c Unescape Escape View File

11 contrib/dblink/expected/dblink.out Unescape Escape View File

5 contrib/dblink/meson.build Unescape Escape View File

8 contrib/dblink/sql/dblink.sql Unescape Escape View File

253 contrib/dblink/t/001_auth_scram.pl Normal file Unescape Escape View File

5 contrib/dict_int/dict_int.c Unescape Escape View File

5 contrib/dict_xsyn/dict_xsyn.c Unescape Escape View File

5 contrib/earthdistance/earthdistance.c Unescape Escape View File

965 Commits

c407d5426b ... e5a3c9d9b5

72

.cirrus.tasks.yml

View File

15

.git-blame-ignore-revs

View File

2

COPYRIGHT

View File

117

config/c-compiler.m4

View File

11

config/config.guess vendored

View File

729

config/config.sub vendored

View File

80

config/programs.m4

View File

940

configure vendored

View File

140

configure.ac

View File

1

contrib/Makefile

View File

7

contrib/amcheck/Makefile

View File

14

contrib/amcheck/amcheck--1.4--1.5.sql Normal file

View File

2

contrib/amcheck/amcheck.control

View File

4

contrib/amcheck/expected/check_btree.out

View File

78

contrib/amcheck/expected/check_gin.out Normal file

View File

4

contrib/amcheck/meson.build

View File

52

contrib/amcheck/sql/check_gin.sql Normal file

View File

10

contrib/amcheck/t/002_cic.pl

View File

40

contrib/amcheck/t/003_cic_2pc.pl

View File

191

contrib/amcheck/verify_common.c Normal file

View File

31

contrib/amcheck/verify_common.h Normal file

View File

799

contrib/amcheck/verify_gin.c Normal file

View File

142

contrib/amcheck/verify_heapam.c

View File

275

contrib/amcheck/verify_nbtree.c

View File

5

contrib/auth_delay/auth_delay.c

View File

13

contrib/auto_explain/auto_explain.c

View File

15

contrib/auto_explain/t/001_auto_explain.pl

View File

5

contrib/basebackup_to_shell/basebackup_to_shell.c

View File

10

contrib/basebackup_to_shell/t/001_basic.pl

View File

5

contrib/basic_archive/basic_archive.c

View File

5

contrib/bloom/blinsert.c

View File

2

contrib/bloom/blscan.c

View File

3

contrib/bloom/blutils.c

View File

5

contrib/bool_plperl/bool_plperl.c

View File

4

contrib/bool_plperl/expected/bool_plperl.out

View File

4

contrib/bool_plperl/expected/bool_plperlu.out

View File

5

contrib/btree_gin/btree_gin.c

View File

2

contrib/btree_gist/Makefile

View File

57

contrib/btree_gist/btree_bit.c

View File

32

contrib/btree_gist/btree_bool.c

View File

49

contrib/btree_gist/btree_bytea.c

View File

41

contrib/btree_gist/btree_cash.c

View File

37

contrib/btree_gist/btree_date.c

View File

47

contrib/btree_gist/btree_enum.c

View File

33

contrib/btree_gist/btree_float4.c

View File

36

contrib/btree_gist/btree_float8.c

View File

52

contrib/btree_gist/btree_gist--1.7--1.8.sql

View File

197

contrib/btree_gist/btree_gist--1.8--1.9.sql Normal file

View File

5

contrib/btree_gist/btree_gist.c

View File

2

contrib/btree_gist/btree_gist.control

View File

38

contrib/btree_gist/btree_inet.c

View File

37

contrib/btree_gist/btree_int2.c

View File

38

contrib/btree_gist/btree_int4.c

View File

37

contrib/btree_gist/btree_int8.c

View File

33

contrib/btree_gist/btree_interval.c

View File

35

contrib/btree_gist/btree_macaddr.c

View File

38

contrib/btree_gist/btree_macaddr8.c

View File

49

contrib/btree_gist/btree_numeric.c

View File

38

contrib/btree_gist/btree_oid.c

View File

82

contrib/btree_gist/btree_text.c

View File

40

contrib/btree_gist/btree_time.c

View File

38

contrib/btree_gist/btree_ts.c

View File

12

contrib/btree_gist/btree_utils_var.h

View File

30

contrib/btree_gist/btree_uuid.c

View File

5

contrib/btree_gist/expected/enum.out

View File

1

contrib/btree_gist/meson.build

View File

6

contrib/btree_gist/sql/enum.sql

View File

5

contrib/citext/citext.c

View File

5

contrib/cube/cube.c

View File

1

contrib/dblink/Makefile

View File

280

contrib/dblink/dblink.c

View File

11

contrib/dblink/expected/dblink.out

View File

5

contrib/dblink/meson.build

View File

8

contrib/dblink/sql/dblink.sql

View File

253

contrib/dblink/t/001_auth_scram.pl Normal file

View File

5

contrib/dict_int/dict_int.c

View File

5

contrib/dict_xsyn/dict_xsyn.c

View File

5

contrib/earthdistance/earthdistance.c

View File

29

contrib/file_fdw/file_fdw.c

View File