Commit Graph

24968 Commits

Author SHA1 Message Date
Masahiko Sawada
08e6344fd6 Remove ReorderBufferTupleBuf structure.
Since commit a4ccc1cef, the 'node' and 'alloc_tuple_size' fields of
the ReorderBufferTupleBuf structure are no longer used. This leaves
only the 'tuple' field in the structure. Since keeping a single-field
structure makes little sense, the ReorderBufferTupleBuf is removed
entirely. The code is refactored accordingly.

No back-patching since these are ABI changes in an exposed structure
and functions, and there would be some risk of breaking extensions.

Author: Aleksander Alekseev
Reviewed-by: Amit Kapila, Masahiko Sawada, Reid Thompson
Discussion: https://postgr.es/m/CAD21AoCvnuxiXXfRecp7g9+CeC35POQfhuQeJFr7_9u_Q5jc_Q@mail.gmail.com
2024-01-29 10:37:16 +09:00
Michael Paquier
50b797dc99 Fix DROP ROLE when specifying duplicated roles
This commit fixes failures with "tuple already updated by self" when
listing twice the same role and in a DROP ROLE query.

This is an oversight in 6566133c5f, that has introduced a two-phase
logic in DropRole() where dependencies of all the roles to drop are
removed in a first phase, with the roles themselves removed from
pg_authid in a second phase.

The code is simplified to not rely on a List of ObjectAddress built in
the first phase used to remove the pg_authid entries in the second
phase, switching to a list of OIDs.  Duplicated OIDs can be simply
avoided in the first phase thanks to that.  Using ObjectAddress was not
necessary for the roles as they are not used for anything specific to
dependency.c, building all the ObjectAddress in the List with
AuthIdRelationId as class ID.

In 15 and older versions, where a single phase is used, DROP ROLE with
duplicated role names would fail on "role \"blah\" does not exist" for
the second entry after the CCI() done by the first deletion.  This is
not really incorrect, but it does not seem worth changing based on a
lack of complaints.

Reported-by: Alexander Lakhin
Reviewed-by: Tender Wang
Discussion: https://postgr.es/m/18310-1eb233c5908189c8@postgresql.org
Backpatch-through: 16
2024-01-29 08:05:59 +09:00
Tom Lane
5e444a2526 Compare varnullingrels too in assign_param_for_var().
Oversight in 2489d76c4.  Preliminary analysis suggests that the
problem may be unreachable --- but if we did have instances of
the same column with different varnullingrels, we'd surely need
to treat them as different Params.

Discussion: https://postgr.es/m/412552.1706203379@sss.pgh.pa.us
2024-01-26 15:54:17 -05:00
Tom Lane
25cd2d6402 Detect Julian-date overflow in timestamp[tz]_pl_interval.
We perform addition of the days field of an interval via
arithmetic on the Julian-date representation of the timestamp's date.
This step is subject to int32 overflow, and we also should not let
the Julian date become very negative, for fear of weird results from
j2date.  (In the timestamptz case, allow a Julian date of -1 to pass,
since it might convert back to zero after timezone rotation.)

The additions of the months and microseconds fields could also
overflow, of course.  However, I believe we need no additional
checks there; the existing range checks should catch such cases.
The difficulty here is that j2date's magic modular arithmetic could
produce something that looks like it's in-range.

Per bug #18313 from Christian Maurer.  This has been wrong for
a long time, so back-patch to all supported branches.

Discussion: https://postgr.es/m/18313-64d2c8952d81e84b@postgresql.org
2024-01-26 13:39:45 -05:00
Robert Haas
5ddf997347 Temporary patch to help debug pg_walsummary test failures.
The tests in 002_blocks.pl are failing in the buildfarm from time to
time, but we don't know how to reproduce the failure elsewhere. The
most obvious explanation seems to be the unexpected disappearance of a
WAL summary file, so bump up the logging level in
RemoveWalSummaryIfOlderThan to try to help us spot such problems, and
print the cutoff time in addition to the removed filename. Also
adjust 002_blocks.pl to dump out a directory listing of the relevant
directory at various points.

This patch should be reverted once we sort out what's happening here.

Patch by me, reviewed by Nathan Bossart, who also reported the issue.

Discussion: http://postgr.es/m/20240124170846.GA2643050@nathanxps13
2024-01-26 13:25:19 -05:00
Robert Haas
5eafacd279 Combine FSM updates for prune and no-prune cases.
lazy_scan_prune() and lazy_scan_noprune() update the freespace map
with identical conditions; combine them. This consolidation is easier
now that cb970240f1 moved visibility map
updates into lazy_scan_prune().

While combining the FSM updates, simplify the logic for calling
lazy_scan_new_or_empty() and lazy_scan_noprune().

Also update a few comemnts in this part of the code to make them,
hopefully, clearer.

Melanie Plageman and Robert Haas

Discussion: https://postgr.es/m/CA%2BTgmoaLTvipm%3Dxx4rJLr07m908PCu%3DQH3uCjD1UOn8YaEuO2g%40mail.gmail.com
2024-01-26 11:40:16 -05:00
Peter Eisentraut
f7cf9494ba Split some code out from MergeAttributes()
- Separate function to merge a child attribute into matching inherited
  attribute: The logic to merge a child attribute into matching
  inherited attribute in MergeAttribute() is only applicable to
  regular inheritance child.  The code is isolated and coherent enough
  that it can be separated into a function of its own.

- Separate function to merge next parent attribute: Partitions inherit
  from only a single parent.  The logic to merge an attribute from the
  next parent into the corresponding attribute inherited from previous
  parents in MergeAttribute() is only applicable to regular
  inheritance children.  This code is isolated enough that it can be
  separate into a function by itself.

These separations makes MergeAttribute() more readable by making it
easier to follow high level logic without getting entangled into
details.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-26 13:52:05 +01:00
Alvaro Herrera
8c9da1441d
Make spelling of cancelled/cancellation consistent
This fixes places where words derived from cancel were not using their
common en-US ugly^H^H^H^Hspelling.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKG+Lrq+ty6yWXF5572qNQ8KwxGwG5n4fsEcCUap685nWvQ@mail.gmail.com
2024-01-26 12:38:15 +01:00
Peter Eisentraut
64444ce071 MergeAttributes code deduplication
The code handling NOT NULL constraints is duplicated in blocks merging
the attribute definition incrementally.  Deduplicate that code.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-26 11:05:10 +01:00
Michael Paquier
f2bf8fb048 Reindex toast before its main relation in reindex_relation()
This commit changes the order of reindex on a relation so as a toast
relation is processed before its main relation.

The original order, where a rebuild was first done for the indexes on
the main table, could be a problem in the event of a corruption of a
toast index, because, as scans of a toast index may be required to
rebuild the indexes on the main relation, this could lead to failures
with REINDEX TABLE without being able to fix anything.

Rebuilding corrupted toast indexes before this change was possible but
troublesome, as it was necessary to issue a REINDEX on the toast
relation first, followed by a REINDEX on the main relation.  Changing
the order of these operations should make things easier when rebuilding
corrupted indexes, as toast indexes would be rebuilt before they are
used for the indexes on the main relation.

Per request from Richard Vesely.

Author: Gurjeet Singh
Reviewed-by: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/18016-2bd9b549b1fe49b3@postgresql.org
2024-01-26 17:39:58 +09:00
David Rowley
bc397e5cdb De-dupicate Memoize cache keys
It was possible when determining the cache keys for a Memoize path that
if the same expr appeared twice in the parameterized path's ppi_clauses
and/or in the Nested Loop's inner relation's lateral_vars.  If this
happened the Memoize node's cache keys would contain duplicates.  This
isn't a problem for correctness, all it means is that the cache lookups
will be suboptimal due to having redundant work to do on every hash table
lookup and insert.

Here we adjust paraminfo_get_equal_hashops() to look for duplicates and
ignore them when we find them.

Author: David Rowley
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/422277.1706207562%40sss.pgh.pa.us
2024-01-26 20:51:36 +13:00
Michael Paquier
bd5760df38 Fix comment in index.c
Extracted from a larger patch by the same author.

Author: Gurjeet Singh
Discussion: https://postgr.es/m/CABwTF4WX=m5pQvKXvLFJoEH=hSd6O=iZSqxVqHKjFm+iL-AO=w@mail.gmail.com
2024-01-26 14:08:04 +09:00
David Rowley
2cca95e175 Improve NestLoopParam generation for lateral subqueries
It was possible in cases where we had a LATERAL joined subquery that
when the same Var is mentioned in both the lateral references and in the
outer Vars of the scan clauses that the given Var wouldn't be assigned
to the same NestLoopParam.

This could cause issues in Memoize as the cache key would reference the
Var for the scan clauses but when the parameter for the lateral references
changed some code in Memoize would see that some other parameter had
changed that's not part of the cache key and end up purging the entire
cache as a result, thinking the cache had become stale.  This could
result in a Nested Loop -> Memoize plan being quite inefficient as, in
the worst case, the cache purging could result in never getting a cache
hit.  In no cases could this problem lead to incorrect query results.

Here we switch the order of operations so that we create NestLoopParam
for the lateral references first before doing replace_nestloop_params().
replace_nestloop_params() will find and reuse the existing NestLoopParam
in cases where the Var exists in both locations.

Author: Richard Guo
Reviewed-by: Tom Lane, David Rowley
Discussion: https://postgr.es/m/CAMbWs48XHJEK1Q1CzAQ7L9sTANTs9W1cepXu8%3DKc0quUL%2Btg4Q%40mail.gmail.com
2024-01-26 16:18:58 +13:00
Michael Paquier
f2743a7d70 Revert "Add support for parsing of large XML data (>= 10MB)"
This reverts commit 2197d06224, following a discussion over a Coverity
report where issues like the "Billion laugh attack" could cause the
backend to waste CPU and memory even if a client applied checks on the
size of the data given in input, and libxml2 does not offer guarantees
that input limits are respected under XML_PARSE_HUGE.

Discussion: https://postgr.es/m/ZbHlgrPLtBZyr_QW@paquier.xyz
2024-01-26 10:15:32 +09:00
Heikki Linnakangas
376c216138 Update comment, generation mem contexts have a "keeper" block
The keeper block was introduced in commit 1b0d9aa4f7, but it forgot
to update this comment.
2024-01-26 01:04:58 +02:00
Tom Lane
8ba6fdf905 Support TZ and OF format codes in to_timestamp().
Formerly, these were only supported in to_char(), but there seems
little reason for that restriction.  We should at least have enough
support to permit round-tripping the output of to_char().

In that spirit, TZ accepts either zone abbreviations or numeric
(HH or HH:MM) offsets, which are the cases that to_char() can output.
In an ideal world we'd make it take full zone names too, but
that seems like it'd introduce an unreasonable amount of ambiguity,
since the rules for POSIX-spec zone names are so lax.

OF is a subset of this, accepting only HH or HH:MM.

One small benefit of this improvement is that we can simplify
jsonpath's executeDateTimeMethod function, which no longer needs
to consider the HH and HH:MM cases separately.  Moreover, letting
it accept zone abbreviations means it will accept "Z" to mean UTC,
which is emitted by JSON.stringify() for example.

Patch by me, reviewed by Aleksander Alekseev and Daniel Gustafsson

Discussion: https://postgr.es/m/1681086.1686673242@sss.pgh.pa.us
2024-01-25 17:47:08 -05:00
Andrew Dunstan
06a66d87db Clean up a bug in sql/json items commit 66ea94e8e6
Remove a buggy and unnecessary test, along with an unnecessary pstrdup()
and a line of dead code.

Per report, diagnosis and fix from Tom Lane

Discussion: https://postgr.es/m/439811.1706211069@sss.pgh.pa.us
2024-01-25 16:25:11 -05:00
Andrew Dunstan
66ea94e8e6 Implement various jsonpath methods
This commit implements ithe jsonpath .bigint(), .boolean(),
.date(), .decimal([precision [, scale]]), .integer(), .number(),
.string(), .time(), .time_tz(), .timestamp(), and .timestamp_tz()
methods.

.bigint() converts the given JSON string or a numeric value to
the bigint type representation.

.boolean() converts the given JSON string, numeric, or boolean
value to the boolean type representation.  In the numeric case, only
integers are allowed. We use the parse_bool() backend function
to convert a string to a bool.

.decimal([precision [, scale]]) converts the given JSON string
or a numeric value to the numeric type representation.  If precision
and scale are provided for .decimal(), then it is converted to the
equivalent numeric typmod and applied to the numeric number.

.integer() and .number() convert the given JSON string or a
numeric value to the int4 and numeric type representation.

.string() uses the datatype's output function to convert numeric
and various date/time types to the string representation.

The JSON string representing a valid date/time is converted to the
specific date or time type representation using jsonpath .date(),
.time(), .time_tz(), .timestamp(), .timestamp_tz() methods.  The
changes use the infrastructure of the .datetime() method and perform
the datatype conversion as appropriate.  Unlike the .datetime()
method, none of these methods accept a format template and use ISO
DateTime format instead.  However, except for .date(), the
date/time related methods take an optional precision to adjust the
fractional seconds.

Jeevan Chalke, reviewed by Peter Eisentraut and Andrew Dunstan.
2024-01-25 10:15:43 -05:00
Peter Eisentraut
924d046dcf Add a const decoration
Useful for a subsequent patch.

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-25 13:34:49 +01:00
Alvaro Herrera
55627ba2d3
Remove dummy_spinlock
It's been unused since 1b468a131b (2015).
2024-01-25 11:43:47 +01:00
Peter Eisentraut
4d969b2f85 MergeAttributes: convert pg_attribute back to ColumnDef before comparing
MergeAttributes() has a loop to merge multiple inheritance parents
into a column column definition.  The merged-so-far definition is
stored in a ColumnDef node.  If we have to merge columns from multiple
inheritance parents (if the name matches), then we have to check
whether various column properties (type, collation, etc.) match.  The
current code extracts the pg_attribute value of the
currently-considered inheritance parent and compares it against the
merged-so-far ColumnDef value.  If the currently considered column
doesn't match any previously inherited column, we make a new ColumnDef
node from the pg_attribute information and add it to the column list.

This patch rearranges this so that we create the ColumnDef node first
in either case.  Then the code that checks the column properties for
compatibility compares ColumnDef against ColumnDef (instead of
ColumnDef against pg_attribute).  This makes the code more symmetric
and easier to follow.  Also, later in MergeAttributes(), there is a
similar but separate loop that merges the new local column definition
with the combination of the inheritance parents established in the
first loop.  That comparison is already ColumnDef-vs-ColumnDef.  With
this change, both of these can use similar-looking logic.  (A future
project might be to extract these two sets of code into a common
routine that encodes all the knowledge of whether two column
definitions are compatible.  But this isn't currently straightforward
because we want to give different error messages in the two cases.)
Furthermore, by avoiding the use of Form_pg_attribute here, we make it
easier to make changes in the pg_attribute layout without having to
worry about the local needs of tablecmds.c.

Because MergeAttributes() is hugely long, it's sometimes hard to know
where in the function you are currently looking.  To help with that, I
also renamed some variables to make it clearer where you are currently
looking.  The first look is "prev" vs. "new", the second loop is "inh"
vs. "new".

Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-25 11:30:01 +01:00
Alvaro Herrera
46778187f5
Fix s_lock_test compile
This is a mostly unused tool, but I discovered while nosing around the
Makefile that it hasn't been kept in line with other changes.  Fix it.

Backpatching doesn't appear to be necessary.

Discussion: https://postgr.es/m/202401241114.ied53jcich72@alvherre.pgsql
2024-01-25 11:17:33 +01:00
Amit Langote
fba2112b15 Silence compiler warning introduced in 1edb3b491b
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs48qEoe9Du5tuUxrkGQ6VC9oy+tQOORQ6jpob14-E1Z+jg@mail.gmail.com
2024-01-25 17:12:18 +09:00
Michael Paquier
1d35f705e1 Add more LOG messages when starting and ending recovery from a backup
Three LOG messages are added in the recovery code paths, providing
information that can be useful to track corruption issues depending on
the state of the cluster, telling that:
- Recovery has started from a backup_label.
- Recovery is restarting from a backup start LSN, without a
backup_label.
- Recovery has completed from a backup.

Author: Andres Freund
Reviewed-by: David Steele, Laurenz Albe, Michael Paquier
Discussion: https://postgr.es/m/20231117041811.vz4vgkthwjnwp2pp@awork3.anarazel.de
2024-01-25 17:07:56 +09:00
Amit Kapila
c393308b69 Allow to enable failover property for replication slots via SQL API.
This commit adds the failover property to the replication slot. The
failover property indicates whether the slot will be synced to the standby
servers, enabling the resumption of corresponding logical replication
after failover. But note that this commit does not yet include the
capability to sync the replication slot; the subsequent commits will add
that capability.

A new optional parameter 'failover' is added to the
pg_create_logical_replication_slot() function. We will also enable to set
'failover' option for slots via the subscription commands in the
subsequent commits.

The value of the 'failover' flag is displayed as part of
pg_replication_slots view.

Author: Hou Zhijie, Shveta Malik, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda, Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-01-25 12:15:46 +05:30
Fujii Masao
a044e61f1b Remove redundant HandleWalWriterInterrupts().
Because of commit 1bdd54e662, the code of HandleWalWriterInterrupts()
became the same as HandleMainLoopInterrupts(). So this commit removes
HandleWalWriterInterrupts() and makes walwriter use
HandleMainLoopInterrupts() for improved code simplicity.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Nathan Bossart
Discussion: https://postgr.es/m/CAHGQGwHUtwCsB4DnqFLiMiVzjcA=zmeCKf9_pgQM-yJopydatw@mail.gmail.com
2024-01-25 12:50:08 +09:00
Thomas Munro
820b5af73d jit: Require at least LLVM 10.
Remove support for older LLVM versions.  The default on common software
distributions will be at least LLVM 10 when PostgreSQL 17 ships.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
2024-01-25 15:42:34 +13:00
Masahiko Sawada
729439607a Add progress reporting of skipped tuples during COPY FROM.
9e2d870119 enabled the COPY command to skip malformed data, however
there was no visibility into how many tuples were actually skipped
during the COPY FROM.

This commit adds a new "tuples_skipped" column to
pg_stat_progress_copy view to report the number of tuples that were
skipped because they contain malformed data.

Bump catalog version.

Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/d12fd8c99adcae2744212cb23feff6ed%40oss.nttdata.com
2024-01-25 10:57:41 +09:00
Thomas Munro
d282e88e50 Track LLVM 18 changes.
A function was given a newly standard name from C++20 in LLVM 16.  Then
LLVM 18 added a deprecation warning for the old name, and it is about to
ship, so it's time to adjust that.

Back-patch to all supported releases.

Discussion: https://www.postgresql.org/message-id/CA+hUKGLbuVhH6mqS8z+FwAn4=5dHs0bAWmEMZ3B+iYHWKC4-ZA@mail.gmail.com
2024-01-25 13:44:54 +13:00
Peter Eisentraut
46a0cd4cef Add temporal PRIMARY KEY and UNIQUE constraints
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-01-24 16:34:37 +01:00
Alvaro Herrera
74a7306310
Improve notation of BuiltinTrancheNames
Use C99 designated initializer syntax for array elements, instead of
writing the position in a comment.  This is less verbose and much more
readable.  Akin to cc15059634.

One disadvantage is that the BuiltinTrancheNames array now has a hole of
51 NULLs -- previously, the array elements were shifted 51 elements
 downward to avoid this.  This can be fixed by merging the
IndividualLWLockNames array into BuiltinTrancheNames, which would occupy
those 51 pointers, but because it requires some arguably ugly Meson
hackery, it's left for later.

Discussion: https://postgr.es/m/202401231025.gbv4nnte5fmm@alvherre.pgsql
2024-01-24 15:01:30 +01:00
Amit Langote
faa2b953ba Refactor code used by jsonpath executor to fetch variables
Currently, getJsonPathVariable() directly extracts a named
variable/key from the source Jsonb value.  This commit puts that
logic into a callback function called by getJsonPathVariable().
Other implementations of the callback may accept different forms
of the source value(s), for example, a List of values passed from
outside jsonpath_exec.c.

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Amit Langote
1edb3b491b Adjust populate_record_field() to handle errors softly
This adds a Node *escontext parameter to it and a bunch of functions
downstream to it, replacing any ereport()s in that path by either
errsave() or ereturn() as appropriate.  This also adds code to those
functions where necessary to return early upon encountering a soft
error.

The changes here are mainly intended to suppress errors in the
functions of jsonfuncs.c.  Functions in any external modules, such as
arrayfuncs.c, that those functions may in turn call are not changed
here based on the assumption that the various checks in jsonfuncs.c
functions should ensure that only values that are structurally valid
get passed to the functions in those external modules.  An exception
is made for domain_check() to allow handling domain constraint
violation errors softly.

For testing, this adds a function jsonb_populate_record_valid(),
which returns true if jsonb_populate_record() would finish without
causing an error for the provided JSON object, false otherwise.  Note
that jsonb_populate_record() internally calls populate_record(),
which in turn uses populate_record_field().

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Amit Langote
aaaf9449ec Add soft error handling to some expression nodes
This adjusts the code for CoerceViaIO and CoerceToDomain expression
nodes to handle errors softly.

For CoerceViaIo, this adds a new ExprEvalStep opcode
EEOP_IOCOERCE_SAFE, which is implemented in the new accompanying
function ExecEvalCoerceViaIOSafe().  The only difference from
EEOP_IOCOERCE's inline implementation is that the input function
receives an ErrorSaveContext via the function's
FunctionCallInfo.context, which it can use to handle errors softly.

For CoerceToDomain, this simply entails replacing the ereport() in
ExecEvalConstraintNotNull() and ExecEvalConstraintCheck() by
errsave() passing it the ErrorSaveContext passed in the expression's
ExprEvalStep.

In both cases, the ErrorSaveContext to be used is passed by setting
ExprState.escontext to point to it before calling ExecInitExprRec()
on the expression tree whose errors are to be handled softly.

Note that there's no functional change as of this commit as no call
site of ExecInitExprRec() has been changed.  This is intended for
implementing new SQL/JSON expression nodes in future commits.

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Michael Paquier
bb812ab091 Fix ALTER TABLE .. ADD COLUMN with complex inheritance trees
This command, when used to add a column on a parent table with a complex
inheritance tree, tried to update multiple times the same tuple in
pg_attribute for a child table when incrementing attinhcount, causing
failures with "tuple already updated by self" because of a missing
CommandCounterIncrement() between two updates.

This exists for a rather long time, so backpatch all the way down.

Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/18297-b04cd83a55b51e35@postgresql.org
Backpatch-through: 12
2024-01-24 14:20:01 +09:00
Heikki Linnakangas
21ef4d4d89 Revert "libpqwalreceiver: Convert to libpq-be-fe-helpers.h"
This reverts commit 728f86fec6.

The signal handling was a few bricks shy of a load in that commit,
which made the walreceiver non-responsive to SIGTERM while it was
waiting for the connection to be established. That prevented a standby
from being promoted.

Since it was non-essential refactoring, let's revert it to make v16
work the same as earlier releases. I reverted it in 'master' too, to
keep the branches in sync. The refactoring was a good idea as such,
but it needs a bit more work. Once we have developed a complete patch
with this issue fixed, let's re-apply that to 'master'.

Reported-by: Kyotaro Horiguchi
Backpatch-through: 16
Discussion: https://www.postgresql.org/message-id/20231231.200741.1078989336605759878.horikyota.ntt@gmail.com
2024-01-23 10:38:07 +02:00
Peter Eisentraut
9b1a6f50b9 Generate syscache info from catalog files
Add a new genbki macros MAKE_SYSCACHE that specifies the syscache ID
macro, the underlying index, and the number of buckets.  From that, we
can generate the existing tables in syscache.h and syscache.c via
genbki.pl.

Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/75ae5875-3abc-dafc-8aec-73247ed41cde@eisentraut.org
2024-01-23 07:31:06 +01:00
David Rowley
b262ad440e Add better handling of redundant IS [NOT] NULL quals
Until now PostgreSQL has not been very smart about optimizing away IS
NOT NULL base quals on columns defined as NOT NULL.  The evaluation of
these needless quals adds overhead.  Ordinarily, anyone who came
complaining about that would likely just have been told to not include
the qual in their query if it's not required.  However, a recent bug
report indicates this might not always be possible.

Bug 17540 highlighted that when we optimize Min/Max aggregates the IS NOT
NULL qual that the planner adds to make the rewritten plan ignore NULLs
can cause issues with poor index choice.  That particular case
demonstrated that other quals, especially ones where no statistics are
available to allow the planner a chance at estimating an approximate
selectivity for can result in poor index choice due to cheap startup paths
being prefered with LIMIT 1.

Here we take generic approach to fixing this by having the planner check
for NOT NULL columns and just have the planner remove these quals (when
they're not needed) for all queries, not just when optimizing Min/Max
aggregates.

Additionally, here we also detect IS NULL quals on a NOT NULL column and
transform that into a gating qual so that we don't have to perform the
scan at all.  This also works for join relations when the Var is not
nullable by any outer join.

This also helps with the self-join removal work as it must replace
strict join quals with IS NOT NULL quals to ensure equivalence with the
original query.

Author: David Rowley, Richard Guo, Andy Fan
Reviewed-by: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAApHDvqg6XZDhYRPz0zgOcevSMo0d3vxA9DvHrZtKfqO30WTnw@mail.gmail.com
Discussion: https://postgr.es/m/17540-7aa1855ad5ec18b4%40postgresql.org
2024-01-23 18:09:18 +13:00
Nathan Bossart
4372adfa24 Fix possible NULL pointer dereference in GetNamedDSMSegment().
GetNamedDSMSegment() doesn't check whether dsm_attach() returns
NULL, which creates the possibility of a NULL pointer dereference
soon after.  To fix, emit an ERROR if dsm_attach() returns NULL.
This shouldn't happen, but it would be nice to avoid a segfault if
it does.  In passing, tidy up the surrounding code.

Reported-by: Tom Lane
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/3348869.1705854106%40sss.pgh.pa.us
2024-01-22 20:44:38 -06:00
Michael Paquier
cdd863480c Fix ERROR message in injection_point.c
This commit fixes an error message that failed to show the correct
function and library names when a function cannot be loaded.

While on it, adjust the call to load_external_function() so as this
ERROR can be reached, by making load_external_function() return NULL
rather than fail if a function cannot be found for a given injection
point.

Thinkos in d86d20f0ba.
2024-01-23 10:45:00 +09:00
Heikki Linnakangas
0eb23285a2 Fix two memcpy() bugs in the new injection point code
1. The memcpy()s in InjectionPointAttach() would copy garbage from
beyond the end of input string to the buffer in shared memory. You
won't usually notice, but if there is not enough valid mapped memory
beyond the end of the string, the read of unmapped memory will
segfault. This was flagged by the Cirrus CI build with address
sanitizer enabled.

2. The memcpy() in injection_point_cache_add() failed to copy the NULL
terminator.

Discussion: https://www.postgresql.org/message-id/0615a424-b726-4157-afa7-4245629f9512%40iki.fi
2024-01-22 20:55:45 +02:00
David Rowley
2bcf0785cd Re-disallow Memoize for parameterized nested loops with join filters
This was previously fixed in 9e215378d but got broken again as a result
of 2489d76c4.  It seems that commit causes ppi_clauses to contain
duplicate clauses and it's no longer safe to check the list_length of
that list to determine if there are join conditions other than what's
mentioned in ppi_clauses.

Here we adjust the check to count the distinct rinfo_serial mentioned in
ppi_clauses.  We expect that extra->restrictlist won't have duplicate
rinfo_serials.

Reported-by: Amadeo Gallardo
Author: Richard Guo
Discussion: https://postgr.es/m/CADFREbW-BLJd7-a5J%2B5wjVumeFG1ByXiSOFzMtkmY_SDWckTxw%40mail.gmail.com
Backpatch-through: 16, where 2489d76c4 was introduced.
2024-01-22 22:45:02 +13:00
Michael Paquier
b199eb89c6 Fix some typos
Author: Yongtao Huang
Discussion: https://postgr.es/m/CAOe1Go1F99o5JsphtXdDC5bxm7AzetU8q3AxLh4AAVGKu1AzEQ@mail.gmail.com
2024-01-22 13:55:25 +09:00
Michael Paquier
d86d20f0ba Add backend support for injection points
Injection points are a new facility that makes possible for developers
to run custom code in pre-defined code paths.  Its goal is to provide
ways to design and run advanced tests, for cases like:
- Race conditions, where processes need to do actions in a controlled
ordered manner.
- Forcing a state, like an ERROR, FATAL or even PANIC for OOM, to force
recovery, etc.
- Arbitrary sleeps.

This implements some basics, and there are plans to extend it more in
the future depending on what's required.  Hence, this commit adds a set
of routines in the backend that allows developers to attach, detach and
run injection points:
- A code path calling an injection point can be declared with the macro
INJECTION_POINT(name).
- InjectionPointAttach() and InjectionPointDetach() to respectively
attach and detach a callback to/from an injection point.  An injection
point name is registered in a shmem hash table with a library name and a
function name, which will be used to load the callback attached to an
injection point when its code path is run.

Injection point names are just strings, so as an injection point can be
declared and run by out-of-core extensions and modules, with callbacks
defined in external libraries.

This facility is hidden behind a dedicated switch for ./configure and
meson, disabled by default.

Note that backends use a local cache to store callbacks already loaded,
cleaning up their cache if a callback has found to be removed on a
best-effort basis.  This could be refined further but any tests but what
we have here was fine with the tests I've written while implementing
these backend APIs.

Author: Michael Paquier, with doc suggestions from Ashutosh Bapat.
Reviewed-by: Ashutosh Bapat, Nathan Bossart, Álvaro Herrera, Dilip
Kumar, Amul Sul, Nazir Bilal Yavuz
Discussion: https://postgr.es/m/ZTiV8tn_MIb_H2rE@paquier.xyz
2024-01-22 10:15:50 +09:00
Alexander Korotkov
0452b461bc Explore alternative orderings of group-by pathkeys during optimization.
When evaluating a query with a multi-column GROUP BY clause, we can minimize
sort operations or avoid them if we synchronize the order of GROUP BY clauses
with the ORDER BY sort clause or sort order, which comes from the underlying
query tree. Grouping does not imply any ordering, so we can compare
the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg,
we simply compared keys in the order specified in the query. This commit
explores alternative ordering of the keys, trying to find a cheaper one.

The ordering of group keys may interact with other parts of the query, some of
which may not be known while planning the grouping. For example, there may be
an explicit ORDER BY clause or some other ordering-dependent operation higher up
in the query, and using the same ordering may allow using either incremental
sort or even eliminating the sort entirely.

The patch always keeps the ordering specified in the query, assuming the user
might have additional insights.

This introduces a new GUC enable_group_by_reordering so that the optimization
may be disabled if needed.

Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Author: Andrei Lepikhov, Teodor Sigaev
Reviewed-by: Tomas Vondra, Claudio Freire, Gavin Flower, Dmitry Dolgov
Reviewed-by: Robert Haas, Pavel Borisov, David Rowley, Zhihong Yu
Reviewed-by: Tom Lane, Alexander Korotkov, Richard Guo, Alena Rybakina
2024-01-21 22:21:36 +02:00
Alexander Korotkov
7ab80ac1ca Generalize the common code of adding sort before processing of grouping
Extract the repetitive code pattern into a new function make_ordered_path().

Discussion: https://postgr.es/m/CAPpHfdtzaVa7S4onKy3YvttF2rrH5hQNHx9HtcSTLbpjx%2BMJ%2Bw%40mail.gmail.com
Author: Andrei Lepikhov
2024-01-21 22:19:47 +02:00
Tom Lane
58447e3189 Add hint about not qualifying UPDATE...SET target with relation name.
Target columns in UPDATE ... SET must not be qualified with the target
table; we disallow this because it'd create ambiguity about which name
is the column name in case of field-qualified names.  However, newbies
have been seen to expect that they could qualify a target name just
like other names.  The error message when they do is confusing:
"column "foo" of relation "foo" does not exist".  To improve matters,
issue a HINT if the invalid name is qualified and matches the
relation's alias.

James Coleman (editorialized a bit by me)

Discussion: https://postgr.es/m/CAAaqYe8S2Qa060UV-YF5GoSd5PkEhLV94x-fEi3=TOtpaXCV+w@mail.gmail.com
2024-01-20 17:54:14 -05:00
Tom Lane
075df6b208 Add planner support functions for range operators <@ and @>.
These support functions will transform expressions with constant
range values into direct comparisons on the range bound values,
which are frequently better-optimizable.  The transformation is
skipped however if it would require double evaluation of a
volatile or expensive element expression.

Along the way, add the range opfamily OID to range typcache entries,
since load_rangetype_info has to compute that anyway and it seems
silly to duplicate the work later.

Kim Johan Andersson and Jian He, reviewed by Laurenz Albe

Discussion: https://postgr.es/m/94f64d1f-b8c0-b0c5-98bc-0793a34e0851@kimmet.dk
2024-01-20 13:57:54 -05:00
Nathan Bossart
8b2bcf3f28 Introduce the dynamic shared memory registry.
Presently, the most straightforward way for a shared library to use
shared memory is to request it at server startup via a
shmem_request_hook, which requires specifying the library in
shared_preload_libraries.  Alternatively, the library can create a
dynamic shared memory (DSM) segment, but absent a shared location
to store the segment's handle, other backends cannot use it.  This
commit introduces a registry for DSM segments so that these other
backends can look up existing segments with a library-specified
string.  This allows libraries to easily use shared memory without
needing to request it at server startup.

The registry is accessed via the new GetNamedDSMSegment() function.
This function handles allocating the segment and initializing it
via a provided callback.  If another backend already created and
initialized the segment, it simply attaches the segment.
GetNamedDSMSegment() locks the registry appropriately to ensure
that only one backend initializes the segment and that all other
backends just attach it.

The registry itself is comprised of a dshash table that stores the
DSM segment handles keyed by a library-specified string.

Reviewed-by: Michael Paquier, Andrei Lepikhov, Nikita Malakhov, Robert Haas, Bharath Rupireddy, Zhang Mingli, Amul Sul
Discussion: https://postgr.es/m/20231205034647.GA2705267%40nathanxps13
2024-01-19 14:24:36 -06:00
Alexander Korotkov
448a3331d9 Fix name collision in c64086b79d
Reported-by: Erik Rijkers, Tom Lane
Discussion: https://postgr.es/m/E1rQqeS-002A0s-Qm%40gemulon.postgresql.org
2024-01-19 18:17:13 +02:00