Commit Graph

27072 Commits

Author SHA1 Message Date
Tom Lane
d6b37cdb6e Don't believe MinMaxExpr is leakproof without checking.
MinMaxExpr invokes the btree comparison function for its input datatype,
so it's only leakproof if that function is.  Many such functions are
indeed leakproof, but others are not, and we should not just assume that
they are.  Hence, adjust contain_leaked_vars to verify the leakproofness
of the referenced function explicitly.

I didn't add a regression test because it would need to depend on
some particular comparison function being leaky, and that's a moving
target, per discussion.

This has been wrong all along, so back-patch to supported branches.

Discussion: https://postgr.es/m/31042.1546194242@sss.pgh.pa.us
2019-01-02 16:33:48 -05:00
Noah Misch
309d16f073 pg_regress: Promptly detect failed postmaster startup.
Detect it the way pg_ctl's wait_for_postmaster() does.  When pg_regress
spawned a postmaster that failed startup, we were detecting that only
with "pg_regress: postmaster did not respond within 60 seconds".
Back-patch to 9.4 (all supported versions).

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/20181231172922.GA199150@gust.leadboat.com
2018-12-31 13:51:30 -08:00
Alvaro Herrera
2602838fa3 Have DISCARD ALL/TEMP remove leftover temp tables
Previously, it would only remove temp tables created in the same
session; but if the session uses the BackendId of a previously crashed
backend that left temp tables around, those would not get removed.
Since autovacuum would not drop them either (because it sees that the
BackendId is in use by the current session) these can cause annoying
xid-wraparound warnings.

Apply to branches 9.4 to 10.  This is not a problem since version 11,
because commit 943576bddc added state tracking that makes autovacuum
realize that those temp tables are not ours, so it removes them.

This is useful to handle in DISCARD, because even though it does not
handle all situations, it does handle the common one where a connection
pooler keeps the same session open for an indefinitely long time.

Discussion: https://postgr.es/m/20181226190834.wsk2wzott5yzrjiq@alvherre.pgsql
Reviewed-by: Takayuki Tsunakawa, Michaël Paquier
2018-12-27 16:17:40 -03:00
Alvaro Herrera
5199abaca3 Make autovacuum more selective about temp tables to keep
When temp tables are in danger of XID wraparound, autovacuum drops them;
however, it preserves those that are owned by a working session.  This
is desirable, except when the session is connected to a different
database (because the temp tables cannot be from that session), so make
it only keep the temp tables only if the backend is in the same database
as the temp tables.

This is not bulletproof: it fails to detect temp tables left by a
session whose backend ID is reused in the same database but the new
session does not use temp tables.  Commit 943576bddc fixes that case
too, for branches 11 and up (which is why we don't apply this fix to
those branches), but back-patching that one is not universally agreed
on.

Discussion: https://postgr.es/m/20181214162843.37g6h3txto43akrb@alvherre.pgsql
Reviewed-by: Takayuki Tsunakawa, Michaël Paquier
2018-12-27 16:01:36 -03:00
Michael Paquier
1d70076710 Ignore inherited temp relations from other sessions when truncating
Inheritance trees can include temporary tables if the parent is
permanent, which makes possible the presence of multiple temporary
children from different sessions.  Trying to issue a TRUNCATE on the
parent in this scenario causes a failure, so similarly to any other
queries just ignore such cases, which makes TRUNCATE work
transparently.

This makes truncation behave similarly to any other DML query working on
the parent table with queries which need to be issues on children.  A
set of isolation tests is added to cover basic cases.

Reported-by: Zhou Digoal
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/15565-ce67a48d0244436a@postgresql.org
Backpatch-through: 9.4
2018-12-27 10:17:42 +09:00
Tom Lane
4f7ab73106 Fix portability failure introduced in commits d2b0b60e7 et al.
I made a frontend fprintf() format use %m, forgetting that that's only
safe in HEAD not the back branches; prior to 96bf88d52 and d6c55de1f,
it would work on glibc platforms but not elsewhere.  Revert to using
%s ... strerror(errno) as the code did before.

We could have left HEAD as-is, but for code consistency across branches,
I chose to apply this patch there too.

Per Coverity and a few buildfarm members.
2018-12-26 15:30:40 -05:00
Peter Eisentraut
b7b0314b80 Fix ancient compiler warnings and typos in !HAVE_SYMLINK code
This has never been correct since this code was introduced.
2018-12-22 07:27:21 +01:00
Tom Lane
ef673a32d3 Fix ancient thinko in mergejoin cost estimation.
"rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples,
which is sensible if it's to be multiplied by total-inner-tuples or a cost
value corresponding to scanning all the inner tuples.  But in reality it
was (mostly) multiplied by inner_rows or a related cost, numbers that take
into account the possibility of stopping short of scanning the whole inner
relation thanks to a limited key range in the outer relation.  This'd
still make sense if we could expect that stopping short would result in a
proportional decrease in the number of tuples that have to be rescanned.
It does not, however.  The argument that establishes the validity of our
estimate for that number is independent of whether we scan all of the inner
relation or stop short, and experimentation also shows that stopping short
doesn't reduce the number of rescanned tuples.  So the correct calculation
is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply
that by inner_rows or a corresponding cost value.

Most of the time this doesn't make much difference, but if we have
both a high rescan rate (due to lots of duplicate values) and an outer
key range much smaller than the inner key range, then the error can
be significant, leading to a large underestimate of the cost associated
with rescanning.

Per report from Vijaykumar Jain.  This thinko appears to go all the way
back to the introduction of the rescan estimation logic in commit
70fba7043, so back-patch to all supported branches.

Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com
2018-12-18 11:19:39 -05:00
Michael Paquier
696c68c2be Fix use-after-free bug when renaming constraints
This is an oversight from recent commit b13fd344.  While on it, tweak
the previous test with a better name for the renamed primary key.

Detected by buildfarm member prion which forces relation cache release
with -DRELCACHE_FORCE_RELEASE.  Back-patch down to 9.4 as the previous
commit.
2018-12-17 12:44:09 +09:00
Michael Paquier
d5d86e2cd6 Make constraint rename issue relcache invalidation on target relation
When a constraint gets renamed, it may have associated with it a target
relation (for example domain constraints don't have one).  Not
invalidating the target relation cache when issuing the renaming can
result in issues with subsequent commands that refer to the old
constraint name using the relation cache, causing various failures.  One
pattern spotted was using CREATE TABLE LIKE after a constraint
renaming.

Reported-by: Stuart <sfbarbee@gmail.com>
Author: Amit Langote
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org
2018-12-17 10:37:24 +09:00
Alexander Korotkov
bf0e5a73be Fix wrong backpatching of ginRedoDeletePage() deadlock fix
19cf52e6cc changes lock order in ginRedoDeletePage().  But did it in a wrong
way due to oversight during backpatching.  This commit fixes that.

Reported-by: Bruce Momjian
Discussion: https://postgr.es/m/20181213153232.GA10664%40momjian.us
2018-12-13 22:32:05 +03:00
Alexander Korotkov
1cf175c74f Prevent GIN deleted pages from being reclaimed too early
When GIN vacuum deletes a posting tree page, it assumes that no concurrent
searchers can access it, thanks to ginStepRight() locking two pages at once.
However, since 9.4 searches can skip parts of posting trees descending from the
root.  That leads to the risk that page is deleted and reclaimed before
concurrent search can access it.

This commit prevents the risk of above by waiting for every transaction, which
might wait to reference this page, to finish.  Due to binary compatibility
we can't change GinPageOpaqueData to store corresponding transaction id.
Instead we reuse page header pd_prune_xid field, which is unused in index pages.

Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com
Author: Andrey Borodin, Alexander Korotkov
Reviewed-by: Alexander Korotkov
Backpatch-through: 9.4
2018-12-13 06:52:26 +03:00
Alexander Korotkov
19cf52e6cc Prevent deadlock in ginRedoDeletePage()
On standby ginRedoDeletePage() can work concurrently with read-only queries.
Those queries can traverse posting tree in two ways.
1) Using rightlinks by ginStepRight(), which locks the next page before
   unlocking its left sibling.
2) Using downlinks by ginFindLeafPage(), which locks at most one page at time.

Original lock order was: page, parent, left sibling.  That lock order can
deadlock with ginStepRight().  In order to prevent deadlock this commit changes
lock order to: left sibling, page, parent.  Note, that position of parent in
locking order seems insignificant, because we only lock one page at time while
traversing downlinks.

Reported-by: Chen Huajun
Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin
Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com
Author: Alexander Korotkov
Backpatch-through: 9.4
2018-12-13 06:36:54 +03:00
Tom Lane
3af726a1f3 Add stack depth checks to key recursive functions in backend/nodes/*.c.
Although copyfuncs.c has a check_stack_depth call in its recursion,
equalfuncs.c, outfuncs.c, and readfuncs.c lacked one.  This seems
unwise.

Likewise fix planstate_tree_walker(), in branches where that exists.

Discussion: https://postgr.es/m/30253.1544286631@sss.pgh.pa.us
2018-12-10 11:12:43 -05:00
Tom Lane
3a691f8a25 Improve our response to invalid format strings, and detect more cases.
Places that are testing for *printf failure ought to include the format
string in their error reports, since bad-format-string is one of the
more likely causes of such failure.  This both makes it easier to find
and repair the mistake, and provides at least some useful info to the
user who stumbles across such a problem.

Also, tighten snprintf.c to report EINVAL for an invalid flag or
final character in a format %-spec (including the case where the
%-spec is missing a final character altogether).  This seems like
better project policy, and it also allows removing an instruction
or two from the hot code path.

Back-patch the error reporting change in pvsnprintf, since it should be
harmless and may be helpful; but not the snprintf.c change.

Per discussion of bug #15511 from Ertuğrul Kahveci, which reported an
invalid translated format string.  These changes don't fix that error,
but they should improve matters next time we make such a mistake.

Discussion: https://postgr.es/m/15511-1d8b6a0bc874112f@postgresql.org
2018-12-06 15:08:44 -05:00
Tom Lane
02d3104713 Ensure static libraries have correct mod time even if ranlib messes it up.
In at least Apple's version of ranlib, the output file is updated to have
a mod time equal to the max of the timestamps of its components, and that
data only has seconds precision.  On a filesystem with sub-second file
timestamp precision --- say, APFS --- this can result in the finished
static library appearing older than its input files, which causes useless
rebuilds and possible outright failures in parallel makes.

We've only seen this reported in the field from people using Apple's
ranlib with a non-Apple make, because Apple's make doesn't know about
sub-second timestamps either so it doesn't decide rebuilds are needed.
But Apple's ranlib presumably shares code with at least some BSDen,
so it's not that unlikely that the same problem could arise elsewhere.

To fix, just "touch" the output file after ranlib finishes.

We seem to need this in only one place.  There are other calls of
ranlib in our makefiles, but they are working on intermediate files
whose timestamps are not actually important, or else on an installed
static library for which sub-second timestamp precision is unlikely
to matter either.  (Also, so far as I can tell, Apple's ranlib doesn't
mess up the file timestamp in the latter usage anyhow.)

In passing, change "ranlib" to "$(RANLIB)" in one place that was
bypassing the make macro for no good reason.

Per bug #15525 from Jack Kelly (via Alyssa Ross).
Back-patch to all supported branches.

Discussion: https://postgr.es/m/15525-a30da084f17a1faa@postgresql.org
2018-11-29 15:53:44 -05:00
Michael Paquier
b81d08d600 Fix handling of synchronous replication for stopping WAL senders
This fixes an oversight from c6c3334 which has introduced a more strict
ordering in the way WAL senders are stopped to prevent current WAL
activity when a shutdown checkpoint is created.  After all backends are
stopped, all WAL senders are requested to stop which makes them stop any
activity, and switching their state as stopping.  Once the checkpointer
knows that all WAL senders are in a stopping state, the shutdown
checkpoint can begin, with all WAL senders activated, waiting for their
clients to flush the shutdown checkpoint record.

If a subset of WAL senders are stopping and in a sync state, other WAL
senders could still be waiting for a WAL position to be synced while
committing a transaction, however the subset of stopping senders would
not release waiters, potentially breaking synchronous replication
guarantees.  This commit makes sure that even WAL senders stopping are
able to release waiters properly.

On 9.4, this can also trigger an assertion failure when setting for
example max_wal_senders to 1 where a WAL sender is not able to find
itself as in synchronous state when the instance stops.

Reported-by: Paul Guo
Author: Paul Guo, Michael Paquier
Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com
Backpatch-through: 9.4
2018-11-29 09:13:04 +09:00
Tomas Vondra
c1a5caea82 Do not decode TOAST data for table rewrites
During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged
using XLOG / FPI records, and thus (correctly) ignored in decoding.
But the associated TOAST table is WAL-logged as plain INSERT records,
and so was logically decoded and passed to reorder buffer.

That has severe consequences with TOAST tables of non-trivial size.
Firstly, reorder buffer has to keep all those changes, possibly spilling
them to a file, incurring I/O costs and disk space.

Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into
a hash table, which got discarded only after processing the row from the
main heap.  But as the main heap is not decoded for rewrites, this never
happened, so all the TOAST data accumulated in memory, resulting either
in excessive memory consumption or OOM.

The fix is simple, as commit e9edc1ba already introduced infrastructure
(namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST
tables, but it only applied it to system tables.  So simply use it for
all TOAST data in raw_heap_insert().

That would however solve only the memory consumption issue - the TOAST
changes would still be decoded and added to the reorder buffer, and
spilled to disk (although without TOAST tuple data, so much smaller).
But we can solve that by tweaking DecodeInsert() to just ignore such
INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag,
instead of skipping them later in ReorderBufferCommit().

Review: Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com
Backpatch: 9.4-, where logical decoding was introduced
2018-11-28 01:53:29 +01:00
Tom Lane
74bfb5388d Fix translation of special characters in psql's LaTeX output modes.
latex_escaped_print() mistranslated \ and failed to provide any translation
for # ^ and ~, all of which would typically lead to LaTeX document syntax
errors.  In addition it didn't translate < > and |, which would typically
render as unexpected characters.

To some extent this represents shortcomings in ancient versions of LaTeX,
which if memory serves had no easy way to render these control characters
as ASCII text.  But that's been fixed for, um, decades.  In any case there
is no value in emitting guaranteed-to-fail output for these characters.

Noted while fooling with test cases added by commit 9a98984f4.  Back-patch
the code change to all supported versions.
2018-11-26 17:32:51 -05:00
Tom Lane
bf9fb00dd1 Update additional float4/8 expected-output files.
I forgot that the back branches have more variant files than HEAD :-(.
Per buildfarm.

Discussion: https://postgr.es/m/15519-4fc785b483201ff1@postgresql.org
2018-11-24 13:53:12 -05:00
Tom Lane
d5231253e3 Fix float-to-integer coercions to handle edge cases correctly.
ftoi4 and its sibling coercion functions did their overflow checks in
a way that looked superficially plausible, but actually depended on an
assumption that the MIN and MAX comparison constants can be represented
exactly in the float4 or float8 domain.  That fails in ftoi4, ftoi8,
and dtoi8, resulting in a possibility that values near the MAX limit will
be wrongly converted (to negative values) when they need to be rejected.

Also, because we compared before rounding off the fractional part,
the other three functions threw errors for values that really ought
to get rounded to the min or max integer value.

Fix by doing rint() first (requiring an assumption that it handles
NaN and Inf correctly; but dtoi8 and ftoi8 were assuming that already),
and by comparing to values that should coerce to float exactly, namely
INTxx_MIN and -INTxx_MIN.  Also remove some random cosmetic discrepancies
between these six functions.

This back-patches commits cbdb8b4c0 and 452b637d4.  In the 9.4 branch,
also back-patch the portion of 62e2a8dc2 that added PG_INTnn_MIN and
related constants to c.h, so that these functions can rely on them.

Per bug #15519 from Victor Petrovykh.

Patch by me; thanks to Andrew Gierth for analysis and discussion.

Discussion: https://postgr.es/m/15519-4fc785b483201ff1@postgresql.org
2018-11-24 12:45:50 -05:00
Tom Lane
e1d8f18b87 Fix old TAP tests' method for selecting a valid PGPORT value.
This code was trying to be paranoid, but it wasn't paranoid enough.
It only ensured that the selected port is in 0..65535, while most
Unix systems will refuse unprivileged attempts to use TCP port numbers
below 1024.

Change it to allow specification of ports 1024..65535, while if the
port is outside that range, map it into 49152..65535 which is the
port range used by our later branches.

The main reason we've not noticed this up to now is that it's not
important when testing over Unix-socket connections, only TCP,
and most of our test code deliberately prevents the postmaster from
opening any TCP ports.  However, the SSL tests do open up a TCP port,
and I believe this explains why buildfarm member chipmunk has been
failing the SSL tests in 9.5: it's picking a reserved port number.

Patch in 9.5 and 9.4.  Later branches do not use this code.
2018-11-19 20:01:35 -05:00
Tom Lane
7c86b94311 Back-patch updated thread flags tests into 9.4 and 9.5.
This commit back-patches these 9.6-era commits into 9.4 and 9.5:

e97af6c8b Replace our hacked version of ax_pthread.m4 with latest upstream version.
3b14a17c8 Move pthread-tests earlier in the autoconf script.
01051a987 Use AS_IF rather than plain shell "if" in pthread-check.
a2932283c Update ax_pthread.m4 to an experimental draft version from upstream.

The net result is to sync configure's checks for threading-related
flags and libraries with the version we've been using since 9.6.
The motivation for doing so now is that it seems the older code does
not work correctly on very recent RHEL7/ppc64, as evidenced by
buildfarm member quokka.  The newer code is pretty battle-hardened
by now, so this seems like a low-risk fix.

Discussion: https://postgr.es/m/3320.1542647565@sss.pgh.pa.us
2018-11-19 14:24:52 -05:00
Thomas Munro
f1ff5f51d2 PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure.  In that case the only remaining copy of the
data is in the WAL.  A subsequent fsync() could appear to succeed,
but not have flushed the data.  That means that a future checkpoint
could apparently complete successfully but have lost data.

Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure.  Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.

Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses.  If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail.  The GUC defaults
to off, meaning that we panic.

Back-patch to all supported releases.

There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error.  A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.

Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19 14:26:28 +13:00
Thomas Munro
2b2010d12a Don't forget about failed fsync() requests.
If fsync() fails, md.c must keep the request in its bitmap, so that
future attempts will try again.

Back-patch to all supported releases.

Author: Thomas Munro
Reviewed-by: Amit Kapila
Reported-by: Andrew Gierth
Discussion: https://postgr.es/m/87y3i1ia4w.fsf%40news-spur.riddles.org.uk
2018-11-19 14:26:20 +13:00
Tomas Vondra
4134489636 Add valgrind suppressions for wcsrtombs optimizations
wcsrtombs (called through wchar2char from common functions like lower,
upper, etc.) uses various optimizations that may look like access to
uninitialized data, triggering valgrind reports.

For example AVX2 instructions load data in 256-bit chunks, and  gconv
does something similar with 32-bit chunks.  This is faster than accessing
the bytes one by one, and the uninitialized part of the buffer is not
actually used. So suppress the bogus reports.

The exact stack depends on possible optimizations - it might be AVX, SSE
(as in the report by Aleksander Alekseev) or something else. Hence the
last frame is wildcarded, to deal with this.

Backpatch all the way back to 9.4.

Author: Tomas Vondra
Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com
Discussion: https://www.postgresql.org/message-id/20180220150838.GD18315@e733.localdomain
2018-11-18 00:10:15 +01:00
Tom Lane
41609776f2 Second try at fixing numeric data passed through an ECPG SQLDA.
In commit ecfd55795, I removed sqlda.c's checks for ndigits != 0 on the
grounds that we should duplicate the state of the numeric value's digit
buffer even when all the digits are zeroes.  However, that still isn't
quite right, because another possible state of the digit buffer is
buf == digits == NULL (this occurs for a NaN).  As the code now stands,
it'll invoke memcpy with a NULL source address and zero bytecount,
which we know a few platforms crash on.  Hence, reinstate the no-copy
short-circuit, but make it test specifically for buf != NULL rather than
some other condition.  In hindsight, the ndigits test (added by commit
f2ae9f9c3) was almost certainly meant to fix the NaN case not the
all-zeroes case as the associated thread alleged.

As before, back-patch to all supported versions.

Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24
2018-11-14 11:27:31 -05:00
Michael Paquier
e85b729375 Initialize TransactionState and user ID consistently at transaction start
If a failure happens when a transaction is starting between the moment
the transaction status is changed from TRANS_DEFAULT to TRANS_START and
the moment the current user ID and security context flags are fetched
via GetUserIdAndSecContext(), or before initializing its basic fields,
then those may get reset to incorrect values when the transaction
aborts, leaving the session in an inconsistent state.

One problem reported is that failing a starting transaction at the first
query of a session could cause several kinds of system crashes on the
follow-up queries.

In order to solve that, move the initialization of the transaction state
fields and the call of GetUserIdAndSecContext() in charge of fetching
the current user ID close to the point where the transaction status is
switched to TRANS_START, where there cannot be any error triggered
in-between, per an idea of Tom Lane.  This properly ensures that the
current user ID, the security context flags and that the basic fields of
TransactionState remain consistent even if the transaction fails while
starting.

Reported-by: Richard Guo
Diagnosed-By: Richard Guo
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAN_9JTxECSb=pEPcb0a8d+6J+bDcOZ4=DgRo_B7Y5gRHJUM=Rw@mail.gmail.com
Backpatch-through: 9.4
2018-11-14 16:48:26 +09:00
Tom Lane
9e5e3861c7 Fix incorrect results for numeric data passed through an ECPG SQLDA.
Numeric values with leading zeroes were incorrectly copied into a
SQLDA (SQL Descriptor Area), leading to wrong results in ECPG programs.

Report and patch by Daisuke Higuchi.  Back-patch to all supported
versions.

Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24
2018-11-13 15:46:08 -05:00
Tom Lane
2abc879531 Limit the number of index clauses considered in choose_bitmap_and().
classify_index_clause_usage() is O(N^2) in the number of distinct index
qual clauses it considers, because of its use of a simple search list to
store them.  For nearly all queries, that's fine because only a few clauses
will be considered.  But Alexander Kuzmenkov reported a machine-generated
query with 80000 (!) index qual clauses, which caused this code to take
forever.  Somewhat remarkably, this is the only O(N^2) behavior we now
have for such a query, so let's fix it.

We can get rid of the O(N^2) runtime for cases like this without much
damage to the functionality of choose_bitmap_and() by separating out
paths with "too many" qual or pred clauses, and deeming them to always
be nonredundant with other paths.  Then their clauses needn't go into
the search list, so it doesn't get too long, but we don't lose the
ability to consider bitmap AND plans altogether.  I set the threshold
for "too many" to be 100 clauses per path, which should be plenty to
ensure no change in planning behavior for normal queries.

There are other things we could do to make this go faster, but it's not
clear that it's worth any additional effort.  80000 qual clauses require
a whole lot of work in many other places, too.

The code's been like this for a long time, so back-patch to all supported
branches.  The troublesome query only works back to 9.5 (in 9.4 it fails
with stack overflow in the parser); so I'm not sure that fixing this in
9.4 has any real-world benefit, but perhaps it does.

Discussion: https://postgr.es/m/90c5bdfa-d633-dabe-9889-3cf3e1acd443@postgrespro.ru
2018-11-12 11:19:04 -05:00
Tom Lane
277602dfee Fix missing role dependencies for some schema and type ACLs.
This patch fixes several related cases in which pg_shdepend entries were
never made, or were lost, for references to roles appearing in the ACLs of
schemas and/or types.  While that did no immediate harm, if a referenced
role were later dropped, the drop would be allowed and would leave a
dangling reference in the object's ACL.  That still wasn't a big problem
for normal database usage, but it would cause obscure failures in
subsequent dump/reload or pg_upgrade attempts, taking the form of
attempts to grant privileges to all-numeric role names.  (I think I've
seen field reports matching that symptom, but can't find any right now.)

Several cases are fixed here:

1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any
existing ACL entries for the domain.  This case is ancient, dating
back as far as we've had pg_shdepend tracking at all.

2. If a default type privilege applies, CREATE TYPE recorded the
ACL properly but forgot to install dependency entries for it.
This dates to the addition of default privileges for types in 9.2.

3. If a default schema privilege applies, CREATE SCHEMA recorded the
ACL properly but forgot to install dependency entries for it.
This dates to the addition of default privileges for schemas in v10
(commit ab89e465c).

Another somewhat-related problem is that when creating a relation
rowtype or implicit array type, TypeCreate would apply any available
default type privileges to that type, which we don't really want
since such an object isn't supposed to have privileges of its own.
(You can't, for example, drop such privileges once they've been added
to an array type.)

ab89e465c is also to blame for a race condition in the regression tests:
privileges.sql transiently installed globally-applicable default
privileges on schemas, which sometimes got absorbed into the ACLs of
schemas created by concurrent test scripts.  This should have resulted
in failures when privileges.sql tried to drop the role holding such
privileges; but thanks to the bug fixed here, it instead led to dangling
ACLs in the final state of the regression database.  We'd managed not to
notice that, but it became obvious in the wake of commit da906766c, which
allowed the race condition to occur in pg_upgrade tests.

To fix, add a function recordDependencyOnNewAcl to encapsulate what
callers of get_user_default_acl need to do; while the original call
sites got that right via ad-hoc code, none of the later-added ones
have.  Also change GenerateTypeDependencies to generate these
dependencies, which requires adding the typacl to its parameter list.
(That might be annoying if there are any extensions calling that
function directly; but if there are, they're most likely buggy in the
same way as the core callers were, so they need work anyway.)  While
I was at it, I changed GenerateTypeDependencies to accept most of its
parameters in the form of a Form_pg_type pointer, making its parameter
list a bit less unwieldy and mistake-prone.

The test race condition is fixed just by wrapping the addition and
removal of default privileges into a single transaction, so that that
state is never visible externally.  We might eventually prefer to
separate out tests of default privileges into a script that runs by
itself, but that would be a bigger change and would make the tests
run slower overall.

Back-patch relevant parts to all supported branches.

Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us
2018-11-09 20:42:03 -05:00
Tom Lane
2407d4807f Disallow setting client_min_messages higher than ERROR.
Previously it was possible to set client_min_messages to FATAL or PANIC,
which had the effect of suppressing transmission of regular ERROR messages
to the client.  Perhaps that seemed like a useful option in the past, but
the trouble with it is that it breaks guarantees that are explicitly made
in our FE/BE protocol spec about how a query cycle can end.  While libpq
and psql manage to cope with the omission, that's mostly because they
are not very bright; client libraries that have more semantic knowledge
are likely to get confused.  Notably, pgODBC doesn't behave very sanely.
Let's fix this by getting rid of the ability to set client_min_messages
above ERROR.

In HEAD, just remove the FATAL and PANIC options from the set of allowed
enum values for client_min_messages.  (This change also affects
trace_recovery_messages, but that's OK since these aren't useful values
for that variable either.)

In the back branches, there was concern that rejecting these values might
break applications that are explicitly setting things that way.  I'm
pretty skeptical of that argument, but accommodate it by accepting these
values and then internally setting the variable to ERROR anyway.

In all branches, this allows a couple of tiny simplifications in the
logic in elog.c, so do that.

Also respond to the point that was made that client_min_messages has
exactly nothing to do with the server's logging behavior, and therefore
does not belong in the "When To Log" subsection of the documentation.
The "Statement Behavior" subsection is a better match, so move it there.

Jonah Harris and Tom Lane

Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us
Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org
2018-11-08 17:33:26 -05:00
Bruce Momjian
b8db4c2af0 GUC: adjust effective_cache_size SQL descriptions
Follow on patch for commit 3e0f1a4741.

Reported-by: Peter Eisentraut

Discussion: https://postgr.es/m/369ec766-b947-51bd-4dad-6fb9e026439f@2ndquadrant.com

Backpatch-through: 9.4
2018-11-06 13:40:02 -05:00
Tom Lane
4f0bf3359f Stamp 9.4.20. 2018-11-05 16:51:23 -05:00
Andres Freund
b7301e3a7b Fix copy-paste error in errhint() introduced in 691d79a079.
Reported-By: Petr Jelinek
Discussion: https://postgr.es/m/c95a620b-34f0-7930-aeb5-f7ab804f26cb@2ndquadrant.com
Backpatch: 9.4-, like the previous commit
2018-11-05 12:05:40 -08:00
Peter Eisentraut
92154ef477 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 23063751d2d17da76d34ddfdead3f633041a6cbe
2018-11-05 15:12:15 +01:00
Tom Lane
0ae902e39e Make ts_locale.c's character-type functions cope with UTF-16.
On Windows, in UTF8 database encoding, what char2wchar() produces is
UTF16 not UTF32, ie, characters above U+FFFF will be represented by
surrogate pairs.  t_isdigit() and siblings did not account for this
and failed to provide a large enough result buffer.  That in turn
led to bogus "invalid multibyte character for locale" errors, because
contrary to what you might think from char2wchar()'s documentation,
its Windows code path doesn't cope sanely with buffer overflow.

The solution for t_isdigit() and siblings is pretty clear: provide
a 3-wchar_t result buffer not 2.

char2wchar() also needs some work to provide more consistent, and more
accurately documented, buffer overrun behavior.  But that's a bigger job
and it doesn't actually have any immediate payoff, so leave it for later.

Per bug #15476 from Kenji Uno, who deserves credit for identifying the
cause of the problem.  Back-patch to all active branches.

Discussion: https://postgr.es/m/15476-4314f480acf0f114@postgresql.org
2018-11-03 13:56:10 -04:00
Tom Lane
1b5e8b408e Yet further rethinking of build changes for macOS Mojave.
The solution arrived at in commit e74dd00f5 presumes that the compiler
has a suitable default -isysroot setting ... but further experience
shows that in many combinations of macOS version, XCode version, Xcode
command line tools version, and phase of the moon, Apple's compiler
will *not* supply a default -isysroot value.

We could potentially go back to the approach used in commit 68fc227dd,
but I don't have a lot of faith in the reliability or life expectancy of
that either.  Let's just revert to the approach already shipped in 11.0,
namely specifying an -isysroot switch globally.  As a partial response to
the concerns raised by Jakob Egger, adjust the contents of Makefile.global
to look like

CPPFLAGS = -isysroot $(PG_SYSROOT) ...
PG_SYSROOT = /path/to/sysroot

This allows overriding the sysroot path at build time in a relatively
painless way.

Add documentation to installation.sgml about how to use the PG_SYSROOT
option.  I also took the opportunity to document how to work around
macOS's "System Integrity Protection" feature.

As before, back-patch to all supported versions.

Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
2018-11-02 18:54:00 -04:00
Bruce Momjian
060aff97b4 GUC: adjust effective_cache_size docs and SQL description
Clarify that effective_cache_size is both kernel buffers and shared
buffers.

Reported-by: nat@makarevitch.org

Discussion: https://postgr.es/m/153685164808.22334.15432535018443165207@wrigleys.postgresql.org

Backpatch-through: 9.3
2018-11-02 09:10:59 -04:00
Andres Freund
b0fa768c61 Fix error message typo introduced 691d79a079.
Reported-By: Michael Paquier
Discussion: https://postgr.es/m/20181101003405.GB1727@paquier.xyz
Backpatch: 9.4-, like the previous commit
2018-11-01 10:45:42 -07:00
Andres Freund
cf358a2c06 Disallow starting server with insufficient wal_level for existing slot.
Previously it was possible to create a slot, change wal_level, and
restart, even if the new wal_level was insufficient for the
slot. That's a problem for both logical and physical slots, because
the necessary WAL records are not generated.

This removes a few tests in newer versions that, somewhat
inexplicably, whether restarting with a too low wal_level worked (a
buggy behaviour!).

Reported-By: Joshua D. Drake
Author: Andres Freund
Discussion: https://postgr.es/m/20181029191304.lbsmhshkyymhw22w@alap3.anarazel.de
Backpatch: 9.4-, where replication slots where introduced
2018-10-31 15:46:40 -07:00
Tom Lane
95015b1f8e Fix memory leak in repeated SPGIST index scans.
spgendscan neglected to pfree all the memory allocated by spgbeginscan.
It's possible to get away with that in most normal queries, since the
memory is allocated in the executor's per-query context which is about
to get deleted anyway; but it causes severe memory leakage during
creation or filling of large exclusion-constraint indexes.

Also, document that amendscan is supposed to free what ambeginscan
allocates.  The docs' lack of clarity on that point probably caused this
bug to begin with.  (There is discussion of changing that API spec going
forward, but I don't think it'd be appropriate for the back branches.)

Per report from Bruno Wolff.  It's been like this since the beginning,
so back-patch to all active branches.

In HEAD, also fix an independent leak caused by commit 2a6368343
(allocating memory during spgrescan instead of spgbeginscan, which
might be all right if it got cleaned up, but it didn't).  And do a bit
of code beautification on that commit, too.

Discussion: https://postgr.es/m/20181024012314.GA27428@wolff.to
2018-10-31 17:04:43 -04:00
Tom Lane
4311cdd8e2 Sync our copy of the timezone library with IANA release tzcode2018g.
This patch absorbs an upstream fix to "zic" for a recently-introduced
bug that made it output data that some 32-bit clients couldn't read.
Given the current source data, the bug only manifests in zones with
leap seconds, which we don't generate, so that there's no actual
change in our installed timezone data files from this.  Still, in
case somebody uses our copy of "zic" to do something else, it seems
best to apply the fix promptly.

Also, update the README's notes about converting upstream code to
our conventions.
2018-10-31 09:48:24 -04:00
Tom Lane
d651e9e7c5 Update time zone data files to tzdata release 2018g.
DST law changes in Morocco (with, effectively, zero notice).
Historical corrections for Hawaii.
2018-10-31 08:36:35 -04:00
Andrew Dunstan
6982551473 Fix perl searchpath for modern perl for MSVC tools
Modern versions of perl no longer include the current directory in the
perl searchpath, as it's insecure. Instead of adding the current
directory, we get around the problem by adding the directory where the
script lives.

Problem noted by Victor Wagner.

Solution adapted from buildfarm client code.

Backpatch to all live versions.
2018-10-28 12:26:14 -04:00
Tom Lane
0fead87601 Sync our copy of the timezone library with IANA release tzcode2018f.
About half of this is purely cosmetic changes to reduce the diff between
our code and theirs, like inserting "const" markers where they have them.

The other half is tracking actual code changes in zic.c and localtime.c.
I don't think any of these represent near-term compatibility hazards, but
it seems best to stay up to date.

I also fixed longstanding bugs in our code for producing the
known_abbrevs.txt list, which by chance hadn't been exposed before,
but which resulted in some garbage output after applying the upstream
changes in zic.c.  Notably, because upstream removed their old phony
transitions at the Big Bang, it's now necessary to cope with TZif files
containing no DST transition times at all.
2018-10-19 19:36:34 -04:00
Tom Lane
9abbfc35ca Update time zone data files to tzdata release 2018f.
DST law changes in Chile, Fiji, and Russia (Volgograd).
Historical corrections for China, Japan, Macau, and North Korea.

Note: like the previous tzdata update, this involves a depressingly
large amount of semantically-meaningless churn in tzdata.zi.  That
is a consequence of upstream's data compression method assigning
unstable abbreviations to DST rulesets.  I complained about that
to them last time, and this version now uses an assignment method
that pays some heed to not changing abbreviations unnecessarily.
So hopefully, that'll be better going forward.
2018-10-19 17:02:20 -04:00
Tom Lane
0749acca50 Still further rethinking of build changes for macOS Mojave.
To avoid the sorts of problems complained of by Jakob Egger, it'd be
best if configure didn't emit any references to the sysroot path at all.
In the case of PL/Tcl, we can do that just by keeping our hands off the
TCL_INCLUDE_SPEC string altogether.  In the case of PL/Perl, we need to
substitute -iwithsysroot for -I in the compile commands, which is easily
handled if we change to using a configure output variable that includes
the switch not only the directory name.  Since PL/Tcl and PL/Python
already do it like that, this seems like good consistency cleanup anyway.

Hence, this replaces the advice given to Perl-related extensions in commit
5e2217131; instead of writing "-I$(perl_archlibexp)/CORE", they should
just write "$(perl_includespec)".  (The old way continues to work, but not
on recent macOS.)

It's still the case that configure needs to be aware of the sysroot
path internally, but that's cleaner than what we had before.

As before, back-patch to all supported versions.

Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
2018-10-18 14:55:23 -04:00
Tom Lane
176f659027 Fix minor bug in isolationtester.
If the lock wait query failed, isolationtester would report the
PQerrorMessage from some other connection, meaning there would be
no message or an unrelated one.  This seems like a pretty unlikely
occurrence, but if it did happen, this bug could make it really
difficult/confusing to figure out what happened.  That seems to
justify patching all the way back.

In passing, clean up another place where the "wrong" conn was used
for an error report.  That one's not actually buggy because it's
a different alias for the same connection, but it's still confusing
to the reader.
2018-10-17 15:06:38 -04:00
Tom Lane
ec5fe7f799 Improve tzparse's handling of TZDEFRULES ("posixrules") zone data.
In the IANA timezone code, tzparse() always tries to load the zone
file named by TZDEFRULES ("posixrules").  Previously, we'd hacked
that logic to skip the load in the "lastditch" code path, which we use
only to initialize the default "GMT" zone during GUC initialization.
That's critical for a couple of reasons: since we do not support leap
seconds, we *must not* allow "GMT" to have leap seconds, and since this
case runs before the GUC subsystem is fully alive, we'd really rather
not take the risk of pg_open_tzfile throwing any errors.

However, that still left the code reading TZDEFRULES on every other
call, something we'd noticed to the extent of having added code to cache
the result so it was only done once per process not a lot of times.
Andres Freund complained about the static data space used up for the
cache; but as long as the logic was like this, there was no point in
trying to get rid of that space.

We can improve matters by looking a bit more closely at what the IANA
code actually needs the TZDEFRULES data for.  One thing it does is
that if "posixrules" is a leap-second-aware zone, the leap-second
behavior will be absorbed into every POSIX-style zone specification.
However, that's a behavior we'd really prefer to do without, since
for our purposes the end effect is to render every POSIX-style zone
name unsupported.  Otherwise, the TZDEFRULES data is used only if
the POSIX zone name specifies DST but doesn't include a transition
date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0").
That is a minority case for our purposes --- in particular, it
never happens when tzload() invokes tzparse() to interpret a
transition date rule string found in a tzdata zone file.

Hence, if we legislate that we're going to ignore leap-second data
from "posixrules", we can postpone the TZDEFRULES load into the path
where we actually need to substitute for a missing date rule string.
That means it will never happen at all in common scenarios, making it
reasonable to dynamically allocate the cache space when it does happen.
Even when the data is already loaded, this saves some cycles in the
common code path since we avoid a memcpy of 23KB or so.  And, IMO at
least, this is a less ugly hack on the IANA logic than what we had
before, since it's not messing with the lastditch-vs-regular code paths.

Back-patch to all supported branches, not so much because this is a
critical change as that I want to keep all our copies of the IANA
timezone code in sync.

Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-17 12:26:48 -04:00
Tom Lane
486e6f8d9c Back off using -isysroot on Darwin.
Rethink the solution applied in commit 5e2217131 to get PL/Tcl to
build on macOS Mojave.  I feared that adding -isysroot globally might
have undesirable consequences, and sure enough Jakob Egger reported
one: it complicates building extensions with a different Xcode version
than was used for the core server.  (I find that a risky proposition
in general, but apparently it works most of the time, so we shouldn't
break it if we don't have to.)

We'd already adopted the solution for PL/Perl of inserting the sysroot
path directly into the -I switches used to find Perl's headers, and we
can do the same thing for PL/Tcl by changing the -iwithsysroot switch
that Apple's tclConfig.sh reports.  This restricts the risks to PL/Perl
and PL/Tcl themselves and directly-dependent extensions, which is a lot
more pleasing in general than a global -isysroot switch.

Along the way, tighten the test to see if we need to inject the sysroot
path into $perl_includedir, as I'd speculated about upthread but not
gotten round to doing.

As before, back-patch to all supported versions.

Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
2018-10-16 16:27:15 -04:00
Tom Lane
4166fb3a75 Avoid rare race condition in privileges.sql regression test.
We created a temp table, then switched to a new session, leaving
the old session to clean up its temp objects in background.  If that
took long enough, the eventual attempt to drop the user that owns
the temp table could fail, as exhibited today by sidewinder.
Fix by dropping the temp table explicitly when we're done with it.

It's been like this for quite some time, so back-patch to all
supported branches.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2018-10-16%2014%3A45%3A00
2018-10-16 13:56:58 -04:00
Tom Lane
27ba589b74 Avoid statically allocating gmtsub()'s timezone workspace.
localtime.c's "struct state" is a rather large object, ~23KB.  We were
statically allocating one for gmtsub() to use to represent the GMT
timezone, even though that function is not at all heavily used and is
never reached in most backends.  Let's malloc it on-demand, instead.

This does pose the question of how to handle a malloc failure, but
there's already a well-defined error report convention here, ie
set errno and return NULL.

We have but one caller of pg_gmtime in HEAD, and two in back branches,
neither of which were troubling to check for error.  Make them do so.
The possible errors are sufficiently unlikely (out-of-range timestamp,
and now malloc failure) that I think elog() is adequate.

Back-patch to all supported branches to keep our copies of the IANA
timezone code in sync.  This particular change is in a stanza that
already differs from upstream, so it's a wash for maintenance purposes
--- but only as long as we keep the branches the same.

Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-16 11:50:19 -04:00
Tom Lane
eb01ea2a36 Check for stack overrun in standard_ProcessUtility().
ProcessUtility can recurse, and indeed can be driven to infinite
recursion, so it ought to have a check_stack_depth() call.  This
covers the reported bug (portal trying to execute itself) and a bunch
of other cases that could perhaps arise somewhere.

Per bug #15428 from Malthe Borch.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/15428-b3c2915ec470b033@postgresql.org
2018-10-15 14:01:38 -04:00
Michael Paquier
7c525519d8 Avoid duplicate XIDs at recovery when building initial snapshot
On a primary, sets of XLOG_RUNNING_XACTS records are generated on a
periodic basis to allow recovery to build the initial state of
transactions for a hot standby.  The set of transaction IDs is created
by scanning all the entries in ProcArray.  However it happens that its
logic never counted on the fact that two-phase transactions finishing to
prepare can put ProcArray in a state where there are two entries with
the same transaction ID, one for the initial transaction which gets
cleared when prepare finishes, and a second, dummy, entry to track that
the transaction is still running after prepare finishes.  This way
ensures a continuous presence of the transaction so as callers of for
example TransactionIdIsInProgress() are always able to see it as alive.

So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase
transaction finishes to prepare, the record can finish with duplicated
XIDs, which is a state expected by design.  If this record gets applied
on a standby to initial its recovery state, then it would simply fail,
so the odds of facing this failure are very low in practice.  It would
be tempting to change the generation of XLOG_RUNNING_XACTS so as
duplicates are removed on the source, but this requires to hold on
ProcArrayLock for longer and this would impact all workloads,
particularly those using heavily two-phase transactions.

XLOG_RUNNING_XACTS is also actually used only to initialize the standby
state at recovery, so instead the solution is taken to discard
duplicates when applying the initial snapshot.

Diagnosed-by: Konstantin Knizhnik
Author: Michael Paquier
Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru
Backpatch-through: 9.3
2018-10-14 22:23:54 +09:00
Tom Lane
7b88c1ddd0 Remove abstime, reltime, tinterval tables from old regression databases.
In the back branches, drop these tables after the regression tests are
done with them.  This fixes failures of cross-branch pg_upgrade testing
caused by these types having been removed in v12.  We do lose the ability
to test dump/restore behavior with these types in the back branches, but
the actual loss of code coverage seems to be nil given that there's nothing
very special about these types.

Discussion: https://postgr.es/m/20181009192237.34wjp3nmw7oynmmr@alap3.anarazel.de
2018-10-12 19:33:57 -04:00
Tom Lane
ec185747a4 Back-patch addition of the ALLOCSET_FOO_SIZES macros.
These macros were originally added in commit ea268cdc9, and back-patched
into 9.6 before 9.6.0.  However, some extensions would like to use them
in older branches, and there seems no harm in providing them.  So add
them to all supported branches.  Per suggestions from Christoph Berg and
Andres Freund.

Discussion: https://postgr.es/m/20181012170355.bhxi273skjt6sag4@alap3.anarazel.de
2018-10-12 14:49:33 -04:00
Andres Freund
c7b96ba291 Fix logical decoding error when system table w/ toast is repeatedly rewritten.
Repeatedly rewriting a mapped catalog table with VACUUM FULL or
CLUSTER could cause logical decoding to fail with:
ERROR, "could not map filenode \"%s\" to relation OID"

To trigger the problem the rewritten catalog had to have live tuples
with toasted columns.

The problem was triggered as during catalog table rewrites the
heap_insert() check that prevents logical decoding information to be
emitted for system catalogs, failed to treat the new heap's toast table
as a system catalog (because the new heap is not recognized as a
catalog table via RelationIsLogicallyLogged()). The relmapper, in
contrast to the normal catalog contents, does not contain historical
information. After a single rewrite of a mapped table the new relation
is known to the relmapper, but if the table is rewritten twice before
logical decoding occurs, the relfilenode cannot be mapped to a
relation anymore.  Which then leads us to error out.   This only
happens for toast tables, because the main table contents aren't
re-inserted with heap_insert().

The fix is simple, add a new heap_insert() flag that prevents logical
decoding information from being emitted, and accept during decoding
that there might not be tuple data for toast tables.

Unfortunately that does not fix pre-existing logical decoding
errors. Doing so would require not throwing an error when a filenode
cannot be mapped to a relation during decoding, and that seems too
likely to hide bugs.  If it's crucial to fix decoding for an existing
slot, temporarily changing the ERROR in ReorderBufferCommit() to a
WARNING appears to be the best fix.

Author: Andres Freund
Discussion: https://postgr.es/m/20180914021046.oi7dm4ra3ot2g2kt@alap3.anarazel.de
Backpatch: 9.4-, where logical decoding was introduced
2018-10-10 13:53:03 -07:00
Tom Lane
26cc27541d Allow btree comparison functions to return INT_MIN.
Historically we forbade datatype-specific comparison functions from
returning INT_MIN, so that it would be safe to invert the sort order
just by negating the comparison result.  However, this was never
really safe for comparison functions that directly return the result
of memcmp(), strcmp(), etc, as POSIX doesn't place any such restriction
on those library functions.  Buildfarm results show that at least on
recent Linux on s390x, memcmp() actually does return INT_MIN sometimes,
causing sort failures.

The agreed-on answer is to remove this restriction and fix relevant
call sites to not make such an assumption; code such as "res = -res"
should be replaced by "INVERT_COMPARE_RESULT(res)".  The same is needed
in a few places that just directly negated the result of memcmp or
strcmp.

To help find places having this problem, I've also added a compile option
to nbtcompare.c that causes some of the commonly used comparators to
return INT_MIN/INT_MAX instead of their usual -1/+1.  It'd likely be
a good idea to have at least one buildfarm member running with
"-DSTRESS_SORT_INT_MIN".  That's far from a complete test of course,
but it should help to prevent fresh introductions of such bugs.

This is a longstanding portability hazard, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/20180928185215.ffoq2xrq5d3pafna@alap3.anarazel.de
2018-10-05 16:01:30 -04:00
Tom Lane
a5b46fc66d Set snprintf.c's maximum number of NL arguments to be 31.
Previously, we used the platform's NL_ARGMAX if any, otherwise 16.
The trouble with this is that the platform value is hugely variable,
ranging from the POSIX-minimum 9 to as much as 64K on recent FreeBSD.
Values of more than a dozen or two have no practical use and slow down
the initialization of the argtypes array.  Worse, they cause snprintf.c
to consume far more stack space than was the design intention, possibly
resulting in stack-overflow crashes.

Standardize on 31, which is comfortably more than we need (it looks like
no existing translatable message has more than about 10 parameters).
I chose that, not 32, to make the array sizes powers of 2, for some
possible small gain in speed of the memset.

The lack of reported crashes suggests that the set of platforms we
use snprintf.c on (in released branches) may have no overlap with
the set where NL_ARGMAX has unreasonably large values.  But that's
not entirely clear, so back-patch to all supported branches.

Per report from Mateusz Guzik (via Thomas Munro).

Discussion: https://postgr.es/m/CAEepm=3VF=PUp2f8gU8fgZB22yPE_KBS0+e1AHAtQ=09schTHg@mail.gmail.com
2018-10-02 12:41:28 -04:00
Tom Lane
fd81fae67f Fix corner-case failures in has_foo_privilege() family of functions.
The variants of these functions that take numeric inputs (OIDs or
column numbers) are supposed to return NULL rather than failing
on bad input; this rule reduces problems with snapshot skew when
queries apply the functions to all rows of a catalog.

has_column_privilege() had careless handling of the case where the
table OID didn't exist.  You might get something like this:
	select has_column_privilege(9999,'nosuchcol','select');
	ERROR:  column "nosuchcol" of relation "(null)" does not exist
or you might get a crash, depending on the platform's printf's response
to a null string pointer.

In addition, while applying the column-number variant to a dropped
column returned NULL as desired, applying the column-name variant
did not:
	select has_column_privilege('mytable','........pg.dropped.2........','select');
	ERROR:  column "........pg.dropped.2........" of relation "mytable" does not exist
It seems better to make this case return NULL as well.

Also, the OID-accepting variants of has_foreign_data_wrapper_privilege,
has_server_privilege, and has_tablespace_privilege didn't follow the
principle of returning NULL for nonexistent OIDs.  Superusers got TRUE,
everybody else got an error.

Per investigation of Jaime Casanova's report of a new crash in HEAD.
These behaviors have been like this for a long time, so back-patch to
all supported branches.

Patch by me; thanks to Stephen Frost for discussion and review

Discussion: https://postgr.es/m/CAJGNTeP=-6Gyqq5TN9OvYEydi7Fv1oGyYj650LGTnW44oAzYCg@mail.gmail.com
2018-10-02 11:54:13 -04:00
Tom Lane
26318c4b85 Fix ALTER COLUMN TYPE to not open a relation without any lock.
If the column being modified is referenced by a foreign key constraint
of another table, ALTER TABLE would open the other table (to re-parse
the constraint's definition) without having first obtained a lock on it.
This was evidently intentional, but that doesn't mean it's really safe.
It's especially not safe in 9.3, which pre-dates use of MVCC scans for
catalog reads, but even in current releases it doesn't seem like a good
idea.

We know we'll need AccessExclusiveLock shortly to drop the obsoleted
constraint, so just get that a little sooner to close the hole.

Per testing with a patch that complains if we open a relation without
holding any lock on it.  I don't plan to back-patch that patch, but we
should close the holes it identifies in all supported branches.

Discussion: https://postgr.es/m/2038.1538335244@sss.pgh.pa.us
2018-10-01 11:39:14 -04:00
Tom Lane
e5baf8c27e Fix detection of the result type of strerror_r().
The method we've traditionally used, of redeclaring strerror_r() to
see if the compiler complains of inconsistent declarations, turns out
not to work reliably because some compilers only report a warning,
not an error.  Amazingly, this has gone undetected for years, even
though it certainly breaks our detection of whether strerror_r
succeeded.

Let's instead test whether the compiler will take the result of
strerror_r() as a switch() argument.  It's possible this won't
work universally either, but it's the best idea I could come up with
on the spur of the moment.

Back-patch of commit 751f532b9.  Buildfarm results indicate that only
icc-on-Linux actually has an issue here; perhaps the lack of field
reports indicates that people don't build PG for production that way.

Discussion: https://postgr.es/m/10877.1537993279@sss.pgh.pa.us
2018-09-30 16:24:56 -04:00
Peter Eisentraut
26b877d280 Recurse to sequences on ownership change for all relkinds
When a table ownership is changed, we must apply that also to any owned
sequences.  (Otherwise, it would result in a situation that cannot be
restored, because linked sequences must have the same owner as the
table.)  But this was previously only applied to regular tables and
materialized views.  But it should also apply to at least foreign
tables.  This patch removes the relkind check altogether, because it
doesn't save very much and just introduces the possibility of similar
omissions.

Bug: #15238
Reported-by: Christoph Berg <christoph.berg@credativ.de>
2018-09-26 20:33:05 +02:00
Tom Lane
a5361b5933 Make some fixes to allow building Postgres on macOS 10.14 ("Mojave").
Apple's latest rearrangements of the system-supplied headers have broken
building of PL/Perl and PL/Tcl.  The only practical way to fix PL/Tcl is to
start using the "-isysroot" compiler flag to point to SDK-supplied headers,
as Apple expects.  We must also start distinguishing where to find Perl's
headers from where to find its shared library; but that seems like good
cleanup anyway.

Extensions that formerly did something like -I$(perl_archlibexp)/CORE
should now do -I$(perl_includedir)/CORE instead.  perl_archlibexp
is still the place to look for libperl.so, though.

If for some reason you don't like the default -isysroot setting, you can
override that by setting PG_SYSROOT in configure's arguments.  I don't
currently think people would need to do so, unless maybe for cross-version
build purposes.

In addition, teach configure where to find tclConfig.sh.  Our traditional
method of searching $auto_path hasn't worked for the last couple of macOS
releases, and it now seems clear that Apple's not going to change that.
The workaround of manually specifying --with-tclconfig was annoying
already, but Mojave's made it a lot more so because the sysroot path now
has to be included as well.  Let's just wire the knowledge into configure
instead.  To avoid breaking builds against non-default Tcl installations
(e.g. MacPorts) wherein the $auto_path method probably still works,
arrange to try the additional case only after all else has failed.

Back-patch to all supported versions, since at least the buildfarm
cares about that.  The changes are set up to not do anything on macOS
releases that are old enough to not have functional sysroot trees.
2018-09-25 13:23:29 -04:00
Tom Lane
028fc0bac9 Fix over-allocation of space for array_out()'s result string.
array_out overestimated the space needed for its output, possibly by
a very substantial amount if the array is multi-dimensional, because
of wrong order of operations in the loop that counts the number of
curly-brace pairs needed.  While the output string is normally
short-lived, this could still cause problems in extreme cases.

An additional minor error was that it counted one more delimiter than
is actually needed.

Repair those errors, add an Assert that the space is now correctly
calculated, and make some minor improvements in the comments.

I also failed to resist the temptation to get rid of an integer
modulus operation per array element; a simple comparison is sufficient.

This bug dates clear back to Berkeley days, so back-patch to all
supported versions.

Keiichi Hirobe, minor additional work by me

Discussion: https://postgr.es/m/CAH=EFxE9W0tRvQkixR2XJRRCToUYUEDkJZk6tnADXugPBRdcdg@mail.gmail.com
2018-09-24 11:30:51 -04:00
Noah Misch
401228183a Initialize random() in bootstrap/stand-alone postgres and in initdb.
This removes a difference between the standard IsUnderPostmaster
execution environment and that of --boot and --single.  In a stand-alone
backend, "SELECT random()" always started at the same seed.

On a system capable of using posix shared memory, initdb could still
conclude "selecting dynamic shared memory implementation ... sysv".
Crashed --boot or --single postgres processes orphaned shared memory
objects having names that collided with the not-actually-random names
that initdb probed.  The sysv fallback appeared after ten crashes of
--boot or --single postgres.  Since --boot and --single are rare in
production use, systems used for PostgreSQL development are the
principal candidate to notice this symptom.

Back-patch to 9.3 (all supported versions).  PostgreSQL 9.4 introduced
dynamic shared memory, but 9.3 does share the "SELECT random()" problem.

Reviewed by Tom Lane and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20180915221546.GA3159382@rfd.leadboat.com
2018-09-23 22:56:57 -07:00
Tom Lane
38cb010843 Fix failure in WHERE CURRENT OF after rewinding the referenced cursor.
In a case where we have multiple relation-scan nodes in a cursor plan,
such as a scan of an inheritance tree, it's possible to fetch from a
given scan node, then rewind the cursor and fetch some row from an
earlier scan node.  In such a case, execCurrent.c mistakenly thought
that the later scan node was still active, because ExecReScan hadn't
done anything to make it look not-active.  We'd get some sort of
failure in the case of a SeqScan node, because the node's scan tuple
slot would be pointing at a HeapTuple whose t_self gets reset to
invalid by heapam.c.  But it seems possible that for other relation
scan node types we'd actually return a valid tuple TID to the caller,
resulting in updating or deleting a tuple that shouldn't have been
considered current.  To fix, forcibly clear the ScanTupleSlot in
ExecScanReScan.

Another issue here, which seems only latent at the moment but could
easily become a live bug in future, is that rewinding a cursor does
not necessarily lead to *immediately* applying ExecReScan to every
scan-level node in the plan tree.  Upper-level nodes will think that
they can postpone that call if their child node is already marked
with chgParam flags.  I don't see a way for that to happen today in
a plan tree that's simple enough for execCurrent.c's search_plan_tree
to understand, but that's one heck of a fragile assumption.  So, add
some logic in search_plan_tree to detect chgParam flags being set on
nodes that it descended to/through, and assume that that means we
should consider lower scan nodes to be logically reset even if their
ReScan call hasn't actually happened yet.

Per bug #15395 from Matvey Arye.  This has been broken for a long time,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/153764171023.14986.280404050547008575@wrigleys.postgresql.org
2018-09-23 16:05:45 -04:00
Thomas Munro
c0c5668c6a Allow DSM allocation to be interrupted.
Chris Travers reported that the startup process can repeatedly try to
cancel a backend that is in a posix_fallocate()/EINTR loop and cause it
to loop forever.  Teach the retry loop to give up if an interrupt is
pending.  Don't actually check for interrupts in that loop though,
because a non-local exit would skip some clean-up code in the caller.

Back-patch to 9.4 where DSM was added (and posix_fallocate() was later
back-patched).

Author: Chris Travers
Reviewed-by: Ildar Musin, Murat Kabilov, Oleksii Kliukin
Tested-by: Oleksii Kliukin
Discussion: https://postgr.es/m/CAN-RpxB-oeZve_J3SM_6%3DHXPmvEG%3DHX%2B9V9pi8g2YR7YW0rBBg%40mail.gmail.com
2018-09-18 23:49:21 +12:00
Tom Lane
8494755109 Fix failure with initplans used conditionally during EvalPlanQual rechecks.
The EvalPlanQual machinery assumes that any initplans (that is,
uncorrelated sub-selects) used during an EPQ recheck would have already
been evaluated during the main query; this is implicit in the fact that
execPlan pointers are not copied into the EPQ estate's es_param_exec_vals.
But it's possible for that assumption to fail, if the initplan is only
reached conditionally.  For example, a sub-select inside a CASE expression
could be reached during a recheck when it had not been previously, if the
CASE test depends on a column that was just updated.

This bug is old, appearing to date back to my rewrite of EvalPlanQual in
commit 9f2ee8f28, but was not detected until Kyle Samson reported a case.

To fix, force all not-yet-evaluated initplans used within the EPQ plan
subtree to be evaluated at the start of the recheck, before entering the
EPQ environment.  This could be inefficient, if such an initplan is
expensive and goes unused again during the recheck --- but that's piling
one layer of improbability atop another.  It doesn't seem worth adding
more complexity to prevent that, at least not in the back branches.

It was convenient to use the new-in-v11 ExecEvalParamExecParams function
to implement this, but I didn't like either its name or the specifics of
its API, so revise that.

Back-patch all the way.  Rather than rewrite the patch to avoid depending
on bms_next_member() in the oldest branches, I chose to back-patch that
function into 9.4 and 9.3.  (This isn't the first time back-patches have
needed that, and it exhausted my patience.)  I also chose to back-patch
some test cases added by commits 71404af2a and 342a1ffa2 into 9.4 and 9.3,
so that the 9.x versions of eval-plan-qual.spec are all the same.

Andrew Gierth diagnosed the problem and contributed the added test cases,
though the actual code changes are by me.

Discussion: https://postgr.es/m/A033A40A-B234-4324-BE37-272279F7B627@tripadvisor.com
2018-09-15 13:42:34 -04:00
Andrew Gierth
a389ddc759 Repair bug in regexp split performance improvements.
Commit c8ea87e4b introduced a temporary conversion buffer for
substrings extracted during regexp splits. Unfortunately the code that
sized it was failing to ignore the effects of ignored degenerate
regexp matches, so for regexp_split_* calls it could under-size the
buffer in such cases.

Fix, and add some regression test cases (though those will only catch
the bug if run in a multibyte encoding).

Backpatch to 9.3 as the faulty code was.

Thanks to the PostGIS project, Regina Obe and Paul Ramsey for the
report (via IRC) and assistance in analysis. Patch by me.
2018-09-12 19:47:50 +01:00
Tom Lane
86e2475833 On all Windows platforms, not just Cygwin, use _timezone and _tzname.
Back-patch commit 868628e4f into the 9.5 branch, so that we can support
building that branch with Visual Studio 2015.  This patch itself could
go further back, but other VS2015 patches such as 0fb54de9a and c8e81afc6
were only back-patched to 9.5, so there seems little point in handling
this one differently.

Discussion: https://postgr.es/m/CAD=LzWFg+Z-KUS3Wm8-1J2vOuYErJXbjuE6b7quzswQEBXJWMQ@mail.gmail.com

Now that we have backported VS2015 support to 9.4 and 9.3, backport this also.
2018-09-12 12:24:11 -04:00
Andrew Dunstan
19acfd6528 Support building with Visual Studio 2017
Haribabu Kommi, reviewed by Takeshi Ideriha and Christian Ullrich

Now backpatched to 9.4 and 9.3
2018-09-11 16:03:42 -04:00
Andrew Dunstan
9ca32a6ebc Support building with Visual Studio 2015
Adjust the way we detect the locale. As a result the minumum Windows
version supported by VS2015 and later is Windows Vista. Add some tweaks
to remove new compiler warnings. Remove documentation references to the
now obsolete msysGit.

Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich.

Rather belated backpatch to 9.4 and 9.3
2018-09-11 15:44:42 -04:00
Alexander Korotkov
35ea98f79a Fix past pd_upper write in ginRedoRecompress()
ginRedoRecompress() replays actions over compressed segments of posting list
in-place.  However, it might lead to write past pg_upper, because intermediate
state during playing the changes can take more space than both original state
and final state.  This commit fixes that by refuse from in-place modification.
Instead page tail is copied once modification is started, and then it's used
as the source of original segments.  Backpatch to 9.4 where posting list
compression was introduced.

Reported-by: Sivasubramanian Ramasubramanian
Discussion: https://postgr.es/m/1536091151804.6588%40amazon.com
Author: Alexander Korotkov based on patch from and ideas by Sivasubramanian Ramasubramanian
Review: Sivasubramanian Ramasubramanian
Backpatch-through: 9.4
2018-09-09 21:45:55 +03:00
Tom Lane
d2003339c3 Save/restore SPI's global variables in SPI_connect() and SPI_finish().
This patch removes two sources of interference between nominally
independent functions when one SPI-using function calls another,
perhaps without knowing that it does so.

Chapman Flack pointed out that xml.c's query_to_xml_internal() expects
SPI_tuptable and SPI_processed to stay valid across datatype output
function calls; but it's possible that such a call could involve
re-entrant use of SPI.  It seems likely that there are similar hazards
elsewhere, if not in the core code then in third-party SPI users.
Previously SPI_finish() reset SPI's API globals to zeroes/nulls, which
would typically make for a crash in such a situation.  Restoring them
to the values they had at SPI_connect() seems like a considerably more
useful behavior, and it still meets the design goal of not leaving any
dangling pointers to tuple tables of the function being exited.

Also, cause SPI_connect() to reset these variables to zeroes/nulls after
saving them.  This prevents interference in the opposite direction: it's
possible that a SPI-using function that's only ever been tested standalone
contains assumptions that these variables start out as zeroes.  That was
the case as long as you were the outermost SPI user, but not so much for
an inner user.  Now it's consistent.

Report and fix suggestion by Chapman Flack, actual patch by me.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/9fa25bef-2e4f-1c32-22a4-3ad0723c4a17@anastigmatix.net
2018-09-07 20:09:57 -04:00
Tom Lane
35e39610a3 Limit depth of forced recursion for CLOBBER_CACHE_RECURSIVELY.
It's somewhat surprising that we got away with this before.  (Actually,
since nobody tests this routinely AFAIK, it might've been broken for
awhile.  But it's definitely broken in the wake of commit f868a8143.)
It seems sufficient to limit the forced recursion to a small number
of levels.

Back-patch to all supported branches, like the preceding patch.

Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us
2018-09-07 18:14:37 -04:00
Tom Lane
bf919387ec Fix longstanding recursion hazard in sinval message processing.
LockRelationOid and sibling routines supposed that, if our session already
holds the lock they were asked to acquire, they could skip calling
AcceptInvalidationMessages on the grounds that we must have already read
any remote sinval messages issued against the relation being locked.
This is normally true, but there's a critical special case where it's not:
processing inside AcceptInvalidationMessages might attempt to access system
relations, resulting in a recursive call to acquire a relation lock.

Hence, if the outer call had acquired that same system catalog lock, we'd
fall through, despite the possibility that there's an as-yet-unread sinval
message for that system catalog.  This could, for example, result in
failure to access a system catalog or index that had just been processed
by VACUUM FULL.  This is the explanation for buildfarm failures we've been
seeing intermittently for the past three months.  The bug is far older
than that, but commits a54e1f158 et al added a new recursion case within
AcceptInvalidationMessages that is apparently easier to hit than any
previous case.

To fix this, we must not skip calling AcceptInvalidationMessages until
we have *finished* a call to it since acquiring a relation lock, not
merely acquired the lock.  (There's already adequate logic inside
AcceptInvalidationMessages to deal with being called recursively.)
Fortunately, we can implement that at trivial cost, by adding a flag
to LOCALLOCK hashtable entries that tracks whether we know we have
completed such a call.

There is an API hazard added by this patch for external callers of
LockAcquire: if anything is testing for LOCKACQUIRE_ALREADY_HELD,
it might be fooled by the new return code LOCKACQUIRE_ALREADY_CLEAR
into thinking the lock wasn't already held.  This should be a fail-soft
condition, though, unless something very bizarre is being done in
response to the test.

Also, I added an additional output argument to LockAcquireExtended,
assuming that that probably isn't called by any outside code given
the very limited usefulness of its additional functionality.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us
2018-09-07 18:04:38 -04:00
Michael Paquier
1130206272 Fix initial sync of slot parent directory when restoring status
At the beginning of recovery, information from replication slots is
recovered from disk to memory.  In order to ensure the durability of the
information, the status file as well as its parent directory are
synced.  It happens that the sync on the parent directory was done
directly using the status file path, which is logically incorrect, and
the current code has been doing a sync on the same object twice in a
row.

Reported-by: Konstantin Knizhnik
Diagnosed-by: Konstantin Knizhnik
Author: Michael Paquier
Discussion: https://postgr.es/m/9eb1a6d5-b66f-2640-598d-c5ea46b8f68a@postgrespro.ru
Backpatch-through: 9.4-
2018-09-02 12:41:06 -07:00
Tom Lane
083d9ced14 Avoid using potentially-under-aligned page buffers.
There's a project policy against using plain "char buf[BLCKSZ]" local
or static variables as page buffers; preferred style is to palloc or
malloc each buffer to ensure it is MAXALIGN'd.  However, that policy's
been ignored in an increasing number of places.  We've apparently got
away with it so far, probably because (a) relatively few people use
platforms on which misalignment causes core dumps and/or (b) the
variables chance to be sufficiently aligned anyway.  But this is not
something to rely on.  Moreover, even if we don't get a core dump,
we might be paying a lot of cycles for misaligned accesses.

To fix, invent new union types PGAlignedBlock and PGAlignedXLogBlock
that the compiler must allocate with sufficient alignment, and use
those in place of plain char arrays.

I used these types even for variables where there's no risk of a
misaligned access, since ensuring proper alignment should make
kernel data transfers faster.  I also changed some places where
we had been palloc'ing short-lived buffers, for coding style
uniformity and to save palloc/pfree overhead.

Since this seems to be a live portability hazard (despite the lack
of field reports), back-patch to all supported versions.

Patch by me; thanks to Michael Paquier for review.

Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de
2018-09-01 15:27:13 -04:00
Noah Misch
20cd88857b Ignore server-side delays when enforcing wal_sender_timeout.
Healthy clients of servers having poor I/O performance, such as
buildfarm members hamster and tern, saw unexpected timeouts.  That
disagreed with documentation.  This fix adds one gettimeofday() call
whenever ProcessRepliesIfAny() finds no client reply messages.
Back-patch to 9.4; the bug's symptom is rare and mild, and the code all
moved between 9.3 and 9.4.

Discussion: https://postgr.es/m/20180826034600.GA1105084@rfd.leadboat.com
2018-08-31 23:00:03 -07:00
Michael Paquier
d9638a326f Ensure correct minimum consistent point on standbys
Startup process has improved its calculation of incorrect minimum
consistent point in 8d68ee6, which ensures that all WAL available gets
replayed when doing crash recovery, and has introduced an incorrect
calculation of the minimum recovery point for non-startup processes,
which can cause incorrect page references on a standby when for example
the background writer flushed a couple of pages on-disk but was not
updating the control file to let a subsequent crash recovery replay to
where it should have.

The only case where this has been reported to be a problem is when a
standby needs to calculate the latest removed xid when replaying a btree
deletion record, so one would need connections on a standby that happen
just after recovery has thought it reached a consistent point.  Using a
background worker which is started after the consistent point is reached
would be the easiest way to get into problems if it connects to a
database.  Having clients which attempt to connect periodically could
also be a problem, but the odds of seeing this problem are much lower.

The fix used is pretty simple, as the idea is to give access to the
minimum recovery point written in the control file to non-startup
processes so as they use a reference, while the startup process still
initializes its own references of the minimum consistent point so as the
original problem with incorrect page references happening post-promotion
with a crash do not show up.

Reported-by: Alexander Kukushkin
Diagnosed-by: Alexander Kukushkin
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, Alexander Kukushkin
Discussion: https://postgr.es/m/153492341830.1368.3936905691758473953@wrigleys.postgresql.org
Backpatch-through: 9.3
2018-08-31 11:05:59 -07:00
Tom Lane
20f9cd55dd Make checksum_impl.h safe to compile with -fstrict-aliasing.
In general, Postgres requires -fno-strict-aliasing with compilers that
implement C99 strict aliasing rules.  There's little hope of getting
rid of that overall.  But it seems like it would be a good idea if
storage/checksum_impl.h in particular didn't depend on it, because
that header is explicitly intended to be included by external programs.
We don't have a lot of control over the compiler switches that an
external program might use, as shown by Michael Banck's report of
failure in a privately-modified version of pg_verify_checksums.

Hence, switch to using a union in place of willy-nilly pointer casting
inside this file.  I think this makes the code a bit more readable
anyway.

checksum_impl.h hasn't changed since it was introduced in 9.3,
so back-patch all the way.

Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de
2018-08-31 12:27:15 -04:00
Andrew Gierth
2ba7c4e6c4 Avoid quadratic slowdown in regexp match/split functions.
regexp_matches, regexp_split_to_table and regexp_split_to_array all
work by compiling a list of match positions as character offsets (NOT
byte positions) in the source string.

Formerly, they then used text_substr to extract the matched text; but
in a multi-byte encoding, that counts the characters in the string,
and the characters needed to reach the starting byte position, on
every call. Accordingly, the performance degraded as the product of
the input string length and the number of match positions, such that
splitting a string of a few hundred kbytes could take many minutes.

Repair by keeping the wide-character copy of the input string
available (only in the case where encoding_max_length is not 1) after
performing the match operation, and extracting substrings from that
instead. This reduces the complexity to being linear in the number of
result bytes, discounting the actual regexp match itself (which is not
affected by this patch).

In passing, remove cleanup using retail pfree() which was obsoleted by
commit ff428cded (Feb 2008) which made cleanup of SRF multi-call
contexts automatic. Also increase (to ~134 million) the maximum number
of matches and provide an error message when it is reached.

Backpatch all the way because this has been wrong forever.

Analysis and patch by me; review by Kaiting Chen.

Discussion: https://postgr.es/m/87pnyn55qh.fsf@news-spur.riddles.org.uk

see also https://postgr.es/m/87lg996g4r.fsf@news-spur.riddles.org.uk
2018-08-28 11:50:20 +01:00
Tom Lane
48bc1a5252 Make syslogger more robust against failures in opening CSV log files.
The previous coding figured it'd be good enough to postpone opening
the first CSV log file until we got a message we needed to write there.
This is unsafe, though, because if the open fails we end up in infinite
recursion trying to report the failure.  Instead make the CSV log file
management code look as nearly as possible like the longstanding logic
for the stderr log file.  In particular, open it immediately at postmaster
startup (if enabled), or when we get a SIGHUP in which we find that
log_destination has been changed to enable CSV logging.

It seems OK to fail if a postmaster-start-time open attempt fails, as
we've long done for the stderr log file.  But we can't die if we fail
to open a CSV log file during SIGHUP, so we're still left with a problem.
In that case, write any output meant for the CSV log file to the stderr
log file.  (This will also cover race-condition cases in which backends
send CSV log data before or after we have the CSV log file open.)

This patch also fixes an ancient oversight that, if CSV logging was
turned off during a SIGHUP, we never actually closed the last CSV
log file.

In passing, remember to reset whereToSendOutput = DestNone during syslogger
start, since (unlike all other postmaster children) it's forked before the
postmaster has done that.  This made for a platform-dependent difference
in error reporting behavior between the syslogger and other children:
except on Windows, it'd report problems to the original postmaster stderr
as well as the normal error log file(s).  It's barely possible that that
was intentional at some point; but it doesn't seem likely to be desirable
in production, and the platform dependency definitely isn't desirable.

Per report from Alexander Kukushkin.  It's been like this for a long time,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/CAFh8B==iLUD_gqC-dAENS0V+kVrCeGiKujtKqSQ7++S-caaChw@mail.gmail.com
2018-08-26 14:21:55 -04:00
Andrew Gierth
6c5ed68363 Reduce an unnecessary O(N^3) loop in lexer.
The lexer's handling of operators contained an O(N^3) hazard when
dealing with long strings of + or - characters; it seems hard to
prevent this case from being O(N^2), but the additional N multiplier
was not needed.

Backpatch all the way since this has been there since 7.x, and it
presents at least a mild hazard in that trying to do Bind, PREPARE or
EXPLAIN on a hostile query could take excessive time (without
honouring cancels or timeouts) even if the query was never executed.
2018-08-23 21:33:38 +01:00
Michael Paquier
788ae09f4a Fix set of NLS translation issues
While monitoring the code, a couple of issues related to string
translation has showed up:
- Some routines for auto-updatable views return an error string, which
sometimes missed the shot.  A comment regarding string translation is
added for each routine to help with future features.
- GSSAPI authentication missed two translations.

Reported-by: Kyotaro Horiguchi
Author: Kyotaro Horiguchi
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20180810.152131.31921918.horiguchi.kyotaro@lab.ntt.co.jp
Backpatch-through: 9.3
2018-08-21 15:18:24 +09:00
Tom Lane
a4fdcceaba Ensure schema qualification in pg_restore DISABLE/ENABLE TRIGGER commands.
Previously, this code blindly followed the common coding pattern of
passing PQserverVersion(AH->connection) as the server-version parameter
of fmtQualifiedId.  That works as long as we have a connection; but in
pg_restore with text output, we don't.  Instead we got a zero from
PQserverVersion, which fmtQualifiedId interpreted as "server is too old to
have schemas", and so the name went unqualified.  That still accidentally
managed to work in many cases, which is probably why this ancient bug went
undetected for so long.  It only became obvious in the wake of the changes
to force dump/restore to execute with restricted search_path.

In HEAD/v11, let's deal with this by ripping out fmtQualifiedId's server-
version behavioral dependency, and just making it schema-qualify all the
time.  We no longer support pg_dump from servers old enough to need the
ability to omit schema name, let alone restoring to them.  (Also, the few
callers outside pg_dump already didn't work with pre-schema servers.)

In older branches, that's not an acceptable solution, so instead just
tweak the DISABLE/ENABLE TRIGGER logic to ensure it will schema-qualify
its output regardless of server version.

Per bug #15338 from Oleg somebody.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/153452458706.1316.5328079417086507743@wrigleys.postgresql.org
2018-08-17 17:12:21 -04:00
Andrew Gierth
3cf3a65cb7 Set scan direction appropriately for SubPlans (bug #15336)
When executing a SubPlan in an expression, the EState's direction
field was left alone, resulting in an attempt to execute the subplan
backwards if it was encountered during a backwards scan of a cursor.
Also, though much less likely, it was possible to reach the execution
of an InitPlan while in backwards-scan state.

Repair by saving/restoring estate->es_direction and forcing forward
scan mode in the relevant places.

Backpatch all the way, since this has been broken since 8.3 (prior to
commit c7ff7663e, SubPlans had their own EStates rather than sharing
the parent plan's, so there was no confusion over scan direction).

Per bug #15336 reported by Vladimir Baranoff; analysis and patch by
me, review by Tom Lane.

Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org
2018-08-17 16:23:56 +01:00
Tomas Vondra
ef1ac5b2ad Close the file descriptor in ApplyLogicalMappingFile
The function was forgetting to close the file descriptor, resulting
in failures like this:

  ERROR:  53000: exceeded maxAllocatedDescs (492) while trying to open
  file "pg_logical/mappings/map-4000-4eb-1_60DE1E08-5376b5-537c6b"
  LOCATION:  OpenTransientFile, fd.c:2161

Simply close the file at the end, and backpatch to 9.4 (where logical
decoding was introduced). While at it, fix a nearby typo.

Discussion: https://www.postgresql.org/message-id/flat/738a590a-2ce5-9394-2bef-7b1caad89b37%402ndquadrant.com
2018-08-16 16:51:00 +02:00
Tom Lane
27c4b0899c Make snprintf.c follow the C99 standard for snprintf's result value.
C99 says that the result should be the number of bytes that would have
been emitted given a large enough buffer, not the number we actually
were able to put in the buffer.  It's time to make our substitute
implementation comply with that.  Not doing so results in inefficiency
in buffer-enlargement cases, and also poses a portability hazard for
third-party code that might expect C99-compliant snprintf behavior
within Postgres.

In passing, remove useless tests for str == NULL; neither C99 nor
predecessor standards ever allowed that except when count == 0,
so I see no reason to expend cycles on making that a non-crash case
for this implementation.  Also, don't waste a byte in pg_vfprintf's
local I/O buffer; this might have performance benefits by allowing
aligned writes during flushbuffer calls.

Back-patch of commit 805889d7d.  There was some concern about this
possibly breaking code that assumes pre-C99 behavior, but there is
much more risk (and reality, in our own code) of code that assumes
C99 behavior and hence fails to detect buffer overrun without this.

Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us
2018-08-15 17:25:24 -04:00
Tom Lane
d371efb39c Clean up assorted misuses of snprintf()'s result value.
Fix a small number of places that were testing the result of snprintf()
but doing so incorrectly.  The right test for buffer overrun, per C99,
is "result >= bufsize" not "result > bufsize".  Some places were also
checking for failure with "result == -1", but the standard only says
that a negative value is delivered on failure.

(Note that this only makes these places correct if snprintf() delivers
C99-compliant results.  But at least now these places are consistent
with all the other places where we assume that.)

Also, make psql_start_test() and isolation_start_test() check for
buffer overrun while constructing their shell commands.  There seems
like a higher risk of overrun, with more severe consequences, here
than there is for the individual file paths that are made elsewhere
in the same functions, so this seemed like a worthwhile change.

Also fix guc.c's do_serialize() to initialize errno = 0 before
calling vsnprintf.  In principle, this should be unnecessary because
vsnprintf should have set errno if it returns a failure indication ...
but the other two places this coding pattern is cribbed from don't
assume that, so let's be consistent.

These errors are all very old, so back-patch as appropriate.  I think
that only the shell command overrun cases are even theoretically
reachable in practice, but there's not much point in erroneous error
checks.

Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us
2018-08-15 16:29:32 -04:00
Heikki Linnakangas
d5a9b706ea Don't run atexit callbacks in quickdie signal handlers.
exit() is not async-signal safe. Even if the libc implementation is, 3rd
party libraries might have installed unsafe atexit() callbacks. After
receiving SIGQUIT, we really just want to exit as quickly as possible, so
we don't really want to run the atexit() callbacks anyway.

The original report by Jimmy Yih was a self-deadlock in startup_die().
However, this patch doesn't address that scenario; the signal handling
while waiting for the startup packet is more complicated. But at least this
alleviates similar problems in the SIGQUIT handlers, like that reported
by Asim R P later in the same thread.

Backpatch to 9.3 (all supported versions).

Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com
2018-08-08 19:10:38 +03:00
Tom Lane
33c5d3bf85 Don't record FDW user mappings as members of extensions.
CreateUserMapping has a recordDependencyOnCurrentExtension call that's
been there since extensions were introduced (very possibly my fault).
However, there's no support anywhere else for user mappings as members
of extensions, nor are they listed as a possible member object type in
the documentation.  Nor does it really seem like a good idea for user
mappings to belong to extensions when roles don't.  Hence, remove the
bogus call.

(As we saw in bug #15310, the lack of any pg_dump support for this case
ensures that any such membership record would silently disappear during
pg_upgrade.  So there's probably no need for us to do anything else
about cleaning up after this mistake.)

Discussion: https://postgr.es/m/27952.1533667213@sss.pgh.pa.us
2018-08-07 16:33:12 -04:00
Tom Lane
753051cc72 Fix incorrect initialization of BackendActivityBuffer.
Since commit c8e8b5a6e, this has been zeroed out using the wrong length.
In practice the length would always be too small, leading to not zeroing
the whole buffer rather than clobbering additional memory; and that's
pretty harmless, both because shmem would likely start out as zeroes
and because we'd reinitialize any given entry before use.  Still,
it's bogus, so fix it.

Reported by Petru-Florin Mihancea (bug #15312)

Discussion: https://postgr.es/m/153363913073.1303.6518849192351268091@wrigleys.postgresql.org
2018-08-07 16:01:14 -04:00
Tom Lane
fb4e0e8960 Fix pg_upgrade to handle event triggers in extensions correctly.
pg_dump with --binary-upgrade must emit ALTER EXTENSION ADD commands
for all objects that are members of extensions.  It forgot to do so for
event triggers, as per bug #15310 from Nick Barnes.  Back-patch to 9.3
where event triggers were introduced.

Haribabu Kommi

Discussion: https://postgr.es/m/153360083872.1395.4593932457718151600@wrigleys.postgresql.org
2018-08-07 15:43:49 -04:00
Tom Lane
abd04e0dd8 Ensure pg_dump_sort.c sorts null vs non-null namespace consistently.
The original coding here (which is, I believe, my fault) supposed that
it didn't need to concern itself with the possibility that one object
of a given type-priority has a namespace while another doesn't.  But
that's not reliably true anymore, if it ever was; and if it does happen
then it's possible that DOTypeNameCompare returns self-inconsistent
comparison results.  That leads to unspecified behavior in qsort()
and a resultant weird output order from pg_dump.

This should end up being only a cosmetic problem, because any ordering
constraints that actually matter should be enforced by the later
dependency-based sort.  Still, it's a bug, so back-patch.

Report and fix by Jacob Champion, though I editorialized on his
patch to the extent of making NULL sort after non-NULL, for consistency
with our usual sorting definitions.

Discussion: https://postgr.es/m/CABAq_6Hw+V-Kj7PNfD5tgOaWT_-qaYkc+SRmJkPLeUjYXLdxwQ@mail.gmail.com
2018-08-07 13:13:42 -04:00
Tom Lane
895fb6e2e2 Stamp 9.4.19. 2018-08-06 16:11:24 -04:00
Peter Eisentraut
1af8bbe9ab Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: a69444e557347b0db244ac3d3ea2bf602a82227f
2018-08-06 19:31:39 +02:00
Tom Lane
6de9766b8d Fix failure to reset libpq's state fully between connection attempts.
The logic in PQconnectPoll() did not take care to ensure that all of
a PGconn's internal state variables were reset before trying a new
connection attempt.  If we got far enough in the connection sequence
to have changed any of these variables, and then decided to try a new
server address or server name, the new connection might be completed
with some state that really only applied to the failed connection.

While this has assorted bad consequences, the only one that is clearly
a security issue is that password_needed didn't get reset, so that
if the first server asked for a password and the second didn't,
PQconnectionUsedPassword() would return an incorrect result.  This
could be leveraged by unprivileged users of dblink or postgres_fdw
to allow them to use server-side login credentials that they should
not be able to use.

Other notable problems include the possibility of forcing a v2-protocol
connection to a server capable of supporting v3, or overriding
"sslmode=prefer" to cause a non-encrypted connection to a server that
would have accepted an encrypted one.  Those are certainly bugs but
it's harder to paint them as security problems in themselves.  However,
forcing a v2-protocol connection could result in libpq having a wrong
idea of the server's standard_conforming_strings setting, which opens
the door to SQL-injection attacks.  The extent to which that's actually
a problem, given the prerequisite that the attacker needs control of
the client's connection parameters, is unclear.

These problems have existed for a long time, but became more easily
exploitable in v10, both because it introduced easy ways to force libpq
to abandon a connection attempt at a late stage and then try another one
(rather than just giving up), and because it provided an easy way to
specify multiple target hosts.

Fix by rearranging PQconnectPoll's state machine to provide centralized
places to reset state properly when moving to a new target host or when
dropping and retrying a connection to the same host.

Tom Lane, reviewed by Noah Misch.  Our thanks to Andrew Krasichkov
for finding and reporting the problem.

Security: CVE-2018-10915
2018-08-06 10:53:35 -04:00
Michael Paquier
e69a3ac4a3 Reset properly errno before calling write()
6cb3372 enforces errno to ENOSPC when less bytes than what is expected
have been written when it is unset, though it forgot to properly reset
errno before doing a system call to write(), causing errno to
potentially come from a previous system call.

Reported-by: Tom Lane
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/31797.1533326676@sss.pgh.pa.us
2018-08-05 05:32:44 +09:00
Peter Geoghegan
250528cec0 Add table relcache invalidation to index builds.
It's necessary to make sure that owning tables have a relcache
invalidation prior to advancing the command counter to make
newly-entered catalog tuples for the index visible.  inval.c must be
able to maintain the consistency of the local caches in the event of
transaction abort.  There is usually only a problem when CREATE INDEX
transactions abort, since there is a generic invalidation once we reach
index_update_stats().

This bug is of long standing.  Problems were made much more likely by
the addition of parallel CREATE INDEX (commit 9da0cc3528), but it is
strongly suspected that similar problems can be triggered without
involving plan_create_index_workers().  (plan_create_index_workers()
triggers a relcache build or rebuild, which previously only happened in
rare edge cases.)

Author: Peter Geoghegan
Reported-By: Luca Ferrari
Diagnosed-By: Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAKoxK+5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf=HEQ@mail.gmail.com
Backpatch: 9.3-
2018-08-03 14:44:33 -07:00
Tom Lane
88adf1add2 Further fixes for quoted-list GUC values in pg_dump and ruleutils.c.
Commits 742869946 et al turn out to be a couple bricks shy of a load.
We were dumping the stored values of GUC_LIST_QUOTE variables as they
appear in proconfig or setconfig catalog columns.  However, although that
quoting rule looks a lot like SQL-identifier double quotes, there are two
critical differences: empty strings ("") are legal, and depending on which
variable you're considering, values longer than NAMEDATALEN might be valid
too.  So the current technique fails altogether on empty-string list
entries (as reported by Steven Winfield in bug #15248) and it also risks
truncating file pathnames during dump/reload of GUC values that are lists
of pathnames.

To fix, split the stored value without any downcasing or truncation,
and then emit each element as a SQL string literal.

This is a tad annoying, because we now have three copies of the
comma-separated-string splitting logic in varlena.c as well as a fourth
one in dumputils.c.  (Not to mention the randomly-different-from-those
splitting logic in libpq...)  I looked at unifying these, but it would
be rather a mess unless we're willing to tweak the API definitions of
SplitIdentifierString, SplitDirectoriesString, or both.  That might be
worth doing in future; but it seems pretty unsafe for a back-patched
bug fix, so for now accept the duplication.

Back-patch to all supported branches, as the previous fix was.

Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us
2018-07-31 13:00:08 -04:00
Tom Lane
addf9e1bd6 Fix pg_dump's failure to dump REPLICA IDENTITY for constraint indexes.
pg_dump knew about printing ALTER TABLE ... REPLICA IDENTITY USING INDEX
for indexes declared as indexes, but it failed to print that for indexes
declared as unique or primary-key constraints.  Per report from Achilleas
Mantzios.

This has been broken since the feature was introduced, AFAICS.
Back-patch to 9.4.

Discussion: https://postgr.es/m/1e6cc5ad-b84a-7c07-8c08-a4d0c3cdc938@matrix.gatewaynet.com
2018-07-30 12:35:49 -04:00
Noah Misch
8c477a42eb Document security implications of qualified names.
Commit 5770172cb0 documented secure schema
usage, and that advice suffices for using unqualified names securely.
Document, in typeconv-func primarily, the additional issues that arise
with qualified names.  Back-patch to 9.3 (all supported versions).

Reviewed by Jonathan S. Katz.

Discussion: https://postgr.es/m/20180721012446.GA1840594@rfd.leadboat.com
2018-07-28 20:08:34 -07:00
Alexander Korotkov
9c6a676c4c Fix handling of empty uncompressed posting list pages in GIN
PostgreSQL 9.4 introduces posting list compression in GIN.  This feature
supports online upgrade, so that after pg_upgrade uncompressed posting
lists are compressed on-the-fly.  Underlying code appears to always
expect at least one item on uncompressed posting list page.  But there
could be completely empty pages, because VACUUM never deletes leftmost
and rightmost pages from posting trees.  This commit fixes that.

Reported-by: Sivasubramanian Ramasubramanian
Discussion: https://postgr.es/m/1531867212836.63354%40amazon.com
Author: Sivasubramanian Ramasubramanian, Alexander Korotkov
Backpatch-through: 9.4
2018-07-19 21:24:53 +03:00
Heikki Linnakangas
47d51a5e8c Fix misc typos, mostly in comments.
A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.

Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
2018-07-18 16:54:45 +03:00
Tom Lane
6d2d5ab173 Fix inadequate buffer locking in FSM and VM page re-initialization.
When reading an existing FSM or VM page that was found to be corrupt by the
buffer manager, the code applied PageInit() to reinitialize the page, but
did so without any locking.  There is thus a hazard that two backends might
concurrently do PageInit, which in itself would still be OK, but the slower
one might then zero over subsequent data changes applied by the faster one.
Even that is unlikely to be fatal; but it's not desirable, so add locking
to prevent it.

This does not add any locking overhead in the normal code path where the
page is OK.  It's not immediately obvious that that's safe, but I believe
it is, for reasons explained in the added comments.

Problem noted by R P Asim.  It's been like this for a long time, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/CANXE4Te4G0TGq6cr0-TvwP0H4BNiK_-hB5gHe8mF+nz0mcYfMQ@mail.gmail.com
2018-07-13 11:53:16 -04:00
Michael Paquier
98e2c298c2 Make logical WAL sender report streaming state appropriately
WAL senders sending logically-decoded data fail to properly report in
"streaming" state when starting up, hence as long as one extra record is
not replayed, such WAL senders would remain in a "catchup" state, which
is inconsistent with the physical cousin.

This can be easily reproduced by for example using pg_recvlogical and
restarting the upstream server.  The TAP tests have been slightly
modified to detect the failure and strengthened so as future tests also
make sure that a node is in streaming state when waiting for its
catchup.

Backpatch down to 9.4 where this code has been introduced.

Reported-by: Sawada Masahiko
Author: Simon Riggs, Sawada Masahiko
Reviewed-by: Petr Jelinek, Michael Paquier, Vaishnavi Prabakaran
Discussion: https://postgr.es/m/CAD21AoB2ZbCCqOx=bgKMcLrAvs1V0ZMqzs7wBTuDySezTGtMZA@mail.gmail.com
2018-07-12 10:20:27 +09:00
Tom Lane
d80ec868fa Avoid emitting a bogus WAL record when recycling an all-zero btree page.
Commit fafa374f2 caused _bt_getbuf() to possibly emit a WAL record for
a page that it was about to recycle.  However, it failed to distinguish
all-zero pages from dead pages, which is important because only the
latter have valid btpo.xact values, or indeed any special space at all.
Recycling an all-zero page with XLogStandbyInfoActive() enabled therefore
led to an Assert failure, or to emission of a WAL record containing a
bogus cutoff XID, which might lead to unnecessary query cancellations
on hot standby servers.

Per reports from Antonin Houska and 自己.  Amit Kapila was first to
propose this fix, and Robert Haas, myself, and Kyotaro Horiguchi
reviewed it at various times.

This is an old bug, so back-patch to all supported branches.

Discussion: https://postgr.es/m/2628.1474272158@localhost
Discussion: https://postgr.es/m/48875502.f4a0.1635f0c27b0.Coremail.zoulx1982@163.com
2018-07-09 19:26:19 -04:00
Tom Lane
dd4e836748 Prevent accidental linking of system-supplied copies of libpq.so etc.
Back-patch commit dddfc4cb2, which broke LDFLAGS and related Makefile
variables into two parts, one for within-build-tree library references and
one for external libraries, to ensure that the order of -L flags has all
of the former before all of the latter.  This turns out to fix a problem
recently noted on buildfarm member peripatus, that we attempted to
incorporate code from libpgport.a into a shared library.  That will fail on
platforms that are sticky about putting non-PIC code into shared libraries.
(It's quite surprising we hadn't seen such failures before, since the code
in question has been like that for a long time.)

I think that peripatus' problem could have been fixed with just a subset
of this patch; but since the previous issue of accidentally linking to the
wrong copy of a Postgres shlib seems likely to bite people in the field,
let's just back-patch the whole change.  Now that commit dddfc4cb2 has
survived some beta testing, I'm less afraid to back-patch it than I was
at the time.

This also fixes undesired inclusion of "-DFRONTEND" in pg_config's CPPFLAGS
output (in 9.6 and up) and undesired inclusion of "-L../../src/common" in
its LDFLAGS output (in all supported branches).

Back-patch to v10 and older branches; this is already in v11.

Discussion: https://postgr.es/m/20180704234304.bq2dxispefl65odz@ler-imac.local
2018-07-09 17:23:32 -04:00
Michael Paquier
f352f43d3f Prevent references to invalid relation pages after fresh promotion
If a standby crashes after promotion before having completed its first
post-recovery checkpoint, then the minimal recovery point which marks
the LSN position where the cluster is able to reach consistency may be
set to a position older than the first end-of-recovery checkpoint while
all the WAL available should be replayed.  This leads to the instance
thinking that it contains inconsistent pages, causing a PANIC and a hard
instance crash even if all the WAL available has not been replayed for
certain sets of records replayed.  When in crash recovery,
minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,
which forces the recovery to replay all the WAL available, so this
commit makes sure that the local copy of minRecoveryPoint from the
control file is initialized properly and stays as it is while crash
recovery is performed.  Once switching to archive recovery or if crash
recovery finishes, then the local copy minRecoveryPoint can be safely
updated.

Pavan Deolasee has reported and diagnosed the failure in the first
place, and the base fix idea to rely on the local copy of
minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded
into a full-fledged patch by me.  The test included in this commit has
been written by Álvaro Herrera and Pavan Deolasee, which I have modified
to make it faster and more reliable with sleep phases.

Backpatch down to all supported versions where the bug appears, aka 9.3
which is where the end-of-recovery checkpoint is not run by the startup
process anymore.  The test gets easily supported down to 10, still it
has been tested on all branches.

Reported-by: Pavan Deolasee
Diagnosed-by: Pavan Deolasee
Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi
Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro
Herrera
Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
2018-07-05 10:47:50 +09:00
Andres Freund
8c8c9f37c2 Check for interrupts inside the nbtree page deletion code.
When deleting pages the nbtree code has to walk through siblings of a
tree node. When those sibling links are corrupted that can lead to
endless loops - which are currently not interruptible.  This is
especially problematic if autovacuum is repeatedly blocked on such
indexes, as it can be hard to get out of that situation without
resorting to single user mode.

Thus add interrupt checks to appropriate places in such
loops. Unfortunately in one of the cases it's it's not easy to do so.

Between 9.3 and 9.4 the page deletion (and page split) code changed
significantly. Before it was significantly less robust against
interruptions. Therefore don't backpatch to 9.3.

Author: Andres Freund
Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de
Backpatch: 9.4-
2018-07-04 14:58:26 -07:00
Fujii Masao
62c2fe6446 Improve the performance of relation deletes during recovery.
When multiple relations are deleted at the same transaction,
the files of those relations are deleted by one call to smgrdounlinkall(),
which leads to scan whole shared_buffers only one time. OTOH,
previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was
called for each file to delete, which led to scan shared_buffers
multiple times. Obviously this could cause to increase the WAL replay
time very much especially when shared_buffers was huge.

To alleviate this situation, this commit changes the recovery so that
it also calls smgrdounlinkall() only one time to delete multiple
relation files.

This is just fix for oversight of commit 279628a0a7, not new feature.
So, per discussion on pgsql-hackers, we concluded to backpatch this
to all supported versions.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa
Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com
2018-07-05 02:46:44 +09:00
Peter Eisentraut
2a4dca9491 Fix libpq example programs
When these programs call pg_catalog.set_config, they need to check for
PGRES_TUPLES_OK instead of PGRES_COMMAND_OK.  Fix for
5770172cb0.

Reported-by: Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>
2018-07-01 14:09:22 +02:00
Alvaro Herrera
962313558f Fix "base" snapshot handling in logical decoding
Two closely related bugs are fixed.  First, xmin of logical slots was
advanced too early.  During xl_running_xacts processing, xmin of the
slot was set to the oldest running xid in the record, but that's wrong:
actually, snapshots which will be used for not-yet-replayed transactions
might consider older txns as running too, so we need to keep xmin back
for them.  The problem wasn't noticed earlier because DDL which allows
to delete tuple (set xmax) while some another not-yet-committed
transaction looks at it is pretty rare, if not unique: e.g. all forms of
ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
conflicting with any inserts. The included test case (test_decoding's
oldest_xmin) uses ALTER of a composite type, which doesn't have such
interlocking.

To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. To
fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
transactions are sorted by base-snapshot-LSN.  This is slightly
different from the existing (sorted by first-LSN) list, because a
transaction can have an earlier LSN but a later Xmin, if its first
record does not obtain an xmin (eg. xl_xact_assignment).  Note this new
list doesn't fully replace the existing txn list: we still need that one
to prevent WAL recycling.

The second issue concerns SnapBuilder snapshots and subtransactions.
SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
transaction that is known to be a subtxn, which is good in the common
case that the top-level transaction already has one (no point in doing
so), but a bug otherwise.  To fix, arrange to transfer the snapshot from
the subtxn to its top-level txn as soon as the kinship gets known.
test_decoding's snapshot_transfer verifies this.

Also, fix a minor memory leak: refcount of toplevel's old base snapshot
was not decremented when the snapshot is transferred from child.

Liberally sprinkle code comments, and rewrite a few existing ones.  This
part is my (Álvaro's) contribution to this commit, as I had to write all
those comments in order to understand the existing code and Arseny's
patch.

Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
2018-06-26 16:38:34 -04:00
Thomas Munro
db05d0b906 Add PGTYPESchar_free() to avoid cross-module problems on Windows.
On Windows, it is sometimes important for corresponding malloc() and
free() calls to be made from the same DLL, since some build options can
result in multiple allocators being active at the same time.  For that
reason we already provided PQfreemem().  This commit adds a similar
function for freeing string results allocated by the pgtypes library.

Author: Takayuki Tsunakawa
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8AD5D6%40G01JPEXMBYT05
2018-06-26 23:21:39 +12:00
Thomas Munro
c4ccbcc1a2 Move RecoveryLockList into a hash table.
Standbys frequently need to release all locks held by a given xid.
Instead of searching one big list linearly, let's create one list
per xid and put them in a hash table, so we can find what we need
in O(1) time.

Earlier analysis and a prototype were done by David Rowley, though
this isn't his patch.

Back-patch all the way.

Author: Thomas Munro
Diagnosed-by: David Rowley, Andres Freund
Reviewed-by: Andres Freund, Tom Lane, Robert Haas
Discussion: https://postgr.es/m/CAEepm%3D1mL0KiQ2KJ4yuPpLGX94a4Ns_W6TL4EGRouxWibu56pA%40mail.gmail.com
Discussion: https://postgr.es/m/CAKJS1f9vJ841HY%3DwonnLVbfkTWGYWdPN72VMxnArcGCjF3SywA%40mail.gmail.com
2018-06-26 18:23:36 +12:00
Michael Paquier
79b5b101f9 Address set of issues with errno handling
System calls mixed up in error code paths are causing two issues which
several code paths have not correctly handled:
1) For write() calls, sometimes the system may return less bytes than
what has been written without errno being set.  Some paths were careful
enough to consider that case, and assumed that errno should be set to
ENOSPC, other calls missed that.
2) errno generated by a system call is overwritten by other system calls
which may succeed once an error code path is taken, causing what is
reported to the user to be incorrect.

This patch uses the brute-force approach of correcting all those code
paths.  Some refactoring could happen in the future, but this is let as
future work, which is not targeted for back-branches anyway.

Author: Michael Paquier
Reviewed-by: Ashutosh Sharma
Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz
2018-06-25 11:22:02 +09:00
Tom Lane
cd56194d18 Avoid unnecessary use of strncpy in a couple of places in ecpg.
Use of strncpy with a length limit based on the source, rather than
the destination, is non-idiomatic and draws warnings from gcc 8.
Replace with memcpy, which does exactly the same thing in these cases,
but with less chance for confusion.

Backpatch to all supported branches.

Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16 14:58:42 -04:00
Andres Freund
817f9f9a8a Fix bugs in vacuum of shared rels, by keeping their relcache entries current.
When vacuum processes a relation it uses the corresponding relcache
entry's relfrozenxid / relminmxid as a cutoff for when to remove
tuples etc. Unfortunately for nailed relations (i.e. critical system
catalogs) bugs could frequently lead to the corresponding relcache
entry being stale.

This set of bugs could cause actual data corruption as vacuum would
potentially not remove the correct row versions, potentially reviving
them at a later point.  After 699bf7d05c some corruptions in this vein
were prevented, but the additional error checks could also trigger
spuriously. Examples of such errors are:
  ERROR: found xmin ... from before relfrozenxid ...
and
  ERROR: found multixact ... from before relminmxid ...
To be caused by this bug the errors have to occur on system catalog
tables.

The two bugs are:

1) Invalidations for nailed relations were ignored, based on the
   theory that the relcache entry for such tables doesn't
   change. Which is largely true, except for fields like relfrozenxid
   etc.  This means that changes to relations vacuumed in other
   sessions weren't picked up by already existing sessions.  Luckily
   autovacuum doesn't have particularly longrunning sessions.

2) For shared *and* nailed relations, the shared relcache init file
   was never invalidated while running.  That means that for such
   tables (e.g. pg_authid, pg_database) it's not just already existing
   sessions that are affected, but even new connections are as well.
   That explains why the reports usually were about pg_authid et. al.

To fix 1), revalidate the rd_rel portion of a relcache entry when
invalid. This implies a bit of extra complexity to deal with
bootstrapping, but it's not too bad.  The fix for 2) is simpler,
simply always remove both the shared and local init files.

Author: Andres Freund
Reviewed-By: Alvaro Herrera
Discussion:
    https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de
    https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com
    https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
    https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com
Backpatch: 9.3-
2018-06-12 11:13:22 -07:00
Alvaro Herrera
5970bfb04e Fix function code in error report
This bug causes a lseek() failure to be reported as a "could not open"
failure in the error message, muddling bug reports.  I introduced this
copy-and-pasteo in commit 78e1220104.

Noticed while reviewing code for bug report #15221, from lily liang.  In
version 10 the affected function is only used by multixact.c and
commit_ts, and only in corner-case circumstances, neither of which are
involved in the reported bug (a pg_subtrans failure.)

Author: Álvaro Herrera
2018-06-06 14:47:46 -04:00
Tom Lane
98d522a1de Fix misidentification of SQL statement type in plpgsql's exec_stmt_execsql.
To distinguish SQL statements that are INSERT/UPDATE/DELETE from other
ones, exec_stmt_execsql looked at the post-rewrite form of the statement
rather than the original.  This is problematic because it did that only
during first execution of the statement (in a session), but the correct
answer could change later due to addition or removal of DO INSTEAD rules
during the session.  That could lead to an Assert failure, as reported
by Tushar Ahuja and Robert Haas.  In non-assert builds, there's a hazard
that we would fail to enforce STRICT behavior when we'd be expected to.
That would happen if an initially present DO INSTEAD, that replaced the
original statement with one of a different type, were removed; after that
the statement should act "normally", including strictness enforcement, but
it didn't.  (The converse case of enforcing strictness when we shouldn't
doesn't seem to be a hazard, as addition of a DO INSTEAD that changes the
statement type would always lead to acting as though the statement returned
zero rows, so that the strictness error could not fire.)

To fix, inspect the original form of the statement not the post-rewrite
form, making it valid to assume the answer can't change intra-session.
This should lead to the same answer in every case except when there is a
DO INSTEAD that changes the statement type; we will now set mod_stmt=true
anyway, while we would not have done so before.  That breaks the Assert
in the SPI_OK_REWRITTEN code path, which expected the latter behavior.
It might be all right to assert mod_stmt rather than !mod_stmt there,
but I'm not entirely convinced that that'd always hold, so just remove
the assertion altogether.

This has been broken for a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/CA+TgmoZUrRN4xvZe_BbBn_Xp0BDwuMEue-0OyF0fJpfvU2Yc7Q@mail.gmail.com
2018-05-25 14:31:07 -04:00
Tom Lane
8f2143bc8f Properly schema-qualify additional object types in getObjectDescription().
Collations, conversions, extended statistics objects (in >= v10),
and all four types of text search objects have schema-qualified names.
getObjectDescription() ignored that and would emit just the base name of
the object, potentially producing wrong or at least highly misleading
output.  Fix it to add the schema name whenever the object is not "visible"
in the current search path, as is the rule for other schema-qualifiable
object types.

Although in common situations the output won't change, this seems to me
(tgl) to be a bug worthy of back-patching, hence do so.

Kyotaro Horiguchi, per a complaint from me

Discussion: https://postgr.es/m/20180522.182020.114074746.horiguchi.kyotaro@lab.ntt.co.jp
2018-05-24 12:07:41 -04:00
Tom Lane
09fb2d5d3b Fix simple_prompt() to disable echo on Windows when stdin != terminal.
If echo = false, simple_prompt() is supposed to prevent echoing the
input (for password input).  However, the Windows implementation applied
the mode change to STD_INPUT_HANDLE.  That would not have the desired
effect if stdin isn't actually the terminal, for instance if the user
is piping something into psql.  Fix it to apply the mode change to
the correct input file, so that passwords do not echo in such cases.

In passing, shorten and de-uglify this code by using #elif rather than
an #if nest and removing some duplicated code.

Back-patch to all supported versions.  To simplify that, also back-patch
the portions of commit 9daec77e1 that got rid of an unnecessary
malloc/free in the same area.

Matthew Stickney (cosmetic changes by me)

Discussion: https://postgr.es/m/502a1fff-862b-da52-1031-f68df6ed5a2d@gmail.com
2018-05-23 19:04:34 -04:00
Tom Lane
d25714d0a3 Widen COPY FROM's current-line-number counter from 32 to 64 bits.
Because the code for the HEADER option skips a line when this counter
is zero, a very long COPY FROM WITH HEADER operation would drop a line
every 2^32 lines.  A lesser but still unfortunate problem is that errors
would show a wrong input line number for errors occurring beyond the
2^31'st input line.  While such large input streams seemed impractical
when this code was first written, they're not any more.  Widening the
counter (and some associated variables) to uint64 should be enough to
prevent problems for the foreseeable future.

David Rowley

Discussion: https://postgr.es/m/CAKJS1f88yh-6wwEfO6QLEEvH3BEugOq2QX1TOja0vCauoynmOQ@mail.gmail.com
2018-05-22 13:32:52 -04:00
Andrew Gierth
769e6fcd1a Fix SQL:2008 FETCH FIRST syntax to allow parameters.
OFFSET <x> ROWS FETCH FIRST <y> ROWS ONLY syntax is supposed to accept
<simple value specification>, which includes parameters as well as
literals. When this syntax was added all those years ago, it was done
inconsistently, with <x> and <y> being different subsets of the
standard syntax.

Rectify that by making <x> and <y> accept the same thing, and allowing
either a (signed) numeric literal or a c_expr there, which allows for
parameters, variables, and parenthesized arbitrary expressions.

Per bug #15200 from Lukas Eder.

Backpatch all the way, since this has been broken from the start.

Discussion: https://postgr.es/m/877enz476l.fsf@news-spur.riddles.org.uk
Discussion: http://postgr.es/m/152647780335.27204.16895288237122418685@wrigleys.postgresql.org
2018-05-21 17:32:29 +01:00
Tom Lane
5517367e97 Fix unsafe usage of strerror(errno) within ereport().
This is the converse of the unsafe-usage-of-%m problem: the reason
ereport/elog provide that format code is mainly to dodge the hazard
of errno getting changed before control reaches functions within the
arguments of the macro.  I only found one instance of this hazard,
but it's been there since 9.4 :-(.
2018-05-21 00:32:52 -04:00
Tom Lane
e52cabff70 printf("%lf") is not portable, so omit the "l".
The "l" (ell) width spec means something in the corresponding scanf usage,
but not here.  While modern POSIX says that applying "l" to "f" and other
floating format specs is a no-op, SUSv2 says it's undefined.  Buildfarm
experience says that some old compilers emit warnings about it, and at
least one old stdio implementation (mingw's "ANSI" option) actually
produces wrong answers and/or crashes.

Discussion: https://postgr.es/m/21670.1526769114@sss.pgh.pa.us
Discussion: https://postgr.es/m/c085e1da-0d64-1c15-242d-c921f32e0d5c@dunslane.net
2018-05-20 11:40:54 -04:00
Tom Lane
8109f201da Support platforms where strtoll/strtoull are spelled __strtoll/__strtoull.
Ancient HPUX, for one, does this.  We hadn't noticed due to the lack
of regression tests that required a working strtoll.

(I was slightly tempted to remove the other historical spelling,
strto[u]q, since it seems we have no buildfarm members testing that case.
But I refrained.)

Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-19 14:22:19 -04:00
Tom Lane
023aa76e19 Arrange to supply declarations for strtoll/strtoull if needed.
Buildfarm member dromedary is still unhappy about the recently-added
ecpg "long long" tests.  The reason turns out to be that it includes
"-ansi" in its CFLAGS, and in their infinite wisdom Apple have decided
to hide the declarations of strtoll/strtoull in C89-compliant builds.
(I find it pretty curious that they hide those function declarations
when you can nonetheless declare a "long long" variable, but anyway
that is their behavior, both on dromedary's obsolete macOS version and
the newest and shiniest.)  As a result, gcc assumes these functions
return "int", leading naturally to wrong results.

(Looking at dromedary's past build results, it's evident that this
problem also breaks pg_strtouint64() on 32-bit platforms; but we
evidently have no regression tests that exercise that function with
values above 32 bits.)

To fix, supply declarations for these functions when the platform
provides the functions but not the declarations, using the same type
of mechanism as we use for some other similar cases.

Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18 22:42:10 -04:00
Tom Lane
54ae787ca7 Hot-fix ecpg regression test for missing ecpg_config.h inclusion.
I don't think this is really the best long-term answer, and in
particular it doesn't fix the pre-existing hazard in sqltypes.h.
But for the moment let's just try to make the buildfarm green again.

Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18 19:04:11 -04:00
Tom Lane
e75c832b29 Add some test coverage for ecpg's "long long" support.
This will only actually exercise the "long long" code paths on platforms
where "long" is 32 bits --- otherwise, the SQL bigint type maps to
plain "long", and we will test that code path instead.  But that's
probably sufficient coverage, and anyway we weren't testing either
code path before.

Dang Minh Huong, tweaked a bit by me

Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18 13:04:59 -04:00
Tom Lane
385f4acbf8 Recognize that MSVC can support strtoll() and strtoull().
This is needed for full support of "long long" variables in ecpg, but
the previous patch for bug #15080 (commits 51057feaa et al) missed it.
In MSVC versions where the functions don't exist under those names,
we can nonetheless use _strtoi64() and _strtoui64().

Like the previous patch, back-patch all the way.

Dang Minh Huong

Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18 12:52:28 -04:00
Magnus Hagander
b5f096d50b Fix error message on short read of pg_control
Instead of saying "error: success", indicate that we got a working read
but it was too short.
2018-05-18 17:53:19 +02:00
Tom Lane
62e0020ad4 Fix misprocessing of equivalence classes involving record_eq().
canonicalize_ec_expression() is supposed to agree with coerce_type() as to
whether a RelabelType should be inserted to make a subexpression be valid
input for the operators of a given opclass.  However, it did the wrong
thing with named-composite-type inputs to record_eq(): it put in a
RelabelType to RECORDOID, which the parser doesn't.  In some cases this was
harmless because all code paths involving a particular equivalence class
did the same thing, but in other cases this would result in failing to
recognize a composite-type expression as being a member of an equivalence
class that it actually is a member of.  The most obvious bad effect was to
fail to recognize that an index on a composite column could provide the
sort order needed for a mergejoin on that column, as reported by Teodor
Sigaev.  I think there might be other, subtler, cases that result in
misoptimization.  It also seems possible that an unwanted RelabelType
would sometimes get into an emitted plan --- but because record_eq and
friends don't examine the declared type of their input expressions, that
would not create any visible problems.

To fix, just treat RECORDOID as if it were a polymorphic type, which in
some sense it is.  We might want to consider formalizing that a bit more
someday, but for the moment this seems to be the only place where an
IsPolymorphicType() test ought to include RECORDOID as well.

This has been broken for a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/a6b22369-e3bf-4d49-f59d-0c41d3551e81@sigaev.ru
2018-05-16 13:46:09 -04:00
Tom Lane
32453bc5af Update time zone data files to tzdata release 2018e.
DST law changes in North Korea.  Redefinition of "daylight savings" in
Ireland, as well as for some past years in Namibia and Czechoslovakia.
Additional historical corrections for Czechoslovakia.

With this change, the IANA database models Irish timekeeping as following
"standard time" in summer, and "daylight savings" in winter, so that the
daylight savings offset is one hour behind standard time not one hour
ahead.  This does not change their UTC offset (+1:00 in summer, 0:00 in
winter) nor their timezone abbreviations (IST in summer, GMT in winter),
though now "IST" is more correctly read as "Irish Standard Time" not "Irish
Summer Time".  However, the "is_dst" column in the pg_timezone_names view
will now be true in winter and false in summer for the Europe/Dublin zone.

Similar changes were made for Namibia between 1994 and 2017, and for
Czechoslovakia between 1946 and 1947.

So far as I can find, no Postgres internal logic cares about which way
tm_isdst is reported; in particular, since commit b2cbced9e we do not
rely on it to decide how to interpret ambiguous timestamps during DST
transitions.  So I don't think this change will affect any Postgres
behavior other than the timezone-view outputs.

Discussion: https://postgr.es/m/30996.1525445902@sss.pgh.pa.us
2018-05-09 13:56:00 -04:00
Tom Lane
3d48654017 Improve inefficient regexes in vacuumdb TAP test.
The regexes used in 102_vacuumdb_stages.pl to check the postmaster log
for expected output contained several places with ".*.*", which is
underdetermined and can cause exponential runtime growth in Perl's regex
matcher (since it's not bright enough not to waste time seeing whether
different splits of the same substring would allow a match).  We were
fortunate that the amount of text in the postmaster log was generally not
enough to make the runtime go to the moon; although commit 6271fceb8 had
been on the hairy edge of an obvious problem, thanks to its increasing the
default log verbosity to DEBUG1.  Experimentation shows that anyone who
tried to run this test case with an even higher log verbosity would have
been in for serious pain.  But even at default logging level, fixing this
saves several hundred ms on my workstation, more on slower buildfarm
members.

Remove the extra ".*"s, restoring more-or-less-linear matching speed.
Back-patch to 9.4 where the test case was added, mostly in case anyone
tries to do related debugging in a back branch.

Discussion: https://postgr.es/m/32459.1525657786@sss.pgh.pa.us
2018-05-08 20:17:43 -04:00
Tom Lane
364998df87 Stamp 9.4.18. 2018-05-07 16:57:35 -04:00
Peter Eisentraut
dc441d5c2d Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: d73a8c239ac494c183e266261c657580526d4cba
2018-05-07 11:47:28 -04:00
Andrew Dunstan
1eb24720c6 Clear severity 5 perlcritic warnings from vcregress.pl
My recent update for python3 support used some idioms that are
unapproved. This fixes them. Backpatch to all live branches like the
original.
2018-05-06 07:40:04 -04:00
Peter Eisentraut
af9e0d5cdf Tweak tests to support Python 3.7
Python 3.7 removes the trailing comma in the repr() of
BaseException (see <https://bugs.python.org/issue30399>), leading to
test output differences.  Work around that by composing the equivalent
test output in a more manual way.
2018-05-05 23:53:05 -04:00
Peter Eisentraut
280cf0fe78 Remove extra newlines after PQerrorMessage() 2018-05-05 10:54:00 -04:00
Heikki Linnakangas
c06380e976 Fix scenario where streaming standby gets stuck at a continuation record.
If a continuation record is split so that its first half has already been
removed from the master, and is only present in pg_wal, and there is a
recycled WAL segment in the standby server that looks like it would
contain the second half, recovery would get stuck. The code in
XLogPageRead() incorrectly started streaming at the beginning of the
WAL record, even if we had already read the first page.

Backpatch to 9.4. In principle, older versions have the same problem, but
without replication slots, there was no straightforward mechanism to
prevent the master from recycling old WAL that was still needed by standby.
Without such a mechanism, I think it's reasonable to assume that there's
enough slack in how many old segments are kept around to not run into this,
or you have a WAL archive.

Reported by Jonathon Nelson. Analysis and patch by Kyotaro HORIGUCHI, with
some extra comments by me.

Discussion: https://www.postgresql.org/message-id/CACJqAM3xVz0JY1XFDKPP%2BJoJAjoGx%3DGNuOAshEDWCext7BFvCQ%40mail.gmail.com
2018-05-05 01:35:18 +03:00
Andrew Dunstan
134db37d21 Provide for testing on python3 modules when under MSVC
This should have been done some years ago as promised in commit
c4dcdd0c2. However, better late than never.

Along the way do a little housekeeping, including using a simpler test
for the python version being tested, and removing a redundant subroutine
parameter. These changes only apply back to release 9.5.

Backpatch to all live releases.
2018-05-04 15:51:31 -04:00
Tom Lane
2d123b3104 Sync our copy of the timezone library with IANA release tzcode2018e.
The non-cosmetic changes involve teaching the "zic" tzdata compiler about
negative DST.  While I'm not currently intending that we start using
negative-DST data right away, it seems possible that somebody would try
to use our copy of zic with bleeding-edge IANA data.  So we'd better be
out in front of this change code-wise, even though it doesn't matter for
the data file we're shipping.

Discussion: https://postgr.es/m/30996.1525445902@sss.pgh.pa.us
2018-05-04 12:26:48 -04:00
Teodor Sigaev
6bd659f19c Add HOLD_INTERRUPTS section into FinishPreparedTransaction.
If an interrupt arrives in the middle of FinishPreparedTransaction
and any callback decide to call CHECK_FOR_INTERRUPTS (e.g.
RemoveTwoPhaseFile can write a warning with ereport, which checks for
interrupts) then it's possible to leave current GXact undeleted.

Backpatch to all supported branches

Stas Kelvich

Discussion: ihttps://www.postgresql.org/message-id/3AD85097-A3F3-4EBA-99BD-C38EDF8D2949@postgrespro.ru
2018-05-03 20:10:11 +03:00
Tom Lane
70211459a5 Revert back-branch changes in power()'s behavior for NaN inputs.
Per discussion, the value of fixing these bugs in the back branches
doesn't outweigh the downsides of changing corner-case behavior in
a minor release.  Hence, revert commits 217d8f3a1 and 4d864de48 in
the v10 branch and the corresponding commits in 9.3-9.6.

Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp
2018-05-02 17:32:40 -04:00
Tom Lane
8109a3c144 Fix bogus list-iteration code in pg_regress.c, affecting ecpg tests only.
While looking at a recent buildfarm failure in the ecpg tests, I wondered
why the pg_regress output claimed the stderr part of the test failed, when
the regression diffs were clearly for the stdout part.  Looking into it,
the reason is that pg_regress.c's logic for iterating over three parallel
lists is wrong, and has been wrong since it was written: it advances the
"tag" pointer at a different place in the loop than the other two pointers.
Fix that.
2018-04-29 21:56:28 -04:00
Tom Lane
59c2df3ae8 Avoid wrong results for power() with NaN input on more platforms.
Buildfarm results show that the modern POSIX rule that 1 ^ NaN = 1 is not
honored on *BSD until relatively recently, and really old platforms don't
believe that NaN ^ 0 = 1 either.  (This is unsurprising, perhaps, since
SUSv2 doesn't require either behavior.)  In hopes of getting to platform
independent behavior, let's deal with all the NaN-input cases explicitly
in dpow().

Note that numeric_power() doesn't know either of these special cases.
But since that behavior is platform-independent, I think it should be
addressed separately, and probably not back-patched.

Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp
2018-04-29 18:15:16 -04:00
Tom Lane
37c02b2b0a Update time zone data files to tzdata release 2018d.
DST law changes in Palestine and Antarctica (Casey Station).  Historical
corrections for Portugal and its colonies, as well as Enderbury, Jamaica,
Turks & Caicos Islands, and Uruguay.
2018-04-29 15:50:43 -04:00
Tom Lane
44ccd11cbb Avoid wrong results for power() with NaN input on some platforms.
Per spec, the result of power() should be NaN if either input is NaN.
It appears that on some versions of Windows, the libc function does
return NaN, but it also sets errno = EDOM, confusing our code that
attempts to work around shortcomings of other platforms.  Hence, add
guard tests to avoid substituting a wrong result for the right one.

It's been like this for a long time (and the odd behavior only appears
in older MSVC releases, too) so back-patch to all supported branches.

Dang Minh Huong, reviewed by David Rowley

Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp
2018-04-29 15:21:45 -04:00
Noah Misch
bb532859f4 Correct pg_recvlogical server version test.
The predecessor test boiled down to "PQserverVersion(NULL) >= 100000",
which is always false.  No release includes that, so it could not have
reintroduced CVE-2018-1058.  Back-patch to 9.4, like the addition of the
predecessor in commit 8d2814f274.

Discussion: https://postgr.es/m/20180422215551.GB2676194@rfd.leadboat.com
2018-04-25 18:50:34 -07:00
Tom Lane
58fec95268 Change more places to be less trusting of RestrictInfo.is_pushed_down.
On further reflection, commit e5d83995e didn't go far enough: pretty much
everywhere in the planner that examines a clause's is_pushed_down flag
ought to be changed to use the more complicated behavior where we also
check the clause's required_relids.  Otherwise we could make incorrect
decisions about whether, say, a clause is safe to use as a hash clause.

Some (many?) of these places are safe as-is, either because they are
never reached while considering a parameterized path, or because there
are additional checks that would reject a pushed-down clause anyway.
However, it seems smarter to just code them all the same way rather
than rely on easily-broken reasoning of that sort.

In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should
be used in place of direct tests on the is_pushed_down flag.

Like the previous patch, back-patch to all supported branches.

Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-04-20 15:19:17 -04:00
Tom Lane
a347d5210e Fix incorrect handling of join clauses pushed into parameterized paths.
In some cases a clause attached to an outer join can be pushed down into
the outer join's RHS even though the clause is not degenerate --- this
can happen if we choose to make a parameterized path for the RHS.  If
the clause ends up attached to a lower outer join, we'd misclassify it
as being a "join filter" not a plain "filter" condition at that node,
leading to wrong query results.

To fix, teach extract_actual_join_clauses to examine each join clause's
required_relids, not just its is_pushed_down flag.  (The latter now
seems vestigial, or at least in need of rethinking, but we won't do
anything so invasive as redefining it in a bug-fix patch.)

This has been wrong since we introduced parameterized paths in 9.2,
though it's evidently hard to hit given the lack of previous reports.
The test case used here involves a lateral function call, and I think
that a lateral reference may be required to get the planner to select
a broken plan; though I wouldn't swear to that.  In any case, even if
LATERAL is needed to trigger the bug, it still affects all supported
branches, so back-patch to all.

Per report from Andreas Karlsson.  Thanks to Andrew Gierth for
preliminary investigation.

Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-04-19 15:49:12 -04:00
Alvaro Herrera
e668507d36 Enlarge find_other_exec's meager fgets buffer
The buffer was 100 bytes long, which is barely sufficient when the
version string gets longer (such as by configure --with-extra-version).
Set it to MAXPGPATH.

Author: Nikhil Sontakke
Discussion: https://postgr.es/m/CAMGcDxfLfpYU_Jru++L6ARPCOyxr0W+2O3Q54TDi5XdYeU36ow@mail.gmail.com
2018-04-19 10:45:15 -03:00
Tom Lane
7490ce725e Better fix for deadlock hazard in CREATE INDEX CONCURRENTLY.
Commit 54eff5311 did not account for the possibility that we'd have
a transaction snapshot due to default_transaction_isolation being
set high enough to require one.  The transaction snapshot is enough
to hold back our advertised xmin and thus risk deadlock anyway.
The only way to get rid of that snap is to start a new transaction,
so let's do that instead.  Also throw in an assert checking that we
really have gotten to a state where no xmin is being advertised.

Back-patch to 9.4, like the previous commit.

Discussion: https://postgr.es/m/CAMkU=1ztk3TpQdcUNbxq93pc80FrXUjpDWLGMeVBDx71GHNwZQ@mail.gmail.com
2018-04-18 12:07:38 -04:00
Tom Lane
92b503c48d Revert "Add temporary debug logging, in 9.4 branch only."
This reverts commit e55380f3b6.
It's served its purpose.
2018-04-18 11:57:37 -04:00
Tom Lane
248c268d5b Revert "Add more temporary debug logging, in 9.4 branch only."
This reverts commit eef1a609ad.
It's served its purpose.
2018-04-18 11:56:56 -04:00
Tom Lane
eef1a609ad Add more temporary debug logging, in 9.4 branch only.
Last night's results were inconclusive, but after more staring at the
code I've thought of some more data to gather.

Discussion: https://postgr.es/m/6744.1523833660@sss.pgh.pa.us
2018-04-17 11:26:37 -04:00
Tom Lane
608d1f9711 Fix broken collation-aware searches in SP-GiST text opclass.
spg_text_leaf_consistent() supposed that it should compare only
Min(querylen, entrylen) bytes of the two strings, and then deal with
any excess bytes in one string or the other by assuming the longer
string is greater if the prefixes are equal.  Quite aside from the
fact that that's just wrong in some locales (e.g., 'ch' is not less
than 'd' in cs_CZ), it also risked passing incomplete multibyte
characters to strcoll(), with ensuing bad results.

Instead, just pass the full strings to varstr_cmp, and let it decide
what to do about unequal-length strings.

Fortunately, this error doesn't imply any index corruption, it's just
that searches might return the wrong set of entries.

Per report from Emre Hasegeli, though this is not his patch.
Thanks to Peter Geoghegan for review and discussion.

This code was born broken, so back-patch to all supported branches.
In HEAD, I failed to resist the temptation to do a bit of cosmetic
cleanup/pgindent'ing on 710d90da1, too.

Discussion: https://postgr.es/m/CAE2gYzzb6K51VnTq5i5p52z+j9p2duEa-K1T3RrC_GQEynAKEg@mail.gmail.com
2018-04-16 16:06:47 -04:00
Tom Lane
e55380f3b6 Add temporary debug logging, in 9.4 branch only.
Commit 5ee940e1c served its purpose by demonstrating that buildfarm
member okapi is seeing some sort of locally-visible state mismanagement,
not a cross-process data visibility problem as I'd first theorized.
Put in some elog(LOG) messages in hopes of gathering more info about
exactly what's happening there.  Again, this is temporary code to be
reverted once we have buildfarm results.

Discussion: https://postgr.es/m/6744.1523833660@sss.pgh.pa.us
2018-04-16 13:44:39 -04:00
Tom Lane
fea5bfde16 Revert "Add temporary debugging assertion, in 9.4 branch only."
This reverts commit 5ee940e1cd.
Further debugging is needed, but it'll look different than this,
so for simplicity revert this first.
2018-04-16 13:23:35 -04:00
Tom Lane
5ee940e1cd Add temporary debugging assertion, in 9.4 branch only.
Buildfarm member okapi has been failing the multiple-cic isolation
test for months now, but only in 9.4.  To narrow down the possible
causes, add an Assert testing that CREATE INDEX CONCURRENTLY is
advertising zero xmin before waiting for other transactions to end.

I'm not sure that this would hold in general, so this assertion isn't
meant to get released, but it passes all 9.4 regression tests for me.
Will revert once we see how okapi responds.
2018-04-15 20:23:59 -04:00
Tom Lane
3dd36aa4b3 In libpq, free any partial query result before collecting a server error.
We'd throw away the partial result anyway after parsing the error message.
Throwing it away beforehand costs nothing and reduces the risk of
out-of-memory failure.  Also, at least in systems that behave like
glibc/Linux, if the partial result was very large then the error PGresult
would get allocated at high heap addresses, preventing the heap storage
used by the partial result from being released to the OS until the error
PGresult is freed.

In psql >= 9.6, we hold onto the error PGresult until another error is
received (for \errverbose), so that this behavior causes a seeming
memory leak to persist for awhile, as in a recent complaint from
Darafei Praliaskouski.  This is a potential performance regression from
older versions, justifying back-patching at least that far.  But similar
behavior may occur in other client applications, so it seems worth just
back-patching to all supported branches.

Discussion: https://postgr.es/m/CAC8Q8tJ=7cOkPePyAbJE_Pf691t8nDFhJp0KZxHvnq_uicfyVg@mail.gmail.com
2018-04-13 12:53:46 -04:00
Tom Lane
f71d803c8d Fix bogus affix-merging code.
NISortAffixes() compared successive compound affixes incorrectly,
thus possibly failing to merge identical affixes, or (less likely)
merging ones that shouldn't be merged.  The user-visible effects
of this are unclear, to me anyway.

Per bug #15150 from Alexander Lakhin.  It's been broken for a long time,
so back-patch to all supported branches.

Arthur Zakirov

Discussion: https://postgr.es/m/152353327780.31225.13445405496721177988@wrigleys.postgresql.org
2018-04-12 18:39:51 -04:00
Tom Lane
6943fb9275 Ignore nextOid when replaying an ONLINE checkpoint.
The nextOid value is from the start of the checkpoint and may well be stale
compared to values from more recent XLOG_NEXTOID records.  Previously, we
adopted it anyway, allowing the OID counter to go backwards during a crash.
While this should be harmless, it contributed to the severity of the bug
fixed in commit 0408e1ed5, by allowing duplicate TOAST OIDs to be assigned
immediately following a crash.  Without this error, that issue would only
have arisen when TOAST objects just younger than a multiple of 2^32 OIDs
were deleted and then not vacuumed in time to avoid a conflict.

Pavan Deolasee

Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com
2018-04-11 18:11:30 -04:00
Tom Lane
5b3ed6b788 Do not select new object OIDs that match recently-dead entries.
When selecting a new OID, we take care to avoid picking one that's already
in use in the target table, so as not to create duplicates after the OID
counter has wrapped around.  However, up to now we used SnapshotDirty when
scanning for pre-existing entries.  That ignores committed-dead rows, so
that we could select an OID matching a deleted-but-not-yet-vacuumed row.
While that mostly worked, it has two problems:

* If recently deleted, the dead row might still be visible to MVCC
snapshots, creating a risk for duplicate OIDs when examining the catalogs
within our own transaction.  Such duplication couldn't be visible outside
the object-creating transaction, though, and we've heard few if any field
reports corresponding to such a symptom.

* When selecting a TOAST OID, deleted toast rows definitely *are* visible
to SnapshotToast, and will remain so until vacuumed away.  This leads to
a conflict that will manifest in errors like "unexpected chunk number 0
(expected 1) for toast value nnnnn".  We've been seeing reports of such
errors from the field for years, but the cause was unclear before.

The fix is simple: just use SnapshotAny to search for conflicting rows.
This results in a slightly longer window before object OIDs can be
recycled, but that seems unlikely to create any large problems.

Pavan Deolasee

Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com
2018-04-11 17:41:27 -04:00
Heikki Linnakangas
310d1379dd Make local copy of client hostnames in backend status array.
The other strings, application_name and query string, were snapshotted to
local memory in pgstat_read_current_status(), but we forgot to do that for
client hostnames. As a result, the client hostname would appear to change in
the local copy, if the client disconnected.

Backpatch to all supported versions.

Author: Edmund Horner
Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/CAMyN-kA7aOJzBmrYFdXcc7Z0NmW%2B5jBaf_m%3D_-77uRNyKC9r%3DA%40mail.gmail.com
2018-04-11 23:40:27 +03:00
Tom Lane
f530af8fc9 Fix incorrect close() call in dsm_impl_mmap().
One improbable error-exit path in this function used close() where
it should have used CloseTransientFile().  This is unlikely to be
hit in the field, and I think the consequences wouldn't be awful
(just an elog(LOG) bleat later).  But a bug is a bug, so back-patch
to 9.4 where this code came in.

Pan Bian

Discussion: https://postgr.es/m/152056616579.4966.583293218357089052@wrigleys.postgresql.org
2018-04-10 18:34:40 -04:00
Tom Lane
b7537ffb1a Fix bogus provolatile/proparallel markings on a few built-in functions.
Richard Yen reported that pg_upgrade failed if the target cluster had
force_parallel_mode = on, because binary_upgrade_create_empty_extension()
is marked parallel restricted, allowing it to be executed in parallel
mode, which complains because it tries to acquire an XID.

In general, no function that might try to modify database data should
be considered parallel safe or restricted, since execution of it might
force XID acquisition.  We found several other examples of this mistake.

Furthermore, functions that execute user-supplied SQL queries or query
fragments, or pull data from user-supplied cursors, had better be marked
both volatile and parallel unsafe, because we don't know what the supplied
query or cursor might try to do.  There were several tsquery and XML
functions that had the wrong proparallel marking for this, and some of
them were even mislabeled as to volatility.

All these bugs are old, dating back to 9.6 for the proparallel mistakes
and much further for the provolatile mistakes.  We can't force a
catversion bump in the back branches, but we can at least ensure that
installations initdb'd in future have the right values.

Thomas Munro and Tom Lane

Discussion: https://postgr.es/m/CAEepm=2sNDScSLTfyMYu32Q=ob98ZGW-vM_2oLxinzSABGQ6VA@mail.gmail.com
2018-03-30 18:14:51 -04:00
Tom Lane
4c26965166 Fix make rules that generate multiple output files.
For years, our makefiles have correctly observed that "there is no correct
way to write a rule that generates two files".  However, what we did is to
provide empty rules that "generate" the secondary output files from the
primary one, and that's not right either.  Depending on the details of
the creating process, the primary file might end up timestamped later than
one or more secondary files, causing subsequent make runs to consider the
secondary file(s) out of date.  That's harmless in a plain build, since
make will just re-execute the empty rule and nothing happens.  But it's
fatal in a VPATH build, since make will expect the secondary file to be
rebuilt in the build directory.  This would manifest as "file not found"
failures during VPATH builds from tarballs, if we were ever unlucky enough
to ship a tarball with apparently out-of-date secondary files.  (It's not
clear whether that has ever actually happened, but it definitely could.)

To ensure that secondary output files have timestamps >= their primary's,
change our makefile convention to be that we provide a "touch $@" action
not an empty rule.  Also, make sure that this rule actually gets invoked
during a distprep run, else the hazard remains.

It's been like this a long time, so back-patch to all supported branches.

In HEAD, I skipped the changes in src/backend/catalog/Makefile, because
those rules are due to get replaced soon in the bootstrap data format
patch, and there seems no need to create a merge issue for that patch.
If for some reason we fail to land that patch in v11, we'll need to
back-fill the changes in that one makefile from v10.

Discussion: https://postgr.es/m/18556.1521668179@sss.pgh.pa.us
2018-03-23 13:45:38 -04:00
Tom Lane
7f6f8ccd97 Fix tuple counting in SP-GiST index build.
Count the number of tuples in the index honestly, instead of assuming
that it's the same as the number of tuples in the heap.  (It might be
different if the index is partial.)

Back-patch to all supported versions.

Tomas Vondra

Discussion: https://postgr.es/m/3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com
2018-03-22 13:23:48 -04:00
Tom Lane
67e02cde73 Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c.
Code that prints out the contents of setconfig or proconfig arrays in
SQL format needs to handle GUC_LIST_QUOTE variables differently from
other ones, because for those variables, flatten_set_variable_args()
already applied a layer of quoting.  The value can therefore safely
be printed as-is, and indeed must be, or flatten_set_variable_args()
will muck it up completely on reload.  For all other GUC variables,
it's necessary and sufficient to quote the value as a SQL literal.

We'd recognized the need for this long ago, but mis-analyzed the
need slightly, thinking that all GUC_LIST_INPUT variables needed
the special treatment.  That's actually wrong, since a valid value
of a LIST variable might include characters that need quoting,
although no existing variables accept such values.

More to the point, we hadn't made any particular effort to keep the
various places that deal with this up-to-date with the set of variables
that actually need special treatment, meaning that we'd do the wrong
thing with, for example, temp_tablespaces values.  This affects dumping
of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET
commands.

In ruleutils.c we can fix it reasonably honestly by exporting a guc.c
function that allows discovering the flags for a given GUC variable.
But pg_dump doesn't have easy access to that, so continue the old method
of having a hard-wired list of affected variable names.  At least we can
fix it to have just one list not two, and update the list to match
current reality.

A remaining problem with this is that it only works for built-in
GUC variables.  pg_dump's list obvious knows nothing of third-party
extensions, and even the "ask guc.c" method isn't bulletproof since
the relevant extension might not be loaded.  There's no obvious
solution to that, so for now, we'll just have to discourage extension
authors from inventing custom GUCs that need GUC_LIST_QUOTE.

This has been busted for a long time, so back-patch to all supported
branches.

Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and
Pavel Stehule

Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
2018-03-21 20:03:28 -04:00
Tom Lane
e1f186da94 Fix some corner-case issues in REFRESH MATERIALIZED VIEW CONCURRENTLY.
refresh_by_match_merge() has some issues in the way it builds a SQL
query to construct the "diff" table:

1. It doesn't require the selected unique index(es) to be indimmediate.
2. It doesn't pay attention to the particular equality semantics enforced
by a given index, but just assumes that they must be those of the column
datatype's default btree opclass.
3. It doesn't check that the indexes are btrees.
4. It's insufficiently careful to ensure that the parser will pick the
intended operator when parsing the query.  (This would have been a
security bug before CVE-2018-1058.)
5. It's not careful about indexes on system columns.

The way to fix #4 is to make use of the existing code in ri_triggers.c
for generating an arbitrary binary operator clause.  I chose to move
that to ruleutils.c, since that seems a more reasonable place to be
exporting such functionality from than ri_triggers.c.

While #1, #3, and #5 are just latent given existing feature restrictions,
and #2 doesn't arise in the core system for lack of alternate opclasses
with different equality behaviors, #4 seems like an issue worth
back-patching.  That's the bulk of the change anyway, so just back-patch
the whole thing to 9.4 where this code was introduced.

Discussion: https://postgr.es/m/13836.1521413227@sss.pgh.pa.us
2018-03-19 18:49:53 -04:00
Tom Lane
b6ba94ec45 Fix performance hazard in REFRESH MATERIALIZED VIEW CONCURRENTLY.
Jeff Janes discovered that commit 7ca25b7de made one of the queries run by
REFRESH MATERIALIZED VIEW CONCURRENTLY perform badly.  The root cause is
bad cardinality estimation for correlated quals, but a principled solution
to that problem is some way off, especially since the planner lacks any
statistics about whole-row variables.  Moreover, in non-error cases this
query produces no rows, meaning it must be run to completion; but use of
LIMIT 1 encourages the planner to pick a fast-start, slow-completion plan,
exactly not what we want.  Remove the LIMIT clause, and instead rely on
the count parameter we pass to SPI_execute() to prevent excess work if the
query does return some rows.

While we've heard no field reports of planner misbehavior with this query,
it could be that people are having performance issues that haven't reached
the level of pain needed to cause a bug report.  In any case, that LIMIT
clause can't possibly do anything helpful with any existing version of the
planner, and it demonstrably can cause bad choices in some cases, so
back-patch to 9.4 where the code was introduced.

Thomas Munro

Discussion: https://postgr.es/m/CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com
2018-03-19 17:23:07 -04:00
Magnus Hagander
af5fbb1286 Fix pg_recvlogical for pre-10 versions
In e170b8c8, protection against modified search_path was added. However,
PostgreSQL versions prior to 10 does not accept SQL commands over a
replication connection, so the protection would generate a syntax error.

Since we cannot run SQL commands on it, we are also not vulnerable to
the issue that e170b8c8 fixes, so we can just skip this command for
older versions.

Author: Michael Paquier <michael@paquier.xyz>
2018-03-18 13:11:58 +01:00
Tom Lane
092401b14f Fix overflow handling in plpgsql's integer FOR loops.
The test to exit the loop if the integer control value would overflow
an int32 turns out not to work on some ICC versions, as it's dependent
on the assumption that the compiler will execute the code as written
rather than "optimize" it.  ICC lacks any equivalent of gcc's -fwrapv
switch, so it was optimizing on the assumption of no integer overflow,
and that breaks this.  Rewrite into a form that in fact does not
do any overflowing computations.

Per Tomas Vondra and buildfarm member fulmar.  It's been like this
for a long time, although it was not till we added a regression test
case covering the behavior (in commit dd2243f2a) that the problem
became apparent.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/50562fdc-0876-9843-c883-15b8566c7511@2ndquadrant.com
2018-03-17 15:38:15 -04:00
Tom Lane
0a0721f84c Fix WHERE CURRENT OF when the referenced cursor uses an index-only scan.
"UPDATE/DELETE WHERE CURRENT OF cursor_name" failed, with an error message
like "cannot extract system attribute from virtual tuple", if the cursor
was using a index-only scan for the target table.  Fix it by digging the
current TID out of the indexscan state.

It seems likely that the same failure could occur for CustomScan plans
and perhaps some FDW plan types, so that leaving this to be treated as an
internal error with an obscure message isn't as good an idea as it first
seemed.  Hence, add a bit of heaptuple.c infrastructure to let us deliver
a more on-topic message.  I chose to make the message match what you get
for the case where execCurrentOf can't identify the target scan node at
all, "cursor "foo" is not a simply updatable scan of table "bar"".
Perhaps it should be different, but we can always adjust that later.

In the future, it might be nice to provide hooks that would let custom
scan providers and/or FDWs deal with this in other ways; but that's
not a suitable topic for a back-patchable bug fix.

It's been like this all along, so back-patch to all supported branches.

Yugo Nagata and Tom Lane

Discussion: https://postgr.es/m/20180201013349.937dfc5f.nagata@sraoss.co.jp
2018-03-17 14:59:31 -04:00
Tom Lane
2709549ecd Fix query-lifespan memory leakage in repeatedly executed hash joins.
ExecHashTableCreate allocated some memory that wasn't freed by
ExecHashTableDestroy, specifically the per-hash-key function information.
That's not a huge amount of data, but if one runs a query that repeats
a hash join enough times, it builds up.  Fix by arranging for the data
in question to be kept in the hashtable's hashCxt instead of leaving it
"loose" in the query-lifespan executor context.  (This ensures that we'll
also clean up anything that the hash functions allocate in fn_mcxt.)

Per report from Amit Khandekar.  It's been like this forever, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAJ3gD9cFofAWGvcxLOxDHC=B0hjtW8yGmUsF2hdGh97CM38=7g@mail.gmail.com
2018-03-16 16:03:45 -04:00
Michael Meskes
fcc15bf381 Fix double frees in ecpg.
Patch by Patrick Krecker <patrick@judicata.com>
2018-03-14 00:52:21 +01:00
Tom Lane
25a2ba35ed When updating reltuples after ANALYZE, just extrapolate from our sample.
The existing logic for updating pg_class.reltuples trusted the sampling
results only for the pages ANALYZE actually visited, preferring to
believe the previous tuple density estimate for all the unvisited pages.
While there's some rationale for doing that for VACUUM (first that
VACUUM is likely to visit a very nonrandom subset of pages, and second
that we know for sure that the unvisited pages did not change), there's
no such rationale for ANALYZE: by assumption, it's looked at an unbiased
random sample of the table's pages.  Furthermore, in a very large table
ANALYZE will have examined only a tiny fraction of the table's pages,
meaning it cannot slew the overall density estimate very far at all.
In a table that is physically growing, this causes reltuples to increase
nearly proportionally to the change in relpages, regardless of what is
actually happening in the table.  This has been observed to cause reltuples
to become so much larger than reality that it effectively shuts off
autovacuum, whose threshold for doing anything is a fraction of reltuples.
(Getting to the point where that would happen seems to require some
additional, not well understood, conditions.  But it's undeniable that if
reltuples is seriously off in a large table, ANALYZE alone will not fix it
in any reasonable number of iterations, especially not if the table is
continuing to grow.)

Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone,
and in ANALYZE, just extrapolate from the sample pages on the assumption
that they provide an accurate model of the whole table.  If, by very bad
luck, they don't, at least another ANALYZE will fix it; in the old logic
a single bad estimate could cause problems indefinitely.

In HEAD, let's remove vac_estimate_reltuples' is_analyze argument
altogether; it was never used for anything and now it's totally pointless.
But keep it in the back branches, in case any third-party code is calling
this function.

Per bug #15005.  Back-patch to all supported branches.

David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me

Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels
2018-03-13 13:24:27 -04:00
Tom Lane
95f08d32de Avoid holding AutovacuumScheduleLock while rechecking table statistics.
In databases with many tables, re-fetching the statistics takes some time,
so that this behavior seriously decreases the available concurrency for
multiple autovac workers.  There's discussion afoot about more complete
fixes, but a simple and back-patchable amelioration is to claim the table
and release the lock before rechecking stats.  If we find out there's no
longer a reason to process the table, re-taking the lock to un-claim the
table is cheap enough.

(This patch is quite old, but got lost amongst a discussion of more
aggressive fixes.  It's not clear when or if such a fix will be
accepted, but in any case it'd be unlikely to get back-patched.
Let's do this now so we have some improvement for the back branches.)

In passing, make the normal un-claim step take AutovacuumScheduleLock
not AutovacuumLock, since that is what is documented to protect the
wi_tableoid field.  This wasn't an actual bug in view of the fact that
readers of that field hold both locks, but it creates some concurrency
penalty against operations that need only AutovacuumLock.

Back-patch to all supported versions.

Jeff Janes

Discussion: https://postgr.es/m/26118.1520865816@sss.pgh.pa.us
2018-03-13 12:28:39 -04:00
Michael Meskes
bd7eb6fe65 Set connection back to NULL after freeing it.
Patch by Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>
2018-03-12 23:54:22 +01:00
Tom Lane
e556fb1372 Fix improper uses of canonicalize_qual().
One of the things canonicalize_qual() does is to remove constant-NULL
subexpressions of top-level AND/OR clauses.  It does that on the assumption
that what it's given is a top-level WHERE clause, so that NULL can be
treated like FALSE.  Although this is documented down inside a subroutine
of canonicalize_qual(), it wasn't mentioned in the documentation of that
function itself, and some callers hadn't gotten that memo.

Notably, commit d007a9505 caused get_relation_constraints() to apply
canonicalize_qual() to CHECK constraints.  That allowed constraint
exclusion to misoptimize situations in which a CHECK constraint had a
provably-NULL subclause, as seen in the regression test case added here,
in which a child table that should be scanned is not.  (Although this
thinko is ancient, the test case doesn't fail before 9.2, for reasons
I've not bothered to track down in detail.  There may be related cases
that do fail before that.)

More recently, commit f0e44751d added an independent bug by applying
canonicalize_qual() to index expressions, which is even sillier since
those might not even be boolean.  If they are, though, I think this
could lead to making incorrect index entries for affected index
expressions in v10.  I haven't attempted to prove that though.

To fix, add an "is_check" parameter to canonicalize_qual() to specify
whether it should assume WHERE or CHECK semantics, and make it perform
NULL-elimination accordingly.  Adjust the callers to apply the right
semantics, or remove the call entirely in cases where it's not known
that the expression has one or the other semantics.  I also removed
the call in some cases involving partition expressions, where it should
be a no-op because such expressions should be canonical already ...
and was a no-op, independently of whether it could in principle have
done something, because it was being handed the qual in implicit-AND
format which isn't what it expects.  In HEAD, add an Assert to catch
that type of mistake in future.

This represents an API break for external callers of canonicalize_qual().
While that's intentional in HEAD to make such callers think about which
case applies to them, it seems like something we probably wouldn't be
thanked for in released branches.  Hence, in released branches, the
extra parameter is added to a new function canonicalize_qual_ext(),
and canonicalize_qual() is a wrapper that retains its old behavior.

Patch by me with suggestions from Dean Rasheed.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/24475.1520635069@sss.pgh.pa.us
2018-03-11 18:10:42 -04:00
Alvaro Herrera
6d30e3a2b6 Refrain from duplicating data in reorderbuffers
If a walsender exits leaving data in reorderbuffers, the next walsender
that tries to decode the same transaction would append its decoded data
in the same spill files without truncating it first, which effectively
duplicate the data.  Avoid that by removing any leftover reorderbuffer
spill files when a walsender starts.

Backpatch to 9.4; this bug has been there from the very beginning of
logical decoding.

Author: Craig Ringer, revised by me
Reviewed by: Álvaro Herrera, Petr Jelínek, Masahiko Sawada
2018-03-06 16:10:23 -03:00
Tom Lane
165fa27fe4 Fix assorted issues in convert_to_scalar().
If convert_to_scalar is passed a pair of datatypes it can't cope with,
its former behavior was just to elog(ERROR).  While this is OK so far as
the core code is concerned, there's extension code that would like to use
scalarltsel/scalargtsel/etc as selectivity estimators for operators that
work on non-core datatypes, and this behavior is a show-stopper for that
use-case.  If we simply allow convert_to_scalar to return FALSE instead of
outright failing, then the main logic of scalarltsel/scalargtsel will work
fine for any operator that behaves like a scalar inequality comparison.
The lack of conversion capability will mean that we can't estimate to
better than histogram-bin-width precision, since the code will effectively
assume that the comparison constant falls at the middle of its bin.  But
that's still a lot better than nothing.  (Someday we should provide a way
for extension code to supply a custom version of convert_to_scalar, but
today is not that day.)

While poking at this issue, we noted that the existing code for handling
type bytea in convert_to_scalar is several bricks shy of a load.
It assumes without checking that if the comparison value is type bytea,
the bounds values are too; in the worst case this could lead to a crash.
It also fails to detoast the input values, so that the comparison result is
complete garbage if any input is toasted out-of-line, compressed, or even
just short-header.  I'm not sure how often such cases actually occur ---
the bounds values, at least, are probably safe since they are elements of
an array and hence can't be toasted.  But that doesn't make this code OK.

Back-patch to all supported branches, partly because author requested that,
but mostly because of the bytea bugs.  The change in API for the exposed
routine convert_network_to_scalar() is theoretically a back-patch hazard,
but it seems pretty unlikely that any third-party code is calling that
function directly.

Tomas Vondra, with some adjustments by me

Discussion: https://postgr.es/m/b68441b6-d18f-13ab-b43b-9a72188a4e02@2ndquadrant.com
2018-03-03 20:31:35 -05:00
Tom Lane
947f06c622 Make gistvacuumcleanup() count the actual number of index tuples.
Previously, it just returned the heap tuple count, which might be only an
estimate, and would be completely the wrong thing if the index is partial.
Since this function scans every index page anyway to find free pages,
it's practically free to count the surviving index tuples.  Let's do that
and return an accurate count.

This is easily visible as a wrong reltuples value for a partial GiST
index following VACUUM, so back-patch to all supported branches.

Andrey Borodin, reviewed by Michail Nikolaev

Discussion: https://postgr.es/m/151956654251.6915.675951950408204404.pgcf@coridan.postgresql.org
2018-03-02 11:22:42 -05:00
Tom Lane
a4fed310cb Use ereport not elog for some corrupt-HOT-chain reports.
These errors have been seen in the field in corrupted-data situations.
It seems worthwhile to report them with ERRCODE_DATA_CORRUPTED, rather
than the generic ERRCODE_INTERNAL_ERROR, for the benefit of log monitoring
and tools like amcheck.  However, use errmsg_internal so that the text
strings still aren't translated; it seems unlikely to be worth
translators' time to do so.

Back-patch to 9.3, like the predecessor commit d70cf811f that introduced
these elog calls originally (replacing Asserts).

Peter Geoghegan

Discussion: https://postgr.es/m/CAH2-Wzmn4-Pg-UGFwyuyK-wiTih9j32pwg_7T9iwqXpAUZr=Mg@mail.gmail.com
2018-03-01 16:23:50 -05:00
Alvaro Herrera
3ee23834ed Relax overly strict sanity check for upgraded ancient databases
Commit 4800f16a7a added some sanity checks to ensure we don't
accidentally corrupt data, but in one of them we failed to consider the
effects of a database upgraded from 9.2 or earlier, where a tuple
exclusively locked prior to the upgrade has a slightly different bit
pattern.  Fix that by using the macro that we fixed in commit
74ebba84ae for similar situations.

Reported-by: Alexandre Garcia
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CAPYLKR6yxV4=pfW0Gwij7aPNiiPx+3ib4USVYnbuQdUtmkMaEA@mail.gmail.com

Andres suspects that this bug may have wider ranging consequences, but I
couldn't find anything.
2018-03-01 18:07:46 -03:00
Tom Lane
d07f79a9cc Rename base64 routines to avoid conflict with Solaris built-in functions.
Solaris 11.4 has built-in functions named b64_encode and b64_decode.
Rename ours to something else to avoid the conflict (fortunately,
ours are static so the impact is limited).

One could wish for less duplication of code in this area, but that
would be a larger patch and not very suitable for back-patching.
Since this is a portability fix, we want to put it into all supported
branches.

Report and initial patch by Rainer Orth, reviewed and adjusted a bit
by Michael Paquier

Discussion: https://postgr.es/m/ydd372wk28h.fsf@CeBiTec.Uni-Bielefeld.DE
2018-02-28 18:33:45 -05:00
Tom Lane
cadb14c271 Remove restriction on SQL block length in isolationtester scanner.
specscanner.l had a fixed limit of 1024 bytes on the length of
individual SQL stanzas in an isolation test script.  People are
starting to run into that, so fix it by making the buffer resizable.

Once we allow this in HEAD, it seems inevitable that somebody will
try to back-patch a test that exceeds the old limit, so back-patch
this change as a preventive measure.

Daniel Gustafsson

Discussion: https://postgr.es/m/8D628BE4-6606-4FF6-A3FF-8B2B0E9B43D0@yesql.se
2018-02-28 16:57:38 -05:00
Tom Lane
49f9014c8c Fix up ecpg's configuration so it handles "long long int" in MSVC builds.
Although configure-based builds correctly define HAVE_LONG_LONG_INT when
appropriate (in both pg_config.h and ecpg_config.h), builds using the MSVC
scripts failed to do so.  This currently has no impact on the backend,
since it uses that symbol nowhere; but it does prevent ecpg from
supporting "long long int".  Fix that.

Also, adjust Solution.pm so that in the constructed ecpg_config.h file,
the "#if (_MSC_VER > 1200)" covers only the LONG_LONG_INT-related
#defines, not the whole file.  AFAICS this was a thinko on somebody's
part: ENABLE_THREAD_SAFETY should always be defined in Windows builds,
and in branches using USE_INTEGER_DATETIMES, the setting of that shouldn't
depend on the compiler version either.  If I'm wrong, I imagine the
buildfarm will say so.

Per bug #15080 from Jonathan Allen; issue diagnosed by Michael Meskes
and Andrew Gierth.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-02-27 16:46:52 -05:00
Tom Lane
5cedaeca26 Remove regression tests' CREATE FUNCTION commands for unused C functions.
I removed these functions altogether in HEAD, in commit db3af9feb, and
it emerges that that causes trouble for cross-branch upgrade testing.
We could put back stub functions but that seems pretty silly.  Instead,
back-patch a minimal subset of db3af9feb, namely just removing the
CREATE FUNCTION commands.

Discussion: https://postgr.es/m/11927.1519756619@sss.pgh.pa.us
2018-02-27 15:04:53 -05:00
Tom Lane
5ccb775869 Prevent dangling-pointer access when update trigger returns old tuple.
A before-update row trigger may choose to return the "new" or "old" tuple
unmodified.  ExecBRUpdateTriggers failed to consider the second
possibility, and would proceed to free the "old" tuple even if it was the
one returned, leading to subsequent access to already-deallocated memory.
In debug builds this reliably leads to an "invalid memory alloc request
size" failure; in production builds it might accidentally work, but data
corruption is also possible.

This is a very old bug.  There are probably a couple of reasons it hasn't
been noticed up to now.  It would be more usual to return NULL if one
wanted to suppress the update action; returning "old" is significantly less
efficient since the update will occur anyway.  Also, none of the standard
PLs would ever cause this because they all returned freshly-manufactured
tuples even if they were just copying "old".  But commit 4b93f5799 changed
that for plpgsql, making it possible to see the bug with a plpgsql trigger.
Still, this is certainly legal behavior for a trigger function, so it's
ExecBRUpdateTriggers's fault not plpgsql's.

It seems worth creating a test case that exercises returning "old" directly
with a C-language trigger; testing this through plpgsql seems unreliable
because its behavior might change again.

Report and fix by Rushabh Lathia; regression test case by me.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAGPqQf1P4pjiNPrMof=P_16E-DFjt457j+nH2ex3=nBTew7tXw@mail.gmail.com
2018-02-27 13:27:38 -05:00
Magnus Hagander
5181bebba7 Revert restructuring of bin/scripts/Makefile
The Makefile portion of 91f3ffc524 broke
the MSVC build. This patch reverts the changes to the Makefile and
adjusts it to work with the new code, while keeping the actual code
changes from the original patch.

Author: Victor Wagner <vitus@wagner.pp.ru>
2018-02-27 14:11:09 +01:00
Tom Lane
f46bc90dbc Stamp 9.4.17. 2018-02-26 17:17:45 -05:00
Noah Misch
f28955e382 Document security implications of search_path and the public schema.
The ability to create like-named objects in different schemas opens up
the potential for users to change the behavior of other users' queries,
maliciously or accidentally.  When you connect to a PostgreSQL server,
you should remove from your search_path any schema for which a user
other than yourself or superusers holds the CREATE privilege.  If you do
not, other users holding CREATE privilege can redefine the behavior of
your commands, causing them to perform arbitrary SQL statements under
your identity.  "SET search_path = ..." and "SELECT
pg_catalog.set_config(...)" are not vulnerable to such hijacking, so one
can use either as the first command of a session.  As special
exceptions, the following client applications behave as documented
regardless of search_path settings and schema privileges: clusterdb
createdb createlang createuser dropdb droplang dropuser ecpg (not
programs it generates) initdb oid2name pg_archivecleanup pg_basebackup
pg_config pg_controldata pg_ctl pg_dump pg_dumpall pg_isready
pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_standby
pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb
vacuumlo.  Not included are core client programs that run user-specified
SQL commands, namely psql and pgbench.  PostgreSQL encourages non-core
client applications to do likewise.

Document this in the context of libpq connections, psql connections,
dblink connections, ECPG connections, extension packaging, and schema
usage patterns.  The principal defense for applications is "SELECT
pg_catalog.set_config('search_path', '', false)", and the principal
defense for databases is "REVOKE CREATE ON SCHEMA public FROM PUBLIC".
Either one is sufficient to prevent attack.  After a REVOKE, consider
auditing the public schema for objects named like pg_catalog objects.

Authors of SECURITY DEFINER functions use some of the same defenses, and
the CREATE FUNCTION reference page already covered them thoroughly.
This is a good opportunity to audit SECURITY DEFINER functions for
robust security practice.

Back-patch to 9.3 (all supported versions).

Reviewed by Michael Paquier and Jonathan S. Katz.  Reported by Arseniy
Sharoglazov.

Security: CVE-2018-1058
2018-02-26 07:39:48 -08:00
Noah Misch
928bca1a30 Empty search_path in Autovacuum and non-psql/pgbench clients.
This makes the client programs behave as documented regardless of the
connect-time search_path and regardless of user-created objects.  Today,
a malicious user with CREATE permission on a search_path schema can take
control of certain of these clients' queries and invoke arbitrary SQL
functions under the client identity, often a superuser.  This is
exploitable in the default configuration, where all users have CREATE
privilege on schema "public".

This changes behavior of user-defined code stored in the database, like
pg_index.indexprs and pg_extension_config_dump().  If they reach code
bearing unqualified names, "does not exist" or "no schema has been
selected to create in" errors might appear.  Users may fix such errors
by schema-qualifying affected names.  After upgrading, consider watching
server logs for these errors.

The --table arguments of src/bin/scripts clients have been lax; for
example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint.  That
now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still
performs a checkpoint.

Back-patch to 9.3 (all supported versions).

Reviewed by Tom Lane, though this fix strategy was not his first choice.
Reported by Arseniy Sharoglazov.

Security: CVE-2018-1058
2018-02-26 07:39:48 -08:00
Noah Misch
461c32b557 Back-patch non-static ExecuteSqlQueryForSingleRow().
Back-patch a subset of commit 47e5969767
to 9.4 and 9.3.  The next commit adds calls to this function.

Security: CVE-2018-1058
2018-02-26 07:39:48 -08:00
Tom Lane
9f6e5296a1 Avoid using unsafe search_path settings during dump and restore.
Historically, pg_dump has "set search_path = foo, pg_catalog" when
dumping an object in schema "foo", and has also caused that setting
to be used while restoring the object.  This is problematic because
functions and operators in schema "foo" could capture references meant
to refer to pg_catalog entries, both in the queries issued by pg_dump
and those issued during the subsequent restore run.  That could
result in dump/restore misbehavior, or in privilege escalation if a
nefarious user installs trojan-horse functions or operators.

This patch changes pg_dump so that it does not change the search_path
dynamically.  The emitted restore script sets the search_path to what
was used at dump time, and then leaves it alone thereafter.  Created
objects are placed in the correct schema, regardless of the active
search_path, by dint of schema-qualifying their names in the CREATE
commands, as well as in subsequent ALTER and ALTER-like commands.

Since this change requires a change in the behavior of pg_restore
when processing an archive file made according to this new convention,
bump the archive file version number; old versions of pg_restore will
therefore refuse to process files made with new versions of pg_dump.

Security: CVE-2018-1058
2018-02-26 10:18:22 -05:00
Peter Eisentraut
fd090d0d6f Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: e859c0a3a8ac0fc7ae28150aff33153ec532ef04
2018-02-26 08:37:35 -05:00
Noah Misch
da27155c6b Synchronize doc/ copies of src/test/examples/.
This is mostly cosmetic, but it might fix build failures, on some
platform, when copying from the documentation.

Back-patch to 9.3 (all supported versions).
2018-02-23 11:24:08 -08:00
Tom Lane
f6dd08489c Fix planner failures with overlapping mergejoin clauses in an outer join.
Given overlapping or partially redundant join clauses, for example
	t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x
the planner's EquivalenceClass machinery will ordinarily refactor the
clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't
see multiple references to the same EquivalenceClass in a list of join
equality clauses.  However, if the join is outer, it's incorrect to derive
a restriction clause on the outer side from the join conditions, so the
clause refactoring does not happen and we end up with overlapping join
conditions.  The code that attempted to deal with such cases had several
subtle bugs, which could result in "left and right pathkeys do not match in
mergejoin" or "outer pathkeys do not match mergeclauses" planner errors,
if the selected join plan type was a mergejoin.  (It does not appear that
any actually incorrect plan could have been emitted.)

The core of the problem really was failure to recognize that the outer and
inner relations' pathkeys have different relationships to the mergeclause
list.  A join's mergeclause list is constructed by reference to the outer
pathkeys, so it will always be ordered the same as the outer pathkeys, but
this cannot be presumed true for the inner pathkeys.  If the inner sides of
the mergeclauses contain multiple references to the same EquivalenceClass
({t2.x} in the above example) then a simplistic rendering of the required
inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery
recognizes that the second sort column is redundant and throws it away.
The mergejoin planning code failed to account for that behavior properly.
One error was to try to generate cut-down versions of the mergeclause list
from cut-down versions of the inner pathkeys in the same way as the initial
construction of the mergeclause list from the outer pathkeys was done; this
could lead to choosing a mergeclause list that fails to match the outer
pathkeys.  The other problem was that the pathkey cross-checking code in
create_mergejoin_plan treated the inner and outer pathkey lists
identically, whereas actually the expectations for them must be different.
That led to false "pathkeys do not match" failures in some cases, and in
principle could have led to failure to detect bogus plans in other cases,
though there is no indication that such bogus plans could be generated.

Reported by Alexander Kuzmenkov, who also reviewed this patch.  This has
been broken for years (back to around 8.3 according to my testing), so
back-patch to all supported branches.

Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
2018-02-23 13:47:33 -05:00
Tom Lane
2d12c55933 Repair pg_upgrade's failure to preserve relfrozenxid for matviews.
This oversight led to data corruption in matviews, manifesting as
"could not access status of transaction" before our most recent releases,
and "found xmin from before relfrozenxid" errors since then.

The proximate cause of the problem seems to have been confusion between
the task of preserving dropped-column status and the task of preserving
frozenxid status.  Those are required for distinct sets of relkinds,
and the reasoning was entirely undocumented in the source code.  In hopes
of forestalling future errors of the same kind, try to improve the
commentary in this area.

In passing, also improve the remarkably unhelpful comments around
pg_upgrade's set_frozenxids().  That's not actually buggy AFAICS,
but good luck figuring out what it does from the old comments.

Per report from Claudio Freire.  It appears that bug #14852 from Alexey
Ermakov is an earlier report of the same issue, and there may be other
cases that we failed to identify at the time.

Patch by me based on analysis by Andres Freund.  The bug dates back
to the introduction of matviews, so back-patch to all supported branches.

Discussion: https://postgr.es/m/CAGTBQpbrY9CdRGGhyBZ9yqY4jWaGC85rUF4X+R7d-aim=mBNsw@mail.gmail.com
Discussion: https://postgr.es/m/20171013115320.28049.86457@wrigleys.postgresql.org
2018-02-21 18:40:24 -05:00
Tom Lane
e11b6488e5 Fix misbehavior of CTE-used-in-a-subplan during EPQ rechecks.
An updating query that reads a CTE within an InitPlan or SubPlan could get
incorrect results if it updates rows that are concurrently being modified.
This is caused by CteScanNext supposing that nothing inside its recursive
ExecProcNode call could change which read pointer is selected in the CTE's
shared tuplestore.  While that's normally true because of scoping
considerations, it can break down if an EPQ plan tree gets built during the
call, because EvalPlanQualStart builds execution trees for all subplans
whether they're going to be used during the recheck or not.  And it seems
like a pretty shaky assumption anyway, so let's just reselect our own read
pointer here.

Per bug #14870 from Andrei Gorita.  This has been broken since CTEs were
implemented, so back-patch to all supported branches.

Discussion: https://postgr.es/m/20171024155358.1471.82377@wrigleys.postgresql.org
2018-02-19 16:00:18 -05:00
Tom Lane
bd87186378 Fix broken logic for reporting PL/Python function names in errcontext.
plpython_error_callback() reported the name of the function associated
with the topmost PL/Python execution context.  This was not merely
wrong if there were nested PL/Python contexts, but it risked a core
dump if the topmost one is an inline code block rather than a named
function.  That will have proname = NULL, and so we were passing a NULL
pointer to snprintf("%s").  It seems that none of the PL/Python-testing
machines in the buildfarm will dump core for that, but some platforms do,
as reported by Marina Polyakova.

Investigation finds that there actually is an existing regression test
that used to prove that the behavior was wrong, though apparently no one
had noticed that it was printing the wrong function name.  It stopped
showing the problem in 9.6 when we adjusted psql to not print CONTEXT
by default for NOTICE messages.  The problem is masked (if your platform
avoids the core dump) in error cases, because PL/Python will throw away
the originally generated error info in favor of a new traceback produced
at the outer level.

Repair by using ErrorContextCallback.arg to pass the correct context to
the error callback.  Add a regression test illustrating correct behavior.

Back-patch to all supported branches, since they're all broken this way.

Discussion: https://postgr.es/m/156b989dbc6fe7c4d3223cf51da61195@postgrespro.ru
2018-02-14 14:47:18 -05:00
Tom Lane
b7e1ca7d8e Stamp 9.4.16. 2018-02-05 16:07:03 -05:00
Peter Eisentraut
0a5dcba2ab Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 1bde68354ce9389733ebee17e76817f3bda1edf1
2018-02-05 12:45:45 -05:00
Peter Eisentraut
9f050c0b41 Exclude common/int128.h from cpluspluscheck
It uses static assertions, which are not supported under C++ in this
branch.

This change only goes into the 9.4 branch, because 9.5 and beyond will
primarily use the USE_NATIVE_INT128 branch, so cpluspluscheck isn't
bothered.  In PG11 we will have C++ support for static assertions, so
the issue will go away altogether.
2018-01-30 19:21:19 -05:00
Peter Eisentraut
b37422de89 psql documentation fixes
Update the documentation for \pset to mention
columns|linestyle.

Author: Дилян Палаузов <dpa-postgres@aegee.org>
2018-01-29 14:18:11 -05:00
Tom Lane
06efc5cf53 Add stack-overflow guards in set-operation planning.
create_plan_recurse lacked any stack depth check.  This is not per
our normal coding rules, but I'd supposed it was safe because earlier
planner processing is more complex and presumably should eat more
stack.  But bug #15033 from Andrew Grossman shows this isn't true,
at least not for queries having the form of a many-thousand-way
INTERSECT stack.

Further testing showed that recurse_set_operations is also capable
of being crashed in this way, since it likewise will recurse to the
bottom of a parsetree before calling any support functions that
might themselves contain any stack checks.  However, its stack
consumption is only perhaps a third of create_plan_recurse's.

It's possible that this particular problem with create_plan_recurse can
only manifest in 9.6 and later, since before that we didn't build a Path
tree for set operations.  But having seen this example, I now have no
faith in the proposition that create_plan_recurse doesn't need a stack
check, so back-patch to all supported branches.

Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org
2018-01-28 13:39:07 -05:00
Tom Lane
fa86a32f9b Update time zone data files to tzdata release 2018c.
DST law changes in Brazil, Sao Tome and Principe.  Historical corrections
for Bolivia, Japan, and South Sudan.  The "US/Pacific-New" zone has been
removed (it was only a link to America/Los_Angeles anyway).
2018-01-27 16:42:55 -05:00
Tom Lane
54e1599c76 Teach reparameterize_path() to handle AppendPaths.
If we're inside a lateral subquery, there may be no unparameterized paths
for a particular child relation of an appendrel, in which case we *must*
be able to create similarly-parameterized paths for each other child
relation, else the planner will fail with "could not devise a query plan
for the given query".  This means that there are situations where we'd
better be able to reparameterize at least one path for each child.

This calls into question the assumption in reparameterize_path() that
it can just punt if it feels like it.  However, the only case that is
known broken right now is where the child is itself an appendrel so that
all its paths are AppendPaths.  (I think possibly I disregarded that in
the original coding on the theory that nested appendrels would get folded
together --- but that only happens *after* reparameterize_path(), so it's
not excused from handling a child AppendPath.)  Given that this code's been
like this since 9.3 when LATERAL was introduced, it seems likely we'd have
heard of other cases by now if there were a larger problem.

Per report from Elvis Pranskevichus.  Back-patch to 9.3.

Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net
2018-01-23 16:50:35 -05:00
Tom Lane
da83ca7d9d Make pg_dump's ACL, sec label, and comment entries reliably identifiable.
_tocEntryRequired() expects that it can identify ACL, SECURITY LABEL,
and COMMENT TOC entries that are for large objects by seeing whether
the tag for them starts with "LARGE OBJECT ".  While that works fine
for actual large objects, which are indeed tagged that way, it's
subject to false positives unless every such entry's tag starts with an
appropriate type ID.  And in fact it does not work for ACLs, because
up to now we customarily tagged those entries with just the bare name
of the object.  This means that an ACL for an object named
"LARGE OBJECT something" would be misclassified as data not schema,
with undesirable results in a schema-only or data-only dump ---
although pg_upgrade seems unaffected, due to the special case for
binary-upgrade mode further down in _tocEntryRequired().

We can fix this by changing all the dumpACL calls to use the label
strings already in use for comments and security labels, which do
follow the convention of starting with an object type indicator.

Well, mostly they follow it.  dumpDatabase() got it wrong, using
just the bare database name for those purposes, so that a database
named "LARGE OBJECT something" would similarly be subject to having
its comment or security label dropped or included when not wanted.
Bring that into line too.  (Note that up to now, database ACLs have
not been processed by pg_dump, so that this issue doesn't affect them.)

_tocEntryRequired() itself is not free of fault: it was overly liberal
about matching object tags to "LARGE OBJECT " in binary-upgrade mode.
This looks like it is probably harmless because there would be no data
component to strip anyway in that mode, but at best it's trouble
waiting to happen, so tighten that up too.

The possible misclassification of SECURITY LABEL entries for databases is
in principle a security problem, but the opportunities for actual exploits
seem too narrow to be interesting.  The other cases seem like just bugs,
since an object owner can change its ACL or comment for himself, he needn't
try to trick someone else into doing it by choosing a strange name.

This has been broken since per-large-object TOC entries were introduced
in 9.0, so back-patch to all supported branches.

Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us
2018-01-22 12:06:19 -05:00
Alvaro Herrera
1284d18b5d Fix StoreCatalogInheritance1 to use 32bit inhseqno
For no apparent reason, this function was using a 16bit-wide inhseqno
value, rather than the correct 32 bit width which is what is stored in
the pg_inherits catalog.  This becomes evident if you try to create a
table with more than 65535 parents, because this error appears:

ERROR:  duplicate key value violates unique constraint «pg_inherits_relid_seqno_index»
DETAIL:  Key (inhrelid, inhseqno)=(329371, 0) already exists.

Needless to say, having so many parents is an uncommon situations, which
explains why this error has never been reported despite being having
been introduced with the Postgres95 1.01 sources in commit d31084e9d111:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/creatinh.c;hb=d31084e9d111#l349

Backpatch all the way back.

David Rowley noticed this while reviewing a patch of mine.
Discussion: https://postgr.es/m/CAKJS1f8Dn7swSEhOWwzZzssW7747YB=2Hi+T7uGud40dur69-g@mail.gmail.com
2018-01-19 10:15:08 -03:00
Michael Meskes
2c1c4b060c Cope with indicator arrays that do not have the correct length.
Patch by: "Rader, David" <davidr@openscg.com>
2018-01-15 10:02:16 +01:00
Tom Lane
8b0e5e7e7c Avoid unnecessary failure in SELECT concurrent with ALTER NO INHERIT.
If a query against an inheritance tree runs concurrently with an ALTER
TABLE that's disinheriting one of the tree members, it's possible to get
a "could not find inherited attribute" error because after obtaining lock
on the removed member, make_inh_translation_list sees that its columns
have attinhcount=0 and decides they aren't the columns it's looking for.

An ideal fix, perhaps, would avoid including such a just-removed member
table in the query at all; but there seems no way to accomplish that
without adding expensive catalog rechecks or creating a likelihood of
deadlocks.  Instead, let's just drop the check on attinhcount.  In this
way, a query that's included a just-disinherited child will still
succeed, which is not a completely unreasonable behavior.

This problem has existed for a long time, so back-patch to all supported
branches.  Also add an isolation test verifying related behaviors.

Patch by me; the new isolation test is based on Kyotaro Horiguchi's work.

Discussion: https://postgr.es/m/20170626.174612.23936762.horiguchi.kyotaro@lab.ntt.co.jp
2018-01-12 15:46:38 -05:00
Alvaro Herrera
c618796404 Change some bogus PageGetLSN calls to BufferGetLSNAtomic
As src/backend/access/transam/README says, PageGetLSN may only be called
by processes holding either exclusive lock on buffer, or a shared lock
on buffer plus buffer header lock.  Therefore any place that only holds
a shared buffer lock must use BufferGetLSNAtomic instead of PageGetLSN,
which internally obtains buffer header lock prior to reading the LSN.

A few callsites failed to comply with this rule.  This was detected by
running all tests under a new (not committed) assertion that verifies
PageGetLSN locking contract.  All but one of the callsites that failed
the assertion are fixed by this patch.  Remaining callsites were
inspected manually and determined not to need any change.

The exception (unfixed callsite) is in TestForOldSnapshot, which only
has a Page argument, making it impossible to access the corresponding
Buffer from it.  Fixing that seems a much larger patch that will have to
be done separately; and that's just as well, since it was only
introduced in 9.6 and other bugs are much older.

Some of these bugs are ancient; backpatch all the way back to 9.3.

Authors: Jacob Champion, Asim Praveen, Ashwin Agrawal
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/CABAq_6GXgQDVu3u12mK9O5Xt5abBZWQ0V40LZCE+oUf95XyNFg@mail.gmail.com
2018-01-09 17:07:24 -03:00
Alvaro Herrera
f68c49f86a Fix failure to delete spill files of aborted transactions
Logical decoding's reorderbuffer.c may spill transaction files to disk
when transactions are large.  These are supposed to be removed when they
become "too old" by xid; but file removal requires the boundary LSNs of
the transaction to be known.  The final_lsn is only set when we see the
commit or abort record for the transaction, but nothing sets the value
for transactions that crash, so the removal code misbehaves -- in
assertion-enabled builds, it crashes by a failed assertion.

To fix, modify the final_lsn of transactions that don't have a value
set, to the LSN of the very latest change in the transaction.  This
causes the spilled files to be removed appropriately.

Author: Atsushi Torikoshi
Reviewed-by: Kyotaro HORIGUCHI, Craig Ringer, Masahiko Sawada
Discussion: https://postgr.es/m/54e4e488-186b-a056-6628-50628e4e4ebc@lab.ntt.co.jp
2018-01-05 12:17:10 -03:00
Andrew Dunstan
2d03daa7b8 Fix use of config-specific libraries for Windows OpenSSL
Commit 614350a3 allowed for an different builds of OpenSSL libraries on
Windows, but ignored the fact that the alternative builds don't have
config-specific libraries. This patch fixes the Solution file to ask for
the correct libraries.

per offline discussions with Leonardo Cecchi and Marco Nenciarini,

Backpatch to all live branches.
2018-01-03 15:34:02 -05:00
Alvaro Herrera
fe6bdc0a38 Make XactLockTableWait work for transactions that are not yet self-locked
XactLockTableWait assumed that its xid argument has already added itself
to the lock table.  That assumption led to another assumption that if
locking the xid has succeeded but the xid is reported as still in
progress, then the input xid must have been a subtransaction.

These assumptions hold true for the original uses of this code in
locking related to on-disk tuples, but they break down in logical
replication slot snapshot building -- in particular, when a standby
snapshot logged contains an xid that's already in ProcArray but not yet
in the lock table.  This leads to assertion failures that can be
reproduced all the way back to 9.4, when logical decoding was
introduced.

To fix, change SubTransGetParent to SubTransGetTopmostTransaction which
has a slightly different API: it returns the argument Xid if there is no
parent, and it goes all the way to the top instead of moving up the
levels one by one.  Also, to avoid busy-waiting, add a 1ms sleep to give
the other process time to register itself in the lock table.

For consistency, change ConditionalXactLockTableWait the same way.

Author: Petr Jelínek
Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru
Reported-by: Konstantin Knizhnik
Diagnosed-by: Stas Kelvich, Petr Jelínek
Reviewed-by: Andres Freund, Robert Haas
2018-01-03 14:38:39 -03:00
Alvaro Herrera
47a3a13178 Fix deadlock hazard in CREATE INDEX CONCURRENTLY
Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are
supposed to be able to work in parallel, as evidenced by fixes in commit
c3d09b3bd2 specifically to support this case.  In reality, one of the
sessions would be aborted by a misterious "deadlock detected" error.

Jeff Janes diagnosed that this is because of leftover snapshots used for
system catalog scans -- this was broken by 8aa3e47510 keeping track of
(registering) the catalog snapshot.  To fix the deadlocks, it's enough
to de-register that snapshot prior to waiting.

Backpatch to 9.4, which introduced MVCC catalog scans.

Include an isolationtester spec that 8 out of 10 times reproduces the
deadlock with the unpatched code for me (Álvaro).

Author: Jeff Janes
Diagnosed-by: Jeff Janes
Reported-by: Jeremy Finzel
Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com
2018-01-02 19:16:16 -03:00
Tom Lane
e1b8e0e4a6 Disallow UNION/INTERSECT/EXCEPT over no columns.
Since 9.4, we've allowed the syntax "select union select" and variants
of that.  However, the planner wasn't expecting a no-column set operation
and ended up treating the set operation as if it were UNION ALL.

Pre-v10, there seem to be some executor issues that would need to be
fixed to support such cases, and it doesn't really seem worth expending
much effort on.  Just disallow it, instead.

Per report from Victor Yegorov.

Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com
2017-12-22 12:08:41 -05:00
Andres Freund
ed8e1aff6a Perform a lot more sanity checks when freezing tuples.
The previous commit has shown that the sanity checks around freezing
aren't strong enough. Strengthening them seems especially important
because the existance of the bug has caused corruption that we don't
want to make even worse during future vacuum cycles.

The errors are emitted with ereport rather than elog, despite being
"should never happen" messages, so a proper error code is emitted. To
avoid superflous translations, mark messages as internal.

Author: Andres Freund and Alvaro Herrera
Reviewed-By: Alvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de
Backpatch: 9.3-
2017-12-14 18:20:48 -08:00
Andres Freund
4eff5a8c9f Fix pruning of locked and updated tuples.
Previously it was possible that a tuple was not pruned during vacuum,
even though its update xmax (i.e. the updating xid in a multixact with
both key share lockers and an updater) was below the cutoff horizon.

As the freezing code assumed, rightly so, that that's not supposed to
happen, xmax would be preserved (as a member of a new multixact or
xmax directly). That causes two problems: For one the tuple is below
the xmin horizon, which can cause problems if the clog is truncated or
once there's an xid wraparound. The bigger problem is that that will
break HOT chains, which in turn can lead two to breakages: First,
failing index lookups, which in turn can e.g lead to constraints being
violated. Second, future hot prunes / vacuums can end up making
invisible tuples visible again. There's other harmful scenarios.

Fix the problem by recognizing that tuples can be DEAD instead of
RECENTLY_DEAD, even if the multixactid has alive members, if the
update_xid is below the xmin horizon. That's safe because newer
versions of the tuple will contain the locking xids.

A followup commit will harden the code somewhat against future similar
bugs and already corrupted data.

Author: Andres Freund, with changes by Alvaro Herrera
Reported-By: Daniel Wood
Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter
   Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier
Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier
Discussion:
    https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com
    https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de
Backpatch: 9.3-
2017-12-14 18:20:48 -08:00
Andrew Dunstan
f5c7e0cddf Fix walsender timeouts when decoding a large transaction
The logical slots have a fast code path for sending data so as not to
impose too high a per message overhead. The fast path skips checks for
interrupts and timeouts. However, the existing coding failed to consider
the fact that a transaction with a large number of changes may take a
very long time to be processed and sent to the client. This causes the
walsender to ignore interrupts for potentially a long time and more
importantly it will result in the walsender being killed due to
timeout at the end of such a transaction.

This commit changes the fast path to also check for interrupts and only
allows calling the fast path when the last keepalive check happened less
than half the walsender timeout ago. Otherwise the slower code path will
be taken.

Backpatched to 9.4

Petr Jelinek, reviewed by  Kyotaro HORIGUCHI, Yura Sokolov,  Craig
Ringer and Robert Haas.

Discussion: https://postgr.es/m/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com
2017-12-14 11:32:25 -05:00
Tom Lane
239b01e313 Fix corner-case coredump in _SPI_error_callback().
I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL,
and only fills it in some time later.  If an error were to happen in
between, _SPI_error_callback would try to dereference the null pointer.
This is unlikely --- there's not much between those points except
push-snapshot calls --- but it's clearly not impossible.  Tweak the
callback to do nothing if the pointer isn't set yet.

It's been like this for awhile, so back-patch to all supported branches.
2017-12-11 16:33:49 -05:00
Noah Misch
d78c3ca0ea MSVC 2012+: Permit linking to 32-bit, MinGW-built libraries.
Notably, this permits linking to the 32-bit Perl binaries advertised on
perl.org, namely Strawberry Perl and ActivePerl.  This has a side effect
of permitting linking to binaries built with obsolete MSVC versions.

By default, MSVC 2012 and later require a "safe exception handler table"
in each binary.  MinGW-built, 32-bit DLLs lack the relevant exception
handler metadata, so linking to them failed with error LNK2026.  Restore
the semantics of MSVC 2010, which omits the table from a given binary if
some linker input lacks metadata.  This has no effect on 64-bit builds
or on MSVC 2010 and earlier.  Back-patch to 9.3 (all supported
versions).

Reported by Victor Wagner.

Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
2017-12-09 00:58:59 -08:00
Noah Misch
aed8d41af6 MSVC: Test whether 32-bit Perl needs -D_USE_32BIT_TIME_T.
Commits 5a5c2feca3 and
b5178c5d08 introduced support for modern
MSVC-built, 32-bit Perl, but they broke use of MinGW-built, 32-bit Perl
distributions like Strawberry Perl and modern ActivePerl.  Perl has no
robust means to report whether it expects a -D_USE_32BIT_TIME_T ABI, so
test this.  Back-patch to 9.3 (all supported versions).

The chief alternative was a heuristic of adding -D_USE_32BIT_TIME_T when
$Config{gccversion} is nonempty.  That banks on every gcc-built Perl
using the same ABI.  gcc could change its default ABI the way MSVC once
did, and one could build Perl with gcc and the non-default ABI.

The GNU make build system could benefit from a similar test, without
which it does not support MSVC-built Perl.  For now, just add a comment.
Most users taking the special step of building Perl with MSVC probably
build PostgreSQL with MSVC.

Discussion: https://postgr.es/m/20171130041441.GA3161526@rfd.leadboat.com
2017-12-08 18:13:49 -08:00
Peter Eisentraut
8b33b5b9df Fix mistake in comment
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
2017-12-08 11:17:46 -05:00
Robert Haas
facd94e72f Report failure to start a background worker.
When a worker is flagged as BGW_NEVER_RESTART and we fail to start it,
or if it is not marked BGW_NEVER_RESTART but is terminated before
startup succeeds, what BgwHandleStatus should be reported?  The
previous code really hadn't considered this possibility (as indicated
by the comments which ignore it completely) and would typically return
BGWH_NOT_YET_STARTED, but that's not a good answer, because then
there's no way for code using GetBackgroundWorkerPid() to tell the
difference between a worker that has not started but will start
later and a worker that has not started and will never be started.
So, when this case happens, return BGWH_STOPPED instead.  Update the
comments to reflect this.

The preceding fix by itself is insufficient to fix the problem,
because the old code also didn't send a notification to the process
identified in bgw_notify_pid when startup failed.  That might've
been technically correct under the theory that the status of the
worker was BGWH_NOT_YET_STARTED, because the status would indeed not
change when the worker failed to start, but now that we're more
usefully reporting BGWH_STOPPED, a notification is needed.

Without these fixes, code which starts background workers and then
uses the recommended APIs to wait for those background workers to
start would hang indefinitely if the postmaster failed to fork a
worker.

Amit Kapila and Robert Haas

Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com
2017-12-06 09:08:30 -05:00
Robert Haas
f4bb60ed69 Mark assorted variables PGDLLIMPORT.
This makes life easier for extension authors who wish to support
Windows.

Brian Cloutier, slightly amended by me.

Discussion: http://postgr.es/m/CAJCy68fscdNhmzFPS4kyO00CADkvXvEa-28H-OtENk-pa2OTWw@mail.gmail.com
2017-12-05 09:35:15 -05:00
Tom Lane
225501cf75 Clean up assorted messiness around AllocateDir() usage.
This patch fixes a couple of low-probability bugs that could lead to
reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE)
concerning directory-open or file-open failures.  It also fixes places
where we took shortcuts in reporting such errors, either by using elog
instead of ereport or by using ereport but forgetting to specify an
errcode.  And it eliminates a lot of just plain redundant error-handling
code.

In service of all this, export fd.c's formerly-static function
ReadDirExtended, so that external callers can make use of the coding
pattern

	dir = AllocateDir(path);
	while ((de = ReadDirExtended(dir, path, LOG)) != NULL)

if they'd like to treat directory-open failures as mere LOG conditions
rather than errors.  Also fix FreeDir to be a no-op if we reach it
with dir == NULL, as such a coding pattern would cause.

Then, remove code at many call sites that was throwing an error or log
message for AllocateDir failure, as ReadDir or ReadDirExtended can handle
that job just fine.  Aside from being a net code savings, this gets rid of
a lot of not-quite-up-to-snuff reports, as mentioned above.  (In some
places these changes result in replacing a custom error message such as
"could not open tablespace directory" with more generic wording "could not
open directory", but it was agreed that the custom wording buys little as
long as we report the directory name.)  In some other call sites where we
can't just remove code, change the error reports to be fully
project-style-compliant.

Also reorder code in restoreTwoPhaseData that was acquiring a lock
between AllocateDir and ReadDir; in the unlikely but surely not
impossible case that LWLockAcquire changes errno, AllocateDir failures
would be misreported.  There is no great value in opening the directory
before acquiring TwoPhaseStateLock, so just do it in the other order.

Also fix CheckXLogRemoved to guarantee that it preserves errno,
as quite a number of call sites are implicitly assuming.  (Again,
it's unlikely but I think not impossible that errno could change
during a SpinLockAcquire.  If so, this function was broken for its
own purposes as well as breaking callers.)

And change a few places that were using not-per-project-style messages,
such as "could not read directory" when "could not open directory" is
more correct.

Back-patch the exporting of ReadDirExtended, in case we have occasion
to back-patch some fix that makes use of it; it's not needed right now
but surely making it global is pretty harmless.  Also back-patch the
restoreTwoPhaseData and CheckXLogRemoved fixes.  The rest of this is
essentially cosmetic and need not get back-patched.

Michael Paquier, with a bit of additional work by me

Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
2017-12-04 17:02:52 -05:00
Noah Misch
e73981cdc0 Fix non-GNU makefiles for AIX make.
Invoking the Makefile without an explicit target was building every
possible target instead of just the "all" target.  Back-patch to 9.3
(all supported versions).
2017-11-30 00:57:32 -08:00
Magnus Hagander
65f1623336 Fix typo in comment
Andreas Karlsson
2017-11-27 09:30:03 +01:00
Joe Conway
d8d9c97cd1 Make has_sequence_privilege support WITH GRANT OPTION
The various has_*_privilege() functions all support an optional
WITH GRANT OPTION added to the supported privilege types to test
whether the privilege is held with grant option. That is, all except
has_sequence_privilege() variations. Fix that.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/005147f6-8280-42e9-5a03-dd2c1e4397ef@joeconway.com
2017-11-26 09:50:42 -08:00
Tom Lane
1601a9413f Update MSVC build process for new timezone data.
Missed this dependency in commits 7cce222c9 et al.
2017-11-25 18:15:23 -05:00
Tom Lane
10aa064c95 Replace raw timezone source data with IANA's new compact format.
Traditionally IANA has distributed their timezone data in pure source
form, replete with extensive historical comments.  As of release 2017c,
they've added a compact single-file format that omits comments and
abbreviates command keywords.  This form is way shorter than the pure
source, even before considering its allegedly better compressibility.
Hence, let's distribute the data in that form rather than pure source.

I'm pushing this now, rather than at the next timezone database update,
so that it's easy to confirm that this data file produces compiled zic
output that's identical to what we were getting before.

Discussion: https://postgr.es/m/1915.1511210334@sss.pgh.pa.us
2017-11-25 15:30:44 -05:00
Tom Lane
2e105cf6db Repair failure with SubPlans in multi-row VALUES lists.
When nodeValuesscan.c was written, it was impossible to have a SubPlan in
VALUES --- any sub-SELECT there would have to be uncorrelated and thereby
would produce an InitPlan instead.  We therefore took a shortcut in the
logic that throws away a ValuesScan's per-row expression evaluation data
structures.  This was broken by the introduction of LATERAL however; a
sub-SELECT containing a lateral reference produces a correlated SubPlan.

The cleanest fix for this would be to give up the optimization of
discarding the expression eval state.  But that still seems pretty
unappetizing for long VALUES lists.  It seems to work to just prevent
the subexpressions from hooking into the ValuesScan node's subPlan
list, so let's do that and see how well it works.  (If this breaks,
due to additional connections between the subexpressions and the outer
query structures, we might consider compromises like throwing away data
only for VALUES rows not containing SubPlans.)

Per bug #14924 from Christian Duta.  Back-patch to 9.3 where LATERAL
was introduced.

Discussion: https://postgr.es/m/20171124120836.1463.5310@wrigleys.postgresql.org
2017-11-25 14:15:48 -05:00
Noah Misch
558f620792 Support linking with MinGW-built Perl.
This is necessary for ActivePerl 5.18 onwards and for Strawberry Perl.
It is not sufficient for 32-bit builds with newer Visual Studio; these
fail with error LINK2026.  Back-patch to 9.3 (all supported versions).

Reported by Victor Wagner.

Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
2017-11-23 20:29:48 -08:00
Robert Haas
294136d422 Provide for forward compatibility with future minor protocol versions.
Previously, any attempt to request a 3.x protocol version other than
3.0 would lead to a hard connection failure, which made the minor
protocol version really no different from the major protocol version
and precluded gentle protocol version breaks.  Instead, when the
client requests a 3.x protocol version where x is greater than 0, send
the new NegotiateProtocolVersion message to convey that we support
only 3.0.  This makes it possible to introduce new minor protocol
versions without requiring a connection retry when the server is
older.

In addition, if the startup packet includes name/value pairs where
the name starts with "_pq_.", assume that those are protocol options,
not GUCs.  Include those we don't support (i.e. all of them, at
present) in the NegotiateProtocolVersion message so that the client
knows they were not understood.  This makes it possible for the
client to request previously-unsupported features without bumping
the protocol version at all; the client can tell from the server's
response whether the option was understood.

It will take some time before servers that support these new
facilities become common in the wild; to speed things up and make
things easier for a future 3.1 protocol version, back-patch to all
supported releases.

Robert Haas and Badrul Chowdhury

Discussion: http://postgr.es/m/BN6PR21MB0772FFA0CBD298B76017744CD1730@BN6PR21MB0772.namprd21.prod.outlook.com
Discussion: http://postgr.es/m/30788.1498672033@sss.pgh.pa.us
2017-11-21 14:38:29 -05:00
Tom Lane
13f2bdb639 Use out-of-line M68K spinlock code for OpenBSD as well as NetBSD.
David Carlier (from a patch being carried by OpenBSD packagers)

Discussion: https://postgr.es/m/CA+XhMqzwFSGVU7MEnfhCecc8YdP98tigXzzpd0AAdwaGwaVXEA@mail.gmail.com
2017-11-20 18:05:03 -05:00
Tom Lane
8bd8b4b77c Add support for Motorola 88K to s_lock.h.
Apparently there are still people out there who care about this old
architecture.  They probably care about dusty versions of Postgres
too, so back-patch to all supported branches.

David Carlier (from a patch being carried by OpenBSD packagers)

Discussion: https://postgr.es/m/CA+XhMqzwFSGVU7MEnfhCecc8YdP98tigXzzpd0AAdwaGwaVXEA@mail.gmail.com
2017-11-20 17:57:46 -05:00
Noah Misch
ab8eae0bb5 MSVC: Rebuild spiexceptions.h when out of date.
Also, add a warning to catch future instances of naming a nonexistent
file as a prerequisite.  Back-patch to 9.3 (all supported versions).
2017-11-12 18:44:38 -08:00
Noah Misch
e17b38db66 Install Windows crash dump handler before all else.
Apart from calling write_stderr() on failure, the handler depends on no
PostgreSQL facilities.  We have experienced crashes before reaching the
former call site.  Given such an early crash, this change cannot hurt
and may produce a helpful dump.  Absent an early crash, this change has
no effect.  Back-patch to 9.3 (all supported versions).

Takayuki Tsunakawa

Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F80CD13@G01JPEXMBYT05
2017-11-12 14:31:04 -08:00
Noah Misch
19cf9e96ae Don't call pgwin32_message_to_UTF16() without CurrentMemoryContext.
PostgreSQL running as a Windows service crashed upon calling
write_stderr() before MemoryContextInit().  This fix completes work
started in 5735efee15.  Messages this
early contain only ASCII bytes; if we removed the CurrentMemoryContext
requirement, the ensuing conversions would have no effect.  Back-patch
to 9.3 (all supported versions).

Takayuki Tsunakawa, reviewed by Michael Paquier.

Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F80CC73@G01JPEXMBYT05
2017-11-12 13:03:29 -08:00
Noah Misch
ae5489e147 Add post-2010 ecpg tests to checktcp.
This suite had been a proper superset of the regular ecpg test suite,
but the three newest tests didn't reach it.  To make this less likely to
recur, delete the extra schedule file and pass the TCP-specific test on
the command line.  Back-patch to 9.3 (all supported versions).
2017-11-11 14:41:50 -08:00
Noah Misch
65fd34f7cc Make connect/test1 independent of localhost IPv6.
Since commit 868898739a, it has assumed
"localhost" resolves to both ::1 and 127.0.0.1.  We gain nothing from
that assumption, and it does not hold in a default installation of Red
Hat Enterprise Linux 5.  Back-patch to 9.3 (all supported versions).
2017-11-11 14:33:41 -08:00
Noah Misch
dfabce8827 Fix connect/test1 expected output.
The test runs only as part of "checktcp".  This is a back-patch to 9.5
and 9.4 of part of commit 868898739a.
Oversight in commit 61bee9f756.
2017-11-11 14:22:51 -08:00