postgres

mirror of https://github.com/zebrajr/postgres.git synced 2025-12-07 12:20:31 +01:00

Author	SHA1	Message	Date
Tom Lane	d6b37cdb6e	Don't believe MinMaxExpr is leakproof without checking. MinMaxExpr invokes the btree comparison function for its input datatype, so it's only leakproof if that function is. Many such functions are indeed leakproof, but others are not, and we should not just assume that they are. Hence, adjust contain_leaked_vars to verify the leakproofness of the referenced function explicitly. I didn't add a regression test because it would need to depend on some particular comparison function being leaky, and that's a moving target, per discussion. This has been wrong all along, so back-patch to supported branches. Discussion: https://postgr.es/m/31042.1546194242@sss.pgh.pa.us	2019-01-02 16:33:48 -05:00
Noah Misch	309d16f073	pg_regress: Promptly detect failed postmaster startup. Detect it the way pg_ctl's wait_for_postmaster() does. When pg_regress spawned a postmaster that failed startup, we were detecting that only with "pg_regress: postmaster did not respond within 60 seconds". Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. Discussion: https://postgr.es/m/20181231172922.GA199150@gust.leadboat.com	2018-12-31 13:51:30 -08:00
Alvaro Herrera	2602838fa3	Have DISCARD ALL/TEMP remove leftover temp tables Previously, it would only remove temp tables created in the same session; but if the session uses the BackendId of a previously crashed backend that left temp tables around, those would not get removed. Since autovacuum would not drop them either (because it sees that the BackendId is in use by the current session) these can cause annoying xid-wraparound warnings. Apply to branches 9.4 to 10. This is not a problem since version 11, because commit `943576bddc` added state tracking that makes autovacuum realize that those temp tables are not ours, so it removes them. This is useful to handle in DISCARD, because even though it does not handle all situations, it does handle the common one where a connection pooler keeps the same session open for an indefinitely long time. Discussion: https://postgr.es/m/20181226190834.wsk2wzott5yzrjiq@alvherre.pgsql Reviewed-by: Takayuki Tsunakawa, Michaël Paquier	2018-12-27 16:17:40 -03:00
Alvaro Herrera	5199abaca3	Make autovacuum more selective about temp tables to keep When temp tables are in danger of XID wraparound, autovacuum drops them; however, it preserves those that are owned by a working session. This is desirable, except when the session is connected to a different database (because the temp tables cannot be from that session), so make it only keep the temp tables only if the backend is in the same database as the temp tables. This is not bulletproof: it fails to detect temp tables left by a session whose backend ID is reused in the same database but the new session does not use temp tables. Commit `943576bddc` fixes that case too, for branches 11 and up (which is why we don't apply this fix to those branches), but back-patching that one is not universally agreed on. Discussion: https://postgr.es/m/20181214162843.37g6h3txto43akrb@alvherre.pgsql Reviewed-by: Takayuki Tsunakawa, Michaël Paquier	2018-12-27 16:01:36 -03:00
Michael Paquier	1d70076710	Ignore inherited temp relations from other sessions when truncating Inheritance trees can include temporary tables if the parent is permanent, which makes possible the presence of multiple temporary children from different sessions. Trying to issue a TRUNCATE on the parent in this scenario causes a failure, so similarly to any other queries just ignore such cases, which makes TRUNCATE work transparently. This makes truncation behave similarly to any other DML query working on the parent table with queries which need to be issues on children. A set of isolation tests is added to cover basic cases. Reported-by: Zhou Digoal Author: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/15565-ce67a48d0244436a@postgresql.org Backpatch-through: 9.4	2018-12-27 10:17:42 +09:00
Tom Lane	4f7ab73106	Fix portability failure introduced in commits `d2b0b60e7` et al. I made a frontend fprintf() format use %m, forgetting that that's only safe in HEAD not the back branches; prior to `96bf88d52` and `d6c55de1f`, it would work on glibc platforms but not elsewhere. Revert to using %s ... strerror(errno) as the code did before. We could have left HEAD as-is, but for code consistency across branches, I chose to apply this patch there too. Per Coverity and a few buildfarm members.	2018-12-26 15:30:40 -05:00
Peter Eisentraut	b7b0314b80	Fix ancient compiler warnings and typos in !HAVE_SYMLINK code This has never been correct since this code was introduced.	2018-12-22 07:27:21 +01:00
Tom Lane	ef673a32d3	Fix ancient thinko in mergejoin cost estimation. "rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples, which is sensible if it's to be multiplied by total-inner-tuples or a cost value corresponding to scanning all the inner tuples. But in reality it was (mostly) multiplied by inner_rows or a related cost, numbers that take into account the possibility of stopping short of scanning the whole inner relation thanks to a limited key range in the outer relation. This'd still make sense if we could expect that stopping short would result in a proportional decrease in the number of tuples that have to be rescanned. It does not, however. The argument that establishes the validity of our estimate for that number is independent of whether we scan all of the inner relation or stop short, and experimentation also shows that stopping short doesn't reduce the number of rescanned tuples. So the correct calculation is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply that by inner_rows or a corresponding cost value. Most of the time this doesn't make much difference, but if we have both a high rescan rate (due to lots of duplicate values) and an outer key range much smaller than the inner key range, then the error can be significant, leading to a large underestimate of the cost associated with rescanning. Per report from Vijaykumar Jain. This thinko appears to go all the way back to the introduction of the rescan estimation logic in commit `70fba7043`, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com	2018-12-18 11:19:39 -05:00
Michael Paquier	696c68c2be	Fix use-after-free bug when renaming constraints This is an oversight from recent commit `b13fd344`. While on it, tweak the previous test with a better name for the renamed primary key. Detected by buildfarm member prion which forces relation cache release with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous commit.	2018-12-17 12:44:09 +09:00
Michael Paquier	d5d86e2cd6	Make constraint rename issue relcache invalidation on target relation When a constraint gets renamed, it may have associated with it a target relation (for example domain constraints don't have one). Not invalidating the target relation cache when issuing the renaming can result in issues with subsequent commands that refer to the old constraint name using the relation cache, causing various failures. One pattern spotted was using CREATE TABLE LIKE after a constraint renaming. Reported-by: Stuart <sfbarbee@gmail.com> Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org	2018-12-17 10:37:24 +09:00
Alexander Korotkov	bf0e5a73be	Fix wrong backpatching of ginRedoDeletePage() deadlock fix `19cf52e6cc` changes lock order in ginRedoDeletePage(). But did it in a wrong way due to oversight during backpatching. This commit fixes that. Reported-by: Bruce Momjian Discussion: https://postgr.es/m/20181213153232.GA10664%40momjian.us	2018-12-13 22:32:05 +03:00
Alexander Korotkov	1cf175c74f	Prevent GIN deleted pages from being reclaimed too early When GIN vacuum deletes a posting tree page, it assumes that no concurrent searchers can access it, thanks to ginStepRight() locking two pages at once. However, since 9.4 searches can skip parts of posting trees descending from the root. That leads to the risk that page is deleted and reclaimed before concurrent search can access it. This commit prevents the risk of above by waiting for every transaction, which might wait to reference this page, to finish. Due to binary compatibility we can't change GinPageOpaqueData to store corresponding transaction id. Instead we reuse page header pd_prune_xid field, which is unused in index pages. Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Andrey Borodin, Alexander Korotkov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4	2018-12-13 06:52:26 +03:00
Alexander Korotkov	19cf52e6cc	Prevent deadlock in ginRedoDeletePage() On standby ginRedoDeletePage() can work concurrently with read-only queries. Those queries can traverse posting tree in two ways. 1) Using rightlinks by ginStepRight(), which locks the next page before unlocking its left sibling. 2) Using downlinks by ginFindLeafPage(), which locks at most one page at time. Original lock order was: page, parent, left sibling. That lock order can deadlock with ginStepRight(). In order to prevent deadlock this commit changes lock order to: left sibling, page, parent. Note, that position of parent in locking order seems insignificant, because we only lock one page at time while traversing downlinks. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov Backpatch-through: 9.4	2018-12-13 06:36:54 +03:00
Tom Lane	3af726a1f3	Add stack depth checks to key recursive functions in backend/nodes/*.c. Although copyfuncs.c has a check_stack_depth call in its recursion, equalfuncs.c, outfuncs.c, and readfuncs.c lacked one. This seems unwise. Likewise fix planstate_tree_walker(), in branches where that exists. Discussion: https://postgr.es/m/30253.1544286631@sss.pgh.pa.us	2018-12-10 11:12:43 -05:00
Tom Lane	3a691f8a25	Improve our response to invalid format strings, and detect more cases. Places that are testing for *printf failure ought to include the format string in their error reports, since bad-format-string is one of the more likely causes of such failure. This both makes it easier to find and repair the mistake, and provides at least some useful info to the user who stumbles across such a problem. Also, tighten snprintf.c to report EINVAL for an invalid flag or final character in a format %-spec (including the case where the %-spec is missing a final character altogether). This seems like better project policy, and it also allows removing an instruction or two from the hot code path. Back-patch the error reporting change in pvsnprintf, since it should be harmless and may be helpful; but not the snprintf.c change. Per discussion of bug #15511 from Ertuğrul Kahveci, which reported an invalid translated format string. These changes don't fix that error, but they should improve matters next time we make such a mistake. Discussion: https://postgr.es/m/15511-1d8b6a0bc874112f@postgresql.org	2018-12-06 15:08:44 -05:00
Tom Lane	02d3104713	Ensure static libraries have correct mod time even if ranlib messes it up. In at least Apple's version of ranlib, the output file is updated to have a mod time equal to the max of the timestamps of its components, and that data only has seconds precision. On a filesystem with sub-second file timestamp precision --- say, APFS --- this can result in the finished static library appearing older than its input files, which causes useless rebuilds and possible outright failures in parallel makes. We've only seen this reported in the field from people using Apple's ranlib with a non-Apple make, because Apple's make doesn't know about sub-second timestamps either so it doesn't decide rebuilds are needed. But Apple's ranlib presumably shares code with at least some BSDen, so it's not that unlikely that the same problem could arise elsewhere. To fix, just "touch" the output file after ranlib finishes. We seem to need this in only one place. There are other calls of ranlib in our makefiles, but they are working on intermediate files whose timestamps are not actually important, or else on an installed static library for which sub-second timestamp precision is unlikely to matter either. (Also, so far as I can tell, Apple's ranlib doesn't mess up the file timestamp in the latter usage anyhow.) In passing, change "ranlib" to "$(RANLIB)" in one place that was bypassing the make macro for no good reason. Per bug #15525 from Jack Kelly (via Alyssa Ross). Back-patch to all supported branches. Discussion: https://postgr.es/m/15525-a30da084f17a1faa@postgresql.org	2018-11-29 15:53:44 -05:00
Michael Paquier	b81d08d600	Fix handling of synchronous replication for stopping WAL senders This fixes an oversight from `c6c3334` which has introduced a more strict ordering in the way WAL senders are stopped to prevent current WAL activity when a shutdown checkpoint is created. After all backends are stopped, all WAL senders are requested to stop which makes them stop any activity, and switching their state as stopping. Once the checkpointer knows that all WAL senders are in a stopping state, the shutdown checkpoint can begin, with all WAL senders activated, waiting for their clients to flush the shutdown checkpoint record. If a subset of WAL senders are stopping and in a sync state, other WAL senders could still be waiting for a WAL position to be synced while committing a transaction, however the subset of stopping senders would not release waiters, potentially breaking synchronous replication guarantees. This commit makes sure that even WAL senders stopping are able to release waiters properly. On 9.4, this can also trigger an assertion failure when setting for example max_wal_senders to 1 where a WAL sender is not able to find itself as in synchronous state when the instance stops. Reported-by: Paul Guo Author: Paul Guo, Michael Paquier Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com Backpatch-through: 9.4	2018-11-29 09:13:04 +09:00
Tomas Vondra	c1a5caea82	Do not decode TOAST data for table rewrites During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged using XLOG / FPI records, and thus (correctly) ignored in decoding. But the associated TOAST table is WAL-logged as plain INSERT records, and so was logically decoded and passed to reorder buffer. That has severe consequences with TOAST tables of non-trivial size. Firstly, reorder buffer has to keep all those changes, possibly spilling them to a file, incurring I/O costs and disk space. Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into a hash table, which got discarded only after processing the row from the main heap. But as the main heap is not decoded for rewrites, this never happened, so all the TOAST data accumulated in memory, resulting either in excessive memory consumption or OOM. The fix is simple, as commit `e9edc1ba` already introduced infrastructure (namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST tables, but it only applied it to system tables. So simply use it for all TOAST data in raw_heap_insert(). That would however solve only the memory consumption issue - the TOAST changes would still be decoded and added to the reorder buffer, and spilled to disk (although without TOAST tuple data, so much smaller). But we can solve that by tweaking DecodeInsert() to just ignore such INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag, instead of skipping them later in ReorderBufferCommit(). Review: Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com Backpatch: 9.4-, where logical decoding was introduced	2018-11-28 01:53:29 +01:00
Tom Lane	74bfb5388d	Fix translation of special characters in psql's LaTeX output modes. latex_escaped_print() mistranslated \ and failed to provide any translation for # ^ and ~, all of which would typically lead to LaTeX document syntax errors. In addition it didn't translate < > and \|, which would typically render as unexpected characters. To some extent this represents shortcomings in ancient versions of LaTeX, which if memory serves had no easy way to render these control characters as ASCII text. But that's been fixed for, um, decades. In any case there is no value in emitting guaranteed-to-fail output for these characters. Noted while fooling with test cases added by commit `9a98984f4`. Back-patch the code change to all supported versions.	2018-11-26 17:32:51 -05:00
Tom Lane	bf9fb00dd1	Update additional float4/8 expected-output files. I forgot that the back branches have more variant files than HEAD :-(. Per buildfarm. Discussion: https://postgr.es/m/15519-4fc785b483201ff1@postgresql.org	2018-11-24 13:53:12 -05:00
Tom Lane	d5231253e3	Fix float-to-integer coercions to handle edge cases correctly. ftoi4 and its sibling coercion functions did their overflow checks in a way that looked superficially plausible, but actually depended on an assumption that the MIN and MAX comparison constants can be represented exactly in the float4 or float8 domain. That fails in ftoi4, ftoi8, and dtoi8, resulting in a possibility that values near the MAX limit will be wrongly converted (to negative values) when they need to be rejected. Also, because we compared before rounding off the fractional part, the other three functions threw errors for values that really ought to get rounded to the min or max integer value. Fix by doing rint() first (requiring an assumption that it handles NaN and Inf correctly; but dtoi8 and ftoi8 were assuming that already), and by comparing to values that should coerce to float exactly, namely INTxx_MIN and -INTxx_MIN. Also remove some random cosmetic discrepancies between these six functions. This back-patches commits `cbdb8b4c0` and `452b637d4`. In the 9.4 branch, also back-patch the portion of `62e2a8dc2` that added PG_INTnn_MIN and related constants to c.h, so that these functions can rely on them. Per bug #15519 from Victor Petrovykh. Patch by me; thanks to Andrew Gierth for analysis and discussion. Discussion: https://postgr.es/m/15519-4fc785b483201ff1@postgresql.org	2018-11-24 12:45:50 -05:00
Tom Lane	e1d8f18b87	Fix old TAP tests' method for selecting a valid PGPORT value. This code was trying to be paranoid, but it wasn't paranoid enough. It only ensured that the selected port is in 0..65535, while most Unix systems will refuse unprivileged attempts to use TCP port numbers below 1024. Change it to allow specification of ports 1024..65535, while if the port is outside that range, map it into 49152..65535 which is the port range used by our later branches. The main reason we've not noticed this up to now is that it's not important when testing over Unix-socket connections, only TCP, and most of our test code deliberately prevents the postmaster from opening any TCP ports. However, the SSL tests do open up a TCP port, and I believe this explains why buildfarm member chipmunk has been failing the SSL tests in 9.5: it's picking a reserved port number. Patch in 9.5 and 9.4. Later branches do not use this code.	2018-11-19 20:01:35 -05:00
Tom Lane	7c86b94311	Back-patch updated thread flags tests into 9.4 and 9.5. This commit back-patches these 9.6-era commits into 9.4 and 9.5: `e97af6c8b` Replace our hacked version of ax_pthread.m4 with latest upstream version. `3b14a17c8` Move pthread-tests earlier in the autoconf script. `01051a987` Use AS_IF rather than plain shell "if" in pthread-check. `a2932283c` Update ax_pthread.m4 to an experimental draft version from upstream. The net result is to sync configure's checks for threading-related flags and libraries with the version we've been using since 9.6. The motivation for doing so now is that it seems the older code does not work correctly on very recent RHEL7/ppc64, as evidenced by buildfarm member quokka. The newer code is pretty battle-hardened by now, so this seems like a low-risk fix. Discussion: https://postgr.es/m/3320.1542647565@sss.pgh.pa.us	2018-11-19 14:24:52 -05:00
Thomas Munro	f1ff5f51d2	PANIC on fsync() failure. On some operating systems, it doesn't make sense to retry fsync(), because dirty data cached by the kernel may have been dropped on write-back failure. In that case the only remaining copy of the data is in the WAL. A subsequent fsync() could appear to succeed, but not have flushed the data. That means that a future checkpoint could apparently complete successfully but have lost data. Therefore, violently prevent any future checkpoint attempts by panicking on the first fsync() failure. Note that we already did the same for WAL data; this change extends that behavior to non-temporary data files. Provide a GUC data_sync_retry to control this new behavior, for users of operating systems that don't eject dirty data, and possibly forensic/testing uses. If it is set to on and the write-back error was transient, a later checkpoint might genuinely succeed (on a system that does not throw away buffers on failure); if the error is permanent, later checkpoints will continue to fail. The GUC defaults to off, meaning that we panic. Back-patch to all supported releases. There is still a narrow window for error-loss on some operating systems: if the file is closed and later reopened and a write-back error occurs in the intervening time, but the inode has the bad luck to be evicted due to memory pressure before we reopen, we could miss the error. A later patch will address that with a scheme for keeping files with dirty data open at all times, but we judge that to be too complicated to back-patch. Author: Craig Ringer, with some adjustments by Thomas Munro Reported-by: Craig Ringer Reviewed-by: Robert Haas, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de	2018-11-19 14:26:28 +13:00
Thomas Munro	2b2010d12a	Don't forget about failed fsync() requests. If fsync() fails, md.c must keep the request in its bitmap, so that future attempts will try again. Back-patch to all supported releases. Author: Thomas Munro Reviewed-by: Amit Kapila Reported-by: Andrew Gierth Discussion: https://postgr.es/m/87y3i1ia4w.fsf%40news-spur.riddles.org.uk	2018-11-19 14:26:20 +13:00
Tomas Vondra	4134489636	Add valgrind suppressions for wcsrtombs optimizations wcsrtombs (called through wchar2char from common functions like lower, upper, etc.) uses various optimizations that may look like access to uninitialized data, triggering valgrind reports. For example AVX2 instructions load data in 256-bit chunks, and gconv does something similar with 32-bit chunks. This is faster than accessing the bytes one by one, and the uninitialized part of the buffer is not actually used. So suppress the bogus reports. The exact stack depends on possible optimizations - it might be AVX, SSE (as in the report by Aleksander Alekseev) or something else. Hence the last frame is wildcarded, to deal with this. Backpatch all the way back to 9.4. Author: Tomas Vondra Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com Discussion: https://www.postgresql.org/message-id/20180220150838.GD18315@e733.localdomain	2018-11-18 00:10:15 +01:00
Tom Lane	41609776f2	Second try at fixing numeric data passed through an ECPG SQLDA. In commit `ecfd55795`, I removed sqlda.c's checks for ndigits != 0 on the grounds that we should duplicate the state of the numeric value's digit buffer even when all the digits are zeroes. However, that still isn't quite right, because another possible state of the digit buffer is buf == digits == NULL (this occurs for a NaN). As the code now stands, it'll invoke memcpy with a NULL source address and zero bytecount, which we know a few platforms crash on. Hence, reinstate the no-copy short-circuit, but make it test specifically for buf != NULL rather than some other condition. In hindsight, the ndigits test (added by commit `f2ae9f9c3`) was almost certainly meant to fix the NaN case not the all-zeroes case as the associated thread alleged. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24	2018-11-14 11:27:31 -05:00
Michael Paquier	e85b729375	Initialize TransactionState and user ID consistently at transaction start If a failure happens when a transaction is starting between the moment the transaction status is changed from TRANS_DEFAULT to TRANS_START and the moment the current user ID and security context flags are fetched via GetUserIdAndSecContext(), or before initializing its basic fields, then those may get reset to incorrect values when the transaction aborts, leaving the session in an inconsistent state. One problem reported is that failing a starting transaction at the first query of a session could cause several kinds of system crashes on the follow-up queries. In order to solve that, move the initialization of the transaction state fields and the call of GetUserIdAndSecContext() in charge of fetching the current user ID close to the point where the transaction status is switched to TRANS_START, where there cannot be any error triggered in-between, per an idea of Tom Lane. This properly ensures that the current user ID, the security context flags and that the basic fields of TransactionState remain consistent even if the transaction fails while starting. Reported-by: Richard Guo Diagnosed-By: Richard Guo Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAN_9JTxECSb=pEPcb0a8d+6J+bDcOZ4=DgRo_B7Y5gRHJUM=Rw@mail.gmail.com Backpatch-through: 9.4	2018-11-14 16:48:26 +09:00
Tom Lane	9e5e3861c7	Fix incorrect results for numeric data passed through an ECPG SQLDA. Numeric values with leading zeroes were incorrectly copied into a SQLDA (SQL Descriptor Area), leading to wrong results in ECPG programs. Report and patch by Daisuke Higuchi. Back-patch to all supported versions. Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24	2018-11-13 15:46:08 -05:00
Tom Lane	2abc879531	Limit the number of index clauses considered in choose_bitmap_and(). classify_index_clause_usage() is O(N^2) in the number of distinct index qual clauses it considers, because of its use of a simple search list to store them. For nearly all queries, that's fine because only a few clauses will be considered. But Alexander Kuzmenkov reported a machine-generated query with 80000 (!) index qual clauses, which caused this code to take forever. Somewhat remarkably, this is the only O(N^2) behavior we now have for such a query, so let's fix it. We can get rid of the O(N^2) runtime for cases like this without much damage to the functionality of choose_bitmap_and() by separating out paths with "too many" qual or pred clauses, and deeming them to always be nonredundant with other paths. Then their clauses needn't go into the search list, so it doesn't get too long, but we don't lose the ability to consider bitmap AND plans altogether. I set the threshold for "too many" to be 100 clauses per path, which should be plenty to ensure no change in planning behavior for normal queries. There are other things we could do to make this go faster, but it's not clear that it's worth any additional effort. 80000 qual clauses require a whole lot of work in many other places, too. The code's been like this for a long time, so back-patch to all supported branches. The troublesome query only works back to 9.5 (in 9.4 it fails with stack overflow in the parser); so I'm not sure that fixing this in 9.4 has any real-world benefit, but perhaps it does. Discussion: https://postgr.es/m/90c5bdfa-d633-dabe-9889-3cf3e1acd443@postgrespro.ru	2018-11-12 11:19:04 -05:00
Tom Lane	277602dfee	Fix missing role dependencies for some schema and type ACLs. This patch fixes several related cases in which pg_shdepend entries were never made, or were lost, for references to roles appearing in the ACLs of schemas and/or types. While that did no immediate harm, if a referenced role were later dropped, the drop would be allowed and would leave a dangling reference in the object's ACL. That still wasn't a big problem for normal database usage, but it would cause obscure failures in subsequent dump/reload or pg_upgrade attempts, taking the form of attempts to grant privileges to all-numeric role names. (I think I've seen field reports matching that symptom, but can't find any right now.) Several cases are fixed here: 1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any existing ACL entries for the domain. This case is ancient, dating back as far as we've had pg_shdepend tracking at all. 2. If a default type privilege applies, CREATE TYPE recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for types in 9.2. 3. If a default schema privilege applies, CREATE SCHEMA recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for schemas in v10 (commit `ab89e465c`). Another somewhat-related problem is that when creating a relation rowtype or implicit array type, TypeCreate would apply any available default type privileges to that type, which we don't really want since such an object isn't supposed to have privileges of its own. (You can't, for example, drop such privileges once they've been added to an array type.) `ab89e465c` is also to blame for a race condition in the regression tests: privileges.sql transiently installed globally-applicable default privileges on schemas, which sometimes got absorbed into the ACLs of schemas created by concurrent test scripts. This should have resulted in failures when privileges.sql tried to drop the role holding such privileges; but thanks to the bug fixed here, it instead led to dangling ACLs in the final state of the regression database. We'd managed not to notice that, but it became obvious in the wake of commit `da906766c`, which allowed the race condition to occur in pg_upgrade tests. To fix, add a function recordDependencyOnNewAcl to encapsulate what callers of get_user_default_acl need to do; while the original call sites got that right via ad-hoc code, none of the later-added ones have. Also change GenerateTypeDependencies to generate these dependencies, which requires adding the typacl to its parameter list. (That might be annoying if there are any extensions calling that function directly; but if there are, they're most likely buggy in the same way as the core callers were, so they need work anyway.) While I was at it, I changed GenerateTypeDependencies to accept most of its parameters in the form of a Form_pg_type pointer, making its parameter list a bit less unwieldy and mistake-prone. The test race condition is fixed just by wrapping the addition and removal of default privileges into a single transaction, so that that state is never visible externally. We might eventually prefer to separate out tests of default privileges into a script that runs by itself, but that would be a bigger change and would make the tests run slower overall. Back-patch relevant parts to all supported branches. Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us	2018-11-09 20:42:03 -05:00
Tom Lane	2407d4807f	Disallow setting client_min_messages higher than ERROR. Previously it was possible to set client_min_messages to FATAL or PANIC, which had the effect of suppressing transmission of regular ERROR messages to the client. Perhaps that seemed like a useful option in the past, but the trouble with it is that it breaks guarantees that are explicitly made in our FE/BE protocol spec about how a query cycle can end. While libpq and psql manage to cope with the omission, that's mostly because they are not very bright; client libraries that have more semantic knowledge are likely to get confused. Notably, pgODBC doesn't behave very sanely. Let's fix this by getting rid of the ability to set client_min_messages above ERROR. In HEAD, just remove the FATAL and PANIC options from the set of allowed enum values for client_min_messages. (This change also affects trace_recovery_messages, but that's OK since these aren't useful values for that variable either.) In the back branches, there was concern that rejecting these values might break applications that are explicitly setting things that way. I'm pretty skeptical of that argument, but accommodate it by accepting these values and then internally setting the variable to ERROR anyway. In all branches, this allows a couple of tiny simplifications in the logic in elog.c, so do that. Also respond to the point that was made that client_min_messages has exactly nothing to do with the server's logging behavior, and therefore does not belong in the "When To Log" subsection of the documentation. The "Statement Behavior" subsection is a better match, so move it there. Jonah Harris and Tom Lane Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org	2018-11-08 17:33:26 -05:00
Bruce Momjian	b8db4c2af0	GUC: adjust effective_cache_size SQL descriptions Follow on patch for commit `3e0f1a4741`. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/369ec766-b947-51bd-4dad-6fb9e026439f@2ndquadrant.com Backpatch-through: 9.4	2018-11-06 13:40:02 -05:00
Tom Lane	4f0bf3359f	Stamp 9.4.20.	2018-11-05 16:51:23 -05:00
Andres Freund	b7301e3a7b	Fix copy-paste error in errhint() introduced in `691d79a079`. Reported-By: Petr Jelinek Discussion: https://postgr.es/m/c95a620b-34f0-7930-aeb5-f7ab804f26cb@2ndquadrant.com Backpatch: 9.4-, like the previous commit	2018-11-05 12:05:40 -08:00
Peter Eisentraut	92154ef477	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 23063751d2d17da76d34ddfdead3f633041a6cbe	2018-11-05 15:12:15 +01:00
Tom Lane	0ae902e39e	Make ts_locale.c's character-type functions cope with UTF-16. On Windows, in UTF8 database encoding, what char2wchar() produces is UTF16 not UTF32, ie, characters above U+FFFF will be represented by surrogate pairs. t_isdigit() and siblings did not account for this and failed to provide a large enough result buffer. That in turn led to bogus "invalid multibyte character for locale" errors, because contrary to what you might think from char2wchar()'s documentation, its Windows code path doesn't cope sanely with buffer overflow. The solution for t_isdigit() and siblings is pretty clear: provide a 3-wchar_t result buffer not 2. char2wchar() also needs some work to provide more consistent, and more accurately documented, buffer overrun behavior. But that's a bigger job and it doesn't actually have any immediate payoff, so leave it for later. Per bug #15476 from Kenji Uno, who deserves credit for identifying the cause of the problem. Back-patch to all active branches. Discussion: https://postgr.es/m/15476-4314f480acf0f114@postgresql.org	2018-11-03 13:56:10 -04:00
Tom Lane	1b5e8b408e	Yet further rethinking of build changes for macOS Mojave. The solution arrived at in commit `e74dd00f5` presumes that the compiler has a suitable default -isysroot setting ... but further experience shows that in many combinations of macOS version, XCode version, Xcode command line tools version, and phase of the moon, Apple's compiler will not supply a default -isysroot value. We could potentially go back to the approach used in commit `68fc227dd`, but I don't have a lot of faith in the reliability or life expectancy of that either. Let's just revert to the approach already shipped in 11.0, namely specifying an -isysroot switch globally. As a partial response to the concerns raised by Jakob Egger, adjust the contents of Makefile.global to look like CPPFLAGS = -isysroot $(PG_SYSROOT) ... PG_SYSROOT = /path/to/sysroot This allows overriding the sysroot path at build time in a relatively painless way. Add documentation to installation.sgml about how to use the PG_SYSROOT option. I also took the opportunity to document how to work around macOS's "System Integrity Protection" feature. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us	2018-11-02 18:54:00 -04:00
Bruce Momjian	060aff97b4	GUC: adjust effective_cache_size docs and SQL description Clarify that effective_cache_size is both kernel buffers and shared buffers. Reported-by: nat@makarevitch.org Discussion: https://postgr.es/m/153685164808.22334.15432535018443165207@wrigleys.postgresql.org Backpatch-through: 9.3	2018-11-02 09:10:59 -04:00
Andres Freund	b0fa768c61	Fix error message typo introduced `691d79a079`. Reported-By: Michael Paquier Discussion: https://postgr.es/m/20181101003405.GB1727@paquier.xyz Backpatch: 9.4-, like the previous commit	2018-11-01 10:45:42 -07:00
Andres Freund	cf358a2c06	Disallow starting server with insufficient wal_level for existing slot. Previously it was possible to create a slot, change wal_level, and restart, even if the new wal_level was insufficient for the slot. That's a problem for both logical and physical slots, because the necessary WAL records are not generated. This removes a few tests in newer versions that, somewhat inexplicably, whether restarting with a too low wal_level worked (a buggy behaviour!). Reported-By: Joshua D. Drake Author: Andres Freund Discussion: https://postgr.es/m/20181029191304.lbsmhshkyymhw22w@alap3.anarazel.de Backpatch: 9.4-, where replication slots where introduced	2018-10-31 15:46:40 -07:00
Tom Lane	95015b1f8e	Fix memory leak in repeated SPGIST index scans. spgendscan neglected to pfree all the memory allocated by spgbeginscan. It's possible to get away with that in most normal queries, since the memory is allocated in the executor's per-query context which is about to get deleted anyway; but it causes severe memory leakage during creation or filling of large exclusion-constraint indexes. Also, document that amendscan is supposed to free what ambeginscan allocates. The docs' lack of clarity on that point probably caused this bug to begin with. (There is discussion of changing that API spec going forward, but I don't think it'd be appropriate for the back branches.) Per report from Bruno Wolff. It's been like this since the beginning, so back-patch to all active branches. In HEAD, also fix an independent leak caused by commit `2a6368343` (allocating memory during spgrescan instead of spgbeginscan, which might be all right if it got cleaned up, but it didn't). And do a bit of code beautification on that commit, too. Discussion: https://postgr.es/m/20181024012314.GA27428@wolff.to	2018-10-31 17:04:43 -04:00
Tom Lane	4311cdd8e2	Sync our copy of the timezone library with IANA release tzcode2018g. This patch absorbs an upstream fix to "zic" for a recently-introduced bug that made it output data that some 32-bit clients couldn't read. Given the current source data, the bug only manifests in zones with leap seconds, which we don't generate, so that there's no actual change in our installed timezone data files from this. Still, in case somebody uses our copy of "zic" to do something else, it seems best to apply the fix promptly. Also, update the README's notes about converting upstream code to our conventions.	2018-10-31 09:48:24 -04:00
Tom Lane	d651e9e7c5	Update time zone data files to tzdata release 2018g. DST law changes in Morocco (with, effectively, zero notice). Historical corrections for Hawaii.	2018-10-31 08:36:35 -04:00
Andrew Dunstan	6982551473	Fix perl searchpath for modern perl for MSVC tools Modern versions of perl no longer include the current directory in the perl searchpath, as it's insecure. Instead of adding the current directory, we get around the problem by adding the directory where the script lives. Problem noted by Victor Wagner. Solution adapted from buildfarm client code. Backpatch to all live versions.	2018-10-28 12:26:14 -04:00
Tom Lane	0fead87601	Sync our copy of the timezone library with IANA release tzcode2018f. About half of this is purely cosmetic changes to reduce the diff between our code and theirs, like inserting "const" markers where they have them. The other half is tracking actual code changes in zic.c and localtime.c. I don't think any of these represent near-term compatibility hazards, but it seems best to stay up to date. I also fixed longstanding bugs in our code for producing the known_abbrevs.txt list, which by chance hadn't been exposed before, but which resulted in some garbage output after applying the upstream changes in zic.c. Notably, because upstream removed their old phony transitions at the Big Bang, it's now necessary to cope with TZif files containing no DST transition times at all.	2018-10-19 19:36:34 -04:00
Tom Lane	9abbfc35ca	Update time zone data files to tzdata release 2018f. DST law changes in Chile, Fiji, and Russia (Volgograd). Historical corrections for China, Japan, Macau, and North Korea. Note: like the previous tzdata update, this involves a depressingly large amount of semantically-meaningless churn in tzdata.zi. That is a consequence of upstream's data compression method assigning unstable abbreviations to DST rulesets. I complained about that to them last time, and this version now uses an assignment method that pays some heed to not changing abbreviations unnecessarily. So hopefully, that'll be better going forward.	2018-10-19 17:02:20 -04:00
Tom Lane	0749acca50	Still further rethinking of build changes for macOS Mojave. To avoid the sorts of problems complained of by Jakob Egger, it'd be best if configure didn't emit any references to the sysroot path at all. In the case of PL/Tcl, we can do that just by keeping our hands off the TCL_INCLUDE_SPEC string altogether. In the case of PL/Perl, we need to substitute -iwithsysroot for -I in the compile commands, which is easily handled if we change to using a configure output variable that includes the switch not only the directory name. Since PL/Tcl and PL/Python already do it like that, this seems like good consistency cleanup anyway. Hence, this replaces the advice given to Perl-related extensions in commit 5e2217131; instead of writing "-I$(perl_archlibexp)/CORE", they should just write "$(perl_includespec)". (The old way continues to work, but not on recent macOS.) It's still the case that configure needs to be aware of the sysroot path internally, but that's cleaner than what we had before. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us	2018-10-18 14:55:23 -04:00
Tom Lane	176f659027	Fix minor bug in isolationtester. If the lock wait query failed, isolationtester would report the PQerrorMessage from some other connection, meaning there would be no message or an unrelated one. This seems like a pretty unlikely occurrence, but if it did happen, this bug could make it really difficult/confusing to figure out what happened. That seems to justify patching all the way back. In passing, clean up another place where the "wrong" conn was used for an error report. That one's not actually buggy because it's a different alias for the same connection, but it's still confusing to the reader.	2018-10-17 15:06:38 -04:00
Tom Lane	ec5fe7f799	Improve tzparse's handling of TZDEFRULES ("posixrules") zone data. In the IANA timezone code, tzparse() always tries to load the zone file named by TZDEFRULES ("posixrules"). Previously, we'd hacked that logic to skip the load in the "lastditch" code path, which we use only to initialize the default "GMT" zone during GUC initialization. That's critical for a couple of reasons: since we do not support leap seconds, we must not allow "GMT" to have leap seconds, and since this case runs before the GUC subsystem is fully alive, we'd really rather not take the risk of pg_open_tzfile throwing any errors. However, that still left the code reading TZDEFRULES on every other call, something we'd noticed to the extent of having added code to cache the result so it was only done once per process not a lot of times. Andres Freund complained about the static data space used up for the cache; but as long as the logic was like this, there was no point in trying to get rid of that space. We can improve matters by looking a bit more closely at what the IANA code actually needs the TZDEFRULES data for. One thing it does is that if "posixrules" is a leap-second-aware zone, the leap-second behavior will be absorbed into every POSIX-style zone specification. However, that's a behavior we'd really prefer to do without, since for our purposes the end effect is to render every POSIX-style zone name unsupported. Otherwise, the TZDEFRULES data is used only if the POSIX zone name specifies DST but doesn't include a transition date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0"). That is a minority case for our purposes --- in particular, it never happens when tzload() invokes tzparse() to interpret a transition date rule string found in a tzdata zone file. Hence, if we legislate that we're going to ignore leap-second data from "posixrules", we can postpone the TZDEFRULES load into the path where we actually need to substitute for a missing date rule string. That means it will never happen at all in common scenarios, making it reasonable to dynamically allocate the cache space when it does happen. Even when the data is already loaded, this saves some cycles in the common code path since we avoid a memcpy of 23KB or so. And, IMO at least, this is a less ugly hack on the IANA logic than what we had before, since it's not messing with the lastditch-vs-regular code paths. Back-patch to all supported branches, not so much because this is a critical change as that I want to keep all our copies of the IANA timezone code in sync. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de	2018-10-17 12:26:48 -04:00
Tom Lane	486e6f8d9c	Back off using -isysroot on Darwin. Rethink the solution applied in commit `5e2217131` to get PL/Tcl to build on macOS Mojave. I feared that adding -isysroot globally might have undesirable consequences, and sure enough Jakob Egger reported one: it complicates building extensions with a different Xcode version than was used for the core server. (I find that a risky proposition in general, but apparently it works most of the time, so we shouldn't break it if we don't have to.) We'd already adopted the solution for PL/Perl of inserting the sysroot path directly into the -I switches used to find Perl's headers, and we can do the same thing for PL/Tcl by changing the -iwithsysroot switch that Apple's tclConfig.sh reports. This restricts the risks to PL/Perl and PL/Tcl themselves and directly-dependent extensions, which is a lot more pleasing in general than a global -isysroot switch. Along the way, tighten the test to see if we need to inject the sysroot path into $perl_includedir, as I'd speculated about upthread but not gotten round to doing. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us	2018-10-16 16:27:15 -04:00
Tom Lane	4166fb3a75	Avoid rare race condition in privileges.sql regression test. We created a temp table, then switched to a new session, leaving the old session to clean up its temp objects in background. If that took long enough, the eventual attempt to drop the user that owns the temp table could fail, as exhibited today by sidewinder. Fix by dropping the temp table explicitly when we're done with it. It's been like this for quite some time, so back-patch to all supported branches. Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2018-10-16%2014%3A45%3A00	2018-10-16 13:56:58 -04:00
Tom Lane	27ba589b74	Avoid statically allocating gmtsub()'s timezone workspace. localtime.c's "struct state" is a rather large object, ~23KB. We were statically allocating one for gmtsub() to use to represent the GMT timezone, even though that function is not at all heavily used and is never reached in most backends. Let's malloc it on-demand, instead. This does pose the question of how to handle a malloc failure, but there's already a well-defined error report convention here, ie set errno and return NULL. We have but one caller of pg_gmtime in HEAD, and two in back branches, neither of which were troubling to check for error. Make them do so. The possible errors are sufficiently unlikely (out-of-range timestamp, and now malloc failure) that I think elog() is adequate. Back-patch to all supported branches to keep our copies of the IANA timezone code in sync. This particular change is in a stanza that already differs from upstream, so it's a wash for maintenance purposes --- but only as long as we keep the branches the same. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de	2018-10-16 11:50:19 -04:00
Tom Lane	eb01ea2a36	Check for stack overrun in standard_ProcessUtility(). ProcessUtility can recurse, and indeed can be driven to infinite recursion, so it ought to have a check_stack_depth() call. This covers the reported bug (portal trying to execute itself) and a bunch of other cases that could perhaps arise somewhere. Per bug #15428 from Malthe Borch. Back-patch to all supported branches. Discussion: https://postgr.es/m/15428-b3c2915ec470b033@postgresql.org	2018-10-15 14:01:38 -04:00
Michael Paquier	7c525519d8	Avoid duplicate XIDs at recovery when building initial snapshot On a primary, sets of XLOG_RUNNING_XACTS records are generated on a periodic basis to allow recovery to build the initial state of transactions for a hot standby. The set of transaction IDs is created by scanning all the entries in ProcArray. However it happens that its logic never counted on the fact that two-phase transactions finishing to prepare can put ProcArray in a state where there are two entries with the same transaction ID, one for the initial transaction which gets cleared when prepare finishes, and a second, dummy, entry to track that the transaction is still running after prepare finishes. This way ensures a continuous presence of the transaction so as callers of for example TransactionIdIsInProgress() are always able to see it as alive. So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase transaction finishes to prepare, the record can finish with duplicated XIDs, which is a state expected by design. If this record gets applied on a standby to initial its recovery state, then it would simply fail, so the odds of facing this failure are very low in practice. It would be tempting to change the generation of XLOG_RUNNING_XACTS so as duplicates are removed on the source, but this requires to hold on ProcArrayLock for longer and this would impact all workloads, particularly those using heavily two-phase transactions. XLOG_RUNNING_XACTS is also actually used only to initialize the standby state at recovery, so instead the solution is taken to discard duplicates when applying the initial snapshot. Diagnosed-by: Konstantin Knizhnik Author: Michael Paquier Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru Backpatch-through: 9.3	2018-10-14 22:23:54 +09:00
Tom Lane	7b88c1ddd0	Remove abstime, reltime, tinterval tables from old regression databases. In the back branches, drop these tables after the regression tests are done with them. This fixes failures of cross-branch pg_upgrade testing caused by these types having been removed in v12. We do lose the ability to test dump/restore behavior with these types in the back branches, but the actual loss of code coverage seems to be nil given that there's nothing very special about these types. Discussion: https://postgr.es/m/20181009192237.34wjp3nmw7oynmmr@alap3.anarazel.de	2018-10-12 19:33:57 -04:00
Tom Lane	ec185747a4	Back-patch addition of the ALLOCSET_FOO_SIZES macros. These macros were originally added in commit `ea268cdc9`, and back-patched into 9.6 before 9.6.0. However, some extensions would like to use them in older branches, and there seems no harm in providing them. So add them to all supported branches. Per suggestions from Christoph Berg and Andres Freund. Discussion: https://postgr.es/m/20181012170355.bhxi273skjt6sag4@alap3.anarazel.de	2018-10-12 14:49:33 -04:00
Andres Freund	c7b96ba291	Fix logical decoding error when system table w/ toast is repeatedly rewritten. Repeatedly rewriting a mapped catalog table with VACUUM FULL or CLUSTER could cause logical decoding to fail with: ERROR, "could not map filenode \"%s\" to relation OID" To trigger the problem the rewritten catalog had to have live tuples with toasted columns. The problem was triggered as during catalog table rewrites the heap_insert() check that prevents logical decoding information to be emitted for system catalogs, failed to treat the new heap's toast table as a system catalog (because the new heap is not recognized as a catalog table via RelationIsLogicallyLogged()). The relmapper, in contrast to the normal catalog contents, does not contain historical information. After a single rewrite of a mapped table the new relation is known to the relmapper, but if the table is rewritten twice before logical decoding occurs, the relfilenode cannot be mapped to a relation anymore. Which then leads us to error out. This only happens for toast tables, because the main table contents aren't re-inserted with heap_insert(). The fix is simple, add a new heap_insert() flag that prevents logical decoding information from being emitted, and accept during decoding that there might not be tuple data for toast tables. Unfortunately that does not fix pre-existing logical decoding errors. Doing so would require not throwing an error when a filenode cannot be mapped to a relation during decoding, and that seems too likely to hide bugs. If it's crucial to fix decoding for an existing slot, temporarily changing the ERROR in ReorderBufferCommit() to a WARNING appears to be the best fix. Author: Andres Freund Discussion: https://postgr.es/m/20180914021046.oi7dm4ra3ot2g2kt@alap3.anarazel.de Backpatch: 9.4-, where logical decoding was introduced	2018-10-10 13:53:03 -07:00
Tom Lane	26cc27541d	Allow btree comparison functions to return INT_MIN. Historically we forbade datatype-specific comparison functions from returning INT_MIN, so that it would be safe to invert the sort order just by negating the comparison result. However, this was never really safe for comparison functions that directly return the result of memcmp(), strcmp(), etc, as POSIX doesn't place any such restriction on those library functions. Buildfarm results show that at least on recent Linux on s390x, memcmp() actually does return INT_MIN sometimes, causing sort failures. The agreed-on answer is to remove this restriction and fix relevant call sites to not make such an assumption; code such as "res = -res" should be replaced by "INVERT_COMPARE_RESULT(res)". The same is needed in a few places that just directly negated the result of memcmp or strcmp. To help find places having this problem, I've also added a compile option to nbtcompare.c that causes some of the commonly used comparators to return INT_MIN/INT_MAX instead of their usual -1/+1. It'd likely be a good idea to have at least one buildfarm member running with "-DSTRESS_SORT_INT_MIN". That's far from a complete test of course, but it should help to prevent fresh introductions of such bugs. This is a longstanding portability hazard, so back-patch to all supported branches. Discussion: https://postgr.es/m/20180928185215.ffoq2xrq5d3pafna@alap3.anarazel.de	2018-10-05 16:01:30 -04:00
Tom Lane	a5b46fc66d	Set snprintf.c's maximum number of NL arguments to be 31. Previously, we used the platform's NL_ARGMAX if any, otherwise 16. The trouble with this is that the platform value is hugely variable, ranging from the POSIX-minimum 9 to as much as 64K on recent FreeBSD. Values of more than a dozen or two have no practical use and slow down the initialization of the argtypes array. Worse, they cause snprintf.c to consume far more stack space than was the design intention, possibly resulting in stack-overflow crashes. Standardize on 31, which is comfortably more than we need (it looks like no existing translatable message has more than about 10 parameters). I chose that, not 32, to make the array sizes powers of 2, for some possible small gain in speed of the memset. The lack of reported crashes suggests that the set of platforms we use snprintf.c on (in released branches) may have no overlap with the set where NL_ARGMAX has unreasonably large values. But that's not entirely clear, so back-patch to all supported branches. Per report from Mateusz Guzik (via Thomas Munro). Discussion: https://postgr.es/m/CAEepm=3VF=PUp2f8gU8fgZB22yPE_KBS0+e1AHAtQ=09schTHg@mail.gmail.com	2018-10-02 12:41:28 -04:00
Tom Lane	fd81fae67f	Fix corner-case failures in has_foo_privilege() family of functions. The variants of these functions that take numeric inputs (OIDs or column numbers) are supposed to return NULL rather than failing on bad input; this rule reduces problems with snapshot skew when queries apply the functions to all rows of a catalog. has_column_privilege() had careless handling of the case where the table OID didn't exist. You might get something like this: select has_column_privilege(9999,'nosuchcol','select'); ERROR: column "nosuchcol" of relation "(null)" does not exist or you might get a crash, depending on the platform's printf's response to a null string pointer. In addition, while applying the column-number variant to a dropped column returned NULL as desired, applying the column-name variant did not: select has_column_privilege('mytable','........pg.dropped.2........','select'); ERROR: column "........pg.dropped.2........" of relation "mytable" does not exist It seems better to make this case return NULL as well. Also, the OID-accepting variants of has_foreign_data_wrapper_privilege, has_server_privilege, and has_tablespace_privilege didn't follow the principle of returning NULL for nonexistent OIDs. Superusers got TRUE, everybody else got an error. Per investigation of Jaime Casanova's report of a new crash in HEAD. These behaviors have been like this for a long time, so back-patch to all supported branches. Patch by me; thanks to Stephen Frost for discussion and review Discussion: https://postgr.es/m/CAJGNTeP=-6Gyqq5TN9OvYEydi7Fv1oGyYj650LGTnW44oAzYCg@mail.gmail.com	2018-10-02 11:54:13 -04:00
Tom Lane	26318c4b85	Fix ALTER COLUMN TYPE to not open a relation without any lock. If the column being modified is referenced by a foreign key constraint of another table, ALTER TABLE would open the other table (to re-parse the constraint's definition) without having first obtained a lock on it. This was evidently intentional, but that doesn't mean it's really safe. It's especially not safe in 9.3, which pre-dates use of MVCC scans for catalog reads, but even in current releases it doesn't seem like a good idea. We know we'll need AccessExclusiveLock shortly to drop the obsoleted constraint, so just get that a little sooner to close the hole. Per testing with a patch that complains if we open a relation without holding any lock on it. I don't plan to back-patch that patch, but we should close the holes it identifies in all supported branches. Discussion: https://postgr.es/m/2038.1538335244@sss.pgh.pa.us	2018-10-01 11:39:14 -04:00
Tom Lane	e5baf8c27e	Fix detection of the result type of strerror_r(). The method we've traditionally used, of redeclaring strerror_r() to see if the compiler complains of inconsistent declarations, turns out not to work reliably because some compilers only report a warning, not an error. Amazingly, this has gone undetected for years, even though it certainly breaks our detection of whether strerror_r succeeded. Let's instead test whether the compiler will take the result of strerror_r() as a switch() argument. It's possible this won't work universally either, but it's the best idea I could come up with on the spur of the moment. Back-patch of commit `751f532b9`. Buildfarm results indicate that only icc-on-Linux actually has an issue here; perhaps the lack of field reports indicates that people don't build PG for production that way. Discussion: https://postgr.es/m/10877.1537993279@sss.pgh.pa.us	2018-09-30 16:24:56 -04:00
Peter Eisentraut	26b877d280	Recurse to sequences on ownership change for all relkinds When a table ownership is changed, we must apply that also to any owned sequences. (Otherwise, it would result in a situation that cannot be restored, because linked sequences must have the same owner as the table.) But this was previously only applied to regular tables and materialized views. But it should also apply to at least foreign tables. This patch removes the relkind check altogether, because it doesn't save very much and just introduces the possibility of similar omissions. Bug: #15238 Reported-by: Christoph Berg <christoph.berg@credativ.de>	2018-09-26 20:33:05 +02:00
Tom Lane	a5361b5933	Make some fixes to allow building Postgres on macOS 10.14 ("Mojave"). Apple's latest rearrangements of the system-supplied headers have broken building of PL/Perl and PL/Tcl. The only practical way to fix PL/Tcl is to start using the "-isysroot" compiler flag to point to SDK-supplied headers, as Apple expects. We must also start distinguishing where to find Perl's headers from where to find its shared library; but that seems like good cleanup anyway. Extensions that formerly did something like -I$(perl_archlibexp)/CORE should now do -I$(perl_includedir)/CORE instead. perl_archlibexp is still the place to look for libperl.so, though. If for some reason you don't like the default -isysroot setting, you can override that by setting PG_SYSROOT in configure's arguments. I don't currently think people would need to do so, unless maybe for cross-version build purposes. In addition, teach configure where to find tclConfig.sh. Our traditional method of searching $auto_path hasn't worked for the last couple of macOS releases, and it now seems clear that Apple's not going to change that. The workaround of manually specifying --with-tclconfig was annoying already, but Mojave's made it a lot more so because the sysroot path now has to be included as well. Let's just wire the knowledge into configure instead. To avoid breaking builds against non-default Tcl installations (e.g. MacPorts) wherein the $auto_path method probably still works, arrange to try the additional case only after all else has failed. Back-patch to all supported versions, since at least the buildfarm cares about that. The changes are set up to not do anything on macOS releases that are old enough to not have functional sysroot trees.	2018-09-25 13:23:29 -04:00
Tom Lane	028fc0bac9	Fix over-allocation of space for array_out()'s result string. array_out overestimated the space needed for its output, possibly by a very substantial amount if the array is multi-dimensional, because of wrong order of operations in the loop that counts the number of curly-brace pairs needed. While the output string is normally short-lived, this could still cause problems in extreme cases. An additional minor error was that it counted one more delimiter than is actually needed. Repair those errors, add an Assert that the space is now correctly calculated, and make some minor improvements in the comments. I also failed to resist the temptation to get rid of an integer modulus operation per array element; a simple comparison is sufficient. This bug dates clear back to Berkeley days, so back-patch to all supported versions. Keiichi Hirobe, minor additional work by me Discussion: https://postgr.es/m/CAH=EFxE9W0tRvQkixR2XJRRCToUYUEDkJZk6tnADXugPBRdcdg@mail.gmail.com	2018-09-24 11:30:51 -04:00
Noah Misch	401228183a	Initialize random() in bootstrap/stand-alone postgres and in initdb. This removes a difference between the standard IsUnderPostmaster execution environment and that of --boot and --single. In a stand-alone backend, "SELECT random()" always started at the same seed. On a system capable of using posix shared memory, initdb could still conclude "selecting dynamic shared memory implementation ... sysv". Crashed --boot or --single postgres processes orphaned shared memory objects having names that collided with the not-actually-random names that initdb probed. The sysv fallback appeared after ten crashes of --boot or --single postgres. Since --boot and --single are rare in production use, systems used for PostgreSQL development are the principal candidate to notice this symptom. Back-patch to 9.3 (all supported versions). PostgreSQL 9.4 introduced dynamic shared memory, but 9.3 does share the "SELECT random()" problem. Reviewed by Tom Lane and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20180915221546.GA3159382@rfd.leadboat.com	2018-09-23 22:56:57 -07:00
Tom Lane	38cb010843	Fix failure in WHERE CURRENT OF after rewinding the referenced cursor. In a case where we have multiple relation-scan nodes in a cursor plan, such as a scan of an inheritance tree, it's possible to fetch from a given scan node, then rewind the cursor and fetch some row from an earlier scan node. In such a case, execCurrent.c mistakenly thought that the later scan node was still active, because ExecReScan hadn't done anything to make it look not-active. We'd get some sort of failure in the case of a SeqScan node, because the node's scan tuple slot would be pointing at a HeapTuple whose t_self gets reset to invalid by heapam.c. But it seems possible that for other relation scan node types we'd actually return a valid tuple TID to the caller, resulting in updating or deleting a tuple that shouldn't have been considered current. To fix, forcibly clear the ScanTupleSlot in ExecScanReScan. Another issue here, which seems only latent at the moment but could easily become a live bug in future, is that rewinding a cursor does not necessarily lead to immediately applying ExecReScan to every scan-level node in the plan tree. Upper-level nodes will think that they can postpone that call if their child node is already marked with chgParam flags. I don't see a way for that to happen today in a plan tree that's simple enough for execCurrent.c's search_plan_tree to understand, but that's one heck of a fragile assumption. So, add some logic in search_plan_tree to detect chgParam flags being set on nodes that it descended to/through, and assume that that means we should consider lower scan nodes to be logically reset even if their ReScan call hasn't actually happened yet. Per bug #15395 from Matvey Arye. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/153764171023.14986.280404050547008575@wrigleys.postgresql.org	2018-09-23 16:05:45 -04:00
Thomas Munro	c0c5668c6a	Allow DSM allocation to be interrupted. Chris Travers reported that the startup process can repeatedly try to cancel a backend that is in a posix_fallocate()/EINTR loop and cause it to loop forever. Teach the retry loop to give up if an interrupt is pending. Don't actually check for interrupts in that loop though, because a non-local exit would skip some clean-up code in the caller. Back-patch to 9.4 where DSM was added (and posix_fallocate() was later back-patched). Author: Chris Travers Reviewed-by: Ildar Musin, Murat Kabilov, Oleksii Kliukin Tested-by: Oleksii Kliukin Discussion: https://postgr.es/m/CAN-RpxB-oeZve_J3SM_6%3DHXPmvEG%3DHX%2B9V9pi8g2YR7YW0rBBg%40mail.gmail.com	2018-09-18 23:49:21 +12:00
Tom Lane	8494755109	Fix failure with initplans used conditionally during EvalPlanQual rechecks. The EvalPlanQual machinery assumes that any initplans (that is, uncorrelated sub-selects) used during an EPQ recheck would have already been evaluated during the main query; this is implicit in the fact that execPlan pointers are not copied into the EPQ estate's es_param_exec_vals. But it's possible for that assumption to fail, if the initplan is only reached conditionally. For example, a sub-select inside a CASE expression could be reached during a recheck when it had not been previously, if the CASE test depends on a column that was just updated. This bug is old, appearing to date back to my rewrite of EvalPlanQual in commit `9f2ee8f28`, but was not detected until Kyle Samson reported a case. To fix, force all not-yet-evaluated initplans used within the EPQ plan subtree to be evaluated at the start of the recheck, before entering the EPQ environment. This could be inefficient, if such an initplan is expensive and goes unused again during the recheck --- but that's piling one layer of improbability atop another. It doesn't seem worth adding more complexity to prevent that, at least not in the back branches. It was convenient to use the new-in-v11 ExecEvalParamExecParams function to implement this, but I didn't like either its name or the specifics of its API, so revise that. Back-patch all the way. Rather than rewrite the patch to avoid depending on bms_next_member() in the oldest branches, I chose to back-patch that function into 9.4 and 9.3. (This isn't the first time back-patches have needed that, and it exhausted my patience.) I also chose to back-patch some test cases added by commits `71404af2a` and `342a1ffa2` into 9.4 and 9.3, so that the 9.x versions of eval-plan-qual.spec are all the same. Andrew Gierth diagnosed the problem and contributed the added test cases, though the actual code changes are by me. Discussion: https://postgr.es/m/A033A40A-B234-4324-BE37-272279F7B627@tripadvisor.com	2018-09-15 13:42:34 -04:00
Andrew Gierth	a389ddc759	Repair bug in regexp split performance improvements. Commit `c8ea87e4b` introduced a temporary conversion buffer for substrings extracted during regexp splits. Unfortunately the code that sized it was failing to ignore the effects of ignored degenerate regexp matches, so for regexp_split_* calls it could under-size the buffer in such cases. Fix, and add some regression test cases (though those will only catch the bug if run in a multibyte encoding). Backpatch to 9.3 as the faulty code was. Thanks to the PostGIS project, Regina Obe and Paul Ramsey for the report (via IRC) and assistance in analysis. Patch by me.	2018-09-12 19:47:50 +01:00
Tom Lane	86e2475833	On all Windows platforms, not just Cygwin, use _timezone and _tzname. Back-patch commit `868628e4f` into the 9.5 branch, so that we can support building that branch with Visual Studio 2015. This patch itself could go further back, but other VS2015 patches such as `0fb54de9a` and `c8e81afc6` were only back-patched to 9.5, so there seems little point in handling this one differently. Discussion: https://postgr.es/m/CAD=LzWFg+Z-KUS3Wm8-1J2vOuYErJXbjuE6b7quzswQEBXJWMQ@mail.gmail.com Now that we have backported VS2015 support to 9.4 and 9.3, backport this also.	2018-09-12 12:24:11 -04:00
Andrew Dunstan	19acfd6528	Support building with Visual Studio 2017 Haribabu Kommi, reviewed by Takeshi Ideriha and Christian Ullrich Now backpatched to 9.4 and 9.3	2018-09-11 16:03:42 -04:00
Andrew Dunstan	9ca32a6ebc	Support building with Visual Studio 2015 Adjust the way we detect the locale. As a result the minumum Windows version supported by VS2015 and later is Windows Vista. Add some tweaks to remove new compiler warnings. Remove documentation references to the now obsolete msysGit. Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich. Rather belated backpatch to 9.4 and 9.3	2018-09-11 15:44:42 -04:00
Alexander Korotkov	35ea98f79a	Fix past pd_upper write in ginRedoRecompress() ginRedoRecompress() replays actions over compressed segments of posting list in-place. However, it might lead to write past pg_upper, because intermediate state during playing the changes can take more space than both original state and final state. This commit fixes that by refuse from in-place modification. Instead page tail is copied once modification is started, and then it's used as the source of original segments. Backpatch to 9.4 where posting list compression was introduced. Reported-by: Sivasubramanian Ramasubramanian Discussion: https://postgr.es/m/1536091151804.6588%40amazon.com Author: Alexander Korotkov based on patch from and ideas by Sivasubramanian Ramasubramanian Review: Sivasubramanian Ramasubramanian Backpatch-through: 9.4	2018-09-09 21:45:55 +03:00
Tom Lane	d2003339c3	Save/restore SPI's global variables in SPI_connect() and SPI_finish(). This patch removes two sources of interference between nominally independent functions when one SPI-using function calls another, perhaps without knowing that it does so. Chapman Flack pointed out that xml.c's query_to_xml_internal() expects SPI_tuptable and SPI_processed to stay valid across datatype output function calls; but it's possible that such a call could involve re-entrant use of SPI. It seems likely that there are similar hazards elsewhere, if not in the core code then in third-party SPI users. Previously SPI_finish() reset SPI's API globals to zeroes/nulls, which would typically make for a crash in such a situation. Restoring them to the values they had at SPI_connect() seems like a considerably more useful behavior, and it still meets the design goal of not leaving any dangling pointers to tuple tables of the function being exited. Also, cause SPI_connect() to reset these variables to zeroes/nulls after saving them. This prevents interference in the opposite direction: it's possible that a SPI-using function that's only ever been tested standalone contains assumptions that these variables start out as zeroes. That was the case as long as you were the outermost SPI user, but not so much for an inner user. Now it's consistent. Report and fix suggestion by Chapman Flack, actual patch by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/9fa25bef-2e4f-1c32-22a4-3ad0723c4a17@anastigmatix.net	2018-09-07 20:09:57 -04:00
Tom Lane	35e39610a3	Limit depth of forced recursion for CLOBBER_CACHE_RECURSIVELY. It's somewhat surprising that we got away with this before. (Actually, since nobody tests this routinely AFAIK, it might've been broken for awhile. But it's definitely broken in the wake of commit f868a8143.) It seems sufficient to limit the forced recursion to a small number of levels. Back-patch to all supported branches, like the preceding patch. Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us	2018-09-07 18:14:37 -04:00
Tom Lane	bf919387ec	Fix longstanding recursion hazard in sinval message processing. LockRelationOid and sibling routines supposed that, if our session already holds the lock they were asked to acquire, they could skip calling AcceptInvalidationMessages on the grounds that we must have already read any remote sinval messages issued against the relation being locked. This is normally true, but there's a critical special case where it's not: processing inside AcceptInvalidationMessages might attempt to access system relations, resulting in a recursive call to acquire a relation lock. Hence, if the outer call had acquired that same system catalog lock, we'd fall through, despite the possibility that there's an as-yet-unread sinval message for that system catalog. This could, for example, result in failure to access a system catalog or index that had just been processed by VACUUM FULL. This is the explanation for buildfarm failures we've been seeing intermittently for the past three months. The bug is far older than that, but commits `a54e1f158` et al added a new recursion case within AcceptInvalidationMessages that is apparently easier to hit than any previous case. To fix this, we must not skip calling AcceptInvalidationMessages until we have finished a call to it since acquiring a relation lock, not merely acquired the lock. (There's already adequate logic inside AcceptInvalidationMessages to deal with being called recursively.) Fortunately, we can implement that at trivial cost, by adding a flag to LOCALLOCK hashtable entries that tracks whether we know we have completed such a call. There is an API hazard added by this patch for external callers of LockAcquire: if anything is testing for LOCKACQUIRE_ALREADY_HELD, it might be fooled by the new return code LOCKACQUIRE_ALREADY_CLEAR into thinking the lock wasn't already held. This should be a fail-soft condition, though, unless something very bizarre is being done in response to the test. Also, I added an additional output argument to LockAcquireExtended, assuming that that probably isn't called by any outside code given the very limited usefulness of its additional functionality. Back-patch to all supported branches. Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us	2018-09-07 18:04:38 -04:00
Michael Paquier	1130206272	Fix initial sync of slot parent directory when restoring status At the beginning of recovery, information from replication slots is recovered from disk to memory. In order to ensure the durability of the information, the status file as well as its parent directory are synced. It happens that the sync on the parent directory was done directly using the status file path, which is logically incorrect, and the current code has been doing a sync on the same object twice in a row. Reported-by: Konstantin Knizhnik Diagnosed-by: Konstantin Knizhnik Author: Michael Paquier Discussion: https://postgr.es/m/9eb1a6d5-b66f-2640-598d-c5ea46b8f68a@postgrespro.ru Backpatch-through: 9.4-	2018-09-02 12:41:06 -07:00
Tom Lane	083d9ced14	Avoid using potentially-under-aligned page buffers. There's a project policy against using plain "char buf[BLCKSZ]" local or static variables as page buffers; preferred style is to palloc or malloc each buffer to ensure it is MAXALIGN'd. However, that policy's been ignored in an increasing number of places. We've apparently got away with it so far, probably because (a) relatively few people use platforms on which misalignment causes core dumps and/or (b) the variables chance to be sufficiently aligned anyway. But this is not something to rely on. Moreover, even if we don't get a core dump, we might be paying a lot of cycles for misaligned accesses. To fix, invent new union types PGAlignedBlock and PGAlignedXLogBlock that the compiler must allocate with sufficient alignment, and use those in place of plain char arrays. I used these types even for variables where there's no risk of a misaligned access, since ensuring proper alignment should make kernel data transfers faster. I also changed some places where we had been palloc'ing short-lived buffers, for coding style uniformity and to save palloc/pfree overhead. Since this seems to be a live portability hazard (despite the lack of field reports), back-patch to all supported versions. Patch by me; thanks to Michael Paquier for review. Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de	2018-09-01 15:27:13 -04:00
Noah Misch	20cd88857b	Ignore server-side delays when enforcing wal_sender_timeout. Healthy clients of servers having poor I/O performance, such as buildfarm members hamster and tern, saw unexpected timeouts. That disagreed with documentation. This fix adds one gettimeofday() call whenever ProcessRepliesIfAny() finds no client reply messages. Back-patch to 9.4; the bug's symptom is rare and mild, and the code all moved between 9.3 and 9.4. Discussion: https://postgr.es/m/20180826034600.GA1105084@rfd.leadboat.com	2018-08-31 23:00:03 -07:00
Michael Paquier	d9638a326f	Ensure correct minimum consistent point on standbys Startup process has improved its calculation of incorrect minimum consistent point in `8d68ee6`, which ensures that all WAL available gets replayed when doing crash recovery, and has introduced an incorrect calculation of the minimum recovery point for non-startup processes, which can cause incorrect page references on a standby when for example the background writer flushed a couple of pages on-disk but was not updating the control file to let a subsequent crash recovery replay to where it should have. The only case where this has been reported to be a problem is when a standby needs to calculate the latest removed xid when replaying a btree deletion record, so one would need connections on a standby that happen just after recovery has thought it reached a consistent point. Using a background worker which is started after the consistent point is reached would be the easiest way to get into problems if it connects to a database. Having clients which attempt to connect periodically could also be a problem, but the odds of seeing this problem are much lower. The fix used is pretty simple, as the idea is to give access to the minimum recovery point written in the control file to non-startup processes so as they use a reference, while the startup process still initializes its own references of the minimum consistent point so as the original problem with incorrect page references happening post-promotion with a crash do not show up. Reported-by: Alexander Kukushkin Diagnosed-by: Alexander Kukushkin Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, Alexander Kukushkin Discussion: https://postgr.es/m/153492341830.1368.3936905691758473953@wrigleys.postgresql.org Backpatch-through: 9.3	2018-08-31 11:05:59 -07:00
Tom Lane	20f9cd55dd	Make checksum_impl.h safe to compile with -fstrict-aliasing. In general, Postgres requires -fno-strict-aliasing with compilers that implement C99 strict aliasing rules. There's little hope of getting rid of that overall. But it seems like it would be a good idea if storage/checksum_impl.h in particular didn't depend on it, because that header is explicitly intended to be included by external programs. We don't have a lot of control over the compiler switches that an external program might use, as shown by Michael Banck's report of failure in a privately-modified version of pg_verify_checksums. Hence, switch to using a union in place of willy-nilly pointer casting inside this file. I think this makes the code a bit more readable anyway. checksum_impl.h hasn't changed since it was introduced in 9.3, so back-patch all the way. Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de	2018-08-31 12:27:15 -04:00
Andrew Gierth	2ba7c4e6c4	Avoid quadratic slowdown in regexp match/split functions. regexp_matches, regexp_split_to_table and regexp_split_to_array all work by compiling a list of match positions as character offsets (NOT byte positions) in the source string. Formerly, they then used text_substr to extract the matched text; but in a multi-byte encoding, that counts the characters in the string, and the characters needed to reach the starting byte position, on every call. Accordingly, the performance degraded as the product of the input string length and the number of match positions, such that splitting a string of a few hundred kbytes could take many minutes. Repair by keeping the wide-character copy of the input string available (only in the case where encoding_max_length is not 1) after performing the match operation, and extracting substrings from that instead. This reduces the complexity to being linear in the number of result bytes, discounting the actual regexp match itself (which is not affected by this patch). In passing, remove cleanup using retail pfree() which was obsoleted by commit `ff428cded` (Feb 2008) which made cleanup of SRF multi-call contexts automatic. Also increase (to ~134 million) the maximum number of matches and provide an error message when it is reached. Backpatch all the way because this has been wrong forever. Analysis and patch by me; review by Kaiting Chen. Discussion: https://postgr.es/m/87pnyn55qh.fsf@news-spur.riddles.org.uk see also https://postgr.es/m/87lg996g4r.fsf@news-spur.riddles.org.uk	2018-08-28 11:50:20 +01:00
Tom Lane	48bc1a5252	Make syslogger more robust against failures in opening CSV log files. The previous coding figured it'd be good enough to postpone opening the first CSV log file until we got a message we needed to write there. This is unsafe, though, because if the open fails we end up in infinite recursion trying to report the failure. Instead make the CSV log file management code look as nearly as possible like the longstanding logic for the stderr log file. In particular, open it immediately at postmaster startup (if enabled), or when we get a SIGHUP in which we find that log_destination has been changed to enable CSV logging. It seems OK to fail if a postmaster-start-time open attempt fails, as we've long done for the stderr log file. But we can't die if we fail to open a CSV log file during SIGHUP, so we're still left with a problem. In that case, write any output meant for the CSV log file to the stderr log file. (This will also cover race-condition cases in which backends send CSV log data before or after we have the CSV log file open.) This patch also fixes an ancient oversight that, if CSV logging was turned off during a SIGHUP, we never actually closed the last CSV log file. In passing, remember to reset whereToSendOutput = DestNone during syslogger start, since (unlike all other postmaster children) it's forked before the postmaster has done that. This made for a platform-dependent difference in error reporting behavior between the syslogger and other children: except on Windows, it'd report problems to the original postmaster stderr as well as the normal error log file(s). It's barely possible that that was intentional at some point; but it doesn't seem likely to be desirable in production, and the platform dependency definitely isn't desirable. Per report from Alexander Kukushkin. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAFh8B==iLUD_gqC-dAENS0V+kVrCeGiKujtKqSQ7++S-caaChw@mail.gmail.com	2018-08-26 14:21:55 -04:00
Andrew Gierth	6c5ed68363	Reduce an unnecessary O(N^3) loop in lexer. The lexer's handling of operators contained an O(N^3) hazard when dealing with long strings of + or - characters; it seems hard to prevent this case from being O(N^2), but the additional N multiplier was not needed. Backpatch all the way since this has been there since 7.x, and it presents at least a mild hazard in that trying to do Bind, PREPARE or EXPLAIN on a hostile query could take excessive time (without honouring cancels or timeouts) even if the query was never executed.	2018-08-23 21:33:38 +01:00
Michael Paquier	788ae09f4a	Fix set of NLS translation issues While monitoring the code, a couple of issues related to string translation has showed up: - Some routines for auto-updatable views return an error string, which sometimes missed the shot. A comment regarding string translation is added for each routine to help with future features. - GSSAPI authentication missed two translations. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20180810.152131.31921918.horiguchi.kyotaro@lab.ntt.co.jp Backpatch-through: 9.3	2018-08-21 15:18:24 +09:00
Tom Lane	a4fdcceaba	Ensure schema qualification in pg_restore DISABLE/ENABLE TRIGGER commands. Previously, this code blindly followed the common coding pattern of passing PQserverVersion(AH->connection) as the server-version parameter of fmtQualifiedId. That works as long as we have a connection; but in pg_restore with text output, we don't. Instead we got a zero from PQserverVersion, which fmtQualifiedId interpreted as "server is too old to have schemas", and so the name went unqualified. That still accidentally managed to work in many cases, which is probably why this ancient bug went undetected for so long. It only became obvious in the wake of the changes to force dump/restore to execute with restricted search_path. In HEAD/v11, let's deal with this by ripping out fmtQualifiedId's server- version behavioral dependency, and just making it schema-qualify all the time. We no longer support pg_dump from servers old enough to need the ability to omit schema name, let alone restoring to them. (Also, the few callers outside pg_dump already didn't work with pre-schema servers.) In older branches, that's not an acceptable solution, so instead just tweak the DISABLE/ENABLE TRIGGER logic to ensure it will schema-qualify its output regardless of server version. Per bug #15338 from Oleg somebody. Back-patch to all supported branches. Discussion: https://postgr.es/m/153452458706.1316.5328079417086507743@wrigleys.postgresql.org	2018-08-17 17:12:21 -04:00
Andrew Gierth	3cf3a65cb7	Set scan direction appropriately for SubPlans (bug #15336 ) When executing a SubPlan in an expression, the EState's direction field was left alone, resulting in an attempt to execute the subplan backwards if it was encountered during a backwards scan of a cursor. Also, though much less likely, it was possible to reach the execution of an InitPlan while in backwards-scan state. Repair by saving/restoring estate->es_direction and forcing forward scan mode in the relevant places. Backpatch all the way, since this has been broken since 8.3 (prior to commit `c7ff7663e`, SubPlans had their own EStates rather than sharing the parent plan's, so there was no confusion over scan direction). Per bug #15336 reported by Vladimir Baranoff; analysis and patch by me, review by Tom Lane. Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org	2018-08-17 16:23:56 +01:00
Tomas Vondra	ef1ac5b2ad	Close the file descriptor in ApplyLogicalMappingFile The function was forgetting to close the file descriptor, resulting in failures like this: ERROR: 53000: exceeded maxAllocatedDescs (492) while trying to open file "pg_logical/mappings/map-4000-4eb-1_60DE1E08-5376b5-537c6b" LOCATION: OpenTransientFile, fd.c:2161 Simply close the file at the end, and backpatch to 9.4 (where logical decoding was introduced). While at it, fix a nearby typo. Discussion: https://www.postgresql.org/message-id/flat/738a590a-2ce5-9394-2bef-7b1caad89b37%402ndquadrant.com	2018-08-16 16:51:00 +02:00
Tom Lane	27c4b0899c	Make snprintf.c follow the C99 standard for snprintf's result value. C99 says that the result should be the number of bytes that would have been emitted given a large enough buffer, not the number we actually were able to put in the buffer. It's time to make our substitute implementation comply with that. Not doing so results in inefficiency in buffer-enlargement cases, and also poses a portability hazard for third-party code that might expect C99-compliant snprintf behavior within Postgres. In passing, remove useless tests for str == NULL; neither C99 nor predecessor standards ever allowed that except when count == 0, so I see no reason to expend cycles on making that a non-crash case for this implementation. Also, don't waste a byte in pg_vfprintf's local I/O buffer; this might have performance benefits by allowing aligned writes during flushbuffer calls. Back-patch of commit `805889d7d`. There was some concern about this possibly breaking code that assumes pre-C99 behavior, but there is much more risk (and reality, in our own code) of code that assumes C99 behavior and hence fails to detect buffer overrun without this. Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us	2018-08-15 17:25:24 -04:00
Tom Lane	d371efb39c	Clean up assorted misuses of snprintf()'s result value. Fix a small number of places that were testing the result of snprintf() but doing so incorrectly. The right test for buffer overrun, per C99, is "result >= bufsize" not "result > bufsize". Some places were also checking for failure with "result == -1", but the standard only says that a negative value is delivered on failure. (Note that this only makes these places correct if snprintf() delivers C99-compliant results. But at least now these places are consistent with all the other places where we assume that.) Also, make psql_start_test() and isolation_start_test() check for buffer overrun while constructing their shell commands. There seems like a higher risk of overrun, with more severe consequences, here than there is for the individual file paths that are made elsewhere in the same functions, so this seemed like a worthwhile change. Also fix guc.c's do_serialize() to initialize errno = 0 before calling vsnprintf. In principle, this should be unnecessary because vsnprintf should have set errno if it returns a failure indication ... but the other two places this coding pattern is cribbed from don't assume that, so let's be consistent. These errors are all very old, so back-patch as appropriate. I think that only the shell command overrun cases are even theoretically reachable in practice, but there's not much point in erroneous error checks. Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us	2018-08-15 16:29:32 -04:00
Heikki Linnakangas	d5a9b706ea	Don't run atexit callbacks in quickdie signal handlers. exit() is not async-signal safe. Even if the libc implementation is, 3rd party libraries might have installed unsafe atexit() callbacks. After receiving SIGQUIT, we really just want to exit as quickly as possible, so we don't really want to run the atexit() callbacks anyway. The original report by Jimmy Yih was a self-deadlock in startup_die(). However, this patch doesn't address that scenario; the signal handling while waiting for the startup packet is more complicated. But at least this alleviates similar problems in the SIGQUIT handlers, like that reported by Asim R P later in the same thread. Backpatch to 9.3 (all supported versions). Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com	2018-08-08 19:10:38 +03:00
Tom Lane	33c5d3bf85	Don't record FDW user mappings as members of extensions. CreateUserMapping has a recordDependencyOnCurrentExtension call that's been there since extensions were introduced (very possibly my fault). However, there's no support anywhere else for user mappings as members of extensions, nor are they listed as a possible member object type in the documentation. Nor does it really seem like a good idea for user mappings to belong to extensions when roles don't. Hence, remove the bogus call. (As we saw in bug #15310, the lack of any pg_dump support for this case ensures that any such membership record would silently disappear during pg_upgrade. So there's probably no need for us to do anything else about cleaning up after this mistake.) Discussion: https://postgr.es/m/27952.1533667213@sss.pgh.pa.us	2018-08-07 16:33:12 -04:00
Tom Lane	753051cc72	Fix incorrect initialization of BackendActivityBuffer. Since commit `c8e8b5a6e`, this has been zeroed out using the wrong length. In practice the length would always be too small, leading to not zeroing the whole buffer rather than clobbering additional memory; and that's pretty harmless, both because shmem would likely start out as zeroes and because we'd reinitialize any given entry before use. Still, it's bogus, so fix it. Reported by Petru-Florin Mihancea (bug #15312) Discussion: https://postgr.es/m/153363913073.1303.6518849192351268091@wrigleys.postgresql.org	2018-08-07 16:01:14 -04:00
Tom Lane	fb4e0e8960	Fix pg_upgrade to handle event triggers in extensions correctly. pg_dump with --binary-upgrade must emit ALTER EXTENSION ADD commands for all objects that are members of extensions. It forgot to do so for event triggers, as per bug #15310 from Nick Barnes. Back-patch to 9.3 where event triggers were introduced. Haribabu Kommi Discussion: https://postgr.es/m/153360083872.1395.4593932457718151600@wrigleys.postgresql.org	2018-08-07 15:43:49 -04:00
Tom Lane	abd04e0dd8	Ensure pg_dump_sort.c sorts null vs non-null namespace consistently. The original coding here (which is, I believe, my fault) supposed that it didn't need to concern itself with the possibility that one object of a given type-priority has a namespace while another doesn't. But that's not reliably true anymore, if it ever was; and if it does happen then it's possible that DOTypeNameCompare returns self-inconsistent comparison results. That leads to unspecified behavior in qsort() and a resultant weird output order from pg_dump. This should end up being only a cosmetic problem, because any ordering constraints that actually matter should be enforced by the later dependency-based sort. Still, it's a bug, so back-patch. Report and fix by Jacob Champion, though I editorialized on his patch to the extent of making NULL sort after non-NULL, for consistency with our usual sorting definitions. Discussion: https://postgr.es/m/CABAq_6Hw+V-Kj7PNfD5tgOaWT_-qaYkc+SRmJkPLeUjYXLdxwQ@mail.gmail.com	2018-08-07 13:13:42 -04:00
Tom Lane	895fb6e2e2	Stamp 9.4.19.	2018-08-06 16:11:24 -04:00
Peter Eisentraut	1af8bbe9ab	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: a69444e557347b0db244ac3d3ea2bf602a82227f	2018-08-06 19:31:39 +02:00
Tom Lane	6de9766b8d	Fix failure to reset libpq's state fully between connection attempts. The logic in PQconnectPoll() did not take care to ensure that all of a PGconn's internal state variables were reset before trying a new connection attempt. If we got far enough in the connection sequence to have changed any of these variables, and then decided to try a new server address or server name, the new connection might be completed with some state that really only applied to the failed connection. While this has assorted bad consequences, the only one that is clearly a security issue is that password_needed didn't get reset, so that if the first server asked for a password and the second didn't, PQconnectionUsedPassword() would return an incorrect result. This could be leveraged by unprivileged users of dblink or postgres_fdw to allow them to use server-side login credentials that they should not be able to use. Other notable problems include the possibility of forcing a v2-protocol connection to a server capable of supporting v3, or overriding "sslmode=prefer" to cause a non-encrypted connection to a server that would have accepted an encrypted one. Those are certainly bugs but it's harder to paint them as security problems in themselves. However, forcing a v2-protocol connection could result in libpq having a wrong idea of the server's standard_conforming_strings setting, which opens the door to SQL-injection attacks. The extent to which that's actually a problem, given the prerequisite that the attacker needs control of the client's connection parameters, is unclear. These problems have existed for a long time, but became more easily exploitable in v10, both because it introduced easy ways to force libpq to abandon a connection attempt at a late stage and then try another one (rather than just giving up), and because it provided an easy way to specify multiple target hosts. Fix by rearranging PQconnectPoll's state machine to provide centralized places to reset state properly when moving to a new target host or when dropping and retrying a connection to the same host. Tom Lane, reviewed by Noah Misch. Our thanks to Andrew Krasichkov for finding and reporting the problem. Security: CVE-2018-10915	2018-08-06 10:53:35 -04:00
Michael Paquier	e69a3ac4a3	Reset properly errno before calling write() `6cb3372` enforces errno to ENOSPC when less bytes than what is expected have been written when it is unset, though it forgot to properly reset errno before doing a system call to write(), causing errno to potentially come from a previous system call. Reported-by: Tom Lane Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/31797.1533326676@sss.pgh.pa.us	2018-08-05 05:32:44 +09:00
Peter Geoghegan	250528cec0	Add table relcache invalidation to index builds. It's necessary to make sure that owning tables have a relcache invalidation prior to advancing the command counter to make newly-entered catalog tuples for the index visible. inval.c must be able to maintain the consistency of the local caches in the event of transaction abort. There is usually only a problem when CREATE INDEX transactions abort, since there is a generic invalidation once we reach index_update_stats(). This bug is of long standing. Problems were made much more likely by the addition of parallel CREATE INDEX (commit `9da0cc3528`), but it is strongly suspected that similar problems can be triggered without involving plan_create_index_workers(). (plan_create_index_workers() triggers a relcache build or rebuild, which previously only happened in rare edge cases.) Author: Peter Geoghegan Reported-By: Luca Ferrari Diagnosed-By: Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAKoxK+5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf=HEQ@mail.gmail.com Backpatch: 9.3-	2018-08-03 14:44:33 -07:00
Tom Lane	88adf1add2	Further fixes for quoted-list GUC values in pg_dump and ruleutils.c. Commits `742869946` et al turn out to be a couple bricks shy of a load. We were dumping the stored values of GUC_LIST_QUOTE variables as they appear in proconfig or setconfig catalog columns. However, although that quoting rule looks a lot like SQL-identifier double quotes, there are two critical differences: empty strings ("") are legal, and depending on which variable you're considering, values longer than NAMEDATALEN might be valid too. So the current technique fails altogether on empty-string list entries (as reported by Steven Winfield in bug #15248) and it also risks truncating file pathnames during dump/reload of GUC values that are lists of pathnames. To fix, split the stored value without any downcasing or truncation, and then emit each element as a SQL string literal. This is a tad annoying, because we now have three copies of the comma-separated-string splitting logic in varlena.c as well as a fourth one in dumputils.c. (Not to mention the randomly-different-from-those splitting logic in libpq...) I looked at unifying these, but it would be rather a mess unless we're willing to tweak the API definitions of SplitIdentifierString, SplitDirectoriesString, or both. That might be worth doing in future; but it seems pretty unsafe for a back-patched bug fix, so for now accept the duplication. Back-patch to all supported branches, as the previous fix was. Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us	2018-07-31 13:00:08 -04:00
Tom Lane	addf9e1bd6	Fix pg_dump's failure to dump REPLICA IDENTITY for constraint indexes. pg_dump knew about printing ALTER TABLE ... REPLICA IDENTITY USING INDEX for indexes declared as indexes, but it failed to print that for indexes declared as unique or primary-key constraints. Per report from Achilleas Mantzios. This has been broken since the feature was introduced, AFAICS. Back-patch to 9.4. Discussion: https://postgr.es/m/1e6cc5ad-b84a-7c07-8c08-a4d0c3cdc938@matrix.gatewaynet.com	2018-07-30 12:35:49 -04:00
Noah Misch	8c477a42eb	Document security implications of qualified names. Commit `5770172cb0` documented secure schema usage, and that advice suffices for using unqualified names securely. Document, in typeconv-func primarily, the additional issues that arise with qualified names. Back-patch to 9.3 (all supported versions). Reviewed by Jonathan S. Katz. Discussion: https://postgr.es/m/20180721012446.GA1840594@rfd.leadboat.com	2018-07-28 20:08:34 -07:00
Alexander Korotkov	9c6a676c4c	Fix handling of empty uncompressed posting list pages in GIN PostgreSQL 9.4 introduces posting list compression in GIN. This feature supports online upgrade, so that after pg_upgrade uncompressed posting lists are compressed on-the-fly. Underlying code appears to always expect at least one item on uncompressed posting list page. But there could be completely empty pages, because VACUUM never deletes leftmost and rightmost pages from posting trees. This commit fixes that. Reported-by: Sivasubramanian Ramasubramanian Discussion: https://postgr.es/m/1531867212836.63354%40amazon.com Author: Sivasubramanian Ramasubramanian, Alexander Korotkov Backpatch-through: 9.4	2018-07-19 21:24:53 +03:00
Heikki Linnakangas	47d51a5e8c	Fix misc typos, mostly in comments. A collection of typos I happened to spot while reading code, as well as grepping for common mistakes. Backpatch to all supported versions, as applicable, to avoid conflicts when backporting other commits in the future.	2018-07-18 16:54:45 +03:00
Tom Lane	6d2d5ab173	Fix inadequate buffer locking in FSM and VM page re-initialization. When reading an existing FSM or VM page that was found to be corrupt by the buffer manager, the code applied PageInit() to reinitialize the page, but did so without any locking. There is thus a hazard that two backends might concurrently do PageInit, which in itself would still be OK, but the slower one might then zero over subsequent data changes applied by the faster one. Even that is unlikely to be fatal; but it's not desirable, so add locking to prevent it. This does not add any locking overhead in the normal code path where the page is OK. It's not immediately obvious that that's safe, but I believe it is, for reasons explained in the added comments. Problem noted by R P Asim. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CANXE4Te4G0TGq6cr0-TvwP0H4BNiK_-hB5gHe8mF+nz0mcYfMQ@mail.gmail.com	2018-07-13 11:53:16 -04:00
Michael Paquier	98e2c298c2	Make logical WAL sender report streaming state appropriately WAL senders sending logically-decoded data fail to properly report in "streaming" state when starting up, hence as long as one extra record is not replayed, such WAL senders would remain in a "catchup" state, which is inconsistent with the physical cousin. This can be easily reproduced by for example using pg_recvlogical and restarting the upstream server. The TAP tests have been slightly modified to detect the failure and strengthened so as future tests also make sure that a node is in streaming state when waiting for its catchup. Backpatch down to 9.4 where this code has been introduced. Reported-by: Sawada Masahiko Author: Simon Riggs, Sawada Masahiko Reviewed-by: Petr Jelinek, Michael Paquier, Vaishnavi Prabakaran Discussion: https://postgr.es/m/CAD21AoB2ZbCCqOx=bgKMcLrAvs1V0ZMqzs7wBTuDySezTGtMZA@mail.gmail.com	2018-07-12 10:20:27 +09:00
Tom Lane	d80ec868fa	Avoid emitting a bogus WAL record when recycling an all-zero btree page. Commit `fafa374f2` caused _bt_getbuf() to possibly emit a WAL record for a page that it was about to recycle. However, it failed to distinguish all-zero pages from dead pages, which is important because only the latter have valid btpo.xact values, or indeed any special space at all. Recycling an all-zero page with XLogStandbyInfoActive() enabled therefore led to an Assert failure, or to emission of a WAL record containing a bogus cutoff XID, which might lead to unnecessary query cancellations on hot standby servers. Per reports from Antonin Houska and 自己. Amit Kapila was first to propose this fix, and Robert Haas, myself, and Kyotaro Horiguchi reviewed it at various times. This is an old bug, so back-patch to all supported branches. Discussion: https://postgr.es/m/2628.1474272158@localhost Discussion: https://postgr.es/m/48875502.f4a0.1635f0c27b0.Coremail.zoulx1982@163.com	2018-07-09 19:26:19 -04:00
Tom Lane	dd4e836748	Prevent accidental linking of system-supplied copies of libpq.so etc. Back-patch commit `dddfc4cb2`, which broke LDFLAGS and related Makefile variables into two parts, one for within-build-tree library references and one for external libraries, to ensure that the order of -L flags has all of the former before all of the latter. This turns out to fix a problem recently noted on buildfarm member peripatus, that we attempted to incorporate code from libpgport.a into a shared library. That will fail on platforms that are sticky about putting non-PIC code into shared libraries. (It's quite surprising we hadn't seen such failures before, since the code in question has been like that for a long time.) I think that peripatus' problem could have been fixed with just a subset of this patch; but since the previous issue of accidentally linking to the wrong copy of a Postgres shlib seems likely to bite people in the field, let's just back-patch the whole change. Now that commit `dddfc4cb2` has survived some beta testing, I'm less afraid to back-patch it than I was at the time. This also fixes undesired inclusion of "-DFRONTEND" in pg_config's CPPFLAGS output (in 9.6 and up) and undesired inclusion of "-L../../src/common" in its LDFLAGS output (in all supported branches). Back-patch to v10 and older branches; this is already in v11. Discussion: https://postgr.es/m/20180704234304.bq2dxispefl65odz@ler-imac.local	2018-07-09 17:23:32 -04:00
Michael Paquier	f352f43d3f	Prevent references to invalid relation pages after fresh promotion If a standby crashes after promotion before having completed its first post-recovery checkpoint, then the minimal recovery point which marks the LSN position where the cluster is able to reach consistency may be set to a position older than the first end-of-recovery checkpoint while all the WAL available should be replayed. This leads to the instance thinking that it contains inconsistent pages, causing a PANIC and a hard instance crash even if all the WAL available has not been replayed for certain sets of records replayed. When in crash recovery, minRecoveryPoint is expected to always be set to InvalidXLogRecPtr, which forces the recovery to replay all the WAL available, so this commit makes sure that the local copy of minRecoveryPoint from the control file is initialized properly and stays as it is while crash recovery is performed. Once switching to archive recovery or if crash recovery finishes, then the local copy minRecoveryPoint can be safely updated. Pavan Deolasee has reported and diagnosed the failure in the first place, and the base fix idea to rely on the local copy of minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded into a full-fledged patch by me. The test included in this commit has been written by Álvaro Herrera and Pavan Deolasee, which I have modified to make it faster and more reliable with sleep phases. Backpatch down to all supported versions where the bug appears, aka 9.3 which is where the end-of-recovery checkpoint is not run by the startup process anymore. The test gets easily supported down to 10, still it has been tested on all branches. Reported-by: Pavan Deolasee Diagnosed-by: Pavan Deolasee Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro Herrera Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com	2018-07-05 10:47:50 +09:00
Andres Freund	8c8c9f37c2	Check for interrupts inside the nbtree page deletion code. When deleting pages the nbtree code has to walk through siblings of a tree node. When those sibling links are corrupted that can lead to endless loops - which are currently not interruptible. This is especially problematic if autovacuum is repeatedly blocked on such indexes, as it can be hard to get out of that situation without resorting to single user mode. Thus add interrupt checks to appropriate places in such loops. Unfortunately in one of the cases it's it's not easy to do so. Between 9.3 and 9.4 the page deletion (and page split) code changed significantly. Before it was significantly less robust against interruptions. Therefore don't backpatch to 9.3. Author: Andres Freund Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de Backpatch: 9.4-	2018-07-04 14:58:26 -07:00
Fujii Masao	62c2fe6446	Improve the performance of relation deletes during recovery. When multiple relations are deleted at the same transaction, the files of those relations are deleted by one call to smgrdounlinkall(), which leads to scan whole shared_buffers only one time. OTOH, previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was called for each file to delete, which led to scan shared_buffers multiple times. Obviously this could cause to increase the WAL replay time very much especially when shared_buffers was huge. To alleviate this situation, this commit changes the recovery so that it also calls smgrdounlinkall() only one time to delete multiple relation files. This is just fix for oversight of commit `279628a0a7`, not new feature. So, per discussion on pgsql-hackers, we concluded to backpatch this to all supported versions. Author: Fujii Masao Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com	2018-07-05 02:46:44 +09:00
Peter Eisentraut	2a4dca9491	Fix libpq example programs When these programs call pg_catalog.set_config, they need to check for PGRES_TUPLES_OK instead of PGRES_COMMAND_OK. Fix for `5770172cb0`. Reported-by: Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>	2018-07-01 14:09:22 +02:00
Alvaro Herrera	962313558f	Fix "base" snapshot handling in logical decoding Two closely related bugs are fixed. First, xmin of logical slots was advanced too early. During xl_running_xacts processing, xmin of the slot was set to the oldest running xid in the record, but that's wrong: actually, snapshots which will be used for not-yet-replayed transactions might consider older txns as running too, so we need to keep xmin back for them. The problem wasn't noticed earlier because DDL which allows to delete tuple (set xmax) while some another not-yet-committed transaction looks at it is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock conflicting with any inserts. The included test case (test_decoding's oldest_xmin) uses ALTER of a composite type, which doesn't have such interlocking. To deal with this, we must be able to quickly retrieve oldest xmin (oldest running xid among all assigned snapshots) from ReorderBuffer. To fix, add another list of ReorderBufferTXNs to the reorderbuffer, where transactions are sorted by base-snapshot-LSN. This is slightly different from the existing (sorted by first-LSN) list, because a transaction can have an earlier LSN but a later Xmin, if its first record does not obtain an xmin (eg. xl_xact_assignment). Note this new list doesn't fully replace the existing txn list: we still need that one to prevent WAL recycling. The second issue concerns SnapBuilder snapshots and subtransactions. SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a transaction that is known to be a subtxn, which is good in the common case that the top-level transaction already has one (no point in doing so), but a bug otherwise. To fix, arrange to transfer the snapshot from the subtxn to its top-level txn as soon as the kinship gets known. test_decoding's snapshot_transfer verifies this. Also, fix a minor memory leak: refcount of toplevel's old base snapshot was not decremented when the snapshot is transferred from child. Liberally sprinkle code comments, and rewrite a few existing ones. This part is my (Álvaro's) contribution to this commit, as I had to write all those comments in order to understand the existing code and Arseny's patch. Reported-by: Arseny Sher <a.sher@postgrespro.ru> Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad	2018-06-26 16:38:34 -04:00
Thomas Munro	db05d0b906	Add PGTYPESchar_free() to avoid cross-module problems on Windows. On Windows, it is sometimes important for corresponding malloc() and free() calls to be made from the same DLL, since some build options can result in multiple allocators being active at the same time. For that reason we already provided PQfreemem(). This commit adds a similar function for freeing string results allocated by the pgtypes library. Author: Takayuki Tsunakawa Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8AD5D6%40G01JPEXMBYT05	2018-06-26 23:21:39 +12:00
Thomas Munro	c4ccbcc1a2	Move RecoveryLockList into a hash table. Standbys frequently need to release all locks held by a given xid. Instead of searching one big list linearly, let's create one list per xid and put them in a hash table, so we can find what we need in O(1) time. Earlier analysis and a prototype were done by David Rowley, though this isn't his patch. Back-patch all the way. Author: Thomas Munro Diagnosed-by: David Rowley, Andres Freund Reviewed-by: Andres Freund, Tom Lane, Robert Haas Discussion: https://postgr.es/m/CAEepm%3D1mL0KiQ2KJ4yuPpLGX94a4Ns_W6TL4EGRouxWibu56pA%40mail.gmail.com Discussion: https://postgr.es/m/CAKJS1f9vJ841HY%3DwonnLVbfkTWGYWdPN72VMxnArcGCjF3SywA%40mail.gmail.com	2018-06-26 18:23:36 +12:00
Michael Paquier	79b5b101f9	Address set of issues with errno handling System calls mixed up in error code paths are causing two issues which several code paths have not correctly handled: 1) For write() calls, sometimes the system may return less bytes than what has been written without errno being set. Some paths were careful enough to consider that case, and assumed that errno should be set to ENOSPC, other calls missed that. 2) errno generated by a system call is overwritten by other system calls which may succeed once an error code path is taken, causing what is reported to the user to be incorrect. This patch uses the brute-force approach of correcting all those code paths. Some refactoring could happen in the future, but this is let as future work, which is not targeted for back-branches anyway. Author: Michael Paquier Reviewed-by: Ashutosh Sharma Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz	2018-06-25 11:22:02 +09:00
Tom Lane	cd56194d18	Avoid unnecessary use of strncpy in a couple of places in ecpg. Use of strncpy with a length limit based on the source, rather than the destination, is non-idiomatic and draws warnings from gcc 8. Replace with memcpy, which does exactly the same thing in these cases, but with less chance for confusion. Backpatch to all supported branches. Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us	2018-06-16 14:58:42 -04:00
Andres Freund	817f9f9a8a	Fix bugs in vacuum of shared rels, by keeping their relcache entries current. When vacuum processes a relation it uses the corresponding relcache entry's relfrozenxid / relminmxid as a cutoff for when to remove tuples etc. Unfortunately for nailed relations (i.e. critical system catalogs) bugs could frequently lead to the corresponding relcache entry being stale. This set of bugs could cause actual data corruption as vacuum would potentially not remove the correct row versions, potentially reviving them at a later point. After `699bf7d05c` some corruptions in this vein were prevented, but the additional error checks could also trigger spuriously. Examples of such errors are: ERROR: found xmin ... from before relfrozenxid ... and ERROR: found multixact ... from before relminmxid ... To be caused by this bug the errors have to occur on system catalog tables. The two bugs are: 1) Invalidations for nailed relations were ignored, based on the theory that the relcache entry for such tables doesn't change. Which is largely true, except for fields like relfrozenxid etc. This means that changes to relations vacuumed in other sessions weren't picked up by already existing sessions. Luckily autovacuum doesn't have particularly longrunning sessions. 2) For shared and nailed relations, the shared relcache init file was never invalidated while running. That means that for such tables (e.g. pg_authid, pg_database) it's not just already existing sessions that are affected, but even new connections are as well. That explains why the reports usually were about pg_authid et. al. To fix 1), revalidate the rd_rel portion of a relcache entry when invalid. This implies a bit of extra complexity to deal with bootstrapping, but it's not too bad. The fix for 2) is simpler, simply always remove both the shared and local init files. Author: Andres Freund Reviewed-By: Alvaro Herrera Discussion: https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com Backpatch: 9.3-	2018-06-12 11:13:22 -07:00
Alvaro Herrera	5970bfb04e	Fix function code in error report This bug causes a lseek() failure to be reported as a "could not open" failure in the error message, muddling bug reports. I introduced this copy-and-pasteo in commit `78e1220104`. Noticed while reviewing code for bug report #15221, from lily liang. In version 10 the affected function is only used by multixact.c and commit_ts, and only in corner-case circumstances, neither of which are involved in the reported bug (a pg_subtrans failure.) Author: Álvaro Herrera	2018-06-06 14:47:46 -04:00
Tom Lane	98d522a1de	Fix misidentification of SQL statement type in plpgsql's exec_stmt_execsql. To distinguish SQL statements that are INSERT/UPDATE/DELETE from other ones, exec_stmt_execsql looked at the post-rewrite form of the statement rather than the original. This is problematic because it did that only during first execution of the statement (in a session), but the correct answer could change later due to addition or removal of DO INSTEAD rules during the session. That could lead to an Assert failure, as reported by Tushar Ahuja and Robert Haas. In non-assert builds, there's a hazard that we would fail to enforce STRICT behavior when we'd be expected to. That would happen if an initially present DO INSTEAD, that replaced the original statement with one of a different type, were removed; after that the statement should act "normally", including strictness enforcement, but it didn't. (The converse case of enforcing strictness when we shouldn't doesn't seem to be a hazard, as addition of a DO INSTEAD that changes the statement type would always lead to acting as though the statement returned zero rows, so that the strictness error could not fire.) To fix, inspect the original form of the statement not the post-rewrite form, making it valid to assume the answer can't change intra-session. This should lead to the same answer in every case except when there is a DO INSTEAD that changes the statement type; we will now set mod_stmt=true anyway, while we would not have done so before. That breaks the Assert in the SPI_OK_REWRITTEN code path, which expected the latter behavior. It might be all right to assert mod_stmt rather than !mod_stmt there, but I'm not entirely convinced that that'd always hold, so just remove the assertion altogether. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CA+TgmoZUrRN4xvZe_BbBn_Xp0BDwuMEue-0OyF0fJpfvU2Yc7Q@mail.gmail.com	2018-05-25 14:31:07 -04:00
Tom Lane	8f2143bc8f	Properly schema-qualify additional object types in getObjectDescription(). Collations, conversions, extended statistics objects (in >= v10), and all four types of text search objects have schema-qualified names. getObjectDescription() ignored that and would emit just the base name of the object, potentially producing wrong or at least highly misleading output. Fix it to add the schema name whenever the object is not "visible" in the current search path, as is the rule for other schema-qualifiable object types. Although in common situations the output won't change, this seems to me (tgl) to be a bug worthy of back-patching, hence do so. Kyotaro Horiguchi, per a complaint from me Discussion: https://postgr.es/m/20180522.182020.114074746.horiguchi.kyotaro@lab.ntt.co.jp	2018-05-24 12:07:41 -04:00
Tom Lane	09fb2d5d3b	Fix simple_prompt() to disable echo on Windows when stdin != terminal. If echo = false, simple_prompt() is supposed to prevent echoing the input (for password input). However, the Windows implementation applied the mode change to STD_INPUT_HANDLE. That would not have the desired effect if stdin isn't actually the terminal, for instance if the user is piping something into psql. Fix it to apply the mode change to the correct input file, so that passwords do not echo in such cases. In passing, shorten and de-uglify this code by using #elif rather than an #if nest and removing some duplicated code. Back-patch to all supported versions. To simplify that, also back-patch the portions of commit `9daec77e1` that got rid of an unnecessary malloc/free in the same area. Matthew Stickney (cosmetic changes by me) Discussion: https://postgr.es/m/502a1fff-862b-da52-1031-f68df6ed5a2d@gmail.com	2018-05-23 19:04:34 -04:00
Tom Lane	d25714d0a3	Widen COPY FROM's current-line-number counter from 32 to 64 bits. Because the code for the HEADER option skips a line when this counter is zero, a very long COPY FROM WITH HEADER operation would drop a line every 2^32 lines. A lesser but still unfortunate problem is that errors would show a wrong input line number for errors occurring beyond the 2^31'st input line. While such large input streams seemed impractical when this code was first written, they're not any more. Widening the counter (and some associated variables) to uint64 should be enough to prevent problems for the foreseeable future. David Rowley Discussion: https://postgr.es/m/CAKJS1f88yh-6wwEfO6QLEEvH3BEugOq2QX1TOja0vCauoynmOQ@mail.gmail.com	2018-05-22 13:32:52 -04:00
Andrew Gierth	769e6fcd1a	Fix SQL:2008 FETCH FIRST syntax to allow parameters. OFFSET <x> ROWS FETCH FIRST <y> ROWS ONLY syntax is supposed to accept <simple value specification>, which includes parameters as well as literals. When this syntax was added all those years ago, it was done inconsistently, with <x> and <y> being different subsets of the standard syntax. Rectify that by making <x> and <y> accept the same thing, and allowing either a (signed) numeric literal or a c_expr there, which allows for parameters, variables, and parenthesized arbitrary expressions. Per bug #15200 from Lukas Eder. Backpatch all the way, since this has been broken from the start. Discussion: https://postgr.es/m/877enz476l.fsf@news-spur.riddles.org.uk Discussion: http://postgr.es/m/152647780335.27204.16895288237122418685@wrigleys.postgresql.org	2018-05-21 17:32:29 +01:00
Tom Lane	5517367e97	Fix unsafe usage of strerror(errno) within ereport(). This is the converse of the unsafe-usage-of-%m problem: the reason ereport/elog provide that format code is mainly to dodge the hazard of errno getting changed before control reaches functions within the arguments of the macro. I only found one instance of this hazard, but it's been there since 9.4 :-(.	2018-05-21 00:32:52 -04:00
Tom Lane	e52cabff70	printf("%lf") is not portable, so omit the "l". The "l" (ell) width spec means something in the corresponding scanf usage, but not here. While modern POSIX says that applying "l" to "f" and other floating format specs is a no-op, SUSv2 says it's undefined. Buildfarm experience says that some old compilers emit warnings about it, and at least one old stdio implementation (mingw's "ANSI" option) actually produces wrong answers and/or crashes. Discussion: https://postgr.es/m/21670.1526769114@sss.pgh.pa.us Discussion: https://postgr.es/m/c085e1da-0d64-1c15-242d-c921f32e0d5c@dunslane.net	2018-05-20 11:40:54 -04:00
Tom Lane	8109f201da	Support platforms where strtoll/strtoull are spelled __strtoll/__strtoull. Ancient HPUX, for one, does this. We hadn't noticed due to the lack of regression tests that required a working strtoll. (I was slightly tempted to remove the other historical spelling, strto[u]q, since it seems we have no buildfarm members testing that case. But I refrained.) Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org	2018-05-19 14:22:19 -04:00
Tom Lane	023aa76e19	Arrange to supply declarations for strtoll/strtoull if needed. Buildfarm member dromedary is still unhappy about the recently-added ecpg "long long" tests. The reason turns out to be that it includes "-ansi" in its CFLAGS, and in their infinite wisdom Apple have decided to hide the declarations of strtoll/strtoull in C89-compliant builds. (I find it pretty curious that they hide those function declarations when you can nonetheless declare a "long long" variable, but anyway that is their behavior, both on dromedary's obsolete macOS version and the newest and shiniest.) As a result, gcc assumes these functions return "int", leading naturally to wrong results. (Looking at dromedary's past build results, it's evident that this problem also breaks pg_strtouint64() on 32-bit platforms; but we evidently have no regression tests that exercise that function with values above 32 bits.) To fix, supply declarations for these functions when the platform provides the functions but not the declarations, using the same type of mechanism as we use for some other similar cases. Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org	2018-05-18 22:42:10 -04:00
Tom Lane	54ae787ca7	Hot-fix ecpg regression test for missing ecpg_config.h inclusion. I don't think this is really the best long-term answer, and in particular it doesn't fix the pre-existing hazard in sqltypes.h. But for the moment let's just try to make the buildfarm green again. Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org	2018-05-18 19:04:11 -04:00
Tom Lane	e75c832b29	Add some test coverage for ecpg's "long long" support. This will only actually exercise the "long long" code paths on platforms where "long" is 32 bits --- otherwise, the SQL bigint type maps to plain "long", and we will test that code path instead. But that's probably sufficient coverage, and anyway we weren't testing either code path before. Dang Minh Huong, tweaked a bit by me Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org	2018-05-18 13:04:59 -04:00
Tom Lane	385f4acbf8	Recognize that MSVC can support strtoll() and strtoull(). This is needed for full support of "long long" variables in ecpg, but the previous patch for bug #15080 (commits `51057feaa` et al) missed it. In MSVC versions where the functions don't exist under those names, we can nonetheless use _strtoi64() and _strtoui64(). Like the previous patch, back-patch all the way. Dang Minh Huong Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org	2018-05-18 12:52:28 -04:00
Magnus Hagander	b5f096d50b	Fix error message on short read of pg_control Instead of saying "error: success", indicate that we got a working read but it was too short.	2018-05-18 17:53:19 +02:00
Tom Lane	62e0020ad4	Fix misprocessing of equivalence classes involving record_eq(). canonicalize_ec_expression() is supposed to agree with coerce_type() as to whether a RelabelType should be inserted to make a subexpression be valid input for the operators of a given opclass. However, it did the wrong thing with named-composite-type inputs to record_eq(): it put in a RelabelType to RECORDOID, which the parser doesn't. In some cases this was harmless because all code paths involving a particular equivalence class did the same thing, but in other cases this would result in failing to recognize a composite-type expression as being a member of an equivalence class that it actually is a member of. The most obvious bad effect was to fail to recognize that an index on a composite column could provide the sort order needed for a mergejoin on that column, as reported by Teodor Sigaev. I think there might be other, subtler, cases that result in misoptimization. It also seems possible that an unwanted RelabelType would sometimes get into an emitted plan --- but because record_eq and friends don't examine the declared type of their input expressions, that would not create any visible problems. To fix, just treat RECORDOID as if it were a polymorphic type, which in some sense it is. We might want to consider formalizing that a bit more someday, but for the moment this seems to be the only place where an IsPolymorphicType() test ought to include RECORDOID as well. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/a6b22369-e3bf-4d49-f59d-0c41d3551e81@sigaev.ru	2018-05-16 13:46:09 -04:00
Tom Lane	32453bc5af	Update time zone data files to tzdata release 2018e. DST law changes in North Korea. Redefinition of "daylight savings" in Ireland, as well as for some past years in Namibia and Czechoslovakia. Additional historical corrections for Czechoslovakia. With this change, the IANA database models Irish timekeeping as following "standard time" in summer, and "daylight savings" in winter, so that the daylight savings offset is one hour behind standard time not one hour ahead. This does not change their UTC offset (+1:00 in summer, 0:00 in winter) nor their timezone abbreviations (IST in summer, GMT in winter), though now "IST" is more correctly read as "Irish Standard Time" not "Irish Summer Time". However, the "is_dst" column in the pg_timezone_names view will now be true in winter and false in summer for the Europe/Dublin zone. Similar changes were made for Namibia between 1994 and 2017, and for Czechoslovakia between 1946 and 1947. So far as I can find, no Postgres internal logic cares about which way tm_isdst is reported; in particular, since commit `b2cbced9e` we do not rely on it to decide how to interpret ambiguous timestamps during DST transitions. So I don't think this change will affect any Postgres behavior other than the timezone-view outputs. Discussion: https://postgr.es/m/30996.1525445902@sss.pgh.pa.us	2018-05-09 13:56:00 -04:00
Tom Lane	3d48654017	Improve inefficient regexes in vacuumdb TAP test. The regexes used in 102_vacuumdb_stages.pl to check the postmaster log for expected output contained several places with "..", which is underdetermined and can cause exponential runtime growth in Perl's regex matcher (since it's not bright enough not to waste time seeing whether different splits of the same substring would allow a match). We were fortunate that the amount of text in the postmaster log was generally not enough to make the runtime go to the moon; although commit `6271fceb8` had been on the hairy edge of an obvious problem, thanks to its increasing the default log verbosity to DEBUG1. Experimentation shows that anyone who tried to run this test case with an even higher log verbosity would have been in for serious pain. But even at default logging level, fixing this saves several hundred ms on my workstation, more on slower buildfarm members. Remove the extra ".*"s, restoring more-or-less-linear matching speed. Back-patch to 9.4 where the test case was added, mostly in case anyone tries to do related debugging in a back branch. Discussion: https://postgr.es/m/32459.1525657786@sss.pgh.pa.us	2018-05-08 20:17:43 -04:00
Tom Lane	364998df87	Stamp 9.4.18.	2018-05-07 16:57:35 -04:00
Peter Eisentraut	dc441d5c2d	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: d73a8c239ac494c183e266261c657580526d4cba	2018-05-07 11:47:28 -04:00
Andrew Dunstan	1eb24720c6	Clear severity 5 perlcritic warnings from vcregress.pl My recent update for python3 support used some idioms that are unapproved. This fixes them. Backpatch to all live branches like the original.	2018-05-06 07:40:04 -04:00
Peter Eisentraut	af9e0d5cdf	Tweak tests to support Python 3.7 Python 3.7 removes the trailing comma in the repr() of BaseException (see <https://bugs.python.org/issue30399>), leading to test output differences. Work around that by composing the equivalent test output in a more manual way.	2018-05-05 23:53:05 -04:00
Peter Eisentraut	280cf0fe78	Remove extra newlines after PQerrorMessage()	2018-05-05 10:54:00 -04:00
Heikki Linnakangas	c06380e976	Fix scenario where streaming standby gets stuck at a continuation record. If a continuation record is split so that its first half has already been removed from the master, and is only present in pg_wal, and there is a recycled WAL segment in the standby server that looks like it would contain the second half, recovery would get stuck. The code in XLogPageRead() incorrectly started streaming at the beginning of the WAL record, even if we had already read the first page. Backpatch to 9.4. In principle, older versions have the same problem, but without replication slots, there was no straightforward mechanism to prevent the master from recycling old WAL that was still needed by standby. Without such a mechanism, I think it's reasonable to assume that there's enough slack in how many old segments are kept around to not run into this, or you have a WAL archive. Reported by Jonathon Nelson. Analysis and patch by Kyotaro HORIGUCHI, with some extra comments by me. Discussion: https://www.postgresql.org/message-id/CACJqAM3xVz0JY1XFDKPP%2BJoJAjoGx%3DGNuOAshEDWCext7BFvCQ%40mail.gmail.com	2018-05-05 01:35:18 +03:00
Andrew Dunstan	134db37d21	Provide for testing on python3 modules when under MSVC This should have been done some years ago as promised in commit `c4dcdd0c2`. However, better late than never. Along the way do a little housekeeping, including using a simpler test for the python version being tested, and removing a redundant subroutine parameter. These changes only apply back to release 9.5. Backpatch to all live releases.	2018-05-04 15:51:31 -04:00
Tom Lane	2d123b3104	Sync our copy of the timezone library with IANA release tzcode2018e. The non-cosmetic changes involve teaching the "zic" tzdata compiler about negative DST. While I'm not currently intending that we start using negative-DST data right away, it seems possible that somebody would try to use our copy of zic with bleeding-edge IANA data. So we'd better be out in front of this change code-wise, even though it doesn't matter for the data file we're shipping. Discussion: https://postgr.es/m/30996.1525445902@sss.pgh.pa.us	2018-05-04 12:26:48 -04:00
Teodor Sigaev	6bd659f19c	Add HOLD_INTERRUPTS section into FinishPreparedTransaction. If an interrupt arrives in the middle of FinishPreparedTransaction and any callback decide to call CHECK_FOR_INTERRUPTS (e.g. RemoveTwoPhaseFile can write a warning with ereport, which checks for interrupts) then it's possible to leave current GXact undeleted. Backpatch to all supported branches Stas Kelvich Discussion: ihttps://www.postgresql.org/message-id/3AD85097-A3F3-4EBA-99BD-C38EDF8D2949@postgrespro.ru	2018-05-03 20:10:11 +03:00
Tom Lane	70211459a5	Revert back-branch changes in power()'s behavior for NaN inputs. Per discussion, the value of fixing these bugs in the back branches doesn't outweigh the downsides of changing corner-case behavior in a minor release. Hence, revert commits `217d8f3a1` and `4d864de48` in the v10 branch and the corresponding commits in 9.3-9.6. Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp	2018-05-02 17:32:40 -04:00
Tom Lane	8109a3c144	Fix bogus list-iteration code in pg_regress.c, affecting ecpg tests only. While looking at a recent buildfarm failure in the ecpg tests, I wondered why the pg_regress output claimed the stderr part of the test failed, when the regression diffs were clearly for the stdout part. Looking into it, the reason is that pg_regress.c's logic for iterating over three parallel lists is wrong, and has been wrong since it was written: it advances the "tag" pointer at a different place in the loop than the other two pointers. Fix that.	2018-04-29 21:56:28 -04:00
Tom Lane	59c2df3ae8	Avoid wrong results for power() with NaN input on more platforms. Buildfarm results show that the modern POSIX rule that 1 ^ NaN = 1 is not honored on *BSD until relatively recently, and really old platforms don't believe that NaN ^ 0 = 1 either. (This is unsurprising, perhaps, since SUSv2 doesn't require either behavior.) In hopes of getting to platform independent behavior, let's deal with all the NaN-input cases explicitly in dpow(). Note that numeric_power() doesn't know either of these special cases. But since that behavior is platform-independent, I think it should be addressed separately, and probably not back-patched. Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp	2018-04-29 18:15:16 -04:00
Tom Lane	37c02b2b0a	Update time zone data files to tzdata release 2018d. DST law changes in Palestine and Antarctica (Casey Station). Historical corrections for Portugal and its colonies, as well as Enderbury, Jamaica, Turks & Caicos Islands, and Uruguay.	2018-04-29 15:50:43 -04:00
Tom Lane	44ccd11cbb	Avoid wrong results for power() with NaN input on some platforms. Per spec, the result of power() should be NaN if either input is NaN. It appears that on some versions of Windows, the libc function does return NaN, but it also sets errno = EDOM, confusing our code that attempts to work around shortcomings of other platforms. Hence, add guard tests to avoid substituting a wrong result for the right one. It's been like this for a long time (and the odd behavior only appears in older MSVC releases, too) so back-patch to all supported branches. Dang Minh Huong, reviewed by David Rowley Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp	2018-04-29 15:21:45 -04:00
Noah Misch	bb532859f4	Correct pg_recvlogical server version test. The predecessor test boiled down to "PQserverVersion(NULL) >= 100000", which is always false. No release includes that, so it could not have reintroduced CVE-2018-1058. Back-patch to 9.4, like the addition of the predecessor in commit `8d2814f274`. Discussion: https://postgr.es/m/20180422215551.GB2676194@rfd.leadboat.com	2018-04-25 18:50:34 -07:00
Tom Lane	58fec95268	Change more places to be less trusting of RestrictInfo.is_pushed_down. On further reflection, commit `e5d83995e` didn't go far enough: pretty much everywhere in the planner that examines a clause's is_pushed_down flag ought to be changed to use the more complicated behavior where we also check the clause's required_relids. Otherwise we could make incorrect decisions about whether, say, a clause is safe to use as a hash clause. Some (many?) of these places are safe as-is, either because they are never reached while considering a parameterized path, or because there are additional checks that would reject a pushed-down clause anyway. However, it seems smarter to just code them all the same way rather than rely on easily-broken reasoning of that sort. In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should be used in place of direct tests on the is_pushed_down flag. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se	2018-04-20 15:19:17 -04:00
Tom Lane	a347d5210e	Fix incorrect handling of join clauses pushed into parameterized paths. In some cases a clause attached to an outer join can be pushed down into the outer join's RHS even though the clause is not degenerate --- this can happen if we choose to make a parameterized path for the RHS. If the clause ends up attached to a lower outer join, we'd misclassify it as being a "join filter" not a plain "filter" condition at that node, leading to wrong query results. To fix, teach extract_actual_join_clauses to examine each join clause's required_relids, not just its is_pushed_down flag. (The latter now seems vestigial, or at least in need of rethinking, but we won't do anything so invasive as redefining it in a bug-fix patch.) This has been wrong since we introduced parameterized paths in 9.2, though it's evidently hard to hit given the lack of previous reports. The test case used here involves a lateral function call, and I think that a lateral reference may be required to get the planner to select a broken plan; though I wouldn't swear to that. In any case, even if LATERAL is needed to trigger the bug, it still affects all supported branches, so back-patch to all. Per report from Andreas Karlsson. Thanks to Andrew Gierth for preliminary investigation. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se	2018-04-19 15:49:12 -04:00
Alvaro Herrera	e668507d36	Enlarge find_other_exec's meager fgets buffer The buffer was 100 bytes long, which is barely sufficient when the version string gets longer (such as by configure --with-extra-version). Set it to MAXPGPATH. Author: Nikhil Sontakke Discussion: https://postgr.es/m/CAMGcDxfLfpYU_Jru++L6ARPCOyxr0W+2O3Q54TDi5XdYeU36ow@mail.gmail.com	2018-04-19 10:45:15 -03:00
Tom Lane	7490ce725e	Better fix for deadlock hazard in CREATE INDEX CONCURRENTLY. Commit `54eff5311` did not account for the possibility that we'd have a transaction snapshot due to default_transaction_isolation being set high enough to require one. The transaction snapshot is enough to hold back our advertised xmin and thus risk deadlock anyway. The only way to get rid of that snap is to start a new transaction, so let's do that instead. Also throw in an assert checking that we really have gotten to a state where no xmin is being advertised. Back-patch to 9.4, like the previous commit. Discussion: https://postgr.es/m/CAMkU=1ztk3TpQdcUNbxq93pc80FrXUjpDWLGMeVBDx71GHNwZQ@mail.gmail.com	2018-04-18 12:07:38 -04:00
Tom Lane	92b503c48d	Revert "Add temporary debug logging, in 9.4 branch only." This reverts commit `e55380f3b6`. It's served its purpose.	2018-04-18 11:57:37 -04:00
Tom Lane	248c268d5b	Revert "Add more temporary debug logging, in 9.4 branch only." This reverts commit `eef1a609ad`. It's served its purpose.	2018-04-18 11:56:56 -04:00
Tom Lane	eef1a609ad	Add more temporary debug logging, in 9.4 branch only. Last night's results were inconclusive, but after more staring at the code I've thought of some more data to gather. Discussion: https://postgr.es/m/6744.1523833660@sss.pgh.pa.us	2018-04-17 11:26:37 -04:00
Tom Lane	608d1f9711	Fix broken collation-aware searches in SP-GiST text opclass. spg_text_leaf_consistent() supposed that it should compare only Min(querylen, entrylen) bytes of the two strings, and then deal with any excess bytes in one string or the other by assuming the longer string is greater if the prefixes are equal. Quite aside from the fact that that's just wrong in some locales (e.g., 'ch' is not less than 'd' in cs_CZ), it also risked passing incomplete multibyte characters to strcoll(), with ensuing bad results. Instead, just pass the full strings to varstr_cmp, and let it decide what to do about unequal-length strings. Fortunately, this error doesn't imply any index corruption, it's just that searches might return the wrong set of entries. Per report from Emre Hasegeli, though this is not his patch. Thanks to Peter Geoghegan for review and discussion. This code was born broken, so back-patch to all supported branches. In HEAD, I failed to resist the temptation to do a bit of cosmetic cleanup/pgindent'ing on `710d90da1`, too. Discussion: https://postgr.es/m/CAE2gYzzb6K51VnTq5i5p52z+j9p2duEa-K1T3RrC_GQEynAKEg@mail.gmail.com	2018-04-16 16:06:47 -04:00
Tom Lane	e55380f3b6	Add temporary debug logging, in 9.4 branch only. Commit `5ee940e1c` served its purpose by demonstrating that buildfarm member okapi is seeing some sort of locally-visible state mismanagement, not a cross-process data visibility problem as I'd first theorized. Put in some elog(LOG) messages in hopes of gathering more info about exactly what's happening there. Again, this is temporary code to be reverted once we have buildfarm results. Discussion: https://postgr.es/m/6744.1523833660@sss.pgh.pa.us	2018-04-16 13:44:39 -04:00
Tom Lane	fea5bfde16	Revert "Add temporary debugging assertion, in 9.4 branch only." This reverts commit `5ee940e1cd`. Further debugging is needed, but it'll look different than this, so for simplicity revert this first.	2018-04-16 13:23:35 -04:00
Tom Lane	5ee940e1cd	Add temporary debugging assertion, in 9.4 branch only. Buildfarm member okapi has been failing the multiple-cic isolation test for months now, but only in 9.4. To narrow down the possible causes, add an Assert testing that CREATE INDEX CONCURRENTLY is advertising zero xmin before waiting for other transactions to end. I'm not sure that this would hold in general, so this assertion isn't meant to get released, but it passes all 9.4 regression tests for me. Will revert once we see how okapi responds.	2018-04-15 20:23:59 -04:00
Tom Lane	3dd36aa4b3	In libpq, free any partial query result before collecting a server error. We'd throw away the partial result anyway after parsing the error message. Throwing it away beforehand costs nothing and reduces the risk of out-of-memory failure. Also, at least in systems that behave like glibc/Linux, if the partial result was very large then the error PGresult would get allocated at high heap addresses, preventing the heap storage used by the partial result from being released to the OS until the error PGresult is freed. In psql >= 9.6, we hold onto the error PGresult until another error is received (for \errverbose), so that this behavior causes a seeming memory leak to persist for awhile, as in a recent complaint from Darafei Praliaskouski. This is a potential performance regression from older versions, justifying back-patching at least that far. But similar behavior may occur in other client applications, so it seems worth just back-patching to all supported branches. Discussion: https://postgr.es/m/CAC8Q8tJ=7cOkPePyAbJE_Pf691t8nDFhJp0KZxHvnq_uicfyVg@mail.gmail.com	2018-04-13 12:53:46 -04:00
Tom Lane	f71d803c8d	Fix bogus affix-merging code. NISortAffixes() compared successive compound affixes incorrectly, thus possibly failing to merge identical affixes, or (less likely) merging ones that shouldn't be merged. The user-visible effects of this are unclear, to me anyway. Per bug #15150 from Alexander Lakhin. It's been broken for a long time, so back-patch to all supported branches. Arthur Zakirov Discussion: https://postgr.es/m/152353327780.31225.13445405496721177988@wrigleys.postgresql.org	2018-04-12 18:39:51 -04:00
Tom Lane	6943fb9275	Ignore nextOid when replaying an ONLINE checkpoint. The nextOid value is from the start of the checkpoint and may well be stale compared to values from more recent XLOG_NEXTOID records. Previously, we adopted it anyway, allowing the OID counter to go backwards during a crash. While this should be harmless, it contributed to the severity of the bug fixed in commit `0408e1ed5`, by allowing duplicate TOAST OIDs to be assigned immediately following a crash. Without this error, that issue would only have arisen when TOAST objects just younger than a multiple of 2^32 OIDs were deleted and then not vacuumed in time to avoid a conflict. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com	2018-04-11 18:11:30 -04:00
Tom Lane	5b3ed6b788	Do not select new object OIDs that match recently-dead entries. When selecting a new OID, we take care to avoid picking one that's already in use in the target table, so as not to create duplicates after the OID counter has wrapped around. However, up to now we used SnapshotDirty when scanning for pre-existing entries. That ignores committed-dead rows, so that we could select an OID matching a deleted-but-not-yet-vacuumed row. While that mostly worked, it has two problems: * If recently deleted, the dead row might still be visible to MVCC snapshots, creating a risk for duplicate OIDs when examining the catalogs within our own transaction. Such duplication couldn't be visible outside the object-creating transaction, though, and we've heard few if any field reports corresponding to such a symptom. * When selecting a TOAST OID, deleted toast rows definitely are visible to SnapshotToast, and will remain so until vacuumed away. This leads to a conflict that will manifest in errors like "unexpected chunk number 0 (expected 1) for toast value nnnnn". We've been seeing reports of such errors from the field for years, but the cause was unclear before. The fix is simple: just use SnapshotAny to search for conflicting rows. This results in a slightly longer window before object OIDs can be recycled, but that seems unlikely to create any large problems. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com	2018-04-11 17:41:27 -04:00
Heikki Linnakangas	310d1379dd	Make local copy of client hostnames in backend status array. The other strings, application_name and query string, were snapshotted to local memory in pgstat_read_current_status(), but we forgot to do that for client hostnames. As a result, the client hostname would appear to change in the local copy, if the client disconnected. Backpatch to all supported versions. Author: Edmund Horner Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAMyN-kA7aOJzBmrYFdXcc7Z0NmW%2B5jBaf_m%3D_-77uRNyKC9r%3DA%40mail.gmail.com	2018-04-11 23:40:27 +03:00
Tom Lane	f530af8fc9	Fix incorrect close() call in dsm_impl_mmap(). One improbable error-exit path in this function used close() where it should have used CloseTransientFile(). This is unlikely to be hit in the field, and I think the consequences wouldn't be awful (just an elog(LOG) bleat later). But a bug is a bug, so back-patch to 9.4 where this code came in. Pan Bian Discussion: https://postgr.es/m/152056616579.4966.583293218357089052@wrigleys.postgresql.org	2018-04-10 18:34:40 -04:00
Tom Lane	b7537ffb1a	Fix bogus provolatile/proparallel markings on a few built-in functions. Richard Yen reported that pg_upgrade failed if the target cluster had force_parallel_mode = on, because binary_upgrade_create_empty_extension() is marked parallel restricted, allowing it to be executed in parallel mode, which complains because it tries to acquire an XID. In general, no function that might try to modify database data should be considered parallel safe or restricted, since execution of it might force XID acquisition. We found several other examples of this mistake. Furthermore, functions that execute user-supplied SQL queries or query fragments, or pull data from user-supplied cursors, had better be marked both volatile and parallel unsafe, because we don't know what the supplied query or cursor might try to do. There were several tsquery and XML functions that had the wrong proparallel marking for this, and some of them were even mislabeled as to volatility. All these bugs are old, dating back to 9.6 for the proparallel mistakes and much further for the provolatile mistakes. We can't force a catversion bump in the back branches, but we can at least ensure that installations initdb'd in future have the right values. Thomas Munro and Tom Lane Discussion: https://postgr.es/m/CAEepm=2sNDScSLTfyMYu32Q=ob98ZGW-vM_2oLxinzSABGQ6VA@mail.gmail.com	2018-03-30 18:14:51 -04:00
Tom Lane	4c26965166	Fix make rules that generate multiple output files. For years, our makefiles have correctly observed that "there is no correct way to write a rule that generates two files". However, what we did is to provide empty rules that "generate" the secondary output files from the primary one, and that's not right either. Depending on the details of the creating process, the primary file might end up timestamped later than one or more secondary files, causing subsequent make runs to consider the secondary file(s) out of date. That's harmless in a plain build, since make will just re-execute the empty rule and nothing happens. But it's fatal in a VPATH build, since make will expect the secondary file to be rebuilt in the build directory. This would manifest as "file not found" failures during VPATH builds from tarballs, if we were ever unlucky enough to ship a tarball with apparently out-of-date secondary files. (It's not clear whether that has ever actually happened, but it definitely could.) To ensure that secondary output files have timestamps >= their primary's, change our makefile convention to be that we provide a "touch $@" action not an empty rule. Also, make sure that this rule actually gets invoked during a distprep run, else the hazard remains. It's been like this a long time, so back-patch to all supported branches. In HEAD, I skipped the changes in src/backend/catalog/Makefile, because those rules are due to get replaced soon in the bootstrap data format patch, and there seems no need to create a merge issue for that patch. If for some reason we fail to land that patch in v11, we'll need to back-fill the changes in that one makefile from v10. Discussion: https://postgr.es/m/18556.1521668179@sss.pgh.pa.us	2018-03-23 13:45:38 -04:00
Tom Lane	7f6f8ccd97	Fix tuple counting in SP-GiST index build. Count the number of tuples in the index honestly, instead of assuming that it's the same as the number of tuples in the heap. (It might be different if the index is partial.) Back-patch to all supported versions. Tomas Vondra Discussion: https://postgr.es/m/3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com	2018-03-22 13:23:48 -04:00
Tom Lane	67e02cde73	Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c. Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz	2018-03-21 20:03:28 -04:00
Tom Lane	e1f186da94	Fix some corner-case issues in REFRESH MATERIALIZED VIEW CONCURRENTLY. refresh_by_match_merge() has some issues in the way it builds a SQL query to construct the "diff" table: 1. It doesn't require the selected unique index(es) to be indimmediate. 2. It doesn't pay attention to the particular equality semantics enforced by a given index, but just assumes that they must be those of the column datatype's default btree opclass. 3. It doesn't check that the indexes are btrees. 4. It's insufficiently careful to ensure that the parser will pick the intended operator when parsing the query. (This would have been a security bug before CVE-2018-1058.) 5. It's not careful about indexes on system columns. The way to fix #4 is to make use of the existing code in ri_triggers.c for generating an arbitrary binary operator clause. I chose to move that to ruleutils.c, since that seems a more reasonable place to be exporting such functionality from than ri_triggers.c. While #1, #3, and #5 are just latent given existing feature restrictions, and #2 doesn't arise in the core system for lack of alternate opclasses with different equality behaviors, #4 seems like an issue worth back-patching. That's the bulk of the change anyway, so just back-patch the whole thing to 9.4 where this code was introduced. Discussion: https://postgr.es/m/13836.1521413227@sss.pgh.pa.us	2018-03-19 18:49:53 -04:00
Tom Lane	b6ba94ec45	Fix performance hazard in REFRESH MATERIALIZED VIEW CONCURRENTLY. Jeff Janes discovered that commit `7ca25b7de` made one of the queries run by REFRESH MATERIALIZED VIEW CONCURRENTLY perform badly. The root cause is bad cardinality estimation for correlated quals, but a principled solution to that problem is some way off, especially since the planner lacks any statistics about whole-row variables. Moreover, in non-error cases this query produces no rows, meaning it must be run to completion; but use of LIMIT 1 encourages the planner to pick a fast-start, slow-completion plan, exactly not what we want. Remove the LIMIT clause, and instead rely on the count parameter we pass to SPI_execute() to prevent excess work if the query does return some rows. While we've heard no field reports of planner misbehavior with this query, it could be that people are having performance issues that haven't reached the level of pain needed to cause a bug report. In any case, that LIMIT clause can't possibly do anything helpful with any existing version of the planner, and it demonstrably can cause bad choices in some cases, so back-patch to 9.4 where the code was introduced. Thomas Munro Discussion: https://postgr.es/m/CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com	2018-03-19 17:23:07 -04:00
Magnus Hagander	af5fbb1286	Fix pg_recvlogical for pre-10 versions In `e170b8c8`, protection against modified search_path was added. However, PostgreSQL versions prior to 10 does not accept SQL commands over a replication connection, so the protection would generate a syntax error. Since we cannot run SQL commands on it, we are also not vulnerable to the issue that `e170b8c8` fixes, so we can just skip this command for older versions. Author: Michael Paquier <michael@paquier.xyz>	2018-03-18 13:11:58 +01:00
Tom Lane	092401b14f	Fix overflow handling in plpgsql's integer FOR loops. The test to exit the loop if the integer control value would overflow an int32 turns out not to work on some ICC versions, as it's dependent on the assumption that the compiler will execute the code as written rather than "optimize" it. ICC lacks any equivalent of gcc's -fwrapv switch, so it was optimizing on the assumption of no integer overflow, and that breaks this. Rewrite into a form that in fact does not do any overflowing computations. Per Tomas Vondra and buildfarm member fulmar. It's been like this for a long time, although it was not till we added a regression test case covering the behavior (in commit `dd2243f2a`) that the problem became apparent. Back-patch to all supported versions. Discussion: https://postgr.es/m/50562fdc-0876-9843-c883-15b8566c7511@2ndquadrant.com	2018-03-17 15:38:15 -04:00
Tom Lane	0a0721f84c	Fix WHERE CURRENT OF when the referenced cursor uses an index-only scan. "UPDATE/DELETE WHERE CURRENT OF cursor_name" failed, with an error message like "cannot extract system attribute from virtual tuple", if the cursor was using a index-only scan for the target table. Fix it by digging the current TID out of the indexscan state. It seems likely that the same failure could occur for CustomScan plans and perhaps some FDW plan types, so that leaving this to be treated as an internal error with an obscure message isn't as good an idea as it first seemed. Hence, add a bit of heaptuple.c infrastructure to let us deliver a more on-topic message. I chose to make the message match what you get for the case where execCurrentOf can't identify the target scan node at all, "cursor "foo" is not a simply updatable scan of table "bar"". Perhaps it should be different, but we can always adjust that later. In the future, it might be nice to provide hooks that would let custom scan providers and/or FDWs deal with this in other ways; but that's not a suitable topic for a back-patchable bug fix. It's been like this all along, so back-patch to all supported branches. Yugo Nagata and Tom Lane Discussion: https://postgr.es/m/20180201013349.937dfc5f.nagata@sraoss.co.jp	2018-03-17 14:59:31 -04:00
Tom Lane	2709549ecd	Fix query-lifespan memory leakage in repeatedly executed hash joins. ExecHashTableCreate allocated some memory that wasn't freed by ExecHashTableDestroy, specifically the per-hash-key function information. That's not a huge amount of data, but if one runs a query that repeats a hash join enough times, it builds up. Fix by arranging for the data in question to be kept in the hashtable's hashCxt instead of leaving it "loose" in the query-lifespan executor context. (This ensures that we'll also clean up anything that the hash functions allocate in fn_mcxt.) Per report from Amit Khandekar. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAJ3gD9cFofAWGvcxLOxDHC=B0hjtW8yGmUsF2hdGh97CM38=7g@mail.gmail.com	2018-03-16 16:03:45 -04:00
Michael Meskes	fcc15bf381	Fix double frees in ecpg. Patch by Patrick Krecker <patrick@judicata.com>	2018-03-14 00:52:21 +01:00
Tom Lane	25a2ba35ed	When updating reltuples after ANALYZE, just extrapolate from our sample. The existing logic for updating pg_class.reltuples trusted the sampling results only for the pages ANALYZE actually visited, preferring to believe the previous tuple density estimate for all the unvisited pages. While there's some rationale for doing that for VACUUM (first that VACUUM is likely to visit a very nonrandom subset of pages, and second that we know for sure that the unvisited pages did not change), there's no such rationale for ANALYZE: by assumption, it's looked at an unbiased random sample of the table's pages. Furthermore, in a very large table ANALYZE will have examined only a tiny fraction of the table's pages, meaning it cannot slew the overall density estimate very far at all. In a table that is physically growing, this causes reltuples to increase nearly proportionally to the change in relpages, regardless of what is actually happening in the table. This has been observed to cause reltuples to become so much larger than reality that it effectively shuts off autovacuum, whose threshold for doing anything is a fraction of reltuples. (Getting to the point where that would happen seems to require some additional, not well understood, conditions. But it's undeniable that if reltuples is seriously off in a large table, ANALYZE alone will not fix it in any reasonable number of iterations, especially not if the table is continuing to grow.) Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone, and in ANALYZE, just extrapolate from the sample pages on the assumption that they provide an accurate model of the whole table. If, by very bad luck, they don't, at least another ANALYZE will fix it; in the old logic a single bad estimate could cause problems indefinitely. In HEAD, let's remove vac_estimate_reltuples' is_analyze argument altogether; it was never used for anything and now it's totally pointless. But keep it in the back branches, in case any third-party code is calling this function. Per bug #15005. Back-patch to all supported branches. David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels	2018-03-13 13:24:27 -04:00
Tom Lane	95f08d32de	Avoid holding AutovacuumScheduleLock while rechecking table statistics. In databases with many tables, re-fetching the statistics takes some time, so that this behavior seriously decreases the available concurrency for multiple autovac workers. There's discussion afoot about more complete fixes, but a simple and back-patchable amelioration is to claim the table and release the lock before rechecking stats. If we find out there's no longer a reason to process the table, re-taking the lock to un-claim the table is cheap enough. (This patch is quite old, but got lost amongst a discussion of more aggressive fixes. It's not clear when or if such a fix will be accepted, but in any case it'd be unlikely to get back-patched. Let's do this now so we have some improvement for the back branches.) In passing, make the normal un-claim step take AutovacuumScheduleLock not AutovacuumLock, since that is what is documented to protect the wi_tableoid field. This wasn't an actual bug in view of the fact that readers of that field hold both locks, but it creates some concurrency penalty against operations that need only AutovacuumLock. Back-patch to all supported versions. Jeff Janes Discussion: https://postgr.es/m/26118.1520865816@sss.pgh.pa.us	2018-03-13 12:28:39 -04:00
Michael Meskes	bd7eb6fe65	Set connection back to NULL after freeing it. Patch by Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>	2018-03-12 23:54:22 +01:00
Tom Lane	e556fb1372	Fix improper uses of canonicalize_qual(). One of the things canonicalize_qual() does is to remove constant-NULL subexpressions of top-level AND/OR clauses. It does that on the assumption that what it's given is a top-level WHERE clause, so that NULL can be treated like FALSE. Although this is documented down inside a subroutine of canonicalize_qual(), it wasn't mentioned in the documentation of that function itself, and some callers hadn't gotten that memo. Notably, commit `d007a9505` caused get_relation_constraints() to apply canonicalize_qual() to CHECK constraints. That allowed constraint exclusion to misoptimize situations in which a CHECK constraint had a provably-NULL subclause, as seen in the regression test case added here, in which a child table that should be scanned is not. (Although this thinko is ancient, the test case doesn't fail before 9.2, for reasons I've not bothered to track down in detail. There may be related cases that do fail before that.) More recently, commit `f0e44751d` added an independent bug by applying canonicalize_qual() to index expressions, which is even sillier since those might not even be boolean. If they are, though, I think this could lead to making incorrect index entries for affected index expressions in v10. I haven't attempted to prove that though. To fix, add an "is_check" parameter to canonicalize_qual() to specify whether it should assume WHERE or CHECK semantics, and make it perform NULL-elimination accordingly. Adjust the callers to apply the right semantics, or remove the call entirely in cases where it's not known that the expression has one or the other semantics. I also removed the call in some cases involving partition expressions, where it should be a no-op because such expressions should be canonical already ... and was a no-op, independently of whether it could in principle have done something, because it was being handed the qual in implicit-AND format which isn't what it expects. In HEAD, add an Assert to catch that type of mistake in future. This represents an API break for external callers of canonicalize_qual(). While that's intentional in HEAD to make such callers think about which case applies to them, it seems like something we probably wouldn't be thanked for in released branches. Hence, in released branches, the extra parameter is added to a new function canonicalize_qual_ext(), and canonicalize_qual() is a wrapper that retains its old behavior. Patch by me with suggestions from Dean Rasheed. Back-patch to all supported branches. Discussion: https://postgr.es/m/24475.1520635069@sss.pgh.pa.us	2018-03-11 18:10:42 -04:00
Alvaro Herrera	6d30e3a2b6	Refrain from duplicating data in reorderbuffers If a walsender exits leaving data in reorderbuffers, the next walsender that tries to decode the same transaction would append its decoded data in the same spill files without truncating it first, which effectively duplicate the data. Avoid that by removing any leftover reorderbuffer spill files when a walsender starts. Backpatch to 9.4; this bug has been there from the very beginning of logical decoding. Author: Craig Ringer, revised by me Reviewed by: Álvaro Herrera, Petr Jelínek, Masahiko Sawada	2018-03-06 16:10:23 -03:00
Tom Lane	165fa27fe4	Fix assorted issues in convert_to_scalar(). If convert_to_scalar is passed a pair of datatypes it can't cope with, its former behavior was just to elog(ERROR). While this is OK so far as the core code is concerned, there's extension code that would like to use scalarltsel/scalargtsel/etc as selectivity estimators for operators that work on non-core datatypes, and this behavior is a show-stopper for that use-case. If we simply allow convert_to_scalar to return FALSE instead of outright failing, then the main logic of scalarltsel/scalargtsel will work fine for any operator that behaves like a scalar inequality comparison. The lack of conversion capability will mean that we can't estimate to better than histogram-bin-width precision, since the code will effectively assume that the comparison constant falls at the middle of its bin. But that's still a lot better than nothing. (Someday we should provide a way for extension code to supply a custom version of convert_to_scalar, but today is not that day.) While poking at this issue, we noted that the existing code for handling type bytea in convert_to_scalar is several bricks shy of a load. It assumes without checking that if the comparison value is type bytea, the bounds values are too; in the worst case this could lead to a crash. It also fails to detoast the input values, so that the comparison result is complete garbage if any input is toasted out-of-line, compressed, or even just short-header. I'm not sure how often such cases actually occur --- the bounds values, at least, are probably safe since they are elements of an array and hence can't be toasted. But that doesn't make this code OK. Back-patch to all supported branches, partly because author requested that, but mostly because of the bytea bugs. The change in API for the exposed routine convert_network_to_scalar() is theoretically a back-patch hazard, but it seems pretty unlikely that any third-party code is calling that function directly. Tomas Vondra, with some adjustments by me Discussion: https://postgr.es/m/b68441b6-d18f-13ab-b43b-9a72188a4e02@2ndquadrant.com	2018-03-03 20:31:35 -05:00
Tom Lane	947f06c622	Make gistvacuumcleanup() count the actual number of index tuples. Previously, it just returned the heap tuple count, which might be only an estimate, and would be completely the wrong thing if the index is partial. Since this function scans every index page anyway to find free pages, it's practically free to count the surviving index tuples. Let's do that and return an accurate count. This is easily visible as a wrong reltuples value for a partial GiST index following VACUUM, so back-patch to all supported branches. Andrey Borodin, reviewed by Michail Nikolaev Discussion: https://postgr.es/m/151956654251.6915.675951950408204404.pgcf@coridan.postgresql.org	2018-03-02 11:22:42 -05:00
Tom Lane	a4fed310cb	Use ereport not elog for some corrupt-HOT-chain reports. These errors have been seen in the field in corrupted-data situations. It seems worthwhile to report them with ERRCODE_DATA_CORRUPTED, rather than the generic ERRCODE_INTERNAL_ERROR, for the benefit of log monitoring and tools like amcheck. However, use errmsg_internal so that the text strings still aren't translated; it seems unlikely to be worth translators' time to do so. Back-patch to 9.3, like the predecessor commit `d70cf811f` that introduced these elog calls originally (replacing Asserts). Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzmn4-Pg-UGFwyuyK-wiTih9j32pwg_7T9iwqXpAUZr=Mg@mail.gmail.com	2018-03-01 16:23:50 -05:00
Alvaro Herrera	3ee23834ed	Relax overly strict sanity check for upgraded ancient databases Commit `4800f16a7a` added some sanity checks to ensure we don't accidentally corrupt data, but in one of them we failed to consider the effects of a database upgraded from 9.2 or earlier, where a tuple exclusively locked prior to the upgrade has a slightly different bit pattern. Fix that by using the macro that we fixed in commit `74ebba84ae` for similar situations. Reported-by: Alexandre Garcia Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAPYLKR6yxV4=pfW0Gwij7aPNiiPx+3ib4USVYnbuQdUtmkMaEA@mail.gmail.com Andres suspects that this bug may have wider ranging consequences, but I couldn't find anything.	2018-03-01 18:07:46 -03:00
Tom Lane	d07f79a9cc	Rename base64 routines to avoid conflict with Solaris built-in functions. Solaris 11.4 has built-in functions named b64_encode and b64_decode. Rename ours to something else to avoid the conflict (fortunately, ours are static so the impact is limited). One could wish for less duplication of code in this area, but that would be a larger patch and not very suitable for back-patching. Since this is a portability fix, we want to put it into all supported branches. Report and initial patch by Rainer Orth, reviewed and adjusted a bit by Michael Paquier Discussion: https://postgr.es/m/ydd372wk28h.fsf@CeBiTec.Uni-Bielefeld.DE	2018-02-28 18:33:45 -05:00
Tom Lane	cadb14c271	Remove restriction on SQL block length in isolationtester scanner. specscanner.l had a fixed limit of 1024 bytes on the length of individual SQL stanzas in an isolation test script. People are starting to run into that, so fix it by making the buffer resizable. Once we allow this in HEAD, it seems inevitable that somebody will try to back-patch a test that exceeds the old limit, so back-patch this change as a preventive measure. Daniel Gustafsson Discussion: https://postgr.es/m/8D628BE4-6606-4FF6-A3FF-8B2B0E9B43D0@yesql.se	2018-02-28 16:57:38 -05:00
Tom Lane	49f9014c8c	Fix up ecpg's configuration so it handles "long long int" in MSVC builds. Although configure-based builds correctly define HAVE_LONG_LONG_INT when appropriate (in both pg_config.h and ecpg_config.h), builds using the MSVC scripts failed to do so. This currently has no impact on the backend, since it uses that symbol nowhere; but it does prevent ecpg from supporting "long long int". Fix that. Also, adjust Solution.pm so that in the constructed ecpg_config.h file, the "#if (_MSC_VER > 1200)" covers only the LONG_LONG_INT-related #defines, not the whole file. AFAICS this was a thinko on somebody's part: ENABLE_THREAD_SAFETY should always be defined in Windows builds, and in branches using USE_INTEGER_DATETIMES, the setting of that shouldn't depend on the compiler version either. If I'm wrong, I imagine the buildfarm will say so. Per bug #15080 from Jonathan Allen; issue diagnosed by Michael Meskes and Andrew Gierth. Back-patch to all supported branches. Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org	2018-02-27 16:46:52 -05:00
Tom Lane	5cedaeca26	Remove regression tests' CREATE FUNCTION commands for unused C functions. I removed these functions altogether in HEAD, in commit `db3af9feb`, and it emerges that that causes trouble for cross-branch upgrade testing. We could put back stub functions but that seems pretty silly. Instead, back-patch a minimal subset of `db3af9feb`, namely just removing the CREATE FUNCTION commands. Discussion: https://postgr.es/m/11927.1519756619@sss.pgh.pa.us	2018-02-27 15:04:53 -05:00
Tom Lane	5ccb775869	Prevent dangling-pointer access when update trigger returns old tuple. A before-update row trigger may choose to return the "new" or "old" tuple unmodified. ExecBRUpdateTriggers failed to consider the second possibility, and would proceed to free the "old" tuple even if it was the one returned, leading to subsequent access to already-deallocated memory. In debug builds this reliably leads to an "invalid memory alloc request size" failure; in production builds it might accidentally work, but data corruption is also possible. This is a very old bug. There are probably a couple of reasons it hasn't been noticed up to now. It would be more usual to return NULL if one wanted to suppress the update action; returning "old" is significantly less efficient since the update will occur anyway. Also, none of the standard PLs would ever cause this because they all returned freshly-manufactured tuples even if they were just copying "old". But commit `4b93f5799` changed that for plpgsql, making it possible to see the bug with a plpgsql trigger. Still, this is certainly legal behavior for a trigger function, so it's ExecBRUpdateTriggers's fault not plpgsql's. It seems worth creating a test case that exercises returning "old" directly with a C-language trigger; testing this through plpgsql seems unreliable because its behavior might change again. Report and fix by Rushabh Lathia; regression test case by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAGPqQf1P4pjiNPrMof=P_16E-DFjt457j+nH2ex3=nBTew7tXw@mail.gmail.com	2018-02-27 13:27:38 -05:00
Magnus Hagander	5181bebba7	Revert restructuring of bin/scripts/Makefile The Makefile portion of `91f3ffc524` broke the MSVC build. This patch reverts the changes to the Makefile and adjusts it to work with the new code, while keeping the actual code changes from the original patch. Author: Victor Wagner <vitus@wagner.pp.ru>	2018-02-27 14:11:09 +01:00
Tom Lane	f46bc90dbc	Stamp 9.4.17.	2018-02-26 17:17:45 -05:00
Noah Misch	f28955e382	Document security implications of search_path and the public schema. The ability to create like-named objects in different schemas opens up the potential for users to change the behavior of other users' queries, maliciously or accidentally. When you connect to a PostgreSQL server, you should remove from your search_path any schema for which a user other than yourself or superusers holds the CREATE privilege. If you do not, other users holding CREATE privilege can redefine the behavior of your commands, causing them to perform arbitrary SQL statements under your identity. "SET search_path = ..." and "SELECT pg_catalog.set_config(...)" are not vulnerable to such hijacking, so one can use either as the first command of a session. As special exceptions, the following client applications behave as documented regardless of search_path settings and schema privileges: clusterdb createdb createlang createuser dropdb droplang dropuser ecpg (not programs it generates) initdb oid2name pg_archivecleanup pg_basebackup pg_config pg_controldata pg_ctl pg_dump pg_dumpall pg_isready pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_standby pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb vacuumlo. Not included are core client programs that run user-specified SQL commands, namely psql and pgbench. PostgreSQL encourages non-core client applications to do likewise. Document this in the context of libpq connections, psql connections, dblink connections, ECPG connections, extension packaging, and schema usage patterns. The principal defense for applications is "SELECT pg_catalog.set_config('search_path', '', false)", and the principal defense for databases is "REVOKE CREATE ON SCHEMA public FROM PUBLIC". Either one is sufficient to prevent attack. After a REVOKE, consider auditing the public schema for objects named like pg_catalog objects. Authors of SECURITY DEFINER functions use some of the same defenses, and the CREATE FUNCTION reference page already covered them thoroughly. This is a good opportunity to audit SECURITY DEFINER functions for robust security practice. Back-patch to 9.3 (all supported versions). Reviewed by Michael Paquier and Jonathan S. Katz. Reported by Arseniy Sharoglazov. Security: CVE-2018-1058	2018-02-26 07:39:48 -08:00
Noah Misch	928bca1a30	Empty search_path in Autovacuum and non-psql/pgbench clients. This makes the client programs behave as documented regardless of the connect-time search_path and regardless of user-created objects. Today, a malicious user with CREATE permission on a search_path schema can take control of certain of these clients' queries and invoke arbitrary SQL functions under the client identity, often a superuser. This is exploitable in the default configuration, where all users have CREATE privilege on schema "public". This changes behavior of user-defined code stored in the database, like pg_index.indexprs and pg_extension_config_dump(). If they reach code bearing unqualified names, "does not exist" or "no schema has been selected to create in" errors might appear. Users may fix such errors by schema-qualifying affected names. After upgrading, consider watching server logs for these errors. The --table arguments of src/bin/scripts clients have been lax; for example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint. That now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still performs a checkpoint. Back-patch to 9.3 (all supported versions). Reviewed by Tom Lane, though this fix strategy was not his first choice. Reported by Arseniy Sharoglazov. Security: CVE-2018-1058	2018-02-26 07:39:48 -08:00
Noah Misch	461c32b557	Back-patch non-static ExecuteSqlQueryForSingleRow(). Back-patch a subset of commit `47e5969767` to 9.4 and 9.3. The next commit adds calls to this function. Security: CVE-2018-1058	2018-02-26 07:39:48 -08:00
Tom Lane	9f6e5296a1	Avoid using unsafe search_path settings during dump and restore. Historically, pg_dump has "set search_path = foo, pg_catalog" when dumping an object in schema "foo", and has also caused that setting to be used while restoring the object. This is problematic because functions and operators in schema "foo" could capture references meant to refer to pg_catalog entries, both in the queries issued by pg_dump and those issued during the subsequent restore run. That could result in dump/restore misbehavior, or in privilege escalation if a nefarious user installs trojan-horse functions or operators. This patch changes pg_dump so that it does not change the search_path dynamically. The emitted restore script sets the search_path to what was used at dump time, and then leaves it alone thereafter. Created objects are placed in the correct schema, regardless of the active search_path, by dint of schema-qualifying their names in the CREATE commands, as well as in subsequent ALTER and ALTER-like commands. Since this change requires a change in the behavior of pg_restore when processing an archive file made according to this new convention, bump the archive file version number; old versions of pg_restore will therefore refuse to process files made with new versions of pg_dump. Security: CVE-2018-1058	2018-02-26 10:18:22 -05:00
Peter Eisentraut	fd090d0d6f	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: e859c0a3a8ac0fc7ae28150aff33153ec532ef04	2018-02-26 08:37:35 -05:00
Noah Misch	da27155c6b	Synchronize doc/ copies of src/test/examples/. This is mostly cosmetic, but it might fix build failures, on some platform, when copying from the documentation. Back-patch to 9.3 (all supported versions).	2018-02-23 11:24:08 -08:00
Tom Lane	f6dd08489c	Fix planner failures with overlapping mergejoin clauses in an outer join. Given overlapping or partially redundant join clauses, for example t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x the planner's EquivalenceClass machinery will ordinarily refactor the clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't see multiple references to the same EquivalenceClass in a list of join equality clauses. However, if the join is outer, it's incorrect to derive a restriction clause on the outer side from the join conditions, so the clause refactoring does not happen and we end up with overlapping join conditions. The code that attempted to deal with such cases had several subtle bugs, which could result in "left and right pathkeys do not match in mergejoin" or "outer pathkeys do not match mergeclauses" planner errors, if the selected join plan type was a mergejoin. (It does not appear that any actually incorrect plan could have been emitted.) The core of the problem really was failure to recognize that the outer and inner relations' pathkeys have different relationships to the mergeclause list. A join's mergeclause list is constructed by reference to the outer pathkeys, so it will always be ordered the same as the outer pathkeys, but this cannot be presumed true for the inner pathkeys. If the inner sides of the mergeclauses contain multiple references to the same EquivalenceClass ({t2.x} in the above example) then a simplistic rendering of the required inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery recognizes that the second sort column is redundant and throws it away. The mergejoin planning code failed to account for that behavior properly. One error was to try to generate cut-down versions of the mergeclause list from cut-down versions of the inner pathkeys in the same way as the initial construction of the mergeclause list from the outer pathkeys was done; this could lead to choosing a mergeclause list that fails to match the outer pathkeys. The other problem was that the pathkey cross-checking code in create_mergejoin_plan treated the inner and outer pathkey lists identically, whereas actually the expectations for them must be different. That led to false "pathkeys do not match" failures in some cases, and in principle could have led to failure to detect bogus plans in other cases, though there is no indication that such bogus plans could be generated. Reported by Alexander Kuzmenkov, who also reviewed this patch. This has been broken for years (back to around 8.3 according to my testing), so back-patch to all supported branches. Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru	2018-02-23 13:47:33 -05:00
Tom Lane	2d12c55933	Repair pg_upgrade's failure to preserve relfrozenxid for matviews. This oversight led to data corruption in matviews, manifesting as "could not access status of transaction" before our most recent releases, and "found xmin from before relfrozenxid" errors since then. The proximate cause of the problem seems to have been confusion between the task of preserving dropped-column status and the task of preserving frozenxid status. Those are required for distinct sets of relkinds, and the reasoning was entirely undocumented in the source code. In hopes of forestalling future errors of the same kind, try to improve the commentary in this area. In passing, also improve the remarkably unhelpful comments around pg_upgrade's set_frozenxids(). That's not actually buggy AFAICS, but good luck figuring out what it does from the old comments. Per report from Claudio Freire. It appears that bug #14852 from Alexey Ermakov is an earlier report of the same issue, and there may be other cases that we failed to identify at the time. Patch by me based on analysis by Andres Freund. The bug dates back to the introduction of matviews, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAGTBQpbrY9CdRGGhyBZ9yqY4jWaGC85rUF4X+R7d-aim=mBNsw@mail.gmail.com Discussion: https://postgr.es/m/20171013115320.28049.86457@wrigleys.postgresql.org	2018-02-21 18:40:24 -05:00
Tom Lane	e11b6488e5	Fix misbehavior of CTE-used-in-a-subplan during EPQ rechecks. An updating query that reads a CTE within an InitPlan or SubPlan could get incorrect results if it updates rows that are concurrently being modified. This is caused by CteScanNext supposing that nothing inside its recursive ExecProcNode call could change which read pointer is selected in the CTE's shared tuplestore. While that's normally true because of scoping considerations, it can break down if an EPQ plan tree gets built during the call, because EvalPlanQualStart builds execution trees for all subplans whether they're going to be used during the recheck or not. And it seems like a pretty shaky assumption anyway, so let's just reselect our own read pointer here. Per bug #14870 from Andrei Gorita. This has been broken since CTEs were implemented, so back-patch to all supported branches. Discussion: https://postgr.es/m/20171024155358.1471.82377@wrigleys.postgresql.org	2018-02-19 16:00:18 -05:00
Tom Lane	bd87186378	Fix broken logic for reporting PL/Python function names in errcontext. plpython_error_callback() reported the name of the function associated with the topmost PL/Python execution context. This was not merely wrong if there were nested PL/Python contexts, but it risked a core dump if the topmost one is an inline code block rather than a named function. That will have proname = NULL, and so we were passing a NULL pointer to snprintf("%s"). It seems that none of the PL/Python-testing machines in the buildfarm will dump core for that, but some platforms do, as reported by Marina Polyakova. Investigation finds that there actually is an existing regression test that used to prove that the behavior was wrong, though apparently no one had noticed that it was printing the wrong function name. It stopped showing the problem in 9.6 when we adjusted psql to not print CONTEXT by default for NOTICE messages. The problem is masked (if your platform avoids the core dump) in error cases, because PL/Python will throw away the originally generated error info in favor of a new traceback produced at the outer level. Repair by using ErrorContextCallback.arg to pass the correct context to the error callback. Add a regression test illustrating correct behavior. Back-patch to all supported branches, since they're all broken this way. Discussion: https://postgr.es/m/156b989dbc6fe7c4d3223cf51da61195@postgrespro.ru	2018-02-14 14:47:18 -05:00
Tom Lane	b7e1ca7d8e	Stamp 9.4.16.	2018-02-05 16:07:03 -05:00
Peter Eisentraut	0a5dcba2ab	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 1bde68354ce9389733ebee17e76817f3bda1edf1	2018-02-05 12:45:45 -05:00
Peter Eisentraut	9f050c0b41	Exclude common/int128.h from cpluspluscheck It uses static assertions, which are not supported under C++ in this branch. This change only goes into the 9.4 branch, because 9.5 and beyond will primarily use the USE_NATIVE_INT128 branch, so cpluspluscheck isn't bothered. In PG11 we will have C++ support for static assertions, so the issue will go away altogether.	2018-01-30 19:21:19 -05:00
Peter Eisentraut	b37422de89	psql documentation fixes Update the documentation for \pset to mention columns\|linestyle. Author: Дилян Палаузов <dpa-postgres@aegee.org>	2018-01-29 14:18:11 -05:00
Tom Lane	06efc5cf53	Add stack-overflow guards in set-operation planning. create_plan_recurse lacked any stack depth check. This is not per our normal coding rules, but I'd supposed it was safe because earlier planner processing is more complex and presumably should eat more stack. But bug #15033 from Andrew Grossman shows this isn't true, at least not for queries having the form of a many-thousand-way INTERSECT stack. Further testing showed that recurse_set_operations is also capable of being crashed in this way, since it likewise will recurse to the bottom of a parsetree before calling any support functions that might themselves contain any stack checks. However, its stack consumption is only perhaps a third of create_plan_recurse's. It's possible that this particular problem with create_plan_recurse can only manifest in 9.6 and later, since before that we didn't build a Path tree for set operations. But having seen this example, I now have no faith in the proposition that create_plan_recurse doesn't need a stack check, so back-patch to all supported branches. Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org	2018-01-28 13:39:07 -05:00
Tom Lane	fa86a32f9b	Update time zone data files to tzdata release 2018c. DST law changes in Brazil, Sao Tome and Principe. Historical corrections for Bolivia, Japan, and South Sudan. The "US/Pacific-New" zone has been removed (it was only a link to America/Los_Angeles anyway).	2018-01-27 16:42:55 -05:00
Tom Lane	54e1599c76	Teach reparameterize_path() to handle AppendPaths. If we're inside a lateral subquery, there may be no unparameterized paths for a particular child relation of an appendrel, in which case we must be able to create similarly-parameterized paths for each other child relation, else the planner will fail with "could not devise a query plan for the given query". This means that there are situations where we'd better be able to reparameterize at least one path for each child. This calls into question the assumption in reparameterize_path() that it can just punt if it feels like it. However, the only case that is known broken right now is where the child is itself an appendrel so that all its paths are AppendPaths. (I think possibly I disregarded that in the original coding on the theory that nested appendrels would get folded together --- but that only happens after reparameterize_path(), so it's not excused from handling a child AppendPath.) Given that this code's been like this since 9.3 when LATERAL was introduced, it seems likely we'd have heard of other cases by now if there were a larger problem. Per report from Elvis Pranskevichus. Back-patch to 9.3. Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net	2018-01-23 16:50:35 -05:00
Tom Lane	da83ca7d9d	Make pg_dump's ACL, sec label, and comment entries reliably identifiable. _tocEntryRequired() expects that it can identify ACL, SECURITY LABEL, and COMMENT TOC entries that are for large objects by seeing whether the tag for them starts with "LARGE OBJECT ". While that works fine for actual large objects, which are indeed tagged that way, it's subject to false positives unless every such entry's tag starts with an appropriate type ID. And in fact it does not work for ACLs, because up to now we customarily tagged those entries with just the bare name of the object. This means that an ACL for an object named "LARGE OBJECT something" would be misclassified as data not schema, with undesirable results in a schema-only or data-only dump --- although pg_upgrade seems unaffected, due to the special case for binary-upgrade mode further down in _tocEntryRequired(). We can fix this by changing all the dumpACL calls to use the label strings already in use for comments and security labels, which do follow the convention of starting with an object type indicator. Well, mostly they follow it. dumpDatabase() got it wrong, using just the bare database name for those purposes, so that a database named "LARGE OBJECT something" would similarly be subject to having its comment or security label dropped or included when not wanted. Bring that into line too. (Note that up to now, database ACLs have not been processed by pg_dump, so that this issue doesn't affect them.) _tocEntryRequired() itself is not free of fault: it was overly liberal about matching object tags to "LARGE OBJECT " in binary-upgrade mode. This looks like it is probably harmless because there would be no data component to strip anyway in that mode, but at best it's trouble waiting to happen, so tighten that up too. The possible misclassification of SECURITY LABEL entries for databases is in principle a security problem, but the opportunities for actual exploits seem too narrow to be interesting. The other cases seem like just bugs, since an object owner can change its ACL or comment for himself, he needn't try to trick someone else into doing it by choosing a strange name. This has been broken since per-large-object TOC entries were introduced in 9.0, so back-patch to all supported branches. Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us	2018-01-22 12:06:19 -05:00
Alvaro Herrera	1284d18b5d	Fix StoreCatalogInheritance1 to use 32bit inhseqno For no apparent reason, this function was using a 16bit-wide inhseqno value, rather than the correct 32 bit width which is what is stored in the pg_inherits catalog. This becomes evident if you try to create a table with more than 65535 parents, because this error appears: ERROR: duplicate key value violates unique constraint «pg_inherits_relid_seqno_index» DETAIL: Key (inhrelid, inhseqno)=(329371, 0) already exists. Needless to say, having so many parents is an uncommon situations, which explains why this error has never been reported despite being having been introduced with the Postgres95 1.01 sources in commit d31084e9d111: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/creatinh.c;hb=d31084e9d111#l349 Backpatch all the way back. David Rowley noticed this while reviewing a patch of mine. Discussion: https://postgr.es/m/CAKJS1f8Dn7swSEhOWwzZzssW7747YB=2Hi+T7uGud40dur69-g@mail.gmail.com	2018-01-19 10:15:08 -03:00
Michael Meskes	2c1c4b060c	Cope with indicator arrays that do not have the correct length. Patch by: "Rader, David" <davidr@openscg.com>	2018-01-15 10:02:16 +01:00
Tom Lane	8b0e5e7e7c	Avoid unnecessary failure in SELECT concurrent with ALTER NO INHERIT. If a query against an inheritance tree runs concurrently with an ALTER TABLE that's disinheriting one of the tree members, it's possible to get a "could not find inherited attribute" error because after obtaining lock on the removed member, make_inh_translation_list sees that its columns have attinhcount=0 and decides they aren't the columns it's looking for. An ideal fix, perhaps, would avoid including such a just-removed member table in the query at all; but there seems no way to accomplish that without adding expensive catalog rechecks or creating a likelihood of deadlocks. Instead, let's just drop the check on attinhcount. In this way, a query that's included a just-disinherited child will still succeed, which is not a completely unreasonable behavior. This problem has existed for a long time, so back-patch to all supported branches. Also add an isolation test verifying related behaviors. Patch by me; the new isolation test is based on Kyotaro Horiguchi's work. Discussion: https://postgr.es/m/20170626.174612.23936762.horiguchi.kyotaro@lab.ntt.co.jp	2018-01-12 15:46:38 -05:00
Alvaro Herrera	c618796404	Change some bogus PageGetLSN calls to BufferGetLSNAtomic As src/backend/access/transam/README says, PageGetLSN may only be called by processes holding either exclusive lock on buffer, or a shared lock on buffer plus buffer header lock. Therefore any place that only holds a shared buffer lock must use BufferGetLSNAtomic instead of PageGetLSN, which internally obtains buffer header lock prior to reading the LSN. A few callsites failed to comply with this rule. This was detected by running all tests under a new (not committed) assertion that verifies PageGetLSN locking contract. All but one of the callsites that failed the assertion are fixed by this patch. Remaining callsites were inspected manually and determined not to need any change. The exception (unfixed callsite) is in TestForOldSnapshot, which only has a Page argument, making it impossible to access the corresponding Buffer from it. Fixing that seems a much larger patch that will have to be done separately; and that's just as well, since it was only introduced in 9.6 and other bugs are much older. Some of these bugs are ancient; backpatch all the way back to 9.3. Authors: Jacob Champion, Asim Praveen, Ashwin Agrawal Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/CABAq_6GXgQDVu3u12mK9O5Xt5abBZWQ0V40LZCE+oUf95XyNFg@mail.gmail.com	2018-01-09 17:07:24 -03:00
Alvaro Herrera	f68c49f86a	Fix failure to delete spill files of aborted transactions Logical decoding's reorderbuffer.c may spill transaction files to disk when transactions are large. These are supposed to be removed when they become "too old" by xid; but file removal requires the boundary LSNs of the transaction to be known. The final_lsn is only set when we see the commit or abort record for the transaction, but nothing sets the value for transactions that crash, so the removal code misbehaves -- in assertion-enabled builds, it crashes by a failed assertion. To fix, modify the final_lsn of transactions that don't have a value set, to the LSN of the very latest change in the transaction. This causes the spilled files to be removed appropriately. Author: Atsushi Torikoshi Reviewed-by: Kyotaro HORIGUCHI, Craig Ringer, Masahiko Sawada Discussion: https://postgr.es/m/54e4e488-186b-a056-6628-50628e4e4ebc@lab.ntt.co.jp	2018-01-05 12:17:10 -03:00
Andrew Dunstan	2d03daa7b8	Fix use of config-specific libraries for Windows OpenSSL Commit `614350a3` allowed for an different builds of OpenSSL libraries on Windows, but ignored the fact that the alternative builds don't have config-specific libraries. This patch fixes the Solution file to ask for the correct libraries. per offline discussions with Leonardo Cecchi and Marco Nenciarini, Backpatch to all live branches.	2018-01-03 15:34:02 -05:00
Alvaro Herrera	fe6bdc0a38	Make XactLockTableWait work for transactions that are not yet self-locked XactLockTableWait assumed that its xid argument has already added itself to the lock table. That assumption led to another assumption that if locking the xid has succeeded but the xid is reported as still in progress, then the input xid must have been a subtransaction. These assumptions hold true for the original uses of this code in locking related to on-disk tuples, but they break down in logical replication slot snapshot building -- in particular, when a standby snapshot logged contains an xid that's already in ProcArray but not yet in the lock table. This leads to assertion failures that can be reproduced all the way back to 9.4, when logical decoding was introduced. To fix, change SubTransGetParent to SubTransGetTopmostTransaction which has a slightly different API: it returns the argument Xid if there is no parent, and it goes all the way to the top instead of moving up the levels one by one. Also, to avoid busy-waiting, add a 1ms sleep to give the other process time to register itself in the lock table. For consistency, change ConditionalXactLockTableWait the same way. Author: Petr Jelínek Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru Reported-by: Konstantin Knizhnik Diagnosed-by: Stas Kelvich, Petr Jelínek Reviewed-by: Andres Freund, Robert Haas	2018-01-03 14:38:39 -03:00
Alvaro Herrera	47a3a13178	Fix deadlock hazard in CREATE INDEX CONCURRENTLY Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are supposed to be able to work in parallel, as evidenced by fixes in commit `c3d09b3bd2` specifically to support this case. In reality, one of the sessions would be aborted by a misterious "deadlock detected" error. Jeff Janes diagnosed that this is because of leftover snapshots used for system catalog scans -- this was broken by `8aa3e47510` keeping track of (registering) the catalog snapshot. To fix the deadlocks, it's enough to de-register that snapshot prior to waiting. Backpatch to 9.4, which introduced MVCC catalog scans. Include an isolationtester spec that 8 out of 10 times reproduces the deadlock with the unpatched code for me (Álvaro). Author: Jeff Janes Diagnosed-by: Jeff Janes Reported-by: Jeremy Finzel Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com	2018-01-02 19:16:16 -03:00
Tom Lane	e1b8e0e4a6	Disallow UNION/INTERSECT/EXCEPT over no columns. Since 9.4, we've allowed the syntax "select union select" and variants of that. However, the planner wasn't expecting a no-column set operation and ended up treating the set operation as if it were UNION ALL. Pre-v10, there seem to be some executor issues that would need to be fixed to support such cases, and it doesn't really seem worth expending much effort on. Just disallow it, instead. Per report from Victor Yegorov. Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com	2017-12-22 12:08:41 -05:00
Andres Freund	ed8e1aff6a	Perform a lot more sanity checks when freezing tuples. The previous commit has shown that the sanity checks around freezing aren't strong enough. Strengthening them seems especially important because the existance of the bug has caused corruption that we don't want to make even worse during future vacuum cycles. The errors are emitted with ereport rather than elog, despite being "should never happen" messages, so a proper error code is emitted. To avoid superflous translations, mark messages as internal. Author: Andres Freund and Alvaro Herrera Reviewed-By: Alvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-	2017-12-14 18:20:48 -08:00
Andres Freund	4eff5a8c9f	Fix pruning of locked and updated tuples. Previously it was possible that a tuple was not pruned during vacuum, even though its update xmax (i.e. the updating xid in a multixact with both key share lockers and an updater) was below the cutoff horizon. As the freezing code assumed, rightly so, that that's not supposed to happen, xmax would be preserved (as a member of a new multixact or xmax directly). That causes two problems: For one the tuple is below the xmin horizon, which can cause problems if the clog is truncated or once there's an xid wraparound. The bigger problem is that that will break HOT chains, which in turn can lead two to breakages: First, failing index lookups, which in turn can e.g lead to constraints being violated. Second, future hot prunes / vacuums can end up making invisible tuples visible again. There's other harmful scenarios. Fix the problem by recognizing that tuples can be DEAD instead of RECENTLY_DEAD, even if the multixactid has alive members, if the update_xid is below the xmin horizon. That's safe because newer versions of the tuple will contain the locking xids. A followup commit will harden the code somewhat against future similar bugs and already corrupted data. Author: Andres Freund, with changes by Alvaro Herrera Reported-By: Daniel Wood Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-	2017-12-14 18:20:48 -08:00
Andrew Dunstan	f5c7e0cddf	Fix walsender timeouts when decoding a large transaction The logical slots have a fast code path for sending data so as not to impose too high a per message overhead. The fast path skips checks for interrupts and timeouts. However, the existing coding failed to consider the fact that a transaction with a large number of changes may take a very long time to be processed and sent to the client. This causes the walsender to ignore interrupts for potentially a long time and more importantly it will result in the walsender being killed due to timeout at the end of such a transaction. This commit changes the fast path to also check for interrupts and only allows calling the fast path when the last keepalive check happened less than half the walsender timeout ago. Otherwise the slower code path will be taken. Backpatched to 9.4 Petr Jelinek, reviewed by Kyotaro HORIGUCHI, Yura Sokolov, Craig Ringer and Robert Haas. Discussion: https://postgr.es/m/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com	2017-12-14 11:32:25 -05:00
Tom Lane	239b01e313	Fix corner-case coredump in _SPI_error_callback(). I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL, and only fills it in some time later. If an error were to happen in between, _SPI_error_callback would try to dereference the null pointer. This is unlikely --- there's not much between those points except push-snapshot calls --- but it's clearly not impossible. Tweak the callback to do nothing if the pointer isn't set yet. It's been like this for awhile, so back-patch to all supported branches.	2017-12-11 16:33:49 -05:00
Noah Misch	d78c3ca0ea	MSVC 2012+: Permit linking to 32-bit, MinGW-built libraries. Notably, this permits linking to the 32-bit Perl binaries advertised on perl.org, namely Strawberry Perl and ActivePerl. This has a side effect of permitting linking to binaries built with obsolete MSVC versions. By default, MSVC 2012 and later require a "safe exception handler table" in each binary. MinGW-built, 32-bit DLLs lack the relevant exception handler metadata, so linking to them failed with error LNK2026. Restore the semantics of MSVC 2010, which omits the table from a given binary if some linker input lacks metadata. This has no effect on 64-bit builds or on MSVC 2010 and earlier. Back-patch to 9.3 (all supported versions). Reported by Victor Wagner. Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home	2017-12-09 00:58:59 -08:00
Noah Misch	aed8d41af6	MSVC: Test whether 32-bit Perl needs -D_USE_32BIT_TIME_T. Commits `5a5c2feca3` and `b5178c5d08` introduced support for modern MSVC-built, 32-bit Perl, but they broke use of MinGW-built, 32-bit Perl distributions like Strawberry Perl and modern ActivePerl. Perl has no robust means to report whether it expects a -D_USE_32BIT_TIME_T ABI, so test this. Back-patch to 9.3 (all supported versions). The chief alternative was a heuristic of adding -D_USE_32BIT_TIME_T when $Config{gccversion} is nonempty. That banks on every gcc-built Perl using the same ABI. gcc could change its default ABI the way MSVC once did, and one could build Perl with gcc and the non-default ABI. The GNU make build system could benefit from a similar test, without which it does not support MSVC-built Perl. For now, just add a comment. Most users taking the special step of building Perl with MSVC probably build PostgreSQL with MSVC. Discussion: https://postgr.es/m/20171130041441.GA3161526@rfd.leadboat.com	2017-12-08 18:13:49 -08:00
Peter Eisentraut	8b33b5b9df	Fix mistake in comment Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>	2017-12-08 11:17:46 -05:00
Robert Haas	facd94e72f	Report failure to start a background worker. When a worker is flagged as BGW_NEVER_RESTART and we fail to start it, or if it is not marked BGW_NEVER_RESTART but is terminated before startup succeeds, what BgwHandleStatus should be reported? The previous code really hadn't considered this possibility (as indicated by the comments which ignore it completely) and would typically return BGWH_NOT_YET_STARTED, but that's not a good answer, because then there's no way for code using GetBackgroundWorkerPid() to tell the difference between a worker that has not started but will start later and a worker that has not started and will never be started. So, when this case happens, return BGWH_STOPPED instead. Update the comments to reflect this. The preceding fix by itself is insufficient to fix the problem, because the old code also didn't send a notification to the process identified in bgw_notify_pid when startup failed. That might've been technically correct under the theory that the status of the worker was BGWH_NOT_YET_STARTED, because the status would indeed not change when the worker failed to start, but now that we're more usefully reporting BGWH_STOPPED, a notification is needed. Without these fixes, code which starts background workers and then uses the recommended APIs to wait for those background workers to start would hang indefinitely if the postmaster failed to fork a worker. Amit Kapila and Robert Haas Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com	2017-12-06 09:08:30 -05:00
Robert Haas	f4bb60ed69	Mark assorted variables PGDLLIMPORT. This makes life easier for extension authors who wish to support Windows. Brian Cloutier, slightly amended by me. Discussion: http://postgr.es/m/CAJCy68fscdNhmzFPS4kyO00CADkvXvEa-28H-OtENk-pa2OTWw@mail.gmail.com	2017-12-05 09:35:15 -05:00
Tom Lane	225501cf75	Clean up assorted messiness around AllocateDir() usage. This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com	2017-12-04 17:02:52 -05:00
Noah Misch	e73981cdc0	Fix non-GNU makefiles for AIX make. Invoking the Makefile without an explicit target was building every possible target instead of just the "all" target. Back-patch to 9.3 (all supported versions).	2017-11-30 00:57:32 -08:00
Magnus Hagander	65f1623336	Fix typo in comment Andreas Karlsson	2017-11-27 09:30:03 +01:00
Joe Conway	d8d9c97cd1	Make has_sequence_privilege support WITH GRANT OPTION The various has_*_privilege() functions all support an optional WITH GRANT OPTION added to the supported privilege types to test whether the privilege is held with grant option. That is, all except has_sequence_privilege() variations. Fix that. Back-patch to all supported branches. Discussion: https://postgr.es/m/005147f6-8280-42e9-5a03-dd2c1e4397ef@joeconway.com	2017-11-26 09:50:42 -08:00
Tom Lane	1601a9413f	Update MSVC build process for new timezone data. Missed this dependency in commits `7cce222c9` et al.	2017-11-25 18:15:23 -05:00
Tom Lane	10aa064c95	Replace raw timezone source data with IANA's new compact format. Traditionally IANA has distributed their timezone data in pure source form, replete with extensive historical comments. As of release 2017c, they've added a compact single-file format that omits comments and abbreviates command keywords. This form is way shorter than the pure source, even before considering its allegedly better compressibility. Hence, let's distribute the data in that form rather than pure source. I'm pushing this now, rather than at the next timezone database update, so that it's easy to confirm that this data file produces compiled zic output that's identical to what we were getting before. Discussion: https://postgr.es/m/1915.1511210334@sss.pgh.pa.us	2017-11-25 15:30:44 -05:00
Tom Lane	2e105cf6db	Repair failure with SubPlans in multi-row VALUES lists. When nodeValuesscan.c was written, it was impossible to have a SubPlan in VALUES --- any sub-SELECT there would have to be uncorrelated and thereby would produce an InitPlan instead. We therefore took a shortcut in the logic that throws away a ValuesScan's per-row expression evaluation data structures. This was broken by the introduction of LATERAL however; a sub-SELECT containing a lateral reference produces a correlated SubPlan. The cleanest fix for this would be to give up the optimization of discarding the expression eval state. But that still seems pretty unappetizing for long VALUES lists. It seems to work to just prevent the subexpressions from hooking into the ValuesScan node's subPlan list, so let's do that and see how well it works. (If this breaks, due to additional connections between the subexpressions and the outer query structures, we might consider compromises like throwing away data only for VALUES rows not containing SubPlans.) Per bug #14924 from Christian Duta. Back-patch to 9.3 where LATERAL was introduced. Discussion: https://postgr.es/m/20171124120836.1463.5310@wrigleys.postgresql.org	2017-11-25 14:15:48 -05:00
Noah Misch	558f620792	Support linking with MinGW-built Perl. This is necessary for ActivePerl 5.18 onwards and for Strawberry Perl. It is not sufficient for 32-bit builds with newer Visual Studio; these fail with error LINK2026. Back-patch to 9.3 (all supported versions). Reported by Victor Wagner. Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home	2017-11-23 20:29:48 -08:00
Robert Haas	294136d422	Provide for forward compatibility with future minor protocol versions. Previously, any attempt to request a 3.x protocol version other than 3.0 would lead to a hard connection failure, which made the minor protocol version really no different from the major protocol version and precluded gentle protocol version breaks. Instead, when the client requests a 3.x protocol version where x is greater than 0, send the new NegotiateProtocolVersion message to convey that we support only 3.0. This makes it possible to introduce new minor protocol versions without requiring a connection retry when the server is older. In addition, if the startup packet includes name/value pairs where the name starts with "_pq_.", assume that those are protocol options, not GUCs. Include those we don't support (i.e. all of them, at present) in the NegotiateProtocolVersion message so that the client knows they were not understood. This makes it possible for the client to request previously-unsupported features without bumping the protocol version at all; the client can tell from the server's response whether the option was understood. It will take some time before servers that support these new facilities become common in the wild; to speed things up and make things easier for a future 3.1 protocol version, back-patch to all supported releases. Robert Haas and Badrul Chowdhury Discussion: http://postgr.es/m/BN6PR21MB0772FFA0CBD298B76017744CD1730@BN6PR21MB0772.namprd21.prod.outlook.com Discussion: http://postgr.es/m/30788.1498672033@sss.pgh.pa.us	2017-11-21 14:38:29 -05:00
Tom Lane	13f2bdb639	Use out-of-line M68K spinlock code for OpenBSD as well as NetBSD. David Carlier (from a patch being carried by OpenBSD packagers) Discussion: https://postgr.es/m/CA+XhMqzwFSGVU7MEnfhCecc8YdP98tigXzzpd0AAdwaGwaVXEA@mail.gmail.com	2017-11-20 18:05:03 -05:00
Tom Lane	8bd8b4b77c	Add support for Motorola 88K to s_lock.h. Apparently there are still people out there who care about this old architecture. They probably care about dusty versions of Postgres too, so back-patch to all supported branches. David Carlier (from a patch being carried by OpenBSD packagers) Discussion: https://postgr.es/m/CA+XhMqzwFSGVU7MEnfhCecc8YdP98tigXzzpd0AAdwaGwaVXEA@mail.gmail.com	2017-11-20 17:57:46 -05:00
Noah Misch	ab8eae0bb5	MSVC: Rebuild spiexceptions.h when out of date. Also, add a warning to catch future instances of naming a nonexistent file as a prerequisite. Back-patch to 9.3 (all supported versions).	2017-11-12 18:44:38 -08:00
Noah Misch	e17b38db66	Install Windows crash dump handler before all else. Apart from calling write_stderr() on failure, the handler depends on no PostgreSQL facilities. We have experienced crashes before reaching the former call site. Given such an early crash, this change cannot hurt and may produce a helpful dump. Absent an early crash, this change has no effect. Back-patch to 9.3 (all supported versions). Takayuki Tsunakawa Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F80CD13@G01JPEXMBYT05	2017-11-12 14:31:04 -08:00
Noah Misch	19cf9e96ae	Don't call pgwin32_message_to_UTF16() without CurrentMemoryContext. PostgreSQL running as a Windows service crashed upon calling write_stderr() before MemoryContextInit(). This fix completes work started in `5735efee15`. Messages this early contain only ASCII bytes; if we removed the CurrentMemoryContext requirement, the ensuing conversions would have no effect. Back-patch to 9.3 (all supported versions). Takayuki Tsunakawa, reviewed by Michael Paquier. Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F80CC73@G01JPEXMBYT05	2017-11-12 13:03:29 -08:00
Noah Misch	ae5489e147	Add post-2010 ecpg tests to checktcp. This suite had been a proper superset of the regular ecpg test suite, but the three newest tests didn't reach it. To make this less likely to recur, delete the extra schedule file and pass the TCP-specific test on the command line. Back-patch to 9.3 (all supported versions).	2017-11-11 14:41:50 -08:00
Noah Misch	65fd34f7cc	Make connect/test1 independent of localhost IPv6. Since commit `868898739a`, it has assumed "localhost" resolves to both ::1 and 127.0.0.1. We gain nothing from that assumption, and it does not hold in a default installation of Red Hat Enterprise Linux 5. Back-patch to 9.3 (all supported versions).	2017-11-11 14:33:41 -08:00
Noah Misch	dfabce8827	Fix connect/test1 expected output. The test runs only as part of "checktcp". This is a back-patch to 9.5 and 9.4 of part of commit `868898739a`. Oversight in commit `61bee9f756`.	2017-11-11 14:22:51 -08:00

... 3 4 5 6 7 ...

27072 Commits