This is like shared_preload_libraries except that it takes effect at
backend start and can be changed without a full postmaster restart. It
is like local_preload_libraries except that it is still only settable by
a superuser. This can be a better way to load modules such as
auto_explain.
Since there are now three preload parameters, regroup the documentation
a bit. Put all parameters into one section, explain common
functionality only once, update the descriptions to reflect current and
future realities.
Reviewed-by: Dimitri Fontaine <dimitri@2ndQuadrant.fr>
Valgrind "client requests" in aset.c and mcxt.c teach Valgrind and its
Memcheck tool about the PostgreSQL allocator. This makes Valgrind
roughly as sensitive to memory errors involving palloc chunks as it is
to memory errors involving malloc chunks. Further client requests in
PageAddItem() and printtup() verify that all bits being added to a
buffer page or furnished to an output function are predictably-defined.
Those tests catch failures of C-language functions to fully initialize
the bits of a Datum, which in turn stymie optimizations that rely on
_equalConst(). Define the USE_VALGRIND symbol in pg_config_manual.h to
enable these additions. An included "suppression file" silences nominal
errors we don't plan to fix.
Reviewed in earlier versions by Peter Geoghegan and Korry Douglas.
An oversight in commit e710b65c1c allowed
database names beginning with "-" to be treated as though they were secure
command-line switches; and this switch processing occurs before client
authentication, so that even an unprivileged remote attacker could exploit
the bug, needing only connectivity to the postmaster's port. Assorted
exploits for this are possible, some requiring a valid database login,
some not. The worst known problem is that the "-r" switch can be invoked
to redirect the process's stderr output, so that subsequent error messages
will be appended to any file the server can write. This can for example be
used to corrupt the server's configuration files, so that it will fail when
next restarted. Complete destruction of database tables is also possible.
Fix by keeping the database name extracted from a startup packet fully
separate from command-line switches, as had already been done with the
user name field.
The Postgres project thanks Mitsumasa Kondo for discovering this bug,
Kyotaro Horiguchi for drafting the fix, and Noah Misch for recognizing
the full extent of the danger.
Security: CVE-2013-1899
This GUC allows limiting the time spent waiting to acquire any one
heavyweight lock.
In support of this, improve the recently-added timeout infrastructure
to permit efficiently enabling or disabling multiple timeouts at once.
That reduces the performance hit from turning on lock_timeout, though
it's still not zero.
Zoltán Böszörményi, reviewed by Tom Lane,
Stephen Frost, and Hari Babu
... not on auxiliary processes. I managed to overlook the fact that I
had disabled assertions on my HEAD checkout long ago.
Hopefully this will turn the buildfarm green again, and put an end to
today's silliness.
This reverts commit d573e239f0, "Take fewer
snapshots". While that seemed like a good idea at the time, it caused
execution to use a snapshot that had been acquired before locking any of
the tables mentioned in the query. This created user-visible anomalies
that were not present in any prior release of Postgres, as reported by
Tomas Vondra. While this whole area could do with a redesign (since there
are related cases that have anomalies anyway), it doesn't seem likely that
any future patch would be reasonably back-patchable; and we don't want 9.2
to exhibit a behavior that's subtly unlike either past or future releases.
Hence, revert to prior code while we rethink the problem.
The regular backend's main loop handles signal handling and error recovery
better than the current WAL sender command loop does. For example, if the
client hangs and a SIGTERM is received before starting streaming, the
walsender will now terminate immediately, rather than hang until the
connection times out.
Replace unix_socket_directory with unix_socket_directories, which is a list
of socket directories, and adjust postmaster's code to allow zero or more
Unix-domain sockets to be created.
This is mostly a straightforward change, but since the Unix sockets ought
to be created after the TCP/IP sockets for safety reasons (better chance
of detecting a port number conflict), AddToDataDirLockFile needs to be
fixed to support out-of-order updates of data directory lockfile lines.
That's a change that had been foreseen to be necessary someday anyway.
Honza Horak, reviewed and revised by Tom Lane
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc. But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
There was a wild mix of calling conventions: Some were declared to
return void and didn't return, some returned an int exit code, some
claimed to return an exit code, which the callers checked, but
actually never returned, and so on.
Now all of these functions are declared to return void and decorated
with attribute noreturn and don't return. That's easiest, and most
code already worked that way.
"Unexpected EOF on client connection" without an open transaction
is mostly noise, so turn it into DEBUG1. With an open transaction it's
still indicating a problem, so keep those as ERROR, and change the message
to indicate that it happened in a transaction.
The previous code could cause a backend crash after BEGIN; SAVEPOINT a;
LOCK TABLE foo (interrupted by ^C or statement timeout); ROLLBACK TO
SAVEPOINT a; LOCK TABLE foo, and might have leaked strong-lock counts
in other situations.
Report by Zoltán Böszörményi; patch review by Jeff Davis.
This was a thinko in previous commit. Now that stack base pointer is now set
in PostmasterMain and SubPostmasterMain, it doesn't need to be set in
PostgresMain anymore.
We used to only initialize the stack base pointer when starting up a regular
backend, not in other processes. In particular, autovacuum workers can run
arbitrary user code, and without stack-depth checking, infinite recursion
in e.g an index expression will bring down the whole cluster.
The comment about PL/Java using set_stack_base() is not yet true. As the
code stands, PL/java still modifies the stack_base_ptr variable directly.
However, it's been discussed in the PL/Java mailing list that it should be
changed to use the function, because PL/Java is currently oblivious to the
register stack used on Itanium. There's another issues with PL/Java, namely
that the stack base pointer it sets is not really the base of the stack, it
could be something close to the bottom of the stack. That's a separate issue
that might need some further changes to this code, but that's a different
story.
Backpatch to all supported releases.
Add a queryId field to Query and PlannedStmt. This is not used by the
core backend, except for being copied around at appropriate times.
It's meant to allow plug-ins to track a particular query forward from
parse analysis to execution.
The queryId is intentionally not dumped into stored rules (and hence this
commit doesn't bump catversion). You could argue that choice either way,
but it seems better that stored rule strings not have any dependency
on plug-ins that might or might not be present.
Also, add a post_parse_analyze_hook that gets invoked at the end of
parse analysis (but only for top-level analysis of complete queries,
not cases such as analyzing a domain's default-value expression).
This is mainly meant to be used to compute and assign a queryId,
but it could have other applications.
Peter Geoghegan
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
It now prints the argument that was at fault.
Also fix a small misbehavior where the error message issued by
getopt() would complain about a program named "--single", because
that's what argv[0] is in the server process.
This separates the state (running/idle/idleintransaction etc) into
it's own field ("state"), and leaves the query field containing just
query text.
The query text will now mean "current query" when a query is running
and "last query" in other states. Accordingly,the field has been
renamed from current_query to query.
Since backwards compatibility was broken anyway to make that, the procpid
field has also been renamed to pid - along with the same field in
pg_stat_replication for consistency.
Scott Mead and Magnus Hagander, review work from Greg Smith
When a PORTAL_ONE_SELECT query is executed, we can opportunistically
reuse the parse/plan shot for the execution phase. This cuts down the
number of snapshots per simple query from 2 to 1 for the simple
protocol, and 3 to 2 for the extended protocol. Since we are only
reusing a snapshot taken early in the processing of the same protocol
message, the change shouldn't be user-visible, except that the remote
possibility of the planning and execution snapshots being different is
eliminated.
Note that this change does not make it safe to assume that the parse/plan
snapshot will certainly be reused; that will currently only happen if
PortalStart() decides to use the PORTAL_ONE_SELECT strategy. It might
be worth trying to provide some stronger guarantees here in the future,
but for now we don't.
Patch by me; review by Dimitri Fontaine.
lost. The only way we detect that at the moment is when write() fails when
we try to write to the socket.
Florian Pflug with small changes by me, reviewed by Greg Jaskiewicz.
When a btree index contains all columns required by the query, and the
visibility map shows that all tuples on a target heap page are
visible-to-all, we don't need to fetch that heap page. This patch depends
on the previous patches that made the visibility map reliable.
There's a fair amount left to do here, notably trying to figure out a less
chintzy way of estimating the cost of an index-only scan, but the core
functionality seems ready to commit.
Robert Haas and Ibrar Ahmed, with some previous work by Heikki Linnakangas.
Rewrite plancache.c so that a "cached plan" (which is rather a misnomer
at this point) can support generation of custom, parameter-value-dependent
plans, and can make an intelligent choice between using custom plans and
the traditional generic-plan approach. The specific choice algorithm
implemented here can probably be improved in future, but this commit is
all about getting the mechanism in place, not the policy.
In addition, restructure the API to greatly reduce the amount of extraneous
data copying needed. The main compromise needed to make that possible was
to split the initial creation of a CachedPlanSource into two steps. It's
worth noting in particular that SPI_saveplan is now deprecated in favor of
SPI_keepplan, which accomplishes the same end result with zero data
copying, and no need to then spend even more cycles throwing away the
original SPIPlan. The risk of long-term memory leaks while manipulating
SPIPlans has also been greatly reduced. Most of this improvement is based
on use of the recently-added MemoryContextSetParent primitive.
We were doing some amazingly complicated things in order to avoid running
the very expensive identify_system_timezone() procedure during GUC
initialization. But there is an obvious fix for that, which is to do it
once during initdb and have initdb install the system-specific default into
postgresql.conf, as it already does for most other GUC variables that need
system-environment-dependent defaults. This means that the timezone (and
log_timezone) settings no longer have any magic behavior in the server.
Per discussion.
As per my recent proposal, this refactors things so that these typedefs and
macros are available in a header that can be included in frontend-ish code.
I also changed various headers that were undesirably including
utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly,
this showed that half the system was getting utils/timestamp.h by way of
xlog.h.
No actual code changes here, just header refactoring.
In pursuit of this (and with the expectation that WaitLatch will be needed
in more places), convert the latch field that was already added to PGPROC
for sync rep into a generic latch that is activated for all PGPROC-owning
processes, and change many of the standard backend signal handlers to set
that latch when a signal happens. This will allow WaitLatch callers to be
wakened properly by these signals.
In passing, fix a whole bunch of signal handlers that had been hacked to do
things that might change errno, without adding the necessary save/restore
logic for errno. Also make some minor fixes in unix_latch.c, and clean
up bizarre and unsafe scheme for disowning the process's latch. Much of
this has to be back-patched into 9.1.
Peter Geoghegan, with additional work by Tom
There may be some other places where we should use errdetail_internal,
but they'll have to be evaluated case-by-case. This commit just hits
a bunch of places where invoking gettext is obviously a waste of cycles.
This option turns off autovacuum, prevents non-super-user connections,
and enables oid setting hooks in the backend. The code continues to use
the old autoavacuum disable settings for servers with earlier catalog
versions.
This includes a catalog version bump to identify servers that support
the -b option.
The previous functions of assign hooks are now split between check hooks
and assign hooks, where the former can fail but the latter shouldn't.
Aside from being conceptually clearer, this approach exposes the
"canonicalized" form of the variable value to guc.c without having to do
an actual assignment. And that lets us fix the problem recently noted by
Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus
log messages about "parameter "wal_buffers" cannot be changed without
restarting the server". There may be some speed advantage too, because
this design lets hook functions avoid re-parsing variable values when
restoring a previous state after a rollback (they can store a pre-parsed
representation of the value instead). This patch also resolves a
longstanding annoyance about custom error messages from variable assign
hooks: they should modify, not appear separately from, guc.c's own message
about "invalid parameter value".
1. Don't ignore query cancel interrupts. Instead, if the user asks to
cancel the query after we've already committed it, but before it's on
the standby, just emit a warning and let the COMMIT finish.
2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown).
Instead, emit a warning message and close the connection without
acknowledging the commit. Other backends will still see the effect of
the commit, but there's no getting around that; it's too late to abort
at this point, and ignoring die interrupts altogether doesn't seem like
a good idea.
3. If synchronous_standby_names becomes empty, wake up all backends
waiting for synchronous replication to complete. Without this, someone
attempting to shut synchronous replication off could easily wedge the
entire system instead.
4. Avoid depending on the assumption that if a walsender updates
MyProc->syncRepState, we'll see the change even if we read it without
holding the lock. The window for this appears to be quite narrow (and
probably doesn't exist at all on machines with strong memory ordering)
but protecting against it is practically free, so do that.
5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and
doesn't actually do anything.
There's still some further work needed here to make the behavior of fast
shutdown plausible, but that looks complex, so I'm leaving it for a
separate commit. Review by Fujii Masao.
With this patch, portals, SQL functions, and SPI all agree that there
should be only a CommandCounterIncrement between the queries that are
generated from a single SQL command by rule expansion. Fetching a whole
new snapshot now happens only between original queries. This is equivalent
to the existing behavior of EXPLAIN ANALYZE, and it was judged to be the
best choice since it eliminates one source of concurrency hazards for
rules. The patch should also make things marginally faster by reducing the
number of snapshot push/pop operations.
The patch removes pg_parse_and_rewrite(), which is no longer used anywhere.
There was considerable discussion about more aggressive refactoring of the
query-processing functions exported by postgres.c, but for the moment
nothing more has been done there.
I also took the opportunity to refactor snapmgr.c's API slightly: the
former PushUpdatedSnapshot() has been split into two functions.
Marko Tiikkaja, reviewed by Steve Singer and Tom Lane
Previously reported as ERRCODE_ADMIN_SHUTDOWN, this case is now
reported as ERRCODE_T_R_DATABASE_DROPPED. No message text change.
Unlikely to happen on most servers, so low impact change to allow
session poolers to correctly handle this situation.
Tatsuo Ishii, edits by me, review by Robert Haas
We only need that header when compiling with icc, since the gcc variant of
ia64_get_bsp() uses in-line assembly code. Per report from Frank Brendel,
the header doesn't exist on all IA64 platforms; so don't include it unless
we need it.
We don't actually need optreset, because we can easily fix the code to
ensure that it's cleanly restartable after having completed a scan over the
argv array; which is the only case we need to restart in. Getting rid of
it avoids a class of interactions with the system libraries and allows
reversion of my change of yesterday in postmaster.c and postgres.c.
Back-patch to 8.4. Before that the getopt code was a bit different anyway.
The mingw people don't appear to care about compatibility with non-GNU
versions of getopt, so force use of our own copy of getopt on Windows.
Also, ensure that we make use of optreset when using our own copy.
Per report from Andrew Dunstan. Back-patch to all versions supported
on Windows.
Per recent investigation, the register stack can grow faster than the
regular stack depending on compiler and choice of options. To avoid
crashes we must check both stacks in check_stack_depth().
Since this is poorly-tested code, committing only to HEAD for the
moment ... but we might want to consider back-patching later.
Rather than considering this result as meaning "unknown", report LONG_MAX.
This won't change what superusers can set max_stack_depth to, but it will
cause InitializeGUCOptions() to set the built-in default to 2MB not 100kB.
The latter seems like a fairly unreasonable interpretation of "infinity".
Per my investigation of odd buildfarm results as well as an old complaint
from Heikki.
Since this should persuade all the buildfarm animals to use a reasonable
stack depth setting during "make check", revert previous patch that dumbed
down a recursive regression test to only 5 levels.
I'm mainly interested in finding out what it is on buildfarm machines,
but including the active value in the message seems like good practice
in any case. Add the info to the HINT, not the ERROR string, so as not
to change the regression tests' expected output.
transaction snapshots, i.e. a snapshot registered at the beginning of
a transaction. Change variable naming and comments to reflect this reality
in preparation for a future, truly serializable mode, e.g.
Serializable Snapshot Isolation (SSI).
For the moment transaction snapshots are still used to implement
SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
Grittner and Dan Ports with review and some minor wording changes by me.
those process types that go through InitPostgres; in particular, bootstrap
and standalone-backend cases. This ensures that we have set up a PGPROC
and done some other basic initialization steps (corresponding to the
if (IsUnderPostmaster) block in AuxiliaryProcessMain) before we attempt to
run WAL recovery in a standalone backend. As was discovered last September,
this is necessary for some corner-case code paths during WAL recovery,
particularly end-of-WAL cleanup.
Moving the bootstrap case here too is not necessary for correctness, but it
seems like a good idea since it reduces the number of distinct code paths.
In addition, add support for a "payload" string to be passed along with
each notify event.
This implementation should be significantly more efficient than the old one,
and is also more compatible with Hot Standby usage. There is not yet any
facility for HS slaves to receive notifications generated on the master,
although such a thing is possible in future.
Joachim Wieland, reviewed by Jeff Davis; also hacked on by me.
process. If startup waits on a buffer pin we send a request to all
backends to cancel themselves if they are holding the buffer pin
required and they are also waiting on a lock. If not, startup waits
until max_standby_delay before cancelling any backend waiting for
the requested buffer pin.
woken by alarm we send SIGUSR1 to all backends requesting that they
check to see if they are blocking Startup process. If so, they throw
ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
removed. max_standby_delay = -1 option removed to prevent deadlock.
Conflict reason is passed through directly to the backend, so we can
take decisions about the effect of the conflict based upon the local
state. No specific changes, as yet, though this prepares for later work.
CancelVirtualTransaction() sends signals while holding ProcArrayLock.
Introduce errdetail_abort() to give message detail explaining that the
abort was caused by conflict processing. Remove CONFLICT_MODE states
in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.
Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.
Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.
Fujii Masao, with additional hacking by me
It is absolutely not okay to throw an ereport(ERROR) in any random place in
the code just because DoingCommandRead is set; interrupting, say, OpenSSL
in the midst of its activities is guaranteed to result in heartache.
Instead of that, undo the original optimizations that threw away
QueryCancelPending anytime we were starting or finishing a command read, and
instead discard the cancel request within ProcessInterrupts if we find that
there is no HS reason for forcing a cancel and we are DoingCommandRead.
In passing, may I once again condemn the practice of changing the code
and not fixing the adjacent comment that you just turned into a lie?
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
This was possibly linked to a deadlock-like situation in glibc syslog code
invoked by the ereport call in quickdie(). In any case, a signal handler
should not unblock its own signal unless there is a specific reason to.
This patch also removes buffer-usage statistics from the track_counts
output, since this (or the global server statistics) is deemed to be a better
interface to this information.
Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.
As proof of concept, modify plpgsql to use the hooks. plpgsql is still
inserting $n symbols textually, but the "back end" of the parsing process now
goes through the ParamRef hook instead of using a fixed parameter-type array,
and then execution only fetches actually-referenced parameters, using a hook
added to ParamListInfo.
Although there's a lot left to be done in plpgsql, this already cures the
"if (TG_OP = 'INSERT' and NEW.foo ...)" problem, as illustrated by the
changed regression test.
friends). This code has all been ifdef'd out for many years, and doesn't
seem to have any prospect of becoming any more useful in the future.
EXPLAIN ANALYZE is what people use in practice, and I think if we did want
process-wide counters we'd be more likely to put in dtrace events for that
than try to resurrect this code. Get rid of it so as to have one less detail
to worry about while refactoring execMain.c.
Recent commits have removed the various uses it was supporting. It was a
performance bottleneck, according to bug report #4919 by Lauris Ulmanis; seems
it slowed down user creation after a billion users.
to fix the problem that SetClientEncoding needs to be done before
InitializeClientEncoding, as reported by Zdenek Kotala. We get at least
the small consolation of being able to remove the bizarre API detail that
had InitPostgres returning whether user is a superuser.
(That flat file is now completely useless, but removal will come later.)
To do this, postpone client authentication into the startup transaction
that's run by InitPostgres. We still collect the startup packet and do
SSL initialization (if needed) at the same time we did before. The
AuthenticationTimeout is applied separately to startup packet collection
and the actual authentication cycle. (This is a bit annoying, since it
means a couple extra syscalls; but the signal handling requirements inside
and outside a transaction are sufficiently different that it seems best
to treat the timeouts as completely independent.)
A small security disadvantage is that if the given database name is invalid,
this will be reported to the client before any authentication happens.
We could work around that by connecting to database "postgres" instead,
but consensus seems to be that it's not worth introducing such surprising
behavior.
Processing of all command-line switches and GUC options received from the
client is now postponed until after authentication. This means that
PostAuthDelay is much less useful than it used to be --- if you need to
investigate problems during InitPostgres you'll have to set PreAuthDelay
instead. However, allowing an unauthenticated user to set any GUC options
whatever seems a bit too risky, so we'll live with that.
PostgresMain switch. In point of fact, FrontendProtocol is already set
in a backend process, since ProcessStartupPacket() is executed inside
the backend --- it hasn't been run by the postmaster for many years.
And if it were, we'd still certainly want FrontendProtocol to be set before
we get as far as PostgresMain, so that startup errors get reported in the
right protocol.
-v might have some future use in standalone backends, so I didn't go so
far as to remove the switch outright.
Also, initialize FrontendProtocol to 0 not PG_PROTOCOL_LATEST. The only
likely result of presetting it like that is to mask failure-to-set-it
mistakes.
This patch gets us out from under the Unix limitation of two user-defined
signal types. We already had done something similar for signals directed to
the postmaster process; this adds multiplexing for signals directed to
backends and auxiliary processes (so long as they're connected to shared
memory).
As proof of concept, replace the former usage of SIGUSR1 and SIGUSR2
for backends with use of the multiplexing mechanism. There are still some
hard-wired definitions of SIGUSR1 and SIGUSR2 for other process types,
but getting rid of those doesn't seem interesting at the moment.
Fujii Masao
copies?) to ensure they really don't run proc_exit/shmem_exit callbacks,
as was intended. I broke this behavior recently by installing atexit
callbacks without thinking about the one case where we truly don't want
to run those callback functions. Noted in an example from Dave Page.
is available during datatype input in Bind message processing. I put the
PopActiveSnapshot() or equivalent just before PortalDefineQuery, which is
an unsafe spot for it (in 8.3 and later) because we are carrying a plancache
refcount that hasn't yet been assigned to the portal. Any error thrown there
would result in leaking the refcount. It's not exactly likely that
PopActiveSnapshot would throw an elog, perhaps, but it could happen.
Reorder the code and add another comment warning not to do that.
when they are invoked by the parser. We had been setting up a snapshot at
plan time but really it needs to be done earlier, before parse analysis.
Per report from Dmitry Koterov.
Also fix two related problems discovered while poking at this one:
exec_bind_message called datatype input functions without establishing a
snapshot, and SET CONSTRAINTS IMMEDIATE could call trigger functions without
establishing a snapshot.
Backpatch to 8.2. The underlying problem goes much further back, but it is
masked in 8.1 and before because we didn't attempt to invoke domain check
constraints within datatype input. It would only be exposed if a C-language
datatype input function used the snapshot; which evidently none do, or we'd
have heard complaints sooner. Since this code has changed a lot over time,
a back-patch is hardly risk-free, and so I'm disinclined to patch further
than absolutely necessary.
replication patch needs a signal, but we've already used SIGUSR1 and
SIGUSR2 in normal backends. This patch allows reusing SIGUSR1 for that,
and for other purposes too if the need arises.
that a Portal is a useful and sufficient additional argument for
CreateDestReceiver --- it just isn't, in most cases. Instead formalize
the approach of passing any needed parameters to the receiver separately.
One unexpected benefit of this change is that we can declare typedef Portal
in a less surprising location.
This patch is just code rearrangement and doesn't change any functionality.
I'll tackle the HOLD-cursor-vs-toast problem in a follow-on patch.
free space information is stored in a dedicated FSM relation fork, with each
relation (except for hash indexes; they don't use FSM).
This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
trace of them from the backend, initdb, and documentation.
Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
introduce a new variant of the get_raw_page(regclass, int4, int4) function in
contrib/pageinspect that let's you to return pages from any relation fork, and
a new fsm_page_contents() function to inspect the new FSM pages.
debug_print_plan to appear at LOG message level, not DEBUG1 as historically.
Make debug_pretty_print default to on. Also, cause plans generated via
EXPLAIN to be subject to debug_print_plan. This is all to make
debug_print_plan a reasonably comfortable substitute for the former behavior
of EXPLAIN VERBOSE.
a portal are never NULL, but reliably provide the source text of the query.
It turns out that there was only one place that was really taking a short-cut,
which was the 'EXECUTE' utility statement. That doesn't seem like a
sufficiently critical performance hotspot to justify not offering a guarantee
of validity of the portal source text. Fix it to copy the source text over
from the cached plan. Add Asserts in the places that set up cached plans and
portals to reject null source strings, and simplify a bunch of places that
formerly needed to guard against nulls.
There may be a few places that cons up statements for execution without
having any source text at all; I found one such in ConvertTriggerToFK().
It seems sufficient to inject a phony source string in such a case,
for instance
ProcessUtility((Node *) atstmt,
"(generated ALTER TABLE ADD FOREIGN KEY command)",
NULL, false, None_Receiver, NULL);
We should take a second look at the usage of debug_query_string,
particularly the recently added current_query() SQL function.
ITAGAKI Takahiro and Tom Lane
functions.
Note that because this patch changes FmgrInfo, any external C functions
you might be testing with 8.4 will need to be recompiled.
Patch by Martin Pihlak, some editorialization by me (principally, removing
tracking of getrusage() numbers)
There are two ways to track a snapshot: there's the "registered" list, which
is used for arbitrary long-lived snapshots; and there's the "active stack",
which is used for the snapshot that is considered "active" at any time.
This also allows users of snapshots to stop worrying about snapshot memory
allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
assignment. This is all done automatically now.
As a consequence, this allows us to reset MyProc->xmin when there are no
more snapshots registered in the current backend, reducing the impact that
long-running transactions have on VACUUM.