- Return `PeerEOF` enum instead of `Error` containing string from
`drain_messages_from_peer()`. There are no other error types to return
from this function, so boolean-like enum is sufficient.
- Don't override read hook in `ConnectionFromClient` constructor. It was
previously redefined only to suppress EOF error returned by
`drain_messages_from_peer()`.
If the Ladybird process crashes or just ends normally, the IPC transport
connection with WebContent may be shutdown after a send sync event (for
example: WebContentClient DidRequestCookie) was sent from WebContent,
but before the Ladybird process provided the matching sync event
response (for example: WebContentClient DidRequestCookieResponse). This
can lead to a runaway WebContent process if other IPC events (for
example: WebContentServer DidPaint, or SetSystemVisibilityState, or
MouseEvent, or CloseServer, etc...) are also queued when the IPC
connection is shutdown.
At the core of the issue is that the loop waiting for the matching
send sync response will prioritize waiting for the response and remain
spinning even if the IPC connection is reporting that it was shutdown,
but only if there happens to be other unrelated events received before
the IPC shutdown is detected. These unrelated events will not be
processed because the loop is stuck waiting for the response that due
to the Ladybird process having stopped, will never be sent.
Because the shutdown of the IPC connection is not handled when other
events happen to be also present, new events may be posted for transfer
by the WebContent process if the page is very active. If many new events
are posted this could lead to a slow or very quick memory leak in the
WebContent process due to the queue growing large, sometimes all the way
to total system memory exhaustion. If no events or only a few new events
are sent, then the leak may be hard to detect.
This PR fixes the faulty IPC shutdown handling by not getting stuck if
any messages are present in the receive queue. Before returning to the
caller any remaining messages will be immediately processed.
The local handling of some messages might cause the transport to get
closed. If that's the case, we shouldn't try to send back a response.
This fixes many of the "Trying to post_message during IPC shutdown"
errors I was seeing when terminating Ladybird or when abnormally exiting
from LibWeb tests.
Instead of wrapping all non-movable members of TransportSocket in OwnPtr
to keep it movable, make TransportSocket itself non-movable and wrap it
in OwnPtr.
By doing this we also make MessagePort, that relies on IPC transport,
to send messages from separate thread, which solves the problem when
WebWorker and WebContent could deadlock if both were trying to post
messages at the same time.
Fixes https://github.com/LadybirdBrowser/ladybird/issues/4254
Reimplements c3121c9d at the transport layer, allowing us to solve the
same problem once, in a single place, for both the LibIPC connection and
MessagePort. This avoids exposing a workaround for a macOS specific Unix
domain socket issue to higher abstraction layers.
With this change, the responsibility for prepending messages with their
size and ensuring the entire message is received before returning it to
the caller is moved to TransportSocket. This removes the need to
duplicate this logic in both LibIPC and MessagePort.
Another advantage of reducing message granularity at IPC::Transport
layer is that it will make it easier to support alternative transport
implementations (like Mach ports, which unlike Unix domain sockets are
not stream oriented).
This change ensures that instead of immediately deallocating the message
buffer after sending, we retain it in an acknowledgement wait queue
until an acknowledgement is received from the peer. This is necessary
to handle a behavior of the macOS kernel, which may prematurely
garbage-collect file descriptors contained within the message buffer
before the peer receives them.
The acknowledgement mechanism assumes messages are received in the same
order they were sent so, each acknowledgement message simply indicates
the count of successfully received messages, specifying how many entries
can safely be removed from the acknowledgement wait queue.
It turned out that some web applications want to send fairly large
messages to WebWorker through IPC (for example, MapLibre GL sends
~1200KiB), which led to failures (at least on macOS) because buffer size
of TransportSocket is limited to 128KiB. This change solves the problem
by wrapping messages that exceed socket buffer size into another message
that holds wrapped message content in shared memory.
Co-Authored-By: Luke Wilde <luke@ladybird.org>
This has been a longstanding ergonomic issue with our IPC compiler. Non-
trivial types were previously passed by const&. So if we wanted to avoid
expensive copies, we would have to const_cast and move the data.
We now pass ownership of all transferred data to the client subclasses.
This allows us to remove const_cast from these methods, and allows us to
avoid some trivial expensive copies that we didn't bother to const_cast.