* Optimize policy iterators (go1.23)
This modifies (*Options).GetAllPolicies() to use a go 1.23 iterator
instead of copying all policies on every call, which can be extremely
expensive. All existing usages of this function were updated as
necessary.
Additionally, a new (*Options).NumPolicies() method was added which
quickly computes the number of policies that would be given by
GetAllPolicies(), since there were several usages where only the
number of policies was needed.
* Fix race condition when assigning default envoy opts to a policy
This allows using the scheme 'h2c' to indicate http2 prior knowledge for
insecure upstream servers. This can be used to perform TLS termination for
GRPC servers configured with insecure credentials.
As an example, this allows the following route configuration:
routes:
- from: https://grpc.localhost.pomerium.io
to: h2c://localhost:9090
envoy: log mtls failures
This implements limited listener-based access logging for downstream
transport failures, only enabled when downstream_mtls.enforcement is
set to 'reject_connection'. Client certificate details and the error
message will be logged.
Additionally, the new key 'client-certificate' can be set in the
access_log_fields list in the configuration, which will add peer
certificate properties (issuer, subject, SANs) to the existing
per-request http logs.
---------
Co-authored-by: Kenneth Jenkins <51246568+kenjenkins@users.noreply.github.com>
Add a new 'user_principal_name' type to the downstream mTLS
match_subject_alt_names option. This corresponds to the 'OtherName' type
with type-id 1.3.6.1.4.1.311.20.2.3 and a UTF8String value.
Add support for UserPrincipalName SAN matching to the policy evaluator.
* Initial envoy cgroup resource monitor implementation
* Add cgroupv1 support; add metrics instrumentation
* Slight refactor for more efficient memory limit detection
Instead of reading memory.max/limit_in_bytes on every tick, we
read it once, then again only when it is modified.
To support this change, logic for computing the saturation was moved out
of the cgroup driver and into the resource monitor, and the driver
interface now has separate methods for reading memory usage and limit.
* Code cleanup/lint fixes
* Add platform build tags
* Add unit tests
* Fix lint issues
* Add runtime flag to allow disabling resource monitor
* Clamp saturation values to the range [0.0, 1.0]
* Switch to x/sys/unix; handle inotify IN_IGNORED events
Replace Atoi() calls with ParseUint(), and update the buildAddress()
defaultPort parameter to be a uint32. (A uint16 would arguably make more
sense for a port number, but uint32 matches the Envoy proto field.)
Delete a ParseAddress() method that appears to be unused.
Add a distinction between TCP routes depending on whether the To URL(s)
have the scheme tcp://. For routes with a TCP upstream, configure Envoy
to terminate CONNECT requests and open a TCP tunnel to the upstream
service (this is the current behavior). For routes without a TCP
upstream, configure Envoy to proxy CONNECT requests to the upstream.
This new mode can allow an upstream proxy server to terminate a CONNECT
request and open its own TCP tunnel to the final destination server.
(Note that this will typically require setting the preserve_host_header
option as well.)
Note that this requires Envoy 1.30 or later.
Envoy has an option 'auto_host_rewrite' that rewrites the Host header of
an incoming request to match the upstream domain that the proxied
request is sent to. Pomerium sets the 'auto_host_rewrite' option for all
Pomerium routes that do not set one of the "Host Rewrite options" (see
https://www.pomerium.com/docs/reference/routes/headers#host-rewrite-options).
When Envoy rewrites the Host header, it does not include the upstream
port, even when it is a non-standard port for the scheme (i.e. a port
other than 80 for http or a port other than 443 for https).
I think this behavior does not conform to RFC 9110. The nearest thing I
can find in the text is this statement about http and https URIs:
"If the port is equal to the default port for a scheme, the normal form
is to omit the port subcomponent."
(from https://datatracker.ietf.org/doc/html/rfc9110#section-4.2.3)
I take this to mean that the port should be specified in other cases.
There is a work-around: we can set an explicit hostname on each cluster
endpoint. Let's set this hostname based on the 'to' URL(s) from the
Pomerium route.
This should change the current behavior in two cases:
- When a route has a 'to' URL with a port number, this port number will
now be included in the Host header in the requests made by Pomerium.
- When a route has a 'to' URL with 'localhost' or an IP address as the
host, Pomerium will now rewrite the Host header to match the 'to'
URL.
There should be no change in behavior for routes where one of the "Host
Rewrite options" is set.
The client_ca and client_ca_file settings were deprecated in v0.23.
Remove these options and add a link to the corresponding explanation on
the Upgrading docs page.
In split service mode, and during periods of inactivity, the gRPC
connections to the databroker may fall idle. Some network firewalls may
eventually time out an idle TCP connection and even start dropping
subsequent packets once connection traffic resumes. Combined with Linux
default TCP retransmission settings, this could cause a broken
connection to persist for over 15 minutes.
In an attempt to avoid this scenario, enable TCP keepalive for outbound
gRPC connections, matching the Go standard library default settings for
time & interval: 15 seconds for both. (The probe count does not appear
to be set, so it will remain at the OS default.)
Add a test case exercising the BuildClusters() method with the default
configuration options, comparing the results with a reference "golden"
file in the testdata directory. Also add an '-update' flag to make it
easier to update the reference golden when needed:
go test ./config/envoyconfig -update
This partially reverts commit a1388592d8.
Fetching the authenticate service HPKE public key is required only for
the stateless authentication flow. Now that Pomerium will instead use
the older (stateful) authentication flow when configured for a
self-hosted authenticate service, this logic shouldn't be needed at all.
Removing this logic should also make it easier to test against a local
instance of the hosted authenticate service.
Add an environment variable to allow forcing either the stateful or the
stateless authenticate flow.
Split the existing integration test clusters "single" and "multi" into
four new clusters: "single-stateful", "single-stateless",
"multi-stateful", and "multi-stateless", so that the integration tests
will run for both the stateful and the stateless authenticate flows.
(The "kubernetes" cluster is not currently being run, so I've left it
alone for now.)
Update the initialization logic for the authenticate, authorize, and
proxy services to automatically select between the stateful
authentication flow and the stateless authentication flow, depending on
whether Pomerium is configured to use the hosted authenticate service.
Add a unit test case to verify that the sign_out handler does not
trigger a sign in redirect.
* core/config: update file watcher source to handle base64 encoded certificates
* fix data race
* core/config: only allow files in certificates
* remove test
* re-add test
* core/config: refactor file watcher
* add comments
* updates
* only use the polling watcher
* fix test
* fix test
* try to fix test again
* remove batching
* dont rely on file modification timestamp
* remove benchmark
* try fix again
Remove the Redis databroker backend. According to
https://www.pomerium.com/docs/internals/data-storage#redis it has been
discouraged since Pomerium v0.18.
Update the config options validation to return an error if "redis" is
set as the databroker storage backend type.
* core/config: refactor change dispatcher
* update test
* close listener go routine when context is canceled
* use cancel cause
* use context
* add more time
* more time