identity: rework session refresh error handling (#4638)
Currently, if a temporary error occurs while attempting to refresh an
OAuth2 token, the identity manager won't schedule another attempt.
Instead, update the session refresh logic so that it will retry after
temporary errors. Extract the bulk of this logic into a separate method
that returns a boolean indicating whether to schedule another refresh.
Update the unit test to simulate a temporary error during OAuth2 token
refresh.
Co-authored-by: Kenneth Jenkins <51246568+kenjenkins@users.noreply.github.com>
identity: preserve session refresh schedule (#4633)
The databroker identity manager is responsible for refreshing session
records, to account for overall session expiration as well as OAuth2
access token expiration.
Refresh events are scheduled subject to a coolOffDuration (10 seconds,
by default) relative to a lastRefresh timestamp. Currently, any update
to a session record will reset the associated lastRefresh value and
reschedule any pending refresh event for that session. If an update
occurs close before a scheduled refresh event, this will push back the
scheduled refresh event to 10 seconds from that time.
This means that if a session is updated frequently enough (e.g. if there
is a steady stream of requests that cause constant updates via the
AccessTracker), the access token may expire before a refresh ever runs.
To avoid this problem, do not update the lastRefresh time upon every
session record update, but only if it hasn't yet been set. Instead,
update the lastRefresh during the refresh attempt itself.
Add unit tests to exercise these changes. There is a now() function as
part of the manager configuration (to allow unit tests to set a fake
time); update the Manager to use this function throughout.
Co-authored-by: Kenneth Jenkins <51246568+kenjenkins@users.noreply.github.com>
identity: override TokenSource expiry behavior (#4632)
The current session refresh loop attempts to refresh access tokens when
they are due to expire in less than one minute. However, the code to
perform the refresh relies on a TokenSource from the x/oauth2 package,
which has its own internal 'expiryDelta' threshold, with a default of
10 seconds. As a result, the first four or five attempts to refresh a
particular access token will not actually refresh the token. The refresh
will happen only when the access token is within 10 seconds of expiring.
Instead, before we obtain a new TokenSource, first clear any existing
access token. This causes the TokenSource to consider the token invalid,
triggering a refresh. This should give the refresh loop more control
over when refreshes happen.
Consolidate this logic in a new Refresh() method in the oidc package.
Add unit tests for this new method.
Co-authored-by: Kenneth Jenkins <51246568+kenjenkins@users.noreply.github.com>
config: do not add route headers to global map (#4629)
Currently the GetSetResponseHeadersForPolicy() method may add entries to
the global SetResponseHeaders map, which can lead to one route's headers
being applied to other routes.
Instead, make a copy of the SetResponseHeaders map before adding any
route-specific response header entries.
Add additional unit tests for GetSetResponseHeaders() and
GetSetResponseHeadersForPolicy().
Co-authored-by: Kenneth Jenkins <51246568+kenjenkins@users.noreply.github.com>
Set the Envoy option only_verify_leaf_cert_crl, to avoid a bug where
CRLs cannot be used in combination with an intermediate CA trust root.
Update the client certificate validation logic in the authorize service
to match this behavior.
Update the integration test libsonnet templates to assign a fixed IP
address to the trusted-httpdetails service. This requires also assigning
a fixed IP subnet to the docker network.
Configure a route with a 'to' URL using https and this fixed IP address.
Add a corresponding certificate with the IP address. Finally, add a test
case that makes a request to this route.
Add an integration test case to verify properties of the Pomerium
attestation JWT:
- The 'iat' and 'exp' timestamps should be plain integers.
- The JWT should contain an issuer and audience claim.
- A JWT retrieved from the /.pomerium/jwt endpoint should contain all
the same data as a JWT from the X-Pomerium-Jwt-Assertion header.
Fix the logic around when to add the default invalid_client_certificate
rule: this should only be added if mTLS is enabled and the enforcement
mode is not set to "policy". Add a unit test for this logic.
If Pomerium is operating in the insecure_server mode (e.g. if there is
another reverse proxy in front of Pomerium), then the ssl() Lua method
will return nil.
Add a check for this case to the set-client-certificate-metadata.lua
script, in order to avoid an error when attempting to store the client
certificate info.
Currently Pomerium replaces dynamic set_request_headers tokens
sequentially. As a result, if a replacement value itself contained a
supported "$pomerium" token, Pomerium may treat that as another
replacement, resulting in incorrect output.
This is unlikely to be a problem given the current set of dynamic
tokens, but if we continue to add additional tokens, this will likely
become more of a concern.
To forestall any issues, let's perform all replacements in one pass,
using the os.Expand() method. This does require a slight change to the
syntax, as tokens containing a '.' will need to be wrapped in curly
braces, e.g. ${pomerium.id_token}.
A literal dollar sign can be included by using $$ in the input.
Add a new match_subject_alt_names option to the downstream_mtls settings
group. This setting can be used to further constrain the allowed client
certificates by requiring that certificates contain a Subject
Alternative Name of a particular type, matching a particular regex.
When set, populate the corresponding match_typed_subject_alt_names
setting within Envoy, and also implement a corresponding check in the
authorize service.
Move the parseCRLs() method from package 'authorize/evaluator' to
'pkg/cryptutil', replacing the existing DecodeCRL() method. This method
will parse all CRLs found in the PEM input, rather than just the first.
(This removes our usage of the deprecated method x509.ParseDERCRL.)
Update this method to return an error if there is non-PEM data found in
the input, to satisfy the existing test that raw DER-encoded CRLs are
not permitted.
Delete the CRLFromBase64() and CRLFromFile() methods, as these are no
longer used.
Add a new max_verify_depth option to the downstream_mtls settings group,
with a default value of 1 (to match the behavior of current Pomerium
releases).
Populate the corresponding setting within Envoy, and also implement a
depth check within isValidClientCertificate() in the authorize service.
Update the isValidClientCertificate() method to consider any
client-supplied intermediate certificates. Previously, in order to trust
client certificates issued by an intermediate CA, users would need to
include that intermediate CA's certificate directly in the client_ca
setting. After this change, only the trusted root CA needs to be set: as
long as the client can supply a set of certificates that chain back to
this trusted root, the client's certificate will validate successfully.
Rework the previous CRL checking logic to now consider CRLs for all
issuers in the verified chains.
Add a new client_certificate criterion that accepts a "Certificate
Matcher" object. Start with two certificate match conditions:
fingerprint and SPKI hash, each of which can accept either a single
string or an array of strings.
Add new "client-certificate-ok" and "client-certificate-unauthorized"
reason strings.
Add support for a new token $pomerium.client_cert_fingerprint in the
set_request_headers option. This token will be replaced with the SHA-256
hash of the presented leaf client certificate.
Add an "enforcement" option to the new downstream mTLS configuration
settings group.
When not set, or when set to "policy_default_deny", keep the current
behavior of adding an invalid_client_certificate rule to all policies.
When the enforcement mode is set to just "policy", remove the default
invalid_client_certificate rule that would be normally added.
When the enforcement mode is set to "reject_connection", configure the
Envoy listener with the require_client_certificate setting and remove
the ACCEPT_UNTRUSTED option.
Add a corresponding field to the Settings proto.
Move downstream mTLS settings to a nested config file object, under the
key 'downstream_mtls', and add a new DownstreamMTLSSettings struct for
these settings.
Deprecate the existing ClientCA and ClientCAFile fields in the Options
struct, but continue to honor them for now (log a warning if either is
populated).
Delete the ClientCRL and ClientCRLFile fields entirely (in current
releases these cannot be set without causing an Envoy error, so this
should not be a breaking change).
Update the Settings proto to mirror this nested structure.
Update bindEnvs() to add support for binding nested fields of the
Options struct to environment variables. The variable names are formed
by joining the nested fields' mapstructure tags with underscores (after
first converting to uppercase).
This is in preparation for adding a new nested struct for downstream
mTLS settings that will look something like this:
downstream_mtls:
ca_file: /path/to/CA/cert.pem
enforcement: reject_connection
With this change, these fields would be bound to the variables
DOWNSTREAM_MTLS_CA_FILE and DOWNSTREAM_MTLS_ENFORCEMENT.
Update isValidClientCertificate() to also consult the configured
certificate revocation lists. Update existing test cases and add a new
unit test to exercise the revocation support. Restore the skipped
integration test case.
Generate new test certificates and CRLs using a new `go run`-able source
file.
Partially revert #4374: do not record the peerCertificateValidated()
result as reported by Envoy, as this does not work correctly for resumed
TLS sessions. Instead always record the certificate chain as presented
by the client. Remove the corresponding ClientCertificateInfo Validated
field, and update affected code accordingly. Skip the CRL integration
test case for now.