## Summary
Update the `RouteID` to use the `policy.ID` if it is set. This makes it
so that updated routes use a stable identifier between updates so if the
envoy control plane is updated before the authorize service's internal
definitions (or vice-versa) the authorize service will still be able to
match the route.
The current behavior results in a 404 if envoy passes the old route id.
The new behavior will result in inconsistency, but it should be quickly
remedied. To help with debugging 4 new fields were added to the
authorize check log. The `route-id` and `route-checksum` as the
authorize sees it and the `envoy-route-id` and `envoy-route-checksum` as
envoy sees it.
I also updated the way we send updates to envoy to try and model their
recommended approach:
> In general, to avoid traffic drop, sequencing of updates should follow
a make before break model, wherein:
>
> - CDS updates (if any) must always be pushed first.
> - EDS updates (if any) must arrive after CDS updates for the
respective clusters.
> - LDS updates must arrive after corresponding CDS/EDS updates.
> - RDS updates related to the newly added listeners must arrive after
CDS/EDS/LDS updates.
> - VHDS updates (if any) related to the newly added RouteConfigurations
must arrive after RDS updates.
> - Stale CDS clusters and related EDS endpoints (ones no longer being
referenced) can then be removed.
This should help avoid 404s when configuration is being updated.
## Related issues
-
[ENG-2386](https://linear.app/pomerium/issue/ENG-2386/large-number-of-routes-leads-to-404s-and-slowness)
## Checklist
- [x] reference any related issues
- [x] updated unit tests
- [x] add appropriate label (`enhancement`, `bug`, `breaking`,
`dependencies`, `ci`)
- [x] ready for review
* upgrade to go v1.24
* add a macOS-specific //nolint comment too
---------
Co-authored-by: Kenneth Jenkins <51246568+kenjenkins@users.noreply.github.com>
* update tracing config definitions
* new tracing system
* performance improvements
* only configure tracing in envoy if it is enabled in pomerium
* [tracing] refactor to use custom extension for trace id editing (#5420)
refactor to use custom extension for trace id editing
* set default tracing sample rate to 1.0
* fix proxy service http middleware
* improve some existing auth related traces
* test fixes
* bump envoyproxy/go-control-plane
* code cleanup
* test fixes
* Fix missing spans for well-known endpoints
* import extension apis from pomerium/envoy-custom
This also replaces instances where we manually write "return ctx.Err()"
with "return context.Cause(ctx)" which is functionally identical, but
will also correctly propagate cause errors if present.
* Optimize policy iterators (go1.23)
This modifies (*Options).GetAllPolicies() to use a go 1.23 iterator
instead of copying all policies on every call, which can be extremely
expensive. All existing usages of this function were updated as
necessary.
Additionally, a new (*Options).NumPolicies() method was added which
quickly computes the number of policies that would be given by
GetAllPolicies(), since there were several usages where only the
number of policies was needed.
* Fix race condition when assigning default envoy opts to a policy
* wip
* wip
* handle wildcards in override name
* remove wait for ready, add comment about sync, force initial sync complete in test
* address comments
* authorize: add databroker server and record version to result, force sync via polling
* wrap inmem store to take read lock when grabbing databroker versions
* address code review comments
* reset max to 0
* refactor backend, implement encrypted store
* refactor in-memory store
* wip
* wip
* wip
* add syncer test
* fix redis expiry
* fix linting issues
* fix test by skipping non-config records
* fix backoff import
* fix init issues
* fix query
* wait for initial sync before starting directory sync
* add type to SyncLatest
* add more log messages, fix deadlock in in-memory store, always return server version from SyncLatest
* update sync types and tests
* add redis tests
* skip macos in github actions
* add comments to proto
* split getBackend into separate methods
* handle errors in initVersion
* return different error for not found vs other errors in get
* use exponential backoff for redis transaction retry
* rename raw to result
* use context instead of close channel
* store type urls as constants in databroker
* use timestampb instead of ptypes
* fix group merging not waiting
* change locked names
* update GetAll to return latest record version
* add method to grpcutil to get the type url for a protobuf type
- gofumpt everything
- fix TLS MinVersion to be at least 1.2
- add octal syntax
- remove newlines
- fix potential decompression bomb in ecjson
- remove implicit memory aliasing in for loops.
Signed-off-by: Bobby DeSimone <bobbydesimone@gmail.com>