Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from pyroscope-io:main #32

Open
wants to merge 4,164 commits into
base: main
Choose a base branch
from
Open

[pull] main from pyroscope-io:main #32

wants to merge 4,164 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Jun 18, 2021

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

simonswine and others added 7 commits August 7, 2024 10:28
We stopped using the prefix based listing a while ago, so this brings us
back to upstream of thanos-io/objstore.
We currently allow any character to be user specified at upload. This
limits that character set. It also makes sure only IDs are listend and
retrieve with get, when they are fulfilling the character limit.

This validation will ensure that the generated identifiers are withing
the allowed set use for object store.
Pyrobench is a hackathon 2024-08 project, which provides a github
actiont that can run go microbenchmarks on default it will:

- Double check the code for the package with the Benchmarks in has
  changed (comparing base..head of PR)
- Will run the benchmarks multiple times for base and head and then
  compare it using benstat (WIP).
- Each benchmark run will also collect profiles and upload them to
  flamegraph.com
* add metastore API definition scratch

* Add metastore API definition (#3391)

* WIP: Add dummy metastore (#3394)

* Add dummy metastore

* Add metastore client (#3397)

* experiment: new architecture deployment (#3401)

* remove query-worker service for now

* fix metastore args

* enable persistence for metastore

* WIP: Distribute profiles based on tenant id, service name and labels (#3400)

* Distribute profiles to shards based on tenant/service_name/labels with no replication

* Add retries in case of delivery issues (WIP)

* Calculate shard for ingested profiles, send to ingesters in push request

* Set replication factor and token count via config

* Fix tests

* Run make helm/check check/unstaged-changes

* Run make reference-help

* Simplify shard calculation

* Add a metric for distributed bytes

* Register metric

* Revert undesired change

* metastore bootstrap include self

* fix ingester ring replication factor

* delete helm workflows

* wip: segments writer (#3405)

* start working on segments writer

add shards

awaiting segment flush in requests

upload blocks

add some tracing

* upload meta in case of metastore error

* do not upload metadata to dlq

* add some flags

* skip some tests. fmt

* skip e2e tests

* maybe fix microservices_test.go. I have no idea what im doing

* change partition selection

* rm e2e yaml

* fmt

* add compaction planner API definition

* rm unnecesary nested dirs

* debug log head stats

* more debug logs

* fix skipping empty head

* fix tests

* pass shards

* more debug logs

* fix nil deref in ingester

* more debugs logs in segmentsWriter

* start collecting some metrics for segments

* hack: purge state segments

* hack: purge stale segments on the leader

* more segment metrics

* more segment metrics

* more segment metrics

* more segment metrics

* make fmt

* more segment metrics

* fix panic caused by the metric with the same name

* more segment metrics

* more segment metrics

* make fmt

* decrease page buffer size

* decrease page buffer size, tsdb buffer writer size

* separate parquet write buffer size for segments and compacted blocks

* separate index write buffer size for segments and compacted blocks

* improve segment metrics - decrease cardinality by removing service as label

* fix head metrics recreation via phlarectx usage ;(

* try to pool newParquetProfileWriter

* Revert "try to pool newParquetProfileWriter"

This reverts commit d91e3f1.

* decrease tsdb index buffers

* decrease tsdb index buffers

* experiment: add query backend (#3404)

* Add query-backend component

* add generated code for connect

* add mergers

* block reader scratches

* block reader updates

* query frontend integration

* better query planning

* tests and fixes

* improve API

* profilestore: use single row group

* profile_store: pool profile parquet writer

* profiles parquet encoding: fix profile column count

* segments: rewrite shards to flush independently

* make fmt

* segments: flush heads concurrently

* segments: tmp debug log

* segments: change wait duration metric buckets

* add inmemory tsdb index writer

* rm debug log

* use inmemory index writer

* remove FileWriter from inmem index

* inmemory tsdb index writer: reuse buffers through pool

* inmemory tsdb index writer: preallocate initial buffers

* segment: concat files with preallocated buffers

* experiment: query backend block reader (#3423)

* simplify report merge handling

* implement query context and tenant service section

* implement LabelNames and LabelValues

* bind tsdb query api

* bind time series query api

* better caching

* bind stack trace api

* implement tree query

* fix offset shift

* update helm chart

* tune buffers

* add block object size attribute

* serve profile type query from metastore

* tune grpc server config

* segment: try to use memfs

* Revert "segment: try to use memfs"

This reverts commit 798bb9d.

* tune s3 http client

Before getting too deep with a custom TLS VerifyConnection function, it makes sense to ensure that we reuse connections as much as possible

* WIP: Compaction planning in metastore (#3427)

* First iteration of compaction planning in metastore

* Second iteration of compaction planning in metastore

* Add GetCompactionJobs back

* Create and persist jobs in the same transaction as blocks

* Add simple logging for compaction planning

* Fix bug

* Remove unused proto message

* Remove unused raft command type

* Add a simple config for compaction planning

* WIP (new architecture): Add compaction workers, Integrate with planner (#3430)

* Add compaction workers, Integrate with planner (wip)

* Fix test

* add compaction-worker service

* refactor compaction-worker out of metastore

* prevent bootstrapping a single node cluster on silent DNS failures

* Scope compaction planning to shard+tenant

* Improve state handling for compaction job status updates

* Remove import

* Reduce parquet buffer size for compaction workers

* Fix another case of compaction job state inconsistency

* refactor block reader out

* handle nil blocks more carefully

* add job priority queue with lease expiration and fencing token

* disable boltdb sync

We only use it to make snapshots

* extend raft handlers with the raft log command

* Add basic compaction metrics

* Improve job assignments and status update logic

* Remove segment truncation command

* Make compaction worker job capacity configurable

* Fix concurrent map access

* Fix metric names

* segment: change segment duration from 1s to 500ms

* update request_duration_seconds buckets

* update request_duration_seconds buckets

* add an explicit parameter that controls how many raft peers to expect

* fix the explicit parameter that controls how many raft peers to expect

* temporary revert temporary hack

I'm reverting it temporary to protect metastore from running out of memory

* add some more metrics

* add some pprof tags for easier visibility

* add some pprof tags for easier visibility

* add block merge draft

* add block merge draft

* update metrics buckets again

* update metrics buckets again

* Address minor consistency issues, improve naming, in-progress updates

* increase boltdb InitialMmapSize

* Improve metrics, logging

* Decrease buffer sizes further, remove completed jobs

* Scale up compaction workers and their concurrency

* experiment: implement shard-aware series distribution (#3436)

* tune boltdb snapshotting

- increase initial mmap size
- keep less entries in the raft log
- trigger more frequently

* compact job regardless of the block size

* ingester & distributor scaling

* update manifests

* metastore ready check

* make fmt

* Revert "make fmt"

This reverts commit 8a55391.

* Revert "metastore ready check"

This reverts commit 98b05da.

* experiment: streaming compaction (#3438)

* experiment: stream compaction

* fix stream compaction

* fix parquet footer optimization

* Persist compaction job pre-queue

* tune compaction-worker capacity

* Fix bug where compaction jobs with level 1 and above are not created

* Remove blocks older than 12 hours (instead of 30 minutes)

* Fix deadlock when restoring compaction jobs

* Add basic metrics for compaction workers

* Load compacted blocks in metastore on restore

* experimenting with object prefetch size

* experimenting with object prefetch

* trace read path

* metastore readycheck

* metastore readycheck

* metastore readycheck. trigger rollout

* metastore readycheck. trigger rollout

* segments, include block id in errors

* metastore: log addBlock error

* segments: maybe fix retries

* segments: maybe fix retries

* segments: more debug logging

* refactor query result aggregation

* segments: more debug logging

* segments: more debug logging

* tune resource requests

* tune compaction worker job capacity

* fix time series step unit

* Update golang version to 1.22.4

* enable grpc tracing for ingesters

* expose /debug/requests

* more debug logs

* reset state when restoring from snapshot

* Add debug logging

* Persist job raft log index and lease expiry after assignment

* Scale up a few components in the dev environment

* more debug logs

* metastore clinet: resolve addresses from endpointslice instead of dns

* Update frontend_profile_types.go

* metastore: add extra delay for readyness check

* metastore: add extra delay for readyness check

* metastore client: more debug log

* fix compaction

* stream statistics tracking: add HeavyKeeper implementation

* Bump compaction worker resources

* Bump compaction worker resources

* improve read path load distribution

* handle compaction synchronously

* revert CI/CD changes

* isolate experimental code

* rollback initialization changes

* rollback initialization changes

* isolate distributor changes

* isolate ingester changes

* cleanup experimental code

* remove large tsdb fixture copy

* update cmd tests

* revert Push API changes

* cleanup dependencies

* cleanup gitignore

* fix reviewdog

* go mod tidy

* make generate

* revert changes in tsdb/index.go

* Revert "revert changes in tsdb/index.go"

This reverts commit 2188cde.

---------

Co-authored-by: Aleksandar Petrov <[email protected]>
Co-authored-by: Tolya Korniltsev <[email protected]>
marcsanmi and others added 30 commits December 16, 2024 11:59
* WIP: test(examples): Introduce some basic testing for examples/

I am sure this is something that we all felt during this docs week: It
is quite hard keeping the examples working.

I spent abit of time autmating building and running the examples:

My observation was often problems happen already at built or statup
stage. So for now that will only cover problems during docker-compose
build and will only detect failures that occur during the first 5
seconds after docker-compose up.

* Add github action

* Hash project name to avoid collisions

* Github actions require docker compose subcommand

* Ignore stdout/stderr read errors

* Fix dotnet example context

* Run when manually triggered

* Run daily at 01:13:00 UTC
Bad link was replaced with reference to actual Pyroscope examples
* docs: add v11 release notes

* Update v1-11.md
* Update java span-profiles example with latest SDKs

* Bump java SDK version in a few more examples, improve docs further
* feat(v2): add metadata label query interface

* implement clients

* improve error handling

* implement metadata label query service

* implement metadata label querier

* test and fix metadata label querier

* refactor metadata label compaction

* introduce block/metadata package

* avoid sending stubs in GetTenantStats

* document metadata entry
* Restructure Pyroscope documentation and share content

* Fix typo

* Fix heading match for profile types

* Restructure for configure server

* Apply suggestions from code review

Co-authored-by: Taylor C <[email protected]>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Aleksandar Petrov <[email protected]>

* Apply suggestions from code review

Co-authored-by: Aleksandar Petrov <[email protected]>

---------

Co-authored-by: Taylor C <[email protected]>
Co-authored-by: Aleksandar Petrov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⤵️ pull merge-conflict Resolve conflicts manually
Projects
None yet
Development

Successfully merging this pull request may close these issues.