...
All events are stored as CSV files in S3. They can be rerun to repopulate
Datasets
andDimensions
.Time dimensions-based Datasets are rolled up and bucketed using an OLAP extension in PSQL. This provides the best of both worlds - faster inserts (for data with an older timestamp compared to OLAPs) and almost comparable OLAP rolling aggregations on time dimensions.
Moving forward, the current collection service can be deprecated to use Opentelemetry Collector.
Assumptions
Since it’s not a frequent use case, Dimension updates and deletion are not supported. In case this is needed they can just create a new dimension, update the mapping to datasets and refresh the datasets.
Event Sourcing is used throughout the design.
Next Steps
SDKs can be added to push events to Opentelemetry Receivers
The current collection service can be deprecated to use Opentelemetry Receivers.
Transformers can be moved to an Opentelemetry Transformer.
The collector can send data to any destination using official plugins.