事物序列化
Today, it’s easy to say that almost everything we do, everything we use, and even everything around us is capable of producing data. But what is even more true, is that this data is produced in real-time to describe something that is happening.
如今,可以很容易地说,我们所做的几乎所有事情,我们使用的所有东西,甚至我们周围的一切都能够产生数据。 但是更真实的是,这些数据是实时生成的,用于描述正在发生的事情。
Therefore, it’s logical to think that data must be also harnessed in real-time to be able to extract the most value from it. In addition, and perhaps most importantly, data must be stored and processed with a temporal context to retain its full significance. This is actually the condition necessary to fully understand the context in which something exists or occurred.
因此,逻辑上认为还必须实时利用数据才能从中获取最大价值。 另外,也许是最重要的是,必须在时间上下文中存储和处理数据,以保持其全部意义。 实际上,这是充分理解存在或发生某事的上下文所必需的条件。
So, let’s take some real-life examples where the temporal context (i.e., time) is an essential part of the meaning of your data:
因此,让我们举一些真实的例子,其中时间上下文(即时间)是数据含义的重要组成部分:
- Recording sports performance metrics (i.e. speed, position, heart rate) during a sporting activity through a connected watch.
通过连接的手表记录体育活动过程中的运动表现指标(即速度,位置,心律)。 - Measuring atmospheric conditions to provide data for weather forecasts (wind speed, temperature, atmospheric pressure, etc).
测量大气条件以提供天气预报数据(风速,温度,大气压力等)。 - Monitoring of a server’s system resource usage.
监视服务器的系统资源使用情况。 - Monitoring of a home’s energy consumption.
监视房屋的能源消耗。 - Monitoring stock prices, etc.
监视股票价格等
所有这些示例有一个共同点:它们都是关于我们要随时间测量以监视其演变,检测或预测趋势(可能与其他事件相关)或警告阈值的数据。 我们通常将这些数据称为时间序列。
The explosion of the IoT (Internet of Things) in recent years has greatly accelerated the need to be able to efficiently store and analyze this data, which most often means millions of new metrics produced every second.
物联网(爆炸物联网)近年来大大加快了需要能够有效地存储和分析数据,其中大部分往往意味着数以百万计的新的标准生产的每一秒。
什么是时间序列,什么是时间序列数据库(TSDB)? (What is a time-series and what is a time-series database (TSDB) ?) Time-series are sequences of numeric data points that are generated in successive order. Each data point represents a measure (also called a metric). Each metric has a name, a timestamp, and usually one or more labels that describe the actual object being measured.
时间序列是按连续顺序生成的数字数据点的序列。 每个数据点代表一个度量(也称为度量)。 每个度量标准都有一个名称,一个时间戳记,通常还有一个或多个描述实际测量对象的标签。
To store such data we could perfectly use a traditional relational database (such as PostgreSQL) and create a simple SQL table like this :
为了存储此类数据,我们可以完美地使用传统的关系数据库(例如PostgreSQL)并创建一个简单SQL表,如下所示:
CREATE TABLE timeseries (
metric_name TEXT NOT NULL,
metric_ts timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP,
value double precision NOT NULL,
labels json,
PRIMARY KEY(metric_name, metric_ts)
);
And, for example, to query and aggregate every point from now to the last 10 minutes we could use a SQL query similar to :
并且,例如,要查询和汇总从现在到最后10分钟的每个点,我们可以使用类似于以下内容SQL查询:
SELECT avg(value) FROM timeseries WHERE metric_name = ‘heart_rate_bpm’ AND metric_ts >= NOW() — INTERVAL ’10 minutes’;
However, this solution would not be really effective for data-intensive applications and long-term use. And sooner or later we would probably be limited by :
但是,此解决方案对于数据密集型应用程序和长期使用而言并不会真正有效。 迟早我们可能会受到以下限制:
- The horizontal scalability capabilities, whether for long-term storage, resiliency, or multi-region deployment needs.
水平可伸缩性功能,无论是针对长期存储,弹性还是多区域部署需求。 - The ability to massively insert millions of metrics per second (most relational databases are based on B-TREE index structures).
每秒大量插入数百万个指标的能力(大多数关系数据库基于B-TREE索引结构)。 - The ability to automatically roll-up data over time. For example, to aggregate all metrics from the previous month into 5-minute points).
随着时间的推移自动汇总数据的能力。 例如,将上个月的所有指标汇总为5分钟)。
此外,插入非常高的吞吐量测量值时可能会出现热点。 由于并发访问,这可能导致性能下降,具体取决于数据库使用的索引类型。
For all of these reasons, it‘s usually preferable to use solutions that are specifically designed to enable efficient storage and querying of this kind of data. These solutions are called time-series databases (TSDB).
由于所有这些原因,通常最好使用专门设计的解决方案,以实现对此类数据的有效存储和查询。 这些解决方案称为时间序列数据库(TSDB)。
Below are some of the most known TSDB :
以下是一些最著名的TSDB:
- InfluxDB
InfluxDB
- TimescaleDB
时标数据库
- OpenTSDB
OpenTSDB
最后,还有其他非常流行的解决方案,例如Prometheus 和石墨 由于它们可以存储时间序列,因此与TSBD相比有时(可能是错误的)。 但是它们实际上是监视系统,其使用与TSDB类似的功能来存储度量。
In this article, we will focus on a more recent solution: M3DB a distributed time series platform.
在本文中,我们将重点介绍最新的解决方案: M3DB 分布式时间序列平台。
M3,一个分布式时间序列数据库 (M3, A distributed time-series database) M3 is a distributed time-series platform that was developed by Uber to meet its growing storage and access needs for the trillions of metrics that the platform generates every day around the world.
M3 是一个分布式时间序列平台,由Uber开发,以满足其平台每天在全球产生的数万亿个指标的不断增长的存储和访问需求。
The M3 platform is available in open-source under the Apache v2.0 license since 2018 on GitHub. It is developed entirely in Go and has been designed to be able to scale horizontally in order to support both high throughput writes and low-latency queries.
M3的平台是在Apache许可证2.0版开放源代码,因为2018上可用GitHub上。 它完全在Go中开发,并且已设计为能够水平扩展,以支持高吞吐量写入和低延迟查询。
The M3 platform provides key features that make it a complete and robust solution for storing and processing time-series data:
M3平台提供的关键功能使其成为用于存储和处理时间序列数据的完整而强大的解决方案:
- Cluster Management: M3 is built on top of etcd to provides support for handling multiple clusters out of the box.
集群管理:M3建立在etcd之上 提供对开箱即用处理多个群集的支持。
- Built-in replication: Time-series data points are replicated across nodes with tunable configuration to achieve the desired balance between performance, availability, durability, and consistency.
内置复制:时间序列数据点在具有可调配置的节点之间复制,以在性能,可用性,耐用性和一致性之间实现所需的平衡。
- Highly Compressed: M3 provides an efficient compression algorithm inspired by Gorilla TSZ.
高度压缩:M3提供了一种受大猩猩TSZ启发的高效压缩算法。
- Configurable Consistency: M3 supports different consistency levels for both write and read requests (i.e: One, Quorum, All).
可配置的一致性:M3支持写入和读取请求的不同一致性级别(即:一个,法定人数,全部)。
- 【事物序列化_大规模测量每件事物m3时间序列简介】Out of order writes: M3 can seamlessly handle out-of-order writes for a configurable period.
乱序写入:M3可以在可配置的时间内无缝处理乱序写入。
- Seamless Prometheus Integration: M3 has built-in supports PromQL and can be used as a Prometheus Long-term Storage
无缝Prometheus集成:M3内置支持PromQL,可以用作Prometheus长期存储
所有这些功能都是由组成M3平台的不同组件提供的: M3 DB , M3协调器, M3查询和M3聚合器。
Now, let us now take a closer look at these four components.
现在,让我们仔细看一下这四个组成部分。
M3组件概述 (M3 Components Overview) M3数据库(M3 DB) M3DB is the actual distributed time-series database that provides durable and scalable storage as well as reverse indexes for time-series.
M3DB是实际的分布式时间序列数据库,可提供持久且可扩展的存储以及时间序列的反向索引。
M3 DB relies on etcd for clustering-management and provides synchronous replication with configurable durability and read consistency (one, majority, all, etc).
M3 DB依赖etcd 用于集群管理,并提供具有可配置的持久性和读取一致性(一个,多数,全部等)的同步复制。
M3协调员 (M3 Coordinator) M3 Coordinator is the service, part of the M3 platform, dedicated to the coordination of reads and writes in M3DB between upstream systems. For example, it can act as a bridge with Prometheus (or other systems such as Graphite). In addition, M3 Coordinator is used as a global service to configure other components of the platform.
M3协调器是M3平台的一部分,该服务专用于上游系统之间的M3DB读写协调。 例如,它可以作为与Prometheus(或其他系统,如Graphite)的桥梁。 此外,M3协调器还用作全局服务来配置平台的其他组件。
使用M3DB作为Prometheus长期存储 (Using M3DB as a Prometheus Long-term Storage) Prometheus is a very popular monitoring system that quickly becomes the de-facto solution to use for monitoring infrastructures and cloud-native applications (in particular, the ones that are running in Kubernetes). A key benefit of Prometheus is its ease of use and operability in production. This can be explained, among other things, by the fact that each instance of Prometheus operates independently of each other and relies only on its local storage to guarantee the durability of the data.
普罗米修斯 是一个非常受欢迎的监视系统,它Swift成为事实上的解决方案,用于监视基础结构和云原生应用程序(特别是在Kubernetes中运行的应用程序)。 Prometheus的主要优势在于其易用性和生产中的可操作性。 除其他外,这可以通过以下事实来解释:Prometheus的每个实例彼此独立运行,并且仅依靠其本地存储来保证数据的持久性。
But this simplicity is also the source of its limitations: Prometheus wasn’t designed to be durable long-term data storage, allowing to run analysis queries on historical data. Additionally, it can’t be horizontally scaled without third-party solutions (e.g.: Thanos, Cortex).
但是,这种简单性也是其局限性的根源:Prometheus并非旨在提供持久的长期数据存储,而是允许对历史数据进行分析查询。 此外,如果没有第三方解决方案(例如Thanos , Cortex ),就无法进行水平缩放。
So, M3DB can be used as a remote, multi-tenant and scalable data storage for Prometheus.
因此,M3DB可用作Prometheus的远程,多租户和可伸缩数据存储。
M3查询 (M3 Queries) M3 Queries is the M3 service dedicated to exposing the metrics and metadata of time series stored in M3DB. M3 Queries allows distributed query execution on an M3 cluster to interrogate both realtime and historical metrics for analytical purposes. For this purpose, M3 Queries offers two query engines: Prometheus/PromQL (default) and M3.
M3查询是致力于公开M3DB中存储的时间序列的度量和元数据的M3服务。 M3查询允许在M3集群上执行分布式查询,以查询实时和历史指标,以进行分析。 为此,M3查询提供了两个查询引擎: Prometheus / PromQL (默认)和M3。
The fact that M3 Queries supports PromQL by default is a huge advantage. Indeed, the HTTP API is compatible with the Prometheus plugin of Grafana. In this way, it is possible to easily switch out or part of a Grafana monitoring to M3 without having to rewrite the requests of its dashboards.
M3查询默认情况下支持PromQL的事实是一个巨大的优势。 实际上,HTTP API与Grafana的Prometheus插件兼容。 这样,可以轻松地将Grafana监视的一部分或一部分切换到M3,而无需重写其仪表板的请求。
M3聚合器 (M3 Aggregator) M3 Aggregator is the latest service that is part of the M3 platform. Its role is to aggregate the metrics, following rules stored in etcd, for sampling purposes. before they are stored in M3DB.
M3聚合器是M3平台的一部分,是最新的服务。 它的作用是按照存储在etcd中的规则汇总指标,以用于抽样目的。 将它们存储在M3DB中之前。
The diagram below illustrates how M3 can be used to federate multiple Prometheus instances :
下图说明了如何使用M3联合多个Prometheus实例:
文章图片
M3 Platform: As a Prometheus Long-term Storage M3平台:作为Prometheus的长期存储 M3DB体系结构概述(M3DB Architecture Overview) M3DB cluster is composed of two types of nodes: StorageNode and SeedNode.
M3DB集群由两种类型的节点组成: StorageNode和SeedNode 。
- StorageNode runs the m3dbnode process that stores time-series and serves both write and read queries.
StorageNode运行m3dbnode进程,该进程存储时间序列,并提供写入和读取查询。
- SeedNode is similar to the StorageNode but also runs an embedded etcd server to manage the cluster configuration.
SeedNode与StorageNode类似,但它还运行嵌入式etcd服务器来管理集群配置。
通常,对于非常大的部署,我们使用专用的etcd集群,因此仅部署M3DB存储节点。
Then, in addition to these two types of nodes, we will also have multiple dedicated nodes to run M3 Coordinator and M3 Queries.
然后,除了这两种类型的节点外,我们还将有多个专用节点来运行M3协调器和M3查询。
The following schema illustrates the different types of nodes.
以下架构说明了不同类型的节点。
文章图片
M3 DB Cluster Deployment Overview M3数据库群集部署概述 Now, let’s take a closer look at how a StorageNode and the storage engine of M3 DB work.
现在,让我们仔细研究一下StorageNode和M3 DB的存储引擎如何工作。
The internal architecture of a node is made of two distinct parts: an in-memory model and persistent storage.
节点的内部体系结构由两个不同的部分组成:内存模型和持久性存储。
First, the in-memory is designed according to a hierarchical object model where each node contains a single database that owns one or more namespace. Then, locally to each node, a namespace owns multiple shards which in turn owns multiple Series. Finally, each series owns a buffer and multiple cached blocks.
首先,根据分层对象模型设计内存,其中每个节点包含一个拥有一个或多个命名空间的数据库。 然后,在每个节点本地,一个名称空间拥有多个分片 进而拥有多个Series 。 最后,每个系列都有一个缓冲区和多个缓存的块。
Database >Namespaces > Shards > Series > (Buffer, Cached Blockeds)
Secondly, to implement persistent storage, the M3DB instance uses on the one hand a Commit-Log, to ensure data consistency after recovery from node failure, and on the other hand, multiple FileSet files to efficiently store time series, reverse indexes and metadata.
其次,为了实现持久性存储,M3DB实例一方面使用Commit-Log ,以确保从节点故障中恢复后的数据一致性,另一方面,使用多个FileSet文件来有效地存储时间序列,反向索引和元数据。
The following diagram tries to illustrate these different concepts in concise form :
下图试图以简洁的形式说明这些不同的概念:
文章图片
M3DB Architecture Overview: Memory Model + Persistent Storage M3DB体系结构概述:内存模型+持久存储 Now, let’s describe the role of each of these elements.
现在,让我们描述这些元素的作用。
命名空间 (Namespace) A namespace has a unique name and a distinct set of configuration options (i.e. retention, block size, etc).
名称空间具有唯一的名称和一组独特的配置选项(即保留,块大小等)。
碎片 (Shard) Shards allow distributing the time series evenly across all the nodes. By default, 4096 virtual shards are configured. Shards are replicated and the assignment of shards to nodes is stored in etcd.
分片允许在所有节点之间平均分配时间序列。 默认情况下,配置了4096个虚拟分片。 复制分片,并将分片对节点的分配存储在etcd中。
系列 (Series) A series is a sequence of data points. Each series is associated with an ID hashed using the murmur3 algorithm. The hash is then used to determine the target shard that owns the series.
系列是一系列数据点。 每个系列都与使用murmur3散列的ID相关联 算法。 散列然后用于确定拥有该系列的目标分片。
缓冲 (Buffer) A buffer contains all data points that have not yet been written to disk (i.e. new writes) as well as some data loaded during the bootstrapping of the node. A buffer creates a block for new writes which is later flushed on disk depending on the configured block-sized.
缓冲区包含尚未写入磁盘的所有数据点(即新写入)以及在节点引导过程中加载的一些数据。 缓冲区为新写操作创建一个块,然后根据配置的块大小将其刷新到磁盘上。
块 (Block) A block contains compressed time-series data. A block is cached after a read request. A block has a fixed size that is configured when the namespace is created. For example, the size of a block can be expressed in hours or days (e.g: 2d).
一个块包含压缩的时间序列数据。 读取请求后将缓存一个块。 块具有在创建名称空间时配置的固定大小。 例如,块的大小可以用小时或天(例如2d)表示。
提交日志 (Commit Log) The commit-log is an append-only structure (equivalent to the write-ahead-log or binary log in other databases) in which every data points are written sequentially and uncompressed. The MD3B node periodically runs a snapshotting process that compacts this file. The commit-log is used for disaster recovery and can be configured to fsync every writes.
提交日志是仅追加的结构(等同于其他数据库中的预写日志或二进制日志),其中每个数据点都按顺序写入且未压缩。 MD3B节点定期运行快照过程以压缩该文件。 提交日志用于灾难恢复,并且可以配置为同步每个写入。
FileSet文件 (FileSet Files) Fileset files are the primary unit of long-term storage for M3DB. A FileSet is used to store compressed streams of time series values for a specific shard/block.
文件集文件是M3DB长期存储的主要单位。 FileSet用于存储特定分片/块的时间序列值的压缩流。
A fileset includes all there files :
文件集包括所有存在的文件:
- Info file: Stores metadata about the fileset volume such as the block time window start and size.
信息文件:存储有关文件集卷的元数据,例如块时间窗口的开始和大小。
- Summaries file: Stores a subset of the index file to keep the contents in memory and to seek to the index file for a series with a linear scan.
摘要文件:存储索引文件的子集,以将内容保留在内存中,并通过线性扫描搜索索引文件以进行一系列扫描。
- Index file: Stores the series metadata to locate compressed time series data stream in the data file.
索引文件:存储序列元数据以在数据文件中定位压缩的时间序列数据流。
- Data file: Stores the compressed streams of time series values.
数据文件:存储时间序列值的压缩流。
- Bloom filter file: Stores a bloom filter bitset for all series contained in a fileset. The bloom filter is used in the read path to determine whether or not a series exists on the disk.
布隆过滤器文件:存储文件集中包含的所有系列的布隆过滤器位集。 在读取路径中使用Bloom筛选器来确定磁盘上是否存在系列。
- Digests file: Stores the digest checksums of all the files in the fileset volume for integrity verification.
摘要文件:将所有文件的摘要校验和存储在文件集卷中,以进行完整性验证。
- Checkpoint file: Stores a digest of the digests file to allows for quickly checking if a volume was completed.
Checkpoint文件:存储摘要文件的摘要,以便快速检查卷是否已完成。
要开始使用M3,最好的方法是遵循官方文档的“操作方法”页面: https : //docs.m3db.io/how_to/single_node/
The documentation also references multiple videos that introduce M3 and the motivations that led to its development at Uber: https://docs.m3db.io/overview/media/
该文档还引用了介绍M3的多个视频以及在Uber上导致其发展的动机: https : //docs.m3db.io/overview/media/
结论 (Conclusion) M3 is a relatively new solution that offers a simple and efficient architecture with a design similar to Apache Cassandra. M3 integrates seamlessly with existing monitoring solutions such as Prometheus and Grafana to provide scalable and durable data storage. Finally, M3 can be used both for low-latency queries and analytical queries on historical data.
M3是一个相对较新的解决方案,它提供了一种简单高效的体系结构,其设计类似于Apache Cassandra 。 M3与Prometheus和Grafana等现有监视解决方案无缝集成,以提供可扩展且持久的数据存储。 最后,M3可用于低延迟查询和对历史数据的分析查询。
关于我们 (About Us) StreamThoughts is a French IT consulting company specialized in event streaming technologies and data engineering, which was founded in 2020 by a group of technical experts. Our mission is to help our customers to make values out of their data as real-time event streams through our expertise, solutions and partners.
StreamThoughts是一家法国IT咨询公司,专门从事事件流技术和数据工程,由一组技术专家于2020年成立。 我们的使命是通过我们的专业知识,解决方案和合作伙伴,帮助客户在实时事件流中利用其数据创造价值。
翻译自: https://medium.com/streamthoughts/measuring-every-thing-at-scale-an-introduction-to-time-series-with-m3-a1e8d81465c事物序列化
推荐阅读
- python|victoriametrics的prometheus高可用性和容错策略长期存储
- 计算机视觉|ECCV2022|何恺明团队开源ViTDet(只用普通ViT,不做分层设计也能搞定目标检测...)
- Python|数据分析(实战模拟)
- Python的模块调用
- 计算机视觉|计算机视觉 图像基本操作
- opencv|图像基础入门--图像基本操作
- python|opencv图像处理及视频处理基本操作
- ModuleNotFoundError No module named 'PIL'问题解决
- opencv|Ubuntu下Opencv的安装(亲测有效,超级简单!)