Storage|Storage Format
文档简介(0.9.0)
Data in Druid is stored in a custom column format known as a segment. Segments are composed of different types of columns. Column.java
and the classes that extend it is a great place to looking into the storage format.
基本类
ValueType
枚举类,包含四个可选项:
- Float
- Long
- String
- Complex
【Storage|Storage Format】主要有三个方法:
int size();
int get(int index);
void fill(int index, int[] toFill);
实现类主要有:
- EmptyIndexedInts
- IntBufferIndexedInts
- ListBasedIndexedInts
- VSizeIndexedInts
size()
指的是该 Buffer 下还有多少个元素可读或可写;get(index)
读取该 Buffer 下的 index 个元素;fill()
将对应的 Channel 数据填充到该 Buffer,目前都不支持该方法.其中,
ListBasedIndexedInts
采用的存储是 List
.可以看出,部分是采用 Java NIO 操作 native memory.
ColumnCapabilities
属性:
private ValueType type = null;
private boolean dictionaryEncoded = false;
// 是否字典编码
private boolean runLengthEncoded = false;
// 是否 runLength 编码,runLength 是虚构的,可忽略
private boolean hasInvertedIndexes = false;
// 是否倒排索引
private boolean hasSpatialIndexes = false;
// 是否稀疏索引
private boolean hasMultipleValues = false;
// 是否有多值
DictionaryEncodedColumn
基本方法:
public int length();
// 一个字典编码列的总长度
public boolean hasMultipleValues();
// 是否有多值的情况
public int getSingleValueRow(int rowNum);
// 获取某行的单值
public IndexedInts getMultiValueRow(int rowNum);
// 获取某行的多值
public String lookupName(int id);
// 通过 id 索引获取对应行的值,注意,null and empty 都会转化成 null
public int lookupId(String name);
//
public int getCardinality();
// 获取基数,字典长度
唯一实现类
SimpleDictionaryEncodedColumn
,有三个属性:private final IndexedInts column;
private final IndexedMultivalue multiValueColumn;
private final CachingIndexed cachedLookups;
其中有意思的是
cachedLookups
,存储的是字典。CachingIndexed 字典的具体实现类,实现了
Indexed
接口,其它的实现类主要有:- GenericIndexed
- ArrayIndexed
- BufferIndexed
- ListIndexed
- VSizeIndexed
CachingIndexed
是 wrapping a given GenericIndexed,同时使用一个 LRUMap SizedLRUMap
来存储 cachedValues.GenericIndexed
A generic, flat storage mechanism. Use static methods fromArray() or fromIterable() to construct. If input is sorted, supports binary search index lookups. If input is not sorted, only supports array-like index lookups.
V1 Storage Format:
- byte 1: version (0x1)
- byte 2 == 0x1 => allowReverseLookup
- bytes 3-6 => numBytesUsed
- bytes 7-10 => numElements
- bytes 10-((numElements * 4) + 10): integers representing 'end' offsets of byte serialized values
- bytes ((numElements * 4) + 10)-(numBytesUsed + 2): 4-byte integer representing length of value, followed by bytes for value
private final ByteBuffer theBuffer;
// 内置的 ByteBuffer 存储
private final ObjectStrategy strategy;
private final boolean allowReverseLookup;
private final int size;
// theBuffer 的当前 int 值
private final int valuesOffset;
private final BufferIndexed bufferIndexed;
// 内部类, BufferIndexed
Column 类 接口,详见实现类
SimpleColumn 类
属性:
private final ColumnCapabilitiescapabilities;
private final SupplierdictionaryEncodedColumn;
private final SupplierrunLengthColumn;
private final SuppliergenericColumn;
private final SuppliercomplexColumn;
private final SupplierbitmapIndex;
private final SupplierspatialIndex;
推荐阅读
- ts泛型使用举例
- HTTP高级(Cookie,Session|HTTP高级(Cookie,Session ,LocalStorage )
- ffmpeg源码分析01(结构体)
- P5 DS——构件与文档关联
- LaTeX记录|LaTeX记录 —— LaTeX文档基本结构
- android|android today上下卡片,【精品文档】关于计算机专业大学生安卓系统有关的外文文献翻译成品(基于Android(安卓)的考勤管理系统(中英文双语对照)
- 《DOM知识点总结》
- BLAS|BLAS API 中文文档(2)(待续)
- 用c#转换word或excel文档为html文件|用c#转换word或excel文档为html文件,C#实现DataSet内数据转化为Excel和Word文件的通用类完整实例...
- elasticsearch分析器