Redis Object 相关源码阅读笔记,源码文件 server.hobject.c


1. 小结

redisObject 主要由以下几个部分组成:

  • type:数据对象的类型
  • encoding:数据对象的编码格式,同一类型具有多个编码格式的目的是为了节省内存
  • lru:内存策略相关统计值
  • refcount:引用计数,共享对象以节省内存
  • ptr:数据对象的内容

2. Object 数据结构

在 Redis 中,各个数据类型,均使用 redisObject 表示,其定义如下:

1
2
3
4
5
6
7
8
9
typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

其中,type 用来区分不同的数据类型,目前共有 7 种数据类型:

1
2
3
4
5
6
7
#define OBJ_STRING 0    /* String object. */
#define OBJ_LIST 1      /* List object. */
#define OBJ_SET 2       /* Set object. */
#define OBJ_ZSET 3      /* Sorted set object. */
#define OBJ_HASH 4      /* Hash object. */
#define OBJ_MODULE 5    /* Module object. */
#define OBJ_STREAM 6    /* Stream object. */

encoding 则表示数据对象的编码格式,目前共使用 9 中编码格式:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#define OBJ_ENCODING_RAW 0        /* Raw representation */
#define OBJ_ENCODING_INT 1        /* Encoded as integer */
#define OBJ_ENCODING_HT 2         /* Encoded as hash table */
#define OBJ_ENCODING_ZIPMAP 3     /* No longer used */
#define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
#define OBJ_ENCODING_ZIPLIST 5    /* Encoded as ziplist */
#define OBJ_ENCODING_INTSET 6     /* Encoded as intset */
#define OBJ_ENCODING_SKIPLIST 7   /* Encoded as skiplist */
#define OBJ_ENCODING_EMBSTR 8     /* Embedded sds string encoding */
#define OBJ_ENCODING_QUICKLIST 9  /* Encoded as linked list of ziplists */
#define OBJ_ENCODING_STREAM 10    /* Encoded as a radix tree of listpacks */

为节省内存使用,同一 type 可能会使用多种 encoding,其对应关系如下:

type encoding situation
OBJ_STRING OBJ_ENCODING_RAW 默认项
OBJ_ENCODING_INT 字符串为整型,且数值大小在 LONG_MINLONG_MAX 之间
OBJ_ENCODING_EMBSTR 字符串长度小于 OBJ_ENCODING_EMBSTR_SIZE_LIMIT
OBJ_LIST OBJ_ENCODING_QUICKLIST All
OBJ_SET OBJ_ENCODING_INTSET Integers in radix 10 in the range of 64 bit signed integers && number of entries $\le$ set-max-intset-entries
OBJ_ENCODING_HT Others
OBJ_ZSET OBJ_ENCODING_ZIPLIST Number of entries $\le$ zset-max-ziplist-entries && value size $\le$ zset-max-ziplist-value
OBJ_ENCODING_SKIPLIST Others
OBJ_HASH OBJ_ENCODING_ZIPLIST Number of entries $\le$ hash-max-ziplist-entries && value size $\le$ hash-max-ziplist-value
OBJ_ENCODING_HT Others
OBJ_MODULE OBJ_ENCODING_RAW All
OBJ_STREAM OBJ_ENCODING_STREAM All

此外,lru 字段的含义和 Redis 使用的 key 驱逐策略(LRU,LFU…)有关,refcount 为引用计数,ptr 指向具体数据内容。

3. Object APIs

3.1. 构造函数

3.1.1. createObject

1
robj *createObject(int type, void *ptr);

根据传入的参数创建编码格式为 OBJ_ENCODING_RAW 的数据对象,此时 robj 和其内容 robj->ptr 未确保连续存放。

3.1.2. String Object

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* Create a string object with encoding OBJ_ENCODING_RAW, that is a plain
 * string object where o->ptr points to a proper sds string. */
robj *createRawStringObject(const char *ptr, size_t len);

/* Create a string object with encoding OBJ_ENCODING_EMBSTR, that is
 * an object where the sds string is actually an unmodifiable string
 * allocated in the same chunk as the object itself. */
robj *createEmbeddedStringObject(const char *ptr, size_t len);

/* Create a string object with EMBSTR encoding if it is smaller than
 * OBJ_ENCODING_EMBSTR_SIZE_LIMIT, otherwise the RAW encoding is
 * used.
 *
 * The current limit of 44 is chosen so that the biggest string object
 * we allocate as EMBSTR will still fit into the 64 byte arena of jemalloc. */
#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
robj *createStringObject(const char *ptr, size_t len) {
    if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT)
        return createEmbeddedStringObject(ptr,len);
    else
        return createRawStringObject(ptr,len);
}

可以看出在 createStringObject 中,如果字符串长度小于 44,则选用 OBJ_ENCODING_EMBSTR 编码,否则使用 OBJ_ENCODING_RAW 编码。

  • 使用 OBJ_ENCODING_RAW 编码时,robj 和其内容 robj->ptr 未连续存放,robj->ptr 可以指向任意类型的 sds
  • 使用 OBJ_ENCODING_EMBSTR 编码时,robj 和其内容 robj->ptr 连续存放,robj-ptr 指向 sdshdr8
  • sdshdr8 的长度上限为 64,但是此处为了使得整个 robj fit into the 64 byte arena of jemalloc,因此 OBJ_ENCODING_EMBSTR_SIZE_LIMIT 为 44

    64 - sizeof(robj) - sizeof(struct sdshdr8) - sizeof('\0') = 64 - 16 - 3 - 1 = 44

1
2
/* Create a string object from a long long value. */
robj *createStringObjectFromLongLongWithOptions(long long value, int valueobj);
  • 如果 valueobj == 0 且数值在区间 $[0, 10000]$ 内,则使用共享对象

    如果 key 驱逐策略不为 MAXMEMORY_FLAG_LRUMAXMEMORY_FLAG_LFU 时,valueobj 置为 0

  • 如果数值大小在 LONG_MINLONG_MAX 之间,使用 OBJ_ENCODING_INT 编码,否则使用 OBJ_ENCODING_RAW 编码

  • 使用 OBJ_ENCODING_INT 编码时,robj->ptr 中存储的内容为整型值,而不是地址

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
/* Wrapper for createStringObjectFromLongLongWithOptions() always demanding
 * to create a shared object if possible. */
robj *createStringObjectFromLongLong(long long value);

/* Wrapper for createStringObjectFromLongLongWithOptions() avoiding a shared
 * object when LFU/LRU info are needed, that is, when the object is used
 * as a value in the key space, and Redis is configured to evict based on
 * LFU/LRU. */
robj *createStringObjectFromLongLongForValue(long long value);

/* Create a string object from a long double. If humanfriendly is non-zero
 * it does not use exponential format and trims trailing zeroes at the end,
 * however this results in loss of precision. Otherwise exp format is used
 * and the output of snprintf() is not modified.
 *
 * The 'humanfriendly' option is used for INCRBYFLOAT and HINCRBYFLOAT. */
robj *createStringObjectFromLongDouble(long double value, int humanfriendly);
1
2
/* Duplicate a string object */
robj *dupStringObject(const robj *o);
  • 该函数保证返回的对象和传入的对象编码方式相同
  • 当字符串对象表示整型时,即使是在 $[0, 10000]$ 范围内,该函数也不会使用共享对象,会构造新的字符串对象,因此该函数返回的对象中 o->refcount 始终为 1。

3.1.3. List Object

1
2
3
4
5
/* robj->encoding is OBJ_ENCODING_QUICKLIST */
robj *createQuicklistObject(void);

/* robj->encoding is OBJ_ENCODING_ZIPLIST */
robj *createZiplistObject(void);

createZiplistObject 已不再使用,列表编码格式仅为 OBJ_ENCODING_QUICKLIST

3.1.4. Set Object

1
2
3
4
5
/* robj->encoding is OBJ_ENCODING_HT */
robj *createSetObject(void);

/* robj->encoding is OBJ_ENCODING_INTSET */
robj *createIntsetObject(void);

3.1.5. Hash Object

1
2
/* robj->encoding is OBJ_ENCODING_ZIPLIST */
robj *createHashObject(void);

3.1.6. Zset Object

1
2
3
4
5
/* robj->encoding is OBJ_ENCODING_SKIPLIST */
robj *createZsetObject(void);

/* robj->encoding is OBJ_ENCODING_ZIPLIST */
robj *createZsetZiplistObject(void);

3.1.7. Stream Object

1
2
/* robj->encoding is OBJ_ENCODING_STREAM */
robj *createStreamObject(void);

3.1.8. Module Object

1
2
/* robj->encoding is OBJ_ENCODING_RAW */
robj *createModuleObject(moduleType *mt, void *value);

3.2. 析构函数

1
2
3
4
5
void freeStringObject(robj *o);
void freeListObject(robj *o);
void freeSetObject(robj *o);
void freeZsetObject(robj *o);
void freeHashObject(robj *o);

只有在对象为字符串类型,且其编码格式为 OBJ_ENCODING_INTOBJ_ENCODING_EMBSTR 时,只需释放 robj 对象即可。在其他情况下,还需释放 robj->ptr 所指向的数据内容。

3.3. 引用计数相关

1
2
3
4
5
/* Set a special refcount in the object to make it "shared":
 * incrRefCount and decrRefCount() will test for this special refcount
 * and will not touch the object. This way it is free to access shared
 * objects such as small integers from different threads without any mutex. */
robj *makeObjectShared(robj *o);

o->refcount 值设置为 OBJ_SHARED_REFCOUNT,以将该对象标记为 shared。

1
void incrRefCount(robj *o);
  • o->refcount == OBJ_SHARED_REFCOUNT 时,说明该对象为共享对象,无操作
  • o->refcount == OBJ_STATIC_REFCOUNT 时,试图增加栈上对象引用计数,会导致 panic
1
2
3
4
5
6
void decrRefCount(robj *o);

/* This variant of decrRefCount() gets its argument as void, and is useful
 * as free method in data structures that expect a 'void free_object(void*)'
 * prototype for the free method. */
void decrRefCountVoid(void *o);
  • o->refcount == OBJ_SHARED_REFCOUNT 时,说明该对象为共享对象,无操作
  • o->refcount == 1 时,会先根据 robj->type 调用指定的析构函数释放 robj->ptr 所指向的内存,再释放 robj
  • o->refcount 小于 0 时,panic
1
2
/* This function set the ref count to zero without freeing the object. */
robj *resetRefCount(robj *obj);

3.4. String 对象相关

判断是否可转换为整型

1
int isSdsRepresentableAsLongLong(sds s, long long *llval);

判断传入的 sds 是否可以转换为 long long 类型:

  • 若可以,且 llval 不为 NULL,则 llval 中存储了该整型数值,函数返回 C_OK
  • 若不行,函数返回 C_ERR
1
int isObjectRepresentableAsLongLong(robj *o, long long *llongval);

isSdsRepresentableAsLongLong 类似,判断传入的 robj 是否可以转换为 long long 类型。

节省内存

1
2
3
4
5
6
7
8
/* Optimize the SDS string inside the string object to require little space,
 * in case there is more than 10% of free space at the end of the SDS
 * string. This happens because SDS strings tend to overallocate to avoid
 * wasting too much time in allocations when appending to the string. */
void trimStringObjectIfNeeded(robj *o);

/* Try to encode a string object in order to save space */
robj *tryObjectEncoding(robj *o);

trimStringObjectIfNeeded 中,如果 String 对象的编码格式为 OBJ_ENCODING_RAWrobj->ptr 所指向的 sds 还有超过 $10%$ 的空闲空间,则会对 sds 进行缩容。

tryObjectEncoding 中:

  • 如果字符串对象可以转换为 long 类型,则会将编码格式转换为 OBJ_ENCODING_INT
  • 如果字符串对象长度小于 OBJ_ENCODING_EMBSTR_SIZE_LIMIT,则会将编码方式从 OBJ_ENCODING_RAW 转换为 OBJ_ENCODING_EMBSTR
  • 在不能转换编码方式的情况下,最后会调用 trimStringObjectIfNeeded 尝试缩容

比较操作

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
/* Compare two string objects via strcmp() or strcoll() depending on flags.
 * Note that the objects may be integer-encoded. In such a case we
 * use ll2string() to get a string representation of the numbers on the stack
 * and compare the strings, it's much faster than calling getDecodedObject().
 *
 * Important note: when REDIS_COMPARE_BINARY is used a binary-safe comparison
 * is used. */
int compareStringObjectsWithFlags(robj *a, robj *b, int flags);

/* Wrapper for compareStringObjectsWithFlags() using binary comparison. */
int compareStringObjects(robj *a, robj *b);

/* Wrapper for compareStringObjectsWithFlags() using collation. */
int collateStringObjects(robj *a, robj *b);

/* Equal string objects return 1 if the two objects are the same from the
 * point of view of a string comparison, otherwise 0 is returned. Note that
 * this function is faster then checking for (compareStringObject(a,b) == 0)
 * because it can perform some more optimization. */
int equalStringObjects(robj *a, robj *b);
  • strcmp: Compares two null-terminated byte strings lexicographically.
  • strcoll: Compares two null-terminated byte strings according to the current locale as defined by the LC_COLLATE category.

Get APIs

1
2
3
/* Get a decoded version of an encoded object (returned as a new object).
 * If the object is already raw-encoded just increment the ref count. */
robj *getDecodedObject(robj *o);
  • 如果编码格式为 OBJ_ENCODING_RAWOBJ_ENCODING_EMBSTR,调用 incrRefCount 后直接返回 o
  • 如果编码格式为 OBJ_ENCODING_INT,则根据 o 构造编码格式为 OBJ_ENCODING_RAWOBJ_ENCODING_EMBSTR 的 String 对象并返回其指针
1
size_t stringObjectLen(robj *o);
  • 如果编码格式为 OBJ_ENCODING_RAWOBJ_ENCODING_EMBSTR,则返回 sds->len
  • 如果编码格式为 OBJ_ENCODING_INT,则返回该整型转换为 10 进制之后的字符串长度
1
2
3
4
5
6
7
int getDoubleFromObject(const robj *o, double *target);
int getDoubleFromObjectOrReply(client *c, robj *o, double *target, const char *msg);
int getLongDoubleFromObject(robj *o, long double *target);
int getLongDoubleFromObjectOrReply(client *c, robj *o, long double *target, const char *msg);
int getLongLongFromObject(robj *o, long long *target);
int getLongLongFromObjectOrReply(client *c, robj *o, long long *target, const char *msg);
int getLongFromObjectOrReply(client *c, robj *o, long *target, const char *msg);

3.5. Utils

1
2
3
4
5
/* 检查 robj 的类型是否和传入的 type 相同,相同返回 1,不同则返回 0 */
int checkType(client *c, robj *o, int type);

/* 返回字符串表示该编码的具体含义 */
char *strEncoding(int encoding);

3.6. TODO

内存使用量统计 & OBJECT and MEMORY Commands 实现。