Working on Redis Compatible Database 3.0 (LDB)

Came across this article online a few days back on SpeeDB.

I am currently working on an improvement to the customized redis compatible database with clustering and replication improvement to the current Database 2.0 (LDB) we have. This is going to be the backbone foundation for all data storage units, the rudimentary fundamental layer to all data stores in future.

Database backend data store is a tricky thing to get right. Between RocksDB (LSM, write optimized) and LMDB (Btree, read optimized), which I have tried both implementation, RocksDB seems to have a better balance of read / write performance than the lop-sided read optimized LMDB. Though I’ve checked on the benchmark here : http://www.lmdb.tech/bench/inmem/ , it is only applicable to in-memory scenarios. In real life application, the performance of lmdb does take up more space. For in-memory database, I think, imho, in the most generalized sense that it’s better to write your own optimized data structures to accommodate the strict memory requirement (for large scale applications) than to use a bloated (full feature) db in general (ymmv depending on workload scenarios).

Other DBs I’ve tried include leveldb and boltdb, that’s why I needed to write my own. LDB stands for LightspeedDB, which is being used to serve as a caching database from this blog you are reading now. 85% of the time, all webpages on bcz.com are served from LDB backend. It’s performance is “sort of” comparable to in-memory hash table with a 30% improvement to data compression used but slower than pure in-memory hash table implementation by a factor of 8x-10x because of the compression processing overhead.

Overall it’s designed to be suited for current needs as part of our infrastructure serving. Compared with existing solutions like redis, it offers on disk expansion capabilities (redis-on-flash). Compared with keydb, we are not tied down to any licensing costs as it is in-house proprietary developed software. Though I admit the performance is roughly a factor of 2x to 5x slower depending on certain workload scenario but rarely we will be needing to use beyond over committed capacity limit of the software.

Drink your own champagne if you are good but eat your own dog food otherwise. For me, I eat my own butter cake.

Back to developing reliable working software.

Published
Categorized as Journal