Why prefer Redis over others?
Most of the developers nowadays would have heard about Redis (https://en.wikipedia.org/wiki/Redis). Redis is one of the best open source in-memory NoSQL database currently available in the market. It offers a lot of good to have features for frontend as well as backend services like key-value lookup, queues, hashes, hyperlog (etc).
It’s been 2 years since we started Pepipost. Back then, nearly all of our developers use to work in Php-Mysql and Perl combination and same was used to develop Pepipost product. Initial few months were smooth, but as we started getting clients with handsome sending volumes, we started experiencing performance issues at some components. So we dug deeper to find the problems and solutions for it. The conclusion to which we arrived was to get in memory cache to speed up as most queries for blocking at MySQL level, so as we had used Memcached (https://en.wikipedia.org/wiki/Memcached) in the past we started to search based on that & which is the best in this category. While we were hunting for the solution, one of our developer came up with the suggestion of using Redis, which I misheard as RADIUS (https://en.wikipedia.org/wiki/RADIUS), being from an Internet service provider background I had used this Radius & I thought how come we can use that as a database.
After having an extended discussion on the same, we went through all docs & benchmarks of Redis & RabbitMQ, and then we finalized on Redis, as it had many features which we definitely had no use at that particular time, but as human nature is we prefer to have lots of features in the products we buy, even if we are not going to be using them.
Going forward, we added one Redis server which started to cache all the client configs inside it, post that the config queries to MySQL we reduced by 50%. Once this was in place, we broke down our backend components in micro services and managed that using Redis Queues. When we started we had 10 queues, currently, it’s in 100s, each serving a separate purpose.
Redis Usage in Pepipost.
Here at Pepipost, we use Redis in multiple scenarios, right from accepting emails from HTTP API or SMTP and storing them on Redis queue, then generating the final email and delivery of email. Each email triggers multiple events as it progresses. We show real-time dashboards using Redis to show the number of emails that have been sent today, with the number of opens, clicks, bounces on a real-time basis. We have priority queues, job queues, real-time dashboards, retry failed events after a particular time interval, etc using Redis. At any given time our Redis servers process around 40K requests per second. Some stats.
[Pepipost@~]$ redis-cli info stats
Gradually we moved all our dependency from MySQL to Redis, so that even if MySQL is down, there is Zero impact on Pepipost from acceptance of email to delivery. This has helped us to give 99.99% uptime to our customers at all times.
At times we have started hitting bottlenecks in Redis also. Currently, we use multiple Redis servers to handle the enormous load of the email volume we are delivering (2 Billion plus/ month). Our engineering team always works to find the next bottleneck that is impacting the overall performance and optimize it. Our team is doing POC on NSQ (http://nsq.io/overview/performance.html), which as per their blog can give up to 400K request per second in single node & 800K request per second in 3 node cluster. We are expecting to get a very nice jump in our performance using this.
Different ways in which you can use Redis.
- Key-Value stores – To store individual configs items.
- Hashes – Can use this to store data which can be huge key-value lookups & needs to be broken down. For example, you have a blacklist database in millions that need to be looked up very frequently, then using Redis Key-Value is not the best way as millions of keys will bring the performance down. Well, there are other alternatives like constant database (TinyCDB – www.corpit.ru/mjt/tinycdb.html) for this use case if your data is in millions.
- Counters & Hyperlogs – Best to build real-time dashboards & leaderboards.
- Sorted Sets – You can use this for triggering events at a particular time, eg drip campaigns, automation of events.
- Bitmaps – Good to build cohorts and other analysis. (https://medium.com/hacking-and-gonzo/bitmapist-analytics-and-cohorts-for-redis-44be43458ef6)
There are much more types of features available in Redis which you can use, all types of commands Redis provides is available here (https://redis.io/commands).
Will be back with a next exciting blog on a new database that we are working on Clickhouse (https://clickhouse.yandex/) for building our Analytics Reports stay tuned.