Hey folks!
The new version of FMN will run in OpenShift and will use Redis as a cache backends (we chose it over memcached because it can do native "is-this-string-in-this-set" operations).
I can deploy redis inside my openshift project easily enough , but I was wondering if it would be worthwhile to have a shared Redis instance, like we have a shared PostgreSQL instance. It's not just for ease of use, but I expect to store quite a bit of data in our Redis instance, and since we don't attach persistent storage to OpenShift that means that it will live in the pod's memory. So I'm being conscious of the memory hog it can become. Unless I'm mistaken there can be several databases in the same Redis instance, so we could share it between projects without stepping on each other's toes.
What do you think?
Hey!
I'm not sure it's worth doing. While I don't think this is a bad idea, I can't find any real gains to a shared instance in an Openshift world. I do see a few downsides though: - You'll need to share the same redis password across several projects. I see that as a potential security issue. - Since you'll use an emptyDir (in-memory storage), every restart will flush the cache for all connected applications. - Applications owners lose control over the redis instance in case they want to do some fancy stuff with it, or just general debugging.
The resource footprint of a redis container is quite low. We have a couple of redis pods deployed already, and the memory usage for some of them is as low as 14Mb.
On Thu, Nov 24, 2022 at 10:56:57AM +0100, Aurelien Bompard wrote:
Hey folks!
The new version of FMN will run in OpenShift and will use Redis as a cache backends (we chose it over memcached because it can do native "is-this-string-in-this-set" operations).
I can deploy redis inside my openshift project easily enough , but I was wondering if it would be worthwhile to have a shared Redis instance, like we have a shared PostgreSQL instance. It's not just for ease of use, but I expect to store quite a bit of data in our Redis instance, and since we don't attach persistent storage to OpenShift that means that it will live in the pod's memory. So I'm being conscious of the memory hog it can become. Unless I'm mistaken there can be several databases in the same Redis instance, so we could share it between projects without stepping on each other's toes.
What do you think?
We have talked about it before, but I think the tradeoffs come down on the side of seperate instances. They aren't too hard to spin up in openshift.
It avoids a single point of failure for a bunch of services.
Contention/resource problems. (ie, one app is hammering the shared instance and starving other apps for resources).
etc.
So, I would say we should do seperate ones per service...
kevin
- You'll need to share the same redis password across several projects.
Redis does have users and permissions, at least from a quick look at their docs: https://docs.redis.com/latest/rc/security/database-security/passwords-users-...
- Since you'll use an emptyDir (in-memory storage), every restart will flush the cache for all connected applications.
I was thinking of running Redis in a VM, not in OpenShift. Sorry if that wasn't clear in my initial message.
- Applications owners lose control over the redis instance in case they want to do some fancy stuff with it, or just general debugging.
True.
It avoids a single point of failure for a bunch of services.
Right, but that's what our PostgreSQL host is at the moment already.
Contention/resource problems. (ie, one app is hammering the shared instance and starving other apps for resources).
True, true.
OK, that makes sense. The good thing about having a central Redis DB was, in my mind, to have persistent storage. What happens if I store a lot of data in the Redis Openshift pod? Won't that hit a memory limit? I think our current usage of Redis has been pubsub and light cache, but we haven't stored a lot of data in there yet.
Le lun. 28 nov. 2022 à 01:17, Kevin Fenzi kevin@scrye.com a écrit :
On Thu, Nov 24, 2022 at 10:56:57AM +0100, Aurelien Bompard wrote:
Hey folks!
The new version of FMN will run in OpenShift and will use Redis as a cache backends (we chose it over memcached because it can do native "is-this-string-in-this-set" operations).
I can deploy redis inside my openshift project easily enough , but I was wondering if it would be worthwhile to have a shared Redis instance, like we have a shared PostgreSQL instance. It's not just for ease of use, but I expect to store quite a bit of data in our Redis instance, and since we don't attach persistent storage to OpenShift that means that it will live in the pod's memory. So I'm being conscious of the memory hog it can become. Unless I'm mistaken there can be several databases in the same Redis instance, so we could share it between projects without stepping on each other's toes.
What do you think?
We have talked about it before, but I think the tradeoffs come down on the side of seperate instances. They aren't too hard to spin up in openshift.
It avoids a single point of failure for a bunch of services.
Contention/resource problems. (ie, one app is hammering the shared instance and starving other apps for resources).
etc.
So, I would say we should do seperate ones per service...
kevin _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Mon, Nov 28, 2022 at 10:49:28AM +0100, Aurelien Bompard wrote:
- You'll need to share the same redis password across several projects.
Redis does have users and permissions, at least from a quick look at their docs: https://docs.redis.com/latest/rc/security/database-security/passwords-users-...
- Since you'll use an emptyDir (in-memory storage), every restart will flush the cache for all connected applications.
I was thinking of running Redis in a VM, not in OpenShift. Sorry if that wasn't clear in my initial message.
- Applications owners lose control over the redis instance in case they want to do some fancy stuff with it, or just general debugging.
True.
It avoids a single point of failure for a bunch of services.
Right, but that's what our PostgreSQL host is at the moment already.
Yeah, true. I have thought about splitting that out too, but I am not sure if I think it's a good idea to have databases in openshift and making them vm's adds overhead of more vm's.
Contention/resource problems. (ie, one app is hammering the shared instance and starving other apps for resources).
True, true.
OK, that makes sense. The good thing about having a central Redis DB was, in my mind, to have persistent storage. What happens if I store a lot of data in the Redis Openshift pod? Won't that hit a memory limit? I think our current usage of Redis has been pubsub and light cache, but we haven't stored a lot of data in there yet.
Well, we can actually do persistent storage in the ocp4 cluster. ;)
There's nfs volumes, but also there's a local ceph storage (using disk on the compute nodes). I'm not sure how slow/fast it might be, but it is there...
kevin
Well, we can actually do persistent storage in the ocp4 cluster. ;)
Oh, that's interesting! Are we using it already in one of our ansible-deployed apps?
I'm not sure how slow/fast it might be, but it is there...
I think it's fine, Redis will use memory first and snapshot to disk periodically, so disk speed should not be an issue. That said, now that I look more into the docs it means that if I try to store more data than what the memory allows, it'll evict data and not use the disk for that. So having persistent storage for Redis only helps with pod restarts. Which is still useful.
A.
On 2022-11-29 15:55, Aurelien Bompard wrote:
Well, we can actually do persistent storage in the ocp4 cluster. ;)
Oh, that's interesting! Are we using it already in one of our ansible-deployed apps?
You can take the meetbot app as an example: https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/mote...
There are 4 types of persistent storage available you can use: - NFS (no storageClass): poor performance, need to be provisioned beforehand on Netapp. Can be shared by multiple pods and outside of Openshift. - RBD (storageClass: ocs-storagecluster-ceph-rbd): Block storage. Openshift will create an ext4 FS on top of it for you by default. Provide fast performance, but can only be accessed by one node at a time. - CephFS (storageClass: ocs-storagecluster-cephfs): Shared Filesystem storage. It's pretty much like NFS. Can be accessed by several pods simultaneously. - S3 (storageClass: openshift-storage.noobaa.io): Object storage. Require specific support from the application to use it. I don't think Redis support that kind of storage.
Basically, if you want more than 1 replica/pod to access your storage, pick CephFS. If you need max perf and single access (like a PSQL database), use RBD. If you want your data accessible from outside Openshift (i.e: on another VM), use NFS. With the exception of S3, all other storage types are used the same way. Only the PVC definition is different.
-darknao
infrastructure@lists.fedoraproject.org