We are having an issue in both IDM servers since 28th, no evidences before this date. Authentication performance is affected, it goes slowly.
We are trying to figure out where is the issue. We found this messages when server was starting to consume high memory: Jan 28 20:16:45 icidmpdc1 sssd: Child [10800] ('ipa.unicc.org':'%BE_ipa.unicc.org') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. Jan 28 20:16:45 icidmpdc1 be[ipa.unicc.org]: Starting up
We added more memory to one of them and still using more than 95% of memory and it's still using between 20 to 60% of swap. And, obviously there are majflt/s: 01:13:02 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 01:14:01 PM 25.46 710.90 8456.75 0.49 7525.51 630.59 0.00 627.25 99.47 01:15:01 PM 207.28 813.30 7768.93 0.68 7458.57 773.50 0.00 581.38 75.16 01:16:01 PM 1110.16 1076.56 7726.09 2.68 7628.36 1041.29 24.97 840.16 78.79 01:17:01 PM 803.00 750.42 7827.29 1.93 7410.60 1144.91 0.00 765.54 66.86 01:18:01 PM 16282.35 6026.31 37911.53 55.44 17653.13 13243.27 52.22 5828.67 43.84 01:19:01 PM 5636.41 5428.07 17209.47 210.48 8604.65 5133.68 11.86 2333.18 45.34 01:20:02 PM 3108.13 4065.21 10127.76 229.49 6610.96 3183.18 8.94 1441.82 45.17 01:21:01 PM 15298.65 4763.03 13467.79 226.12 39224.22 6995.40 27.79 4130.50 58.81 01:22:01 PM 605.23 4454.37 28790.32 12.51 13404.36 0.00 0.00 0.00 0.00 Average: 1212.61 1222.89 18638.83 24.66 8143.57 730.57 2.68 445.85 60.80
In our monitorization, each 5min memory usage goes up to more than 95%, and after it goes down to less than 20%
We made a test with node2 stopped and issue persist. I.e. when both nodes are active (also replication) the issue is accentuated.
On Mon, Feb 01, 2021 at 12:44:54PM -0000, Miguel Hinojosa via FreeIPA-users wrote:
We are having an issue in both IDM servers since 28th, no evidences before this date. Authentication performance is affected, it goes slowly.
Hi,
can you share for a start which version of SSSD are you using and your sssd.conf.
bye, Sumit
We are trying to figure out where is the issue. We found this messages when server was starting to consume high memory: Jan 28 20:16:45 icidmpdc1 sssd: Child [10800] ('ipa.unicc.org':'%BE_ipa.unicc.org') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. Jan 28 20:16:45 icidmpdc1 be[ipa.unicc.org]: Starting up
We added more memory to one of them and still using more than 95% of memory and it's still using between 20 to 60% of swap. And, obviously there are majflt/s: 01:13:02 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 01:14:01 PM 25.46 710.90 8456.75 0.49 7525.51 630.59 0.00 627.25 99.47 01:15:01 PM 207.28 813.30 7768.93 0.68 7458.57 773.50 0.00 581.38 75.16 01:16:01 PM 1110.16 1076.56 7726.09 2.68 7628.36 1041.29 24.97 840.16 78.79 01:17:01 PM 803.00 750.42 7827.29 1.93 7410.60 1144.91 0.00 765.54 66.86 01:18:01 PM 16282.35 6026.31 37911.53 55.44 17653.13 13243.27 52.22 5828.67 43.84 01:19:01 PM 5636.41 5428.07 17209.47 210.48 8604.65 5133.68 11.86 2333.18 45.34 01:20:02 PM 3108.13 4065.21 10127.76 229.49 6610.96 3183.18 8.94 1441.82 45.17 01:21:01 PM 15298.65 4763.03 13467.79 226.12 39224.22 6995.40 27.79 4130.50 58.81 01:22:01 PM 605.23 4454.37 28790.32 12.51 13404.36 0.00 0.00 0.00 0.00 Average: 1212.61 1222.89 18638.83 24.66 8143.57 730.57 2.68 445.85 60.80
In our monitorization, each 5min memory usage goes up to more than 95%, and after it goes down to less than 20%
We made a test with node2 stopped and issue persist. I.e. when both nodes are active (also replication) the issue is accentuated. _______________________________________________ FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahoste...
Hi
The issue was solved after restart SSSD and clear the logs: # systemctl stop sssd ; rm -rf /var/log/sssd/* /var/lib/sss/{db,mc}/* ; systemctl start sssd
Thank you for your interest anyway
freeipa-users@lists.fedorahosted.org