[PATCH] GPO: Fix crash with GPO and missing security descriptor

Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.

Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?

With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.

If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).

Lukas Slebodnik

23 Apr 23 Apr

8:14 a.m.

On (20/04/15 14:38), Stephen Gallagher wrote:

...

On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote:

...
ehlo,

attached patch fixes crash in https://fedorahosted.org/sssd/ticket/2629

Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.

You are right.

We got a referral to GPO and therefore we do not find any attributes.

[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524-419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'

[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7-185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.

...

Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?

With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.

If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).

I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?

Stephen Gallagher

1:29 p.m.

On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote:

...

On (20/04/15 14:38), Stephen Gallagher wrote:

...
On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote:

...
ehlo,

attached patch fixes crash in https://fedorahosted.org/sssd/ticket/2629

Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.

You are right.

We got a referral to GPO and therefore we do not find any attributes.

[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- 419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'

[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.

...
Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?

With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.

If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).

I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?

LS

OK, I hadn't considered the referral scenario. (I wasn't actually aware that GPOs could *be* referred to elsewhere). Is this referral pointing to a different machine than the host is enrolled to? It's trying to reach a DN at lzb.hq that is identical to the one we just requested.

I'd really like to know how to reproduce the setup for this situation; is it inheriting a GPO from a parent domain somehow?

Neither of my above proposals are a proper solution for this case, unfortunately. The first one would make the crash go away, but if this GPO really exists and is restrictive, not processing it is *really bad*. Right now we're actually slightly lucky in that the crash has a net result of causing an access denial rather than erroneously resulting in a user login, but if we were to "solve" this problem the way Lukas's original patch intended (or my first alternative above), it would be introducing a security issue. So let's not do either of those.

The second option I suggested there is invalid as well, since the entry *exists*, it's just been converted to a referral. There are therefore only two safe options at this time:

1) Quick-and-ugly: If we encounter a referral, skip straight to denial and loudly warn in the logs that this is a situation we are unable to handle yet. This will avoid the crash, but will leave the GPO behavior equivalently bad to the current state. Might be acceptable if this is only happening on a very small subset of systems. Users experiencing this might be able to set "ldap_referrals = True" on the affected systems to quietly solve the problem at the expense of all the other side-effects to that (particularly performance)

2) More complete: We need to process *these* referrals ourselves. As we're iterating through, make a note of any that have referral responses, queue them up and send them out in parallel. (Managing the connections may be complex and painful). We've needed to rework the LDAP provider to be able to handle referrals ourselves for a long time, but we've always put it off because the need was insufficiently great. It may be time now.

Jakub Hrozek

24 Apr 24 Apr

12:43 p.m.

On Thu, Apr 23, 2015 at 07:29:07AM -0400, Stephen Gallagher wrote:

...

On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote:

...
On (20/04/15 14:38), Stephen Gallagher wrote:

...
On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote:

...
ehlo,

attached patch fixes crash in https://fedorahosted.org/sssd/ticket/2629

Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.

You are right.

We got a referral to GPO and therefore we do not find any attributes.

[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- 419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'

[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.

...
Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?

With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.

If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).

I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?

LS

OK, I hadn't considered the referral scenario. (I wasn't actually aware that GPOs could *be* referred to elsewhere). Is this referral pointing to a different machine than the host is enrolled to? It's trying to reach a DN at lzb.hq that is identical to the one we just requested.

I'd really like to know how to reproduce the setup for this situation; is it inheriting a GPO from a parent domain somehow?

Neither of my above proposals are a proper solution for this case, unfortunately. The first one would make the crash go away, but if this GPO really exists and is restrictive, not processing it is *really bad*. Right now we're actually slightly lucky in that the crash has a net result of causing an access denial rather than erroneously resulting in a user login, but if we were to "solve" this problem the way Lukas's original patch intended (or my first alternative above), it would be introducing a security issue. So let's not do either of those.

The second option I suggested there is invalid as well, since the entry *exists*, it's just been converted to a referral. There are therefore only two safe options at this time:

Quick-and-ugly: If we encounter a referral, skip straight to denial

and loudly warn in the logs that this is a situation we are unable to handle yet. This will avoid the crash, but will leave the GPO behavior equivalently bad to the current state. Might be acceptable if this is only happening on a very small subset of systems.

I'm afraid we need to go this way now.

...

Users experiencing this might be able to set "ldap_referrals = True" on the affected systems to quietly solve the problem at the expense of all the other side-effects to that (particularly performance)

I wouldn't even suggest enabling referrals. I've seen the LDAP provider misbehave completely with referrals enabled, openldap-libs is not really great at chasing them and LDAP provider would go offline etc.. So enabling referrals doesn't just degrade performance, but also functionality..

I also tested that using Global Catalog wouldn't help here, some attributes we require are not replicated to GC, such as gPCFileSysPath.

Are you sure we're using the right connection to the right domain? We can select the right connection based on the DN that we know. I'm sorry, but I couldn't reproduce the bug myself.

...

More complete: We need to process *these* referrals ourselves. As

we're iterating through, make a note of any that have referral responses, queue them up and send them out in parallel. (Managing the connections may be complex and painful). We've needed to rework the LDAP provider to be able to handle referrals ourselves for a long time, but we've always put it off because the need was insufficiently great. It may be time now.

I'm not optimistic there is time for this in this development cycle. Because of the downstream schedule, all refactoring should land by mid-May at the latest.

btw I think you used to work on the referral chasing feature a long time ago and concluded it's not really possible to chase referrals correctly, right? Do you propose a subset of that complete feature for referral chasing?

Lukas Slebodnik

2:01 p.m.

On (24/04/15 12:43), Jakub Hrozek wrote:

...

On Thu, Apr 23, 2015 at 07:29:07AM -0400, Stephen Gallagher wrote:

...
On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote:

...
On (20/04/15 14:38), Stephen Gallagher wrote:

...
On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote:

...
ehlo,

attached patch fixes crash in https://fedorahosted.org/sssd/ticket/2629

Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.

You are right.

We got a referral to GPO and therefore we do not find any attributes.

[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- 419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'

[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.

...
Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?

With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.

If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).

I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?

LS

OK, I hadn't considered the referral scenario. (I wasn't actually aware that GPOs could *be* referred to elsewhere). Is this referral pointing to a different machine than the host is enrolled to? It's trying to reach a DN at lzb.hq that is identical to the one we just requested.

I'd really like to know how to reproduce the setup for this situation; is it inheriting a GPO from a parent domain somehow?

Neither of my above proposals are a proper solution for this case, unfortunately. The first one would make the crash go away, but if this GPO really exists and is restrictive, not processing it is *really bad*. Right now we're actually slightly lucky in that the crash has a net result of causing an access denial rather than erroneously resulting in a user login, but if we were to "solve" this problem the way Lukas's original patch intended (or my first alternative above), it would be introducing a security issue. So let's not do either of those.

The second option I suggested there is invalid as well, since the entry *exists*, it's just been converted to a referral. There are therefore only two safe options at this time:

Quick-and-ugly: If we encounter a referral, skip straight to denial

and loudly warn in the logs that this is a situation we are unable to handle yet. This will avoid the crash, but will leave the GPO behavior equivalently bad to the current state. Might be acceptable if this is only happening on a very small subset of systems.

I'm afraid we need to go this way now.

...
Users experiencing this might be able to set "ldap_referrals = True" on the affected systems to quietly solve the problem at the expense of all the other side-effects to that (particularly performance)

I wouldn't even suggest enabling referrals. I've seen the LDAP provider misbehave completely with referrals enabled, openldap-libs is not really great at chasing them and LDAP provider would go offline etc.. So enabling referrals doesn't just degrade performance, but also functionality..

I also tested that using Global Catalog wouldn't help here, some attributes we require are not replicated to GC, such as gPCFileSysPath.

Are you sure we're using the right connection to the right domain? We can select the right connection based on the DN that we know. I'm sorry, but I couldn't reproduce the bug myself.

Stephen wrote a reproducer into ticket https://fedorahosted.org/sssd/ticket/2629

We agreed to do the "Quick-and-ugly" way for 1.12 + documentation but he mentioned that it is walid use-case and it would be good to implement it in next release.

Jakub Hrozek

2:07 p.m.

On Fri, Apr 24, 2015 at 02:01:11PM +0200, Lukas Slebodnik wrote:

...

On (24/04/15 12:43), Jakub Hrozek wrote:

...
On Thu, Apr 23, 2015 at 07:29:07AM -0400, Stephen Gallagher wrote:

...
On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote:

...
On (20/04/15 14:38), Stephen Gallagher wrote:

...
On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote:

...
ehlo,

attached patch fixes crash in https://fedorahosted.org/sssd/ticket/2629

Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.

You are right.

We got a referral to GPO and therefore we do not find any attributes.

[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- 419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'

[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.

...
Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?

With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.

If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).

I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?

LS

OK, I hadn't considered the referral scenario. (I wasn't actually aware that GPOs could *be* referred to elsewhere). Is this referral pointing to a different machine than the host is enrolled to? It's trying to reach a DN at lzb.hq that is identical to the one we just requested.

I'd really like to know how to reproduce the setup for this situation; is it inheriting a GPO from a parent domain somehow?

Neither of my above proposals are a proper solution for this case, unfortunately. The first one would make the crash go away, but if this GPO really exists and is restrictive, not processing it is *really bad*. Right now we're actually slightly lucky in that the crash has a net result of causing an access denial rather than erroneously resulting in a user login, but if we were to "solve" this problem the way Lukas's original patch intended (or my first alternative above), it would be introducing a security issue. So let's not do either of those.

The second option I suggested there is invalid as well, since the entry *exists*, it's just been converted to a referral. There are therefore only two safe options at this time:

Quick-and-ugly: If we encounter a referral, skip straight to denial

and loudly warn in the logs that this is a situation we are unable to handle yet. This will avoid the crash, but will leave the GPO behavior equivalently bad to the current state. Might be acceptable if this is only happening on a very small subset of systems.

I'm afraid we need to go this way now.

...
Users experiencing this might be able to set "ldap_referrals = True" on the affected systems to quietly solve the problem at the expense of all the other side-effects to that (particularly performance)

I wouldn't even suggest enabling referrals. I've seen the LDAP provider misbehave completely with referrals enabled, openldap-libs is not really great at chasing them and LDAP provider would go offline etc.. So enabling referrals doesn't just degrade performance, but also functionality..

I also tested that using Global Catalog wouldn't help here, some attributes we require are not replicated to GC, such as gPCFileSysPath.

Are you sure we're using the right connection to the right domain? We can select the right connection based on the DN that we know. I'm sorry, but I couldn't reproduce the bug myself.

Stephen wrote a reproducer into ticket https://fedorahosted.org/sssd/ticket/2629

I know, that's what I tried, but I couldn't get the setup to break. For some reason SSSD didn't examine the GPO from the parent domain if it was linked to a site the client was at..

...

We agreed to do the "Quick-and-ugly" way for 1.12 + documentation but he mentioned that it is walid use-case and it would be good to implement it in next release.

Yes, the LDAP provider is in a big need of refactoring, but I think even for 1.13 it's too late.

Jakub Hrozek

28 Apr 28 Apr

10:44 a.m.

On Fri, Apr 24, 2015 at 02:07:57PM +0200, Jakub Hrozek wrote:

...

...
We agreed to do the "Quick-and-ugly" way for 1.12 + documentation but he mentioned that it is walid use-case and it would be good to implement it in next release.

Yes, the LDAP provider is in a big need of refactoring, but I think even for 1.13 it's too late.

Hi,

who is working on the quick-n-dirty fix? Please note that downstream would like to see this patch sooner rather than later. Feel free to ping me if you'd like me to help with the fix.

Thanks!

Lukas Slebodnik

29 Apr 29 Apr

9:38 a.m.

On (24/04/15 14:07), Jakub Hrozek wrote:

...

On Fri, Apr 24, 2015 at 02:01:11PM +0200, Lukas Slebodnik wrote:

...
On (24/04/15 12:43), Jakub Hrozek wrote:

...
On Thu, Apr 23, 2015 at 07:29:07AM -0400, Stephen Gallagher wrote:

...
On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote:

...
On (20/04/15 14:38), Stephen Gallagher wrote:

...
On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote: > ehlo, > > attached patch fixes crash in > https://fedorahosted.org/sssd/ticket/2629 >

Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.

You are right.

We got a referral to GPO and therefore we do not find any attributes.

[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- 419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'

[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.

...
Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?

With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.

If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).

I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?

LS

OK, I hadn't considered the referral scenario. (I wasn't actually aware that GPOs could *be* referred to elsewhere). Is this referral pointing to a different machine than the host is enrolled to? It's trying to reach a DN at lzb.hq that is identical to the one we just requested.

I'd really like to know how to reproduce the setup for this situation; is it inheriting a GPO from a parent domain somehow?

Neither of my above proposals are a proper solution for this case, unfortunately. The first one would make the crash go away, but if this GPO really exists and is restrictive, not processing it is *really bad*. Right now we're actually slightly lucky in that the crash has a net result of causing an access denial rather than erroneously resulting in a user login, but if we were to "solve" this problem the way Lukas's original patch intended (or my first alternative above), it would be introducing a security issue. So let's not do either of those.

The second option I suggested there is invalid as well, since the entry *exists*, it's just been converted to a referral. There are therefore only two safe options at this time:

Quick-and-ugly: If we encounter a referral, skip straight to denial

and loudly warn in the logs that this is a situation we are unable to handle yet. This will avoid the crash, but will leave the GPO behavior equivalently bad to the current state. Might be acceptable if this is only happening on a very small subset of systems.

I'm afraid we need to go this way now.

...
Users experiencing this might be able to set "ldap_referrals = True" on the affected systems to quietly solve the problem at the expense of all the other side-effects to that (particularly performance)

I wouldn't even suggest enabling referrals. I've seen the LDAP provider misbehave completely with referrals enabled, openldap-libs is not really great at chasing them and LDAP provider would go offline etc.. So enabling referrals doesn't just degrade performance, but also functionality..

I also tested that using Global Catalog wouldn't help here, some attributes we require are not replicated to GC, such as gPCFileSysPath.

Are you sure we're using the right connection to the right domain? We can select the right connection based on the DN that we know. I'm sorry, but I couldn't reproduce the bug myself.

Stephen wrote a reproducer into ticket https://fedorahosted.org/sssd/ticket/2629

I know, that's what I tried, but I couldn't get the setup to break. For some reason SSSD didn't examine the GPO from the parent domain if it was linked to a site the client was at..

I decided to break processing of GPOs when attributes cannot be found.

I don't know why it was implemented in this way. Is there a valid use case for skipping GPO with missing attributes? Should I special case referral?

Stephen Gallagher

2 p.m.

On Wed, 2015-04-29 at 09:38 +0200, Lukas Slebodnik wrote:

...

On (24/04/15 14:07), Jakub Hrozek wrote:

...
On Fri, Apr 24, 2015 at 02:01:11PM +0200, Lukas Slebodnik wrote:

...
On (24/04/15 12:43), Jakub Hrozek wrote:

...
On Thu, Apr 23, 2015 at 07:29:07AM -0400, Stephen Gallagher wrote:

...
On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote:

...
On (20/04/15 14:38), Stephen Gallagher wrote: > On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote: > > ehlo, > > > > attached patch fixes crash in > > https://fedorahosted.org/sssd/ticket/2629 > > > > > Nack. I'd rather we fixed the root of this problem. I > did some > digging > this afternoon and tracked the issue back to ad_gpo.c > line 3499 (in > current master). If we get back a NULL result or > num_results == 0, > then we just skip over this item in the list and start > processing > the > next one. Unfortunately, that leaves an item in the > candidate_gpos > list that was never properly constructed. > You are right.

We got a referral to GPO and therefore we do not find any attributes.

[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- 419F-B4B7 -185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'

[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.

> Under what circumstances can the secinfo_dacl search > return success > but with zero results? Is there a bug or a race here > (such as the > AD > server has updated the GPO since we got the list of > candidate > GPOs?). > How best to handle this? > > With your patch here, it looks like we're assuming that > it's okay > to > just skip over this GPO. If that's the case, then what > we really > need > to be doing in ad_gpo_get_gpo_attrs_step() is to mark > the gp_gpo as > being invalid and then after we've gone through them > all, shrink > the > array, removing all of the invalid entries. This will be > more > future- > proof, as it's not just the gpo_sd that is uninitialized > here. > Every > member of this gp_gpo is NULL except for the DN. > > If it's *not* okay that we've gotten no results for this > lookup > (such > as in the race case; we don't want to be skipping over a > GPO that > might properly be denying users), we may need to restart > processing at > least a couple times to try to avoid the race and go > offline if we > can't complete the processing (so we at least stick to > our cached > rules).

I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?

LS

OK, I hadn't considered the referral scenario. (I wasn't actually aware that GPOs could *be* referred to elsewhere). Is this referral pointing to a different machine than the host is enrolled to? It's trying to reach a DN at lzb.hq that is identical to the one we just requested.

I'd really like to know how to reproduce the setup for this situation; is it inheriting a GPO from a parent domain somehow?

Neither of my above proposals are a proper solution for this case, unfortunately. The first one would make the crash go away, but if this GPO really exists and is restrictive, not processing it is *really bad*. Right now we're actually slightly lucky in that the crash has a net result of causing an access denial rather than erroneously resulting in a user login, but if we were to "solve" this problem the way Lukas's original patch intended (or my first alternative above), it would be introducing a security issue. So let's not do either of those.

The second option I suggested there is invalid as well, since the entry *exists*, it's just been converted to a referral. There are therefore only two safe options at this time:

Quick-and-ugly: If we encounter a referral, skip straight

to denial and loudly warn in the logs that this is a situation we are unable to handle yet. This will avoid the crash, but will leave the GPO behavior equivalently bad to the current state. Might be acceptable if this is only happening on a very small subset of systems.

I'm afraid we need to go this way now.

...
Users experiencing this might be able to set "ldap_referrals = True" on the affected systems to quietly solve the problem at the expense of all the other side-effects to that (particularly performance)

I wouldn't even suggest enabling referrals. I've seen the LDAP provider misbehave completely with referrals enabled, openldap-libs is not really great at chasing them and LDAP provider would go offline etc.. So enabling referrals doesn't just degrade performance, but also functionality..

I also tested that using Global Catalog wouldn't help here, some attributes we require are not replicated to GC, such as gPCFileSysPath.

Are you sure we're using the right connection to the right domain? We can select the right connection based on the DN that we know. I'm sorry, but I couldn't reproduce the bug myself.

Stephen wrote a reproducer into ticket https://fedorahosted.org/sssd/ticket/2629

I know, that's what I tried, but I couldn't get the setup to break. For some reason SSSD didn't examine the GPO from the parent domain if it was linked to a site the client was at..

I decided to break processing of GPOs when attributes cannot be found.

I don't know why it was implemented in this way. Is there a valid use case for skipping GPO with missing attributes? Should I special case referral?

LS

I'm not aware of any situation where this would be a sensible reply, so this should be fine (and at worst, safe).

I suspect (but since Yassir isn't here any more cannot confirm) that the original intent here was to skip this GPO, but that wasn't correctly implemented. Good thing too, as it would have been a security bug as previously noted.

Given that this is fairly likely to be hit, I suggest that we need to open an RFE bug upstream and then change the message to refer to it. I suggest the following:

"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

Jakub Hrozek

2:19 p.m.

On Wed, Apr 29, 2015 at 08:00:02AM -0400, Stephen Gallagher wrote:

...

"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

I would also prefer to log into syslog/journal via sss_log() in this case, to get admin's attention easily.

Lukas Slebodnik

2:48 p.m.

On (29/04/15 14:19), Jakub Hrozek wrote:

...

On Wed, Apr 29, 2015 at 08:00:02AM -0400, Stephen Gallagher wrote:

...
"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

I would also prefer to log into syslog/journal via sss_log() in this case, to get admin's attention easily.

It would be supperfluos. We will fail and there should be log message sss_log_ext(SSS_LOG_WARNING, LOG_AUTHPRIV, "Warning: user would " \ "have been denied GPO-based logon access if the " \ "ad_gpo_access_control option were set to enforcing " \ "mode.");

Jakub Hrozek

2:54 p.m.

On Wed, Apr 29, 2015 at 02:48:28PM +0200, Lukas Slebodnik wrote:

...

On (29/04/15 14:19), Jakub Hrozek wrote:

...
On Wed, Apr 29, 2015 at 08:00:02AM -0400, Stephen Gallagher wrote:

...
"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

I would also prefer to log into syslog/journal via sss_log() in this case, to get admin's attention easily.

It would be supperfluos. We will fail and there should be log message sss_log_ext(SSS_LOG_WARNING, LOG_AUTHPRIV, "Warning: user would " \ "have been denied GPO-based logon access if the " \ "ad_gpo_access_control option were set to enforcing " \ "mode.");

Would this message also appear in enforcing mode? Remember there is still Stephen's patch on the list to switch the default to enforcing.

Also, the message really doesn't tell anything about the cause of the denial.

Lukas Slebodnik

3:12 p.m.

On (29/04/15 14:54), Jakub Hrozek wrote:

...

On Wed, Apr 29, 2015 at 02:48:28PM +0200, Lukas Slebodnik wrote:

...
On (29/04/15 14:19), Jakub Hrozek wrote:

...
On Wed, Apr 29, 2015 at 08:00:02AM -0400, Stephen Gallagher wrote:

...
"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

I would also prefer to log into syslog/journal via sss_log() in this case, to get admin's attention easily.

It would be supperfluos. We will fail and there should be log message sss_log_ext(SSS_LOG_WARNING, LOG_AUTHPRIV, "Warning: user would " \ "have been denied GPO-based logon access if the " \ "ad_gpo_access_control option were set to enforcing " \ "mode.");

Would this message also appear in enforcing mode? Remember there is still Stephen's patch on the list to switch the default to enforcing.

Also, the message really doesn't tell anything about the cause of the denial.

User will be able to authenticate in permissive mode and we will write message to syslog. If administrator is interested then it will enable debugging in sssd and investigate issue. In this case syslog message make sense.

In enforcing mode, user will not be able to authenticate and administrator will have to investigate issue anyway. So there is not reason for syslog message.

If we really want to help admin with investigation then better way would be to enable logging to journald by default (messages with log levels <=2).

Unfortunately, there is many false positives with AD :-(.

Jakub Hrozek

3:28 p.m.

On Wed, Apr 29, 2015 at 03:12:35PM +0200, Lukas Slebodnik wrote:

...

On (29/04/15 14:54), Jakub Hrozek wrote:

...
On Wed, Apr 29, 2015 at 02:48:28PM +0200, Lukas Slebodnik wrote:

...
On (29/04/15 14:19), Jakub Hrozek wrote:

...
On Wed, Apr 29, 2015 at 08:00:02AM -0400, Stephen Gallagher wrote:

...
"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

I would also prefer to log into syslog/journal via sss_log() in this case, to get admin's attention easily.

It would be supperfluos. We will fail and there should be log message sss_log_ext(SSS_LOG_WARNING, LOG_AUTHPRIV, "Warning: user would " \ "have been denied GPO-based logon access if the " \ "ad_gpo_access_control option were set to enforcing " \ "mode.");

Would this message also appear in enforcing mode? Remember there is still Stephen's patch on the list to switch the default to enforcing.

Also, the message really doesn't tell anything about the cause of the denial.

User will be able to authenticate in permissive mode and we will write message to syslog. If administrator is interested then it will enable debugging in sssd and investigate issue. In this case syslog message make sense.

In enforcing mode, user will not be able to authenticate and administrator will have to investigate issue anyway. So there is not reason for syslog message.

~~~~~~~~~~~~~~~~~~~~~~~ this would be easier with a syslog message.

At least make the debug message level 0 so it always appears.

Lukas Slebodnik

3:35 p.m.

On (29/04/15 15:28), Jakub Hrozek wrote:

...

On Wed, Apr 29, 2015 at 03:12:35PM +0200, Lukas Slebodnik wrote:

...
On (29/04/15 14:54), Jakub Hrozek wrote:

...
On Wed, Apr 29, 2015 at 02:48:28PM +0200, Lukas Slebodnik wrote:

...
On (29/04/15 14:19), Jakub Hrozek wrote:

...
On Wed, Apr 29, 2015 at 08:00:02AM -0400, Stephen Gallagher wrote:

...
"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

I would also prefer to log into syslog/journal via sss_log() in this case, to get admin's attention easily.

It would be supperfluos. We will fail and there should be log message sss_log_ext(SSS_LOG_WARNING, LOG_AUTHPRIV, "Warning: user would " \ "have been denied GPO-based logon access if the " \ "ad_gpo_access_control option were set to enforcing " \ "mode.");

Would this message also appear in enforcing mode? Remember there is still Stephen's patch on the list to switch the default to enforcing.

Also, the message really doesn't tell anything about the cause of the denial.

User will be able to authenticate in permissive mode and we will write message to syslog. If administrator is interested then it will enable debugging in sssd and investigate issue. In this case syslog message make sense.

In enforcing mode, user will not be able to authenticate and administrator will have to investigate issue anyway. So there is not reason for syslog message.
     ~~~~~~~~~~~~~~~~~~~~~~~
this would be easier with a syslog message.

It would be even more easier with messages with log levels <=2 in journald.

I'm sorry you still do not have a valid reason for explicit message in syslog.

We have many cases when syslog message would simplify troubleshooting. So it's better to do it right way instead of copy&paste style.

I will send new patch with Stephen's suggestion.

Lukas Slebodnik

6:50 p.m.

On (29/04/15 08:00), Stephen Gallagher wrote:

...

I'm not aware of any situation where this would be a sensible reply, so this should be fine (and at worst, safe).

I suspect (but since Yassir isn't here any more cannot confirm) that the original intent here was to skip this GPO, but that wasn't correctly implemented. Good thing too, as it would have been a security bug as previously noted.

Given that this is fairly likely to be hit, I suggest that we need to open an RFE bug upstream and then change the message to refer to it. I suggest the following:

"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

Updated patch attached.

Stephen Gallagher

8:48 p.m.

On Wed, 2015-04-29 at 18:50 +0200, Lukas Slebodnik wrote:

...

On (29/04/15 08:00), Stephen Gallagher wrote:

...
I'm not aware of any situation where this would be a sensible reply, so this should be fine (and at worst, safe).

I suspect (but since Yassir isn't here any more cannot confirm) that the original intent here was to skip this GPO, but that wasn't correctly implemented. Good thing too, as it would have been a security bug as previously noted.

Given that this is fairly likely to be hit, I suggest that we need to open an RFE bug upstream and then change the message to refer to it. I suggest the following:

"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

Updated patch attached.

LS

Ack. Tested with my reproduction environment and the user was denied login and the logs showed the message as expected.

Lukas Slebodnik

9:20 p.m.

On (29/04/15 14:48), Stephen Gallagher wrote:

...

On Wed, 2015-04-29 at 18:50 +0200, Lukas Slebodnik wrote:

...
On (29/04/15 08:00), Stephen Gallagher wrote:

...
I'm not aware of any situation where this would be a sensible reply, so this should be fine (and at worst, safe).

I suspect (but since Yassir isn't here any more cannot confirm) that the original intent here was to skip this GPO, but that wasn't correctly implemented. Good thing too, as it would have been a security bug as previously noted.

Given that this is fairly likely to be hit, I suggest that we need to open an RFE bug upstream and then change the message to refer to it. I suggest the following:

"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

Updated patch attached.

LS

Ack. Tested with my reproduction environment and the user was denied login and the logs showed the message as expected.

Thank you very much for testing and reproducer. Dan will appreciate it :-)

Jakub Hrozek

30 Apr 30 Apr

9:44 a.m.

On Wed, Apr 29, 2015 at 02:48:12PM -0400, Stephen Gallagher wrote:

...

On Wed, 2015-04-29 at 18:50 +0200, Lukas Slebodnik wrote:

...
On (29/04/15 08:00), Stephen Gallagher wrote:

...
I'm not aware of any situation where this would be a sensible reply, so this should be fine (and at worst, safe).

I suspect (but since Yassir isn't here any more cannot confirm) that the original intent here was to skip this GPO, but that wasn't correctly implemented. Good thing too, as it would have been a security bug as previously noted.

Given that this is fairly likely to be hit, I suggest that we need to open an RFE bug upstream and then change the message to refer to it. I suggest the following:

"BUG: No attrs found for GPO [%s]. This was likely caused by the GPO entry being a referred to another domain controller. SSSD does not yet support this configuration. See [insert SSSD bug number] for more information."

Updated patch attached.

LS

Ack. Tested with my reproduction environment and the user was denied login and the logs showed the message as expected.

* CI: http://sssd-ci.duckdns.org/logs/job/13/63/summary.html * master: 03e5f1528184a558fd990e66f083157b404dce08 * sssd-1-12: 7c8c34c1ad152892f93d8e01336258bfd0bc35b9

3350

Age (days ago)

3360

Last active (days ago)

sssd-devel@lists.fedorahosted.org

19 comments

3 participants

tags (0)

participants (3)

Jakub Hrozek
Lukas Slebodnik
Stephen Gallagher