Re: [deltacloud-devel] e2e failing w/rhevm 2.2
by Martyn Taylor
Chris,
I have confirmed that entering an Instance name of > 50 characters does indeed cause a: 500 Internal Server Error. So I am going to assume this is the issue.
This needs to be addressed both in Condor and in the RHEVM Driver. The Driver should do any validation checks before sending the API request to the BackEnd provider, and thus return an Error message with a more helpful message.
Cheers
Martyn
----- Original Message -----
From: "Chris Lalancette" <clalance(a)redhat.com>
To: "Martyn Taylor" <mtaylor(a)redhat.com>
Cc: "Chris Lalancette" <clalance(a)redhat.com>, "Michael Orazi" <morazi(a)redhat.com>
Sent: Monday, 23 May, 2011 1:17:04 PM
Subject: Re: e2e failing w/rhevm 2.2
On 05/23/11 - 07:38:12AM, Martyn Taylor wrote:
> Hi Chris,
>
> I've been trying this morning to get full e2e working with rhevm 2.2.
>
> However, I'm still getting: Create_Instance_Failure: 500 Internal Server Error
>
> You mentioned on Fridays call that there still might be some issues with Condor. Is this now fixed? Could this issue I'm seeing here?
>
> Snippet of GridManager Log:
>
> 05/23/11 12:30:49 [18785] querying for removed/held jobs
> 05/23/11 12:30:49 [18785] Using constraint ((Owner=?="aeolus"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
> 05/23/11 12:30:49 [18785] Fetched 0 job ads from schedd
> 05/23/11 12:30:49 [18785] Updating classad values for 6.0:
> 05/23/11 12:30:49 [18785] EnteredCurrentStatus = 1306150249
> 05/23/11 12:30:49 [18785] HoldReason = "Create_Instance_Failure: 500 Internal Server Error"
> 05/23/11 12:30:49 [18785] HoldReasonCode = 0
> 05/23/11 12:30:49 [18785] HoldReasonSubCode = 0
> 05/23/11 12:30:49 [18785] JobStatus = 5
> 05/23/11 12:30:49 [18785] Managed = "Schedd"
> 05/23/11 12:30:49 [18785] NumSystemHolds = 1
> 05/23/11 12:30:49 [18785] ReleaseReason = undefined
> 05/23/11 12:30:49 [18785] No jobs left, shutting down
> 05/23/11 12:30:49 [18785] leaving doContactSchedd()
> 05/23/11 12:30:49 [18785] Got SIGTERM. Performing graceful shutdown.
> 05/23/11 12:30:49 [18785] Started timer to call main_shutdown_fast in 1800 seconds
> 05/23/11 12:30:49 [18785] **** condor_gridmanager (condor_GRIDMANAGER) pid 18785 EXITING WITH STATUS
It is hard to say if this is the exact problem.
The problem I still know about is that condor generates instance names like:
Condor_localhost.localdomain_job#12.0 (or something like that)
Unfortunately, RHEV-M VM names can only be 50 characters, and the above is
always longer than 50 characters. Whether that causes the above error is
unclear to me, but until we fix that, RHEV-M is not going to work.
--
Chris Lalancette
12 years, 11 months