January 2012 - vdsm-devel - Fedora Mailing-Lists

by Federico Simoncelli

Hi, oVirt, and more specifically VDSM, is currently implementing the live snapshot feature using the API/commands provided by libvirt and qemu. It would be great if you could review the design and the current open issues at: http://ovirt.org/wiki/Live_Snapshots Thank you, -- Federico

12 years, 3 months

2
2
0 / 0

Libstorage and repository engines

by smizrahi＠redhat.com

I've been working on refactoring the storageDomain\images system in VDSM. Apart from facilitating various features I've also been trying to make adding new SD types easier and making the image manipulation bits consistent across domain implementation. Currently in order to create a new domain type you have to create a new StoageDomain,Image and Volume objects and implement all the logic to manipulate them. Apart from being cumbersome and redundant it also make mixed clustered very hard to do. On of the big changes I put in is separating the image manipulation with the actual storage work. Instead of each domain type implementing createImage and co you have one class responsible for all the image manipulation in the cluster. All you have to do facilitate a new storage type is to create a domain engine. A domain engine is a python class that implement a minimal interface. 1. It has to be able to create resize and delete a slab (slab being a block of writable storage like a lun\lv\file) 2. It has to be able to create and delete tags (tags are pointers to slabs) The above function are very easy to implement and require very little complexity. All the heavy lifting (image manipulation, cleaning, transaction, atomic operations, etc) is managed by the Image Manager that just uses this unified interface to interact with the different storage types) In cases where a domain might have a special non-standard features I introduce the concept of capabilities. A domain engine can declare support for certain capabilities (eg. native snapshotting) and implement additional interfaces. If the image manager sees that the domain implements a capability it will use it if not it will use a default implementation that uses the default must have verbs. This is similar to just having drawLine and having drawRect. This is done automatically and at runtime. I like to compare this to how OpenGL will use software rendering if a certain standard feature is not implemented by the card so you might get a slower but still correct result. Now, libstorage is another way to abstract interactions and capabilities for different storage types and have a unified API for accessing them. Building a repo engine on top of libstorage is completely possible. But as you can see this creates a redundant layer of abstractions in the libstorage side. As I see it if you just want to have you storage supported by ovirt creating a repo engine is simpler as you can use high level concepts and I do plan to have engines run as their own processes so you could use whatever licence, language and storage server API you choose. Also libstorage will have to keep it's abstraction at a much lower level. This means exposing target specific flags and abilities. eWhile this is good in concept it will mean that the repo engine wrapping libstorage will have to juggle all those flags and calls instead of having different distinct class for each storage type with it's own specific hacks in place. Just as a current example, we currently use the same "engine" for nfs3 and nfs4. This means that when we are running on nfs4 we are still doing all the hacks that are meant to circumvent issues with v3 being stateless. This is no longer relevant as v4 is stateful. And what about SAMBA? or gluster? You got to have special hacks for boths What I'm saying is that if in the relatively simple world of NAS where we have a proven abstraction (file access commands, POSIX). We can't find a way to create a 1 class to rule them all. How can we expect to have a sane solution for the crazy world of SAN. I'm not saying we shouldn't create an engine for libstorage, just that we should treat it like we treat sharefs. As a simple generic non bullet proof\optimized implementation. Let the flaming commence!

12 years, 3 months

3
2
0 / 0

about migration

by wangxiaofan

Hi, To do migration, why vdsm/libvirt requires DNS server or hostnames in hosts file? Is there any way to do migration directly with IP address?

12 years, 3 months

4
3
0 / 0

about snapshot

by wangxiaofan

Hi there, Why does not vdsm use snapshot APIs of libvirt, or "qemu-img snapshot -c" ?

12 years, 3 months

3
2
0 / 0

oVirt release go/nogo meeting

by mgoldboi＠redhat.com

towards today meeting, listed are the problematic bugs from QE perspective, most in new state, several are blockers and the rest needs to be discussed and fixed with high priority for next version, filtered by component VDSM: 773371 <https://bugzilla.redhat.com/show_bug.cgi?id=773371> high 2012-01-11 urgent Linux fsimonce(a)redhat.com NEW --- vdsm: when installing vdsm manually in the host and then installing host with web-admin /etc/libvirt/qemu.conf is using spice_tls=1 which causes vm's to fail to run with cert error dron(a)redhat.com vdsm 781970 <https://bugzilla.redhat.com/show_bug.cgi?id=781970> urgent 2012-01-16 urgent Linux danken(a)redhat.com NEW --- [vdsm] Migration fails due to changes in xmlrpclib in Python 2.7 jlibosva(a)redhat.com vdsm 784324 <https://bugzilla.redhat.com/show_bug.cgi?id=784324> high Tue 10:21 high Linux extras-orphan@fedoraproject... NEW --- [qemu-kvm] guest crash when connection to storage is blocked on host machine dron(a)redhat.com kvm 785557 <https://bugzilla.redhat.com/show_bug.cgi?id=785557> high Sun 09:51 unspecified All dougsland(a)redhat.com NEW --- [vdsm][deployUtil] host installation adds interface configured with NM_CONTROLLED="yes" to a bridge which is unsupported. dnaori(a)redhat.com vdsm 785749 <https://bugzilla.redhat.com/show_bug.cgi?id=785749> unspecified 09:27:13 unspecified Unspecified danken(a)redhat.com NEW --- [ovirt] [vdsm] deadlock after prepareForShutdown when connection to storage is blocked with VMs running hateya(a)redhat.com vdsm Engine: 782432 <https://bugzilla.redhat.com/show_bug.cgi?id=782432> urgent 2012-01-17 urgent Linux lpeer(a)redhat.com NEW --- ovirt-engine-core: extend fails when there is more than once host in cluster dron(a)redhat.com ovirt-engine-core 783083 <https://bugzilla.redhat.com/show_bug.cgi?id=783083> high 2012-01-19 unspecified Linux lpeer(a)redhat.com NEW --- NPE during SD removal jlibosva(a)redhat.com ovirt-engine-core 783662 <https://bugzilla.redhat.com/show_bug.cgi?id=783662> urgent 2012-01-21 unspecified Linux oourfali(a)redhat.com MODIFIED --- IPA - IPA does not perform login with UPN pstehlik(a)redhat.com ovirt-engine-core 783789 <https://bugzilla.redhat.com/show_bug.cgi?id=783789> high 2012-01-22 unspecified Linux lpeer(a)redhat.com NEW --- [ovirt][engine][core] - exception when trying to fence a SPM host that moved to non responsive yeylon(a)redhat.com ovirt-engine-core 784900 <https://bugzilla.redhat.com/show_bug.cgi?id=784900> high Thu 10:35 high Unspecified lpeer(a)redhat.com NEW --- [ovirt] [engine-core] null pointer exception when merging snapshot and engine gets restarted - VM stuck in unknown state hateya(a)redhat.com ovirt-engine-core 785671 <https://bugzilla.redhat.com/show_bug.cgi?id=785671> urgent 04:55:48 urgent Linux lpeer(a)redhat.com NEW --- ovirt-engine-core: after trying to add a snapshot while a snapshot is created and getting an error vm CreateAllSnapshotsFromVmCommand will fail to aquire lock forever dron(a)redhat.com ovirt-engine-core webadmin: 785565 <https://bugzilla.redhat.com/show_bug.cgi?id=785565> high Sun 10:52 high Linux ecohen(a)redhat.com NEW --- [ovirt] [webadmin] refresh problem after discover & login to target - content of getDeviceList is not presented dron(a)redhat.com ovirt-engine-webadmin ovirt-node: 785728 <https://bugzilla.redhat.com/show_bug.cgi?id=785728> high 08:36:17 unspecified Unspecified jboggs(a)redhat.com NEW --- [installation] ovirt node installation fails dnaori(a)redhat.com https://bugzilla.redhat.com/buglist.cgi?quicksearch=773371+781970+784324+... Moran.

12 years, 3 months

1
0
0 / 0

VDSM Networking patches

by dfediuck＠redhat.com

Hi guys, Engine-core needs your help! There are 2 patches which needs to be reviewed, in order to allow bridge-less network and MTU support both in VDSM and Engine-core. These are the patches: * bridge-less http://gerrit.ovirt.org/#change,848 * mtu http://gerrit.ovirt.org/#change,754 Please review it. -- /d "Ford," he said, "you're turning into a penguin. Stop it." --Douglas Adams, The Hitchhiker's Guide to the Galaxy

12 years, 3 months

1
0
0 / 0

[RFC] New Connection Management API

by smizrahi＠redhat.com

I have begun work at changing how API clients can control storage connections when interacting with VDSM. Currently there are 2 API calls: connectStorageServer() - Will connect to the storage target if the host is not already connected to it. disconnectStorageServer() - Will disconnect from the storage target if the host is connected to it. This API is very simple but is inappropriate when multiple clients and flows try to access the same storage. This is currently solved by trying to synchronize things inside rhevm. This is hard and convoluted. It also brings out issues with other clients using the VDSM API. Another problem is error recovery. Currently ovirt-engine(OE) has no way of monitoring the connections on all the hosts an if a connection disappears it's OE's responsibility to reconnect. I suggest a different concept where VDSM 'manages' the connections. VDSM receives a manage request with the connection information and from that point forward VDSM will try to keep this connection alive. If the connection fails VDSM will automatically try and recover. Every manage request will also have a connection ID(CID). This CID will be used when the same client asks to unamange the connection. When multiple requests for manage are received to the same connection they all have to have their own unique CID. By internally mapping CIDs to actual connections VDSM can properly disconnect when no CID is addressing the connection. This allows each client and even each flow to have it's own CID effectively eliminating connect\disconnect races. The change from (dis)connect to (un)manage also changes the semantics of the calls significantly. Whereas connectStorageServer would have returned when the storage is either connected or failed to connect, manageStorageServer will return once VDSM registered the CID. This means that the connection might not be active immediately as the VDSM tries to connect. The connection might remain down for a long time if the storage target is down or is having issues. This allows for VDSM to receive the manage request even if the storage is having issues and recover as soon as it's operational without user intervention. In order for the client to query the current state of the connections I propose getStorageConnectionList(). This will return a mapping of CID to connection status. The status contains the connection info (excluding credentials), whether the connection is active, whether the connection is managed (unamanged connection are returned with transient IDs), and, if the connection is down, the last error information. The same actual connection can return multiple times, once for each CID. For cases where an operation requires a connection to be active a user can poll the status of the CID. The user can then choose to poll for a certain amount of time or until an error appears in the error field of the status. This will give you either a timeout or a "try once" semantic depending on the flows needs. All connections that have been managed persist VDSM restart and will be managed until a corresponding unmanage command has been issued. There is no concept of temporary connections as "temporary" is flow dependent and VDSM can't accommodate all interpretation of "temporary". An ad-hoc mechanism can be build using the CID field. For instance a client can manage a connection with "ENGINE_FLOW101_CON1". If the flow got interrupted the client can clean all IDs with certain flow IDs. I think this API gives safety, robustness, and implementation freedom. Nitty Gritty: manageStorageServer =================== Synopsis: manageStorageServer(uri, connectionID): Parameters: uri - a uri pointing to a storage target (eg: nfs://server:export, iscsi://host/iqn;portal=1) connectionID - string with any char except "/". Description: Tells VDSM to start managing the connection. From this moment on VDSM will try and have the connection available when needed. VDSM will monitor the connection and will automatically reconnect on failure. Returns: Success code if VDSM was able to manage the connection. It usually just verifies that the arguments are sane and that the CID is not already in use. This doesn't mean the host is connected. ---- unmanageStorageServer ===================== Synopsis: unmanageStorageServer(connectionID): Parameters: connectionID - string with any char except "/". Descriptions: Tells VDSM to stop managing the connection. VDSM will try and disconnect for the storage target if this is the last CID referencing the storage connection. Returns: Success code if VDSM was able to unmanage the connection. It will return an error if the CID is not registered with VDSM. Disconnect failures are not reported. Active unmanaged connections can be tracked with getStorageServerList() ---- getStorageServerList ==================== Synopsis: getStorageServerList() Description: Will return list of all managed and unmanaged connections. Unmanaged connections have temporary IDs and are not guaranteed to be consistent across calls. Results:VDSM was able to manage the connection. It usually just verifies that the arguments are sane and that the CID is not already in use. This doesn't mean the host is connected. ---- unmanageStorageServer ===================== Synopsis: unmanageStorageServer(connectionID): Parameters: connectionID - string with any char except "/". Descriptions: Tells VDSM to stop managing the connection. VDSM will try and disconnect for the storage target if this is the last CID referencing the storage connection. Returns: Success code if VDSM was able to unmanage the connection. It will return an error if the CID is not registered with VDSM. Disconnect failures are not reported. Active unmanaged connections can be tracked with getStorageServerList() ---- getStorageServerList ==================== Synopsis: getStorageServerList() Description: Will return list of all managed and unmanaged connections. Unmanaged connections have temporary IDs and are not guaranteed to be consistent across calls. Results: A mapping between CIDs and the status. example return value (Actual key names may differ) {'conA': {'connected': True, 'managed': True, 'lastError': 0, 'connectionInfo': { 'remotePath': 'server:/export 'retrans': 3 'version': 4 }} 'iscsi_session_34': {'connected': False, 'managed': False, 'lastError': 339, 'connectionIfno': { 'hostname': 'dandylopn' 'portal': 1}} }

12 years, 3 months

7
21
0 / 0

getDeviceList should report partioned devices, too

by Dan Kenigsberg

Hi Lists, One cannot create a PV on a partitioned device, and therefor such devices where not reported to Engine. This proved surprising to users who woder where their LUN disappeared. Vdsm should report all devices, and ovirt-engine should mark partitioned devices as unworthy of a PV. In the future, Vdsm may allow to forcefully remove a partition table from a device, to make it usable as a PV. Douglas (CCed) would take resposibilty on the Vdsm side. Initial patch at http://gerrit.ovirt.org/944 sports a backword-compatible API. Who's taking this on Engine? This involves GUI, too, as partitioned devices should probably be displayed greyed-out. Dan.

12 years, 3 months

4
14
0 / 0

Vdsm sync call agenda items

by agl＠us.ibm.com

Hi Ayal, I would like to propose two agenda items for Monday's call: - vdsm testing (in preparation for oVirt Test Day) - my API refactoring patches Hopefully by Monday folks will have had a chance to look at the patches and we can discuss what I have done and the next steps. Thanks. -- Adam Litke <agl(a)us.ibm.com> IBM Linux Technology Center

12 years, 3 months

2
1
0 / 0

FOSDEM sessions

by iheim＠redhat.com

fyi we got the following sessions in the FOSDEM[1] Open Source Virtualization and Cloud devroom. if you are planning to be in FOSDEM drop us an email if you would want to meet, discuss, catch up, etc. 1. Virtualization Management the oVirt way - Introducing oVirt (Itamar Heim) 2. VDSM - The oVirt Node Management Agent (Federico Simoncelli) 3. Open Virtualization – Engine Core: Internals and Infrastructure (Omer Frenkel) apart of it there are also some kvm sessions in the hypervisor main track. Hope to meet you there, Itamar [1] http://fosdem.org/2012/

12 years, 4 months

1
0
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

vdsm-devel January 2012