oVirt Live Snapshots
by Federico Simoncelli
Hi,
oVirt, and more specifically VDSM, is currently implementing the live
snapshot feature using the API/commands provided by libvirt and qemu.
It would be great if you could review the design and the current open
issues at:
http://ovirt.org/wiki/Live_Snapshots
Thank you,
--
Federico
12 years, 3 months
Libstorage and repository engines
by smizrahi@redhat.com
I've been working on refactoring the storageDomain\images system in VDSM.
Apart from facilitating various features I've also been trying to make adding new SD types easier and making the image manipulation bits consistent across domain implementation.
Currently in order to create a new domain type you have to create a new StoageDomain,Image and Volume objects and implement all the logic to manipulate them. Apart from being cumbersome and redundant it also make mixed clustered very hard to do.
On of the big changes I put in is separating the image manipulation with the actual storage work.
Instead of each domain type implementing createImage and co you have one class responsible for all the image manipulation in the cluster.
All you have to do facilitate a new storage type is to create a domain engine.
A domain engine is a python class that implement a minimal interface.
1. It has to be able to create resize and delete a slab (slab being a block of writable storage like a lun\lv\file)
2. It has to be able to create and delete tags (tags are pointers to slabs)
The above function are very easy to implement and require very little complexity. All the heavy lifting (image manipulation, cleaning, transaction, atomic operations, etc) is managed by the Image Manager that just uses this unified interface to interact with the different storage types)
In cases where a domain might have a special non-standard features I introduce the concept of capabilities. A domain engine can declare support for certain capabilities (eg. native snapshotting) and implement additional interfaces. If the image manager sees that the domain implements a capability it will use it if not it will use a default implementation that uses the default must have verbs. This is similar to just having drawLine and having drawRect. This is done automatically and at runtime.
I like to compare this to how OpenGL will use software rendering if a certain standard feature is not implemented by the card so you might get a slower but still correct result.
Now, libstorage is another way to abstract interactions and capabilities for different storage types and have a unified API for accessing them.
Building a repo engine on top of libstorage is completely possible. But as you can see this creates a redundant layer of abstractions in the libstorage side.
As I see it if you just want to have you storage supported by ovirt creating a repo engine is simpler as you can use high level concepts and I do plan to have engines run as their own processes so you could use whatever licence, language and storage server API you choose.
Also libstorage will have to keep it's abstraction at a much lower level. This means exposing target specific flags and abilities. eWhile this is good in concept it will mean that the repo engine wrapping libstorage will have to juggle all those flags and calls instead of having different distinct class for each storage type with it's own specific hacks in place.
Just as a current example, we currently use the same "engine" for nfs3 and nfs4. This means that when we are running on nfs4 we are still doing all the hacks that are meant to circumvent issues with v3 being stateless. This is no longer relevant as v4 is stateful.
And what about SAMBA? or gluster? You got to have special hacks for boths
What I'm saying is that if in the relatively simple world of NAS where we have a proven abstraction (file access commands, POSIX). We can't find a way to create a 1 class to rule them all. How can we expect to have a sane solution for the crazy world of SAN.
I'm not saying we shouldn't create an engine for libstorage, just that we should treat it like we treat sharefs. As a simple generic non bullet proof\optimized implementation.
Let the flaming commence!
12 years, 3 months
about migration
by wangxiaofan
Hi,
To do migration, why vdsm/libvirt requires DNS server or hostnames
in hosts file? Is there any way to do migration directly with IP
address?
12 years, 3 months
about snapshot
by wangxiaofan
Hi there,
Why does not vdsm use snapshot APIs of libvirt, or "qemu-img snapshot -c" ?
12 years, 3 months
VDSM Networking patches
by dfediuck@redhat.com
Hi guys,
Engine-core needs your help!
There are 2 patches which needs to be reviewed, in order to allow
bridge-less network and MTU support both in VDSM and Engine-core.
These are the patches:
* bridge-less
http://gerrit.ovirt.org/#change,848
* mtu
http://gerrit.ovirt.org/#change,754
Please review it.
--
/d
"Ford," he said, "you're turning into a penguin. Stop it." --Douglas Adams, The Hitchhiker's Guide to the Galaxy
12 years, 3 months
[RFC] New Connection Management API
by smizrahi@redhat.com
I have begun work at changing how API clients can control storage connections when interacting with VDSM.
Currently there are 2 API calls:
connectStorageServer() - Will connect to the storage target if the host is not already connected to it.
disconnectStorageServer() - Will disconnect from the storage target if the host is connected to it.
This API is very simple but is inappropriate when multiple clients and flows try to access the same storage.
This is currently solved by trying to synchronize things inside rhevm. This is hard and convoluted. It also brings out issues with other clients using the VDSM API.
Another problem is error recovery. Currently ovirt-engine(OE) has no way of monitoring the connections on all the hosts an if a connection disappears it's OE's responsibility to reconnect.
I suggest a different concept where VDSM 'manages' the connections. VDSM receives a manage request with the connection information and from that point forward VDSM will try to keep this connection alive. If the connection fails VDSM will automatically try and recover.
Every manage request will also have a connection ID(CID). This CID will be used when the same client asks to unamange the connection.
When multiple requests for manage are received to the same connection they all have to have their own unique CID. By internally mapping CIDs to actual connections VDSM can properly disconnect when no CID is addressing the connection. This allows each client and even each flow to have it's own CID effectively eliminating connect\disconnect races.
The change from (dis)connect to (un)manage also changes the semantics of the calls significantly.
Whereas connectStorageServer would have returned when the storage is either connected or failed to connect, manageStorageServer will return once VDSM registered the CID. This means that the connection might not be active immediately as the VDSM tries to connect. The connection might remain down for a long time if the storage target is down or is having issues.
This allows for VDSM to receive the manage request even if the storage is having issues and recover as soon as it's operational without user intervention.
In order for the client to query the current state of the connections I propose getStorageConnectionList(). This will return a mapping of CID to connection status. The status contains the connection info (excluding credentials), whether the connection is active, whether the connection is managed (unamanged connection are returned with transient IDs), and, if the connection is down, the last error information.
The same actual connection can return multiple times, once for each CID.
For cases where an operation requires a connection to be active a user can poll the status of the CID. The user can then choose to poll for a certain amount of time or until an error appears in the error field of the status. This will give you either a timeout or a "try once" semantic depending on the flows needs.
All connections that have been managed persist VDSM restart and will be managed until a corresponding unmanage command has been issued.
There is no concept of temporary connections as "temporary" is flow dependent and VDSM can't accommodate all interpretation of "temporary". An ad-hoc mechanism can be build using the CID field. For instance a client can manage a connection with "ENGINE_FLOW101_CON1". If the flow got interrupted the client can clean all IDs with certain flow IDs.
I think this API gives safety, robustness, and implementation freedom.
Nitty Gritty:
manageStorageServer
===================
Synopsis:
manageStorageServer(uri, connectionID):
Parameters:
uri - a uri pointing to a storage target (eg: nfs://server:export, iscsi://host/iqn;portal=1)
connectionID - string with any char except "/".
Description:
Tells VDSM to start managing the connection. From this moment on VDSM will try and have the connection available when needed. VDSM will monitor the connection and will automatically reconnect on failure.
Returns:
Success code if VDSM was able to manage the connection.
It usually just verifies that the arguments are sane and that the CID is not already in use.
This doesn't mean the host is connected.
----
unmanageStorageServer
=====================
Synopsis:
unmanageStorageServer(connectionID):
Parameters:
connectionID - string with any char except "/".
Descriptions:
Tells VDSM to stop managing the connection. VDSM will try and disconnect for the storage target if this is the last CID referencing the storage connection.
Returns:
Success code if VDSM was able to unmanage the connection.
It will return an error if the CID is not registered with VDSM. Disconnect failures are not reported. Active unmanaged connections can be tracked with getStorageServerList()
----
getStorageServerList
====================
Synopsis:
getStorageServerList()
Description:
Will return list of all managed and unmanaged connections. Unmanaged connections have temporary IDs and are not guaranteed to be consistent across calls.
Results:VDSM was able to manage the connection.
It usually just verifies that the arguments are sane and that the CID is not already in use.
This doesn't mean the host is connected.
----
unmanageStorageServer
=====================
Synopsis:
unmanageStorageServer(connectionID):
Parameters:
connectionID - string with any char except "/".
Descriptions:
Tells VDSM to stop managing the connection. VDSM will try and disconnect for the storage target if this is the last CID referencing the storage connection.
Returns:
Success code if VDSM was able to unmanage the connection.
It will return an error if the CID is not registered with VDSM. Disconnect failures are not reported. Active unmanaged connections can be tracked with getStorageServerList()
----
getStorageServerList
====================
Synopsis:
getStorageServerList()
Description:
Will return list of all managed and unmanaged connections. Unmanaged connections have temporary IDs and are not guaranteed to be consistent across calls.
Results:
A mapping between CIDs and the status.
example return value (Actual key names may differ)
{'conA': {'connected': True, 'managed': True, 'lastError': 0, 'connectionInfo': {
'remotePath': 'server:/export
'retrans': 3
'version': 4
}}
'iscsi_session_34': {'connected': False, 'managed': False, 'lastError': 339, 'connectionIfno': {
'hostname': 'dandylopn'
'portal': 1}}
}
12 years, 3 months
getDeviceList should report partioned devices, too
by Dan Kenigsberg
Hi Lists,
One cannot create a PV on a partitioned device, and therefor such
devices where not reported to Engine. This proved surprising to users
who woder where their LUN disappeared.
Vdsm should report all devices, and ovirt-engine should mark partitioned
devices as unworthy of a PV. In the future, Vdsm may allow to forcefully
remove a partition table from a device, to make it usable as a PV.
Douglas (CCed) would take resposibilty on the Vdsm side. Initial patch at
http://gerrit.ovirt.org/944 sports a backword-compatible API. Who's taking this
on Engine? This involves GUI, too, as partitioned devices should probably be
displayed greyed-out.
Dan.
12 years, 3 months
Vdsm sync call agenda items
by agl@us.ibm.com
Hi Ayal,
I would like to propose two agenda items for Monday's call:
- vdsm testing (in preparation for oVirt Test Day)
- my API refactoring patches
Hopefully by Monday folks will have had a chance to look at the patches and we
can discuss what I have done and the next steps.
Thanks.
--
Adam Litke <agl(a)us.ibm.com>
IBM Linux Technology Center
12 years, 3 months
FOSDEM sessions
by iheim@redhat.com
fyi we got the following sessions in the FOSDEM[1] Open Source
Virtualization and Cloud devroom.
if you are planning to be in FOSDEM drop us an email if you would want
to meet, discuss, catch up, etc.
1. Virtualization Management the oVirt way - Introducing oVirt (Itamar
Heim)
2. VDSM - The oVirt Node Management Agent (Federico Simoncelli)
3. Open Virtualization – Engine Core: Internals and Infrastructure
(Omer Frenkel)
apart of it there are also some kvm sessions in the hypervisor main track.
Hope to meet you there,
Itamar
[1] http://fosdem.org/2012/
12 years, 4 months