Hello everyone,
I've been working on the parser recently (which I'm trying to refactor)
and I found a few things that might be worthy to at least think about.
What I am trying to do is to separate the process into two phases. First
one being the parsing along with structural checks (whether the tags are
properly nested and they have all the required attributes) and the
second one the processing with semantic checks (whether the id's match,
and there are no conflicts between different things, type checks, etc.).
The goal of this approach is to reduce the complexity of the code (which
is unfortunately quite high at the moment -- and we are still far away
from having all the semantic checks covered, so the user gets a
reasonable error message when his recipe is wrong).
The problem with the current command sequence structure is that it's
structure depends on the semantic, while it should be the other way
around. We have several types of commands and different commands expect
different attributes to be specified. And in order to be able to
validate the proper structure of the document, we would have to do a
part of the semantic analysis first. We had a similar problem with
interface types in <netconfig> already few months back.
For example:
<command_sequence>
<command type="exec" machine_id="3"
from="multicast" value="./send_igmp_query"/>
<command type="system_config" machine_id="2"
option="/proc/sys/net/ipv4/igmp_max_memberships"
value="5"/>
<command type="test" machine_id="1" timeout="30"
value="IcmpPing">
<options>...</options>
</command>
</command_sequence>
Some commands can have children, some cannot and some of the attributes
are common, while there are many that are supported only by a few
commands. We use a very general attribute called "value" that has
different meaning for different command types which might be confusing.
This makes the parsing logic quite complicated and when we split the
parsing into two phases, we will have to put a hack here to evaluate
the "type" ahead of the semantic analysis so we can check the structure,
or alternatively delay checking the structure for the semantic analysis,
which would again bring everything into one place :-(.
What I would like to propose is a similar change we did earlier with the
interfaces -- move the type to the name of the tag, e.g.,
<command_sequence>
<exec machine_id="3" from="multicast"
command="./send_igmp_query"/>
<system_config machine_id="2"
option="/proc/sys/net/ipv4/igmp_max_memberships"
value="5"/>
<test name="IcmpPing" machine_id="1"
timeout="30">
<options>...</options>
</test>
</command_sequence>
This would also allow us to have "more personalised" attributes for each
command type (use name or command instead of value for tests and execs
etc.). It would make the parser much cleaner and easier to maintain.
However, we'd have to break the XML once more ...
What do you think?
Cheers,
Radek
Show replies by date