Re: [SSSD] Multi-line values in INI

13 Apr 2010

      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 04/13/2010 02:56 PM, Dmitri Pal wrote:
...
Hi,
First news is that I will spend more time in INI  validation code in the
nearest future than in ELAPI as it was originally planned.
So ELAPI work will be deferred. Decision is made at least for now.
Second is that I came to realization that the internal data
representation for the INI collection should change.
There is a bunch of data that makes sense to store together with the
actual configuration value.

The line number
Whether  the value was read from the file originally or was added on

the second pass or may be it was automatically generated because other
value implies it.
3) In future it might also store some state or other additional
information needed for the validation. For example: was the value
successfully validated and if not, what was the error.
...
We do not need to define all the use cases now. But the fact that "just
value" is not enough any more is important.
So I think of replacing the "value" in the configuration collection with
the "value object" that will be able to store mentioned above
information and the "value" itself.
I think it should be a structure since it is internal and can be easily
internally extended on as needed basis. Same is true regarding the
interfaces to deal with the object. Since it is going to be an internal
object we do not have obligation to keep the interfaces the same.
Agreed, this should be converted to an opaque internal object.
...
So the interface will have create, destroy, and a bunch of other set and
get style methods. Though it is C and not C++ I am still following a
pattern of "loose coupling" and creating "facades" rather than letting
anyone deal with the structure directly. Hope no objections on that front.
But as I started looking into the value object I realized that this is a
perfect time to introduce or at least think about supporting multi line
values in the INI files.
I see two use cases that need to be handled in a different way:

I have a long line that I want to just split between several lines

for readability.
key = my long multi line value \
that I want to split in the ini file \
between different lines for readability.
In this case the splitting between different lines is just done for
readability and the application would expect the value consisting of one
buffer with all lines concatenated, new lines removed and NEW LINE
indicators removed. In the example above the NEW LINE indicator is a
back slash but we will talk about alternative indicatios in more details
below.

I have format where the new lines embedded into the value.

For example PEM format for the certificate expects one buffer that
consists of the set of the concatenated lines with NEW LINES symbols at
the end of those lines preserved since they are a part of the format.
In the ini file it will look like this:
-----BEGIN CERTIFICATE REQUEST-----
MIIBnTCCAQYCAQAwXTELMAkGA1UEBhMCU0cxETAPBgNVBAoTCE0yQ3J5cHRvMRIw
EAYDVQQDEwlsb2NhbGhvc3QxJzAlBgkqhkiG9w0BCQEWGGFkbWluQHNlcnZlci5l
eGFtcGxlLmRvbTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAr1nYY1Qrll1r
uB/FqlCRrr5nvupdIN+3wF7q915tvEQoc74bnu6b8IbbGRMhzdzmvQ4SzFfVEAuM
MuTHeybPq5th7YDrTNizKKxOBnqE2KYuX9X22A1Kh49soJJFg6kPb9MUgiZBiMlv
tb7K3CHfgw5WagWnLl8Lb+ccvKZZl+8CAwEAAaAAMA0GCSqGSIb3DQEBBAUAA4GB
AHpoRp5YS55CZpy+wdigQEwjL/wSluvo+WjtpvP0YoBMJu4VMKeZi405R7o8oEwi
PdlrrliKNknFmHKIaCKTLRcU59ScA6ADEIWUzqmUzP5Cs6jrSRo3NKfg1bd09D1K
9rsQkRc9Urv9mRBIsredGnYECNeRaK5R1yzpOowninXC
-----END CERTIFICATE REQUEST-----
So to address both use cases I propose that the INI interface would
implement the following logic:
Read a line from the INI file into buffer
Label:
    IF the NEW LINE indicator is present
       at the end of the buffer THEN
        IF indicator shows that the
           NEW LINES should be stripped THEN
            1) The indicator is stripped and
                end of line character(s) are stripped
            2) Next line is read and appended to the value.
            3) Goto Label
        ELSE IF indicator shows that the NEW LINES
                should not be stripped THEN
            1) The indicator is stripped and end
               of line character(s) are stripped
            2) In place of the stripped data the new
               line character is inserted
            3) Next line is read and appended to the value.
            4) Goto Label
        ELSE
            ERROR
        ENDIF
    ELSE
       We are done with this value.
    ENDIF
Now let us talk about the NEW LINE indicator.
I think of it as a sequence of characters that indicate
that we have a multi-line value that either should
have or should not have the new line symbol as
part of the resulting concatenated string.
It can be a symbol, series of symbols or a pattern.
The most logical and most convenient, as it seems to me,
(and this is where I heard some resistance)
would be the following patterns:
a) New line indicator that does not preserve new line
is a sequence of a back slash and any spaces or tabs after it.
Example (notice that there are spaces after slash):
key = my long multi line value \    
that I want to split in the ini file \    
between different lines for readability.
b) New line indicator that preserves new line
is a sequence of a back slash and symbol 'n' and
any spaces or tabs after it.
Example (notice that there are spaces after 'n' ):
key = my long multi line value with \n   
the preserved new lines because this \n    
is the format my application expects.
I heard some concerns that these patterns should not
be used the way I propose since some other applications
like make allow no spaces after "".
But I do not see how my approach harms?
It allows those who made a mistake of putting
space after slash not being punished.
The spaces at the end are irrelevant so why
I should punish the users of the INI interface
and applications built on top of it for putting
a space that has no meaning and is tripped anyways.
May be I should use some other patterns instead
so that someone does not confuse with the
escaping?
Like:
key = my long multi line value +   
that I want to split in the ini file +   
between different lines for readability.
And
key = my long multi line value with &  
the preserved new lines because this &    
is the format my application expects.
Comments and suggestions welcome!
Thank you,
Dmitri Pal
As I said yesterday, my feeling is that we should follow RFC 822 here
(as the python INI parser does).
For lines that are just long and require no special charactes:
name=valuevaluevalue
 continuationafterwhitespace
In this case, it would be read in as:
{{{
valuevaluevalue continuationafterwhitespace
}}}
RFC 822 requires the parser to do line continuations only at points
where the resulting value can accept whitespace. This whitespace is
truncated to a single space in the final value.
Now, if we want to include a value that contains newlines, it should be
done as follows:
name=valuevaluevalue\ncontinueafternewline
 continueafterspace
which would result in the string:
{{{
valuevaluevalue
continueafternewline continueafterspace
}}}
The parser would have to handle the following escape characters:
\n -> newline
\r -> carriage return
\ -> literal backslash
I don't think there's any value in handling any other escapes, but
others may disagree.
- -- 
Stephen Gallagher
RHCE 804006346421761
Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
...PGP SIGNATURE...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkvExVMACgkQeiVVYja6o6Os5wCfQvGZbbRCvEB9bAPYVlIB9V0j
bywAoKGUTy+V8SwQo96FNoQtQOOxUho7
=dxvm
-----END PGP SIGNATURE-----

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [SSSD] Multi-line values in INI