Announcing ceelog-0.1 - comments wanted
by Miloslav Trmač
Hello,
I have just published a 0.1 of ceelog/libceelog at https://fedorahosted.org/ceelog/ :
Tarball at https://fedorahosted.org/releases/c/e/ceelog/ceelog-0.1.tar.xz
git http://git.fedorahosted.org/cgit/ceelog.git/. A full README is appended below.
Comments (on project name, API design, command-line tool wishlist), testing, review, contributions of any kind are very welcome - on the list, in trac, to me privately, in any way you find convenient.
Mirek
About
=====
The ceelog project provides libceelog, a library for receiving, filtering and
searching a stream or log of CEE/Lumberjack syslog records, and an associated
command-line tool, named ceelog.
The goal is to abstract the user from the backend storage (files, some kind
of local indexed storage, a remote database) and to provide efficient log
processing tools that can be used in applications and scripts for automated
log processing.
The project's home page is at https://fedorahosted.org/ceelog/ .
To get you started
==================
The ceelog(1) tool reads the "default" event source and outputs events matching
a filter. Currently, the "default" event source is hardcoded to
/var/log/messages.
Example filters:
* A regexp (matches the unstructured event text, or the "msg" field for
CEE/Lumberjack structured events)
ceelog '/DHCP/'
* A field comparison (matches a CEE/Lumberjack field)
ceelog 'uid == 0'
ceelog 'uid != 0'
ceelog 'trusted!uid == 0'
ceelog 'username ~ /^guest-/'
ceelog 'username !~ /^guest-/'
* A combination of the above
ceelog 'trusted!uid == 0 && username ~ /^guest-/'
See the source code in src/ceelog.c for an example of a subset of the API.
Roadmap
=======
* Add "Live log file" input source that can handle messages being appended and
log rotation.
* Document the filter expression format.
* Document ceelog(1).
* Get as close to 100% test coverage as possible.
* Support best-effort saving/restoring the current position in a source.
* Add better support for JSON types.
* Implement MongoDB input source.
* Support searching directly in the input source (e.g. to evaluate the filter
server-side).
* ceelog(1) improvements:
- Input processing (e.g. output only the last N recent events, block for
more incoming events)
- Output formatting (e.g. only output some structured fields)
- Statistics/table output (group matching events by one field, output counts)
Bugs
====
Please consider reporting the bug to your distribution's bug tracking system.
Otherwise, please report bugs at https://fedorahosted.org/volume_key/ . Bug
reports with patches are especially welcome.
11 years, 7 months
@cee and MSG content
by Rainer Gerhards
Hi all,
the CEE spec from early this year says that a syslog message with @cee MUST contain JSON content only, and not a message. Take for example this syslog payload:
@cee:{"f":"1"} some text
As of the spec, this is not CEE-enhanced syslog and as such NO JSON object is to be generated. Instead, this would need to be encoded as
@cee:{"f:":"1", "msg":"some text"}
To create a proper CEE-enhanced message.
I wonder what our position in lumberjack is in regard to this. My impression is that the "invalid" format (the first sample) will be seen in practice.
Comments?
Rainer
11 years, 7 months
should trusted properties be in a subobject?
by David Lang
should the trusted properties be grouped into a subobject (i.e.
trusted!pid, trusted!uid, etc)?
I think that the answer is yes for the following reasons
1. there may be multiple PID objects. For examplthe syslogtag includes a
PID. It's useful to have both what the application claims and what the
system claims and be able to compare the two.
2. when relaying from one machine to another, the trusted properties need
to be handled specially depending on the degree of trust that the
transport protocol provides (and that the sysadmin places on that trust)
3. there needs to be some protection to prevent people from generating
logs that have the trusted properties in them. This is easier if these are
in one object and you can protect that object than if they are a bunch of
individual properties that have to be protected seperately.
4. If there is a trusted subobject, it's easier to extend. For example,
when logs are being relayed from one system to another, the recieving
system can add items into the trusted subobject to provide information
about the connection. For example, it could add certificate information or
the remote IP address of the sending machine.
I would define this trusted object as something like:
Fields in the Trusted object should be generated by the software recieving
the log messages and MUST not be able to be affected by anything that the
application generating the log puts into the log message.
Applications recieving relayed log messages SHOULD update the Trusted
object to identify if the data should still be trusted.
Applications recieving relayed log messages MAY update the Trusted object
with additional information indicating why the fields should still be
trusted.
Thoughts?
David Lang
11 years, 7 months
lumberjack schema stability
by Rainer Gerhards
Hi all,
do we consider the current lumberjack XML schema as published on the web site as stable? If so, I think it would make sense to version it and place a "stable" tag onto it (gives implementers peace of mind, at least I think so). With it being officially stable, we can begin to check interop.
With versioning, we can catch up with CEE changes.
Comments are appreciated.
Thanks,
Rainer
11 years, 8 months