cas.spec | 10 -
cas.sql | 7 +
doc/index.rst | 356 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
lib/cas/db.py | 34 +++--
4 files changed, 390 insertions(+), 17 deletions(-)
New commits:
commit 1c9fbbbcd455db47220ab295c7b2ce9092c01b68
Author: Adam Stokes <adam.stokes(a)gmail.com>
Date: Wed Mar 18 11:52:46 2009 -0400
- start work on using mysql backend
diff --git a/cas.spec b/cas.spec
index e463724..5b53a5a 100644
--- a/cas.spec
+++ b/cas.spec
@@ -3,7 +3,7 @@
Name: cas
Summary: Tool to analyze and configure core file environment
Version: 0.13
-Release: 120%{?dist}
+Release: 121%{?dist}
Source0:
https://fedorahosted.org/releases/c/a/cas/%{name}-%{version}.tar.gz
License: GPLv3+
Group: Development/Libraries
@@ -11,12 +11,12 @@ BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-buildroot
BuildArch: noarch
Url:
http://fedorahosted.org/cas
BuildRequires: python-devel
-Requires: crash
+Requires: crash MySQL-python
%description
CAS provides a user the ability to configure an environment for core analysis
-quickly. All the hassles of matching kernel versions and machine architecture
-types to core dumps are automatically detected and processed.
+quickly. All the hassles of matching kernel versions and machine architecture
+types to core dumps are automatically detected and processed.
%prep
%setup -q
@@ -76,7 +76,7 @@ rm -rf ${RPM_BUILD_ROOT}
- splitting off grabcore to be a download/extract only service
- core of the work to be done specifically by their intended
modules
-
+
* Fri Dec 7 2007 Adam Stokes <astokes at redhat dot com> - 0.9
- release bump
- decompression module added
diff --git a/cas.sql b/cas.sql
new file mode 100644
index 0000000..7017d37
--- /dev/null
+++ b/cas.sql
@@ -0,0 +1,7 @@
+CREATE TABLE cas_jobs (
+ 'ID' PRIMARY AUTO_INCREMENT INTEGER,
+ 'PATH' TEXT,
+ 'COREFILEPATH' TEXT,
+ 'DEBUGPATH' TEXT,
+ 'CREATION_TIME' DATETIME CURRENT_TIMESTAMP
+)
diff --git a/doc/index.rst b/doc/index.rst
new file mode 100644
index 0000000..7ecb87c
--- /dev/null
+++ b/doc/index.rst
@@ -0,0 +1,356 @@
+Core Analysis System
+====================
+
+Introduction
+------------
+
+.. image:: cas_logo.png
+
+Description
+^^^^^^^^^^^
+
+CAS provides a user the ability to configure an environment for core analysis
+quickly. All the hassles of matching kernel versions and machine architecture
+types to core dumps are automatically detected and processed.
+
+Prerequisites
+^^^^^^^^^^^^^
+
+CAS needs at least **Python 2.4** to run. For systems that are not running
+Fedora 9 or later (this would include RHEL 4/5) the EPEL repo needs to be
+installed. Visit `EPEL <
https://fedoraproject.org/wiki/EPEL>`_ to enable
+this repository.
+
+Optionally install `Func <
http://fedorahosted.org/func>`_
+for automatic processing on other architecture specific systems. This
+can easily be installed via `yum`::
+
+ $ yum install func
+
+The amount of storage needed can be determined base on the following
+information:
+
+- The number of kernel-debuginfo packages needed
+- How many core dumps will be processed.
+
+Typically it is recommended to have at least 1TB for cores and another 500GB for
+the debuginfo packages.
+
+Since analyzing cores requires the same architecture specific systems the core
+was generated on there will need to be systems available of those same types
+in order for analyzation to work properly.
+
+Finally, root access to the CAS server.
+
+Configuration
+^^^^^^^^^^^^^
+
+CAS comes with one main configuration file which is located at ``/etc/cas.conf``.
+The overall contents of this file is shown below, further down we will break up
+each section and describe its meaning::
+
+ [settings]
+ kernels=/mnt/kernels
+ rpms=/var/db/cas/rpms.db
+ rpmFilter=.*kernel-debuginfo-[0-9].*\.rpm
+ debugs=/cores/debugs
+ debugLevel=DEBUG
+ workDirectory=/cores/processed
+ servers=/var/db/cas/servers.db
+ jobs=/var/db/cas/jobs.db
+
mailServer=mail.example.com
+ [advanced]
+ # crash_32=/usr/local/i386/crash
+
+``kernels``: (**Required**) Describes the location of where kernel-debuginfo packages are
to be
+stored. This can range anywhere from an nfs mount, samba share, local disk or
+any other type of media the cas server can access.
+
+``rpms``: (**Required**) Database which houses the processing of kernel-debuginfo and
stores the
+necessary information for CAS to properly match core dumps.
+
+``rpmFilters``: (**Required**) This is a emacs based regular expression which is
essentially
+passed to a find command to locate the various kernel-debuginfo packages defined
+in ``kernels`` directive.
+
+``debugs``: (**Required**) A temporary directory in which to store the extracted vmlinux
files
+from the kernel-debuginfo packages for processing. Another solution would be to
+alter this to point an existing directory like ``/tmp``, for instance.
+
+``debugLevel``: As the name suggest it will set the debug level for CAS output.
+Currently the only accepted values are ``DEBUG|INFO``.
+
+``workDirectory``: (**Required**) Defines where all processed cores will be placed. This
mount
+point will need to have the most storage assigned to it. Depending on how many
+cores are processed in a given timeframe this area will fill up quickly.
+
+``servers``: If func is installed and configured all associated servers will
+be stored here.
+
+``jobs``: Contains information on current jobs being processed and eventually
+what jobs are in the queue.
+
+``mailServer``: If wanting output of CAS processing email to a certain address
+this directive needs to be set. ``Note`` that the mail server should not
+require smtp authentication.
+
+``crash_32``: Primarily used on x86_64 systems to process x86 cores. If x86
+version of crash is installed this directive can be set to the crash binary
+and CAS will automatically process x86 cores on a x86_64 machine. ``Note`` this
+is only available if the CAS server is a x86_64 machine.
+
+Setup & Execution
+-----------------
+
+Preparing CAS Server
+^^^^^^^^^^^^^^^^^^^^
+
+To install the CAS package simply type::
+
+ $ yum install cas
+
+Once installed edit ``/etc/cas.conf`` as root using any preferred text editor.
+As described above the required directives need to be altered to suit the
+environment in question.
+
+In this example, ``/mnt/kernels`` is an nfs mount which houses the kernel-debuginfo
+packages. ``/cores`` is where all processed cores are stored and ``/tmp`` is the
+temporary storage for collecting the necessary data from the kernel-debuginfos.
+A mail server is setup within the environment to email CAS results and this
+optional directive is shown to reflect that. Finally, the CAS server is an x86_64
+machine and the environment will be processing x86 cores, therefore, the directive
+for this is uncommented and path to the x86 crash binary is given. ``Note`` there
+is information provided within the configuration file for installing the x86 crash
+to a different location.
+
+Altering the configuration to reflect the above assumptions would show the
+following::
+
+ [settings]
+ kernels=/mnt/kernels
+ rpms=/var/db/cas/rpms.db
+ rpmFilter=.*kernel-debuginfo-[0-9].*\.rpm
+ debugs=/tmp
+ debugLevel=DEBUG
+ workDirectory=/cores
+ servers=/var/db/cas/servers.db
+ jobs=/var/db/cas/jobs.db
+
mailServer=mail.cas-server.com
+ [advanced]
+ crash_32=/usr/local/i386/crash
+
+Now that the configuration file is altered and ``/mnt/kernels`` should be populated
+with kernel-debuginfo rpm's the next section will describe running CAS.
+
+Running CAS
+^^^^^^^^^^^
+
+First, one or two administrative tasks need to be run. The required task is to build
+a database for all the data gathered from the kernel-debuginfo packages.::
+
+ $ cas-admin -b
+
+If ``Func`` is installed and several systems are deployed for CAS to use run::
+
+ $ cas-admin -s
+
+At this point CAS is configured and looking at the output of CAS help there are
+a few options to pass::
+
+ Usage: cas [opts] args
+
+ Options:
+ -h, --help show this help message and exit
+ -i IDENTIFIER, --identifier=IDENTIFIER
+ Unique ID for core
+ -f FILENAME, --file=FILENAME
+ Filename
+ -e EMAIL, --email=EMAIL
+ Define email for results (must be valid!)
+ -m, --modules Extract associated kernel modules
+
+CAS prepares its directory hierarchy based on the ``identifier`` this option is
+therefore required. ``filename`` is also required as it tells CAS exactly which
+core to process and associate with ``identifier``. If wanting email results from
+CAS simply pass it the email parameter.
+
+An example, of a user wanting to process a corefile named ``vmcore.12345``::
+
+ $ cas -i 12345 -f vmcore.12345 -e user(a)cas-server.com
+
+In the above example an assumption is made that ``12345`` is associated to some
+form of ticketing system so to keep things organized an identifier was set of
+that number.
+
+The directory hierarchy for the current job should look like ``/cores/12345``.
+In addition to the processing of core files there is also a ``process log`` contained
+within this directory for each job processed. If multiple jobs for the same identifier
+are issued they are placed within a sub directory marked by the current timestamp
+and the relevant data associated with it.
+
+The last option worth mentioning is for core analyst who are needing to work
+within the core that requires one of the kernel modules loaded during the crash.
+This can be extracted by passing the ``modules`` parameter in the CAS execution
+statement. ``Note`` the ``modules`` parameter is not heavily used but can be
+useful when analyzing filesystem issues and the like.
+
+From this point on CAS will download, process, and email the results of its
+initial analysis to the specified email address. From there further instructions
+are provided in either the email or the ``process log`` on how to access and analyze
+the core.
+
+Analyzing
+---------
+
+Continuing with the previous example the results of CAS processing should be emailed
+and look something similar to::
+
+ Subject: CAS results for 12345
+ Date: Wed, 11 Feb 2009 08:44:37 -0500
+
+ Location: /cores/processed/12345/02.11.09.08.44.19
+ Server:
x86_64.cas-server.com
+ Output data:
+ PID: 0 TASK: ffffffff803e9b80 CPU: 0 COMMAND: "swapper"
+ #0 [ffffffff8047a0a0] smp_call_function_interrupt at ffffffff8011d191
+ #1 [ffffffff8047a0b0] call_function_interrupt at ffffffff80110bf5
+ --- <IRQ stack> ---
+ #2 [ffffffff80529f08] call_function_interrupt at ffffffff80110bf5
+ [exception RIP: default_idle+32]
+ RIP: ffffffff8010e7a9 RSP: ffffffff80529fb8 RFLAGS: 00000246
+ RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018
+ RDX: ffffffff8010e789 RSI: ffffffff803e9b80 RDI: 0000010008001780
+ RBP: 0000000000000000 R8: ffffffff80528000 R9: 0000000000000080
+ R10: 0000000000000100 R11: 0000000000000004 R12: 0000000000000000
+ R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
+ ORIG_RAX: fffffffffffffffa CS: 0010 SS: 0018
+ #3 [ffffffff80529fb8] cpu_idle at ffffffff8010e81c
+
+ PID: 0 TASK: 100f57cb030 CPU: 1 COMMAND: "swapper"
+ #0 [1000107bfa0] smp_call_function_interrupt at ffffffff8011d191
+ #1 [1000107bfb0] call_function_interrupt at ffffffff80110bf5
+ --- <IRQ stack> ---
+ #2 [10001073e98] call_function_interrupt at ffffffff80110bf5
+ [exception RIP: default_idle+32]
+ RIP: ffffffff8010e7a9 RSP: 0000010001073f48 RFLAGS: 00000246
+ RAX: 0000000000000000 RBX: 0000000000000e86 RCX: 0000000000000018
+ RDX: ffffffff8010e789 RSI: 00000100f57cb030 RDI: 00000102000a4780
+ RBP: 0000000000000001 R8: 0000010001072000 R9: 0000000000000040
+ R10: 0000000000000000 R11: 0000000000000008 R12: 0000000000000000
+ R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
+ ORIG_RAX: fffffffffffffffa CS: 0010 SS: 0018
+ #3 [10001073f48] cpu_idle at ffffffff8010e81c
+
+ PID: 6122 TASK: 101f3658030 CPU: 2 COMMAND: "gfs_quotad"
+ #0 [101f21efb20] start_disk_dump at ffffffffa03183ff
+ #1 [101f21efb50] try_crashdump at ffffffff8014cc1d
+ #2 [101f21efb60] die at ffffffff80111c90
+ #3 [101f21efb80] do_invalid_op at ffffffff80112058
+ #4 [101f21efc40] error_exit at ffffffff80110e1d
+ [exception RIP: do_dlm_lock+366]
+
+ ... snip ...
+
+From this email a ``location`` is provided ``/cores/processed/12345/02.11.09.08.44.19``
+and the server in which further analyzation can be continued ``x86_64.cas-server.com``.
+
+Normally from a support perspective this email should contain enough information
+for a kernel engineer to begin debugging the problem. Assuming more is needed
+the information provided previously will prove beneficial for anyone wishing
+to access this data.
+
+Logging into the stated server and changing into the directory defined several
+files are presented::
+
+ $ pwd
+ /cores/processed/12345/02.11.09.08.44.19
+ $ ls
+ 12345.log crash crash.in crash.out usr vmcore.12345
+
+``12345.log``: contains any informational messages presented during the processing
+of the core. Everything from informational to debug statements are provided here.
+
+``crash``: a script autogenerated to provide an automated way of gathering intial
+data from the coredump. ``Note`` if wanting to use this crash wrapper in a more
+manual approach some alterations to the script need to occur.
+
+crash wrapper in its original form::
+
+ #!/bin/sh
+ /usr/bin/crash \
+ /cores/processed/12345/02.11.09.08.44.19/vmcore.12345 \
+ /cores/processed/12345/02.11.09.08.44.19/usr/*/*/*/*/2.6.9*largesmp/vmlinux \
+ -s < /cores/processed/12345/02.11.09.08.44.19/crash.in
+
+In order to remove automation the redirect from crash.in needs to be removed::
+
+ #!/bin/sh
+ /usr/bin/crash /cores/processed/12345/02.11.09.08.44.19/vmcore.12345 \
+ /cores/processed/12345/02.11.09.08.44.19/usr/*/*/*/*/2.6.9*largesmp/vmlinux
+
+**Alternative to using the crash wrapper**
+
+It is possible to specify the vmlinux and corefile with crash on the command line::
+
+ $ crash /cores/processed/12345/02.11.09.08.44.19/usr/*/*/*/*/2.6.9*largesmp/vmlinux
\
+ /cores/processed/12345/02.11.09.08.44.19/vmcore.12345
+
+``crash.in``: a list of commands to be read into crash during the automated
+analysis::
+
+ bt -a >>/cores/processed/12345/02.11.09.08.44.19/crash.out
+ sys >>/cores/processed/12345/02.11.09.08.44.19/crash.out
+ log >>/cores/processed/12345/02.11.09.08.44.19/crash.out
+ mod >>/cores/processed/12345/02.11.09.08.44.19/crash.out
+ exit
+
+``crash.out``: output of initial crash analysis and the same data which
+is sent in an email if defined.
+
+``usr``: directory structure from the extraction of the vmlinux file
+from the associated kernel-debuginfo rpm for use within crash::
+
+ /cores/processed/12345/02.11.09.08.44.19/
+ usr/lib/debug/lib/modules/2.6.9-78.18.ELlargesmp/vmlinux
+
+``vmcore.12345``: corefile from which was either defined or extracted from
+a compressed archive during CAS initialization.
+
+Troubleshooting
+---------------
+
+Some of the major problems that arise when using CAS usually boils down to some
+improper usage of the compression and archiving tools.
+
+When compressing a core which may need to be sent over the network to a CAS server
+one of the proper ways to do so is::
+
+ $ tar cvjf vmcore.12345.tar.bz2 vmcore.12345
+
+Other various ways of compressing archive are as follows::
+
+ $ tar cvzf vmcore.tar.gz vmcore
+ $ gzip vmcore
+ $ bzip2 vmcore
+
+``Note``: please do not double compress or CAS will fail.
+
+Another issue, which isn't primarily a fault of CAS, are
+incomplete or corrupted cores. If either of these occur
+there is a chance that CAS will not be able to process
+the data needed to associate a debug kernel or do any
+sort of automated analysis. Unfortunately, there is not
+much that can be done to resolve these sort of issues
+other than verifying that the process which happens when
+a system coredump and when that dump reaches the
+system specified for retrieval is solid and are seeing
+no errors.
+
+Resources
+=========
+
+* `CAS Wiki <
http://fedorahosted.org/cas>`_
+* `CAS FAQ <
https://fedorahosted.org/cas/wiki/CasFAQ>`_
+* `Mailing list <
https://fedorahosted.org/mailman/listinfo/cas>`_
+* `Upstream releases <
https://fedorahosted.org/releases/c/a/cas/>`_
+* Checkout latest from Git, ``git clone
git://git.fedorahosted.org/cas.git``
diff --git a/lib/cas/db.py b/lib/cas/db.py
index 6f7e942..9c0cac6 100644
--- a/lib/cas/db.py
+++ b/lib/cas/db.py
@@ -1,14 +1,24 @@
-""" storage container """
+import MySQLdb
-class Core(object):
- def __init__(self):
- self.timestamp = None
- self.debugKernel = None
+class CasStorage(object):
+ def __init__(self, sql_server, username, password, database):
+ """ setup database connection and return db cursor for
+ traversing database """
+ self.db = database
+ self.sql_server = sql_server
+ self.username = username
+ self.password = password
+ self.cursor = None
-class CasServer(object):
- """ Handles the definition of cas servers for each architecture
- needing to be used for analyzing cores
- """
- def __init__(self):
- self.hostname = None
- self.architecture = None
+ def connect(self):
+ """ execute connection """
+ db_connect = MySQLdb.connect(host=self.sql_server,
+ user=self.username,
+ passwd=self.password,
+ db=self.db)
+ self.cursor = db_connect.cursor()
+ return
+
+ def getJobs(self):
+ self.cursor.execute("SELECT * FROM cas_jobs")
+ return self.cursor.fetchall()