Version 1 (modified by solj, 14 years ago) (diff) |
---|
The Bcfg2 Reporting System
The Bcfg2 reporting system collects and displays information about the operation of the Bcfg2 client, and the configuration states of target machines.
Goals
The reporting system provides an interface to administrators describing a few important tasks
- Client configuration state, particularly aspects that do not match the configuration specification. Information about bad and extra configuration elements is included.
- Client execution results (a list of configuration elements that were modified)
- Client execution performance data (including operation retry counts, and timings for several critical execution regions)
This data can be used to understand the current configuration state of the entire network, the operations performed by the client, how the configuration changes propagate, and any reconfiguration operations that have failed.
Retention Model
The current reporting system stores statistics in an XML data store, by default to <repo>/etc/statistics.xml. It retains either one or two statistic sets per host. If the client has a clean configuration state, the most recent (clean) record is retained. If the client has a dirty configuration state, two records are retained. One record is the last clean record. The other record is the most recent record collected, detailing the incorrect state.
This retention model, while non-optimal, does manage to persistently record most of the data that users would like.
Output
Several output reports can be generated from the statistics store with the command line tool bcfg2-build-reports.
- Nodes Digest
- Nodes Individual
- Overview Statistics
- Performance
The data generated by these reports can be delivered by several mechanisms:
- HTML
- RSS
Shortcomings and Planned Enhancements
When designing the current reporting system, we were overly concerned with the potential explosion in data size over time. In order to address this, we opted to use the retention scheme described above. This approach has several shortcomings:
- A comprehensive list of reconfiguration operations (with associated timestamps) isn't retained
- Client results for any given day (except the last one) aren't uniformly retained. This means that inter-client analysis is difficult, if not impossible
We plan to move to a database backend to address the dataset size problem and start retaining all information. The move to a SQL backend will allow many more types of queries to be efficiently processed. It will also make on-demand reports simpler.
Other sorts of information would also be useful to track. We plan to add the ability to tag a particular configuration element as security related, and include this in reports. This will aid in the effective prioritization of manual and failed reconfiguration tasks.
Capability Goals (posed as questions)
- What machines have not yet applied critical updates?
- How long did critical updates take to be applied?
- What configuration did machine X have on a particular date?
- When did machine X perform configuration update Y?