######################## Shinken ChangeLog ######################## 2.4.4 - 29/10/2022 ----------------- CORE ENHANCEMENT Enh: (backport from Enterprise, internal ref #SEF-202) Big boost for the broker daemons, especially for very large setups and slow modules (yes Graphite, I'm looking at you). Enh: (backport from Enterprise ref:#SEF-103) Slight boost of startup time by remiving useless hash computation. Enh: (Christophe Simon) Force problem/impact evaluation Enh: (Christophe Simon) added strict object name conflict policy Enh: (Christophe Simon) better objects balancing between schedulers (#1999) Enh: (Christophe Simon) Maintenance checks (#1929) Enh: (Nicolas DUPEUX) Add an option to dump configuration as build by shinken to a json file (#1954) Enh: (Lionel Sausin) Add instructions to use --install-scripts in pip (#1890) Enh: (Christophe Simon) extend duplicate_foreach to host/hostgroups (#1905) Enh: (Christophe Simon) Made memory free an opt-in option Enh: (Christophe Simon) Explicitly frees memory when receiving new conf Enh: (Christophe Simon) Harmonized graceful restart Enh: (David Durieux) Add support of proxy socks5 for shinken cli (#1583) Enh: (Mateusz Korniak) Add info about exit error code when check finished with error/was signalled. (#1766) CORE FIXES Fix: (backport from Enterprise ref:#SEF-76) In some case, we can have "negative" value for downtime depth, and then the element will never exit from downtime. Fix: Dailymotion team (Nicolas Perraud) found a way to bypass pickle.loads protection and execute code from daemon if you can exchange with it's internal port. Now we whitelist allowed class only :) Fix: (Christophe Simon) Enforced downtime state after retention load Fix: (V. D'AGOSTINO) Update about.rst to fix a typo Fix: (Flavien Peyre) multiple notification way when using a contact template (#1867) Fix: (Dani Rodríguez) Update typographic errors and explain things in the help messages (#1968) Fix: (Dani Rodríguez) brok queues not producing broks (#1971) Fix: (David Gil) Pin CherryPy dependpency < 9.0.0 (#1983) Fix: (Vladimir Kazarin) Update dmz-monitoring.rst (#1980) Fix: (Muhammad Zeeshan Qazi) #1961 : Replace #!/usr/bin/python to #!/usr/bin/python2 (#1979) Fix: (Dani Rodríguez) Fixes calls to cProfile in shinken binaries (#1967) Fix: (wilfriedroset) shinken typo (#1974) Fix: (efficks) CherryPy >= 3 required (#1943) Fix: (David Gil) osmatch in nmap discovery process (#1944) Fix: (Konstantin Shalygin) modules_manager: ignore '.git' dir when load modules. (#1883) Fix: (Christophe Simon) http_client exceptions management Fix: (Christophe Simon) OSX (darwin) platform detection Fix: (Christophe Simon) unicode string parisng in shinken cmdline Fix: (Olivier Hanesse) missing ini file for shinken.io Fix: (Olivier Hanesse) #1857 Set default value for proxy/proxy_socks5 for shinken cli Fix: (Olivier Hanesse) #1845 Add missing paramter for statsd Fix: (Nicolas Le Manchet) Improve systemd units reliability Fix: (Frédéric MOHIER) Iterate correctly in the Services object (#1842) MISC FIXES Fix: (Yann 'Ze' Richard) missing email header to prevent "Out Of Office" messages (#2014) Fix: (se4598*) #1951: Email notification in text mode (#1986) Enh: (Arthur Lutz) starttls and username/password for SMTP Fix: (George Shuklin) trace on sening smtp messages Doc: (rdmo) than -> as Doc: (dodofox) specify host inheritance for contact/contactgroup (#1872) Doc: (Elliott Peay) Fix default reactionner port in docs Fix: (Konstantin Shalygin) Encode attachment. (#1817) Fix: (Marc Remy) Typo: Update some doc files. (#1821) 2.4.3 - 10/03/2016 ------------------- CORE ENHANCEMENT Add: First version for a online profiling system, not enabled currently CORE FIXES Fix: Fix missing Services/Hosts in Livestatus (Close #1691) Fix: initial_state mapping for hosts Fix: Fix duration when a parent is down (Close #1779) Add: Better logging for regenerator MISC ENHANCEMENT Enh: HTML mail notifications fix and improvements Enh: Change TCP Connect scan to TCP Syn scan with nmap for discovery features Enh: Various Documentation 2.4.2 - 01/10/2015 ------------------- CORE ENHANCEMENT Add: Service excludes/includes/overrides extension Add: Implementation of inital_state Add: Broks when host/service downtime is scheduled Add: Solaris SMF manifests Enh: Allow white space between foreach elements Enh: Increase default http thread_pool to 16 CORE FIXES Fix: Duplicate Service from template using duplicate_foreach with the same name Fix: Service_(includes|excludes) template recursion Fix: Service description with multi level inheritance Fix: Default business rule notification options 2.4.1 - 15/07/2015 ------------------- CORE ENHANCEMENT Add: Safe Pickle Add: Better Debian 8 Jessie support CORE FIXES Fix: Display_name when using duplicate_foreach Fix: template definition loop did segfault python Fix: Service Description inheritance when using several level of inheritance Fix: cpu looping for receiver 2.4 - 04/05/2015 ------------------- CORE ENHANCEMENT Add: Hosts - Service_includes feature. Similar to service_excludes Add: Deprecation in the doc about the discovery part. Should be move to contrib or as a module in next versions Add: (by Ddurieux) better logging of failed external modules Enh: Actions - Catch stderr output Enh: Arbiter - Unpickle new conf earlier. Prevent a memory issue Enh: Config - Remove sudo from restart/reload commands Enh: Daemon - Factorization of get_objects_from_from_queues Enh: DaemonsLinks : Move files to object directory Enh: Datamanager - Function get_contactgroup Enh: Doc - A lot of typo and improvements! Enh: Init Scripts - use readlink to avoid failure on symlinks Enh: Modulesmanager - Reworking module loading, now try to import in 3 different ways Enh: Pep8 : Shinken is now pep8 compliant (enforced by Travis) excepted for 3 specific rules Enh: Scheduler - Use iterator instead of list when necessary. Reduce memory consumption Enh: Setup - Refactoring to be virtualenv compatible Enh: Tests - Stabilization Del: Check Shinken script. It has its own repository Del: Windows files from 1.4 version CORE FIXES Fix: Config Parsing - Arbiter module auto creation Fix: Config Parsing - Infinite recursion loop in template Fix: Config Parsing - Inheritance of custom variable Fix: Config Parsing - Raise error properly in ArbiterLink object Fix: Config Parsing - Type properties in Contact, NotificationWay and Arbiter objects Fix: Escalations - escalated parameter in notification object and macro $NOTIFICATIONISESCALATED$ are set correctly Fix: Event Handlers - Global event handlers parsing and launching Fix: External Command - Internal argument for CHANGE_RETRY_HOST_CHECK_INTERVAL command Fix: Realm - Bailout when there are two default realm Fix: (reported by:Stéphane Loeuillet) init.d - Unset http(s)_proxy env variables in the init.d script (curl was using them by default) Fix: (reported by: Arthur Lutz) CLI - manage shinken install call without arguments Fix: (reported by mohierf) STATS - fix statsd format. Fix: (reported by: Stéphane Loeuillet) Config Parsing - manage , case at the end of members list (hostgroups & contactgroups) Fix: (reported by: Stéphane Loeuillet) Config Parsing - host check_period was mandatory with some parameters, now is missing always true period. Fix: + char in servicegroups 2.2 - 16/01/2015 ------------------- CORE ENHANCEMENT Add : Bottle - HTTP_X_FORWARDED_PROTO support (for WebUI) Add : BpRules - Expands bp_rules t flag from tags Add : BpRules - Use standard macros expansion in bp_rules output Add : Broks - new type unknown_check_result_brok (included in the previous add) Add : CLI - update command for the shinken.io install Add : Contact - new parameter expert. For UI purpose Add : Escalations - Time properties for host and service escalations Add : External Command - Implement SCHEDULE_HOSTGROUP_HOST_DOWNTIME and SCHEDULE_HOSTGROUP_SVC_DOWNTIME Add : External Commands - Reload-config function Add : External Commands - support for reset modify attribute Add : HTTP API - get_start_time call returning timestamp Add : HTTP API - Inner stats can be sent to shinken.io Add : LogEvent - Output key to properties dict Add : LogEvent - Pattern for FLAPPING events Add : Receiver - Accept passive results from host or service not in configuration Add : Realms - Handle properly multi level of realms Add : SchedulingItem - callbacks for HOSTDOWNTIME and SERVICEDOWNTIME macros Add : SchedulingItem - Parameter to remove host-service dependency Add : Service Escalation - wildcard service_description Add : Service - method get_service_tags (for WebUI) Add : Snapshots - Feature to get more output when we have a bad state Del : Move WebUI image and template to the good repo Del : Pack Distribution tuning (hosts where fixd on the same scehduler between reload) - causing more damage in load balancing that it solve problems Enh : CLI now exit with a real system return code when fail to install Enh : Config - Module name formatting in config files Enh : Config parsing - Refactoring to pythonize at object creation. Attribute have correct type very early. Enh : Config parsing - Refactoring to reduce parse time, and scope code to ease future evolutions Enh : Daemon - Refactor bailout and edit message Enh : Datarange - Better datarange parsing Enh : Deamon - additional call to os.setgroups to drop privileges Enh : Doc - A lot of typo and improvments! Enh : External Command - Moved manager initialization to init Enh : Imports - use importlib when available to import module properly. Enh : Init scripts - Refactoring script. Add LSB standard functions Enh : Logger - Only use time rotated file handler for regular files Enh : Logger - Refactoring logger class to inherit from standard logger class. Enh : Logger - Use lazy logger syntax to enhance performances Enh : Perfdata - Better internal parsing Enh : Realm - Code refactoring Enh : Scheduler - use a lock to acces put_result Enh : SchedulingItem - use return_code attribute to remember plugin return code so that it is available in broks. Enh : Tests - better Jenkins and Travis integration and Coverall support Enh : Tests - Code cleaning Enh : Tests - Use assert function from unittest2 Enh : Triggers - Try except statement when executing them CORE FIXES Fix : BpRules - '+' was causing segfault in some cases Fix : Broker - Crash when sub process goes down Fix : Config parsing - Unknown members list is common to all items Fix : Daemon - Proper https URL Fix : Daemons - Timeout and data timeout for scheduler connexion Fix : Daemon - When replacing a daemon kill all processes in the group instead of one pid Fix : Datamanager - Service totals and host totals computation Fix : Discovery - Linux hosts with recent nmap versions Fix : HTTP API - Code stabilization Fix : HTTP API - Realm not displayed in get-all-states call Fix : Init scripts - Use last python version Fix : Logs - Manage pre-daemon logs. Fix : Macroresolver - Move output_macros definition to avoid error in WebUI (eltdetail template) Fix : Notifications - Master notifications must not be sent to reactionner so that we scatter them. Fix : Passive reactionners are now working Fix : Scheduler - Pack size for multi-schedulers environnements. Reduce memory consumption and boot time in big setup. Fix : Scheduler - Segfault with circular service dependency loop Fix : Scheduler - Skip compensate if daemon is not init Fix : SchedulingItem - Bad behavior for use_aggressive_host_checking. Fix : SchedulingItem - Up state for a host generates a recovery if necessary Fix : SSL - Crash under Windows Fix : Timeperiods - remove bad modulo calculation for months Fix : Sub-sub-realms where not assigned to higer realms brokers/reactionners Fix : Add callbacks for HOSTDOWNTIME and SERVICEDOWNTIME macros 2.0-RC - 10/02/2014 --------------------- CORE ENHANCEMENT Enh : Replace Pyro layer with HTTP(s) communication layer, still the same ports of course Enh : (huge) Business rules enhancement (parsing level, output and tag/group linking) (thanks to Christophe Simon) Add : Shinken command line interface (cli) to install packages (module and packs) and serve doc Enh : Split configuration files into directory for daemons and objects Add : service_check_timeout : customize exit status Enh : Ensure LSB path for daemons and default paths Del : Nagios(r) references from test dir Enh : Test coverage : Tests have now report per file and a global percentage of code coverage is available Enh : installation is now possible with "pip install shinken" Del : remove the python 2.4/2.5 compatibility. Now shinken need >= 2.6 Add : definition_order parameter for objects order twins remove, only active for services Enh : remove lot of useless configuration file samples Enh : do not bail out at startup is a host service is not having a valid host, just raise a warning now Add : big doc grok from wiki into rst files into /doc. Thanks guys for this! compile with sphinx Add : (from Christophe Simon) Added bp_rule child notification opts management Add : (from Christophe Simon) service_overrides attributes to hosts Add : (from Jean-charles Delon) manage duplicate_foreach and event_handler Add : (from Christophe Simon) Added reversed (negative) xof business rules Add : (from Christophe Simon) percentage support to Xof: business rules Add : (from Jan Ulferts) enable_environment_macros and disable_environment_macros to commands Add : (from Christophe Simon) service excludes feature CORE FIXES Fix : adjust_worker_by_load was wrong, worker was not created again Fix : crash 'c.t_to_go is None' in scheduler when time change Fix : Arbiter parsing (Catch bad service_dependencies, timeperiod definitions) Fix : Hosts over-checked when services are critical (don't respect retry_interval) Fix : Prevent from returning next time notif in the past (don't over notify). Fix : allow the "__ANTI-VIRG__" substring in the configuration files Fix : do not send notification after a downtime to a contact that has been notified if notification_interval = 0 Fix : (reported by : Aina Rakotoson) notification logic failed when reactionners with different reactionner_tags Fix : issue with some discovery rules with mutli-keys Fix : set the python daemons into unbeffered stdout mode by default, centos guys will love it. Fix : some fix on the contact-notifway inheritance Fix : (openglx) bin/nagios : forces -v argument to be used as config file Fix : (from Christophe Simon) impacts/problems retention loaded services Fix : (reported by:fitzdsl) a spare that got a bad configuration can crash when try to send a configuration it do not have in master failback MODULES ENHANCEMENT Del : Move modules to their own repository! They are now in https://github.com/shinken-monitoring MODULES FIXES Enh: Bunch of fixes on the most used modules (livestatus, graphite, ws arbiter, ) and test cases added, look at each modules for details PACKS ENHANCEMENT Del : Move packs to their own repository! They are now in https://github.com/shinken-monitoring 1.4 - 27/05/2013 ------------------ CORE ENHANCEMENT Add : Change Vsphere SDK Version Add : new option for realm : broker complete links, so now a scheduler can give broks to several brokers instead of only one. Add : now the arbiter can take its arbiter lookup name from parameter. Use to allow same host to run arbiter in HA for tests. Add : macro_modulations objects, to dynamically change the macro based on the timeperiod. Can be useful as check_modulations, but easier to use, so maybe they will simply replace them Use persistent cookies by defaut Add : (nerocide) now KEY are expand on the service_dependencies property. Add : first version of user/system time catch for checks. Add : check_modulations objects. CORE FIXES Fix : (reported by:Georges Racinet) lot of SSL errors with Pyro3. Fix #839 Fix : bad offline receiver was too much verbose. Fix #559 Fix : (reported by: Beuc) : bad discovery script name. Fix #843 FIX : #613 macros was overinding modules list instead of just adding a module. No bug at all just a misleading configuration in macro Fix : the mysql python module installation on redhat/centos FIX : Jenkins tests + autogeneration of all_tests based on py file in the tests directory FIX : Jenkis test scripts : The loop skip empty lines Fix : (reported by: miluebbe) catch bad protocol exception in daemons, like a telnet connexion. Fix #829 Fix : Timeperiod calculation Fix : (reported by:darthelmet) bad NOT management for bp_rule and complex expression. Fix #827 Fix : remove useless print. Fix : (reported by:sebguilbaud) bad URL for check_hpasm. Fix : try to fix the no log error for scheduler and log level. Need a beter solution. Fix : Check if a poller exists to handle services tag.Raise error if not Fix : (reported by:boardstretcher) fix #814 about regression since last SSO related patch. Fix : get_all_host_names_set functions since 1.2.4. Fix : fix #509 now bp_rules can manage multi-levels rules with () Fix : Missing quote in host state notification e-mails Fix : (reported by: Johan Svensson) now poller worker crashs are now put in the log. Fix : (reported by: flaf and Imrane.Dessai) bad additive inheritance with multiples templates. FIX : MySQL-python complainig about outdated version of distribute FIX : (by :Alexandre Veyrenc) graphite module now manage perfdata with space in names MODULES ENHANCEMENT Add : Sqlite to Mongodb logstore converter Add : ws_arbiter now supports multiple input values. The ws_arbiter module can now accept a list of metrics to process. See:http://www.shinken-monitoring.org/forum/index.php/topic,846.0.html Add : remote_user_variable from wsgi environment, to use set 'remote_user_enable 2' Add : collectd header in shinken-specific.cfg Add : proper support for counter/derive/absolute datatypes, by taking the derivative. Removed debug logging. Add : support for collectd messages. Added support for more than one value in VALUES collectd part. Fixed graphite broker to nicely deal with multiple values. Fixed graphite graphs to support multiple values by sh Add : new arbiter module, FileTag, taht will tag a host if it is present in a flat file. Add : support for high resolution time/interval (collectd 5.0), and also added support for derive and absolute types Add : AWS arbiter module, to import your hosts from AWS/EC2. Add : (reported by: joachim schiele) update the PNP version to the lastest one. Ehn : add an optionnal list of hostnames to ignore for iptag module Add : example client for zmq_broker.py Add : A ZeroMQ pub socket broker for shinken MODULES FIXES Fix : (reported by:Benoit Dunand-Laisin) livestatus command management with json option. Fix : (reported by:pepejey) void password in ldap auth module. Fix #822 Fix : make the NDO synchronize_database_id option 1 by default to avoid single scheduler mode bug. Make the module far less verbose too. Fix : (reported by:Johan Svensson) nrpe booster module now manage more than 1024 simultaneous connexions by switchting from select based poll to a real poll call. Fix : mongodb fsync as a parameter Fix : invalid variable name causes nrpe_poller crash when recieving SSLError exception Fix : (reported by : igup) logstore_mongodb did nt manage the class filter. Fix : Livestatus and pnpgraph_present UI ENHANCEMENT Add : New impact panel design Add : jquery triggerd hover effect Enh : layout aligned Add : font color class Add : Update Font Awesome 3.0.2 Add : auto refresh function to view Add : Screenfull lib Add : WebUI start of a geomap, imported from https://github.com/darkweaver87/alertsmap with Remi Buisson auth. Still work to do, for centring the map (wtf....) and give real data. Add : [mq]: flupscgi.patch Add : fullscreen dashboard menu Add : WebUI now manage aggregation in the dep trees, reducing the number of visible elements. Add : WebUI Glances view now raise a real error message when fail to connect. Add : WebUI /detail view now save the tab on the location hash, so a refresh will put the user in the good tab. Add : WebUI can now load an additonnal path for plugins taht can override the default ones. UI FIXES Fix : Font color / dashboard Fix : get back refresh on dashboard, so widget edit work again. Fix : WebUI /impacts now display root problems even if there are lot of them. Add : Tooltip to overview bar / eltgroup Enh : Enabled refresh disable function / fullscreen Enh : Change tab names / system log Fix : #776 use "expires" of simplecookie (of bottle.py) Be carefull, "expires" uses String and not Datetime nor TimeStamp see : http://stackoverflow.com/questions/7913169/python-bottle-persistent-cookie-not-work Fix : #779 : sorting fathers before printing Fix : wrong btn alignment for Host Detail Impact Fix : Parents werent displayed in webui Fix : Forced check are now handled DEFAULT PACKS Fix : disable the PickleRetentionArbiter and PickleRetentionBrokerby default. Very advanced modules taht should not be used without Fix : currently removing skonf from standard install. If people want to TEST it, they will have to go to the init.d script. This will remove all script-kiddies from testing it and cry because it dont work. Dev are Add : properties to ESX Host and ESX VM in discovery_runner Add : collectd_disk.trig example Fix : remove _fs_custom stuff Fix : Modified collectd pack in accordance to proper naming aswell Fix : typo in service Disks of linux pack Add new check_disk_custom for linux pack as sample for multi-value duplicate_foreach statement Fix : MsSQL packk Typo Add Host Macros for Windows Pack Fix : type in _MSSQL_CONNECTION_CRIT Fix : type in WARN macros Added Macro definitions in mssql.pack 1.2.4 - 11/02/2013 ------------------ MODULES FIXES * Fix : LS module crash when bad socket close on the client side. 1.2.3 - 30/01/2013 ------------------ CORE FIXES * Fix : (Romain Forlot) wrong plugin owners after installation * Fix : (Romain Forlot) add mysql-client for Debian when install the server. * Fix : (Romain Forlot) Make arbiter last after all daemons except skonf because of issue 647 * Fix : (reported by: opc and ppj) get back the arbiter spare management. * Fix : Manage sudoers to use nmap by discovery process * Fix : Set the HOME var to use the home of shinken, instead of the root's. * FIX : #689 add sudo prerequisite * Fix : Poller handling of abnormal plugin failures. Non-english shell stderr is captured too * Fix : Honor check_for_orphaned_* config params + add time_to_orphanage support for service and host * Fix : bad_start test for environments where groupname != username * Fix : Daemon timeout and receiver module options descriptions * Fix : bad import in shinken-admin * Fix : add default pack_distribution_file to nagios.cfg file, so it fix #576 * Fix : (reported by: ethtricks) manage hosts macros addressX. Fix #640 * Fix : (reported by: Rémi BUISSON, fixed by Olivier Hanesse) the "too much notification" bug after a restart, and also a potential bug for hosts downtimes overlap. * Fix : (By : Olivier Hanesse) schedule times load from retention were used even if the time was past, and make launch a bunch of useless checks. * Fix : (from rdumonnet) and remove alias for generic-host, so the value will be used from real hosts instead. * Fix : more Pyro exceptions handling * Fix : new way of parsing the complex expressions for hostgroups and templates. Fix #676 CORE ENHANCEMENTS * Add : (Jean-francois BUTKIEWICZ) Windows installation for shinken, using .net services instead of instsrv. * Add : (Romain Forlot) distro support to install notify_by_xmpp * Add : (Denetariko) Added pnp4nagios, multisite and nagvis add-on's config folders into restore function (prior version didn't restore them) * Add : (from : Joel Ramat) include the -property feature in discovery. * Add : making orphanage time configurable * Add : first version of a Trending broker module, and libexec scripts. * Add : auto-magically add the shinken root dir parent to the PYTHONPATH environ variable in the daemons if the shinken lib is nout found. MODULES FIXES * Fix : Canopsis module - File handler and os path join * Fix : LiveStatus Correct bug with indexed access connected to issue #648. * Fix : let livestatus handle display_name correctly * Fix : (from Nicolas Pichon) Graphite, URI generation when using datasource, _GRAPHITE_PRE and _GRAPHITE_post MODULES ENHANCEMENTS * Add : (from Thibault Cohen) realm and poller_tag attributes in Livestatus module * Add : mongodb replicat set management in the scehduler retention module. * Add : hot host dependencies with libvirt * Add : Better handling of Carbon connection. * Add : (from Nicolas DUPEUX) SO_REUSEADDR to nsca sock to allow fast stop/start UI FIXES * Fix : WebUI manage service description with / for actions. * Fix : skonf stop script * Fix : (reported by:sebguilbaud) fix #668 : was not managed in webui for names. UI ENHANCEMENTS * Add : (from : Caez) webui,auth manage the $1$ htpasswd md5 way. * Add : (From Frescha) shinken informations * Add : (from Frescha) Logout link moved to admin submenu and submenu moved to the right * Add : (from Steve Schnepp) dashboard/currently, use a 24H clock format * Add : "Beta" badge * Add : caps lock detection login form DEFAULT PACKS * Fix : discovery for safekit cluster * Add : postgreSQL monitoring pack * Add : first part of tomcat pack. This pack need a proper installation of jmx4perl on shinken server (or poller) and a proper installation off jolokia 1.2.2 - 22/10/2012 ------------------ CORE FIXES * Fix : bug in scheduler when next check was impossible. 1.2.1 - 22/10/2012 ------------------ CORE ENHANCEMENTS * Add : Landscape (ubuntu SaaS) arbiter importer module CORE FIXES * Fix : (reported by: Yannig Perre) bad contacts/notifways notif periods were not caught * Fix : Realm objects were not being stripped when hostgroups were used * Fix : (reported by : Rémi BUISSON) there was still realm objects on sub-conf hostgroups objects * Fix : Issue #588 Graphite Templates formatting * Fix : (from Nicolas Boos) output an error at setup if the lsb_release * Fix : make service property service_dependencies take a void hostname * Fix : broker failed to initialize new modules when it get a new configuration * Fix : Restore Python 2.4 compatibility for test suite * Fix : (reported by: Dravail) bad install for PNP in install script * Fix : Cleanup shinken-specific-* files * Fix : (from : xkillian) less time latency between checks, so perfdata modules won't lost some checks anymore MODULE FIXES * Fix : (Reported by pushou) graphite_broker typo in logger call * Fix : a bug in livestatus "GET log" for python 2.4 (no partition method) * Fix : No override mongo replication state levels * Fix : Correct livestatus commands table. Fix #520 * Fix : (Reported by igup) Livestatus Issue 628 UTF8 not decoded in query matching == UI ENHANCEMENTS * Add : Business impact Filter * Add : Multiple action URLS (Thanks to h4wkmoon) * Add : (Thanks GAULUPEAU Jonathan) Webui, move the show/hide tool bar button UI FIXES * Fix : Dynamic reload of widget after save * Fix : bug in active_directory_ui for unknown users * Fix : typo in javascript for bookmarks * Fix : Bookmark Business impact Filter * Fix : Issue #570 gear.png DEFAULT PACKS * Fix : (Thanks jhurliman) check_mongodb invalid option -C * Fix : Missing IMAPS/POPS ports * Fix : update check_nwc_health install to the 1.3 version. * Fix : Typo in Packs.tpl and accept packs with no services * Fix : invalid nmap parameter 1.2 - 31/08/2012 ------------------ CORE ENHANCEMENTS * Add : Skonf daemon, as a configuration UI over the discovery lib * Add : Triggers to the scheduler. This code is launched after each "check" * Add : issue #349 (from David Guenault) Installation script now has a backup repository for installing MK Multisite. * Add : option to skip mongo installation. Useful for poller installation that does not need mongo to be installed. * Add : daemon_enabled option for ini and nagios.cfg files to disable a daemon from running. * Add : (by Hartmut Goebel and xkilian) huge work to replace old log code by a logging based one, print to logger, adjust levels, messages, typos * Add : (by Hartmut Goebel) huge code cleaning * Add : install script available at the install.shinken-monitoring.org address. * Add : make the daemon and the module names appears on the process line (setproctitle mode) * Add : receiver can now send directly commands to schedulers. (direct_routing) * Add : HOSTREALM macro for host/service. * Add : 'address' directive into generated host by discovery * Add : (reported by : Mathias Fussenegger) enhance the README, to just put install script way. Put other methods in the packagers and dev files. * Add : reload option for the init script. Fix #385 * Add : install script: trap signals INT, TERM with a cleanup function * Add : USR2 signal management for the scheduler to dump objects (checks, notification, event handlers and broks) * Add : Reorganize Configuration file in sections. More consistent layout. Clarify the role and scope of shinken-specific.cfg. * Add : Use the new folder "graphite" for graphite templates. Existing templates should be prefixed with .graph, and move to that folder. * Add : the not list for services generators. So the primary list can be automatically generated, and the user can just maintain the not one. * Add : (from Lars Hansson) manage Apache2 htpasswd crypt pass. * Add : Allow running shinken satellites in a NATted environment * Add : shinken-packs to easily create .zip pack files * Add : Hostd daemon, for running community.shinken-monitoring.org website * Add : multi-layers discovery, with rules on runners. * Add : discoveryrule_order so the admin can manage the template order in a more easy way than with the order in the file * Add : support for Linux Mint * Add : nmap discovery speedup, and skonf scan part. * Add : support to explode_hostgroup directive * Add : new external command, PROCESS_SERVICE_OUTPUT and host, same than check_result for passive, but with no return code. * Add : configurable logging level (per daemon) * Add : shinken-admin can now change logging levels during runtime (Guillaume Bour) * Add : now passive checks take their timestamp from the external command instead of the received time. * Add : pre-serialize the configurations in the arbiter before going in daemon, and so kill the previous instance. (startup boost) * Add : brok time part boost, with less useless objects sent, like contacts, timeperiods or commands. (big startup boost) * Add : host pack dispatching history is saved in a pickle file, so we will be able to push same objects in the same scheduler * Add : arbiter -p options for dumping profile * Add : (thibault cohen) Added hook in Scheduler daemon to permit the use of the Shinken SNMP poller module, SnmpBooster. CORE FIXES * Fix : (from raphaeltr) Typo in mysql monitoring pack command arguments * Fix : Fix minor installation issues on centos/redhat. Mainly on services activation and pnp * Fix : escalation issue when we restart the scheduler during the escalation process. Should fix #525 * Fix : potential endless loop on bp_rule + parent relation ship, and problem/impact setting. Fix #198 * Fix : bad characters in external commands make logging not happy at all. Fix #498 * Fix : bad start for arbiter with Pyro4 and enable_ssl=1. * Fix : issue #272 where the shinken user home was forced to be in /home and that make some system not eligible to install. * Fix : Pyro 4.14 issue, where we remove every where the MSG_WAITALL for all where it was useful only for first Pyro4 version. Fix #345 * Fix : Update format of log message to be less prone to breaking * Fix : $USERn$ macros aren't published in env anymore * Fix : Nconf installation : apache was not restarted after dependency installation. * Fix : (reported by : D.Durieux) remove the self.process link for a killed action, because such object cant be pickle. * Fix : spare arbiter now take the lead is the master is never start. Fix #231 * Fix : (reported by Sven) the configuration loader was not following symlinks. BEWARE : on python 2.4 and 2.5 it is NOT managed. * Fix : (reported by : Sven) space in hostgroups problems. Fix #452 * Fix : (reported by : Nicolas DUPEUX) : bad DISABLE_HOST_FRESHNESS_CHECKS and ENABLE_HOST_FRESHNESS_CHECKS code fix #376 * Fix : (reported by : Chettor) #342 comments from downtime and ack where not put in the global list, and so not removed * Fix : catch a case where service dep got no hostname nor hostgroup_name. * Fix : issue with retention and notification that make notification not sent. * Fix : (reported by banderas07, analysed by : Radu Gheorghe) : #401 escalation notification_interval issue * Fix : disable 'FLAPPINGSTART', 'FLAPPINGSTOP', 'FLAPPINGDISABLED' notifications during downtimes. * Fix : issue #404 to treat Tracebacks correctly for Pyro3 and Pyro4. * Fix : Op5 guys changed the name of check_esx3.pl to check_vmware_api.pl so now we host a copy of the plugin to workaround problems * Fix : (Reported by : Morkxy, fixed by : Lars Hansson) change deprecated md5 by haslib. * Fix : add a check_type attribute for checks * Fix : remove illegal char from service_description when duplicated * Fix : wrong name for mongodb debian packages * Fix : CHANGE_HOST_CHECK_TIMEPERIOD apply to a host not a service * Fix : (from : CHAMLEY Stephane) add notifications_enabled to the retention data. * Fix : (reported by : Nicolas DUPEUX) #244 missing from_q property for regenerator. * Fix : (reported by : Akiooutori) fix #325 (missing SOCK_REUSE in some pyro versions) . * Fix : (From : Michael COQUARD) setup.py build with simple user. * Fix : missing mongo dependency for debian like distro < 6 * Fix : mongodb 10gen repo configuration path was not found on redhat/centos MODULE ENHANCEMENTS * Add: (from: xkilian and Thibault Cohen) Graphite broker supports pickle protocol, bulk updates and buffers output when it can't connect to Carbon daemons * Add: (from: Thibault Cohen and ???) Graphite broker replaces invalid caracters by _ and supports negative values * Add: (from: xkilian and Thibault Cohen) Graphite broker adds _PRE_GRAPHITE and _POST_GRAPHITE host or service macros to each dot delimited data name * Add: (from: Claneys Skyne) Graphite broker Create additionnal data points for warning and critical thresholds per data value * Add: (from Romain Forlot) Graphite broker Compile regex * Add: (from: bclermont) Graphite broker add graphite_data_source variable between hostname and service name in data to Graphite * Add : a read_only option to logstore_sqlite * Add : Finalyse the time filter to avoid loading all the logs database in use_aggressive_sql=0 mode * Add : Added more hints for the livestatus query metadata resulting in massive speedup for Nagvis, Multisite, Thruk * Add : Creating new authentification module for OpenLDAP directory service * Add : Most modules do not raise Warning if the module need (like python-ldap for active directory, etc) but instead raise a real Exception * Add : Collectd template with trigger sample. * Add : First working version of Canopsis broker module * Add : Use usjon optional import to improve LS output performance * Add : NDO add table prefix as a parameter. * Add : (Thibault Cohen) Livestatus will now only store STATE messages and no longer any general logs in logstores * Add : (Nicolas Dupeux) TSCA module documentation updated and files consolidated in modules sections of the code MODULE FIXES * Fix : issue #450 (from vaxvms) removed thrift broker, similar to Livestatus, as it is unmaintained. * Fix : MerlinDB for Ninja UI, removed documentation and marked as deprecated, as it is unmaintained. * Fix : (from : Frédéric Pégé) make the sqlite db files put in a good default place, in devel mode or init.d mode. * Fix : the program_start attribute in livestatus status table * Fix : (reported by : Nicolas Dupeux) remove a useless method in logstore_sqlite, and add a protection against endless loops. Fix #374 * Fix : Livestatus allowed_hosts options does not work * Fix : Call the right function for !=~ LQL filter (livestatus) * Fix : do not push back monitoring status in GLPI database when the host configuration does not come from GLPI * Fix : a bug in livestatus module and add the attribute host.event_handler (needed by Thruk) UI ENHANCEMENTS * Add : UI /detail full rewrite. Thanks Andreas!! * Add : UI new dashboard! * Add : (from Mael Vincent, julien pilou, Gael Millet, Damien Mathieu and hugo viricel) new /mobile part! * Add : Enable easy activation of application monitoring performance feature. Use ./install --enableeue to enable eue feature * Add : WebUI in a /problem page a user can now press the sift key and over the selection tick to select them. It make easier for massive selection * Add : UI sqlite backend to save common preferences * Add : the page to save common prefs * Add : webui, /depgraph, now add a loop and loop_time option, to make the graph root change between important elements. * Add : sound on new non-acknowledge IT problem (disabled by default) * Add : UI /detail now dep graph is also print on a tab, so this feature will be more visible. * Add : UI /problems add the downtime filtering. * Add : UI /problems filtering by ack state. UI FIXES * Fix : (reported by : Vincent Riquer) WebUI do not print 1970 for 0 time, but N/A instead. issue fix #241 * Fix : (reported by : xkilian) WebUI would not display names containing dots. DEFAULT PACKS * Add : discovery packs for ibm storage management and beginning of clustering solutions discovery * Add : cluster discovery for Safekit and HACMP. * Add : snmp plugins for AIX to monitor VG, inode, HACMP and Safekit * Add : first 2nd level real discovery script, here for listing windows shares, like shares or printers. It auto-hide the $ ones. * Add : (Camille Vacquie): AIX service pack * Add : SNI support to https and https certificate checks * Add : pack/os/linux - check_linux_network_usage - use last_perfdata to determin bandwidth * Add : commands for notification by XMPP (Jabber) * Add : a basic rsync check template in the network/services pack. * Add : macros for Basic HTTP authentification in the HTTP(S) services templates. * Fix : The check_mysql_health plugin need a : sign after the thresholds for greater is better metrics. * Fix : generic corrupted block oracle database check * Fix : call sudo to execute SMcli into IBM_DS plugins DOCUMENTATION * Add : (from xkilian, MathieuMD, Claneys Skyne, H4wkmoon, etc.) Wiki documentation in-depth renewal of Getting Started pages for new users * Add : Monitoring Pack documentation, modules, installation, FAQ , diagrams, scaling shinken, dev+user resources, etc. * Add : dummy_broker module example. 1.0.1 - 13/03/2012 ------------------ CORE FIXES * Fix: recovery contacts were lost after a restart * Fix: allow } in commands if it's not alone in the line * Fix: (from: Laurent Ollagnier) typo on mysql pack * Fix: issue #214 (shinken-admin catches backtrace for lost connections) * Fix: inheritance with + * Fix: ticket #213 This will manage cases where an active satellite dies and there is no spare. It's useful for pollers where spare is not so interesting. * Fix: (reported by: H4wkmoon) time change compensation lead to a scheduler crash when objets don't get check perdios or notification periods. * Fix: manage LARGE output from sub commands (>64k). Only works under Unix for now, because it uses the fcntl module * Fix: (reported by: Unicow) bug in escalations, there were too many escalations raised * Fix: (from: Arthur Gautier) fix buildpath * Fix: (From: Thomas Meson) Various fixes in setup.py and add missing files to build the package MODULE ENHANCEMENTS * Add: many GLPI module enhancements, with new properties managed MODULE FIXES * Fix: check_esx3 install under ubuntu 32bits edition * Fix: multisite layout, add custom links * Fix: workaround problem with official pyro3 under Debian squeeze. Use pyro4 instead of the official one * FIX: #210 added test of parent target folder. if it does not exist then create it * Fix: HTML in sendmails * Fix: a bug in livestatus attribute action_url_expanded. * Fix: (reported by: Jonathan Gaulupeau) ip_tag was launch too late to access check_command objets. * Fix: (By: zllak) ndodb_mysql_broker module was not working on python2.5 * Fix: Fix a bug in livestatus Service.contact_groups UI FIXES * Fix: Sort alphabeticaly in WebUI 1.0 - 28/02/2012 ------------------ CORE ENHANCEMENTS * Add: shinken.sh script and skonf.py with a LOT of features for setup and configuration management. * Add: better logging for warnings for host, service and npcd configuration * Add: poller daemon will use all available CPUs by default * Add: poller daemon can now load standard external modules like namedPipe to get external commands from it * Add: brok queue watchdog for satellites, activated for Broker by default at 100000 broks to kill/restart the module.. * Add: if the run/var directories are missing, try to create them in the init.d script. * Add: service templates can now be applied on a host template * Add: reactionner can run under android * Add: android SMS module * Add: imlemented a check for dead queue threads. When detected restarts the thread * Add: mail python script * Add: better error text when the pickle load gets an error * Add: manage connexion timeouts in the broker and Pyro4 * Add: $$ are now interpreted as $ in macro solving * Add: (reported by: sduchesneau) check_shinken.py now has a timeout. It is set by default at 10s * Add: business rules now support the NOT operand * Add: send broks in external modules as a biglist instead of n broks. It's more efficient (X3 perf for this, can be huge time for huge conf) * Add: templates for monitoring common devices and services are now included * Add: manage \; to be changed in ; without cutting the rest of the line to be removed as a comment * Add: (reported by: foobar1111) arbiter only matched hostname, and not fqdn. Now it will try fqdn and then the hostname. * Add: retention is now enabled by default at installation time (Pickle) * Add: (From Michael Leinartas) man pages updates * Add: '+' in discovery rule management, so you can 'add' something, like a template * Add: using an undefined template in the configuration is no longer a critical error, just a warning. So the user can just 'tag' hosts first and then create the template * Add: Sebastien Coavoux conducted a large code review! :) * Add: xkilian reviewed and corrected many many pages of the wiki documentation and a few english fixes in the distribution :) * Add: skonf daemon preview. It is not yet meant for production, but is included to get feedback * Add: mongodb insert capabilities for the discovery * Add: TIMEPERIOD TRANSITION logs from the Arbiter to get timeperiods transition output * Add: is_admin property is now available for contacts * Add: make the nmap discovery do tcp and udp scan * Add: (Victor Igumnov) installation for Solaris CORE FIXES * Fix: use StringIO to read configuration files into a string (much faster with lots of config files) * Fix: (reported by: lminoza) bug in inheritance if templates give only a + elements, should continue to loop and not stop here. * Fix: keep multiple spaces in config files. (check_command!"args " were stripped) * Fix: pyro 4.10 management * Fix: (reported by: MINOZA, Landy) multi-layer management for service template on host template. So if you apply a service on layer 3 and your host inherit from layer1->layer2 * Fix: (reported by: Thibaut Notteboom) catch ConnectionClosedError on poller/reactionner connections * Fix: A better safe_print * Fix: lot of encoding stuff! * FIX: proxy support and python-setuptools installation on RHEL/CENTOS 6 * Fix: (reproted by: darkweaver87) somethimes we got a malformed external command. If so, bailout and warn in debug mode. * Fix: when there is a dispatching problem, the arbiter send and send again the same configuration to satellites. * Fix: crash bug when timezone set in config file * Fix: make all log pass the utf8 management * Fix: use auto-generated, absolute path for README (Daniel Widerin) * Fix: last line of email notification must finish by a \n otherwise it will not be sent (Laurent Ollagnier) * Fix: (from rootix2) catch case where service do not have imported_from from modules. * Fix: (reported by: Venelin Petkov) print in hostgroup names was a problem. * Fix: if a scheduler restarted, we got problem because the arbiter did not resend it the configuration * Fix: "host_notification_period" and "service_notification_period" defined twice in the contact definition (Fournet Matthieu) * Fix: (reported by: Httqm) default notification_interval was one minute, 1hour is better. * Fix: (reported by: sprudhomme) longoutput parsing and perfdata did not follow nagios way. * Fix: change sourceforgetrac with github in tracebacks. * Fix: (reported by: lminoza) too many notifications for a contact with multiple notification methods, all where sent when only some would pass filter. * Fix: (reported by: puisea) kill all sub process action tree on unix * Fix: flaw on multiple date and multiple timerange * Fix: if missing alias in objects, put the name if available * Fix: (reported by: denetariko) manage spaces before and after type name in define line * Fix: (reported by: Steve Kieu) missing get_name() method for config * Fix: (reported by: Mihai Efrim) flapping notifications MODULE ENHANCEMENTS * Add: rewrite the LiveStatus module, performance boost and simpler. Thanks Gerhard for this huge undertaking! * Add: two new modules for storing livestatus logs. sqlite and mongodb * Add: graphite module for the Broker to export performance data to a Graphite time series database. * Add: module to deal with flat file dependencies * Add: mongodb module for the arbiter, to load hosts object from it * Add: mongodb retention module * Add: the Service-perfdata module now open/close the perfdata file each second. So it's compatible with tools like Centreon that move it * Add: macros supported in skonf. Lots of bug fixes * Add: merlindb now inserts host_contactgroups * Add: NDO Nagios/Shinken mix in database (Sebastien Coavoux) * Add: port option for the mysql connexion with NDO * Add: ip_tag module that can change configuration properties based on the host IP address or IP address range. * Add: simulation mode for the nmap disco wrapper, so we can ask for xml output from users, and then simulate it easily. * Add: new process for nmap discovery output. There is no longer a big mapping pass. It will output what it can, and it's for the discovery_rule to do this job now, it's far more a BIG thanks to all folks that send me sample xml output :) * Add: enhanced capabilities for the mysql import module (OlivierHa) MODULE FIXES * Fix: useless and dangerous str in redis module. * FIX: htpasswd.users file was not correctly updated at installation * Fix: (reported by: DGuenault) hack_poller_tag_by_macros was not applying poller_tag to commands, only root objects * Fix: memcache and redis retention modules. They were loading notifications->contacts->commands but this class got slots but no __getstate__ * Fix: (reported by: sprudhomme): long_output in ndo for 1.4b9 version * Fix: NSCA, a loop when the client initiate the socket closing UI * Add: now page rendering can be launched in parallel, and so page can now call 'long' queries without breaking all. But during a long query, we still cannot eat broks. * Fix: the WebUI now exits when the brok thread encounters a problem * Add: UI PNP module * Add: if contact is not is_admin, then the /problems view only shows its related elements (if he is a contact of the elemetn, of an impact or a source problems one). * Add: add graph time selection in the detail page * Add: Graphite graph backend * Add: /mobile part * Add: External Authentification for WebUI (SSO) (olivierHa) * Add: tips on the eltgraph view * Fix: acknowledge typo * Fix: Change the scale of the impact view divs * Fix: UI page navigation should not propose page too high if not need * Add: gesture canvas as smaller and visible * Add: address to the eltdetail.tpl * Fix: utf8 names in the UI * Add: allow_html_output parameter for the WebUI * Add: generic perfometer manager for the UI * Fix: search button (Thanx to pydubreucq) * Add: make all nodes apears in the dep graph, but hidden one in very smalls * Add: perfometer image hovering with the graph module like PNP * Add: password iphone like in the login screen * Add: an orange outline to see where the focus is on input 0.8 - 17/10/2011 ------------------ CORE ENHANCEMENTS * Add: WebUI broker module. More info in the UI changelog part. * Add: (by: Vincent Riquer) no_event_handlers_during_downtimes parameter to choose between nagios and shinken behavior to (not) run event handlers during a downtime. * Add: added RUN var relocation in shinken default file * Add: support for serviceextinfo * Add: support for hostextinfo * Add: a workdir parameter in the nagios.cfg file for set the arbiter working directory * Add: modules for modules. Will be useful for the global configuration in the WebUI (it's a module, but will need additonnal modules) * Add: alternative installation method in contrib * Add: shinken-admin command that can connect to an arbiter and ask daemons status. * Add: discovery runners timeout parameter. Set by default at one hour. * Add: rename criticity into business_impact for hosts and services. Criticity is automatically mapped into business_impact in the configuration reading, and rename min_criticity into min_business_impact for contacts and notifways. And keep a mapping for old conf. * Add: if set to 0, min_workers will set the pollers/reactionners to use ALL cpus of the server. * Add: business modulation objects. * Add: new tests for A,B,Cof: rules and a fix for the default return logic. * Add: arbiter ask one a check_interval the managed conf for satellites. Makes for smoother decisions. CORE FIXES * Fix: BugFix Pyro4 compatibility * fix: try to fix bad notification output make scehduler crash. * Fix: (Seb-Solon) catch incomplet service group members * Fix: bad message for arbiter in HA without host_name value. * Fix: (reported by: fraggod): inline comment in boolean cause an arbiter crash when lookup conf. * Fix: (reported by titilambert) bug about linkify and templates, so need to clean search lists after remove templates. * Fix: (reported by Vincent Riquer ) Pyro got problem in connect() and so need a low socket timeout by default. * Fix: (reported by: grim) manage flap_detection_enabled and enable_flap_detection * Fix: when disable active check was done, the pending check was still executed, so the effect was not immediate. Now we change it as a dumym check with curretn value the value of the host/service. * Fix: external_command.py where the config setting log_external_commands was ignored * Fix: (reported by: obiouane) missing object value is now reported correctly. * Fix: (Etienne Obriot) multiple = in macos values * Fix: a bug in config.py where resource macros containing a "=" were not correctly processed. Thanks Etienne * Fix: (reported by Seb-Solon) custome macros get problems in notifications. * Fix: (reported by: Michael Grundmann) a service for a void hostgroup and a host exclusion was reporting a bad conf instead if forgot this service. * Fix: (Reported by: Dirk Hallanzy) was not launching under XP due to some pid file creation bad mode. * Fix: (reported by: Vincent Riquer) now a spare broker can be send to sleep again by the arbiter. It will clean its queues and stop modules. * Fix: arbiter crash with a spare and the new smoothing wat i managed behavior. * Fix: Change default check/eventhandler/notification timeouts and flap thresholds to make nagios more compatible to shinken. * fix: Better handling of check timeouts (there was an empty output) * fix: Log notification and eventhandler timeouts * Fix: (reported by: Denis GERMAIN) service with no description cause a host problem in linking phase. * Fix: (reported by: Seb-Solon) manage utf8 in host and service perfdata. * Fix: protect broker agains module taht ask for full instance init about unknown scheduler. * Fix: (reported by Markus Elger) add next schedule date in the retention data. * Fix: (reported by: Olivier Hanesse) bad realm configuration was not detected for satellites. * Fix: (reported by:Markus Elger) bad host count in realms. * Fix: (reported by:Markus Elger) bad host reaml conf were not detected. MODULES ENHANCEMENTS * Add: module for autotagging poller_tag from a custom macro. * Add: create the simple-log archive dir automatically if it doesn't exist * Add: Initial import of the thrift broker with a sample client in python * Add: Initial import of TSCA, Thrift Service Check Acceptor * Add: Module arbiter load all configuration from GLPI (plugin monitoring) bywebservice * Add: a new module to import configuration data (hosts/services) from a MySQL database * NDO/MySQL: * Add: current_notification_number in the ndo module. * Add: last_notification in ndo module. * Add: last_state_change to the ndo module. * Add: add the is_flapping data to broks and update teh ndo module so it update the value. * Add: percent_state_change as check_result brok and in the ndo module. * LIVESTATUS: * Add: Derive the archives filenames from the main livestatus db * Add: Rewrite the livestatus sqlite code. Now there is one database file with logging events for each day. *!!: If you have a huge livestatus.db it is recommended to stop shinken and run the script contrib/livestatus/splitlivelogs.py * Add: the livestatus sqlite db will now be opened in do_start() instead of init(). Now only the livestatus process has an open filehandle. (the main broker process had one too) * Add: Make the livestatus socket world read/writable * Add: the REUSEADDR flag to livestatus's socket. Thanks Olivier Hanesse * Add: performance improvement for livestatus queries with Limit: MODULES FIXES * Fix: important bugfix in ncpdmod * Fix: Livestatus path database * Fix: string comparison for livestatus. None is equivalent to empty string * Fix: collumn name change in merlin ds. * Fix: (reported by: Seb-Solon) bad contact group insert in objects in NDO * Fix: (reported by: seb-solon) bad contact group id in NDO * Fix: (seb-solon) ndo contacts were not registered with the good id. * Fix: add the flap_detection_enabled in the ndo status tables too. (I love duplicate data...) * Fix: the ndo module with is_flapping value for update states. * Fix: (reported by: Ronny Lindner & Markus Elger) LiveStatus crash in log parsing. * Fix: (reported by: Ronny Lindner) fix status.dat module with utf8 characters * Fix: (reported by: Olivier Hanesse) nrpe module did not tag the check_time value for checks, and so pnp and other tools were not happy. BP rules were alike. * Fix: (reported by Markus Elger) LiveStatus module did not clean previously added elements before recreating them. UI * Add: Andreas to main contributors :) * Add: welcome text and error text in the login page. * Add: make appears on the dep graph only interesting elements, so near the root or hosts or service in real problems. Far more redable with huge conf. * Add: default http backend auto, so if a better one (paste, cehrrypy) is founded, use it instead of the simple swsgi. * Add: less info in the detail page, with a moee button for rarely used infos. * Add: dock effect for actions buttons on /problems. * Add: /system page with satellites info in it. * Add: search form in the problems page, with autocompletion. Search problems and impacts names too. * Add: new way of handling cookies. Now it's a crypted way, and so stateless. Needto remove all session file thing so. * Add: in cfg file password for users, or AD or apache passwd file * Add: AD module that will get user photos from Ldap. * Add: icon sets: network_service, servers., disk, database * Add: gesture management for the detail page. * Add: parallax effect in the top right banner to see huge impacts * Add: make known bad nodes shown in the dep tree, so the admin don't even have to expand and find the bad thing that cause him a problem. Lasy admin mdoe on :) * Add: swtich buttons for ths hostdetail page for enable/disable things. * Add: if you change host topology (new dependency like fror VMotion), the UI is updated too. 0.6 - 03/05/2011 ------------------ ENHANCEMENTS * Add: discovery script and rule engine * Enh: lot of code clean from Gregory Starck, big Thanks :) * Add: manage several brokers by realm * Add: LiveStatus Wait commands * Add: Recevier daemon, for distributed realms in distant LAN * Add: passive poller for DMZ management * Add: (David Guenault) a way for people that install shinken lib in non standard place (like /opt) to get it working for python path. * Add: poller and reactionner modules * Add: nrpe poller module for tuning nrpe calls without having to 'forks' * Add: human_timestamp_log option so the log will output time as human readable format. * Add: make the local log rotate with the logging module, max keep file at 5. * Add: the crash catching in the log for all daemons * Add: enable by default local log, should be enable by default, and if the admin want it down, he can. * Add: (Eric Beaulieu) Windows installation script * Add: Thte configuration is read as UTF8 * Add: round robin dispatching way between the worker of a module for the satellites. * Add: set a check_interval to satellites, so we do not try every second to ping them, each a minute is quite enouth. * Add: sending USR1 do a memory dump of the daemon * Add: Manage Pyro4.2 and higer. * Add: manage no notification period for hosts and services * Add: print the sleep_time as a load average, so will smooth in one minute average, so is a good scheduler load indicator * Add: force LANG=us and UTF8 in init scripts. No sense admin can set another langage for system and hope for tools to work. * Add: print the latencies for scheduler in 95percentile way * Add: when a hook point failed set the module to restart, not to kill, even external ones * Add: support for multisite's double-requests (extcmd+query) to the livestatus module * Add: now the nagios.cmd named pipe reading is an external module * Add: statistics for the livestatus module (multisite needs it) * Add: retention modules for broker and arbiter * Add: reationner tag, like poller tag, but for reactionner * Add: (Denis GERMAIN) check_shinken plugin and interface in the arbiter to get data. * Add: (Rémi BUISSON) common init script for easy launch * Add: sendmailhost.pl + sendmailservices.pl in libexec. * Add: hot poller addition commands * Add: automatic VMware host-> VM dependencies modules, with VMotion support * Add: "Unknown phase", so limit useless notifications FIXES * Fix:the livestatus module so that multiple external commands in one request are possible. (thruk uses this for mass operations) * Fix:the livestatus, so that host_name and service_description are shown correctly for downtiems and comments. * Fix: (reported by: Raynald de Lahondès) for python2.7, if shell launch, do not go in shlex split pass. * Fix: catch an exception when deregistering Pyro. (When the scheduler was started as the only process and no communication happened * Fix: Add the path of the current script (ex: bin/shinken-arbiter) and ".." to the list of paths where Python looks for shinken modules. * Fix: default config had duplicate command name define: check_http * Fix: clean Command(s) now subclass Item(s) as any "shinken.objects" * Fix: (reported by: Raynald de Lahondès) if a hostgroup member got no host defined, create a dummy service for no host, so failed. * Fix: bad UTC hour for scheduler. * Fix: in a rare case, the picle need to call the __init__ of notification before the __get_state__, but it will call it without args. * Fix: (reported by: Raynald de Lahondès) bad hostdep definition (link host) was badly detected. * Fix: utf8 management in the db class. * Fix: the * member was wrong in service calling hostgroup of this kind. * Fix: the schedule immediate check now work without having to force it. * Fix: correct the check launch under python2.7 with utf8 char. * Fix: (reported by: Denis GERMAIN) set by default retention update inverval to 60 minutes. * Fix: bad sys.path management for ndo module * Fix: too much loop in daemons! do_loop_tunr must WAIT a little! To managed in the main_loop part I think. * Fix: if the scheduler gotthe same conf but got a wait a new one before, it do a zombie scheduler then.... * Fix: (reported by: Hienz Michael) do not close local file in the daemon pass. * Fix: (Raynald de Lahondès) setup.py for bsd systems. * Fix: service without host will be just droped, like Nagios. * Fix: Nagios allow elements without contacts. Do alike. * Fix: add protection against adding again and again the same actions ifthe scheduler ask it, like for orphaned checks. * Fix: sys.path management, not far more clear and error proof * Fix: (reported by: capibaru) add a strip pass before loading cfg files. * Fix: raise only one log for orphaned instead of xK useless logs. * Fix: (Venelin Petkov) missing customs on services copy of a based one (multiple hosts, hostgroups, etc) * Fix: (reported by: Ronny Lindner) manage the not host expressions for services. 0.5.1 - 19/01/2011 ------------------ ENHANCEMENTS *Add: Business rules *Add: Downtime for contacts *Add: Escalations based on time, with notification period shorting capabilities *Add: options allowed_hosts and max_logs_age to the livestatus broker *Add: some rarely used operators to the livestatus module (!>=) *Add: SSL connections between daemons with certificates and a CA *Add: module exception/kill catch in the scheduler. *Add: use the binary format for the pickle, so it take less space. *Add: (Hartmut Goebel) use universal open way for conf reading. *Add: support for unix sockets to the livestatus module *Add: criticity value for host/services, with problem/impacts max criticity management *Add: min_criticity definition in cotnact and notificationways. *Add: pylint and coverage pass in the integration server *Add: the new column pnpgraph_present to the livestatus module *Add: now create the pickle retention file with a .tmp, so in case of problem, we do not lost the old one. *Add: event handlers command can now be send by external commands FIXES *Fix: (Laurent Guyon) select with no timeout in NSCA arbiter module. *Fix: shinken init script: enable use of another "default" shinken file than hardcoded one by env variable. *Fix: (current_service_groups needs to return an empty list instead of string) in the livestatus module *Fix: 'setup.py -h install' now also exit *Fix: () crash for some bad conf, should raise a message instead. *Fix: missing check for no args in 'shinken' init script *Fix: a bug in livestatus Servicegroup.members, minor cosmetics, test case for thruk *Fix: a bug in host.parents livestatus representation to make thruk happy *Fix: check for /dev/shm access for the satellites. *Clean: Redesign of the livestatus module *Fix: testing with multisite and thruk *Clean: factorized .is_correct() call for all object types & added log to see more clearly wherer the error is. *Clean: factorization/simplification of code in action.py (and related) for spawning checks processes.+ clean of old deprecated commented code (& "related" too). *Fix: downtime and comment are now pickle in a dict, not a list. *Fix: pickle pass for look at tyype, so downtime and comment from 0.4 still ok. *Fix: acknoledge got too much information in the pickle pass, making the pickle save very very huge. Now fit from 100Mo to..2Mo :) *Clean: big clean of hasattr->getattr with default value *Clean: repalce dict for properties with real objects *Fix: Implement in_check_period/in_notification_period for livestatus to make multisite happy *Fix: Remove a leftover atribute from timeperiod&daterange *Fix: Transmit dateranges in timeperiod-full_status-broks *Fix: Replaced the deprecated StatsGroupBy, implemented Stats: for log entries, making Multisite happy with shinken-livestatus *Fix: manage the 'null' for inheritance. *Fix: add timeout to the status_dat module so that the status.dat is written even if no broks are sent. *Fix: escalations were offset of notif number by -1. *Fix: Replace Queue with an own implementation of LifoQueue for Python 2.4 (livestatus) *Fix: Fallback to sqlite 1.x for Python 2.4 (livestatus) *Fix: bug in the table structure where logging messages are kept (livestatus) *Fix: problem/impacts should be list, not string. *Fix: missing customs values in host/service tables in livestatus and Thruk was not happy. *Fix: is_impact/is_problem got bad format in lviestatus tables. *Fix: (Kristoffer Moegle) missing check in generic object configuration module. *Fix: a bug in livestatus. Catch the exception if a peer is not listening for the response *Fix: support for hosts without check_command (assumed to be always up) *Fix: hostgroup realm assoc was broken. Now it's tested. *Fix: (Maximilien Bersoult) fix mysql_db module search path. *Fix: bug in compensate time when thecore got event handler *Fix: a bug in the npcd module (spoolfile timestamp extension was float, not int) *Fix: windows registry paths. *Fix: problem with Nagios retention that was not happy about host properties type. *Fix: pickle/nagios retention was loading a retention host/service in the comment.ref link! *Fix: now only previously notified contacts are send for recovery notifications. *Fix: bug in NDO module for hostgroups *Fix: (0.5.1) bugs in LiveStatus module for Service get_full_name call and queries with no space after: 0.4 - 08/12/2010 ------------------ ENHANCEMENTS *Add: Service generators *Add: "Limit:" in livestatus *Add: the scheduler now save retention data before stop or take a new conf *Add: the broker clean quit the modules before quitting *Add: better output to know which external process for the broker is who in the log *Add: NodeSet lib use if available for the [X-Y] keys in service generators. *Add: retention modules, Memcache, Redis or simple file. *Add: lot of tests, even a end_to_end one for Ha and load balanced installations *Add: user can put what he want as MACRO in resources.cfg *Add: lot of log output, and clean a lot of others *Add: conf sample for PNP integration. *Add: (Nicolas Dupeux) add a NSCA server module for the Arbiter! (only XOR and none encryption from now) *Add: now the retention_update_interval parameter is managed. *Add: the! character before a host_name is now managed in the services. (even if host was defined in a hostgroup). And with test. *Add: perfdata command management for host/service *Add: manage modules in the Arbtier and in the schedulers *Add: nowthe whole documentation is done in the wiki *Add: obsess_over_host/service and executing oc*p_commands like eventhandlers *Add: "templates" and modes and more for service/host perf data module. *Add: now host with no address are fill with host_name for this value. *Add: timeperiod inheritance *Add: Allow "members *" in a hostgroup definition *Add: manage inherit_parents for dependencies. *Add: system time change catch for satellites. *Add: enable_environment_macros now create or not the env dict for checks *Add: O*HP command management FIXES *Fix: Some missing properties in the livestatus tables *Fix: Some missing properties in the NDO/Mysql export *Fix: parents property was not stripped(), and a error value was not catched as error *fix: missing some errors catch in contact definitions *Fix: Nagios allow contact_name to miss if there is an alias *Fix: Nagios allow a contact with no 'action' if his options are n/n *Fix: Resolv macro can loop forever with special output. Now limit it at 32 loop max! *Fix: the env_macros were enable if we use the tweaks, not good. And they are REALLY CPU killers. *Fix: LiveStatus: do not close the socket before we are sure the other peer send us nothing. If so, we can close it. *Fix: solve a case where config files do not end with a line return and will mix parameters. *Fix: broker spare not look at pollers/reactionners when he come active. *Fix: now the poller/reactionner REALLY raise broks to the broker (it was clear before...) *Fix: bug in the Broker that make in some cases broks lost for extarnal modules if they come from the arbiter. (like logs) *Fix: add a workaround to the livestatus module so it can handle requests from thruk 0.71.1 (which uses strange Stats: requests) *Fix: rename all 'binaries' without the .py extention so distrib will be happy. *Fix: livestatus work now with Python 2.4 *Fix: (Hermann Lauer) important bug in status.dat (and in fact all other 'external modules') that make the brok not manage in the good order in some case or Arbtier restart. Thanks a lot to Hermann Lauer that help me a LOT with all the debug logs! *Fix: (Zoran Zaric) big indentation cleanup *Fix: error handling in timeperiod inheritance *Fix: clean on the default configuration *Fix: manage additional_freshness_latency parameter with a test for check_freshness now. *Fix: setup.py can create a zip file (egg) for the librairy under Centos. It's not a good thing. It should avoid it *Fix: From Nicolas Dupeux: error in livestatus split. *Fix: From Nicolas Dupeux: fix typos in host code 0.3 - 06/10/2010 ------------------ ENHANCEMENTS *Add: complex hostgroup amtching with & ( ) |! *Add: resultmodulation code and tests *Add: brok information about problem/impacts *Add: livestatus export about problem/impacts *Add: clean quit on daemons (pid file and sub processes) *Add: maintenance_period parameter in hosts and services *Add: even more unit test cases *Add: now external commands raised in livestatus module are taken by the arbiter *Add: satellites states are now exported by livestatus *Add: arbiter module managment *Add: GLPI import arbiter module :) *Add: notificationways for contacts *Add: warning about unmanaged parameters *Add: log rotation and syslog managment FIXES *Fix: install crash is now catch with Pyro 4 in Centos (python 2.4) *Fix: host/service dep where not filled with default properties *Fix: catch realm configuration errors *Fix: but in status.dat about parents printing *Fix: problem with the Collums talbe in livestatus module *Fix: next valid time was one minute delay for cases with excludes *Fix: livestatus export in json was bad for service group members *Fix: bug in windows check launch *Clean: dispatcher code about useless options *Clean: tests cases setUp 0.2 - 06/09/2010 ------------------ ENHANCEMENTS *New code layout *Installation is easy with the setup.py process (first version from Maximilien Bersoult) *Now compatible with Pyro 3 AND 4 *Now compatible with Python 2.4 and 2.5 too *Add sticky acknowledgement. Non-permanent ack-comments are now automatically removed *Add host acknowledgement and acknowledgement stickiness *Finished service problem acknowledgement. one more testcase *Add REMOVE_HOST/SVC_ACKNOWLEDGEMENT external command *Now broker get broks from pollers and reactionners. (Useful for Logs) *Give Broker a way to make broks :) (like for it's own log) *Add a problem/incident change states when apply. But it do not interfer with the standard check way of doing (or at least should not). *Add some LSB init.d scripts *Add max_plugins_output_length parameter to limit the checks output size. *"Hack" the old nagios parameters: now status_file and nagios_log are catched. If the user defined them, but do not defined the good broker modules, we create them "on the fly". I hope one day we will remove it... *Nested macros are managed (like USERN in ARGN macro). *Add a pass about changing Nagios2 properties to Nagios3 ones. *Add json outputformat to the livestatus module *Add a broker module npcdmod (plus test_npcdmod) which writes a perfdata file suitable for pnp4nagios *Add check_period implicitly inheritate to service from host. *Redesign of the notifications (far easier to understand than the old async way) *Notice about unused parameters and explain why it can be removed from conf. *Catch non standard return code in actions.py so we can add stderr to the output for such cases. *Now arbiter host_name property is not mandatory. But WARNING: for a multiple arbiter conf, it must be set. *Updated cfg documentation (Author: Luke L ) *Add documentation about date range format because it was not documented. *Update the nagios to shinken migration file *Change the way broks are send from Arbiter to Broker: before, the Broker connect to the Arbiter, take broks like for schedulers. But Arbiter also connect to broker. That's a nightmare about hangout. Now, Arbiter push the broks. It's far more easy and efficient. *Add template handling to servicedependencies *Add test_dependencies as the regression test *Less status_dat verbosity :) *Add a last_perf_data + macros to access last perfdatas as in https://sourceforge.net/apps/trac/shinken/ticket/76 *HUGE clean on shinken-specific.cfg file. *Add a README file *Add a little note about how migrate from Nagios to Shinken *Add a hint about how solve 'cannot find my own arbiter' error message. *Add bin directory with some bash scripts to launch/stop the whole application. *Relative path, now we can have a easy portable sample configuration. (Gerhard) *Add two missing operators in livestatus.py *Big clean up conf sample! *No more modulespath need in brokerd.ini. Will be easier for packagers. *Acknowledgement test cases *Add some hard tests about timeperiods calculations *Add a test.sh script for Hudson test (launch all tests) *Add a problem/impact test. *Now external modules can return objects (from now nobody use it, but it can be useful in the future) *Make easier to raise checks/notificatiosn from in deep objects class. *repository cosmetics (Luke L) FIXES *Fix: now merlin is correctly filled with update program value *Bug Fix: ndo do not have command_file, so do not export it. *Bug fix: retention load was loading not good tab (impacts ones) and so cause problem with remove (not the good object!) (nicolas.dupeux) *Fix a bug in ACKNOWLEDGE_SVC_PROBLEM ext. command. Sticky can be 0/1/2, not bool *Bug fix in find_day_by_weekday_offset. *Bug Fix: when a date was calcl before teh ref time for a weekday it was not recalculated, so problem. *Bug Fix: error in get_end_of_day. It was given the first secon of the next day, so some exclude make problem with it. *Bug Fix: shell like commands where not good :(. Thanks to Gilles Seban for pointing it and to Hiren Patel for giving a list of shell caracters (so we know if we should use shell or not :) ) *Bug fix: external commands to send checks should work now *Bug fix: Arbiter do not more crash when scheduler is down and broker is up (not initialized make a missign parameter) *Bug Fix: cehck orphaned was badly status set. Thanks Pylint. *Bug fix: host in unreach were set DOWN un state, but unreach in state_id. Now it say clearly it's UNREACHABLE. *Bug: retention was loading services objects from retention file. It's not good at all. *Fix a and -> or bug in the dependancies *Fix a bug in livestatus. state_type is now a number instead of HARD/SOFT *Fix a bug in the livestatus module. Eventhandler command is now serializable *Fix a bug in execute_unix. If there is an exception during plugin execution, use it's string representation as plugin_output *Fix a bug in the livestatus module. Multiline input is now possible *Bug: patch from David GUÉNAULT about stopping all brokers *Correct launched/lanched type everywhere (Grégory Starck) *Fixed scheduler.add so that master notifications (without contact) don't create a status brok *Patch from Nicolas DUPEUX about typo correction in service.py *Reduce CPU comsumption of livestatus broker (thanks flox for the patch) *Fix a bug in the npcdmod test case *Fix: configurations files can be mix if the previous do not finished with a line return (Sebastian Reimers) *Fix: Correct a bad default arbiter pid configuration (Sebastian Reimers) *Bug fix a missing save of shinken-reactionner.py about path in relative mode *Global external commands now create an update_program_status_brok instead of program_status_brok *Fix a bug in the status_dat_broker (incorrect servicegroup-definition in objects.cache) *Fix: add Gerhard in print screen :) *Bug Fix: add duplication check for elements (and groups). Only service is allowed to have duplicated (will Warning, but no error). *Bug Fix: patch from Nicolas Dupeux. Thruk socket shutdowns are now handled in an exception *Bux Fix (Sven Velt): patch about recursive dir load and check timeperiod typo *NO MORE nap in code, now all are shinken :) 0.1 - 31/05/2010 ------------------ ENHANCEMENTS *Initial realease