Many Solaris are aware of the the Fault Management Architecture in Solaris. However, it’s not really a habit i’ve seen frequently to have a regular peek into the output of fmlist to look after the faults detects by Solaris. In Solaris a new PAM module has been integrated, that gives you a message, that looking into the information of the FMA may be not the dumbest idea.
It’s already in the default pam configuration:
jmoekamp@testbed:~$ grep -i "session" /etc/pam.d/other
# Default definition for Session management
# Used when service name is not explicitly mentioned for session management
session definitive pam_user_policy.so.1
session required pam_unix_session.so.1
session optional pam_fm_notify.so.1However by default, the module does nothing. You won’t get the message to look after the FMA
joergmoellenkamp@Mac ~ % ssh jmoekamp@192.168.39.122
(jmoekamp@192.168.39.122) Password:
Last login: Thu Apr 3 05:09:23 2025 from 192.168.39.121
Oracle Solaris 11.4.79.189.2 Assembled March 2025
jmoekamp@testbed:~$ The reason for not being the default is simple: Perhaps not all users allowed to log into the system should see this kind of information. In order to get this kind of information at login, you need an additional authorization. The necessary authorization is called solaris.fm.read. You can it this authorization to a user via usermod:
root@testbed:~# usermod -A +solaris.fm.read jmoekampNext time you login as jmoekamp you will see a small, but useful addition to the output:
joergmoellenkamp@Mac ~ % ssh jmoekamp@192.168.39.122
(jmoekamp@192.168.39.122) Password:
Last login: Thu Apr 3 05:11:43 2025 from 192.168.39.121
NOTE: system has 2 active diagnoses; run 'fmadm list' for details.
Oracle Solaris 11.4.79.189.2 Assembled March 2025
I think that it’s a very clever use of PAM.
But I feel like multiple things have failed for this to be the first notification.
Why hasn’t the *LOM sent an alert email / SNMP trap?
Why hasn’t periodic monitoring of fault management as part of overall monitoring alerted?
I mean for belt and suspenders, pam_fm_notify is suspenders. But where’s the belt?
All of that being said, I’m considering installing pam_fm_notify as an additional level of notification.
Do you know a way to disable the fault propagation from LDOMs to the CDOM and/or ILOM? I can only find articles where Oracle announced the feature had been added, but no info how to configure/disable it.
This is quite annoying, as e.g. a user coredumping in an LDOM would be "can be checked tomorrow" thing, while a fault reported in the CDOM is usually a "Call Woo right now!" issue.