Marc2_AdminGuide/5_ServiceProcedures

Service procedures

Access hardware infos / virtual console

Each server and compute node has an iDRAC6 (=integrated Dell Remote Access Controller) onboard. Point your browser to the corresponding management name (=nodexxx-sc) or ip (MGT-IP) mentioned here. A login screen will appear:

After the successfull login this overview will appear:

Here you can read out system data like temperatures or power consumption or you can restart the system.

Alternatively you can login to the iDRAC6 via ssh, where you can connect to the serial console with the "console com2" command. Back to the iDRAC6 prompt you go with

ctrl+^

Or use the commands psipmi/psconsole to read IPMI related information and access the serial console:

marc2-h1:~ # psipmi node001 sdr
node001 (node001-sc):
CPU 1 Temp       | 14 degrees C      | ok
CPU 2 Temp       | 15 degrees C      | ok
CPU 3 Temp       | 13 degrees C      | ok
CPU 4 Temp       | 13 degrees C      | ok
PS 1 Temp        | 29 degrees C      | ok
PS 2 Temp        | disabled          | ns
...
marc2-h1:~ # psipmi node001 sel
node001 (node001-sc):
   1 | 12/31/2011 | 13:14:49 | Event Logging Disabled #0x72 | Log area reset/cleared | Asserted
   2 | Pre-Init Time-stamp   | Physical Security #0x73 | General Chassis intrusion | Asserted
   3 | Pre-Init Time-stamp   | Physical Security #0x73 | General Chassis intrusion | Deasserted
marc2-h1:~ # psconsole node001
console node001:
[SOL Session operational.  Use ~? for help]
Kernel 2.6.32-220.4.1.el6.x86_64 on an x86_64

node001 login: 

Note: you don't have to point to the addresses of the BMCs, the system knows how to connect to a node's associated BMCs/DRAC.

Restart iDRAC

If login fails without error (only showing the login page again), it may help to reset your browser's settings and restart the iDRAC once or twice (!) via ssh:

marc2-h2:~ # ssh <idrac-ipaddress>
/admin1-> racadm racreset
RAC reset operation initiated successfully. It may take up to a minute 
for the RAC to come back online again.

After reset, it may take a few minutes till the https site is available again.

Opening service requests with Dell

In case something goes wrong with the hardware, a case has to be openend with Dell. For that you need to know the service tag (serial number). See list of service tags in this wiki.

From the (Linux) commmand line you can get it with

node085:~ # dmidecode -s system-serial-number
GTF185J
node085:~ #

Then contact the Dell email support via  http://www.dell.com/support/incidents/de/de/debsdt1/email/tagchange

If you are asked to submit an error log, you may export the System Event Log from iDRAC.

View guarantee dates at Dell Support

Visit  http://support.dell.com and click "My account" to login.

Under "Support" > "My Products and Services" you can create and view a list of items, add items (by service tag number) etc.

Manage Dell MD3220+1220 /scratch disk shelfs

There is a (self explaining) Java Managing GUI at marc2-fs1:/opt/dell/mdstoragesoftware/mdstoragemanager/client/SMclient .

In case of an error, Dell support may ask for an exported support data file. It may be created by clicking "Zusammenfassung" - "Supportdaten manuell erfassen".

Open Manage Server Administrator (OMSA)

On  https://marc2-fh:1311 ,  https://marc2-fs1:1311 and  https://marc2-fs2:1311 , there is a web interface available. Log in as user "root" and system root password.

If the OMSA service is not running, it can be started with the command "/opt/dell/srvadmin/sbin/srvadmin-services.sh status|start|stop"

Attachments