Archivos de la categoría ‘sysadmin’

I never liked have to install agents for different tasks like Backups or monitoring. I think that is always enough with SSH. In this post I will introduce some concepts that I am using as an alternative to the NRPE for nagios.

Time ago I explained how to setup SSH for remote monitor servers in Nagios, using the ControlMaster feature to reuse the connection.

In that post I was using runit to keep the connections alive.

But in OpenSSH 5.6 a new feature has been released:

* Added a ControlPersist option to ssh_config(5) that automatically
starts a background ssh(1) multiplex master when connecting. This
connection can stay alive indefinitely, or can be set to
automatically close after a user-specified duration of inactivity.

And this is COOL! We can just use some options in the check_by_ssh plugin to automatically create the session. The options are:

  • -i /etc/nagios/nagiosssh.id_rsa: Private ssh key generated with ssh-keygen.
  • -o ControlMaster=auto: Create the control master socket automatically
  • -o ControlPersist=yes: Enable Control persist. It will spam a ssh process in background that will keep the connection (can be stopped with -O exit)
  • -o ControlPath=/var/run/nagiosssh/$HOSTNAME$: Path to the control socket. We can create a dir in /var/run/nagiosssh.
  • -l nagiosssh -H $HOSTNAME$: User and host were we are connecting.

So, the command definition can be:


define command{
command_name    check_users_ssh
command_line    $USER1$/check_by_ssh \
-o ControlMaster=auto \
-o ControlPath=/var/run/nagios/$HOSTNAME$ \
-o ControlPersist=yes \
-i $USER6$ -H $HOSTADDRESS$ -l $USER5$ \
'check_users -w $ARG1$ -c $ARG2$'
}

Note: You have to define the USER variables in resources.cfg.

Then we only need to create the proper user in the remote host. To improve the security, you can:

  • Use bash in restricted mode:
    1. Create the user ‘nagiosssh’ with shell=/home/nagiosssh/rbash
    2. Create a script /home/nagiosssh/rbash:
      #!/bin/sh
      # Restricted shell for the client.
      # Sets the path to checks
      PATH=/home/icingassh/checks exec /bin/bash --restricted "$@"
    3. Create the directory /home/icingassh/checks  and link here all the desired checks.
  • Restrict the ssh connection setting options in .ssh/authorized_keys. For example:

    no-agent-forwarding,no-port-forwarding,no-pty,no-X11-forwarding,from="10.10.10.10" ssh-rsa AAAAB3NzaC1...

Maybe in some days I upload a chef recipe to setup this.

These is one of the proposed solutions for the job assessment commented in a previous post.

Provide a design which is able to parse Apache access-logs so we can generate an overview of which IP has visited a specific link. This design needs to be usable for a 500+ node webcluster. Please provide your configs/possible scripts and explain your choices and how scalable they are.

I will consider these requirements:

  • It is not critical to register all the log entries. It is no needed ensure that all the web hits are registered.
  • No control on duplicated log entries. It is not needed
    to check that the log entries had been already loaded.
  • It is also needed to propose a mechanism to gather the logs from the webservers.
  • It must be scalable.
  • It is a plus to make it flexible to allow further different analysis.

The problems to be solved are log storage and log gathering, but the main bottleneck will be the storage.

One realizes that the best option is a noSQL database due to
the characteristics of the data to process (log entries):

  • Time ordered entries
  • no duplicates
  • need of fast insertion
  • fixed fields
  • no data relation or conceptual integrity
  • need to be rotated (old entries removed)
  • etc…

So, I will propose the usage of MongoDB [1] (http://www.mongodb.org/), that fits the requirements:

  • It is fast, both at inserting and querying.
  • Scales horizontally without disruption (is initially proper configured).
  • Supports replication and High Availability.
  • Well known solution. Commercial support if needed.
  • Python bindings (pyMongo)
[1] Note: I will not enter in details of a MongoDB scalable HA architecture.
See the quick start guide
to setup a single node and the documentation
for architecture examples.

To parse the logs and store them in MongoDB, I will propose a
simple python script: accesslog_parse_mongo.py that:

  • Setup a direct MongoDB connection.
  • Read the access log from standard input.
  • Parse the logs and store all the fields, including: client_ip, url, referer, status code, timestamp, timezone…
  • I do not set any indexes in the NoSQL db. Indexes could be
    created on url or client_ip fields, but not having indexes allows faster
    insertions, that is the objective. The reads are very uncommon and performed
    in batch processes.
  • Notice that it should be improved to be more reliable. For instance, it
    does not check for errors (DB failures, etc.). It could buffer entries in case of DB failure.
  • A second script called example_query_accesslog.py queries the DB and prints the access. It gets an optional argument, the relative URL.

To feed the DB with the logs from the webservers, some solutions could be:

  • Copy the log files with a scheduled task via SSH or similar, then process them with
    accesslog_parse_mongo.py in a centralized server (or cluster of servers).

    • Pros: Logs are centralized. Only a set of servers access to MongoDB.
      System can be stopped as needed.
    • Cons: Needs extra programming to get the logs.
      No realtime data.
  • Use a centralized syslog service, like syslog-ng
    (can be balanced and configured in HA),
    and setup all the webservers to send the logs via syslog
    (see this article).

    In the log server, we can process resulting files with a batch process or send all the messages to accesslog_parse_mongo.py. For instance, the configuration for syslog-ng:

    destination d_prog { program("/apath/accesslog_parse_mongo.py"
                                  template(“$MSGONLY\n”)
                                  template-escape(no)); };
    • Pros: Centralized logs. No extra programming. Realtime data.
      Use of existent infrastructure (syslog). Only a set of servers access to MongoDB.
    • Cons: Some logs entries can be dropped. Can not be stopped, if not log entries will be lost.
  • Pipe the webserver logs directly to the script, accesslog_parse_mongo.py.
    In Apache configuration:

    CustomLog "|/apath/accesslog_parse_mongo.py" combined
    • Pros: Easy to implement. No extra programming or infrastructure. Realtime data.
    • Cons: Some logs entries can be dropped. It can not be stopped or log entries will be lost.
      The script should be improved to make it more reliable.

These is one of the proposed solutions for the job assessment commented in a previous post.

These is one of the proposed solutions for the job assessment commented in a previous post.

How could you backup all the data of a MySQL installation without taking the DB down at all. The DB is 100G and is write intensive. Provide script.

My solution to this problem will be “Online backup using LVM snapshots in a replication slave”.
It implies that a series of previous decisions have been taken:

  • LVM snapshots: Database data files must be on a LVM volume, which will allow the creation of snapshots. An alternative could be use Btrfs snapshots.

    Advantages:

    • Almost online backup: The DB will continue working while copying the data files.
    • Simple to setup and without cost.
    • You can even start a separated mysql instance (with RW snapshots in LVM2) to perform any task.

    Drawbacks:

    • To ensure the data file integrity the tables must be locked while creating the snapshot.
      The snapshot creation is fast (~2s), but this means a lock anyway.
    • All datafiles must be in the same Logical Volume, to ensure an atomic snapshot.
    • The snapshot has an overhead that will decrease the write/read performance
      (it is said
      that up to a 6x overhead). This is because any Logical Extend modified must be copied. After a while this overhead will be reduced because changes are usually made in the same Logical Extends.
  • Replication slaves: (see official documentation about replication backups and backup raw data in slaves ).
    Backup operations will be executed in a slave instead of in the master.

    Advantages:

    • Will avoid the performance impact in the Master mysql server,
      since “the slave can be paused and shut down without affecting the running operation of the master”.
    • Any backup technique can be done. In fact, using this second method should be enough.
    • Simple to setup.
    • If there is already a MySQL cluster it will use existent infrastructure.

    Drawbacks:

    • This is more an architectonic solution or backup policy than a backupscript.
    • If there are logical inconsistencies in the slave, they are included in the backup.

So, I will suppose that:

  • MySQL data files are stored in a LVM logical volume, with enough free
    LEs in the volume group to create snapshots.
  • The backupscript will be executed in a working replication slave, not in the master.
    Execution of this script on a master will result in a short service pause (tables locked)
    and I/O performance impact.

This is the backup script (tested with xfs) that must be executed on a Slave replication server:

(in github)

#!/usr/bin/python
# -*- coding: utf-8 -*-
#
# Requirements
# - Data files must be in lvm
# - Optionally in xfs (xfs_freeze)
# - User must have LOCK TABLES and RELOAD privilieges::
#
#    grant LOCK TABLES, RELOAD on *.*
#        to backupuser@localhost
#        identified by 'backupassword';
#
import MySQLdb
import sys
import os
from datetime import datetime

# DB Configuration
MYSQL_HOST = "localhost" # Where the slave is
MYSQL_PORT = 3306
MYSQL_USER = "backupuser"
MYSQL_PASSWD = "backupassword"
MYSQL_DB = "appdb"

# Datafiles location and LVM information
DATA_FILES_PATH = "/mysql/data" # do not add / at the end
DATA_FILES_LV = "/dev/datavg/datalv"
SNAPSHOT_SIZE = "10G" # tune de size as needed.

SNAPSHOT_MOUNTPOINT = "/mysql/data.snapshot" # do not add / at the end

# Backup target conf
BACKUP_DESTINATION = "/mysql/data.backup"

#----------------------------------------------------------------
# Commands
# Avoids sudo ask the password
#SUDO = "SUDO_ASKPASS=/bin/true /usr/bin/sudo -A "
SUDO = "sudo"
LVCREATE_CMD =   "%s /sbin/lvcreate" % SUDO
LVREMOVE_CMD =   "%s /sbin/lvremove" % SUDO
MOUNT_CMD =      "%s /bin/mount" % SUDO
UMOUNT_CMD =     "%s /bin/umount" % SUDO
# There is a bug in this command with the locale, we set LANG=
XFS_FREEZE_CMD = "LANG= %s /usr/sbin/xfs_freeze" % SUDO

RSYNC_CMD = "%s /usr/bin/rsync" % SUDO

#----------------------------------------------------------------
# MySQL functions
def mysql_connect():
    dbconn = MySQLdb.connect (host = MYSQL_HOST,
                              port = MYSQL_PORT,
                              user = MYSQL_USER,
                              passwd = MYSQL_PASSWD,
                              db = MYSQL_DB)
    return dbconn

def mysql_lock_tables(dbconn):
    sqlcmd = "FLUSH TABLES WITH READ LOCK"

    print "Locking tables: %s" % sqlcmd

    cursor = dbconn.cursor()
    cursor.execute(sqlcmd)
    cursor.close()

def mysql_unlock_tables(dbconn):
    sqlcmd = "UNLOCK TABLES"

    print "Unlocking tables: %s" % sqlcmd

    cursor = dbconn.cursor()
    cursor.execute(sqlcmd)
    cursor.close()

#----------------------------------------------------------------
# LVM operations
class FailedLvmOperation(Exception):
    pass

# Get the fs type with a common shell script
def get_fs_type(fs_path):
    p = os.popen('mount | grep $(df %s |grep /dev |'\
                 'cut -f 1 -d " ") | cut -f 3,5 -d " "' % fs_path)
    (fs_mountpoint, fs_type) = p.readline().strip().split(' ')
    p.close()
    return (fs_mountpoint, fs_type)

def lvm_create_snapshot():
    # XFS filesystem supports freezing. That is convenient before the snapshot
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        print "Freezing '%s'" % fs_mountpoint
        os.system('%s -f %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    newlv_name = "%s_backup_%ilv" % \
                    (DATA_FILES_LV.split('/')[-1], os.getpid())
    cmdline = "%s --snapshot %s -L%s --name %s" % \
        (LVCREATE_CMD, DATA_FILES_LV, SNAPSHOT_SIZE, newlv_name)

    print "Creating the snapshot backup LV '%s' from '%s'" % \
            (newlv_name, DATA_FILES_LV)
    print " # %s" % cmdline

    ret = os.system(cmdline)

    # Always unfreeze!!
    if fs_type == 'xfs':
        print "Unfreezing '%s'" % fs_mountpoint
        os.system('%s -u %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    if ret != 0: raise FailedLvmOperation

    # Return the path to the device
    return '/'.join(DATA_FILES_LV.split('/')[:-1]+[newlv_name])

def lvm_remove_snapshot(lv_name):
    cmdline = "%s -f %s" % \
        (LVREMOVE_CMD, lv_name)

    print "Removing the snapshot backup LV '%s'" % lv_name
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedLvmOperation

#----------------------------------------------------------------
# Mount the filesystem
class FailedMountOperation(Exception):
    pass

def mount_snapshot(lv_name):
    # XFS filesystem needs nouuid option to mount snapshots
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        cmdline = "%s -o nouuid %s %s" % \
                    (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)
    else:
        cmdline = "%s %s %s" % (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)

    print "Mounting the snapshot backup LV '%s' on '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

def umount_snapshot(lv_name):
    cmdline = "%s %s" % (UMOUNT_CMD, SNAPSHOT_MOUNTPOINT)

    print "Unmounting the snapshot backup LV '%s' from '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

#----------------------------------------------------------------
# Perform the backup process. For instance, an rsync
class FailedBackupOperation(Exception):
    pass

def do_backup():
    cmdline = "%s --progress -av %s/ %s" % \
                (RSYNC_CMD, DATA_FILES_PATH, BACKUP_DESTINATION)

    print "Executing the backup"
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedBackupOperation

def main():
    dbconn = mysql_connect()
    mysql_lock_tables(dbconn)

    start_time = datetime.now()
    # Critical, tables are locked!
    snapshotlv = ''
    try:
        snapshotlv = lvm_create_snapshot()
    except:
        print "Backup failed."
        raise
    finally:
        mysql_unlock_tables(dbconn)
        dbconn.close()
        print "Tables had been locked for %s" % str(datetime.now()-start_time)

    try:
        mount_snapshot(snapshotlv)
        do_backup()
        umount_snapshot(snapshotlv)
        lvm_remove_snapshot(snapshotlv)
    except:
        print "Backup failed. Snapshot LV '%s' still exists. " % snapshotlv
        raise

    print "Backup completed. Elapsed time %s" % str(datetime.now()-start_time)

if __name__ == '__main__':
    main()

These is one of the proposed solutions for the job assessment commented in a previous post.

Using an Open Source solution, design a load balancer configuration that meets: redundancy, multiples subnets, and handle 500-1000Mbit of syn/ack/fin packets. Explain scalability of your design/configs.

The main problem that the load balancer design must solve in
web applications is the session stickiness (or persistence). The load balancer
design must be created according to the session replication policy of the architecture.
On the other hand, the load balancer must be designed to allow the upgrade and
maintenance of the servers.

Considering the architecture explained in Webserver architecture section, the stickiness restrictions are:

  • Session stickiness must be set for each site. Site failure means lose of all users sessions crated in that site.
  • Session stickiness must be set for each farm. Server failure is allowed (session will be recovered from the session DB backend)
  • Session stickiness for servers into the farm is optional (Could be set to take advantage of OS disk cache).

The software that I propose is HAProxy (http://haproxy.1wt.eu/):

The load balancer design consists of two layers:

  • One primary HAProxy LB, that will balance between sites.
    Configured with session stickiness to the site using an inserted cookie, SITEID.
    See primary-haproxy.conf.
  • One site LB in each site, balancing the site’s farms.
    Configured with session stickiness to the farm FARM_ID.
    See site1-haproxy.conf.

Extra comments:

  • Each layer can have several HAproxy instances, with the same configuration, configured with a failover solution or behind a Layer 4 load balancer. See http://haproxy.1wt.eu/download/1.3/doc/architecture.txt (Section 2) for examples.
  • Additionally a SSL frontend solution should be configured for SSL connections between the client
    and the primary load balancer. We can use plain HTTP between balancers and servers.
    I will not describe this element.
  • The solutions described in HAproxy architecture documentation, “4 Soft-stop for application maintenance”, can be used.
  • With HAProxy 1.4 you can dynamically control servers weight.
    A monitoring system can check the farms/servers health and tune the weight as needed.

This solution scales well. You simply need to add more servers, farms and sites.
Load-balancers can scale horizontally as commented.

Primary configuration:

#
# Primary Load Balancer configuration: primary-haproxy.conf
#
global
	log 127.0.0.1	local0
	log 127.0.0.1	local1 notice
	#log loghost	local0 info
	maxconn 40000
	user haproxy
	group haproxy
	daemon
	#debug
	#quiet

defaults
	log	global
	mode	http
	option	httplog
	option	dontlognull
	retries	3
	option redispatch
	maxconn	2000
	contimeout	5000
	clitimeout	50000
	srvtimeout	50000

listen primary_lb_1
    # We insert cookies, add headers => http mode
    mode http

    #------------------------------------
    # Bind to all address.
    #bind 0.0.0.0:10001
    # Bind to a clusterized virtual ip
    bind 192.168.10.1:10001 transparent

    #------------------------------------
    # Cookie persistence for PHP sessions. Options
    #  - rewrite PHPSESSID: will add the server label to the session id
    #cookie	PHPSESSID rewrite indirect
    #  - insert a cookie with the identifier.
    #    Use of postonly (session created in login form) or nocache to avoid be cached
    cookie SITEID insert postonly

    # We need to know the client ip in the end servers.
    # Inserts X-Forwarded-For. Needs httpclose (no Keep-Alive).
    option forwardfor
    option httpclose

    # Roundrobin is ok for HTTP requests.
    balance	roundrobin

    # The backend sites
    # Several options are possible:
    #  inter 2000 downinter 500 rise 2 fall 5 weight 100
    server site1 192.168.11.1:10001 cookie site1 check
    server site2 192.168.11.1:10002 cookie site2 check
    # etc..

Site configuration:

#
# Site 1 load balancer configuration: syte1-haproxy.conf
#
global
log 127.0.0.1    local0
log 127.0.0.1    local1 notice
#log loghost    local0 info
maxconn 40000
user haproxy
group haproxy
daemon
#debug
#quiet

defaults
log    global
mode    http
option    httplog
option    dontlognull
retries    3
option redispatch
maxconn    2000
contimeout    5000
clitimeout    50000
srvtimeout    50000

#------------------------------------
listen site1_lb_1
grace 20000 # don't kill us until 20 seconds have elapsed

# Bind to all address.
#bind 0.0.0.0:10001
# Bind to a clusterized virtual ip
bind 192.168.11.1:10001 transparent

# Persistence.
# The webservers in same farm share the session
# with memcached. The whole site has them in a DB backend.
mode http
cookie FARMID insert postonly

# Roundrobin is ok for HTTP requests.
balance roundrobin

# Farm 1 servers
server site1ws1 192.168.21.1:80 cookie farm1 check
server site1ws2 192.168.21.2:80 cookie farm1 check
# etc...

# Farm 2 servers
server site1ws17 192.168.21.17:80 cookie farm2 check
server site1ws18 192.168.21.18:80 cookie farm2 check
server site1ws19 192.168.21.19:80 cookie farm1 check
# etc..

These is one of the proposed solutions for the job assessment commented in a previous post.

Note that this was my reply it that moment, nowadays I would change my reply including automatic provisioning and automated configuration  based on puppet, chef… and other techniques. Also something about rollbacks, using LVM snapshots.

Question

Given a 500+ node webcluster in one costumer for one base code. Design a Gentoo/PHP version control. Propose solutions for OS upgrades of servers. Propose a plan and execution for PHP upgrades.  Please explain your choices.

Solution

About the OS/upgrades I will consider:

  • There are a limited number of hardware configurations. I will call them: hw-profile.
  • There is a preproduction environment, with servers of each hardware configuration.

In that case:

  • Each upgrade must be properly tested in the preproduction environment.

  • The preproduction servers will pre-compile the Gentoo packages
    for each hw-profile. Distributed compiling can be set.

  • There is a local Gentoo mirror and pre-compiled packages repository
    in the network, serving the binaries built for each hw-profile.

  • Each server will have associated its hw-profile repository and install the binaries:

    PORTAGE_BINHOST="ftp://gentoo-repository/$hw-profile"
    emerge --usepkg --getbinpkg <package>

The PHP upgrades can be distributed using rsync, in different location for each version,
and activated changing the apache/nginx configuration.

To plan the upgrades (both OS and PHP) I will consider the architecture
explained previously in Webserver architecture section, and the load balancing
solution described in Redundant load balancer design.

The upgrade requirements are:

  • HA, no lost of service due maintenance or upgrades.
  • Each request with an associated session must access to a
    webapp version equal or superior than previous request.
    This is important to ensure the application consistency
    (e.p. an user fills a form that is only available in the last version,
    session contains unexpected values…).

The upgrades can be divided in:

  • Non-disruptive OS upgrade: small OS software upgrades that are not related to
    the webservice (p.e. man, findutils, tar…). The upgrade can be performed online.

  • Disruptive OS upgrade: OS software that imply restart the service
    or the server (p.e. Apache, kernel, libc, ssl…):

    1. It will be upgraded only one member of each farms. First all members number 1,
      then number 2…
    2. The web service will be stopped during the upgrade.
      The other servers in the farm will serve without service disruption.

    This method provides homogeneous and little performance impact (100/16 = 6% servers down).

  • Application upgrade: Clients must access to equal or newer webapp versions:

    1. A complete farm must be stopped at the same time.
    2. The sessions sticked to this farm will be
      served by other farms in the same site (session stickiness to site).
      Session data will be recovered from DB backend.
    3. The memcached associated to this farm must be flushed.
    4. Once upgraded, all servers in the farm are started and can serve to new sessions.

    Load balancer stickiness ensure that the new sessions will access only to
    the upgraded farm. Except:

    • if all the servers of the farm fail at the same time after the upgrade.
    • if end user manipulates the cookies.

    In that case, control code can be added to the application to
    invalidate sessions from upper versions. Something like this:

    if ($app_version < $_SESSION['app_version'])
     session_destroy()
    elseif ($app_version != $_SESSION['app_version'])
     $_SESSION['app_version'] = $app_session

To perform the upgrades, cluster management tools can be used,
like MCollective (good if using puppet), func, fabric

 

For the last years I had the same problem: I was running windows as desktop and managing Linux/Unix?. Of cuorse, to minimize the pain, I use Cygwin and/or colinux, that make my life easier.

Often I need to open files remotely, but is so tedious to find them in the samba share… and then I’ve found this tool: doit  http://www.chiark.greenend.org.uk/~sgtatham/doit/, from Simon Tatham, the putty author.

It allows execute commands in your box from the remote server, automatically translating the paths (in case you are using samba).

Fast installation

  1. Client on Unix side (1):
  1. Download and compile:
    curl http://www.chiark.greenend.org.uk/~sgtatham/doit/doit.tar.gz | tar -xvzf -
    cd doit
    cc -o doitclient doitclient.c doitlib.c -lsocket -lnsl -lresolv
    
  2. Install. I use  stowfor my adhoc binaries:
    ##  Preset variables.
    LOCAL_BINARIES=~/local
    PLATFORM="$(uname -s)-$(uname -p)"
    PATH=$PATH:$LOCAL_BINARIES/$PLATFORM/bin
    
    STOW_HOME=$LOCAL_BINARIES/$PLATFORM/stow
    
    mkdir -p $STOW_HOME/doit/bin
    cp doitclient $STOW_HOME/doit/bin
    for i in wf win winwait wcmd wclip www wpath; do
     ln -s doitclient $STOW_HOME/doit/bin/$i
    done
    
    cd $STOW_HOME
    stow doit
    
  1. Shared secret setup and configuration:
    dd if=/dev/random of=$HOME/.doit-secret bs=64 count=1
    chmod 640 $HOME/.doit-secret 
    echo "secret $HOME/.doit-secret" &gt; $HOME/.doitrc
    

    Then set the mappings as described in the documentation. For instance:

    host
      map /home/ \\sambaserver\
  1. If you are using su (or sudo reseting passwords), you will lose the SSH_CLIENT variable. But can set the $DOIT_HOST variable. You can use this:
    cat <<"EOF" >> ~/.bashrc
    # DOIT_HOST variable, for the DoIt tool (Integration with windows desktops)
    export DOIT_HOST=$(who -m | sed 's/.*(\(.*\)).*/\1/')
    EOF
    
  2. Setup the client on a windows box. You can copy the .doit-secret or use samba to access to your home.
    Just create a link to “doit.exe secret.file”, for instance:

    \\sambaserver\keymon\local\Linux-x86\stow\doit\doit.exe \\sambaserver\keymon\.doit-secret

Conclusions

It is really cool, and it really works.

My only concern is that is the key, that should be shared. One solution can be use environment vars or event the Putty ‘Answerback to ^E’ ( http://tartarus.org/~simon/putty-snapshots/htmldoc/Chapter4.html#config-answerback), but I am not sure how implement it.

(1) On solaris, compiling with GCC, I got this error:

/var/tmp//cc5ZYGYW.o: In function `main':
doitclient.c:(.text+0x29e8): undefined reference to `hstrerror'
collect2: ld returned 1 exit status

This is solved solved adding -lsocket.

Sometimes you need to know who was the user that did login in a linux/unix server, but after several “sudo” or “su” commands (and others programs that change the permissions) you have lost the information.

You can try to determine the user using the tty of the process tree, querying the process parents.

With this idea, I wrote this small script: whowasi.sh

#!/bin/env bash
# This scripts allows  determine the user used to login in the
# machine to run the given process.
# 
SCRIPT_NAME=$0

# Command names to be considered as login commands
LOGIN_PROGRAMS="sshd telnetd login" 

# Get all pids of the parents of a pid
get_parent_pids() {
    echo $1
    ppid=$(ps -o ppid -p $1 | awk '/[0-9]+/ {print $1}' )
    [ $ppid == 1 -o $ppid == 0 ] && return
    get_parent_pids $ppid
}

# Get users of parent process of a pid
get_parent_users() {
	get_parent_pids $1 | xargs -n1 ps -o user= -p | uniq | awk '{print $1}'
}

get_parent_users_commands() {
	get_parent_pids $1 | xargs -n1 ps -o user= -o comm= -p | uniq
}

get_parent_users_ttys() {
	get_parent_pids $1 | xargs -n1 ps -o user= -o tty= -p | uniq
}


get_firstuser_after_login() {
	cmd="egrep -B1 -m1" # Get the line before, and stop on first match
	for p in $LOGIN_PROGRAMS; do 
		cmd="$cmd -e '^(.*/)?$p\$'" 
	done
	get_parent_users_commands $1 | eval $cmd | awk '{ print $1; exit; }'
}

get_firstuser_after_root() {
	get_parent_users $1 | grep -B1 -m1 root | awk '{print $1;exit;}'
}

get_firstuser_with_tty() {
	get_parent_users_ttys $1 | grep -B1 -m1 \?  | awk '{print $1;exit;}'
}


print_help() {
	cat <<EOF
Usage $SCRIPT_NAME [Option...] [pid]

Prints the users that where used to start a process.

By default it will use the current process.
	
Options
	-h:		This help.
	-t:		Print the user of the first process having a valid tty (not ?) 
			This is the default behaviour.
	-a:		Print all processes.
	-r:		Print only user started after the first root (usually the one that login in)
	-l:		Print the user after a login program ($LOGIN_PROGRAMS)
	        Requires GNU egrep.
EOF
}

mode=tty
while true; do
	case $1 in
		"")
			break
		;;
		"-a")
			mode=all
		;;
		"-l")
			mode=login
		;; 
		"-r")
			mode=root
		;;
		"-t")
			mode=tty
		;;
		"-h")
			printhelp
			exit
		;;
		"-*")
			echo "$SCRIPT_NAME: Unknown option '$1'"
			printhelp
			exit
		;;
		*)
			args="$args $1"
		;;
	esac
	shift
done
set -- $args

pid=${1:-$$}

if ! ps -p $pid >/dev/null 2>&1; then
	echo "$SCRIPT_NAME: Unable to find process '$pid'"
	exit 1
fi

case $mode in 
	all)
		get_parent_users $pid
	;;
	login)
		get_firstuser_after_login $pid
	;;
	root)
		get_firstuser_after_root $pid
	;;
	tty)
		get_firstuser_with_tty $pid
	;;
esac

In this case I needed to patch lftp.

First we set the configuration to support the overlay path (This is done once)


export PORTDIR_OVERLAY="$EPREFIX/usr/local/portage"
cat <<EOF >>$EPREFIX/etc/make.conf
# Overlay
PORTDIR_OVERLAY="$PORTDIR_OVERLAY"
EOF

And then, for any package, we just have to copy the ebuild and its files, and add the new patch (copying the file and updating the ebuild):


# To create a overlay version of any package, just change this variables
pkg=net-ftp/lftp
pkgvers=lftp-4.3.1

# Copy the ebuild
mkdir -p $PORTDIR_OVERLAY/$pkg
cp $EPREFIX/usr/portage/$pkg/$pkgvers.ebuild  $PORTDIR_OVERLAY/$pkg
cp -R $EPREFIX/usr/portage/$pkg/files   $PORTDIR_OVERLAY/$pkg/files

# Do any change.
# e.p. a Simple modification: add patches and add them to the ebuild:
#  cp lftp-solaris-2.10-socket.patch $PORTDIR_OVERLAY/$pkg/files/lftp-solaris-2.10-socket.patch
#  joe $EPREFIX/usr/portage/$pkg/$pkgvers.ebuild
#    +> Add to src_prepare(): epatch "${FILESDIR}/${PN}-solaris-2.10-socket.patch"

# Sign the ebuild
ebuild $PORTDIR_OVERLAY/$pkg/$pkgvers.ebuild digest

When you create a “Virtual Target Disk” or VTD on a VIOS, there is not documented way to define or change the LUN number that it shows to the client partition. But there are situation where you might need to update it:
  1. In a dual VIOS environment, to have the same LUNs in both clients (easier to administrate)-
  2. In a redudant configuration, when you need to start lpars on different hardware, using SAN disks. For instance, we use this configuration for our Backup datacenter where we have al the SAN disks mirrored.

In this post I comment how to update this LUN. The idea is basicly:

  • Set the VTD device to Defined in the VIOS
  • Update the ODM database. You have to update the attribute ‘LogicalUnitAddr’ in ObjectClass ‘CuAt’
  • Perform a ‘cfgmgr’ on the virtual host adapter (vhostX). This will enable the VTD device and reload the LUN number. Perform an cfgmgr on the VTD device does not work.

So, with commands:

$ oem_setup_env
# bash

# lsmap -vadapter vhost21
SVSA            Physloc                                      Client Partition ID
--------------- -------------------------------------------- ------------------
vhost21         U9117.MMA.XXXXXXX-V2-C34                     0x00000016

VTD                   host01v01
Status                Available
LUN                   0x8200000000000000
Backing device        hdiskpower0
Physloc               U789D.001.BBBBBBB-P1-C3-T2-L75

# ioscli mkvdev -vadapter vhost21 -dev host01v99 -vdev hdiskpower1
cfgmgr -l vhost21

# lsmap -vadapter vhost21
SVSA            Physloc                                      Client Partition ID
--------------- -------------------------------------------- ------------------
vhost21         U9117.MMA.XXXXXXX-V2-C34                     0x00000016

VTD                   host01v01
Status                Available
LUN                   0x8200000000000000
Backing device        hdiskpower0
Physloc               U789D.001.JJJJJJJ-P1-C3-T2-L75

VTD                   host01v99
Status                Available
LUN                   0x8300000000000000
Backing device        hdiskpower1
Physloc               U789D.001.JJJJJJJ-P1-C3-T2-L77

# rmdev -l host01v99
host01v99 Defined

# odmget -q "name=host01v99 and attribute=LogicalUnitAddr"  CuAt
CuAt:
  name = "host01v99"
  attribute = "LogicalUnitAddr"
  value = "0x8300000000000000"
  type = "R"
  generic = "D"
  rep = "n"
  nls_index = 6

# odmchange -o CuAt -q "name = host01v99 and attribute = LogicalUnitAddr" <<"EOF"
CuAt:
  name = "host01v99"
  attribute = "LogicalUnitAddr"
  value = "0x8100000000000000"
  type = "R"
  generic = "D"
  rep = "n"
  nls_index = 6
EOF

# odmget -q "name=host01v99 and attribute=LogicalUnitAddr"  CuAt
CuAt:
  name = "host01v99"
  attribute = "LogicalUnitAddr"
  value = "0x8100000000000000"
  type = "R"
  generic = "D"
  rep = "n"
  nls_index = 6

# cfgmgr -l vhost21
# lsmap -vadapter vhost21
SVSA            Physloc                                      Client Partition ID
--------------- -------------------------------------------- ------------------
vhost21         U9117.MMA.XXXXXXX-V2-C34                     0x00000016

VTD                   host01v01
Status                Available
LUN                   0x8200000000000000
Backing device        hdiskpower0
Physloc               U789D.001.JJJJJJJ-P1-C3-T2-L75

VTD                   host01v99
Status                Available
LUN                   0x8100000000000000
Backing device        hdiskpower1
Physloc               U789D.001.JJJJJJJ-P1-C3-T2-L77

In the client partition, you can scan for the new disk, and it will have the LUN 0x81:

root@host01:~/# cfgmgr -l vio0
root@host01:~/# lscfg -vl hdisk5
  hdisk5           U9117.MMA.XXXXXXX-V22-C3-T1-L8100000000000000  Virtual SCSI Disk Drive

Note: Actually I changed the output of these commands to remove information of my company.

Update: I created an script to do this: change_vtd_lun.sh

Any Linux & Unix admin knowns this fact: GNU tools are MUCH MORE better tools than AIX, BSD, Solaris or HP-UX tools.

GNU tools have much less bugs, much more functionality and options, localization, better documentation, they are standard, most of the scripts are built based on GNU tools, etc, etc,etc. Why the hell they do not throw out their ugly-buggy-limitated tools and install the GNU tools in their systems by default???

Here you have an example of a weird behaviour in the ‘dd’ command in the AIX platform: With the skip=<Num. blocks> parameter the ‘dd’ command skips the blocks, but it actually reads them (no matter if you are working on a filesystem with file random access). So, if you are working with big files (in my case, 50GB) you have to read ALL the blocks in memory before access the requested position. That means huge I/O, usage of memory in cache, etc…

IBM guys: you do not know that there is a lseek(2) function?

Here you have an example of the time that takes read 2MB from a big file, skiping 1000MB. Using native ‘dd’ command takes 12s:

$ time /usr/bin/dd if=a_big_big_file.data skip=1000 bs=1M count=2 of=/dev/null
2+0 records in.
2+0 records out.

real    0m12.059s
user    0m0.013s
sys     0m1.419s

With GNU’s version, less than a second:

$ time /opt/freeware/bin/dd if=a_big_big_file.data skip=1000 bs=1M count=2 of=/dev/null
2+0 records in
2+0 records out

real    0m0.024s
user    0m0.002s
sys     0m0.006s

Note: You can find the GNU’s dd tool in AIX Linux ToolBox coreutils package.

Update: I contacted the IBM support and they told me that using the option conv=iblock ,”dd” will behave as expected. But IMHO the documentation does not explicitily say that:

iblock, oblock
Minimize data loss resulting from a read or write error on direct access devices. If you specify the iblock variable and an error occurs during a block read (where the block size is 512 or the size specified by theibs=InputBlockSize variable), the dd command attempts to reread the data block in smaller size units. If the dd command can determine the sector size of the input device, it reads the damaged block one sector at a time. Otherwise, it reads it 512 bytes at a time. The input block size (ibs) must be a multiple of this retry size. This option contains data loss associated with a read error to a single sector. The oblock conversion works similarly on output.