Archivos de la categoría ‘Technical’

I never liked have to install agents for different tasks like Backups or monitoring. I think that is always enough with SSH. In this post I will introduce some concepts that I am using as an alternative to the NRPE for nagios.

Time ago I explained how to setup SSH for remote monitor servers in Nagios, using the ControlMaster feature to reuse the connection.

In that post I was using runit to keep the connections alive.

But in OpenSSH 5.6 a new feature has been released:

* Added a ControlPersist option to ssh_config(5) that automatically
starts a background ssh(1) multiplex master when connecting. This
connection can stay alive indefinitely, or can be set to
automatically close after a user-specified duration of inactivity.

And this is COOL! We can just use some options in the check_by_ssh plugin to automatically create the session. The options are:

  • -i /etc/nagios/nagiosssh.id_rsa: Private ssh key generated with ssh-keygen.
  • -o ControlMaster=auto: Create the control master socket automatically
  • -o ControlPersist=yes: Enable Control persist. It will spam a ssh process in background that will keep the connection (can be stopped with -O exit)
  • -o ControlPath=/var/run/nagiosssh/$HOSTNAME$: Path to the control socket. We can create a dir in /var/run/nagiosssh.
  • -l nagiosssh -H $HOSTNAME$: User and host were we are connecting.

So, the command definition can be:

define command{
command_name    check_users_ssh
command_line    $USER1$/check_by_ssh \
-o ControlMaster=auto \
-o ControlPath=/var/run/nagios/$HOSTNAME$ \
-o ControlPersist=yes \
-i $USER6$ -H $HOSTADDRESS$ -l $USER5$ \
'check_users -w $ARG1$ -c $ARG2$'

Note: You have to define the USER variables in resources.cfg.

Then we only need to create the proper user in the remote host. To improve the security, you can:

  • Use bash in restricted mode:
    1. Create the user ‘nagiosssh’ with shell=/home/nagiosssh/rbash
    2. Create a script /home/nagiosssh/rbash:
      # Restricted shell for the client.
      # Sets the path to checks
      PATH=/home/icingassh/checks exec /bin/bash --restricted "$@"
    3. Create the directory /home/icingassh/checks  and link here all the desired checks.
  • Restrict the ssh connection setting options in .ssh/authorized_keys. For example:

    no-agent-forwarding,no-port-forwarding,no-pty,no-X11-forwarding,from="" ssh-rsa AAAAB3NzaC1...

Maybe in some days I upload a chef recipe to setup this.


Hace un porrón de años escribí esto… creo que aún estaba en el instituto :)

Básicamente funciona así (se ve mal por culpa de wordpress que quita los espacios… paso de rayarme):

$ fortune | ./
/  \\                                                                       \ \
\__/|       Many a writer seems to think he is never profound except         |/
 __ |       when he can't understand his own meaning.                        |
/  \|       -- George D. Prentice                                            |\
\__//                                                                        //
# Poner en el cron:
# 33 *     * * *     root   (cat /etc/motd_base ; /usr/games/fortune | /usr/local/bin/ > /etc/motd; chmod 644 /etc/motd

fmt -w 60 | (
read a;
read b;
read c;

cat << "TOP"
/  \\                                                                       \ \

printf "\\__/|       %-65s|/\n" "$a"
read c;

while read c; do
	printf "    |       %-65s|\n" "$a";

printf " __ |       %-65s|\n" "$a";
printf "/  \\|       %-65s|\\ \n" "$b";

cat << "BOTTON"
\__//                                                                        //


These is one of the proposed solutions for the job assessment commented in a previous post.

Provide a design which is able to parse Apache access-logs so we can generate an overview of which IP has visited a specific link. This design needs to be usable for a 500+ node webcluster. Please provide your configs/possible scripts and explain your choices and how scalable they are.

I will consider these requirements:

  • It is not critical to register all the log entries. It is no needed ensure that all the web hits are registered.
  • No control on duplicated log entries. It is not needed
    to check that the log entries had been already loaded.
  • It is also needed to propose a mechanism to gather the logs from the webservers.
  • It must be scalable.
  • It is a plus to make it flexible to allow further different analysis.

The problems to be solved are log storage and log gathering, but the main bottleneck will be the storage.

One realizes that the best option is a noSQL database due to
the characteristics of the data to process (log entries):

  • Time ordered entries
  • no duplicates
  • need of fast insertion
  • fixed fields
  • no data relation or conceptual integrity
  • need to be rotated (old entries removed)
  • etc…

So, I will propose the usage of MongoDB [1] (, that fits the requirements:

  • It is fast, both at inserting and querying.
  • Scales horizontally without disruption (is initially proper configured).
  • Supports replication and High Availability.
  • Well known solution. Commercial support if needed.
  • Python bindings (pyMongo)
[1] Note: I will not enter in details of a MongoDB scalable HA architecture.
See the quick start guide
to setup a single node and the documentation
for architecture examples.

To parse the logs and store them in MongoDB, I will propose a
simple python script: that:

  • Setup a direct MongoDB connection.
  • Read the access log from standard input.
  • Parse the logs and store all the fields, including: client_ip, url, referer, status code, timestamp, timezone…
  • I do not set any indexes in the NoSQL db. Indexes could be
    created on url or client_ip fields, but not having indexes allows faster
    insertions, that is the objective. The reads are very uncommon and performed
    in batch processes.
  • Notice that it should be improved to be more reliable. For instance, it
    does not check for errors (DB failures, etc.). It could buffer entries in case of DB failure.
  • A second script called queries the DB and prints the access. It gets an optional argument, the relative URL.

To feed the DB with the logs from the webservers, some solutions could be:

  • Copy the log files with a scheduled task via SSH or similar, then process them with in a centralized server (or cluster of servers).

    • Pros: Logs are centralized. Only a set of servers access to MongoDB.
      System can be stopped as needed.
    • Cons: Needs extra programming to get the logs.
      No realtime data.
  • Use a centralized syslog service, like syslog-ng
    (can be balanced and configured in HA),
    and setup all the webservers to send the logs via syslog
    (see this article).

    In the log server, we can process resulting files with a batch process or send all the messages to For instance, the configuration for syslog-ng:

    destination d_prog { program("/apath/"
                                  template-escape(no)); };
    • Pros: Centralized logs. No extra programming. Realtime data.
      Use of existent infrastructure (syslog). Only a set of servers access to MongoDB.
    • Cons: Some logs entries can be dropped. Can not be stopped, if not log entries will be lost.
  • Pipe the webserver logs directly to the script,
    In Apache configuration:

    CustomLog "|/apath/" combined
    • Pros: Easy to implement. No extra programming or infrastructure. Realtime data.
    • Cons: Some logs entries can be dropped. It can not be stopped or log entries will be lost.
      The script should be improved to make it more reliable.

These is one of the proposed solutions for the job assessment commented in a previous post.

These is one of the proposed solutions for the job assessment commented in a previous post.

How could you backup all the data of a MySQL installation without taking the DB down at all. The DB is 100G and is write intensive. Provide script.

My solution to this problem will be “Online backup using LVM snapshots in a replication slave”.
It implies that a series of previous decisions have been taken:

  • LVM snapshots: Database data files must be on a LVM volume, which will allow the creation of snapshots. An alternative could be use Btrfs snapshots.


    • Almost online backup: The DB will continue working while copying the data files.
    • Simple to setup and without cost.
    • You can even start a separated mysql instance (with RW snapshots in LVM2) to perform any task.


    • To ensure the data file integrity the tables must be locked while creating the snapshot.
      The snapshot creation is fast (~2s), but this means a lock anyway.
    • All datafiles must be in the same Logical Volume, to ensure an atomic snapshot.
    • The snapshot has an overhead that will decrease the write/read performance
      (it is said
      that up to a 6x overhead). This is because any Logical Extend modified must be copied. After a while this overhead will be reduced because changes are usually made in the same Logical Extends.
  • Replication slaves: (see official documentation about replication backups and backup raw data in slaves ).
    Backup operations will be executed in a slave instead of in the master.


    • Will avoid the performance impact in the Master mysql server,
      since “the slave can be paused and shut down without affecting the running operation of the master”.
    • Any backup technique can be done. In fact, using this second method should be enough.
    • Simple to setup.
    • If there is already a MySQL cluster it will use existent infrastructure.


    • This is more an architectonic solution or backup policy than a backupscript.
    • If there are logical inconsistencies in the slave, they are included in the backup.

So, I will suppose that:

  • MySQL data files are stored in a LVM logical volume, with enough free
    LEs in the volume group to create snapshots.
  • The backupscript will be executed in a working replication slave, not in the master.
    Execution of this script on a master will result in a short service pause (tables locked)
    and I/O performance impact.

This is the backup script (tested with xfs) that must be executed on a Slave replication server:

(in github)

# -*- coding: utf-8 -*-
# Requirements
# - Data files must be in lvm
# - Optionally in xfs (xfs_freeze)
# - User must have LOCK TABLES and RELOAD privilieges::
#    grant LOCK TABLES, RELOAD on *.*
#        to backupuser@localhost
#        identified by 'backupassword';
import MySQLdb
import sys
import os
from datetime import datetime

# DB Configuration
MYSQL_HOST = "localhost" # Where the slave is
MYSQL_USER = "backupuser"
MYSQL_PASSWD = "backupassword"
MYSQL_DB = "appdb"

# Datafiles location and LVM information
DATA_FILES_PATH = "/mysql/data" # do not add / at the end
DATA_FILES_LV = "/dev/datavg/datalv"
SNAPSHOT_SIZE = "10G" # tune de size as needed.

SNAPSHOT_MOUNTPOINT = "/mysql/data.snapshot" # do not add / at the end

# Backup target conf
BACKUP_DESTINATION = "/mysql/data.backup"

# Commands
# Avoids sudo ask the password
#SUDO = "SUDO_ASKPASS=/bin/true /usr/bin/sudo -A "
SUDO = "sudo"
LVCREATE_CMD =   "%s /sbin/lvcreate" % SUDO
LVREMOVE_CMD =   "%s /sbin/lvremove" % SUDO
MOUNT_CMD =      "%s /bin/mount" % SUDO
UMOUNT_CMD =     "%s /bin/umount" % SUDO
# There is a bug in this command with the locale, we set LANG=
XFS_FREEZE_CMD = "LANG= %s /usr/sbin/xfs_freeze" % SUDO

RSYNC_CMD = "%s /usr/bin/rsync" % SUDO

# MySQL functions
def mysql_connect():
    dbconn = MySQLdb.connect (host = MYSQL_HOST,
                              port = MYSQL_PORT,
                              user = MYSQL_USER,
                              passwd = MYSQL_PASSWD,
                              db = MYSQL_DB)
    return dbconn

def mysql_lock_tables(dbconn):

    print "Locking tables: %s" % sqlcmd

    cursor = dbconn.cursor()

def mysql_unlock_tables(dbconn):
    sqlcmd = "UNLOCK TABLES"

    print "Unlocking tables: %s" % sqlcmd

    cursor = dbconn.cursor()

# LVM operations
class FailedLvmOperation(Exception):

# Get the fs type with a common shell script
def get_fs_type(fs_path):
    p = os.popen('mount | grep $(df %s |grep /dev |'\
                 'cut -f 1 -d " ") | cut -f 3,5 -d " "' % fs_path)
    (fs_mountpoint, fs_type) = p.readline().strip().split(' ')
    return (fs_mountpoint, fs_type)

def lvm_create_snapshot():
    # XFS filesystem supports freezing. That is convenient before the snapshot
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        print "Freezing '%s'" % fs_mountpoint
        os.system('%s -f %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    newlv_name = "%s_backup_%ilv" % \
                    (DATA_FILES_LV.split('/')[-1], os.getpid())
    cmdline = "%s --snapshot %s -L%s --name %s" % \

    print "Creating the snapshot backup LV '%s' from '%s'" % \
            (newlv_name, DATA_FILES_LV)
    print " # %s" % cmdline

    ret = os.system(cmdline)

    # Always unfreeze!!
    if fs_type == 'xfs':
        print "Unfreezing '%s'" % fs_mountpoint
        os.system('%s -u %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    if ret != 0: raise FailedLvmOperation

    # Return the path to the device
    return '/'.join(DATA_FILES_LV.split('/')[:-1]+[newlv_name])

def lvm_remove_snapshot(lv_name):
    cmdline = "%s -f %s" % \
        (LVREMOVE_CMD, lv_name)

    print "Removing the snapshot backup LV '%s'" % lv_name
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedLvmOperation

# Mount the filesystem
class FailedMountOperation(Exception):

def mount_snapshot(lv_name):
    # XFS filesystem needs nouuid option to mount snapshots
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        cmdline = "%s -o nouuid %s %s" % \
                    (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)
        cmdline = "%s %s %s" % (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)

    print "Mounting the snapshot backup LV '%s' on '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

def umount_snapshot(lv_name):
    cmdline = "%s %s" % (UMOUNT_CMD, SNAPSHOT_MOUNTPOINT)

    print "Unmounting the snapshot backup LV '%s' from '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

# Perform the backup process. For instance, an rsync
class FailedBackupOperation(Exception):

def do_backup():
    cmdline = "%s --progress -av %s/ %s" % \

    print "Executing the backup"
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedBackupOperation

def main():
    dbconn = mysql_connect()

    start_time =
    # Critical, tables are locked!
    snapshotlv = ''
        snapshotlv = lvm_create_snapshot()
        print "Backup failed."
        print "Tables had been locked for %s" % str(

        print "Backup failed. Snapshot LV '%s' still exists. " % snapshotlv

    print "Backup completed. Elapsed time %s" % str(

if __name__ == '__main__':

These is one of the proposed solutions for the job assessment commented in a previous post.

Provide a script manipulating the routing table to send all outgoing traffic originating from ipaddress: through gw and all other traffic through gateway

One basically have to execute these commands:

# Default route
ip route del default table 254
ip route add default via dev wlan0 table 254

# alternative route and its rule
ip route del default table 1
ip route add default via dev wlan0 table 1
ip rule del from
ip rule add from table 1

I delete the previous default route and rule to ensure that the commands will be a success and will update the configuration.

A more convenient script could be:


# Default route

# Create the diferent routes, where
#  NAME[id]  = name of routing table (for documentation purposes)
#  ROUTE[id] = destination
#  SRCS[id]  = list of routed ips
#  DEV[id]     = network device
#  id          = number of routing table (1..253)

# Set the "main" table

# debug
#ip() { echo "> ip $*"; command ip $*; }

for i in {255..1}; do
[ ! -z "${ROUTE[$i]}" ] || continue

# Delete default route if exists
ip route list table $i | grep -q default && \
echo "Deleting default entry for route table ${NAME[$i]}($i)..." && \
ip route del default table $i

# Create the new table default route
echo "Creating route table '${NAME[$i]}($i)' with default via gw ${ROUTE[$i]}"
ip route add default via "${ROUTE[$i]}" dev ${DEV[$i]} table $i || continue

# Create
for ip in ${SRCS[i]}; do
# Delete rule if exists
ip rule list |grep -q "from $ip" && \
echo " - deleting rule from $ip..." && \
ip rule del from $ip

# Add the source rule
echo " + adding rule from $ip..."
ip rule add from $ip table $i

These is one of the proposed solutions for the job assessment commented in a previous post.

Using an Open Source solution, design a load balancer configuration that meets: redundancy, multiples subnets, and handle 500-1000Mbit of syn/ack/fin packets. Explain scalability of your design/configs.

The main problem that the load balancer design must solve in
web applications is the session stickiness (or persistence). The load balancer
design must be created according to the session replication policy of the architecture.
On the other hand, the load balancer must be designed to allow the upgrade and
maintenance of the servers.

Considering the architecture explained in Webserver architecture section, the stickiness restrictions are:

  • Session stickiness must be set for each site. Site failure means lose of all users sessions crated in that site.
  • Session stickiness must be set for each farm. Server failure is allowed (session will be recovered from the session DB backend)
  • Session stickiness for servers into the farm is optional (Could be set to take advantage of OS disk cache).

The software that I propose is HAProxy (

The load balancer design consists of two layers:

  • One primary HAProxy LB, that will balance between sites.
    Configured with session stickiness to the site using an inserted cookie, SITEID.
    See primary-haproxy.conf.
  • One site LB in each site, balancing the site’s farms.
    Configured with session stickiness to the farm FARM_ID.
    See site1-haproxy.conf.

Extra comments:

  • Each layer can have several HAproxy instances, with the same configuration, configured with a failover solution or behind a Layer 4 load balancer. See (Section 2) for examples.
  • Additionally a SSL frontend solution should be configured for SSL connections between the client
    and the primary load balancer. We can use plain HTTP between balancers and servers.
    I will not describe this element.
  • The solutions described in HAproxy architecture documentation, “4 Soft-stop for application maintenance”, can be used.
  • With HAProxy 1.4 you can dynamically control servers weight.
    A monitoring system can check the farms/servers health and tune the weight as needed.

This solution scales well. You simply need to add more servers, farms and sites.
Load-balancers can scale horizontally as commented.

Primary configuration:

# Primary Load Balancer configuration: primary-haproxy.conf
	log	local0
	log	local1 notice
	#log loghost	local0 info
	maxconn 40000
	user haproxy
	group haproxy

	log	global
	mode	http
	option	httplog
	option	dontlognull
	retries	3
	option redispatch
	maxconn	2000
	contimeout	5000
	clitimeout	50000
	srvtimeout	50000

listen primary_lb_1
    # We insert cookies, add headers => http mode
    mode http

    # Bind to all address.
    # Bind to a clusterized virtual ip
    bind transparent

    # Cookie persistence for PHP sessions. Options
    #  - rewrite PHPSESSID: will add the server label to the session id
    #cookie	PHPSESSID rewrite indirect
    #  - insert a cookie with the identifier.
    #    Use of postonly (session created in login form) or nocache to avoid be cached
    cookie SITEID insert postonly

    # We need to know the client ip in the end servers.
    # Inserts X-Forwarded-For. Needs httpclose (no Keep-Alive).
    option forwardfor
    option httpclose

    # Roundrobin is ok for HTTP requests.
    balance	roundrobin

    # The backend sites
    # Several options are possible:
    #  inter 2000 downinter 500 rise 2 fall 5 weight 100
    server site1 cookie site1 check
    server site2 cookie site2 check
    # etc..

Site configuration:

# Site 1 load balancer configuration: syte1-haproxy.conf
log    local0
log    local1 notice
#log loghost    local0 info
maxconn 40000
user haproxy
group haproxy

log    global
mode    http
option    httplog
option    dontlognull
retries    3
option redispatch
maxconn    2000
contimeout    5000
clitimeout    50000
srvtimeout    50000

listen site1_lb_1
grace 20000 # don't kill us until 20 seconds have elapsed

# Bind to all address.
# Bind to a clusterized virtual ip
bind transparent

# Persistence.
# The webservers in same farm share the session
# with memcached. The whole site has them in a DB backend.
mode http
cookie FARMID insert postonly

# Roundrobin is ok for HTTP requests.
balance roundrobin

# Farm 1 servers
server site1ws1 cookie farm1 check
server site1ws2 cookie farm1 check
# etc...

# Farm 2 servers
server site1ws17 cookie farm2 check
server site1ws18 cookie farm2 check
server site1ws19 cookie farm1 check
# etc..

These is one of the proposed solutions for the job assessment commented in a previous post.

Note that this was my reply it that moment, nowadays I would change my reply including automatic provisioning and automated configuration  based on puppet, chef… and other techniques. Also something about rollbacks, using LVM snapshots.


Given a 500+ node webcluster in one costumer for one base code. Design a Gentoo/PHP version control. Propose solutions for OS upgrades of servers. Propose a plan and execution for PHP upgrades.  Please explain your choices.


About the OS/upgrades I will consider:

  • There are a limited number of hardware configurations. I will call them: hw-profile.
  • There is a preproduction environment, with servers of each hardware configuration.

In that case:

  • Each upgrade must be properly tested in the preproduction environment.

  • The preproduction servers will pre-compile the Gentoo packages
    for each hw-profile. Distributed compiling can be set.

  • There is a local Gentoo mirror and pre-compiled packages repository
    in the network, serving the binaries built for each hw-profile.

  • Each server will have associated its hw-profile repository and install the binaries:

    emerge --usepkg --getbinpkg <package>

The PHP upgrades can be distributed using rsync, in different location for each version,
and activated changing the apache/nginx configuration.

To plan the upgrades (both OS and PHP) I will consider the architecture
explained previously in Webserver architecture section, and the load balancing
solution described in Redundant load balancer design.

The upgrade requirements are:

  • HA, no lost of service due maintenance or upgrades.
  • Each request with an associated session must access to a
    webapp version equal or superior than previous request.
    This is important to ensure the application consistency
    (e.p. an user fills a form that is only available in the last version,
    session contains unexpected values…).

The upgrades can be divided in:

  • Non-disruptive OS upgrade: small OS software upgrades that are not related to
    the webservice (p.e. man, findutils, tar…). The upgrade can be performed online.

  • Disruptive OS upgrade: OS software that imply restart the service
    or the server (p.e. Apache, kernel, libc, ssl…):

    1. It will be upgraded only one member of each farms. First all members number 1,
      then number 2…
    2. The web service will be stopped during the upgrade.
      The other servers in the farm will serve without service disruption.

    This method provides homogeneous and little performance impact (100/16 = 6% servers down).

  • Application upgrade: Clients must access to equal or newer webapp versions:

    1. A complete farm must be stopped at the same time.
    2. The sessions sticked to this farm will be
      served by other farms in the same site (session stickiness to site).
      Session data will be recovered from DB backend.
    3. The memcached associated to this farm must be flushed.
    4. Once upgraded, all servers in the farm are started and can serve to new sessions.

    Load balancer stickiness ensure that the new sessions will access only to
    the upgraded farm. Except:

    • if all the servers of the farm fail at the same time after the upgrade.
    • if end user manipulates the cookies.

    In that case, control code can be added to the application to
    invalidate sessions from upper versions. Something like this:

    if ($app_version < $_SESSION['app_version'])
    elseif ($app_version != $_SESSION['app_version'])
     $_SESSION['app_version'] = $app_session

To perform the upgrades, cluster management tools can be used,
like MCollective (good if using puppet), func, fabric


Based to this post, my fast tip to allow syntax highlight in less:

cat <<EOF >> ~/.bash_profile

# Syntax Highlight for less
# Check if source highlite is intalled
# Set SRC_HILITE_LESSPIPE for custom location
# To install: 
#   sudo yum install source-highlight
if [ -x "$SRC_HILITE_LESSPIPE" ]; then
	export LESS="${LESS/ -R/}  -R" # Set LESS option in raw mode (avoid repeating it)

For the last years I had the same problem: I was running windows as desktop and managing Linux/Unix?. Of cuorse, to minimize the pain, I use Cygwin and/or colinux, that make my life easier.

Often I need to open files remotely, but is so tedious to find them in the samba share… and then I’ve found this tool: doit, from Simon Tatham, the putty author.

It allows execute commands in your box from the remote server, automatically translating the paths (in case you are using samba).

Fast installation

  1. Client on Unix side (1):
  1. Download and compile:
    curl | tar -xvzf -
    cd doit
    cc -o doitclient doitclient.c doitlib.c -lsocket -lnsl -lresolv
  2. Install. I use  stowfor my adhoc binaries:
    ##  Preset variables.
    PLATFORM="$(uname -s)-$(uname -p)"
    mkdir -p $STOW_HOME/doit/bin
    cp doitclient $STOW_HOME/doit/bin
    for i in wf win winwait wcmd wclip www wpath; do
     ln -s doitclient $STOW_HOME/doit/bin/$i
    cd $STOW_HOME
    stow doit
  1. Shared secret setup and configuration:
    dd if=/dev/random of=$HOME/.doit-secret bs=64 count=1
    chmod 640 $HOME/.doit-secret 
    echo "secret $HOME/.doit-secret" &gt; $HOME/.doitrc

    Then set the mappings as described in the documentation. For instance:

      map /home/ \\sambaserver\
  1. If you are using su (or sudo reseting passwords), you will lose the SSH_CLIENT variable. But can set the $DOIT_HOST variable. You can use this:
    cat <<"EOF" >> ~/.bashrc
    # DOIT_HOST variable, for the DoIt tool (Integration with windows desktops)
    export DOIT_HOST=$(who -m | sed 's/.*(\(.*\)).*/\1/')
  2. Setup the client on a windows box. You can copy the .doit-secret or use samba to access to your home.
    Just create a link to “doit.exe secret.file”, for instance:

    \\sambaserver\keymon\local\Linux-x86\stow\doit\doit.exe \\sambaserver\keymon\.doit-secret


It is really cool, and it really works.

My only concern is that is the key, that should be shared. One solution can be use environment vars or event the Putty ‘Answerback to ^E’ (, but I am not sure how implement it.

(1) On solaris, compiling with GCC, I got this error:

/var/tmp//cc5ZYGYW.o: In function `main':
doitclient.c:(.text+0x29e8): undefined reference to `hstrerror'
collect2: ld returned 1 exit status

This is solved solved adding -lsocket.

Sometimes you need to know who was the user that did login in a linux/unix server, but after several “sudo” or “su” commands (and others programs that change the permissions) you have lost the information.

You can try to determine the user using the tty of the process tree, querying the process parents.

With this idea, I wrote this small script:

#!/bin/env bash
# This scripts allows  determine the user used to login in the
# machine to run the given process.

# Command names to be considered as login commands
LOGIN_PROGRAMS="sshd telnetd login" 

# Get all pids of the parents of a pid
get_parent_pids() {
    echo $1
    ppid=$(ps -o ppid -p $1 | awk '/[0-9]+/ {print $1}' )
    [ $ppid == 1 -o $ppid == 0 ] && return
    get_parent_pids $ppid

# Get users of parent process of a pid
get_parent_users() {
	get_parent_pids $1 | xargs -n1 ps -o user= -p | uniq | awk '{print $1}'

get_parent_users_commands() {
	get_parent_pids $1 | xargs -n1 ps -o user= -o comm= -p | uniq

get_parent_users_ttys() {
	get_parent_pids $1 | xargs -n1 ps -o user= -o tty= -p | uniq

get_firstuser_after_login() {
	cmd="egrep -B1 -m1" # Get the line before, and stop on first match
	for p in $LOGIN_PROGRAMS; do 
		cmd="$cmd -e '^(.*/)?$p\$'" 
	get_parent_users_commands $1 | eval $cmd | awk '{ print $1; exit; }'

get_firstuser_after_root() {
	get_parent_users $1 | grep -B1 -m1 root | awk '{print $1;exit;}'

get_firstuser_with_tty() {
	get_parent_users_ttys $1 | grep -B1 -m1 \?  | awk '{print $1;exit;}'

print_help() {
	cat <<EOF
Usage $SCRIPT_NAME [Option...] [pid]

Prints the users that where used to start a process.

By default it will use the current process.
	-h:		This help.
	-t:		Print the user of the first process having a valid tty (not ?) 
			This is the default behaviour.
	-a:		Print all processes.
	-r:		Print only user started after the first root (usually the one that login in)
	-l:		Print the user after a login program ($LOGIN_PROGRAMS)
	        Requires GNU egrep.

while true; do
	case $1 in
			echo "$SCRIPT_NAME: Unknown option '$1'"
			args="$args $1"
set -- $args


if ! ps -p $pid >/dev/null 2>&1; then
	echo "$SCRIPT_NAME: Unable to find process '$pid'"
	exit 1

case $mode in 
		get_parent_users $pid
		get_firstuser_after_login $pid
		get_firstuser_after_root $pid
		get_firstuser_with_tty $pid