I never liked have to install agents for different tasks like Backups or monitoring. I think that is always enough with SSH. In this post I will introduce some concepts that I am using as an alternative to the NRPE for nagios.

Time ago I explained how to setup SSH for remote monitor servers in Nagios, using the ControlMaster feature to reuse the connection.

In that post I was using runit to keep the connections alive.

But in OpenSSH 5.6 a new feature has been released:

* Added a ControlPersist option to ssh_config(5) that automatically
starts a background ssh(1) multiplex master when connecting. This
connection can stay alive indefinitely, or can be set to
automatically close after a user-specified duration of inactivity.

And this is COOL! We can just use some options in the check_by_ssh plugin to automatically create the session. The options are:

  • -i /etc/nagios/nagiosssh.id_rsa: Private ssh key generated with ssh-keygen.
  • -o ControlMaster=auto: Create the control master socket automatically
  • -o ControlPersist=yes: Enable Control persist. It will spam a ssh process in background that will keep the connection (can be stopped with -O exit)
  • -o ControlPath=/var/run/nagiosssh/$HOSTNAME$: Path to the control socket. We can create a dir in /var/run/nagiosssh.
  • -l nagiosssh -H $HOSTNAME$: User and host were we are connecting.

So, the command definition can be:


define command{
command_name    check_users_ssh
command_line    $USER1$/check_by_ssh \
-o ControlMaster=auto \
-o ControlPath=/var/run/nagios/$HOSTNAME$ \
-o ControlPersist=yes \
-i $USER6$ -H $HOSTADDRESS$ -l $USER5$ \
'check_users -w $ARG1$ -c $ARG2$'
}

Note: You have to define the USER variables in resources.cfg.

Then we only need to create the proper user in the remote host. To improve the security, you can:

  • Use bash in restricted mode:
    1. Create the user ‘nagiosssh’ with shell=/home/nagiosssh/rbash
    2. Create a script /home/nagiosssh/rbash:
      #!/bin/sh
      # Restricted shell for the client.
      # Sets the path to checks
      PATH=/home/icingassh/checks exec /bin/bash --restricted "$@"
    3. Create the directory /home/icingassh/checks  and link here all the desired checks.
  • Restrict the ssh connection setting options in .ssh/authorized_keys. For example:

    no-agent-forwarding,no-port-forwarding,no-pty,no-X11-forwarding,from="10.10.10.10" ssh-rsa AAAAB3NzaC1...

Maybe in some days I upload a chef recipe to setup this.

Hace un porrón de años escribí esto… creo que aún estaba en el instituto :)

Básicamente funciona así (se ve mal por culpa de wordpress que quita los espacios… paso de rayarme):

$ fortune | ./papel.sh
 __,-----------------------------------------------------------------------,__
/  \\                                                                       \ \
\__/|       Many a writer seems to think he is never profound except         |/
 __ |       when he can't understand his own meaning.                        |
/  \|       -- George D. Prentice                                            |\
\__//                                                                        //
   `------------------------------------------------------------------------``
#!/bin/sh
# Poner en el cron:
# 33 *     * * *     root   (cat /etc/motd_base ; /usr/games/fortune | /usr/local/bin/papel.sh) > /etc/motd; chmod 644 /etc/motd

fmt -w 60 | (
read a;
read b;
read c;

cat << "TOP"
 __,-----------------------------------------------------------------------,__
/  \\                                                                       \ \
TOP

printf "\\__/|       %-65s|/\n" "$a"
a=$b;
b=$c;
read c;

while read c; do
	printf "    |       %-65s|\n" "$a";
	a=$b;
	b=$c;
done;

printf " __ |       %-65s|\n" "$a";
printf "/  \\|       %-65s|\\ \n" "$b";

cat << "BOTTON"
\__//                                                                        //
   `------------------------------------------------------------------------``
BOTTON

)

These is one of the proposed solutions for the job assessment commented in a previous post.

Provide a design which is able to parse Apache access-logs so we can generate an overview of which IP has visited a specific link. This design needs to be usable for a 500+ node webcluster. Please provide your configs/possible scripts and explain your choices and how scalable they are.

I will consider these requirements:

  • It is not critical to register all the log entries. It is no needed ensure that all the web hits are registered.
  • No control on duplicated log entries. It is not needed
    to check that the log entries had been already loaded.
  • It is also needed to propose a mechanism to gather the logs from the webservers.
  • It must be scalable.
  • It is a plus to make it flexible to allow further different analysis.

The problems to be solved are log storage and log gathering, but the main bottleneck will be the storage.

One realizes that the best option is a noSQL database due to
the characteristics of the data to process (log entries):

  • Time ordered entries
  • no duplicates
  • need of fast insertion
  • fixed fields
  • no data relation or conceptual integrity
  • need to be rotated (old entries removed)
  • etc…

So, I will propose the usage of MongoDB [1] (http://www.mongodb.org/), that fits the requirements:

  • It is fast, both at inserting and querying.
  • Scales horizontally without disruption (is initially proper configured).
  • Supports replication and High Availability.
  • Well known solution. Commercial support if needed.
  • Python bindings (pyMongo)
[1] Note: I will not enter in details of a MongoDB scalable HA architecture.
See the quick start guide
to setup a single node and the documentation
for architecture examples.

To parse the logs and store them in MongoDB, I will propose a
simple python script: accesslog_parse_mongo.py that:

  • Setup a direct MongoDB connection.
  • Read the access log from standard input.
  • Parse the logs and store all the fields, including: client_ip, url, referer, status code, timestamp, timezone…
  • I do not set any indexes in the NoSQL db. Indexes could be
    created on url or client_ip fields, but not having indexes allows faster
    insertions, that is the objective. The reads are very uncommon and performed
    in batch processes.
  • Notice that it should be improved to be more reliable. For instance, it
    does not check for errors (DB failures, etc.). It could buffer entries in case of DB failure.
  • A second script called example_query_accesslog.py queries the DB and prints the access. It gets an optional argument, the relative URL.

To feed the DB with the logs from the webservers, some solutions could be:

  • Copy the log files with a scheduled task via SSH or similar, then process them with
    accesslog_parse_mongo.py in a centralized server (or cluster of servers).

    • Pros: Logs are centralized. Only a set of servers access to MongoDB.
      System can be stopped as needed.
    • Cons: Needs extra programming to get the logs.
      No realtime data.
  • Use a centralized syslog service, like syslog-ng
    (can be balanced and configured in HA),
    and setup all the webservers to send the logs via syslog
    (see this article).

    In the log server, we can process resulting files with a batch process or send all the messages to accesslog_parse_mongo.py. For instance, the configuration for syslog-ng:

    destination d_prog { program("/apath/accesslog_parse_mongo.py"
                                  template(“$MSGONLY\n”)
                                  template-escape(no)); };
    • Pros: Centralized logs. No extra programming. Realtime data.
      Use of existent infrastructure (syslog). Only a set of servers access to MongoDB.
    • Cons: Some logs entries can be dropped. Can not be stopped, if not log entries will be lost.
  • Pipe the webserver logs directly to the script, accesslog_parse_mongo.py.
    In Apache configuration:

    CustomLog "|/apath/accesslog_parse_mongo.py" combined
    • Pros: Easy to implement. No extra programming or infrastructure. Realtime data.
    • Cons: Some logs entries can be dropped. It can not be stopped or log entries will be lost.
      The script should be improved to make it more reliable.

These is one of the proposed solutions for the job assessment commented in a previous post.

These is one of the proposed solutions for the job assessment commented in a previous post.

How could you backup all the data of a MySQL installation without taking the DB down at all. The DB is 100G and is write intensive. Provide script.

My solution to this problem will be “Online backup using LVM snapshots in a replication slave”.
It implies that a series of previous decisions have been taken:

  • LVM snapshots: Database data files must be on a LVM volume, which will allow the creation of snapshots. An alternative could be use Btrfs snapshots.

    Advantages:

    • Almost online backup: The DB will continue working while copying the data files.
    • Simple to setup and without cost.
    • You can even start a separated mysql instance (with RW snapshots in LVM2) to perform any task.

    Drawbacks:

    • To ensure the data file integrity the tables must be locked while creating the snapshot.
      The snapshot creation is fast (~2s), but this means a lock anyway.
    • All datafiles must be in the same Logical Volume, to ensure an atomic snapshot.
    • The snapshot has an overhead that will decrease the write/read performance
      (it is said
      that up to a 6x overhead). This is because any Logical Extend modified must be copied. After a while this overhead will be reduced because changes are usually made in the same Logical Extends.
  • Replication slaves: (see official documentation about replication backups and backup raw data in slaves ).
    Backup operations will be executed in a slave instead of in the master.

    Advantages:

    • Will avoid the performance impact in the Master mysql server,
      since “the slave can be paused and shut down without affecting the running operation of the master”.
    • Any backup technique can be done. In fact, using this second method should be enough.
    • Simple to setup.
    • If there is already a MySQL cluster it will use existent infrastructure.

    Drawbacks:

    • This is more an architectonic solution or backup policy than a backupscript.
    • If there are logical inconsistencies in the slave, they are included in the backup.

So, I will suppose that:

  • MySQL data files are stored in a LVM logical volume, with enough free
    LEs in the volume group to create snapshots.
  • The backupscript will be executed in a working replication slave, not in the master.
    Execution of this script on a master will result in a short service pause (tables locked)
    and I/O performance impact.

This is the backup script (tested with xfs) that must be executed on a Slave replication server:

(in github)

#!/usr/bin/python
# -*- coding: utf-8 -*-
#
# Requirements
# - Data files must be in lvm
# - Optionally in xfs (xfs_freeze)
# - User must have LOCK TABLES and RELOAD privilieges::
#
#    grant LOCK TABLES, RELOAD on *.*
#        to backupuser@localhost
#        identified by 'backupassword';
#
import MySQLdb
import sys
import os
from datetime import datetime

# DB Configuration
MYSQL_HOST = "localhost" # Where the slave is
MYSQL_PORT = 3306
MYSQL_USER = "backupuser"
MYSQL_PASSWD = "backupassword"
MYSQL_DB = "appdb"

# Datafiles location and LVM information
DATA_FILES_PATH = "/mysql/data" # do not add / at the end
DATA_FILES_LV = "/dev/datavg/datalv"
SNAPSHOT_SIZE = "10G" # tune de size as needed.

SNAPSHOT_MOUNTPOINT = "/mysql/data.snapshot" # do not add / at the end

# Backup target conf
BACKUP_DESTINATION = "/mysql/data.backup"

#----------------------------------------------------------------
# Commands
# Avoids sudo ask the password
#SUDO = "SUDO_ASKPASS=/bin/true /usr/bin/sudo -A "
SUDO = "sudo"
LVCREATE_CMD =   "%s /sbin/lvcreate" % SUDO
LVREMOVE_CMD =   "%s /sbin/lvremove" % SUDO
MOUNT_CMD =      "%s /bin/mount" % SUDO
UMOUNT_CMD =     "%s /bin/umount" % SUDO
# There is a bug in this command with the locale, we set LANG=
XFS_FREEZE_CMD = "LANG= %s /usr/sbin/xfs_freeze" % SUDO

RSYNC_CMD = "%s /usr/bin/rsync" % SUDO

#----------------------------------------------------------------
# MySQL functions
def mysql_connect():
    dbconn = MySQLdb.connect (host = MYSQL_HOST,
                              port = MYSQL_PORT,
                              user = MYSQL_USER,
                              passwd = MYSQL_PASSWD,
                              db = MYSQL_DB)
    return dbconn

def mysql_lock_tables(dbconn):
    sqlcmd = "FLUSH TABLES WITH READ LOCK"

    print "Locking tables: %s" % sqlcmd

    cursor = dbconn.cursor()
    cursor.execute(sqlcmd)
    cursor.close()

def mysql_unlock_tables(dbconn):
    sqlcmd = "UNLOCK TABLES"

    print "Unlocking tables: %s" % sqlcmd

    cursor = dbconn.cursor()
    cursor.execute(sqlcmd)
    cursor.close()

#----------------------------------------------------------------
# LVM operations
class FailedLvmOperation(Exception):
    pass

# Get the fs type with a common shell script
def get_fs_type(fs_path):
    p = os.popen('mount | grep $(df %s |grep /dev |'\
                 'cut -f 1 -d " ") | cut -f 3,5 -d " "' % fs_path)
    (fs_mountpoint, fs_type) = p.readline().strip().split(' ')
    p.close()
    return (fs_mountpoint, fs_type)

def lvm_create_snapshot():
    # XFS filesystem supports freezing. That is convenient before the snapshot
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        print "Freezing '%s'" % fs_mountpoint
        os.system('%s -f %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    newlv_name = "%s_backup_%ilv" % \
                    (DATA_FILES_LV.split('/')[-1], os.getpid())
    cmdline = "%s --snapshot %s -L%s --name %s" % \
        (LVCREATE_CMD, DATA_FILES_LV, SNAPSHOT_SIZE, newlv_name)

    print "Creating the snapshot backup LV '%s' from '%s'" % \
            (newlv_name, DATA_FILES_LV)
    print " # %s" % cmdline

    ret = os.system(cmdline)

    # Always unfreeze!!
    if fs_type == 'xfs':
        print "Unfreezing '%s'" % fs_mountpoint
        os.system('%s -u %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    if ret != 0: raise FailedLvmOperation

    # Return the path to the device
    return '/'.join(DATA_FILES_LV.split('/')[:-1]+[newlv_name])

def lvm_remove_snapshot(lv_name):
    cmdline = "%s -f %s" % \
        (LVREMOVE_CMD, lv_name)

    print "Removing the snapshot backup LV '%s'" % lv_name
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedLvmOperation

#----------------------------------------------------------------
# Mount the filesystem
class FailedMountOperation(Exception):
    pass

def mount_snapshot(lv_name):
    # XFS filesystem needs nouuid option to mount snapshots
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        cmdline = "%s -o nouuid %s %s" % \
                    (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)
    else:
        cmdline = "%s %s %s" % (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)

    print "Mounting the snapshot backup LV '%s' on '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

def umount_snapshot(lv_name):
    cmdline = "%s %s" % (UMOUNT_CMD, SNAPSHOT_MOUNTPOINT)

    print "Unmounting the snapshot backup LV '%s' from '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

#----------------------------------------------------------------
# Perform the backup process. For instance, an rsync
class FailedBackupOperation(Exception):
    pass

def do_backup():
    cmdline = "%s --progress -av %s/ %s" % \
                (RSYNC_CMD, DATA_FILES_PATH, BACKUP_DESTINATION)

    print "Executing the backup"
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedBackupOperation

def main():
    dbconn = mysql_connect()
    mysql_lock_tables(dbconn)

    start_time = datetime.now()
    # Critical, tables are locked!
    snapshotlv = ''
    try:
        snapshotlv = lvm_create_snapshot()
    except:
        print "Backup failed."
        raise
    finally:
        mysql_unlock_tables(dbconn)
        dbconn.close()
        print "Tables had been locked for %s" % str(datetime.now()-start_time)

    try:
        mount_snapshot(snapshotlv)
        do_backup()
        umount_snapshot(snapshotlv)
        lvm_remove_snapshot(snapshotlv)
    except:
        print "Backup failed. Snapshot LV '%s' still exists. " % snapshotlv
        raise

    print "Backup completed. Elapsed time %s" % str(datetime.now()-start_time)

if __name__ == '__main__':
    main()

These is one of the proposed solutions for the job assessment commented in a previous post.

Provide a script manipulating the routing table to send all outgoing traffic originating from ipaddress: 85.14.228.248 through gw 85.14.228.254 and all other traffic through gateway 10.12.0.254

One basically have to execute these commands:

# Default route
ip route del default table 254
ip route add default via 192.168.1.1 dev wlan0 table 254

# alternative route and its rule
ip route del default table 1
ip route add default via 85.14.228.254 dev wlan0 table 1
ip rule del from 85.14.228.248
ip rule add from 85.14.228.248 table 1

I delete the previous default route and rule to ensure that the commands will be a success and will update the configuration.

A more convenient script could be:

#!/bin/bash

# Default route
DEFAULT_ROUTE=10.12.0.254
DEFAULT_DEV=eth0

# Create the diferent routes, where
#  NAME[id]  = name of routing table (for documentation purposes)
#  ROUTE[id] = destination
#  SRCS[id]  = list of routed ips
#  DEV[id]     = network device
#  id          = number of routing table (1..253)
#
NAME[1]=uplink1
ROUTE[1]="85.14.228.254"
DEV[1]=eth1
SRCS[1]="85.14.228.248"

#-----------------------------------------
# Set the "main" table
NAME[254]=main
ROUTE[254]=$DEFAULT_ROUTE
DEV[254]=$DEFAULT_DEV

# debug
#ip() { echo "> ip $*"; command ip $*; }

for i in {255..1}; do
[ ! -z "${ROUTE[$i]}" ] || continue

# Delete default route if exists
ip route list table $i | grep -q default && \
echo "Deleting default entry for route table ${NAME[$i]}($i)..." && \
ip route del default table $i

# Create the new table default route
echo "Creating route table '${NAME[$i]}($i)' with default via gw ${ROUTE[$i]}"
ip route add default via "${ROUTE[$i]}" dev ${DEV[$i]} table $i || continue

# Create
for ip in ${SRCS[i]}; do
# Delete rule if exists
ip rule list |grep -q "from $ip" && \
echo " - deleting rule from $ip..." && \
ip rule del from $ip

# Add the source rule
echo " + adding rule from $ip..."
ip rule add from $ip table $i
done
done

These is one of the proposed solutions for the job assessment commented in a previous post.

Using an Open Source solution, design a load balancer configuration that meets: redundancy, multiples subnets, and handle 500-1000Mbit of syn/ack/fin packets. Explain scalability of your design/configs.

The main problem that the load balancer design must solve in
web applications is the session stickiness (or persistence). The load balancer
design must be created according to the session replication policy of the architecture.
On the other hand, the load balancer must be designed to allow the upgrade and
maintenance of the servers.

Considering the architecture explained in Webserver architecture section, the stickiness restrictions are:

  • Session stickiness must be set for each site. Site failure means lose of all users sessions crated in that site.
  • Session stickiness must be set for each farm. Server failure is allowed (session will be recovered from the session DB backend)
  • Session stickiness for servers into the farm is optional (Could be set to take advantage of OS disk cache).

The software that I propose is HAProxy (http://haproxy.1wt.eu/):

The load balancer design consists of two layers:

  • One primary HAProxy LB, that will balance between sites.
    Configured with session stickiness to the site using an inserted cookie, SITEID.
    See primary-haproxy.conf.
  • One site LB in each site, balancing the site’s farms.
    Configured with session stickiness to the farm FARM_ID.
    See site1-haproxy.conf.

Extra comments:

  • Each layer can have several HAproxy instances, with the same configuration, configured with a failover solution or behind a Layer 4 load balancer. See http://haproxy.1wt.eu/download/1.3/doc/architecture.txt (Section 2) for examples.
  • Additionally a SSL frontend solution should be configured for SSL connections between the client
    and the primary load balancer. We can use plain HTTP between balancers and servers.
    I will not describe this element.
  • The solutions described in HAproxy architecture documentation, “4 Soft-stop for application maintenance”, can be used.
  • With HAProxy 1.4 you can dynamically control servers weight.
    A monitoring system can check the farms/servers health and tune the weight as needed.

This solution scales well. You simply need to add more servers, farms and sites.
Load-balancers can scale horizontally as commented.

Primary configuration:

#
# Primary Load Balancer configuration: primary-haproxy.conf
#
global
	log 127.0.0.1	local0
	log 127.0.0.1	local1 notice
	#log loghost	local0 info
	maxconn 40000
	user haproxy
	group haproxy
	daemon
	#debug
	#quiet

defaults
	log	global
	mode	http
	option	httplog
	option	dontlognull
	retries	3
	option redispatch
	maxconn	2000
	contimeout	5000
	clitimeout	50000
	srvtimeout	50000

listen primary_lb_1
    # We insert cookies, add headers => http mode
    mode http

    #------------------------------------
    # Bind to all address.
    #bind 0.0.0.0:10001
    # Bind to a clusterized virtual ip
    bind 192.168.10.1:10001 transparent

    #------------------------------------
    # Cookie persistence for PHP sessions. Options
    #  - rewrite PHPSESSID: will add the server label to the session id
    #cookie	PHPSESSID rewrite indirect
    #  - insert a cookie with the identifier.
    #    Use of postonly (session created in login form) or nocache to avoid be cached
    cookie SITEID insert postonly

    # We need to know the client ip in the end servers.
    # Inserts X-Forwarded-For. Needs httpclose (no Keep-Alive).
    option forwardfor
    option httpclose

    # Roundrobin is ok for HTTP requests.
    balance	roundrobin

    # The backend sites
    # Several options are possible:
    #  inter 2000 downinter 500 rise 2 fall 5 weight 100
    server site1 192.168.11.1:10001 cookie site1 check
    server site2 192.168.11.1:10002 cookie site2 check
    # etc..

Site configuration:

#
# Site 1 load balancer configuration: syte1-haproxy.conf
#
global
log 127.0.0.1    local0
log 127.0.0.1    local1 notice
#log loghost    local0 info
maxconn 40000
user haproxy
group haproxy
daemon
#debug
#quiet

defaults
log    global
mode    http
option    httplog
option    dontlognull
retries    3
option redispatch
maxconn    2000
contimeout    5000
clitimeout    50000
srvtimeout    50000

#------------------------------------
listen site1_lb_1
grace 20000 # don't kill us until 20 seconds have elapsed

# Bind to all address.
#bind 0.0.0.0:10001
# Bind to a clusterized virtual ip
bind 192.168.11.1:10001 transparent

# Persistence.
# The webservers in same farm share the session
# with memcached. The whole site has them in a DB backend.
mode http
cookie FARMID insert postonly

# Roundrobin is ok for HTTP requests.
balance roundrobin

# Farm 1 servers
server site1ws1 192.168.21.1:80 cookie farm1 check
server site1ws2 192.168.21.2:80 cookie farm1 check
# etc...

# Farm 2 servers
server site1ws17 192.168.21.17:80 cookie farm2 check
server site1ws18 192.168.21.18:80 cookie farm2 check
server site1ws19 192.168.21.19:80 cookie farm1 check
# etc..

Some posts from a new job Assessment

Publicado: enero 15, 2012 en Sin categoría

One year ago I applied for a Linux engineer position in a Internet company abroad.

They did send me a technical assessment with several questions. I took it quite seriously and I did the best I could. Sadly I did not get the job, but I did learn a lot from this.

This was time ago, and now I feel that I can publish some of these replies, they might they are useful for somebody, even as anti-pattern.

One thing that was annoying is that they did not describe the platform enough, and I did have to make a lot of assumptions. But the intention was evaluate me, not implement real solutions.

Webserver architecture assumptions

To reply the questions, specially OS and PHP version control and Redundant load balancer design, the webserver architecture must be considered. I will design the solutions for the following architecture:

  • The servers are located in different physical datacenters or sites. I will consider 4 sites.
  • The webservers are grouped in farms. Farms members should in the same subnet,
    connected to an high-performance local network, but on different hardware.
    I will consider 16 servers per farm.
  • The architecture has a mechanism to share the sessions. For instance we can consider:
    • memcached stored sessions.
      • One memcached server per webserver.
      • One memcached cluster per farm.
    • DB backend for the sessions (for sessions it is good a NoSQL DB).
      • One DB backend by site. Any webserver recover a session created in the site.
        It is possible to configure an inter-site replication.
      • It must offer redundancy and HA.
  • Other caches implemented can follow the same schema.

Entradas

These is one of the proposed solutions for the job assessment commented in a previous post.

Note that this was my reply it that moment, nowadays I would change my reply including automatic provisioning and automated configuration  based on puppet, chef… and other techniques. Also something about rollbacks, using LVM snapshots.

Question

Given a 500+ node webcluster in one costumer for one base code. Design a Gentoo/PHP version control. Propose solutions for OS upgrades of servers. Propose a plan and execution for PHP upgrades.  Please explain your choices.

Solution

About the OS/upgrades I will consider:

  • There are a limited number of hardware configurations. I will call them: hw-profile.
  • There is a preproduction environment, with servers of each hardware configuration.

In that case:

  • Each upgrade must be properly tested in the preproduction environment.

  • The preproduction servers will pre-compile the Gentoo packages
    for each hw-profile. Distributed compiling can be set.

  • There is a local Gentoo mirror and pre-compiled packages repository
    in the network, serving the binaries built for each hw-profile.

  • Each server will have associated its hw-profile repository and install the binaries:

    PORTAGE_BINHOST="ftp://gentoo-repository/$hw-profile"
    emerge --usepkg --getbinpkg <package>

The PHP upgrades can be distributed using rsync, in different location for each version,
and activated changing the apache/nginx configuration.

To plan the upgrades (both OS and PHP) I will consider the architecture
explained previously in Webserver architecture section, and the load balancing
solution described in Redundant load balancer design.

The upgrade requirements are:

  • HA, no lost of service due maintenance or upgrades.
  • Each request with an associated session must access to a
    webapp version equal or superior than previous request.
    This is important to ensure the application consistency
    (e.p. an user fills a form that is only available in the last version,
    session contains unexpected values…).

The upgrades can be divided in:

  • Non-disruptive OS upgrade: small OS software upgrades that are not related to
    the webservice (p.e. man, findutils, tar…). The upgrade can be performed online.

  • Disruptive OS upgrade: OS software that imply restart the service
    or the server (p.e. Apache, kernel, libc, ssl…):

    1. It will be upgraded only one member of each farms. First all members number 1,
      then number 2…
    2. The web service will be stopped during the upgrade.
      The other servers in the farm will serve without service disruption.

    This method provides homogeneous and little performance impact (100/16 = 6% servers down).

  • Application upgrade: Clients must access to equal or newer webapp versions:

    1. A complete farm must be stopped at the same time.
    2. The sessions sticked to this farm will be
      served by other farms in the same site (session stickiness to site).
      Session data will be recovered from DB backend.
    3. The memcached associated to this farm must be flushed.
    4. Once upgraded, all servers in the farm are started and can serve to new sessions.

    Load balancer stickiness ensure that the new sessions will access only to
    the upgraded farm. Except:

    • if all the servers of the farm fail at the same time after the upgrade.
    • if end user manipulates the cookies.

    In that case, control code can be added to the application to
    invalidate sessions from upper versions. Something like this:

    if ($app_version < $_SESSION['app_version'])
     session_destroy()
    elseif ($app_version != $_SESSION['app_version'])
     $_SESSION['app_version'] = $app_session

To perform the upgrades, cluster management tools can be used,
like MCollective (good if using puppet), func, fabric

 

Based to this post, http://linux-tips.org/article/78/syntax-highlighting-in-less my fast tip to allow syntax highlight in less:

cat <<EOF >> ~/.bash_profile

# Syntax Highlight for less
#
# Check if source highlite is intalled http://www.gnu.org/software/src-highlite/
# Set SRC_HILITE_LESSPIPE for custom location
# 
# To install: 
#   sudo yum install source-highlight
#
SRC_HILITE_LESSPIPE=${SRC_HILITE_LESSPIPE:-$(which src-hilite-lesspipe.sh 2> /dev/null)}
if [ -x "$SRC_HILITE_LESSPIPE" ]; then
	export LESSOPEN="| $SRC_HILITE_LESSPIPE  %s"
	export LESS="${LESS/ -R/}  -R" # Set LESS option in raw mode (avoid repeating it)
fi
EOF

Wisdom of the ancients

Publicado: noviembre 18, 2011 en Sin categoría
Etiquetas:,

Long time without publishing… quite busy lately.

This cartoon is amazing!