Archivos de la categoría ‘Misc’

Hace un porrón de años escribí esto… creo que aún estaba en el instituto :)

Básicamente funciona así (se ve mal por culpa de wordpress que quita los espacios… paso de rayarme):

$ fortune | ./papel.sh
 __,-----------------------------------------------------------------------,__
/  \\                                                                       \ \
\__/|       Many a writer seems to think he is never profound except         |/
 __ |       when he can't understand his own meaning.                        |
/  \|       -- George D. Prentice                                            |\
\__//                                                                        //
   `------------------------------------------------------------------------``
#!/bin/sh
# Poner en el cron:
# 33 *     * * *     root   (cat /etc/motd_base ; /usr/games/fortune | /usr/local/bin/papel.sh) > /etc/motd; chmod 644 /etc/motd

fmt -w 60 | (
read a;
read b;
read c;

cat << "TOP"
 __,-----------------------------------------------------------------------,__
/  \\                                                                       \ \
TOP

printf "\\__/|       %-65s|/\n" "$a"
a=$b;
b=$c;
read c;

while read c; do
	printf "    |       %-65s|\n" "$a";
	a=$b;
	b=$c;
done;

printf " __ |       %-65s|\n" "$a";
printf "/  \\|       %-65s|\\ \n" "$b";

cat << "BOTTON"
\__//                                                                        //
   `------------------------------------------------------------------------``
BOTTON

)

These is one of the proposed solutions for the job assessment commented in a previous post.

Provide a design which is able to parse Apache access-logs so we can generate an overview of which IP has visited a specific link. This design needs to be usable for a 500+ node webcluster. Please provide your configs/possible scripts and explain your choices and how scalable they are.

I will consider these requirements:

  • It is not critical to register all the log entries. It is no needed ensure that all the web hits are registered.
  • No control on duplicated log entries. It is not needed
    to check that the log entries had been already loaded.
  • It is also needed to propose a mechanism to gather the logs from the webservers.
  • It must be scalable.
  • It is a plus to make it flexible to allow further different analysis.

The problems to be solved are log storage and log gathering, but the main bottleneck will be the storage.

One realizes that the best option is a noSQL database due to
the characteristics of the data to process (log entries):

  • Time ordered entries
  • no duplicates
  • need of fast insertion
  • fixed fields
  • no data relation or conceptual integrity
  • need to be rotated (old entries removed)
  • etc…

So, I will propose the usage of MongoDB [1] (http://www.mongodb.org/), that fits the requirements:

  • It is fast, both at inserting and querying.
  • Scales horizontally without disruption (is initially proper configured).
  • Supports replication and High Availability.
  • Well known solution. Commercial support if needed.
  • Python bindings (pyMongo)
[1] Note: I will not enter in details of a MongoDB scalable HA architecture.
See the quick start guide
to setup a single node and the documentation
for architecture examples.

To parse the logs and store them in MongoDB, I will propose a
simple python script: accesslog_parse_mongo.py that:

  • Setup a direct MongoDB connection.
  • Read the access log from standard input.
  • Parse the logs and store all the fields, including: client_ip, url, referer, status code, timestamp, timezone…
  • I do not set any indexes in the NoSQL db. Indexes could be
    created on url or client_ip fields, but not having indexes allows faster
    insertions, that is the objective. The reads are very uncommon and performed
    in batch processes.
  • Notice that it should be improved to be more reliable. For instance, it
    does not check for errors (DB failures, etc.). It could buffer entries in case of DB failure.
  • A second script called example_query_accesslog.py queries the DB and prints the access. It gets an optional argument, the relative URL.

To feed the DB with the logs from the webservers, some solutions could be:

  • Copy the log files with a scheduled task via SSH or similar, then process them with
    accesslog_parse_mongo.py in a centralized server (or cluster of servers).

    • Pros: Logs are centralized. Only a set of servers access to MongoDB.
      System can be stopped as needed.
    • Cons: Needs extra programming to get the logs.
      No realtime data.
  • Use a centralized syslog service, like syslog-ng
    (can be balanced and configured in HA),
    and setup all the webservers to send the logs via syslog
    (see this article).

    In the log server, we can process resulting files with a batch process or send all the messages to accesslog_parse_mongo.py. For instance, the configuration for syslog-ng:

    destination d_prog { program("/apath/accesslog_parse_mongo.py"
                                  template(“$MSGONLY\n”)
                                  template-escape(no)); };
    • Pros: Centralized logs. No extra programming. Realtime data.
      Use of existent infrastructure (syslog). Only a set of servers access to MongoDB.
    • Cons: Some logs entries can be dropped. Can not be stopped, if not log entries will be lost.
  • Pipe the webserver logs directly to the script, accesslog_parse_mongo.py.
    In Apache configuration:

    CustomLog "|/apath/accesslog_parse_mongo.py" combined
    • Pros: Easy to implement. No extra programming or infrastructure. Realtime data.
    • Cons: Some logs entries can be dropped. It can not be stopped or log entries will be lost.
      The script should be improved to make it more reliable.

These is one of the proposed solutions for the job assessment commented in a previous post.

These is one of the proposed solutions for the job assessment commented in a previous post.

How could you backup all the data of a MySQL installation without taking the DB down at all. The DB is 100G and is write intensive. Provide script.

My solution to this problem will be “Online backup using LVM snapshots in a replication slave”.
It implies that a series of previous decisions have been taken:

  • LVM snapshots: Database data files must be on a LVM volume, which will allow the creation of snapshots. An alternative could be use Btrfs snapshots.

    Advantages:

    • Almost online backup: The DB will continue working while copying the data files.
    • Simple to setup and without cost.
    • You can even start a separated mysql instance (with RW snapshots in LVM2) to perform any task.

    Drawbacks:

    • To ensure the data file integrity the tables must be locked while creating the snapshot.
      The snapshot creation is fast (~2s), but this means a lock anyway.
    • All datafiles must be in the same Logical Volume, to ensure an atomic snapshot.
    • The snapshot has an overhead that will decrease the write/read performance
      (it is said
      that up to a 6x overhead). This is because any Logical Extend modified must be copied. After a while this overhead will be reduced because changes are usually made in the same Logical Extends.
  • Replication slaves: (see official documentation about replication backups and backup raw data in slaves ).
    Backup operations will be executed in a slave instead of in the master.

    Advantages:

    • Will avoid the performance impact in the Master mysql server,
      since “the slave can be paused and shut down without affecting the running operation of the master”.
    • Any backup technique can be done. In fact, using this second method should be enough.
    • Simple to setup.
    • If there is already a MySQL cluster it will use existent infrastructure.

    Drawbacks:

    • This is more an architectonic solution or backup policy than a backupscript.
    • If there are logical inconsistencies in the slave, they are included in the backup.

So, I will suppose that:

  • MySQL data files are stored in a LVM logical volume, with enough free
    LEs in the volume group to create snapshots.
  • The backupscript will be executed in a working replication slave, not in the master.
    Execution of this script on a master will result in a short service pause (tables locked)
    and I/O performance impact.

This is the backup script (tested with xfs) that must be executed on a Slave replication server:

(in github)

#!/usr/bin/python
# -*- coding: utf-8 -*-
#
# Requirements
# - Data files must be in lvm
# - Optionally in xfs (xfs_freeze)
# - User must have LOCK TABLES and RELOAD privilieges::
#
#    grant LOCK TABLES, RELOAD on *.*
#        to backupuser@localhost
#        identified by 'backupassword';
#
import MySQLdb
import sys
import os
from datetime import datetime

# DB Configuration
MYSQL_HOST = "localhost" # Where the slave is
MYSQL_PORT = 3306
MYSQL_USER = "backupuser"
MYSQL_PASSWD = "backupassword"
MYSQL_DB = "appdb"

# Datafiles location and LVM information
DATA_FILES_PATH = "/mysql/data" # do not add / at the end
DATA_FILES_LV = "/dev/datavg/datalv"
SNAPSHOT_SIZE = "10G" # tune de size as needed.

SNAPSHOT_MOUNTPOINT = "/mysql/data.snapshot" # do not add / at the end

# Backup target conf
BACKUP_DESTINATION = "/mysql/data.backup"

#----------------------------------------------------------------
# Commands
# Avoids sudo ask the password
#SUDO = "SUDO_ASKPASS=/bin/true /usr/bin/sudo -A "
SUDO = "sudo"
LVCREATE_CMD =   "%s /sbin/lvcreate" % SUDO
LVREMOVE_CMD =   "%s /sbin/lvremove" % SUDO
MOUNT_CMD =      "%s /bin/mount" % SUDO
UMOUNT_CMD =     "%s /bin/umount" % SUDO
# There is a bug in this command with the locale, we set LANG=
XFS_FREEZE_CMD = "LANG= %s /usr/sbin/xfs_freeze" % SUDO

RSYNC_CMD = "%s /usr/bin/rsync" % SUDO

#----------------------------------------------------------------
# MySQL functions
def mysql_connect():
    dbconn = MySQLdb.connect (host = MYSQL_HOST,
                              port = MYSQL_PORT,
                              user = MYSQL_USER,
                              passwd = MYSQL_PASSWD,
                              db = MYSQL_DB)
    return dbconn

def mysql_lock_tables(dbconn):
    sqlcmd = "FLUSH TABLES WITH READ LOCK"

    print "Locking tables: %s" % sqlcmd

    cursor = dbconn.cursor()
    cursor.execute(sqlcmd)
    cursor.close()

def mysql_unlock_tables(dbconn):
    sqlcmd = "UNLOCK TABLES"

    print "Unlocking tables: %s" % sqlcmd

    cursor = dbconn.cursor()
    cursor.execute(sqlcmd)
    cursor.close()

#----------------------------------------------------------------
# LVM operations
class FailedLvmOperation(Exception):
    pass

# Get the fs type with a common shell script
def get_fs_type(fs_path):
    p = os.popen('mount | grep $(df %s |grep /dev |'\
                 'cut -f 1 -d " ") | cut -f 3,5 -d " "' % fs_path)
    (fs_mountpoint, fs_type) = p.readline().strip().split(' ')
    p.close()
    return (fs_mountpoint, fs_type)

def lvm_create_snapshot():
    # XFS filesystem supports freezing. That is convenient before the snapshot
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        print "Freezing '%s'" % fs_mountpoint
        os.system('%s -f %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    newlv_name = "%s_backup_%ilv" % \
                    (DATA_FILES_LV.split('/')[-1], os.getpid())
    cmdline = "%s --snapshot %s -L%s --name %s" % \
        (LVCREATE_CMD, DATA_FILES_LV, SNAPSHOT_SIZE, newlv_name)

    print "Creating the snapshot backup LV '%s' from '%s'" % \
            (newlv_name, DATA_FILES_LV)
    print " # %s" % cmdline

    ret = os.system(cmdline)

    # Always unfreeze!!
    if fs_type == 'xfs':
        print "Unfreezing '%s'" % fs_mountpoint
        os.system('%s -u %s' % (XFS_FREEZE_CMD, fs_mountpoint))

    if ret != 0: raise FailedLvmOperation

    # Return the path to the device
    return '/'.join(DATA_FILES_LV.split('/')[:-1]+[newlv_name])

def lvm_remove_snapshot(lv_name):
    cmdline = "%s -f %s" % \
        (LVREMOVE_CMD, lv_name)

    print "Removing the snapshot backup LV '%s'" % lv_name
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedLvmOperation

#----------------------------------------------------------------
# Mount the filesystem
class FailedMountOperation(Exception):
    pass

def mount_snapshot(lv_name):
    # XFS filesystem needs nouuid option to mount snapshots
    (fs_mountpoint, fs_type) = get_fs_type(DATA_FILES_PATH)
    if fs_type == 'xfs':
        cmdline = "%s -o nouuid %s %s" % \
                    (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)
    else:
        cmdline = "%s %s %s" % (MOUNT_CMD, lv_name, SNAPSHOT_MOUNTPOINT)

    print "Mounting the snapshot backup LV '%s' on '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

def umount_snapshot(lv_name):
    cmdline = "%s %s" % (UMOUNT_CMD, SNAPSHOT_MOUNTPOINT)

    print "Unmounting the snapshot backup LV '%s' from '%s'" % \
            (lv_name, SNAPSHOT_MOUNTPOINT)
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedMountOperation

#----------------------------------------------------------------
# Perform the backup process. For instance, an rsync
class FailedBackupOperation(Exception):
    pass

def do_backup():
    cmdline = "%s --progress -av %s/ %s" % \
                (RSYNC_CMD, DATA_FILES_PATH, BACKUP_DESTINATION)

    print "Executing the backup"
    print " # %s" % cmdline

    ret = os.system(cmdline)
    if ret != 0:
        raise FailedBackupOperation

def main():
    dbconn = mysql_connect()
    mysql_lock_tables(dbconn)

    start_time = datetime.now()
    # Critical, tables are locked!
    snapshotlv = ''
    try:
        snapshotlv = lvm_create_snapshot()
    except:
        print "Backup failed."
        raise
    finally:
        mysql_unlock_tables(dbconn)
        dbconn.close()
        print "Tables had been locked for %s" % str(datetime.now()-start_time)

    try:
        mount_snapshot(snapshotlv)
        do_backup()
        umount_snapshot(snapshotlv)
        lvm_remove_snapshot(snapshotlv)
    except:
        print "Backup failed. Snapshot LV '%s' still exists. " % snapshotlv
        raise

    print "Backup completed. Elapsed time %s" % str(datetime.now()-start_time)

if __name__ == '__main__':
    main()

These is one of the proposed solutions for the job assessment commented in a previous post.

Using an Open Source solution, design a load balancer configuration that meets: redundancy, multiples subnets, and handle 500-1000Mbit of syn/ack/fin packets. Explain scalability of your design/configs.

The main problem that the load balancer design must solve in
web applications is the session stickiness (or persistence). The load balancer
design must be created according to the session replication policy of the architecture.
On the other hand, the load balancer must be designed to allow the upgrade and
maintenance of the servers.

Considering the architecture explained in Webserver architecture section, the stickiness restrictions are:

  • Session stickiness must be set for each site. Site failure means lose of all users sessions crated in that site.
  • Session stickiness must be set for each farm. Server failure is allowed (session will be recovered from the session DB backend)
  • Session stickiness for servers into the farm is optional (Could be set to take advantage of OS disk cache).

The software that I propose is HAProxy (http://haproxy.1wt.eu/):

The load balancer design consists of two layers:

  • One primary HAProxy LB, that will balance between sites.
    Configured with session stickiness to the site using an inserted cookie, SITEID.
    See primary-haproxy.conf.
  • One site LB in each site, balancing the site’s farms.
    Configured with session stickiness to the farm FARM_ID.
    See site1-haproxy.conf.

Extra comments:

  • Each layer can have several HAproxy instances, with the same configuration, configured with a failover solution or behind a Layer 4 load balancer. See http://haproxy.1wt.eu/download/1.3/doc/architecture.txt (Section 2) for examples.
  • Additionally a SSL frontend solution should be configured for SSL connections between the client
    and the primary load balancer. We can use plain HTTP between balancers and servers.
    I will not describe this element.
  • The solutions described in HAproxy architecture documentation, “4 Soft-stop for application maintenance”, can be used.
  • With HAProxy 1.4 you can dynamically control servers weight.
    A monitoring system can check the farms/servers health and tune the weight as needed.

This solution scales well. You simply need to add more servers, farms and sites.
Load-balancers can scale horizontally as commented.

Primary configuration:

#
# Primary Load Balancer configuration: primary-haproxy.conf
#
global
	log 127.0.0.1	local0
	log 127.0.0.1	local1 notice
	#log loghost	local0 info
	maxconn 40000
	user haproxy
	group haproxy
	daemon
	#debug
	#quiet

defaults
	log	global
	mode	http
	option	httplog
	option	dontlognull
	retries	3
	option redispatch
	maxconn	2000
	contimeout	5000
	clitimeout	50000
	srvtimeout	50000

listen primary_lb_1
    # We insert cookies, add headers => http mode
    mode http

    #------------------------------------
    # Bind to all address.
    #bind 0.0.0.0:10001
    # Bind to a clusterized virtual ip
    bind 192.168.10.1:10001 transparent

    #------------------------------------
    # Cookie persistence for PHP sessions. Options
    #  - rewrite PHPSESSID: will add the server label to the session id
    #cookie	PHPSESSID rewrite indirect
    #  - insert a cookie with the identifier.
    #    Use of postonly (session created in login form) or nocache to avoid be cached
    cookie SITEID insert postonly

    # We need to know the client ip in the end servers.
    # Inserts X-Forwarded-For. Needs httpclose (no Keep-Alive).
    option forwardfor
    option httpclose

    # Roundrobin is ok for HTTP requests.
    balance	roundrobin

    # The backend sites
    # Several options are possible:
    #  inter 2000 downinter 500 rise 2 fall 5 weight 100
    server site1 192.168.11.1:10001 cookie site1 check
    server site2 192.168.11.1:10002 cookie site2 check
    # etc..

Site configuration:

#
# Site 1 load balancer configuration: syte1-haproxy.conf
#
global
log 127.0.0.1    local0
log 127.0.0.1    local1 notice
#log loghost    local0 info
maxconn 40000
user haproxy
group haproxy
daemon
#debug
#quiet

defaults
log    global
mode    http
option    httplog
option    dontlognull
retries    3
option redispatch
maxconn    2000
contimeout    5000
clitimeout    50000
srvtimeout    50000

#------------------------------------
listen site1_lb_1
grace 20000 # don't kill us until 20 seconds have elapsed

# Bind to all address.
#bind 0.0.0.0:10001
# Bind to a clusterized virtual ip
bind 192.168.11.1:10001 transparent

# Persistence.
# The webservers in same farm share the session
# with memcached. The whole site has them in a DB backend.
mode http
cookie FARMID insert postonly

# Roundrobin is ok for HTTP requests.
balance roundrobin

# Farm 1 servers
server site1ws1 192.168.21.1:80 cookie farm1 check
server site1ws2 192.168.21.2:80 cookie farm1 check
# etc...

# Farm 2 servers
server site1ws17 192.168.21.17:80 cookie farm2 check
server site1ws18 192.168.21.18:80 cookie farm2 check
server site1ws19 192.168.21.19:80 cookie farm1 check
# etc..

Redescubriendo reStructuredText

Publicado: noviembre 2, 2010 en fast-tip, Misc, Technical
Etiquetas:, , ,

reStructuredText

Estoy “redescubriendo” el reStructuredText (también conocido como rst o ReST).

Se trata de un lenguaje de markut (como HTML o SGML) pero human friendly. Similar al lenguaje de los wikis. Consulta la pagina de reStructuredText en la wikipedia para más detalles.

Parece tonto, pero la verdad es que se puede usar para muchas cosas, como lenguaje franco para textos con formato simple: Documentación, sistema de ticketing, manuales…

Pero se puede ir más lejos:

Y aquí vemos cómo generé este post (sin esta parte, para no hacer un post recursivamente infinito :)):

    cat <<EOF | ./rst2wp
    reStructuredText
    ----------------

    Estoy *"redescubriendo"* el reStructuredText_ (también conocido como *rst* o *ReST*).

    Se trata de un lenguaje de markut (como HTML o SGML) pero *human friendly*. Similar al lenguaje de los wikis. Consulta `la pagina de reStructuredText en la wikipedia <http://en.wikipedia.org/wiki/ReStructuredText>`_ para más detalles.

    Parece tonto, pero la verdad es que se puede usar para muchas cosas, como *lenguaje franco* para textos con formato simple: Documentación, sistema de ticketing, manuales...

    Pero se puede ir más lejos:

    - entradas en blog (este post está echo con ReST), usando `este pequeño script en python <http://unmaintainable.wordpress.com/2008/03/22/using-rst-with-wordpress/>`_...

    Si vamos más lejos, `hay quien usa un gestor de versiones junto este script <http://tadhg.com/wp/2009/07/14/blog-workflow-with-restructuredtext/>`_ !. Esto hace rondar una idea por la cabeza... ¿no molaría disponer de un blog en el que pudieras gestionar directamente con git?

    - O incluso presentaciones!, como comentan en esta página del propio docutils:

    * Original en Rest: http://docutils.sourceforge.net/docs/user/slide-shows.txt

    * En HTML normal: http://docutils.sourceforge.net/docs/user/slide-shows.txt

    * Como presentación en `S5 <http://meyerweb.com/eric/tools/s5/>`_: http://docutils.sourceforge.net/docs/user/slide-shows.s5.html::

Y aquí vemos cómo generé este post (sin esta parte, para no hacer un post recursivamente infinito :))::

     cat <<EOF | ./rst2wp
    EOF

Any Linux & Unix admin knowns this fact: GNU tools are MUCH MORE better tools than AIX, BSD, Solaris or HP-UX tools.

GNU tools have much less bugs, much more functionality and options, localization, better documentation, they are standard, most of the scripts are built based on GNU tools, etc, etc,etc. Why the hell they do not throw out their ugly-buggy-limitated tools and install the GNU tools in their systems by default???

Here you have an example of a weird behaviour in the ‘dd’ command in the AIX platform: With the skip=<Num. blocks> parameter the ‘dd’ command skips the blocks, but it actually reads them (no matter if you are working on a filesystem with file random access). So, if you are working with big files (in my case, 50GB) you have to read ALL the blocks in memory before access the requested position. That means huge I/O, usage of memory in cache, etc…

IBM guys: you do not know that there is a lseek(2) function?

Here you have an example of the time that takes read 2MB from a big file, skiping 1000MB. Using native ‘dd’ command takes 12s:

$ time /usr/bin/dd if=a_big_big_file.data skip=1000 bs=1M count=2 of=/dev/null
2+0 records in.
2+0 records out.

real    0m12.059s
user    0m0.013s
sys     0m1.419s

With GNU’s version, less than a second:

$ time /opt/freeware/bin/dd if=a_big_big_file.data skip=1000 bs=1M count=2 of=/dev/null
2+0 records in
2+0 records out

real    0m0.024s
user    0m0.002s
sys     0m0.006s

Note: You can find the GNU’s dd tool in AIX Linux ToolBox coreutils package.

Update: I contacted the IBM support and they told me that using the option conv=iblock ,”dd” will behave as expected. But IMHO the documentation does not explicitily say that:

iblock, oblock
Minimize data loss resulting from a read or write error on direct access devices. If you specify the iblock variable and an error occurs during a block read (where the block size is 512 or the size specified by theibs=InputBlockSize variable), the dd command attempts to reread the data block in smaller size units. If the dd command can determine the sector size of the input device, it reads the damaged block one sector at a time. Otherwise, it reads it 512 bytes at a time. The input block size (ibs) must be a multiple of this retry size. This option contains data loss associated with a read error to a single sector. The oblock conversion works similarly on output.

 


I am a Google reader user, and I never find the “Share in google reader” icon . I wanted to add it to my blog, but I did not find this anywhere… so I will post that. It is quite easy: go to your site preferences, options, sharing settings and click in “Add a new service”.

You have  to fill the new formulary with:

And you now can use it…

The same for other platforms

Hoy he perdido algo de tiempo con una soberana tonteria: Creé un script para ejecutar en el /etc/cron.daily llamado “script.sh”… y nunca se ejecutaba. Revisé todo arriba y abajo, y finalmente ejecuté:

run-parts --report -v  --test /etc/cron.daily

y no salí el script… ¿que era? al final se me ocurrió quitarle el ‘.sh’ del final… y voilá.

Y la cosa no acabó ahí. Otra definición en /etc/cron.d/zabbix.cron no cargaba… el mismo problema, sobraba el ‘.’.

En fins…

Enlace: Agile Ruined My Life

Publicado: septiembre 9, 2010 en coding, management, Misc, Technical

Ultimamente estoy viendo cosas sobre scrum, metodologías agiles, software tipo maven, hudson…

Me encanta el enfoque práctico y realista, orientado a las personas y atacando los problemas típicos de los entornos de trabajo. Pero no es oro todo lo que reluce:

http://www.whattofix.com/blog/archives/2010/09/agile-ruined-my.php

I have been asked to install AccessStream application on linux to test it.

But this application has not ANY documentation or install procedure. At the end I managed to install it, here is the procedure.

(más…)