Tag Archives: clustering

Datetools: Date Manipulation and Cron Alternative For Linux

Introduction to Datetools

I wrote Datetools (in C++) to allow the manipulation of date time from the command line. It greatly simplified my life and maybe it will help yours out too!. It comprises of two core applications:

  • Dateblock: allows you to block until a scheduled period of time arrives unlike sleep which blocks for a set period of time. I found this so helpful, I ended up additionally building in an python extension for it.
  • Datemath: This application is just a simple way of preforming simple math on the system date.

The source code can be found here on GitHub if you’re interested in compiling it yourself. Or you can just scroll to the bottom of this blog where I provided the packaged goods.

Dateblock

The tool works very similarly to cron and sleep (a combination of the two); you can pass it a crontab string if that’s what you’re used too, or you can simply pass it in variables as arguments as well (as all other commands work):

Here’s is an example of what I mean:

# block until a minute divisible by 10 is reached:
# ex: HH:00, HH:10, HH:20, HH:30, HH:40, and HH:50
dateblock --minute=/10
# We will only reach this line when the above scheduled time has
# been met.
echo "Scheduled time reached; current time is: $(date)"

An equivalent crontab entry would look like this:

# block until a minute divisible by 10 is reached:
/10 * * * * echo "Scheduled time reached; current time is: $(date)"

Dateblock can also do another cool feature called ‘drifting’ which allows you to schedule processes on delayed cycles… Note that drifting is always specified in seconds. For example:

# Unblock on 5 minute cycles, but 2 minutes (120 seconds) into them:
# ex: HH:02, HH:07, HH:12, HH:17, HH:22, etc..
dateblock --minute=/5 --drift=120
# We will only reach this line when the above scheduled time has
# been met.
echo "Scheduled time reached; current time is: $(date)"

An equivalent crontab entry would look like this:

# block until a minute divisible by 10 is reached:
/5 * * * * sleep 120; echo "Scheduled time reached; current time is: $(date)"

The complexity of the tool can be as powerful as you want it to be:

# Unblock only on hours divisible by 5 on the 1st through to the 14th
# of every month (as well as the 20th). Unblock only when 30 seconds
# of that minute has elapsed.
 dateblock -o /5 -d 1-14,20 -s 30
# We will only reach this line when the above scheduled time has
# been met.
echo "Scheduled time reached; current time is: $(date)"

There is no way to reproduce this in a crontab unless the 30 second reference at the end is unnecessary… in that case:

# block until a minute divisible by 10 is reached:
0 /5 1-14,20 * * sleep 120; echo "Scheduled time reached; current time is: $(date)"

Just like crontabs, dateblock supports minute, hour, day of month, month and day of week. In addition, dateblock support seconds too. dateblock accepts traditional crontab entries as well as arguments:

#     day of week (0 - 6) (Sunday=0) --+
#     month (1 - 12) ---------------+  |
#     day of month (1 - 31) -----+  |  |
#     hour (0 - 23) ----------+  |  |  |
#     min (0 - 59) --------+  |  |  |  |
#  ***sec (0 - 59) -----+  |  |  |  |  |
#                       |  |  |  |  |  |
#                       -  -  -  -  -  -
# Dateblock Cron Entry: *  *  *  *  *  *
# Cron Crontab Entry:      *  *  *  *  *

# Unblock on the specific hours of 0 and 12:
# ex: HH:00, HH:12
$> dateblock --cron="0 0 00,12"

You’ll notice in the above, I didn’t bother specifying the remaining cron fields… In this case they will assume the default of *. But you can feel free to specify a * for readability. The other thing to observe is the addition of the second column which isn’t present in a regular crontab entry. It’s rules are no different then what you’ve already learned from other fields.

Testing

Simply adding a –test (-t) switch to your dateblock entry will allow you to test the tool in a debugging mode to which it will present to you the current time followed by when it would have unblocked for had you not provided the –test (-t) switch. It’s a great way to calculate when the next processing time will be.

Python Extension

To handle scheduled processes for my websites, I created a python extension for dateblock. This allowed to extend it’s flexibility with other offline processing… consider the following example:

from dateblock import dateblock
while True:
    # /5 as first argument means unblock on the 5th second of each minute
    dateblock('/5')
    print 'begin processing ...'
    # your code here...
    # if you want, you can even report the next unblock time
    print 'Done processing; blocking until %s' % 
        dateblock('/5', block=False).strftime('%Y-%m-%d %H:%M:%S')

You can also also access the drift as such:

from dateblock import dateblock
print 'Unblock at %s' % 
    dateblock('/5', drift=120, block=False).strftime('%Y-%m-%d %H:%M:%S')

Finally the python extension allows you to pass in a datetime object as an argument for calculating a time based on another (and not the current time which is the default).

from dateblock import dateblock
from datetime import datetime
from datetime import timedelta

# 31 days ago
reftime = datetime.now() - timedelta(days=31)

print('Would blocking until %s' % 
    dateblock('/5', drift=120, block=False, )
      .strftime('%Y-%m-%d %H:%M:%S') + 
    " if time was %s" % reftime
      .strftime('%Y-%m-%d %H:%M:%S'))

Things to Consider

Just like sleep, dateblock uses SIGALARM to manage its wake up time. Therefore if your code relies heavily on SIGALARM for another purposes, dateblock may not be a solution for you since you could interrupt it’s reliability (though not likely). This really shouldn’t be a big concern because this exact same warning comes with the sleep libraries we’ve been using for years. But it does mean that sleep could interfere with dateblock just as dateblock could interfere with sleep if they were both used in separate threads.

Dateblock vs Sleep
Dateblock vs Sleep

Why would I use dateblock over sleep?

Scheduling is the key… If your program relies completely on sleep, then the only thing you’re accomplishing is cpu throttling (controlling unnecessary thrashing). This is approach is fine if you’re going to just retry connecting to an unresponsive server in ?? seconds. But what if timing becomes an important factor of your application? The dateblock tool ensures you only unblock at absolute times vs sleep which unblocks at relative times with respect to when it was called.

Dateblock also allows your program to chronologically work in turn with other applications that may be on their own processing cycle. Such as something delivering data at the top of every hour. You may wish to have your program wake up 5 min after the top of each hour to perform the processing regardless of when your program was started.

Datemath

There isn’t as much to be said about Datemath; I personally never found a Linux/Unix tool that would allow me to script date/time calculations from the command line. For that reason, this tool exists.
Here is an example of the tools function:

# what time is it now?
date +'%Y-%m-%d %H-%M-%S'
# The above output was '2013-10-26 09-42-21' at the time of this blog
# what time will it be 5 months and 3 days from now
datemath --months=5 --days=3 --format='%Y-%m-%d %H-%M-%S'
# the above output was '2014-03-29 09-42-21' at the time of this blog
# and this makes sense... this is the calculation we want.

The tool supports negative values for calculating into the past as well and will handle leap years in the calculations too.

# what time is it now?
date +'%Y-%m-%d %H-%M-%S'
# The above output was '2013-10-26 09-45-45' at the time of this blog
# What was the current date 753 days ago?
datemath --days=-753 --format='%Y-%m-%d %H-%M-%S'
# the above output was '2011-10-04 09-45-45' at the time of this blog
# and this makes sense... this is the calculation we want.

No Python Module For Datemath

There is no python module for datemath because Python’s datetime and timedelta libraries already provide a fantastic solution for the same problem datemath solves…

# Simple code example to show why it really isn't
# necessary to port datemath to Python:
from datetime import datetime
from datetime import timedelta
in_the_past = datetime.now() - timedelta(minutes=15)
print '15 minutes ago: %s' % in_the_past
                              .strftime('%Y-%m-%d %H:%M:%S')

Just give me the goods

No problem, below are the RPMS as well as their accompanied source packages:

Package Download Description
dateblock
el6.rpm
/
el7.rpm
The powerful (cron like) command line interface (CLI) tool.
python-dateblock
el6.rpm
/
el7.rpm
The python extension for dateblock.
datemath
el6.rpm
/
el7.rpm
The datemath command line interface tool.
datetools
el6.rpm
/
el7.rpm
An optional package which includes licensing and information.

Note: The source rpm can be obtained here which builds everything you see in the table above. It’s not required for the application to run, but might be useful for developers or those who want to inspect how I put the package together.

No way, I’m building this myself; I don’t trust you

That’s okay, I understand; here is how you can build it yourself:

# Install 'mock' into your environment if you don't have it already
# This step will require you to be the superuser (root) in your native
# environment.
yum install -y mock

# Grant your normal every day user account access to the mock group
# This step will also require you to be the root user.
usermod -a -G mock YourNonRootUsername

At this point it’s safe to change from the ‘root‘ user back to the user account you granted the mock group privileges to in the step above. We won’t need the root user again until the end of this tutorial when we install our built RPM.

# Create an environment we can work in
mkdir datetool-build

# Change into our temporary working directory
cd datetool-build

curl -L --output datetools-0.8.1.tar.gz \
   https://github.com/caronc/datetools/archive/v0.8.1.tar.gz

# Extract Spec File
tar xfz datetools-0.8.1.tar.gz \
   datetools-0.8.1/datetools.spec \
      --strip-components=1

# Initialize Mock Environment
mock -v -r epel-6-x86_64 --init

# Now install our dependencies
mock -v -r epel-6-x86_64 --install boost-devel libstdc++-devel 
          glib-devel python-devel autoconf automake libtool

# Copy in our downloaded content:
mock -v -r epel-6-x86_64 --copyin datetools-0.8.1.tar.gz
   /builddir/build/SOURCES
mock -v -r epel-6-x86_64 --copyin datetools.spec 
   /builddir/build/SPECS

# Shell into our environment
mock -v -r epel-6-x86_64 --shell

# Change to our build directory
cd builddir/build

# Build our RPMS
rpmbuild -ba SPECS/datetools.spec

# we're now done with our mock environment for now; Press Ctrl-D to exit
# or simply type exit on the command line of our virtual environment
exit

Future Considerations

This is totally up in the air, at the moment the tool does everything I needed at the time. However I could see the following becoming a useful feature in the future:

  • Pass in a different time into both programs (instead of always working with the current time) (You can already do this with the dateblock python extension).
  • Have dateblock additionally take in a program and arguments as input to have it automatically execute the call to it when the scheduled time is reached. In addition to this, it means the dateblock tool would daemonize itself and run in the background on reoccurring schedules.
  • Add a devel package and create a shared library for C++ linking; perhaps the binary tools and extensions could link here too. Right now the library is just so small it’s really nothing to just include it statically as it is now.
  • Got an idea of your own? Pass it along! You can also submit a pull request to me on GitHub here.

Credit

Please note that this information took me several days to put together and test thoroughly. I may not blog often; but I want to re-assure the stability and testing I put into everything I intend share.

If you like what you see and wish to copy and paste this information, please reference back to this blog post at the very least. It’s really all I ask.