Category Archives: Programming

Probing PHP with Systemtap on Linux

DTrace is a dynamic tracing tool build by Sun Microsystems and is available for Solaris, MacOS and FreeBSD. It features a tracing language which can be used to probe certain “probing” points in kernel or userland. This can be very useful to gather statistics, etc. Linux comes with a separate solution called systemtap. It also features a tracing language and can probe both userland and kernel space. A few Linux distributions such as Fedora enable systemtap in their default kernel.

PHP introduced DTrace support with PHP 5.3, enabling probing points in the PHP executable that can be used to simplify probing of PHP applications without having to the PHP implementation details. We enabled probes on function calls, file compilation, exceptions and errors. But this has always been limited to the operating systems that support DTrace. With the popularity of DTrace, Systemap programmers decided to add a DTrace compatibility layer that allows to use DTrace probes as Systemtap probing points as well.

With my recent commit to the PHP 5.5 branch, we allow DTrace probes to be build on Linux, so people can use Systemtap to probe those userland probes.

To compile PHP with userland probes you need to obtain the PHP 5.5 from git:

$ git clone git://github.com/php/php-src php-src
$ cd php-src
$ git checkout PHP-5.5
Now build PHP with DTrace support. First we have to rebuild configure as we build directly from the repository. Make sure your Linux distribution comes with systemtap and uprobes support.

$ ./buildconf --force
$ ./configure --disable-all --enable-dtrace
$ make
After being done with building we can see if we found any probes:

$ stap -l 'process.provider("php").mark("*")' -c 'sapi/cli/php -i'
process("sapi/cli/php").provider("php").mark("compile__file__entry")
process("sapi/cli/php").provider("php").mark("compile__file__return")
process("sapi/cli/php").provider("php").mark("error")
process("sapi/cli/php").provider("php").mark("exception__caught")
process("sapi/cli/php").provider("php").mark("exception__thrown")
process("sapi/cli/php").provider("php").mark("execute__entry")
process("sapi/cli/php").provider("php").mark("execute__return")
process("sapi/cli/php").provider("php").mark("function__entry")
process("sapi/cli/php").provider("php").mark("function__return")
process("sapi/cli/php").provider("php").mark("request__shutdown")
process("sapi/cli/php").provider("php").mark("request__startup")
Let’s build us a short Systemtap script that counts the function calls of a specific function. we use the function-return and function-entry probes for that:

$ cat request.stp
global callcount;
probe process.provider("php").mark("function-entry") {
    callcount[user_string($arg1)] += 1;
}
probe end {
    printf("count : function\n");
    foreach (name in callcount) {
        printf("%5d : %s\n", callcount[name], name);
    }
}
$ sudo stap -c 'sapi/cli/php test.php' request.stp
count : function
  100 : foo
  101 : bar

So that’s all. You can use systemtap now to probe your PHP. Hope you come up with some useful scripts. Share them!

Language runtimes and backwards compatbility (or why you shouldn’t write a version control system in Python)

Software projects choose languages based on idioms of the languages. Languages can provide mechanisms and structures to support object orientation or functional programming. Less time is spent thinking about backwards compatibility of programming language runtimes. While this is usually a non-issue for short living software like websites or software in tightly controlled environment, it becomes an issue for software projects that need to guarantee backwards-compatibility for years. For example: a version control system.

The Mercurial project aims to support Python 2.4 to Python 2.7. It does not support Python 3. Why? Python 3 is a drastic change. Unicode is the default string type, classes removed, etc. The impact of the changes are similar to the change from PHP 4 to PHP 5. Most software projects have adopted these language changes, but for projects that need to support LTS operating systems like RHEL or Solaris 9/10, it can be become an issue. You could drop Python 2.X support and tell existing users of your software to look for something else – a no-go for a version control system. You could simply not support Python 3 at someday, but Python 2.7 already reached it’s EOL. It’s just a matter time until distribution stop shipping Python 2.X. LTS operating systems might still not have Python 3 and rely on Python 2. Writing software that needs to be backwards-compatbile for 8 years can be a problem sumo suits for sale.

The source of the problem

Why is this a not an issue for Java or C, but for Python, PHP and Ruby? Java and C compile to bytecode that is guaranteed to be stable. C compiles to machinecode. A processor architecture won’t change anymore. If it’s a x86 processor, it will support x86 machinecode. It won’t change with the next software update. If your code needs to support old C code that modern compilers don’t understand anymore, use an old one. Java is similar in that regard. The JVM runtime has a defined set of instructions, which won’t be changed anymore. It doesnt matter which Java compiler you use, in the end it will produce bytecode that will run on any JVM. Sure you still might have problems supporting multiple versions of a library, but at least the JVM will always run your compiled code.

Python and PHP compile to bytecode as well, similar Java. There is, however, one exception: They do it in memory and the VM to interprete the bytecode is bundled with the compiler. This is were the backwards compatibility problem comes in play. You cannot run Python bytecode compiled on Python 3 with a Python 2 interpreter. You cannot compile with PHP 5 and run it on PHP 4. Either the interpreter simply fails to your old code, or your VM implementation is not guaranteed to be stable. That means in Python and PHP the underlying machine that you compile might change with the next update. Let’s compare this to the x86 world. Your next software update might change the x86 instruction set? You would have to recompile all your C code and maybe some of the old C code cannot be compiled with modern C compilers and old C compilers might not be able to get compiled on the new instruction set. Sounds painful, particularly if you really care about backwards-compatibility.

Sidenote

I think that Python, PHP and others did an architectual mistake. They bundled the VM and runtime with the compiler. Thus your language version defines your runtime and the underlying machinecode. If you write a new language, write down a minimum instruction set that you will always support and separate your VM from your compiler. Always support that instruction set. This can lead to interesting problems. The implementation of Java Generics is a good example. Nobody thought about generics when defining the insturctions set. Therefore the bytecode was not designed to retain information about the generic type. Thats why the Java compiler needs to check the generic type information and than transform it, so that the resulting bytecode is compatible with old JVM versions. This is known as type erasure. Python and PHP developer would probably just introduce new bytecodes, not caring about BC. (Well PHP devs would just pretend that PHP is a web language and web projects shouldn’t care about BC at all ;)).

Conclusion
If you seriously care about backward-compatibility for LTS systems that are 8 years old, choose a language which separates the VM from the compiler. Languages like Java (probably C#) do this. Java developer won’t define behavior that requires a new opcode. PHP and Python are wonderful programming languages, but personally I am not sure if it is wise to write something like a VCS in such a language.

Long story short: Language choice matters for BC. If you write your own language, please separate your VM from your compiler. Better (as johannes pointed out) compile to an existing VM like JVM, CLR or LLVM

How to run clojure.test in Slime and Swank

$ lein swank
In emacs use M-x slime-connect to connect to swank.
user> (use 'clojure.test)
nil
user> (use :reload 'geocommit.test.services) (run-test 'geocommit.test.services)
{:type :summary, :test 3, :pass 9, :fail 0, :error 0}

Locate your commits or how to use geocommit.

We recently launched geocommit.com. Geocommit is a service to add geolocation data to your commits. You only need a working WiFi connection. No GPS module is required.

This blogpost gives you an example how to use geocommit and the geocommit.com services. I’ll show how to use geocommits in your Git or mercurial project. How to make github and bitbucket more beautiful with our Chrome and Firefox extensions and how to get a fancy map of your geocommits.

What is geocommit

First of all, geocommit is a text format to attach geolocation data to version control system commits. The geocommit website has detailed information about the geocommit format blow up water slides for sale.
Second, geocommit is a service to store and analyse your geocommit data. We offer a set of tools and a webservice to make geocommit cool. The Git implementation git geo runs on Mac OS X and Linux. The Mercurial implementation hg-geo runs only under Linux. Mac OS support is under way.

Git & Geocommit

To start with geocommit, install git geo:

$ pip install geocommit

Go to a project directory and enable geocommit support:

$ cd myproject.git
$ git geo setup
geocommit setup
Installing geocommit hook in /home/dsp/awesomeproject/.git/hooks/post-rewrite
Installing geocommit hook in /home/dsp/awesomeproject/.git/hooks/post-merge
Installing geocommit hook in /home/dsp/awesomeproject/.git/hooks/post-commit

This will enable geocommit support in your project. If you commit something with git commit, git geo will try to get your current location and add a geocommit. If no WiFi connection is enabled, no geocommit will be created.

Check your geocommits:

$ git log --show-notes='geocommit'
commit 5a34e6ebc8cb5c2a394ca26505c1d375095161c4
Merge: 25cf72d 828af6e
Author: David Soria Parra 
Date:   Tue Jan 4 14:00:55 2011 +0100

    Merge branch 'master' of https://github.com/jezdez/geocommit

Notes (geocommit):
    geocommit (1.0)
    lat: 48.1211828
    long: 11.4853565
    hacc: 39.0
    src: nmg

Let’s push our geocommits to github:

$ git geo push

git geo push accepts the same options as git push. It pulls geocommits first, merges them and then pushes geocommits and the given branch to the remote repository.
That’s everything you need. Easy, isn’t it? So let’s see how to enable geocommits on Mercurial and then talk about the Chrome and Firefox extensions.


Deep dive
git geo stores geocommits in git notes. We use the namespace geocommit for that. Git notes have some cool properties. They are metadata and don’t change the commit hash. Therefore they can be added to a commit at anytime. They are displayed on github and can be deleted without any problem. You also can decide yourself when to push geocommits or not. You can delete already pushed geocommits without breaking the repository or changing any commit sha1. The drawback is that it is hard to deal with git notes from time to time. git notes is a new feature in git and not yet fully supported. We have to write a script to merge git notes as git notes merge is not available before git 1.7.7.


Mercurial & geocommit

You can add support for geocommits to Mercurial by installing the hg-geo extension. Clone the extension and enable it in your hgrc:

$ hg clone http://bitbucket.org/segv/hg-geo
$ echo "[extensions]\ngeo=/path/to/hg-geo/geo.py"
$ hg help geo

The extension will add an additional line to every commit that you do.

$ hg commit
$ hg log -v
changeset:   9:236a0f4c3d2e
tag:         tip
user:        David Soria Parra 
date:        Sun Jan 02 03:01:04 2011 +0100
files:       .hgtags
description:
Added tag v1.0.0 for changeset 3079e3ff3083

geocommit(1.0): lat 48.1211306, long 11.4853251, hacc 30.0, src nmg;

Now push your geocommits to bitbucket.

$ hg push

Deep dive
As Mercurial doesn’t have a way to store metadata, we are adding the geocommit data to the commit message itself. The obvious advantage is that you can use hg-geo with plain Mercurial. You do not need to enable hg-geo on the remote site to push geocommits (like Mercurial bookmarks). The disadvantage is that we modify the commit message and therefore the commit hash. There is no easy way to delete geocommits once they are created.


bitbucket.org and github.com

We can push geocommits easily now. But how to use them? We can install the Firefox or Chrome extension. This will display a map next to your commit!

Firefox
To install the geocommit extension for Firefox you need Greasemonkey. Greasemonkey is a well know and supported extension that enables user scripts to safely modify the displayed website.

Install Greasemonkey from userscripts.org. You can then browse bitbucket.org or github.com and see a map of your geocommit:

bitbucket with geocommit support

github with geocommit support

Chrome
On Chrome install the plugin from
chrome.google.com

Post Hook

We offer a post hook that you can use with github.com and bitbucket.org. Your commits will be tracked by gecommit.com and we will create a global and a project specific map as well as provide further analytics as soon as possible.

github.com
To install the hook go to th eadmin section of your repository and select Service Hooks.

Add http://hook.geocommit.com/api/github as a POST service hook.

on bitbucket.org
Go to the admin seciton of your repository and select Services

Add http://hook.geocommit.com/api/bitbucket as a POST service hook

Thats about it. Browse www.geocommit.com/full.html to checkout your commits on our map.

Questions?!
Enjoy!

Mercurial Bookmarks Revisited – Part I

Bookmarks is an extension to the Mercurial SCM, which adds git-like branches to Mercurial. The extension is distributed together with Mercurial.
Recently the extension has received a major update. Time to look back.

This is a series of blogposts that consists of three parts:

(1) Part I: History of Bookmarks
(2) Part II: Daily Bookmarking
(3) Part III: Pushable Bookmarks

Continue reading

php-trunk macport

macports is a widely used ports system for Mac OS. It’s repository contains hundreds of application that can be compiled and installed. The repository contains php 5.3. So if you want to run PHP from subversion you still have to compile it yourself and install it yourself outside your managed ports environment. I created a rather simple Portfile to build it from PHP’s trunk.

To use the php-trunk portsfile. Just extract the tarball into a directory and…

$ tar xzvf php-trunk-port.tar.gz
$ cd php-trunk-port
$ echo "file://`pwd`" > /opt/local/etc/macports/sources.conf
$ echo "file://`pwd`" >> /opt/local/etc/macports/sources.conf
$ port install php-trunk

And the best thing about the port: It compiles PHP with dtrace support :).

Please note the Portfile is very simple and is not tested with the additional modules provided by macports. So use it at your own risk and enhance it.

DOWNLOAD

UPDATE: as philip noted, it should be >> instead of >.

PHP 5.3.99-dev and DTrace Part I

For those not following the PHP development. We backported the DTraces probes from the abandoned PHP 6.0 branch, back to the new trunk PHP 5.3.99-dev. It is called 5.3.99 because the PHP dev community has not decided yet on a version number (5.4 or 6.0).

To compile PHP with DTrace, on Solaris or Mac OS do:

  $ svn co http://svn.php.net/repository/php/php-src/trunk php
  $ cd php
  $ ./buildconf --force
  $ ./configure --enable-dtrace
  $ make

To check if your PHP has DTraces probes enabled, you can check the phpinfo() using

 $ php -i | grep DTrace

or use DTrace directly>

 $ sudo dtrace -ln 'php*:::' -c 'php -m'

88836    php4614               php               dtrace_compile_file compile-file-entry
88837    php4614               php               dtrace_compile_file compile-file-return
88838    php4614               php                        zend_error error
88839    php4614               php  ZEND_CATCH_SPEC_CONST_CV_HANDLER exception-caught
88840    php4614               php     zend_throw_exception_internal exception-thrown
88841    php4614               php           dtrace_execute_internal execute-entry
88842    php4614               php                    dtrace_execute execute-entry
88843    php4614               php           dtrace_execute_internal execute-return
88844    php4614               php                    dtrace_execute execute-return
88845    php4614               php                    dtrace_execute function-entry
88846    php4614               php                    dtrace_execute function-return
88847    php4614               php              php_request_shutdown request-shutdown
88848    php4614               php               php_request_startup request-startup

Have fun with DTrace in PHP.

A few examples will follow in the next days.

Dealing with Sessions in Compojure

I recently started working with Compojure, a web framework for programming language Clojure. After playing around with basic GET and POST requests I tried to store and load data from a session. As Compojure is quiet new, there is not much documentation how to deal with sessions, particularly as Compojure and it’s API still changes fast.

In the following post I’ll show a brief example on how to create a login page with Compojure 0.3.2 running on Clojure 1.1.0.

Continue reading

Talks

I really enjoy giving talks. This is particularly because I like to teach people something and because I’m really enthusiastic about the technical things I talk about. Once of these things are obviously decentralized version control system, in particular Git and Mercurial. Finally after two years of submitting talks to various conferences, people and conferences in the PHP community start to pick up this topic. Seems that 2010 is the year of DVCS, and I’m really looking forward to give a talk about the advanced features of Git at

The talk will give a very brief overview how Git works, and will then give a more detailed insight in how Git handles commits, files, etc so that people get a very good understanding about the concepts that are needed to fully understand tools like git rebase, git reflog and git svn. The aim is to provide them will all necessary information and a few examples to get lost commits back, rebase their branches and design more complex git workflows in the future without needing to search the web or ask a guru.

A second talk will be more focused on beginners and developers coming from subversion. This talk will be part of a series of talks the german telekom is organizing. I’ll also give an extended version of this as an in-house workshop at a Munich based company.

So for me it seems that after five years, DVCS is mature enough to get into companies and that we can expect a bright variety of companies to adopt new tools and workflows. Let’s see what’s coming…

For sure I still offer Git and Mercurial trainings, so feel free to contact me (dsp ~at~ php ~dot~ net)

probably shameless self promotion