Category Archives: PHP

Probing PHP with Systemtap on Linux

DTrace is a dynamic tracing tool build by Sun Microsystems and is available for Solaris, MacOS and FreeBSD. It features a tracing language which can be used to probe certain “probing” points in kernel or userland. This can be very useful to gather statistics, etc. Linux comes with a separate solution called systemtap. It also features a tracing language and can probe both userland and kernel space. A few Linux distributions such as Fedora enable systemtap in their default kernel.

PHP introduced DTrace support with PHP 5.3, enabling probing points in the PHP executable that can be used to simplify probing of PHP applications without having to the PHP implementation details. We enabled probes on function calls, file compilation, exceptions and errors. But this has always been limited to the operating systems that support DTrace. With the popularity of DTrace, Systemap programmers decided to add a DTrace compatibility layer that allows to use DTrace probes as Systemtap probing points as well.Fayetteville

With my recent commit to the PHP 5.5 branch, we allow DTrace probes to be build on Linux, so people can use Systemtap to probe those userland probes.

To compile PHP with userland probes you need to obtain the PHP 5.5 from git:

$ git clone git://github.com/php/php-src php-src
$ cd php-src
$ git checkout PHP-5.5
Now build PHP with DTrace support. First we have to rebuild configure as we build directly from the repository. Make sure your Linux distribution comes with systemtap and uprobes support.

$ ./buildconf --force
$ ./configure --disable-all --enable-dtrace
$ make
After being done with building we can see if we found any probes:

$ stap -l 'process.provider("php").mark("*")' -c 'sapi/cli/php -i'
process("sapi/cli/php").provider("php").mark("compile__file__entry")
process("sapi/cli/php").provider("php").mark("compile__file__return")
process("sapi/cli/php").provider("php").mark("error")
process("sapi/cli/php").provider("php").mark("exception__caught")
process("sapi/cli/php").provider("php").mark("exception__thrown")
process("sapi/cli/php").provider("php").mark("execute__entry")
process("sapi/cli/php").provider("php").mark("execute__return")
process("sapi/cli/php").provider("php").mark("function__entry")
process("sapi/cli/php").provider("php").mark("function__return")
process("sapi/cli/php").provider("php").mark("request__shutdown")
process("sapi/cli/php").provider("php").mark("request__startup")
Let’s build us a short Systemtap script that counts the function calls of a specific function. we use the function-return and function-entry probes for that:

$ cat request.stp
global callcount;
probe process.provider("php").mark("function-entry") {
    callcount[user_string($arg1)] += 1;
}
probe end {
    printf("count : function\n");
    foreach (name in callcount) {
        printf("%5d : %s\n", callcount[name], name);
    }
}
$ sudo stap -c 'sapi/cli/php test.php' request.stp
count : function
  100 : foo
  101 : bar

So that’s all. You can use systemtap now to probe your PHP. Hope you come up with some useful scripts. Share them!

Language runtimes and backwards compatbility (or why you shouldn’t write a version control system in Python)

Software projects choose languages based on idioms of the languages. Languages can provide mechanisms and structures to support object orientation or functional programming. Less time is spent thinking about backwards compatibility of programming language runtimes. While this is usually a non-issue for short living software like websites or software in tightly controlled environment, it becomes an issue for software projects that need to guarantee backwards-compatibility for years. For example: a version control system.

The Mercurial project aims to support Python 2.4 to Python 2.7. It does not support Python 3. Why? Python 3 is a drastic change. Unicode is the default string type, classes removed, etc. The impact of the changes are similar to the change from PHP 4 to PHP 5. Most software projects have adopted these language changes, but for projects that need to support LTS operating systems like RHEL or Solaris 9/10, it can be become an issue. You could drop Python 2.X support and tell existing users of your software to look for something else – a no-go for a version control system. You could simply not support Python 3 at someday, but Python 2.7 already reached it’s EOL. It’s just a matter time until distribution stop shipping Python 2.X. LTS operating systems might still not have Python 3 and rely on Python 2. Writing software that needs to be backwards-compatbile for 8 years can be a problem sumo suits for sale.

The source of the problem

Why is this a not an issue for Java or C, but for Python, PHP and Ruby? Java and C compile to bytecode that is guaranteed to be stable. C compiles to machinecode. A processor architecture won’t change anymore. If it’s a x86 processor, it will support x86 machinecode. It won’t change with the next software update. If your code needs to support old C code that modern compilers don’t understand anymore, use an old one. Java is similar in that regard. The JVM runtime has a defined set of instructions, which won’t be changed anymore. It doesnt matter which Java compiler you use, in the end it will produce bytecode that will run on any JVM. Sure you still might have problems supporting multiple versions of a library, but at least the JVM will always run your compiled code.

Python and PHP compile to bytecode as well, similar Java. There is, however, one exception: They do it in memory and the VM to interprete the bytecode is bundled with the compiler. This is were the backwards compatibility problem comes in play. You cannot run Python bytecode compiled on Python 3 with a Python 2 interpreter. You cannot compile with PHP 5 and run it on PHP 4. Either the interpreter simply fails to your old code, or your VM implementation is not guaranteed to be stable. That means in Python and PHP the underlying machine that you compile might change with the next update. Let’s compare this to the x86 world. Your next software update might change the x86 instruction set? You would have to recompile all your C code and maybe some of the old C code cannot be compiled with modern C compilers and old C compilers might not be able to get compiled on the new instruction set. Sounds painful, particularly if you really care about backwards-compatibility.

Sidenote

I think that Python, PHP and others did an architectual mistake. They bundled the VM and runtime with the compiler. Thus your language version defines your runtime and the underlying machinecode. If you write a new language, write down a minimum instruction set that you will always support and separate your VM from your compiler. Always support that instruction set. This can lead to interesting problems. The implementation of Java Generics is a good example. Nobody thought about generics when defining the insturctions set. Therefore the bytecode was not designed to retain information about the generic type. Thats why the Java compiler needs to check the generic type information and than transform it, so that the resulting bytecode is compatible with old JVM versions. This is known as type erasure. Python and PHP developer would probably just introduce new bytecodes, not caring about BC. (Well PHP devs would just pretend that PHP is a web language and web projects shouldn’t care about BC at all ;)).

Conclusion
If you seriously care about backward-compatibility for LTS systems that are 8 years old, choose a language which separates the VM from the compiler. Languages like Java (probably C#) do this. Java developer won’t define behavior that requires a new opcode. PHP and Python are wonderful programming languages, but personally I am not sure if it is wise to write something like a VCS in such a language.

Long story short: Language choice matters for BC. If you write your own language, please separate your VM from your compiler. Better (as johannes pointed out) compile to an existing VM like JVM, CLR or LLVM

php-trunk macport

macports is a widely used ports system for Mac OS. It’s repository contains hundreds of application that can be compiled and installed. The repository contains php 5.3. So if you want to run PHP from subversion you still have to compile it yourself and install it yourself outside your managed ports environment. I created a rather simple Portfile to build it from PHP’s trunk.

To use the php-trunk portsfile. Just extract the tarball into a directory and…

$ tar xzvf php-trunk-port.tar.gz
$ cd php-trunk-port
$ echo "file://`pwd`" > /opt/local/etc/macports/sources.conf
$ echo "file://`pwd`" >> /opt/local/etc/macports/sources.conf
$ port install php-trunk

And the best thing about the port: It compiles PHP with dtrace support :).

Please note the Portfile is very simple and is not tested with the additional modules provided by macports. So use it at your own risk and enhance it.

DOWNLOAD

UPDATE: as philip noted, it should be >> instead of >.

PHP 5.3.99-dev and DTrace Part I

For those not following the PHP development. We backported the DTraces probes from the abandoned PHP 6.0 branch, back to the new trunk PHP 5.3.99-dev. It is called 5.3.99 because the PHP dev community has not decided yet on a version number (5.4 or 6.0).

To compile PHP with DTrace, on Solaris or Mac OS do:

  $ svn co http://svn.php.net/repository/php/php-src/trunk php
  $ cd php
  $ ./buildconf --force
  $ ./configure --enable-dtrace
  $ make

To check if your PHP has DTraces probes enabled, you can check the phpinfo() using

 $ php -i | grep DTrace

or use DTrace directly>

 $ sudo dtrace -ln 'php*:::' -c 'php -m'

88836    php4614               php               dtrace_compile_file compile-file-entry
88837    php4614               php               dtrace_compile_file compile-file-return
88838    php4614               php                        zend_error error
88839    php4614               php  ZEND_CATCH_SPEC_CONST_CV_HANDLER exception-caught
88840    php4614               php     zend_throw_exception_internal exception-thrown
88841    php4614               php           dtrace_execute_internal execute-entry
88842    php4614               php                    dtrace_execute execute-entry
88843    php4614               php           dtrace_execute_internal execute-return
88844    php4614               php                    dtrace_execute execute-return
88845    php4614               php                    dtrace_execute function-entry
88846    php4614               php                    dtrace_execute function-return
88847    php4614               php              php_request_shutdown request-shutdown
88848    php4614               php               php_request_startup request-startup

Have fun with DTrace in PHP.

A few examples will follow in the next days.

Talks

I really enjoy giving talks. This is particularly because I like to teach people something and because I’m really enthusiastic about the technical things I talk about. Once of these things are obviously decentralized version control system, in particular Git and Mercurial. Finally after two years of submitting talks to various conferences, people and conferences in the PHP community start to pick up this topic. Seems that 2010 is the year of DVCS, and I’m really looking forward to give a talk about the advanced features of Git at

The talk will give a very brief overview how Git works, and will then give a more detailed insight in how Git handles commits, files, etc so that people get a very good understanding about the concepts that are needed to fully understand tools like git rebase, git reflog and git svn. The aim is to provide them will all necessary information and a few examples to get lost commits back, rebase their branches and design more complex git workflows in the future without needing to search the web or ask a guru.

A second talk will be more focused on beginners and developers coming from subversion. This talk will be part of a series of talks the german telekom is organizing. I’ll also give an extended version of this as an in-house workshop at a Munich based company.

So for me it seems that after five years, DVCS is mature enough to get into companies and that we can expect a bright variety of companies to adopt new tools and workflows. Let’s see what’s coming…

For sure I still offer Git and Mercurial trainings, so feel free to contact me (dsp ~at~ php ~dot~ net)

probably shameless self promotion

Writing a simple PHP sourcecode buildscript in Scala

Scala is a fascinating language. Running on the Java VM, Scala offers a powerful mixture from both the imperative Java world and functional programming including modern techniques like Actors. Personally I prefer to not just learn programming languages, but also try them out while reading through the book.

As I wanted to have nice a build system for my PHP subversion checkout, I used this need as a project to start coding Scala. So what do I exactly need? I want to build multiple versions of PHP from the same branch without checking out the code twice. I also want to configure these builds somewhere without always typing in the parameter list or so. For further versions I want to be able to configure these in a file that can easily be distributed to other machines.

I set down and wrote a parser for a configuration file that can configured build targets which is then build by
the program. The configuration file I used is specialized for this purpose, which is why I didn’t used something like ant or so. The result is called bauaffe-3.0.0a1.jar.

I’ll just show a few things done in the project, but mainly focus on what the nice script can do. Further blog posts will be about the actual implementation.

The configuration looks like this

$ cat ~/.buildmaker
begin default configuration
    define source "/Users/dsp/dev/c/php-src"
    define build "/Users/dsp/dev/c/php-src/build"
    define defaults as
        with "iconv=/opt/local"
    build trunk as
        "php60" using defaults
        "php60-debug" using defaults
            enable "debug"
    build branch "PHP_5_3" as
        "php53" using defaults
            environment PHP_AUTOCONF="autoconf213"
        "php53-debug" using defaults
            enable "debug"
            environment PHP_AUTOCONF="autoconf213"

.
Proper indention is not necessary (as e.g in python).


You might want to think that parsing the configuration file can be difficult. Well, if you use C you would use YACC, if you use PHP, I don’t know what you would have done, but Scala is made to create this kind of Domain Specific Languages (for my the config is a DSL). You can easily transform a EBNF directly to scala code using the JavaTokenParsers provided by the Scala Library. As an example this it the statement that parses the first line:

def begin : Parser[Configuration] =
"begin" ~ ("default" | stringLiteral) ~ "configuration" ~ rep(define | build) ^^ {
case "begin"~name~"configuration"~confs => new Configuration(name, confs)
}

which is directly taken from the BNF:

config ::= "begin" ( "default" | string ) "configuration" ( define | build )*

Did I mention that the actual parser is 170 lines of code with usual indention and formatting?


Configuration
The configuration file is searched in ~/.buildmaker, or if ~/.builmaker doesn’t exists, buildmaker.conf in the current directory. How do you configure the tool? First of all you can specify a configuration. It is usually called “default”. It is not yet supported to name it differently, although the parser is able to parse it. In further versions multiple configurations per file are allowed.

Variables
Variables are set using the define syntax. At the moment you can set the build and source variable as well the defaults variable, which is usually a block of statements that can be used in the branch configurations.

Branches
A branch is configured using the build syntax. You first have to specify which branch to build. Every branch can then configured to have build target with a given set of options. Branch options are:

  • with string: Builds the target with the given extension
  • enable string: Builds the target with the given extension
  • environment string=string: Builds the target with environment variable

. You can specify using defaults which will cause the runner to use the options specified in the defaults define.

At the moment the parser will not do a good job in notifying you what you are allowed to do and what not, although pure parse error will be emitted. You can also not set any other variable than the described once.

Building
Calling

$ java -jar bauaffe-3.0.0a1.jar list
TARGET                         LAST BUILD
php60                          None
php60-debug                    None
php53                          Sat Jan 09 16:55:12 CET 2010
php53-debug                    Sat Jan 09 16:59:37 CET 2010

gives you a list of parsed targets and their last build date. You can build a target using

$ java -jar bauaffe-3.0.0a1.jar 

or build all using

$ java -jar bauaffe-3.0.0a1.jar all

Please notice that the current version requires that you now what you are doing. You might miss some error messages or find them not useful. I’ll change this before the first release, if I’ll do a final version of it. I hope you like the little tool.

Download It!

Scala (pronounced /ˈskɑːlə, ˈskeɪlə/) is a multi-paradigm programming language designed to integrate features of object-oriented programming and functional programming.[1] The name Scala stands for “scalable language”, signifying that it is designed to grow with the demands of its users.

PHP 5.3 BBQ Release Party Munich

We like to invite you to the PHP 5.3 release party which is an event to
celebrate the 5.3 release, happening Friday, the 17th of July in Munich.
The release party offers a chance to come together with other php
enthusiasts and enjoy that php is alive and kicking. And of course people
in favour of a decent barbecue, together with some beer and other drinks
are invited.

The happening will take place at Waldwirtschaft (http://www.waldwirtschaft.de)
beer garden, at any weather. We will meet at 19:00 o’clock – open end.

Catering will be provided and as a special delicacy you may enjoy a suckling pig!…hopefully vegetarian food. If you like to join the event please register at PHPUG-Munich Wiki (http://phpugmunich.org/dokuwiki/php_release_party) and follow it for updates. Alternatively you may register at Facebook
(http://www.facebook.com/event.php?eid=115203467104) as well and follow
this for updates.

For any questions please visit IRC channel: #phprp on irc.uni-erlangen.de.

The PHP 5.3. BBQ release party is sponsored by:
– Microsoft
– Mayflower GmbH
– Swoodoo AG
– Zend Technologies GmbH

Supporters for the PHP 5.3 BBQ release party are:
– Sun Microsystems
– PHPUG.de

Calling Conventions – when you need to know C to understand PHP

I think most of the people using PHP wonder from time to time about particular behavior of the language. That’s pretty much the same case with every language. Pythoneers have their wtf moments, Ruby programmers have their wtf moments and C programmers tend to live in a whole wtf universe. But lately I stumbled over a nice one. It looked like a bug in PHP, but turns out to be an interesting, curious, part of the C-language. Imagine the following PHP code sample and note that $a and $b are not defined (yeah I know, it’s bad coding style..blabla..):

<?php
var_dump($a + $b);
?>

What is the expected result with error_reporting set to E_ALL?

PHP Notice: Undefined variable: b in /var/foo/bla on line 1
PHP Notice: Undefined variable: a in /var/foo/bla on line 1
.
Are you sure? I’m not. On x86 hardware b is fetched before a is fetched and therefore the executor detects that b is not set first. But wait. Let’s test this on a SPARC machine:

PHP Notice: Undefined variable: a in /var/foo/bla on line 1
PHP Notice: Undefined variable: b in /var/foo/bla on line 1
.
What? It evaluates it in the reversed oder? What is happening? So I spend a few minutes with my lovely debugger and it turns out that this is what happens in the engine (I use pseudo code here):

result ZEND_ADD_OPCODE()
{
   return add_function(get_op1(), get_op2());
}

Voila, that’s the problem. On SPARC get_op1() is executed before get_op2(), while it’s the other way round on x86. As get_opX() detects if a variable exists, the error messages appear in reversed oder. I did a little bit research (thank you SunCC Team for your answer!) and it turned out, that C99 doesn’t define the way function calls in parameter lists are executed. Therefore, every system and compiler is free to use it’s own ordering mechanism. My current plan: Write a compiler that does this by random(). The fix is trivial:

result ZEND_ADD_OPCODE()
{
  op2 = get_op2();
  return add_function(get_op1(), op2);
}

. It’s a lovely curiosity.

PHP BBQ Tour in Karlsruhe

The PHP BBQ Tour organized by Ulf Wendel and various PHP Usergroups in Germany will stop in Karlsruhe. For those of you living near Karlsruhe, it’s definatly worth to join us. There will be food, drinks and I’m sure it’ll be a great evening. Information can be found here