Reblogged from Mark on WordPress:

Click to visit the original post
  • Click to visit the original post
  • Click to visit the original post
  • Click to visit the original post

Have Baby. Need Stuff! is a baby gear site that my wife and I just launched. I thought I’d share how I built it.

WordPress Core

WordPress is a Git submodule, with the content directory moved to the /content/ directory. This makes my Git repo smaller, as WordPress isn’t actually in it.

Underscores

For a theme base, I started with the…

Read more… 1,606 more words

Step by step post on how to make WordPress into a full blown product/category site. No ecomm, but very comprehensive.

Write your own code profiler

Posted: April 10, 2012 in Uncategorized

I wrote earlier about how to use register_tick_function to profile your code. I’ve expanded the article and included a working profiler class on inside.godaddy.com in my latest post on Write Your Own Code Profiler.


Varnish is a great tool to have if you need a scriptable proxy server. For most sites, though, this will be more complex than necessary. Check out this article on how to speed up WordPress with Varnish.

I propose that most users don’t need the power of VCL, though. They’d be much better off with a high performance proxy server with RFC 2616 compliance and simple configuration.

Allow me to play devil’s advocate and show you how to build and configure Apache Traffic Server, then show you benchmark results:

# built on centos 6.2
# Get traffic server pre-requisites
yum install gcc-c++ openssl-devel tcl-devel expat-devel pcre-devel make

# Get traffic server
wget http://apache.tradebit.com/pub/trafficserver/trafficserver-3.1.3-unstable.tar.bz2
tar -xf trafficserver-3.1.3-unstable.tar.bz2
cd trafficserver-3.1.3-unstable

# Build traffic server
./configure
make -j 8

# Install traffic server
make install
ln -s /usr/local/bin/trafficserver /etc/init.d/trafficserver
chkconfig --add trafficserver
chkconfig trafficserver on

# Configure traffic server URL map
echo "map http://YOURDOMAIN.COM http://localhost" >> /usr/local/etc/trafficserver/remap.config
echo "reverse_map http://localhost http://YOURDOMAIN.COM" >> /usr/local/etc/trafficserver/remap.config

# Run traffic server
trafficserver restart

Note: Traffic Server runs on port 8080. I used my host’s router to map port 80 -> 8080 and left httpd on port 80.

Since traffic server is now handling all incoming requests, the web server logs will be unreliable. You can rely on an in-browser analytics package, or see how to configure Traffic Server for combined logs.

Configuration done

Let’s benchmark.

Note: my test site is using W3 Total Cache. W3TC sets the cache-control, last-modified, and expires headers on the pages, and static content so that traffic server knows how to best cache your content. You can also use the Varnish integration in W3TC with traffic server. This will purge pages from the traffic server proxy whenever you purge them from W3TC.

Protip: If you need to have Traffic Server work without an expires header or cache-control header, look in records.config and find this section:

# required headers: three options:
#   0 - No required headers to make document cachable
#   1 - "Last-Modified:", "Expires:", or "Cache-Control: max-age" required
#   2 - explicit lifetime required, "Expires:" or "Cache-Control: max-age"
CONFIG proxy.config.http.cache.required_headers INT 2

Look at blitz.io. I used this command in the blitzbar:

# If you paid for blitz, you can max out at 1,000 concurrent users
# -p 1-1000:5,1000-1000:55 -r california -H 'Accept-encoding: gzip' http://YOURDOMAIN.COM/
-p 1-250:1,250-250:59 -r california -H 'Accept-encoding: gzip' http://YOURDOMAIN.COM/

For these benchmarks, I paid to upgrade blitz.io to allow 1,000 concurrent users. At 250 concurrent users, the performance profiles of these systems didn’t really shine through.

Here’s the Varnish benchmark.

blitz.io - Varnish

blitz.io - Varnish

Here’s the Traffic Server benchmark:

blitz.io - Traffic Server

blitz.io - Traffic Server

For comparison, here’s Apache HTTPD alone:

blitz.io - Apache HTTPD

blitz.io - Apache HTTPD

Notice how Varnish doesn’t like the initial influx of connections. Notice how Varnish had more timeouts, too. Notice, too, how much of a difference these caching systems make. Remember, the Apache HTTPD benchmark is already using WordPress + W3 Total Cache.

Let’s look at apachebench for a different view. Using this command, I benchmarked raw Apache HTTPD vs. Varnish vs. Traffic Server. This is all done locally over a loopback connection.

# Change the port to 80 to benchmark apache directly
ab -c 250 -n 1000000 -k -H "Host: YOURDOMAIN.COM" http://127.0.0.1:8080/

Apache results (no chance!):

apr_socket_recv: Connection timed out (110)
Total of 17725 requests completed

Varnish results:

Time taken for tests:   95.049 seconds
Complete requests:      1000000
Failed requests:        0
Requests per second:    10520.83 [#/sec] (mean)
Time per request:       23.762 [ms] (mean)
Time per request:       0.095 [ms] (mean, across all concurrent requests)
Transfer rate:          75257.95 [Kbytes/sec] received

Traffic Server results:

Time taken for tests:   50.971 seconds
Complete requests:      1000000
Failed requests:        0
Requests per second:    19619.00 [#/sec] (mean)
Time per request:       12.743 [ms] (mean)
Time per request:       0.051 [ms] (mean, across all concurrent requests)
Transfer rate:          139371.67 [Kbytes/sec] received

Conclusion

According to blitz.io, Varnish and Traffic Server benchmark results are close. According to ab, Traffic Server is twice as fast as Varnish. Configuration is different in each package. Traffic Server has some unique features like the ability to act as as a forward proxy and a reverse proxy, the ability to proxy SSL connections, a plugin architecture, and will soon have SPDY support. If you haven’t heard of Traffic Server yet, you should check it out. If you’re not using a caching proxy, please start using one!

For more information on Apache Traffic Server, check out this 2011 USENIX presentation (PDF, 10.4 MB) by Leif Hedstrom


If you’re using apache traffic server, all of your traffic is going through your proxy, not your web server.  This is a problem if you rely on your logs for visitor data.

You can get traffic server to output logs in apache’s combined log format, though.  Then you just feed traffic server’s logs to your log analysis app.

Here’s how to do it:

Edit /usr/local/etc/trafficserver/logs_xml.config

Add these lines to the bottom:

<LogFormat>
    <Format = "%<chi> - - [%<cqtn>] \"%<cqhm> %<cquup>\" %<pssc> %<psql> \"%<{Referer}cqh>\" \"%<{User-Agent}cqh>\"" />
    <Name = "httpd_combined"/>
</LogFormat>
<LogObject>
    <Format = "httpd_combined"/>
    <Filename = "access"/>
</LogObject>

Edit /usr/local/etc/trafficserver/logs_records.config

Change

CONFIG proxy.config.log.custom_logs_enabled INT 0

to

CONFIG proxy.config.log.custom_logs_enabled INT 1

Then restart traffic server

trafficserver restart

Now you’ll have an apache combined log file in /usr/local/var/log/trafficserver/access.log

You can read more about apache traffic server’s logging options here and the log formats here.

Cloud Processing with Gearman

Posted: March 2, 2012 in PHP
Tags: ,

This is the stuff I work on at Go Daddy in addition to WordPress

http://inside.godaddy.com/cloud-processing-with-gearman/


Okay, so your code is slow.  Why?  Is it your webservice call?  Or a new plugin you installed?  Is it your database calls?  File I/O?  You need a profiler.  A profiler gives you the ability to trace the performance of your code through every function call and create an overview of your system’s performance over a certain time period and helps you make intelligent decisions about where to look for problems.

Code profilers like xdebug and xhprof are essential tools to have for diagnosing performance bottlenecks.  I highly recommend you check them out.  If you have access to installing these extensions on your system, you can probably stop reading here.

But what if you’re in an environment where you can’t install an extension? Luckily, php has a built-in function called register_tick_function that gives you a way to hook in to every user function that’s called.  With this, you can write a profiler yourself.

Here’s some sample code to show you what I mean:

declare(ticks=1);
register_tick_function('do_profile');

a();

function do_profile() {
    $bt = debug_backtrace();
    if (count($bt) <= 1) {
         return;
    }
    $frame = $bt[1];
    unset($bt);
    $function = $frame['function'];
    echo __FUNCTION__ . ' :: ' . $function . PHP_EOL;
}

function a() {
    echo __FUNCTION__ . PHP_EOL;
    b();
}

function b() {
    echo __FUNCTION__ . PHP_EOL;
    c();
}

function c() {
    echo __FUNCTION__ . PHP_EOL;
}

And the output:

a
do_profile :: a
b
do_profile :: b
c
do_profile :: c
do_profile :: b
do_profile :: a

You’ll see in the output that the tick function is being called when the function is called, and again when the function is removed from the stack.

This can be incredibly useful if you attach a timer to it.

Note:  the declare(ticks=1); line is mandatory, it tells php to call your tick handler for every function.  You can change this to declare(ticks=10); to call the tick handler every tenth function if that fits your needs.

Here’s a very simple example of a working profiler. It’s practically the same code as above, but with a timer attached and a shutdown handler to show you the profile when the script is done:

declare(ticks=1);
register_tick_function('do_profile');
register_shutdown_function('show_profile');

$profile = array();
$last_time = microtime(true);

a();

function do_profile() {
    global $profile, $last_time;
    $bt = debug_backtrace();
    if (count($bt) <= 1) {
        return ;
    }
    $frame = $bt[1];
    unset($bt);
    $function = $frame['function'];
    if (!isset($profile[$function])) {
        $profile[$function] = array(
            'time'  => 0,
            'calls' => 0
        );
    }
    $profile[$function]['calls']++;
    $profile[$function]['time'] += (microtime(true) - $last_time);
    $last_time = microtime(true);
}

function show_profile() {
    global $profile;
    print_r($profile);
}

function a() {
    usleep(50 * 1000);
    b();
}

function b() {
    usleep(500 * 1000);
    c();
}

function c() {
    usleep(5000 * 1000);
}

And the output hopefully matches what you would expect:

Array
(
    [a] => Array
        (
            [time] => 0.0511748790741
            [calls] => 2
        )

    [b] => Array
        (
            [time] => 0.500598907471
            [calls] => 2
        )

    [c] => Array
        (
            [time] => 5.00052690506
            [calls] => 1
        )

)

The timing matches what we set in each function. The calls, though, is a bit off, since the tick handler can be called twice for each function. Just note that when you’re interpreting the results.

You can really amp this up to get some good info, though.  Check this out:

P3 Profiler Timeline

P3 Profiler Timeline

If you want to see how it’s done, or see how well it works, give the P3 Plugin for WordPress a look.

Caveats

The tick function is a user function.  This means that whenever php calls a user function it will call your tick function first, but since your  tick function is itself a user function, it will also have the tick function called for it.  It stops after one level of recursion.  Be sure to check for that in your code!

Ticks will not be called on php internal functions.  If your app is having problems due to I/O or an extension, you won’t be able to get the same granularity with a tick function as with an extension based profiler.

Bytecode optimizers will cause you problems.  Function calls will happen, but they won’t be sent through the tick function.  Not exactly sure why on this, but here’s what I’ve got so far:

  • eaccelerator – optimization definitely causes problems
    • disable method 1 – set eaccelerator.optimizer to false
    • disable method 2 – if your app is in eaccelerator’s admin path, call eaccelerator_optimizer(false);
  • xcache – the documentation says there is no optimizer until version 2.0, but experimentation shows some missing function calls, no solution yet
  • zend optimizer+ –  some problems – you can turn this off by setting zend_optimizerplus.optimization_level to 0
  • apc – optimization was removed in 3.0.13 – no problems found
  • wincache – no problems found
  • ioncube – no problems found
  • zend guard loader – no problems found

There’s a warning that this function is not compatible with multi-threaded web servers and php < 5.3.  See the documentation on php.net for more details

Remember to optimize the tick function!  It will be called hundreds or thousands of times in your app, so milliseconds will add up!  If you have a slight mistake in your timing here, it will cause a big shift in your results.


At some point, many php developers turn to the pcntl functions in php to write a daemon, or server, or simulate threading.

But how do you unit test this with complete code coverage?  Here’s the sample code we’ll be testing:

// @codeCoverageIgnoreStart
if (!defined('TEST_MODE')) {
	define('TEST_MODE', false);
}
// @codeCoverageIgnoreEnd
function start_process() {
	$pid = pcntl_fork();
	if (-1 == $pid) {
		die('Error forking');
	} elseif ($pid > 0)  {
		do {
			echo "Parent - " . getmypid() . "\n";
			sleep(1);
		} while (!TEST_MODE);
	} else {
		echo "Child - " . getmypid() . "\n";
		die();
	}
}

We’ll need to overcome a few big challenges in testing this code:

  • PHPUnit will not re-assemble code-coverage results from across different processes.  This test must remain in one process.  (I’ve tried to find a reference for this, but I can’t … PHPUnit is moving towards parallelized tests, but isn’t there yet)
  • Any calls to die() or exit() will end the test prematurely.
  • The call to sleep(1) will take 1 second of wall time in the test.  That can add up quickly if you have lots of test cases.
  • There isn’t a reliable way to make fork() return -1 to test the error handling branch of the code.

We need to engage some black arts php extensions to make this happen.  An installation guide follows, and the post ends with a complete listing of the unit test.

Disclaimer

These extensions are meant for testing.  These are for your dev or test servers.  Please do not install these on your production servers unless you’re really positive that you want to have them available in your production environment!

Test Helpers

First we’ll need the test_helpers.so extension.  This will let you substitute a callback for any calls to die or exit.

This is a zend_extension, it must be loaded in the correct order, so this one may take some trial and error to get working properly with other zend extensions like xdebug and ioncube.  You can find the extension on github:  https://github.com/sebastianbergmann/php-test-helpers

Once you’ve downloaded and unzipped the package and are in the folder, you can install with:

$ pecl install package.xml

Then add this to your php.ini file:

zend_extension=/path/to/your/php/modules/test_helpers.so

And when you type php -m at the command line, you should see “test_helpers” listed as a Zend Module and a PHP Module.

Runkit

Runkit is a really fun extension that lets you rewrite functions at runtime, execute code in a sandbox, run lint checks on files without spawning a separate process, and other tasks that you probably shouldn’t be doing.  I’ve caused many segfaults in php with runkit.  It doesn’t come with too many safety guards, so be careful with it (see the disclaimer about not running this in production).  You can find the latest version on github here:  https://github.com/zenovich/runkit/

Once you’ve downloaded and unzipped the package and are in the folder, you can install with:

$ pecl install package.xml

Then add this to your php.ini file:

extension=runkit.so

And when you type php -m at the command line, you should see “runkit” listed as a PHP Module.  One last thing, make sure to change this configuration setting for runkit so you can change internal php functions:

runkit.internal_override = 1

The Unit Test

I will skip the guided tour of the evolution of this unit test and jump straight to the finished product.  Here’s a full unit test of the forking code above with 100% coverage.

Note: In the original file, there are ignore coverage wrappers around the code that turns off test mode. Usually this flag is not in a conditional block, it’s set by the application’s framework and is based on the environment + configuration file.  To keep things simple, I’ve chosen to demonstrate this method using a constant.

class ForkTest extends PHPUnit_Framework_TestCase {

	/**
	 * Calls to our mock functions
	 * @var array
	 */
	public static $mock_calls = array(
		'sleep'      => 0,
		'pcntl_fork' => 0,
		'_die'       => 0
	);

	/**
	 * Constructor
	 * Check for necessary php extensions
	 */
	public function __construct() {
		if (!extension_loaded('test_helpers')) {
			echo "These unit tests require the test_helpers extension.  For more information, please visit:\n";
			echo "https://github.com/sebastianbergmann/php-test-helpers\n";
			exit(1);
		}
		if (!extension_loaded('runkit')) {
			echo "These unit tests require the runkit extension.  For more information, please visit:\n";
			echo "https://github.com/zenovich/runkit/\n";
			exit(1);
		}
		parent::__construct();
		define('TEST_MODE', true);
	}

	/**
	 * Setup
	 */
    protected function setUp() {

		// Disable exit / die functionality
		set_exit_overload(array(__CLASS__, '_die'));

		// Sleep quicker (seconds -> milliseconds)
		runkit_function_copy('sleep', '__backup_sleep');
		runkit_function_redefine('sleep', '$sec', 'ForkTest::sleep($sec);');

		// Reset mock calls
		foreach ( array_keys(self::$mock_calls) as $key ) {
			self::$mock_calls[$key] = 0;
		}

		// Get the system under test
		require_once('./fork.php');
    }

	/**
	 * Tear down
	 */
	protected function tearDown() {

		// Restore exit/die functionality
		unset_exit_overload();

		// Restore the original sleep() function
		runkit_function_remove('sleep');
		runkit_function_copy('__backup_sleep', 'sleep');
		runkit_function_remove('__backup_sleep');
	}

	/**
	 * Test the parent path of pcntl_fork()
	 */
    public function testParent() {

		// Force pcntl_fork to return 1 (meaning parent)
		runkit_function_copy('pcntl_fork', '__backup_pcntl_fork');
		runkit_function_redefine('pcntl_fork', '', 'return ForkTest::pcntl_fork(1);');

		// Baseline assertions
		$this->assertEquals(self::$mock_calls['sleep'],      0);
		$this->assertEquals(self::$mock_calls['pcntl_fork'], 0);
		$this->assertEquals(self::$mock_calls['_die'],       0);

		// Call forking code
		start_process();

		// Stuff happened
		$this->assertEquals(self::$mock_calls['sleep'],      1);
		$this->assertEquals(self::$mock_calls['pcntl_fork'], 1);
		$this->assertEquals(self::$mock_calls['_die'],       0);

		// Retore original pcntl_fork functionality
		runkit_function_remove('pcntl_fork');
		runkit_function_copy('__backup_pcntl_fork', 'pcntl_fork');
		runkit_function_remove('__backup_pcntl_fork');
    }

	/**
	 * Test the child path of pcntl_fork()
	 */
    public function testChild() {

		// Force pcntl_fork to return 0 (meaning child)
		runkit_function_copy('pcntl_fork', '__backup_pcntl_fork');
		runkit_function_redefine('pcntl_fork', '', 'return ForkTest::pcntl_fork(0);');

		// Baseline assertions
		$this->assertEquals(self::$mock_calls['sleep'],      0);
		$this->assertEquals(self::$mock_calls['pcntl_fork'], 0);
		$this->assertEquals(self::$mock_calls['_die'],       0);

		// Call forking code
		start_process();

		// Stuff happened
		$this->assertEquals(self::$mock_calls['sleep'],      0);
		$this->assertEquals(self::$mock_calls['pcntl_fork'], 1);
		$this->assertEquals(self::$mock_calls['_die'],       1);

		// Retore original pcntl_fork functionality
		runkit_function_remove('pcntl_fork');
		runkit_function_copy('__backup_pcntl_fork', 'pcntl_fork');
		runkit_function_remove('__backup_pcntl_fork');
    }

	/**
	 * Test the error path of pcntl_fork
	 */
    public function testCouldNotFork() {

		// Force pcntl_fork to return -1 (meaning could not fork)
		runkit_function_copy('pcntl_fork', '__backup_pcntl_fork');
		runkit_function_redefine('pcntl_fork', '', 'return ForkTest::pcntl_fork(-1);');

		// Baseline assertions
		$this->assertEquals(self::$mock_calls['sleep'],      0);
		$this->assertEquals(self::$mock_calls['pcntl_fork'], 0);
		$this->assertEquals(self::$mock_calls['_die'],       0);

		// Call forking code
		start_process();

		// Stuff did not happen
		$this->assertEquals(self::$mock_calls['sleep'],      0);
		$this->assertEquals(self::$mock_calls['pcntl_fork'], 1);
		$this->assertEquals(self::$mock_calls['_die'],       1);

		// Retore original pcntl_fork functionality
		runkit_function_remove('pcntl_fork');
		runkit_function_copy('__backup_pcntl_fork', 'pcntl_fork');
		runkit_function_remove('__backup_pcntl_fork');
    }

	/**
	 * Pretend to fork
	 * @param int $return_value
	 * @return int
	 */
	public static function pcntl_fork($return) {
		self::$mock_calls[__FUNCTION__]++;
		return $return;
	}

	/**
	 * Mock sleep function
	 * Sleep a little bit (seconds -> millisecond)
	 * @param int $sec
	 */
	public static function sleep($sec) {
		self::$mock_calls[__FUNCTION__]++;
		usleep($sec * 1000);
	}

	/**
	 * Mock die function
	 * @return bool
	 */
	public static function _die() {
		self::$mock_calls[__FUNCTION__]++;
		return false;
	}
}

And the proof:

[root@server1 forktest]# phpunit --coverage-html=/var/www/html/coverage/ ForkTest.php
PHPUnit 3.6.4 by Sebastian Bergmann.

.Parent - 25260
.Child - 25260
.

Time: 0 seconds, Memory: 5.75Mb

OK (3 tests, 18 assertions)

Generating code coverage report, this may take a moment.

Code Coverage Report

Code Coverage Report