YAML and PHP

30 Jul 2009 19:03

There are 3 main YAML implementations for PHP:

  • Syck (native C library bindings to PHP)
  • Symphony YAML (pure PHP)
  • Spyc (pure PHP again)

This is the comparison.

What the hell is YAML

Have you heard about XML or JSON? YAML is similarly to JSON and XML a way to store (read and write) structured data like arrays (a.k.a. lists), dictionaries (a.k.a. hash maps) and atomic values like strings and numbers. The structures can be nested, to form a definition of near-real-life objects, for example:

---
Piotr Gabryjeluk:
  company: Wikidot Inc.
  university: Nicolaus Copernicus University, Toruń, Poland
  lives_in: Toruń, Poland
  hobbies:
  - basketball
  - playing the guitar

Which translates to PHP:

<?php
$data = array('Piotr Gabryjeluk' => array(
  'company' => 'Wikidot Inc.',
  'university' => 'Nicolaus Copernicus University, Toruń, Poland',
  'lives_in' => 'Toruń, Poland',
  'hobbies' => array('basketball', 'playing the guitar')
));

So you see YAML is quite nice even when you need to write it yourself.

YAML has its specification (see http://yaml.org), so once we have standard YAML parser and standard YAML dumper we can send arrays from one machine to another and the result should be the same array as was sent.

PHP

So let's see what are the choices if you want to play with YAML in PHP.

Syck

This is the fastest and the most complete YAML dumper and loader library available. This is binding to C library and this is available in PEAR. It is also available as regular package in Ubuntu repository, so install it by simple:

aptitude install php5-syck

In some shared hosting environment this could be a problem, so you need a pure PHP solution.

Spyc

This was the first PHP YAML implementation I saw. It is both dumper and loader and it seemed to work fine, but then I found some bugs, that stopped me from using it as the base and only YAML loader and dumper for Wikidot.

This one has really nice thing, which is nice when you want your users to enter YAML to define things (like we do for forms). It is quite forgiving when it comes to the syntax and ignores things that don't fit and still parses the rest.

Unfortunately as I stated before Spyc dumper so, when you first dump an array and then load it with Spyc you get something different (for example multiple new-lines are treated as one). Not good. Also as a loader it does not fully understand the full YAML specification (which is quite huge BTW).

Symphony YAML

This one is pure-PHP as well, so you don't need special rights, to use it on a PHP-enabled machine.

It's loader does not understand full YAML specification, so for example you can't load documents dumped by Syck. Dumper is good.

Summary

Syck Spyc Symphony YAML
type of library PHP extension pure PHP library pure PHP library
speed fast slow slow
loader: YAML support full bad not bad
loader: if YAML is corrupted exception tries to do its best to load the rest exception
dumper: YAML human-readable more-or-less yes more-or-less if set properly
dumper: YAML conforms to spec yes no yes
loads Syck's dumper output correctly yes no no
loads Symphony's dumper output correctly yes no yes

Verdict: loader

Syck is the winner in loading YAML. If you cannot use Syck, use Symphony YAML. If you need to parse user input (which should be human readable/writable similar to YAML), use Spyc.

Actually, this is nice combination for loading:

<?php
try {
    // if syck is available use it
    if (extension_loaded('syck')) {
        return syck_load($string);
    }
    // if not, use the symfony YAML parser
    $yaml = new sfYamlParser();
    return $yaml->parse($string);
} catch (Exception $e) {
    // if YAML document is not correct,
    return Spyc::YAMLLoadString($string);
}

This way, you have the fastest library used if possible, then the best pure-PHP, and if it fails in a way, that document was badly written (by human being for example), you fall-back to Spyc.

Verdict: dumper

In my opinion Symphony YAML dumper is the best from the three in terms of usability, portability and interoperability, because its output can be read by both itself and Spyc.

However, if you dump YAML often, use (hell faster) Syck for both loading and dumping. The generated YAML won't be readable by Symphony YAML or Spyc, but this is because they don't follow the specification (so not Syck's problem in fact).

Also note, that any valid JSON dumper output is readable by standard YAML 1.2 loaders, because JSON is a subset of YAML 1.2. So if using for data exchange (and not for talking to human) any fast JSON dumper can be used.

Previous post: July News

Next post: Dwa Typy


More posts on this topic

Comments

Add a New Comment
or Sign in as Wikidot user
(will not be published)
- +
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License