YAML and PHP
tags: dev php yaml
1248980616|%e %B %Y
There are 3 main YAML implementations for PHP:
- Syck (native C library bindings to PHP)
- Symphony YAML (pure PHP)
- Spyc (pure PHP again)
This is the comparison.
What the hell is YAML
Have you heard about XML or JSON? YAML is similarly to JSON and XML a way to store (read and write) structured data like arrays (a.k.a. lists), dictionaries (a.k.a. hash maps) and atomic values like strings and numbers. The structures can be nested, to form a definition of near-real-life objects, for example:
---
Piotr Gabryjeluk:
company: Wikidot Inc.
university: Nicolaus Copernicus University, Toruń, Poland
lives_in: Toruń, Poland
hobbies:
- basketball
- playing the guitar
Which translates to PHP:
$data = array('Piotr Gabryjeluk' => array( 'company' => 'Wikidot Inc.', 'university' => 'Nicolaus Copernicus University, Toruń, Poland', 'lives_in' => 'Toruń, Poland', 'hobbies' => array('basketball', 'playing the guitar') ));
So you see YAML is quite nice even when you need to write it yourself.
YAML has its specification (see http://yaml.org), so once we have standard YAML parser and standard YAML dumper we can send arrays from one machine to another and the result should be the same array as was sent.
PHP
So let's see what are the choices if you want to play with YAML in PHP.
Syck
This is the fastest and the most complete YAML dumper and loader library available. This is binding to C library and this is available in PEAR. It is also available as regular package in Ubuntu repository, so install it by simple:
aptitude install php5-syck
In some shared hosting environment this could be a problem, so you need a pure PHP solution.
Spyc
This was the first PHP YAML implementation I saw. It is both dumper and loader and it seemed to work fine, but then I found some bugs, that stopped me from using it as the base and only YAML loader and dumper for Wikidot.
This one has really nice thing, which is nice when you want your users to enter YAML to define things (like we do for forms). It is quite forgiving when it comes to the syntax and ignores things that don't fit and still parses the rest.
Unfortunately as I stated before Spyc dumper so, when you first dump an array and then load it with Spyc you get something different (for example multiple new-lines are treated as one). Not good. Also as a loader it does not fully understand the full YAML specification (which is quite huge BTW).
Symphony YAML
This one is pure-PHP as well, so you don't need special rights, to use it on a PHP-enabled machine.
It's loader does not understand full YAML specification, so for example you can't load documents dumped by Syck. Dumper is good.
Summary
| Syck | Spyc | Symphony YAML | |
|---|---|---|---|
| type of library | PHP extension | pure PHP library | pure PHP library |
| speed | fast | slow | slow |
| loader: YAML support | full | bad | not bad |
| loader: if YAML is corrupted | exception | tries to do its best to load the rest | exception |
| dumper: YAML human-readable | more-or-less | yes | more-or-less if set properly |
| dumper: YAML conforms to spec | yes | no | yes |
| loads Syck's dumper output correctly | yes | no | no |
| loads Symphony's dumper output correctly | yes | no | yes |
Verdict: loader
Syck is the winner in loading YAML. If you cannot use Syck, use Symphony YAML. If you need to parse user input (which should be human readable/writable similar to YAML), use Spyc.
Actually, this is nice combination for loading:
try { // if syck is available use it if (extension_loaded('syck')) { return syck_load($string); } // if not, use the symfony YAML parser $yaml = new sfYamlParser(); return $yaml->parse($string); } catch (Exception $e) { // if YAML document is not correct, return Spyc::YAMLLoadString($string); }
This way, you have the fastest library used if possible, then the best pure-PHP, and if it fails in a way, that document was badly written (by human being for example), you fall-back to Spyc.
Verdict: dumper
In my opinion Symphony YAML dumper is the best from the three in terms of usability, portability and interoperability, because its output can be read by both itself and Spyc.
However, if you dump YAML often, use (hell faster) Syck for both loading and dumping. The generated YAML won't be readable by Symphony YAML or Spyc, but this is because they don't follow the specification (so not Syck's problem in fact).
Also note, that any valid JSON dumper output is readable by standard YAML 1.2 loaders, because JSON is a subset of YAML 1.2. So if using for data exchange (and not for talking to human) any fast JSON dumper can be used.
Comments: 0, Rating: 0
PHP as FastCGI backend and Lighttpd
tags: dev fastcgi lighttpd php problem solved wikidot
1245099569|%e %B %Y
Wikidot + Lighttpd + PHP5
At Wikidot we use PHP5 as FastCGI backend to Lighttpd light-and-fast webserver. It works like this:
- there are a few hundreds of php5-cgi processes (name is cgi, but they also support FastCGI mode) running and waiting to be used
- lighttpd (only one needed!) process manages the network connections to all the clients and once the request is ready serves a static file or forwards the request to one of PHP backends processes.
We used to use internal Lighttpd FastCGI process manager, meaning the lighttpd processes actually used to start the PHPs.
Problems
We encountered some known problems of 500 (server side) errors appearing after some random time, especially under a high traffic. The typical message appearing at the Lighttpd's error.log was:
<some date>: (mod_fastcgi.c.2494) unexpected end-of-file (perhaps the fastcgi process died): pid: ...
There are plenty of reports on this in both Lighttpd's and PHP's forums, bug trackers and even some blogs.
Workarounds
We managed to write some hacky scripts that detected the situation and restarted the backends when needed. The reaction was so quick, that almost no-one noticed the error, but damn, this is not how WE solve problems.
A blind try
We decided to give spawn-fcgi a shot. What is it? It is a program that spawns FastCGI backends (independently from Lighttpd server). Why trying it? I've read somewhere, that it works more reliably than the internal Lighttpd spawner. What's interesting is that this program comes from lighttpd package, so we're in family anyway. It's mainly intended to run the FastCGI backends from different user than the webserver user or to run them on different machine(s) than the webserver machine. This can be used naturally for some smart load-balancing.
The only problem of this solution we encountered was internal limit of number of processes to spawn by a single process which was 256 (hardcoded, fixed in next versions). But at the same time, we decided to build a few FastCGI bridges (each spawning ~200 PHPs) anyway so that was no longer a problem for us.
What was quite surprising (but honestly, I deeply believed in this), our problems with 500 server errors and PHP disappeared. This configuration works for about 2 weeks now with absolutely no hacky scripts involved and no restarting needed. Cool.
Why I wrote this
I wrote this short note just for the record and to let other people know, that using spawn-fcgi instead of the internal Lighttpd's FastCGI spawner might solve their problems with PHP (FastCGI) and 500 internal server errors.
Hope this helps someone.
Comments: 1, Rating: 0
O Zend Framework
tags: php polish zend_framework
1236800046|%e %B %Y
Pewien czas temu, mówiłem ciepłe słowa o Zend Framework. Okazuje się, że nie jest tak różowo jak się wydaje. A wyrażeniem kluczowym jest tutaj:
64 bit
Na 64 bitowym systemie, z Zend Framework jest wiele problemów. Wymienię ich kilka:
Zend_Search_Lucene
Już taki prosty kod, uruchamiany na 64-bitowym systemie powoduje nieskończone pętle i przekraczanie limitu pamięci:
require_once("Zend/Search/Lucene.php"); $index = Zend_Search_Lucene::open('/path/to/index');
Oczywiście pierwsze co robimy, żeby korzystać z indeksu, to go otwieramy, więc ten moduł (Zend_Search_Lucene) staje się zupełnie niezdatny do użytku.
Co ciekawe, problem jest zgłoszony na bug-trackerze ZF. Doszedłem co trzeba zrobić, żeby rozwiązać problem, wrzuciłem na bug-trackera gotowego (mniej lub bardziej) diffa, ale nikt się nie przejął ani błędem, ani rozwiązaniem.
Zend_Db
Jednym z ważniejszych elementów zawartych w Zend Framework, jest warstwa dostępu do bazy danych. Niestety na 64 bitowym systemie, framework ma jakieś problemy z ograniczaniem wyników przy użyciu metody limit. Nakazanie wyświetlenia rekordów począwszy od rekordu 0, wygenerowało mi zapytanie, które kończyło się na:
LIMIT 98382101, 20;
Powinno być:
LIMIT 0, 20;
Głupia sprawa. Może to poprawili w nowszej wersji, może nie. Nie zgłębiałem tego.
Zend_XmlRpc_Server
Ostatnio pracując nad Wikidot API natrafiłem na paskudny i ukryty błąd w komponencie serwera XML-RPC Zend Framework.
Wszystko niby działa, ale wołanie przez klienta XML-RPC funkcji system.methodHelp, czy system.methodSignature kończy się błędem niedopasowania rządanej metody to sygnatur znanych metod. Na 32 bitach wszystko działa.
Podsumowanie
Zend Framework może się wydawać fajny (mi się wydawał), ale uważajcie mocno przy przenoszeniu kodu z 32 bitów (np. na laptopie) na 64 bity (np. na serwer). Jest SPORO bugów w tym naprawdę dokuczliwe, związane z Zend_Db.
Comments: 1, Rating: 0
Working On wdLite
tags: apache dev hack php wdlite wikidot
1233622455|%e %B %Y
A few days ago I started working on wdLite — a lite version of Wikidot.
The primary aim of this project is to make installation dead simple and server requirements really small.
Server requirements
wdLite should be installable on:
- Apache with PHP5 (no safe mode or other limitations) and PostgreSQL on Linux boxes
PHP and PostgreSQL should be already configured to work with each other. You should have a PostgreSQL database and "user/password/database"-based access to it. Wikidot will create tables, but won't create a database. You should either create it as root or have root created it for you before (this process might be automatic on webhosting services with PHP/PostgreSQL).
This configuration should include a whole bunch of virtual hosting providers.
Installation
The installation process should be no harder than this:
- Get a zip or checkout the newest version from repository
- Upload the directory to the server
- Adjust directory permissions
- Go to install.php script with your browser
- Supply mail and PostgreSQL credentials
- Choose your wiki name and create users
- Enjoy your new wiki
What are the differences between "full" Wikidot installation and wdLite
Limitations
- only one wiki
- no page revision diffs
- more limited page size
- lower security (especially for IE users)
- works only with Apache
- some features disabled or non-working
- memcached disabled
- karma disabled
- notifications disabled
Better than full version, because
- works with Apache
- works on any HTTP port (not only 80)
- works within any directory (also in user directory accessible like http://myserver.com/~quake/something/really/deep/wikidot)
- easier installation with a web interface
- no root-access needed
- works well with GMail to send mails from the service
- easily installable on Ubuntu
- the easiest method to start developing with Wikidot
- no need to manually compile additional software
Current work progress
I'm about to pre-release this software, to let you test it.
I have to:
- create a list of things that need to work before a final 1.0 release.
- redirect / to /?/
- create install.php
Things that work already:
- logging in/out
- displaying pages
- editing pages
- saving pages
- some basic modules
- uploading and displaying files
- navigation (links are rewritten from absolute to relative)
How does it work
The wdLite is based on Wikidot OpenSource. It contains wikidot, index.php and a bunch of helpers scripts. The index.php file is a hacky PHP script that
- converts URL-s like http://some.server.com/some/url/?/front:page to http://www.some.server.com/front:page
- tricks Wikidot software to think that the Wikidot domain is some.server.com and the main wiki is www.some.server.com
- sets a bunch of system variables Wikidot relies on, like $_SERVER['REQUEST_URI'], $_SERVER['QUERY_STRING']
- runs a proper one from Wikidot scripts
- … or serves (more like redirects to) a static file
- catches the script output
- runs some transformation on caught output (like converting the links from http://www.some-server.com/some:other-page to ?/some:other-page)
- sends the data back to browser
WARNING, ACHTUNG
The script is more a dirty hack than a version of Wikidot, but this is intentional. We don't want to mantain to many versions of Wikidot. Having this dirty script "only using without modyfing" Wikidot software makes it quite independent from changes in Wikidot. This means the same wdLite script will work for a newer version of Wikidot (=less maintaining work).
WARNING, ACHTUNG


