maettig.com

Thiemos Archiv

Re-visiting »PHP: a fractal of bad design« in 2020

I love PHP. And I love Eevee's (a.k.a. Evelyn Woods) blog post about PHP's bad design from 2012. Mostly because it's so massive. It lists really everything anybody ever disliked about the language, and explains why. It's fun to read. And it's true. Or at least was true. The post was compiled in 2012. The most recent – and only supported – version we had back then was PHP 5.3, as well as 12 painful years with PHP 4. And yes, PHP 4 was still in use pretty much everywhere. Adoption was notably slower back then.

It's 2020, and PHP 8 will be released in less than 3 months, according to the plan. And oh boy, what a leap forward! The PHP 8 changelog reads like they are finally fixing some of the worst design mistakes in the language.

Time to re-visit the »PHP: a fractal of bad design«.

»… design«
Uh, wait. Most of PHP was never really »designed«. It started as a library of scripts, written in Perl, never really meant to be an actual programming language. Some design mistakes have been made back then. And they are hard to fix when you don't really have a deprecation strategy, and a community that isn't used to deprecations and keeping up with possibly breaking changes. When asking the question, »what is better, making people angry because a design mistake is never fixed, or making people angry because of breaking changes«, for a very long time the PHP community's answer was to stick to the old behavior. This changed with PHP 5.3, and continues since then.

Ok, I get it. Even the lack of design is design, and that's what the post is about.

»PHP requires boilerplate«
I'm not really sure what the author meant when they wrote this. Don't all languages require at least some boilerplate? I mean, we are not using BASIC any more whare you could write 10 PRINT "Hello world", and that would already be a runnable program. I guess one possible answer is that we have proper auto-loaders now.
»Little new functionality is implemented as new syntax«
That changed, and will change a lot more with PHP 8. 🎉
»The language is full of global and implicit state.«
True. Writing clean code that doesn't rely on (to much) global state is still really, really hard.
»array_search, strpos, and similar functions return […] false if they don’t find it at all.«
Yea, and other languages return -1. Is this really better? You still get code that doesn't work and possibly does the wrong thing when you continue to use that -1 as if it is a valid index. Sure, there is a difference: It's much more likely for a -1 to produce a runtime error, and much more likely for PHP's false to not do this. But neither is really predictable.
»== is useless.«
Sad, but true. I try hard to never use it, and when I see it, it's almost always a red flag, signaling error-prone code. Still PHP 8 tries. Most notably: the glorious 123 == "123foo" example won't work any more.

But wait. The loose == comparison behaves very similar in JavaScript, the language everybody seems to be fine with. 🤷‍♂️

»The [] indexing operator can also be spelled {}.«
It was deprecated in PHP 7.4, and is gone in PHP 8.
»[] can be used on any variable […] returns null and issues no warning.«
Fixed.
»I don’t know why [list($a, $b) = …] wasn’t given real dedicated syntax, or why the name is so obviously confusing.«
Fixed. Just use [$a, $b] = … and forget about list().
»(int) is […] a single token; there’s nothing called int in the language.«
I honestly don't care. Why do you need to know?
»There’s no such thing as a nested or locally-scoped function or class.«
Fixed. Anonymous functions a.k.a. »closures« as well as anonymous classes are a thing now.
»There’s redundant syntax for blocks […] endif«
Uh. We pretend these don't exist, 🙄 even actively banned them.
»PHP errors and PHP exceptions are completely different beasts. They don’t seem to interact at all.«
Yea, that's seriously awful. Again, PHP 7 and 8 try to fix most of this now.
»[…] you can’t require that a [function] argument be an int or string […] or other “core” type«
Fixed.
»No named arguments to functions.«
Fixed in PHP 8. 🥳
»“Variadic” functions require faffing about with […] func_get_args. There’s no syntax for such a thing.«
There is the ... syntax now.
»Also, an instance method can still be called statically.«
Fixed. It's not allowed any more.
»Subclasses cannot override private methods.«
I never noticed. But it's fixed now.
»There are no constructors or destructors. […] There is no method you can call on a class to allocate memory and create an object.«
What? 😨 I seriously don't want to allocate memory manually – ever.
»There are a lot of aliases«
Slowly, slowly they are deprecated and removed. In MediaWiki, we actively disallow a lot of them via a PHPCS sniff.
»chunk_split breaks a string into chunks of equal length, then joins them together with a delimiter.«
There is a lot of crazy stuff you probably never need. So what? 🤷‍♂️ chunk_split appears a single time in MediaWiki.
»Because calling a function with an array as its arguments is so awkward (call_user_func_array) […]«
Fixed with the ... syntax.
»08 becomes the number zero.«
It's a parse error now.
»0x0+2 produces 4. […] 0b0+1 produces 2.«
Not sure what happened here, but it's all fine now.
»No Unicode support.«
I don't fully understand why people keep bringing this up. It's not really true. Sure, it's confusing. PHP can use the same string as an 8-bit ASCII string or a sequence of UTF-8 code points, depending on which functions you use, or if you use the /u modifier in regular expressions or not. But this isn't really specific for PHP, but something it shares with many other languages that have been developed when Unicode just did not existed. C or Delphi, for example.
»This one datatype acts as a list, ordered hash, ordered set, sparse list, and occasionally some strange combination of those.«
Yes. And I love it.
»Negative indexing doesn’t work«
Wait, what? What do you expect it to do? Give you an array element counting from the end of the array? Wasn't this blog post about consistency? Anyway, just use array_slice( …, -1 ).
»[…] cannot create zero-length arrays«
That was an issue with many function arguments and return values, and got fixed for all of them when the ... syntax was introduced, as far as I know.
»All the addslashes, stripslashes, and other slashes-related nonsense are red herrings that don’t help anything.«
That's … oh wow. That's a really, really good summary for why these concepts are bad. I hope … ah, dang. 🤦‍♂️ But it could be worse.
»register_globals […] is an embarrassment.«
True. It was a good idea that never worked.
»include accepting HTTP URLs.«
I'm not sure if this was ever a good idea. I mean, we still do more or less the same when we write code that calls a web API. But we use dedicated syntax for that. The issue was not that PHP could pull stuff via HTTP, but that this feature was hiding in essential functions like include, and could have been easily be exploited there. Anyway, it's (obviously) fixed now.
»Magic quotes.«
Should be gone by now. I always hated them. They gave us nothing but pain.
»Perl is still alive«
I recommend watching Netanel Rubin entertaining talk »The Perl Jam: Exploiting a 20 Year-old Vulnerability« as well as »The Perl Jam 2«.
good post

only I beg to differ on some points

like that thing about return values - those are actually far worse than in typesafe languages
DAT
Which return values? Returning false in case of an error is a very common thing in PHP. MediaWiki does it as well in places. I don't say I like it. I prefer null or Status objects that tell you if an operation was successful. The thing with -1 is: it's a special case, just as false. Neither does make the code easier to read.

The worst thing might be that we have to use strpos() !== false way to often because we didn't had str_contains() before PHP 8.
Thiemo
> "Negative indexing […] Wasn't this blog post about consistency?"

It exists in many languages and it's perfectly consistant

> "This one datatype acts as a list, ordered hash, ordered set, sparse list […]"

The main problem is that there is not details about implementation and no alternative. As the original author mentionned it, it's not about "liking it or not", it's about efficiency and tooling.
A vector is not a Set and is not a Hash. They have different purposes and different specs, leading to better (or worse) performances / capabilities according to the given situation

> "array_search, strpos, and similar functions return […] false […] Yea, and other languages return -1. Is this really better? […] Sure, there is a difference: It's much more likely for a -1 to produce a runtime error, and much more likely for PHP's false to not do this."

Your answer is bad faith. The problem being 0/false vs 0/-1 is that false is true when compared to 0. So if(strpos == 0) will return true whether the needle was at pos 0 or not in the haystack. You're partially quoting the original point to not answer it.
TMLPESSLBDP
> "The main problem is that there is not details about implementation and no alternative."

This is not true, see php.net/manual/en/book.ds.php.

Additionally, you have interfaces like ArrayAccess which allow you to implement any "array like" class you need.
anonym
I even said "sure, it's much more likely for false to [silently become 0]". So much for partial quotes. But the complaint about returning false is really just repeating the complaint about the == operator. My argument stands: returning -1 is just as much as a hack as returning false.

Besides, I just realized when we make strpos() return -1 and allow negative indexing the same time, we created just another trap to fall into.
Thiemo

Kommentare zu diesem Beitrag können per E-Mail an den Autor gesandt werden.

[ ← Zurück zur Übersicht ]

Impressum & Datenschutz