What is the deal with "queries" thing? I need some study material for understanding servers.

deslocotoco

Enthusiast
Joined
Jun 23, 2016
Messages
210
Hello guys,

Every time that i read a discussion, specially about add-ons, i came across people talking about "queries".

This happens very very often:

- "Oh, this add-on is very good but increase in two queries in each page. Can you fix it?"

So, i don't understand what is it.

Why is a bad thing for server purposes?

And like i said, i need some studying material for understanding a little more about my business. Can you guys recommend some introduction materials?

Thank you very much.
 

rafalp

Desu Ex
Joined
Feb 17, 2008
Messages
1,340
It's short for database query. Doing many of those can increase time it takes to display page to the user and slow the site down so rule of thumb is that less is better.
 

MagicalAzareal

Magical Developer
Joined
Apr 25, 2019
Messages
758
Okay, think about it this way.

The database is a box which holds all of your site's data, some liken it to a box filled with excel spreadsheets, although not quite like that as it is stricter and more rigid among other things.

A query is where you ask the database a "question" about a specific piece of data, for instance, you might want to get back information about one of your members. The database then checks whether you're allowed to access that bit of data and then hands it back to you.

The more queries you have, the more "questions" you have flying back and forth, and the more work the database has to both decipher the question and to finally get back to sending back a response. While this is happening, unless you're doing some brand of async magic, the rendering of the page is stalled (as it obviously doesn't have the data it needs to continue).

Sometimes, a plugin or a software will ask far more questions than they really need to, as they could instead hold the "response" in their memory or do other tricks to get around having to ask it.

Hope this helps :)
 

zappaDPJ

Moderator
Joined
Aug 26, 2010
Messages
8,450
Doing many of those can increase time it takes to display page to the user and slow the site down so rule of thumb is that less is better.

Every time I read something like this I can't but think of the reply the CTO of Internet Brands, Joe Rosenblum gave when ask why vB5 pages required so many (100+) queries. His reply was something along the lines of the number of queries is not important... and so sank 'the world's leading community software'.
 

MagicalAzareal

Magical Developer
Joined
Apr 25, 2019
Messages
758
Every time I read something like this I can't but think of the reply the CTO of Internet Brands, Joe Rosenblum gave when ask why vB5 pages required so many (100+) queries. His reply was something along the lines of the number of queries is not important... and so sank 'the world's leading community software'.
It's less important now than it used to be as database engines have gotten a fair bit faster, but it's still nowhere near as fast as just pulling a value out of a hashmap, things also get worse if the database is on a different server.
 

Oblivion Knight

Participant
Joined
May 11, 2005
Messages
70
the reply the CTO of Internet Brands, Joe Rosenblum gave when ask why vB5 pages required so many (100+) queries. His reply was something along the lines of the number of queries is not important... and so sank 'the world's leading community software'.
I hate that response with a passion.. :X3:

100 queries may be more acceptable than it used to be, but that doesn't mean to say that it should be accepted.. I refuse to see how 100+ queries is better than 20-30 on a standard page, unless the queries are ridiculously complexed and splitting them down into additional queries makes logical sense, but it's highly unlikely that all of the queries being ran match this criteria.

I remember cringing at vBulletin 3 installations that ran with about 70 queries on the forum index.. :p
 

MagicalAzareal

Magical Developer
Joined
Apr 25, 2019
Messages
758
I hate that response with a passion.. :X3:

100 queries may be more acceptable than it used to be, but that doesn't mean to say that it should be accepted.. I refuse to see how 100+ queries is better than 20-30 on a standard page, unless the queries are ridiculously complexed and splitting them down into additional queries makes logical sense, but it's highly unlikely that all of the queries being ran match this criteria.

I remember cringing at vBulletin 3 installations that ran with about 70 queries on the forum index.. :p
20 - 30 shouldn't be acceptable either. 5 at most tbh.

Adding queries is like snorting drugs, people just can't stop.
It's very easy to succumb to "just one more query" and if you have 30 plugins which all think this is perfectly fine, then all of a sudden you have 100 queries.
 
Last edited:

Paul M

Super Moderator
Joined
Jun 26, 2006
Messages
4,077
Every time I read something like this I can't but think of the reply the CTO of Internet Brands, Joe Rosenblum gave when ask why vB5 pages required so many (100+) queries. His reply was something along the lines of the number of queries is not important... and so sank 'the world's leading community software'.
..and he never did understand how wrong that was.

When vB5 was still in early alpha, some pages were running over a thousand queries. They had to add caching just to get that down to hundreds.
 

MagicalAzareal

Magical Developer
Joined
Apr 25, 2019
Messages
758
..and he never did understand how wrong that was.

When vB5 was still in early alpha, some pages were running over a thousand queries. They had to add caching just to get that down to hundreds.
How on earth do you accomplish "thousands of queries"? Even if I was deliberately trying to sabotage myself, it would be hard for me to do that.
 

Paul Buonopane

Aspirant
Joined
Jan 17, 2015
Messages
28
It's important to note that number of queries isn't the metric you should be worried about. It is often more efficient to run additional, well-optimized queries than it is to process the data within PHP. Going out of your way to avoid additional queries may actually be detrimental to the performance of your site.

What you should be worried about are queries that are poorly optimized. They may lack appropriate indices, or they may be written in such a way that indices won't be sufficient.

As a developer, what you should not be doing is attempting to handle relationships from within PHP when you could be using SQL. Relational databases are relatively expensive, but they perform well when given adequate resources, so utilize them. They can often handle large datasets far more efficiently than PHP can. A good MySQL query is usually going to be O(log n)--or perhaps O(1) if indexed properly--but it's far too easy to write equivalent code in PHP that is O(n) or even O(n^k). (In simpler terms, unless you know exactly what you're doing, it's not unusual for MySQL queries to be exponentially faster on paper.)

Edit: For example, here's what XenForo 1.5 reports for the homepage on a dev server:
Code:
Page Time: 0.1990s
Memory: 8.9545 MB (Peak: 10.2250 MB)
Queries (48, time: 0.0602s, 30.2%)
This is a pretty heavy page; most of our requests that hit PHP complete in 50ms, but this one takes 200ms. Even with 48 queries, the query time only accounts for 30% of the time; the rest is PHP. Most of those queries take about 0.5ms each.
 
Last edited:

MagicalAzareal

Magical Developer
Joined
Apr 25, 2019
Messages
758
It's important to note that number of queries isn't the metric you should be worried about. It is often more efficient to run additional, well-optimized queries than it is to process the data within PHP. Going out of your way to avoid additional queries may actually be detrimental to the performance of your site.

What you should be worried about are queries that are poorly optimized. They may lack appropriate indices, or they may be written in such a way that indices won't be sufficient.

As a developer, what you should not be doing is attempting to handle relationships from within PHP when you could be using SQL. Relational databases are relatively expensive, but they perform well when given adequate resources, so utilize them. They can often handle large datasets far more efficiently than PHP can. A good MySQL query is usually going to be O(log n)--or perhaps O(1) if indexed properly--but it's far too easy to write equivalent code in PHP that is O(n) or even O(n^k). (In simpler terms, unless you know exactly what you're doing, it's not unusual for MySQL queries to be exponentially faster on paper.)
Query count is often indicative of a deeper problem, especially when they're excessive.
When you walk into a neighbourhood filled with smashed windows and trash strewn around, that also presents a similar image.

MySQL does have a piss poor query scheduler, at-least in contrast to better database engines like PgSQL, although it's often better than nothing and what one would consider to be "good enough". If you really want to optimise database performance, then you're using the wrong database engine in the first place.

Something as simple as a join can lead to de-optimisations in some cases and it is also often significantly faster to address MySQL over Handlersocket than it is to execute SQL queries with conventional drivers (according to developers more involved in developing MySQL and friends, although it has gotten a fair bit faster over the years).

But really, even simple things like caching can turn 50 queries into a more manageable 5 or so.

A lot of queries likely boil down to... Load all the smilies. Load all the word filters. Load all the templates. Load the current user. Load the current forum. Load the current forum permissions. Load all the settings. Load the current user. Load the current user's group. Load the topic. Load the posts. Load the authors. Load the likes. Load all the widgets. Load whatever you liked. That's 15 queries btw.

Do you really have to do all that work every-time? No, just slap a good portion of that crud in memory, anything else is very elaborate self-sabotage. Are the smilies going to be constantly changing? No.

I don't understand where these "20" queries or "40" queries or "100" queries or the infinitely ludicrous "1000" queries is coming from. Maybe, reactions could bump it up? But by that much?
And finally, if you think PHP is really slow, then well, don't use PHP, although that doesn't relate too well to this.
 
Last edited:

Paul Buonopane

Aspirant
Joined
Jan 17, 2015
Messages
28
MySQL does have a piss poor query scheduler, at-least in contrast to better database engines like PgSQL, although it's often better than nothing and what one would consider to be "good enough".

Postgres is practically impossible to scale. MySQL is pretty bad, too, but not as bad as Postgres. That means Postgres is often ruled out for larger applications.

Query count is often indicative of a deeper problem, especially when they're excessive.
When you walk into a neighbourhood filled with smashed windows and trash strewn around, that also presents a similar image.

That depends. I've added an example to the post you quoted.

Something as simple as a join can lead to de-optimisations in some cases and it is also often significantly faster to address MySQL over Handlersocket than it is to execute SQL queries with conventional drivers (according to developers more involved in developing MySQL and friends, although it has gotten a fair bit faster over the years).

It can, but if you're running a large site or writing add-ons intended to be used on large sites, you should be optimizing your queries regardless. If we push a bad query into production, we notice pretty quickly--alarms start going off. We have pages with 100+ queries that complete in about 200-300ms, but we're certain we need those queries, and they're quite fast. It's possible to write very complex queries with multiple joins that complete instantaneously.

If you really want to optimise database performance, then you're using the wrong database engine in the first place.

That's an oversimplification, and even then, it's not really true. RDBMSes can perform just fine, and they're often the best choice when there are complex relationships between models. Once you scale up enough, allowing the DBMS to handle relationships becomes unrealistic, but at that point you've probably got enough in-house resources to handle the situation. (Or enough money to throw at the likes of Oracle--I shivered just typing that.)

Different types of databases have strengths and weaknesses, and you can't really say that one DBMS is outright better than another. That's why large forum installations tend to combine SQL and NoSQL, the latter permitting faceted search, which MySQL et al really can't handle. Elasticsearch is much, much slower than MySQL, but it scales much better and requires a lot fewer resources for faceted search.
 

MagicalAzareal

Magical Developer
Joined
Apr 25, 2019
Messages
758
Elasticsearch is much, much slower than MySQL
I don't have any benchmarks on hand, so I can't comment on this at the moment, however anything can be made to perform fast, if you throw infinite resources at it.
We have pages with 100+ queries that complete in about 200-300ms, but we're certain we need those queries, and they're quite fast.
What are the queries for specifically? Analytics?

For analytics, I think there is Dalmatiner (which is built on PgSQL which can ingest something like a million events per second?) and that other one whose name I forget which is written in Go. InfluxDB?
 
Last edited:

Paul Buonopane

Aspirant
Joined
Jan 17, 2015
Messages
28
I don't have any benchmarks on hand, so I can't comment on this at the moment, however anything can be made to perform fast, if you throw infinite resources at it.

Keep in mind, the overhead of HTTP requests alone adds a lot to the time. When you consider that most MySQL queries are sub-millisecond, that HTTP overhead hurts Elasticsearch a lot.

Generally, relational queries in ES are going to be slower because they're not really relational. Also, you can't really have a multi-column index; instead, you end up with a faceted search, which is going to be slower--that's just the way it is. Elasticsearch serves a different purpose: if you don't know in advance what combination of filters will be used, or the number of combinations is too high, faceted search is going to be faster than unoptimizable MySQL queries.

anything can be made to perform fast, if you throw infinite resources at it.

The dev server that I'm testing on has a single t2.micro EC2 instance for Nginx + PHP and I think a t2.small RDS instance for MySQL. (It might be t2.micro.) We try to avoid throwing tons of hardware at problems.

What are the queries for specifically? Analytics?

Usually just separate, unrelated page components. We have some pages that act a bit like dashboards and show a lot of information for power users.

Analytics go in Elasticsearch. Elasticsearch costs us significantly more than MySQL and requires a lot more resources, but it's capable of performing queries that we can't reasonable do with MySQL. Our Elasticsearch indices increase by about 50 GiB/day; fortunately, most of it is transient log data that doesn't need to stick around more than 90 days.

MySQL operates almost entirely in RAM in our configuration. Even with only 16 GiB RAM on the master instance, pretty much everything hits the cache and completes quickly, but it took a lot of work to tune it for that. Since it's not I/O-heavy, we can use networked storage--it still has to be solid state, but it's cheaper than local SSDs. Elasticsearch is inevitably I/O-heavy, so it needs NVMe SSDs attached to each node, or it blows up.
 

MagicalAzareal

Magical Developer
Joined
Apr 25, 2019
Messages
758
Keep in mind, the overhead of HTTP requests alone adds a lot to the time. When you consider that most MySQL queries are sub-millisecond, that HTTP overhead hurts Elasticsearch a lot.
Well, maybe it would be slower, if you use Elasticsearch like you would a database, but it has a different use case, one which I think MySQL would suck royally at. Much like how if I was really desperate for time series performance, I might decide to just throw things in Dalmatiner which is built for that sort of data.

Dalmatiner's biggest flaw is that it only supports integers as it's more focused towards collecting sensor data for IoT devices. There are some other time series databases in development, the whole field is blowing up with IoT becoming the next big thing and some are even using GPUs to power the databases (e.g. Uber's AresDB).
Usually just separate, unrelated page components. We have some pages that act a bit like dashboards and show a lot of information for power users.
If it's some sort of dashboard with dozens of charts or something, then I guess that makes sense
 

Paul M

Super Moderator
Joined
Jun 26, 2006
Messages
4,077
It's important to note that number of queries isn't the metric you should be worried about.
This is just wrong. Of course its a metric you should be worried about.

It might not be the only metric too look at, but to say its not important is just nonsense.
If your page is running 1000 queries then something is wrong - regardless of how "optimised" they are.
 
Top