What is the deal with "queries" thing? I need some study material for understanding servers.

Discussion in 'Servers and Control Panels' started by palhanow, Jul 1, 2019.

  1. palhanow

    palhanow Enthusiast

    114
    105
    +109
    Hello guys,

    Every time that i read a discussion, specially about add-ons, i came across people talking about "queries".

    This happens very very often:

    So, i don't understand what is it.

    Why is a bad thing for server purposes?

    And like i said, i need some studying material for understanding a little more about my business. Can you guys recommend some introduction materials?

    Thank you very much.
     
  2. rafalp

    rafalp Desu Ex

    1,179
    707
    +361
    It's short for database query. Doing many of those can increase time it takes to display page to the user and slow the site down so rule of thumb is that less is better.
     
  3. MagicalAzareal

    MagicalAzareal Magical Developer

    275
    70
    +129
    Okay, think about it this way.

    The database is a box which holds all of your site's data, some liken it to a box filled with excel spreadsheets, although not quite like that as it is stricter and more rigid among other things.

    A query is where you ask the database a "question" about a specific piece of data, for instance, you might want to get back information about one of your members. The database then checks whether you're allowed to access that bit of data and then hands it back to you.

    The more queries you have, the more "questions" you have flying back and forth, and the more work the database has to both decipher the question and to finally get back to sending back a response. While this is happening, unless you're doing some brand of async magic, the rendering of the page is stalled (as it obviously doesn't have the data it needs to continue).

    Sometimes, a plugin or a software will ask far more questions than they really need to, as they could instead hold the "response" in their memory or do other tricks to get around having to ask it.

    Hope this helps :)
     
    • Like Like x 2
    • Winner Winner x 2
    • List
  4. zappaDPJ

    zappaDPJ Administrator

    6,693
    1,342
    +5,276
    Every time I read something like this I can't but think of the reply the CTO of Internet Brands, Joe Rosenblum gave when ask why vB5 pages required so many (100+) queries. His reply was something along the lines of the number of queries is not important... and so sank 'the world's leading community software'.
     
    • Agree Agree x 3
    • Like Like x 1
    • List
  5. MagicalAzareal

    MagicalAzareal Magical Developer

    275
    70
    +129
    It's less important now than it used to be as database engines have gotten a fair bit faster, but it's still nowhere near as fast as just pulling a value out of a hashmap, things also get worse if the database is on a different server.
     
    • Informative! Informative! x 1
    • List
  6. Nerdface

    Nerdface Aspirant

    34
    13
    +21
    I hate that response with a passion.. :X3:

    100 queries may be more acceptable than it used to be, but that doesn't mean to say that it should be accepted.. I refuse to see how 100+ queries is better than 20-30 on a standard page, unless the queries are ridiculously complexed and splitting them down into additional queries makes logical sense, but it's highly unlikely that all of the queries being ran match this criteria.

    I remember cringing at vBulletin 3 installations that ran with about 70 queries on the forum index.. :p
     
  7. MagicalAzareal

    MagicalAzareal Magical Developer

    275
    70
    +129
    20 - 30 shouldn't be acceptable either. 5 at most tbh.

    Adding queries is like snorting drugs, people just can't stop.
    It's very easy to succumb to "just one more query" and if you have 30 plugins which all think this is perfectly fine, then all of a sudden you have 100 queries.
     
    Last edited: Jul 5, 2019
  8. R0binHood

    R0binHood Habitué

    1,223
    432
    +952
    Queries. Not even once.
     
  9. Paul M

    Paul M Limeade Addict

    3,788
    1,627
    +2,210
    ..and he never did understand how wrong that was.

    When vB5 was still in early alpha, some pages were running over a thousand queries. They had to add caching just to get that down to hundreds.
     
    • Agree Agree x 2
    • Informative! Informative! x 1
    • List
  10. MagicalAzareal

    MagicalAzareal Magical Developer

    275
    70
    +129
    How on earth do you accomplish "thousands of queries"? Even if I was deliberately trying to sabotage myself, it would be hard for me to do that.
     
  11. mysiteguy

    mysiteguy Devotee

    2,878
    887
    +2,218
    Loops through each result in a select which do more selects, rather than joins.
     
  12. Paul Buonopane

    Paul Buonopane Aspirant

    23
    83
    +18
    It's important to note that number of queries isn't the metric you should be worried about. It is often more efficient to run additional, well-optimized queries than it is to process the data within PHP. Going out of your way to avoid additional queries may actually be detrimental to the performance of your site.

    What you should be worried about are queries that are poorly optimized. They may lack appropriate indices, or they may be written in such a way that indices won't be sufficient.

    As a developer, what you should not be doing is attempting to handle relationships from within PHP when you could be using SQL. Relational databases are relatively expensive, but they perform well when given adequate resources, so utilize them. They can often handle large datasets far more efficiently than PHP can. A good MySQL query is usually going to be O(log n)--or perhaps O(1) if indexed properly--but it's far too easy to write equivalent code in PHP that is O(n) or even O(n^k). (In simpler terms, unless you know exactly what you're doing, it's not unusual for MySQL queries to be exponentially faster on paper.)

    Edit: For example, here's what XenForo 1.5 reports for the homepage on a dev server:
    Code:
    Page Time: 0.1990s
    Memory: 8.9545 MB (Peak: 10.2250 MB)
    Queries (48, time: 0.0602s, 30.2%)
    
    This is a pretty heavy page; most of our requests that hit PHP complete in 50ms, but this one takes 200ms. Even with 48 queries, the query time only accounts for 30% of the time; the rest is PHP. Most of those queries take about 0.5ms each.
     
    Last edited: Jul 7, 2019
  13. MagicalAzareal

    MagicalAzareal Magical Developer

    275
    70
    +129
    Query count is often indicative of a deeper problem, especially when they're excessive.
    When you walk into a neighbourhood filled with smashed windows and trash strewn around, that also presents a similar image.

    MySQL does have a piss poor query scheduler, at-least in contrast to better database engines like PgSQL, although it's often better than nothing and what one would consider to be "good enough". If you really want to optimise database performance, then you're using the wrong database engine in the first place.

    Something as simple as a join can lead to de-optimisations in some cases and it is also often significantly faster to address MySQL over Handlersocket than it is to execute SQL queries with conventional drivers (according to developers more involved in developing MySQL and friends, although it has gotten a fair bit faster over the years).

    But really, even simple things like caching can turn 50 queries into a more manageable 5 or so.

    A lot of queries likely boil down to... Load all the smilies. Load all the word filters. Load all the templates. Load the current user. Load the current forum. Load the current forum permissions. Load all the settings. Load the current user. Load the current user's group. Load the topic. Load the posts. Load the authors. Load the likes. Load all the widgets. Load whatever you liked. That's 15 queries btw.

    Do you really have to do all that work every-time? No, just slap a good portion of that crud in memory, anything else is very elaborate self-sabotage. Are the smilies going to be constantly changing? No.

    I don't understand where these "20" queries or "40" queries or "100" queries or the infinitely ludicrous "1000" queries is coming from. Maybe, reactions could bump it up? But by that much?
    And finally, if you think PHP is really slow, then well, don't use PHP, although that doesn't relate too well to this.
     
    Last edited: Jul 7, 2019
  14. Paul Buonopane

    Paul Buonopane Aspirant

    23
    83
    +18
    Postgres is practically impossible to scale. MySQL is pretty bad, too, but not as bad as Postgres. That means Postgres is often ruled out for larger applications.

    That depends. I've added an example to the post you quoted.

    It can, but if you're running a large site or writing add-ons intended to be used on large sites, you should be optimizing your queries regardless. If we push a bad query into production, we notice pretty quickly--alarms start going off. We have pages with 100+ queries that complete in about 200-300ms, but we're certain we need those queries, and they're quite fast. It's possible to write very complex queries with multiple joins that complete instantaneously.

    That's an oversimplification, and even then, it's not really true. RDBMSes can perform just fine, and they're often the best choice when there are complex relationships between models. Once you scale up enough, allowing the DBMS to handle relationships becomes unrealistic, but at that point you've probably got enough in-house resources to handle the situation. (Or enough money to throw at the likes of Oracle--I shivered just typing that.)

    Different types of databases have strengths and weaknesses, and you can't really say that one DBMS is outright better than another. That's why large forum installations tend to combine SQL and NoSQL, the latter permitting faceted search, which MySQL et al really can't handle. Elasticsearch is much, much slower than MySQL, but it scales much better and requires a lot fewer resources for faceted search.
     
  15. Paul Buonopane

    Paul Buonopane Aspirant

    23
    83
    +18
    If I had a penny for every time I saw this... heck, I saw it about 10 minutes ago.
     
  16. MagicalAzareal

    MagicalAzareal Magical Developer

    275
    70
    +129
    I don't have any benchmarks on hand, so I can't comment on this at the moment, however anything can be made to perform fast, if you throw infinite resources at it.
    What are the queries for specifically? Analytics?

    For analytics, I think there is Dalmatiner (which is built on PgSQL which can ingest something like a million events per second?) and that other one whose name I forget which is written in Go. InfluxDB?
     
    Last edited: Jul 7, 2019
  17. Paul Buonopane

    Paul Buonopane Aspirant

    23
    83
    +18
    Keep in mind, the overhead of HTTP requests alone adds a lot to the time. When you consider that most MySQL queries are sub-millisecond, that HTTP overhead hurts Elasticsearch a lot.

    Generally, relational queries in ES are going to be slower because they're not really relational. Also, you can't really have a multi-column index; instead, you end up with a faceted search, which is going to be slower--that's just the way it is. Elasticsearch serves a different purpose: if you don't know in advance what combination of filters will be used, or the number of combinations is too high, faceted search is going to be faster than unoptimizable MySQL queries.

    The dev server that I'm testing on has a single t2.micro EC2 instance for Nginx + PHP and I think a t2.small RDS instance for MySQL. (It might be t2.micro.) We try to avoid throwing tons of hardware at problems.

    Usually just separate, unrelated page components. We have some pages that act a bit like dashboards and show a lot of information for power users.

    Analytics go in Elasticsearch. Elasticsearch costs us significantly more than MySQL and requires a lot more resources, but it's capable of performing queries that we can't reasonable do with MySQL. Our Elasticsearch indices increase by about 50 GiB/day; fortunately, most of it is transient log data that doesn't need to stick around more than 90 days.

    MySQL operates almost entirely in RAM in our configuration. Even with only 16 GiB RAM on the master instance, pretty much everything hits the cache and completes quickly, but it took a lot of work to tune it for that. Since it's not I/O-heavy, we can use networked storage--it still has to be solid state, but it's cheaper than local SSDs. Elasticsearch is inevitably I/O-heavy, so it needs NVMe SSDs attached to each node, or it blows up.
     
  18. MagicalAzareal

    MagicalAzareal Magical Developer

    275
    70
    +129
    Well, maybe it would be slower, if you use Elasticsearch like you would a database, but it has a different use case, one which I think MySQL would suck royally at. Much like how if I was really desperate for time series performance, I might decide to just throw things in Dalmatiner which is built for that sort of data.

    Dalmatiner's biggest flaw is that it only supports integers as it's more focused towards collecting sensor data for IoT devices. There are some other time series databases in development, the whole field is blowing up with IoT becoming the next big thing and some are even using GPUs to power the databases (e.g. Uber's AresDB).
    If it's some sort of dashboard with dozens of charts or something, then I guess that makes sense
     
  19. Paul M

    Paul M Limeade Addict

    3,788
    1,627
    +2,210
    This is just wrong. Of course its a metric you should be worried about.

    It might not be the only metric too look at, but to say its not important is just nonsense.
    If your page is running 1000 queries then something is wrong - regardless of how "optimised" they are.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.