I already get rate-limited like crazy on lemmy and there are only like 60,000 users on my instance. Is each instance really just one server or are there multiple containers running across several hosts? I’m concerned that federation will mean an inconsistent user experience. Some instances many be beefy, others will be under resourced… so the average person might think Lemmy overall is slow or error-prone.

Reddit has millions of users. How the hell is this going to scale? Does anyone have any information about Lemmy’s DB and architecture?

I found this post about Reddit’s DB from 2012. Not sure if Lemmy has a similar approach to ensure speed and reliability as the user base and traffic grows.

https://kevin.burke.dev/kevin/reddits-database-has-two-tables/

  • Max-P@lemmy.max-p.me
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Bigger instances will indeed run multiple copies of the various components, it’s pretty standard software in that regard.

    Usually at first that will start by moving the PostgreSQL database to its own dedicated box, and then start adding additional backend boxes, possibly adding more caching in front so that the backend doesn’t have to do as much work. Once the database is pegged, the next step is usually a write primary and one or more read secondaries. When that gets too much, you get into sharding so that you can spread the database load across multiple servers. I don’t know much about PostgreSQL but I have to assume it’s better than MySQL in that regard and I’ve seen a 1 TB MySQL database in the wild running just fine.

    I think lemmy.world in general is hitting some scalability issues that they’re working on. Keep in mind the software is fairly new and is just being truely tested at large scale, there’s probably a ton of room for optimization. Also lemmy.world is still on 0.17 and apparently 0.18 changed the protocol a lot in a way that makes it scale much better, so when they complete that upgrade it’ll probably run a lot better already.


    The part that worries me about scalability in the long term is the push nature of ActivityPub. My server is already getting several POST requests to /inbox per second already, which makes me wonder how that’s gonna work if big instances have to push content updates to thousands of lemmy instances where most of the data probably isn’t even seen. I was surprised it was a push system and not a pull system, as pull is much easier to scale and cache at the CDN level, and can be fetched on demand for people that only checks lemmy once in a while.

    I need to start digging into Lemmy’s code and get familiar with the internals, still only a couple days in with my private instance.

  • Iron Lynx@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    My expectation, or at least hope, is that Lemmy will grow horizontally, i.e. more instances for more specialised content, instead of vertically, i.e. more communities in singular, larger instances. Since it’s all federated, you can get to stuff in other instances.

    I just had an idea. Let’s compare reddit and lemmy as land use metaphors.

    Reddit is like one monolithic megacity. It’s full of communites, some big, encompassing entire neighbourhoods, and others smaller, having one street, one block, maybe even just one building.

    Lemmy is like a country, with every instance a city. Some cities are big and varied, others are smaller and specialised, like ones dedicated entirely to fishing or aviation or being German. And you can choose a city to settle in and move between cities for your content. Some cities will be more open to sharing content with residents of other cities, and others will put up bigger restrictions. There are jokes about parts of the userbase on 4chan or Tumblr forming their own subcommunities, and the fediverse allows this in a very material way.

    My expectation is that more cities may emerge as people develop more specialised communities. And since there are many cities, there is some resilience in the system. If an instance goes down, you’ve lost one instance. Out of christ knows how many. Chances are some of its content is duplicated across other instances, so nothing of value is lost. Meanwhile, if (/when) Reddit goes down, all of Reddit is gone.

    In short, I hope lemmy develops more, smaller, specialised instances over time. Reddit allowed very niche insterests to have a corner, and despite that, I think the fediverse is more suited to allow for that than a centralised service.

  • HobbitFoot @thelemmy.club
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Poorly. Lemmy will scale poorly.

    I won’t be surprised if the larger instances start locking down more as a way to sustain themselves, like restricting communities or only allowing text posts.

    • nyakojiru@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      Sometimes you have just to accommodate to the situation and keep going until it settles down. The error I think here is thinking something can’t have flaws and issues, even more if it’s not behind a corporations. And no one wants corporations.

      • HobbitFoot @thelemmy.club
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 year ago

        It isn’t about accommodating to the situation, but planning for long term growth.

        Right now, instances of Lemmy don’t have any way to fund server costs other than asking for donations. Outside of Wikipedia, that isn’t a sustainable business model. How is Lemmy supposed to survive if, every time a sub gains critical mass, it shuts down?

        • ritswd@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          1 year ago

          planning for long-term growth

          Which is part of any scaling effort, and you can’t really guess through predicting and resolving bottlenecks, it takes some serious expertise. And as far as I know, the Lemmy devs have never built a high-scale service before, and I think that is possibly the single biggest risk to the growth and success of the Lemmy project in general.

          Source: that’s my job, I’ve been doing that for some of the most high-scale services in the world for about a decade. I absolutely could help, actually I’d love to, but I definitely won’t under current Lemmy leadership, for reasons: https://lemmy.world/comment/596235

            • ritswd@lemmy.world
              link
              fedilink
              English
              arrow-up
              0
              ·
              1 year ago

              I think Kbin is something good being built by good people, I get what they’re trying to do, but unfortunately I don’t have a lot of faith that it will turn out to be a successful project.

              In terms of technical scaling, I’m puzzled that they went with an interpreted language if the goal is scale. I get that the basic usage of Kbin’s features may not require a ton of CPU-heavy operations, or a fine handling of the memory; but once it meets sufficient scale, there will have to be some scale edge-case bottlenecks where you’ll want to step out of the beaten path and get lower-level, so I’m a bit confused about why they chose a technology that will make those harder to get past rather than easier. PHP is great for rapid prototyping, but I’d argue that’s not what the vision should be here.

              About community scale, I’m not expert, but they seem to really care to offer a karma system; and we’ve seen the karma-farming behavior that this has been incentivizing on Reddit. I don’t see why it would be any different here if enough people end up joining. Lemmy is intentionally not offering a karma system, and it really feels like the healthier move long-term.

              I think all it would take would be for the Lemmy devs to admit that they’re in over their heads, and that their political affiliations have been a hindrance to the project, to the point that they transition the governance of it to other people. I really hope they do that. If they do soon enough, they’re so far ahead and built on so much more long-term thinking, that I think it would pretty much make Kbin kinda obsolete. I have no special information about this, so I could be wrong, and I hope for them that I am; but I can see that as a pretty likely outcome.

              (That, and on the shorter-term, I wouldn’t contribute to a product I don’t use, and I can’t use it for now because my usage is 100% mobile, and the current lack of API means no native client. I wish the mobile web was better than it is as an application platform…)

              • Hexadecimalkink@lemmy.ml
                link
                fedilink
                English
                arrow-up
                0
                arrow-down
                1
                ·
                1 year ago

                Have you found that their political leanings have affected you in any way? Just curious if you have some sort of bias that’s making you think people on the left can’t produce efficient software.

                • ritswd@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  1 year ago

                  It hasn’t. But letting terrible people have power affects the world in normalizing violence and hatred. It’s not about left or right, if they were American racists against Chinese people, I would have the exact same problem. I’m personally quite on the left, but without the hate.

                  I am living safe and not being targeted with hateful violence like the Uyghurs or North Koreans are, so this is far, far more important than what can affect me.

  • Netto Hikari@social.fossware.space
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Well, I run an instance, too. It’s not big at all, but I was thinking about the issue of scaling, too. You can only scale up a single server so much…

    But on the other hand, Lemmy is still young. We’ll find solutions to that problem.

    Also, interesting article. I only took a glance at it, but having only two tables kind of suggests that Reddit is using a relational database. So, if they’re not “normalizing” everything, why not use a completely different paradigm, like what MogoDB etc. has?

    • Irisos@lemmy.umainfo.live
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      The database isn’t really the problem in the current state of things. The server is because:

      • Until 0.18 there was no caching (for the UI) and the poorly implemented websockets
      • The developers have admited that they aren’t proficient in SQL, in which case, why not using an ORM instead? Sure, they aren’t perfect but they will do better than the average developer at scale.
      • There is no queue system for activityPub requests
      • Because there is no queue, user requests and federation have the same priority when it shouldn’t and one can bottleneck the other
      • Live inserts are used meaning that regardless of the DB used, performance is going to be killed since inserting data 1 at a time several times a second is a major waste of resource

      Tl;dr: It’s trying to do everything and not that well. So users suffer because they have to share resources with non-UI related tasks.

      The database suffer because it has to do an insert of 1 object X 50 times in a second when it could do it once for all 50 items.

      Federation suffers because you can’t offload it to a seperate machine farm whose job will be to receive and send ActivityPub requests and send/read data from the correct queues to do so.

  • LostRedditor@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I love Lemmy but your question is legit. I just signed up with lemmy.world because lemmy.ml is slow/not responding.

    Before making a post in lemmy.world guess what? lemmy.world isn’t responding. I know they have scheduled maintenance at 9 CET but it was 20 minutes before that.

  • MoiraPrime@lib.lgbt
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    The ideal way that ActivityPub federation works IMO is a bunch of smaller nodes coming together to make a large network.

    If you have a bunch of people all on one or two instances then you’ll have a “central hub” of the network that’s constantly overloaded.

    That’s my advice to community builders on this platform… Spread out across smaller instances, don’t just all sign up to a big one.

    • Serinus@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      1 year ago

      Moving the scaling problem from a few instances to federation is likely to cause more harm than good.

      I don’t see how syncing a post across a hundred instances is more efficient than having a hundred users see the post on one instance.