• archchan@lemmy.ml
    link
    fedilink
    English
    arrow-up
    105
    arrow-down
    2
    ·
    7 months ago

    I choose to privately self-host open source AI models and stuff on Linux. It’s almost like technology is a tool and corps are the ones fucking things up. Hmmm, imagine that.

    • nexussapphire@lemm.ee
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      15
      ·
      edit-2
      7 months ago

      It’s so fun to play with offline AI. It doesn’t have the creepy underpinnings of knowing art and journalism as well as musings from social media was blatantly stolen from the internet and sold as a service for profit.

      Edit: I hate theft and if you think theft is ok for training llms go ahead and dislike this comment. I don’t feel bad about what I said, local offline AI is just better because it doesn’t work on the premise of backroom deals and blatant theft. I will never use an AI like DALL.E when there is a talented artist trying to put food on the table with a skill they honed for years. If you condone stealing you are a cheap, heartless, coward.

      • Teanut@lemmy.world
        link
        fedilink
        English
        arrow-up
        18
        arrow-down
        2
        ·
        7 months ago

        I hate to break it to you, but if you’re running an LLM based on (for example) Llama the training data (corpus) that went into it was still large parts of the Internet.

        The fact that you’re running the prompts locally doesn’t change the fact that it was still trained on data that could be considered protected under copyright law.

        It’s going to be interesting to see how the law shakes out on this one, because an artist going to an art museum and doing studies of those works (and let’s say it’s a contemporary art museum where the works wouldn’t be in the public domain) for educational purposes is likely fair use - and possibly encouraged to help artists develop their talents. Musicians practicing (or even performing) other artists’ songs is expected during their development. Consider some high school band practicing in a garage, playing some song to improve their skills.

        I know the big difference is that it’s people training vs a machine/LLM training, but that seems to come down to not so much a copyright issue (which it is in an immediate sense) as a “should an algorithm be entitled to the same protections as a person? If not, what if real AI (not just an LLM) is developed? Should those entities be entitled to personhood?”

        • nexussapphire@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          4
          ·
          7 months ago

          I hate to break it to you but not all machine learning is llms based. I’ve been messing with neural based tts from a small project called piper. I’m looking into an image recognition neural network to write software for and train myself. I might try writing it myself for fun 🤔

          I’m not interested in anything that uses stolen data like that so my options are limited and relegated to incredibly focused single purpose tools or things I make myself with the tools available.

          I’d love to play with image generation and large language models but until all the legal stuff is worked out and individuals get paid for their work I’m not touching it.

          To me it’s as cut and dry as this. If it’s the difference between an individual becoming their own boss/making a better living and a corporation growing their market cap I’ll always choose the individual. I know there’s a possibility of that growth resulting in more jobs but I’d rather have an environment where small businesses open breed competition and overall improve everyone’s life. Let’s not give the keys over to companies like Microsoft and close more doors.

          I don’t care about the discussion of true AI having rights. It’s only going to be used to make the wealthy wealthier.

          • hellofriend@lemmy.world
            link
            fedilink
            English
            arrow-up
            9
            ·
            7 months ago

            All LLMs are based on neural networks. Furthermore, all neural networks need training, regardless of whether they’re an LLM or some other form of machine learning. If you want to ensure there’s no stolen material used in the neural net then you have to train it yourself with material that you have the copyright to.

              • hellofriend@lemmy.world
                link
                fedilink
                English
                arrow-up
                5
                ·
                7 months ago

                I was expanding on your point, you twat. But hey, just be a snarky cunt. I’m sure that’ll get you far.

                • nexussapphire@lemm.ee
                  link
                  fedilink
                  English
                  arrow-up
                  5
                  ·
                  edit-2
                  7 months ago

                  Sorry I thought you were being a smartass and just skimmed through it. Truly my bad.

                  Edit: it’s hard to tell intention sometimes and I really do appreciate you summarizing what I said. It’s true and a more approachable answer than what I gave.

          • nexussapphire@lemm.ee
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            1
            ·
            7 months ago

            Sorry I feel strongly about this. Play with it all you want it’s really cool shit! But please don’t pay for access to it and if you need some art or a professional write-up please just pay someone to do it.

            It’ll mean so much to your fellow man in these uncertain times and the quality will be so much better.

      • nexussapphire@lemm.ee
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        7
        ·
        7 months ago

        I’m on his side, I don’t get the dislike. Maybe he likes massive corporations stealing people’s data putting artist and journalist out of work.