• Voroxpete@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    36
    arrow-down
    7
    ·
    16 hours ago

    It really doesn’t. You’re just describing the “fancy” part of “fancy autocomplete.” No one was ever really suggesting that they only predict the next word. If that was the case they would just be autocomplete, nothing fancy about it.

    What’s being conveyed by “fancy autocomplete” is that these models ultimately operate by combining the most statistically likely elements of their dataset, with some application of random noise. More noise creates more “creative” (meaning more random, less probable) outputs. They do not actually “think” as we understand thought. This can clearly be seen in the examples given in the article, especially to do with math. The model is throwing together elements that are statistically proximate to the prompt. It’s not actually applying a structured, logical method the way humans can be taught to.

    • FourWaveforms@lemm.ee
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      1
      ·
      13 hours ago

      Unfortunately, these articles are often written by people who don’t know enough to realize they’re missing important nuances.

      • datalowe@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 hours ago

        It also doesn’t help that the AI companies deliberately use language to make their models seem more human-like and cogent. Saying that the model e.g. “thinks” in “conceptual spaces” is misleading imo. It abuses our innate tendency to anthropomorphize, which I guess is very fitting for a company with that name.

        On this point I can highly recommend this open access and even language-wise accessible article: https://link.springer.com/article/10.1007/s10676-024-09775-5 (the authors also appear on an episode of the Better Offline podcast)

    • reev@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      14 hours ago

      Genuine question regarding the rhyme thing, it can be argued that “predicting backwards isn’t very different” but you can’t attribute generating the rhyme first to noise, right? So how does it “know” (for lack of a better word) to generate the rhyme first?

      • dustyData@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        1
        ·
        14 hours ago

        It already knows which words are, statistically, more commonly rhymed with each other. From the massive list of training poems. This is what the massive data sets are for. One of the interesting things is that it’s not predicting backwards, exactly. It’s actually mathematically converging on the response text to the prompt, all the words at the same time.

          • ThisIsNotHim@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 hours ago

            We also check to see if the word that popped into our heads actually rhymes by saying it out loud. Actual validation steps we can take is a bigger difference than being a little more robust.

            We also have non-list based methods like breaking the word down into smaller chunks to try to build up hopefully more novel rhymes. I imagine professionals have even more tools, given the complexity of more modern rhyme schemes.