• brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      24 days ago

      Also that is a very low context test. A longer context will bog it down, even setting aside the prompt processing time.

      …On the other hand, you could probably squeeze a bit more running openvino instead of llama.cpp, so that is still respectable.

      • rumba@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        24 days ago

        text test. A longer co

        yeah, it’s definitely not good enough for user-facing work, but if I’m working on development for something like translations, being able to see the 70b output to compare it to other models, it’s super useful before I send it off to something that costs more money to run.

        9/10 times, the bigger model isn’t significantly better for what I’m trying to do, but it’s really nice to confirm that.