The news last week that Llama-3 by meta was released meant that everyone around me was talking about it. I immediately downloaded and tried the 8B varient and was impressed by the results. However I quickly began to see benchmarks that challenged GPT-3.5 for the 70B version.
I initially tried to run LLama3 70B GPTQ with 4-bit quantisation as I know GPTQ gives me the best performance with my local equipment. I knew that this almost certainly would fill up my 24GB of vram and I was right. A few days later I discovered a 70B GGUF 2-bit released by LMStudio community. I ran it with an 8192 context window using OobaBooga and JUST fit it in :')
Token generation was running fairly slow but I am not complaining as the quality of output is leagues above anything I have seen in the open source spce. I am SO excited by the possibilities this has created.
2.91 t/s. I'll take it for an LLM that can finally do 0 shot prompt python scripts that actually work!
Literally the first prompt and output with 0 changes:
You
Please write a python script that calls back to a nats server and listens on a subject called email.json and extracts data like name, cc, bcc, title, message, to, from from the content published to that channel
AI
Here is a Python script that connects to a NATS server, subscribes to a subject called "email.json", and extracts data like name, CC, BCC, title, message, to, and from from the published content:
import nats
import json
# Define the NATS server URL and credentials
nats_server_url = "nats://demo.nats.io:4222"
nats_username = "your_username"
nats_password = "your_password"
# Connect to the NATS server
nc = nats.connect(url=nats_server_url, user=nats_username, password=nats_password)
# Subscribe to the "email.json" subject
sub = nc.subscribe("email.json")
# Define a callback function to process incoming messages
def callback(msg):
# Parse the message content as JSON
data = json.loads(msg.payload.decode("utf-8"))
# Extract the desired data from the JSON
name = data.get("name")
cc = data.get("cc")
bcc = data.get("bcc")
title = data.get("title")
message = data.get("message")
to = data.get("to")
from_ = data.get("from")
# Print the extracted data
print(f"Name: {name}")
print(f"CC: {cc}")
print(f"BCC: {bcc}")
print(f"Title: {title}")
print(f"Message: {message}")
print(f"To: {to}")
print(f"From: {from_}")
# Start the subscription and wait for messages
sub.max_opt_wait = 10
while True:
msg = sub.next()
callback(msg)
# Close the NATS connection
nc.close()
Please note that you'll need to replace the nats_server_url, nats_username, and nats_password variables with your actual NATS server URL, username, and password.
Also, the script assumes that the message payload is a JSON object with the specified fields. You may need to adjust the script to fit your specific use case.
Let me know if you have any questions or need further assistance!
ONWARD!!!
>> End of output.