Groq

Groq models leverage a unique Language Processing Unit (LPU) architecture that is optimized for exceptionally low-latency inference, particularly for large language models. This hardware-software co-design prioritizes deterministic performance and high throughput that enables real-time generative AI applications. Groq focuses on efficient execution of complex computational graphs, which minimizes the time required for model responses.

Instrumenting Groq Application

To instrument the Groq application, complete the following steps:

Make sure that your environment meets all the prerequisites. For more information, see Prerequisites.

  1. To install dependencies for Groq, run the following command:

    pip3 install groq==0.13.1
    
  2. Export the following credentials to access the Bedrock models used in the sample application.

    export GROQ_API_KEY =<groq-api-key>
    

    To create an API key to access the Groq API or use the existing one, see Groq API keys.

  3. Run the following code to generate a Groq sample application:

    import os, time, random
    from groq import Groq
    from traceloop.sdk import Traceloop
    from traceloop.sdk.decorators import workflow
    
    client = Groq(
       api_key=os.getenv("GROQ_API_KEY")
    )
    
    Traceloop.init(app_name="groq_chat_service", disable_batch=True)
    
    @workflow(name="streaming_ask")
    def ask_workflow():
       questions = [ "What is AIOps?", "What is GitOps?" ]
       question = random.choice(questions)
       stream = client.chat.completions.create(
          max_tokens=100,
          messages=[
                {"role": "user", "content": question}
          ],
          model="llama-3.3-70b-versatile",
       ).choices[0].message.content
    
       print(stream)
    
    @workflow(name="groq_chat_app")
    def main():
       for i in range(4):
          ask_workflow()
          time.sleep(3)
    main()
    
  4. Execute the following command to run the application:

    python3 ./<groq-sample-application>.py
    

After you configure monitoring, Instana collects the following traces and metrics from the sample application:

To view the traces collected from LLM, see Create an application perspective for viewing traces.

To view the metrics collected from LLM, see View metrics.

Adding LLM Security

When the Personally Identifiable Information (PII) is exposed to LLMs, then that can lead to serious security and privacy risks, such as violating contractual obligations and increased chances of data leakage, or a data breach. For more information, see LLM security