Cortex AI Document RAG Embeddings

High-Level Flow Inside Cortex AI

  1. Hosted Models
  2. SQL Functions
  3. Execution in the AI Engine
  4. Applications

End-to-End Example 1 – Ticket Summarization

  1. Create table
  2. Add summary column
  3. Use Cortex AI to populate summaries

End-to-End Example 2 – Chat With Your Table (Python App)

Populate Table for Cortex AI


CREATE OR REPLACE TABLE SALES_DATA (
    SALE_DATE DATE,
    REGION STRING,
    PRODUCT STRING,
    QUANTITY NUMBER,
    AMOUNT NUMBER
);

INSERT INTO SALES_DATA (SALE_DATE, REGION, PRODUCT, QUANTITY, AMOUNT) VALUES
  ('2024-01-05', 'US', 'Laptop',   10,  15000),
  ('2024-01-15', 'US', 'Mouse',    200, 4000),
  ('2024-02-01', 'US', 'Keyboard', 150, 9000),
  ('2024-03-20', 'US', 'Monitor',  50,  25000),
  ('2024-03-28', 'EU', 'Laptop',   20,  30000),
  ('2024-03-30', 'US', 'Laptop',   5,   7500);
  

Connect to the TABLE

import snowflake.connector

conn = snowflake.connector.connect(
    user      = "MY_USER",
    password  = "MY_PASSWORD",
    account   = "MY_ACCOUNT",
    warehouse = "MY_WH",
    database  = "MY_DB",
    schema    = "PUBLIC",
)

def ask_table_question(table_name: str, question: str) -> str:
    # Safely wrap the table name in quotes to avoid SQL errors
    table_name = table_name.upper()

    sql = f"""
    SELECT SNOWFLAKE.CORTEX.COMPLETE(
        'gpt-4.1-mini',
        OBJECT_CONSTRUCT(
            'prompt',
            'You are a data analyst for the data in the {table_name} table. '
            || 'Use only the data from {table_name} to answer. Question: ' || ?
        )
    ) AS ANSWER;
    """

    cur = conn.cursor()
    cur.execute(sql, (question,))
    row = cur.fetchone()
    return row[0]["content"][0]["text"]


print(ask_table_question(
    "SALES_DATA",
    "What were total sales for Q1 2024 in the US?"
))
  

Chat with the TABLES


ask_table_question("SALES_DATA", "Show US laptop sales in January.")
ask_table_question("ORDERS_2024", "Which day had the highest revenue?")
  

End-to-End Example 3 – RAG with Document AI + Embeddings

Sample SQL Query


WITH QUESTION AS (
  SELECT
    'What is the termination notice period in the ACME contract?' AS Q,
    SNOWFLAKE.CORTEX.EMBED_TEXT(
      'text-embedding-model',
      'What is the termination notice period in the ACME contract?'
    ) AS QVEC
),
RANKED AS (
  SELECT
    c.CONTRACT_ID,
    c.FILE_NAME,
    c.TEXT,
    VECTOR_DISTANCE('COSINE', c.VECTOR, q.QVEC) AS DIST
  FROM CONTRACT_EMBEDDINGS c, QUESTION q
  ORDER BY DIST
  LIMIT 5
)
SELECT SNOWFLAKE.CORTEX.COMPLETE(
  'gpt-4.1-mini',
  OBJECT_CONSTRUCT(
    'prompt',
    'You are a contract analyst. Using ONLY the context below, answer the question.' ||
    '\\n\\nCONTEXT:\\n' ||
    LISTAGG(TEXT, '\\n\\n') || '\\n\\nQUESTION: ' ||
    (SELECT Q FROM QUESTION)
  )
) AS ANSWER
FROM RANKED;