Azure AI Vision & Azure AI Language

Beyond OpenAI models, Azure bundles a family of task-specific AI services under Azure AI Services — pre-trained APIs for vision, speech, and language. This page covers the two most commonly used: Azure AI Vision (images and video) and Azure AI Language (NLP).


Azure AI Vision

Azure AI Vision provides image understanding via the Image Analysis 4.0 API, powered by Microsoft's Florence foundation model. It replaces most capabilities of the older Computer Vision API with a single call that can run multiple features at once.

Capabilities:


1. Multi-Feature Image Analysis


from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential

client = ImageAnalysisClient(
    endpoint="https://myco-vision.cognitiveservices.azure.com/",
    credential=AzureKeyCredential(""),
)

with open("warehouse.jpg", "rb") as f:
    result = client.analyze(
        image_data=f.read(),
        visual_features=[
            VisualFeatures.CAPTION,
            VisualFeatures.TAGS,
            VisualFeatures.OBJECTS,
            VisualFeatures.READ,
        ],
        gender_neutral_caption=True,
    )

print("Caption:", result.caption.text, f"(conf {result.caption.confidence:.2f})")
for tag in result.tags.list[:10]:
    print(" tag:", tag.name, tag.confidence)
for obj in result.objects.list:
    print(" obj:", obj.tags[0].name, obj.bounding_box)
  


2. Multimodal Embeddings — Image ↔ Text Search


# Vectorize an image
img_vec = client.vectorize_image(image_data=open("mug.jpg", "rb").read()).vector

# Vectorize a text query (same space)
txt_vec = client.vectorize_text(text="a ceramic coffee mug on a desk").vector

# Cosine similarity
import numpy as np
sim = np.dot(img_vec, txt_vec) / (np.linalg.norm(img_vec) * np.linalg.norm(txt_vec))
print("similarity:", sim)
  


Azure AI Language

Azure AI Language is the consolidated NLP service (formerly Text Analytics, LUIS, QnA Maker). It exposes pre-built skills and trainable features under one endpoint.

Pre-Built Features:

Trainable Features:


3. Sentiment + NER in One Call


from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

client = TextAnalyticsClient(
    endpoint="https://myco-language.cognitiveservices.azure.com/",
    credential=AzureKeyCredential(""),
)

docs = ["Order A-482 shipped from Seattle on Tuesday and arrived damaged."]

sentiment = client.analyze_sentiment(documents=docs, show_opinion_mining=True)[0]
print("sentiment:", sentiment.sentiment, sentiment.confidence_scores)

entities = client.recognize_entities(documents=docs)[0]
for e in entities.entities:
    print(" ", e.category, "->", e.text, f"({e.confidence_score:.2f})")
  


4. PII Redaction


resp = client.recognize_pii_entities(
    documents=["Contact Jane Doe at jane@acme.com or 555-123-4567. SSN: 123-45-6789."],
    categories_filter=["Email", "PhoneNumber", "USSocialSecurityNumber", "Person"],
)
print(resp[0].redacted_text)
# Contact ******** at ***************** or ************. SSN: ************.
  


5. Abstractive Summarization


poller = client.begin_abstract_summary(documents=[long_article], max_sentence_count=3)
for doc in poller.result():
    for s in doc.summaries:
        print(s.text)
  


Task-Specific APIs vs. GPT via Azure OpenAI