.NET Ramblings
open-menu closeme
Home
Videos
Feeds
About
Contribute
github linkedin
  • P-EAGLE: Faster LLM inference with Parallel Speculative Decoding in vLLM

    calendar Mar 13, 2026 · aws.amazon.com/blogs/machine-learning
    P-EAGLE: Faster LLM inference with Parallel Speculative Decoding in vLLM

    In this post, we explain how P-EAGLE works, how we integrated it into vLLM starting from v0.16.0 (PR#32887), and how to serve it with our pre-trained checkpoints. Link to article: https://aws.amazon.com/blogs/machine-learning/p-eagle-faster-llm-inference-with-parallel-speculative-decoding-in-vllm/


    Read More
  • Improve operational visibility for inference workloads on Amazon Bedrock with new CloudWatch metrics for TTFT and Estimated Quota Consumption

    calendar Mar 12, 2026 · aws.amazon.com/blogs/machine-learning
    Improve operational visibility for inference workloads on Amazon Bedrock with new CloudWatch metrics for TTFT and Estimated Quota Consumption

    Today, we’re announcing two new Amazon CloudWatch metrics for Amazon Bedrock, TimeToFirstToken and EstimatedTPMQuotaUsage. In this post, we cover how these work and how to set alarms, establish baselines, and proactively manage capacity using them. Link to article: …


    Read More
  • Secure AI agents with Policy in Amazon Bedrock AgentCore

    calendar Mar 12, 2026 · aws.amazon.com/blogs/machine-learning
    Secure AI agents with Policy in Amazon Bedrock AgentCore

    In this post, you will understand how Policy in Amazon Bedrock AgentCore creates a deterministic enforcement layer that operates independently of the agent's own reasoning. You will learn how to turn natural language descriptions of your business rules into Cedar policies, then use those policies to enforce …


    Read More
  • Multimodal embeddings at scale: AI data lake for media and entertainment workloads

    calendar Mar 12, 2026 · aws.amazon.com/blogs/machine-learning
    Multimodal embeddings at scale: AI data lake for media and entertainment workloads

    This post shows you how to build a scalable multimodal video search system that enables natural language search across large video datasets using Amazon Nova models and Amazon OpenSearch Service. You will learn how to move beyond manual tagging and keyword-based searches to enable semantic search that captures the full …


    Read More
  • Fine-tuning NVIDIA Nemotron Speech ASR on Amazon EC2 for domain adaptation

    calendar Mar 12, 2026 · aws.amazon.com/blogs/machine-learning
    Fine-tuning NVIDIA Nemotron Speech ASR on Amazon EC2 for domain adaptation

    In this post, we explore how to fine-tune a leaderboard-topping, NVIDIA Nemotron Speech Automatic Speech Recognition (ASR) model; Parakeet TDT 0.6B V2. Using synthetic speech data to achieve superior transcription results for specialised applications, we'll walk through an end-to-end workflow that combines AWS …


    Read More
  • Operationalizing Agentic AI Part 1: A Stakeholder’s Guide

    calendar Mar 11, 2026 · aws.amazon.com/blogs/machine-learning
    Operationalizing Agentic AI Part 1: A Stakeholder’s Guide

    The AWS Generative AI Innovation Center has helped 1,000+ customers move AI into production, delivering millions in documented productivity gains. In this post, we share guidance for leaders across the C-suite: CTOs, CISOs, CDOs, and Chief Data Science/AI officers, as well as business owners and compliance leads. Link …


    Read More
  • Accelerate custom LLM deployment: Fine-tune with Oumi and deploy to Amazon Bedrock

    calendar Mar 10, 2026 · aws.amazon.com/blogs/machine-learning
    Accelerate custom LLM deployment: Fine-tune with Oumi and deploy to Amazon Bedrock

    In this post, we show how to fine-tune a Llama model using Oumi on Amazon EC2 (with the option to create synthetic data using Oumi), store artifacts in Amazon S3, and deploy to Amazon Bedrock using Custom Model Import for managed inference. Link to article: …


    Read More
  • Run NVIDIA Nemotron 3 Nano as a fully managed serverless model on Amazon Bedrock

    calendar Mar 9, 2026 · aws.amazon.com/blogs/machine-learning
    Run NVIDIA Nemotron 3 Nano as a fully managed serverless model on Amazon Bedrock

    We are excited to announce that NVIDIA’s Nemotron 3 Nano is now available as a fully managed and serverless model in Amazon Bedrock. This follows our earlier announcement at AWS re:Invent supporting NVIDIA Nemotron 2 Nano 9B and NVIDIA Nemotron 2 Nano VL 12B models. This post explores the technical characteristics of …


    Read More
  • Access Anthropic Claude models in India on Amazon Bedrock with Global cross-Region inference

    calendar Mar 9, 2026 · aws.amazon.com/blogs/machine-learning
    Access Anthropic Claude models in India on Amazon Bedrock with Global cross-Region inference

    In this post, you will discover how to use Amazon Bedrock's Global cross-Region Inference for Claude models in India. We will guide you through the capabilities of each Claude model variant and how to get started with a code example to help you start building generative AI applications immediately. Link to article: …


    Read More

.NET Ramblings

Copyright  .NET RAMBLINGS. All Rights Reserved

to-top