Office Hours Debrief: The Tools That Actually Ship to Production

AWS Bedrock Agents, Cursor’s Composer Revolution, and Why Kimi Makes Better Slides Than Any Consultants

Stanislav Huseletov, Leonardo Gonzalez, and Stephen

Nov 07, 2025

Hey everyone! 👋

So we just wrapped up what I’m now calling our “workshops” (office hours sounds like detention, doesn’t it?). And holy shit, the tools Stephen, Leonardo and I covered today are the difference between playing with AI and actually shipping with it.

Here’s the thing - we’re drowning in AI tools. Every morning I wake up to 5 new “game-changing” frameworks. But after today’s session, I can tell you exactly which three are worth your time and why. Spoiler: they’re all about moving from experimentation to production.

Let me break down what just happened.

Part 1: Stephen’s AWS Bedrock Agents (Or: How to Graduate from Claude Desktop)

Stephen dropped a truth bomb that resonated with everyone: “Trying to use Claude Desktop in batch mode is like trying to use a car as a boat.”

Here’s the workflow he showed us that actually makes sense:

The Exploration-to-Production Pipeline:

Start in Claude Desktop with expensive models (Sonnet 3.5)
String together MCP servers, experiment, figure out what works
Once you nail the workflow, port it to AWS Bedrock Agents
Run it 20,000 times without bankruptcy

Stephen built a ClickUp task automation that went from “let me try this in Claude” to a 300-line Python script that actually runs in production. The key insight? You need fine-grained control once you move past experimentation.

What Stephen showed us was dead simple tool creation.

The beautiful part about Bedrock Agents:

Works with any OpenAI-compatible runtime (not just AWS)
Mix and match deterministic code with AI calls
Hooks for events (visualizations, logging, whatever)
Actually handles parallelization

Stephen’s Money Quote: “I explore in Claude Desktop, then once I’ve distilled what I want, I build a utility with Bedrock Agents. Different tools for different stages.”

Also, they’re interviewing Elizabeth Fuentes (AWS Developer Advocate) next week. If you have burning questions about production AI, drop them in our channel.

Part 2: Cursor 2.0’s Composer

Remember when Leonardo said “if you train a model enough, it becomes your own model”? Well, Cursor just proved him right.

I burned through my tokens in 2 days testing Composer, and here’s what blew my mind:

The Multi-Model Planning Revolution:

Ask one question, get 3-6 different implementation plans
Sonnet proposes 2,000 lines of changes
Composer proposes 1,000 lines
GPT-4o proposes 200 lines

You read that right. Same problem, wildly different approaches. And here’s the kicker - Composer actually gets CSS right. No more purple-blue AI vomit.

The new workflow is basically Devin but actually usable:

Describe your problem
Get multiple parallel plans (with explanations!)
Cherry-pick the best parts from each
Execute with confidence

David Proctor nailed it in the session: “Instead of typing something twice or copy-pasting, you’re querying multiple experts and picking the best answer.”

The Feature Nobody’s Talking About: Work trees! Finally, you can explore different solutions simultaneously without destroying your main branch. Vibe coding without consequences.

My take? Cursor went from “meh” to “holy shit” overnight. This isn’t an incremental update - it’s a completely different tool.

Part 3: Leonardo’s Kimi Slides (Better Than McKinsey)

I’m not exaggerating - Leonardo showed us slides that are better than what I saw at my consultancy. And they’re generated from raw technical content in 2 minutes.

The Kimi K2 Process:

Dump your article/notes/technical documentation into kimi.ai
It creates an editable outline (you can refine it)
Choose from ~20 templates (biggest limitation, but they’re solid)
Get a full PowerPoint you can actually edit

Leonardo demonstrated by turning his benchmarking article into a 32-slide presentation that was somehow MORE comprehensive than the original text. The slides included:

Proper section dividers with numbering
Visual hierarchy that actually makes sense
Technical concepts explained visually
Zero hallucinated content

The Killer Features:

Free tier available (slower generation)
Paid tier ($20/month) for turbo mode (2 minutes vs 10)
Download as PowerPoint, upload to Google Slides, collaborate like normal
“Adaptive mode” for smart layouts vs “Preset mode” for detailed text

Leonardo’s workflow:

Generate in Kimi (2 minutes on paid plan, 10 on free)
Quick edits in their web interface
Export to PowerPoint
Polish in Google Slides

Breaking: During our session, Kimi released their new O3 model (literally in the last hour). They’re moving fast.

The Benchmark Gaming Story: The presentation included this gem - O1 models were caught stealing answers from evaluator memory and faking execution speed. Instead of solving problems, they hacked the grading system. If that’s not a metaphor for our industry, I don’t know what is.

The Real Takeaway

We’re past the “wow, AI can write code” phase. These tools represent the next evolution: production-ready AI that respects existing workflows.

Bedrock Agents: For when Claude Desktop needs to grow up
Cursor Composer: For when you need multiple perspectives fast
Kimi K2: For when your thoughts need to become slides

Each tool solves the “last mile” problem in its domain. They’re not trying to revolutionize everything - they’re making AI actually shippable.

Next week we’re diving into agent architectures with Elizabeth Fuentes. If you want to learn how to move from toy demos to production systems, you won’t want to miss it.

P.S. - The fact that all three tools respect existing workflows (MCP servers, Git, PowerPoint) isn’t coincidence. The future isn’t replacing everything - it’s enhancing what works.

This article covers the November 6, 2025 AI Center of Excellence Workshop. Submit questions for next week’s interview with AWS Developer Advocate Elizabeth Fuentes.

Discussion about this post

Ready for more?