3 Comments
User's avatar
Greg Foote's avatar

A lot of interesting points and overall thought provoking. I appreciate the points regarding leveraging the data to track outcomes over activity. The productivity metric for AI first engineers comes across as an activity metric, "issues," compared to an outcome metric such as ARR/engineer or some other business outcome that person/team is aligned to. How might we take this a step further to be able to measure an individuals use of AI and it's impact on a key financial metric that drives the business forward?

Expand full comment
Stanislav Huseletov's avatar

Thank you for the thoughtful question! You're absolutely right that "issues shipped" is still an activity metric rather than a true business outcome.

While this article focuses on using AI to detect behavioral patterns (not measuring AI usage itself), the detection framework it describes could actually solve your challenge.

Here's a concrete plan:

Step 1: Define Your Financial North Star

-- Pick ONE metric: ARR per engineer, feature revenue attribution, or customer acquisition cost reduction

-- Set clear attribution rules (e.g., which features drove which revenue)

Step 2: Extend WorkSmart's Detection Pipeline

The article shows WorkSmart already tracks:

-- AI tool usage (27% of the employees use tools less; in engineering it is correlating with lower output)

-- Activity patterns (emails, meetings, code)

Add a new detection layer for:

-- Which code changes reach production

-- Which features generate revenue

-- Time from AI-assisted coding to customer value

Step 3: Build the Correlation Model

Using the same multi-modal approach from healthcare (tracking voice, posture, pauses):

-- Input: AI usage patterns + code commits + deployment data

-- Output: Revenue impact per engineer/team

-- Timeline: Track over 3-6 month periods for meaningful correlation

Step 4: Validate Like the Article Suggests

-- Start with one volunteer team (following the 0-6 month pilot approach)

-- A/B test: Compare AI-first engineers' revenue impact vs. AI-moderate engineers

-- Use the same F1 scoring methodology to ensure accuracy

The beauty is you're not building from scratch—you're adding revenue tracking to an already-proven behavioral detection system.

Would this approach work for the metrics you had in mind?

Expand full comment
Greg Foote's avatar

This is great and the type of thinking that I see a lot of organizations failing to follow through on. Many stop at the KPI (issues, pull requests, decrease in bugs, etc.) vs tracking all the way to value generation. I appreciate the additional context and specifics!

Expand full comment