Go Back

Final Result Preview
How an AI-powered side panel helped eliminate hours of manual data verification for commercial insurance brokers
Role: Product Design Manager (senior IC design leadership in executing design product direction)
Ownership: Full end-to-end ownership of discovery, definition, and design. You operated as the strategic design lead across research, synthesis, experimentation, and delivery handoff.
Team: Product Owner, engineering partners, a Product Analytics team, and Adelie Risk Advisors (an in-house agency of 4 brokers who served as the primary usability testing cohort), alongside stakeholders who aligned on direction and experiment outcomes.
Timeline: 6 weeks - Q2 2026
Problem & Context
Problem Statement
The Bold Penguin Terminal is a cloud-based commercial insurance platform that lets agents, brokers, and agencies quickly triage risks, then quote, compare, and bind small business policies from multiple carriers through a single, streamlined application interface.
Commercial insurance brokers using Bold Penguin’s Terminal struggle to complete complex applications because verifying data in the Multi‑location and Buildings sections still depends on manual back‑and‑forth with clients. As a result, high‑value submissions stall, brokers lose time chasing details, and many abandon the workflow to finish applications directly in carrier portals instead of staying in Terminal.
I scoped and authored the problem statement by synthesizing user feedback and funnel data to clarify the highest-friction moments (Multi-location + Buildings verification) and the business impact (abandonment to carrier portals).
Opportunity Solution Tree
The Opportunity Solution Tree connected brokers’ pain in Terminal to a focused solution by clarifying that the desired outcome, faster and more trustworthy completion of complex applications, was blocked by five root opportunities: inaccurate answers, missing source data, cognitive overload, low confidence in Terminal for complex work, and an unsafe way to pause without losing progress. Based on this map, the team chose App Form Suggestions, an AI-powered side panel that provides suggested answers with confidence tiers and transparent source references, keeping brokers in control while reducing verification effort. The approach was guided by assumptions about reducing abandonment through deferral states, driving habitual use, cutting completion time via one-click acceptance, increasing repeat usage through visible time savings, and shipping safely on existing infrastructure without performance or architectural risk. Together, this created a clear, prioritized set of validations to run before and after launch.
I built the opportunity solution tree to connect the core bottleneck to opportunities and testable assumptions the team could align on.
The Solution Chosen to Pursue
We aligned cross-functionally to pursue App Form Suggestions as the best path to reduce verification effort while keeping broker judgment in control.
Success Metrics
I partnered with Product Analytics to define the instrumentation and the success-metric set we would use to evaluate adoption, speed, and trust.
Core solution

The current state of Terminal lets you create a new application and jump straight into a manual flow for each question.
Before solution View

The new Terminal flow lets you choose several ways to start a new application, including the AI form suggestion option.
After solution prototype
Is it valuable to the customer?
Is it usable for the end-user?
Is it feasible to for engineering to build?
Is it viable for the business?
In-session monitoring of 12 broker sessions showed complex application completion time fell from an average of 9 mins, 23 secs to 6 mins, 45 secs - A 28% reduction in time on task completion time.
In-session monitoring of 12 broker sessions, out of 264 application question, 192 we’re answered with a “Strong Match” selection - 73% Strong Match AI confidence rate
The biggest hurdle was estimating the AI token cost for a typical full application submission. We approximated the cost at ~$0.57 = full submission. Confidence via cascading resolution (API > RAG > defaults) with Critique Agent adversarial validation.
Pre-release already shows movement toward 7:30 target. Ongoing pilot measures real-world adoption. LangSmith observability enables cost-per-completion monitoring.
To verify the risks for this feature, I initiated in-session monitoring across 12 broker sessions to capture behavioral evidence of time reduction. I led analysis of AI suggestion confidence patterns across 264 application questions to validate broker comprehension and interaction quality. I partnered with engineering to model AI token costs and align on the architectural assumptions needed to ship sustainably. I worked with Product Analytics to define success criteria and set up the observability framework used to monitor adoption and cost in production.
Evidence of impact

Within the time on task comparision bar chart, the actual pilot result of 6 min 45 sec not only beat the 9 min 23 sec baseline, but also surpassed the 7 min 30 sec success target by 45 seconds, representing a 28% reduction in time on task.
Constraints and tradeoffs
The core constraints and trade-offs on this project centered on balancing speed, scope, and broker accountability. The pre-release study was limited to four brokers from a single internal cohort (Adelie Risk Advisors), which provided strong directional evidence but left open questions about performance across independent and enterprise brokers with different NAICS codes and risk profiles. The decision to exclude AI suggestions from Carrier questions sections and to block auto-apply on ratable pricing fields (revenue, payroll) intentionally traded maximum automation for broker trust and E&O accountability. The interaction pattern also had to balance field-level precision with cognitive load, which led to rejecting per-question and page-level designs in favor of a side panel that preserved the existing form mental model without disrupting broker workflow. Finally, token cost containment within a ~$0.232/submission budget required building on existing LangGraph infrastructure rather than pursuing a new architecture, maintaining feasibility while setting boundaries on how far AI capability could be extended at launch.
Status and next steps
With App Form Suggestions now shipped into the pilot environment, the focus moves to ensuring the feature reaches its full potential in the hands of real brokers at scale.
To drive that, the next steps center on three priorities: first, launching a lightweight product marketing push — a short email campaign and in-app onboarding moment — so brokers arrive with the right mental model and the 20% adoption target is reachable from day one rather than left to chance; second, tracking all three success metrics in production over a 30–60 day window to confirm that the 28% time-on-task reduction, ≥20% adoption rate, and ≥40% retention rate hold beyond the Adelie cohort and across a broader range of broker types and submission profiles; and third, using production telemetry to close the 27% non-Strong-Match gap by identifying which field types and NAICS codes are generating weaker confidence scores, then improving the MQS and CQS agent resolution logic to push that 73% Strong Match rate higher over successive iterations.
Wrap up here
Back to the homepage for more context and work.
Go Back Home
Go Back

Final Result Preview
How an AI-powered side panel helped eliminate hours of manual data verification for commercial insurance brokers
Role: Product Design Manager (senior IC design leadership in executing design product direction)
Ownership: Full end-to-end ownership of discovery, definition, and design. You operated as the strategic design lead across research, synthesis, experimentation, and delivery handoff.
Team: Product Owner, engineering partners, a Product Analytics team, and Adelie Risk Advisors (an in-house agency of 4 brokers who served as the primary usability testing cohort), alongside stakeholders who aligned on direction and experiment outcomes.
Timeline: 6 weeks - Q2 2026
Problem & Context
Problem Statement
The Bold Penguin Terminal is a cloud-based commercial insurance platform that lets agents, brokers, and agencies quickly triage risks, then quote, compare, and bind small business policies from multiple carriers through a single, streamlined application interface.
Commercial insurance brokers using Bold Penguin’s Terminal struggle to complete complex applications because verifying data in the Multi‑location and Buildings sections still depends on manual back‑and‑forth with clients. As a result, high‑value submissions stall, brokers lose time chasing details, and many abandon the workflow to finish applications directly in carrier portals instead of staying in Terminal.
I scoped and authored the problem statement by synthesizing user feedback and funnel data to clarify the highest-friction moments (Multi-location + Buildings verification) and the business impact (abandonment to carrier portals).
Opportunity Solution Tree
The Opportunity Solution Tree connected brokers’ pain in Terminal to a focused solution by clarifying that the desired outcome, faster and more trustworthy completion of complex applications, was blocked by five root opportunities: inaccurate answers, missing source data, cognitive overload, low confidence in Terminal for complex work, and an unsafe way to pause without losing progress. Based on this map, the team chose App Form Suggestions, an AI-powered side panel that provides suggested answers with confidence tiers and transparent source references, keeping brokers in control while reducing verification effort. The approach was guided by assumptions about reducing abandonment through deferral states, driving habitual use, cutting completion time via one-click acceptance, increasing repeat usage through visible time savings, and shipping safely on existing infrastructure without performance or architectural risk. Together, this created a clear, prioritized set of validations to run before and after launch.
I built the opportunity solution tree to connect the core bottleneck to opportunities and testable assumptions the team could align on.
The Solution Chosen to Pursue
We aligned cross-functionally to pursue App Form Suggestions as the best path to reduce verification effort while keeping broker judgment in control.
Success Metrics
I partnered with Product Analytics to define the instrumentation and the success-metric set we would use to evaluate adoption, speed, and trust.
Core solution

The current state of Terminal lets you create a new application and jump straight into a manual flow for each question.
Before solution View

The new Terminal flow lets you choose several ways to start a new application, including the AI form suggestion option.
After solution prototype
Is it valuable to the customer?
Is it usable for the end-user?
Is it feasible to for engineering to build?
Is it viable for the business?
In-session monitoring of 12 broker sessions showed complex application completion time fell from an average of 9 mins, 23 secs to 6 mins, 45 secs - A 28% reduction in time on task completion time.
In-session monitoring of 12 broker sessions, out of 264 application question, 192 we’re answered with a “Strong Match” selection - 73% Strong Match AI confidence rate
The biggest hurdle was estimating the AI token cost for a typical full application submission. We approximated the cost at ~$0.57 = full submission. Confidence via cascading resolution (API > RAG > defaults) with Critique Agent adversarial validation.
Pre-release already shows movement toward 7:30 target. Ongoing pilot measures real-world adoption. LangSmith observability enables cost-per-completion monitoring.
To verify the risks for this feature, I initiated in-session monitoring across 12 broker sessions to capture behavioral evidence of time reduction. I led analysis of AI suggestion confidence patterns across 264 application questions to validate broker comprehension and interaction quality. I partnered with engineering to model AI token costs and align on the architectural assumptions needed to ship sustainably. I worked with Product Analytics to define success criteria and set up the observability framework used to monitor adoption and cost in production.
Evidence of impact
Within the time on task comparision bar chart, the actual pilot result of 6 min 45 sec not only beat the 9 min 23 sec baseline, but also surpassed the 7 min 30 sec success target by 45 seconds, representing a 28% reduction in time on task.

Within the time on task comparision bar chart, the actual pilot result of 6 min 45 sec not only beat the 9 min 23 sec baseline, but also surpassed the 7 min 30 sec success target by 45 seconds, representing a 28% reduction in time on task.
Constraints and tradeoffs
The core constraints and trade-offs on this project centered on balancing speed, scope, and broker accountability. The pre-release study was limited to four brokers from a single internal cohort (Adelie Risk Advisors), which provided strong directional evidence but left open questions about performance across independent and enterprise brokers with different NAICS codes and risk profiles. The decision to exclude AI suggestions from Carrier questions sections and to block auto-apply on ratable pricing fields (revenue, payroll) intentionally traded maximum automation for broker trust and E&O accountability. The interaction pattern also had to balance field-level precision with cognitive load, which led to rejecting per-question and page-level designs in favor of a side panel that preserved the existing form mental model without disrupting broker workflow. Finally, token cost containment within a ~$0.232/submission budget required building on existing LangGraph infrastructure rather than pursuing a new architecture, maintaining feasibility while setting boundaries on how far AI capability could be extended at launch.
Status and next steps
With App Form Suggestions now shipped into the pilot environment, the focus moves to ensuring the feature reaches its full potential in the hands of real brokers at scale.
To drive that, the next steps center on three priorities: first, launching a lightweight product marketing push — a short email campaign and in-app onboarding moment — so brokers arrive with the right mental model and the 20% adoption target is reachable from day one rather than left to chance; second, tracking all three success metrics in production over a 30–60 day window to confirm that the 28% time-on-task reduction, ≥20% adoption rate, and ≥40% retention rate hold beyond the Adelie cohort and across a broader range of broker types and submission profiles; and third, using production telemetry to close the 27% non-Strong-Match gap by identifying which field types and NAICS codes are generating weaker confidence scores, then improving the MQS and CQS agent resolution logic to push that 73% Strong Match rate higher over successive iterations.
Wrap up here
Back to the homepage for more context and work.
Go Back Home
Go Back

Final Result Preview
How an AI-powered side panel helped eliminate hours of manual data verification for commercial insurance brokers
Role: Product Design Manager (senior IC design leadership in executing design product direction)
Ownership: Full end-to-end ownership of discovery, definition, and design. You operated as the strategic design lead across research, synthesis, experimentation, and delivery handoff.
Team: Product Owner, engineering partners, a Product Analytics team, and Adelie Risk Advisors (an in-house agency of 4 brokers who served as the primary usability testing cohort), alongside stakeholders who aligned on direction and experiment outcomes.
Timeline: 6 weeks - Q2 2026
Problem & Context
Problem Statement
The Bold Penguin Terminal is a cloud-based commercial insurance platform that lets agents, brokers, and agencies quickly triage risks, then quote, compare, and bind small business policies from multiple carriers through a single, streamlined application interface.
Commercial insurance brokers using Bold Penguin’s Terminal struggle to complete complex applications because verifying data in the Multi‑location and Buildings sections still depends on manual back‑and‑forth with clients. As a result, high‑value submissions stall, brokers lose time chasing details, and many abandon the workflow to finish applications directly in carrier portals instead of staying in Terminal.
I scoped and authored the problem statement by synthesizing user feedback and funnel data to clarify the highest-friction moments (Multi-location + Buildings verification) and the business impact (abandonment to carrier portals).
Opportunity Solution Tree
The Opportunity Solution Tree connected brokers’ pain in Terminal to a focused solution by clarifying that the desired outcome, faster and more trustworthy completion of complex applications, was blocked by five root opportunities: inaccurate answers, missing source data, cognitive overload, low confidence in Terminal for complex work, and an unsafe way to pause without losing progress. Based on this map, the team chose App Form Suggestions, an AI-powered side panel that provides suggested answers with confidence tiers and transparent source references, keeping brokers in control while reducing verification effort. The approach was guided by assumptions about reducing abandonment through deferral states, driving habitual use, cutting completion time via one-click acceptance, increasing repeat usage through visible time savings, and shipping safely on existing infrastructure without performance or architectural risk. Together, this created a clear, prioritized set of validations to run before and after launch.
I built the opportunity solution tree to connect the core bottleneck to opportunities and testable assumptions the team could align on.
The Solution Chosen to Pursue
We aligned cross-functionally to pursue App Form Suggestions as the best path to reduce verification effort while keeping broker judgment in control.
Success Metrics
I partnered with Product Analytics to define the instrumentation and the success-metric set we would use to evaluate adoption, speed, and trust.
Core solution

The current state of Terminal lets you create a new application and jump straight into a manual flow for each question.
Before solution View

The new Terminal flow lets you choose several ways to start a new application, including the AI form suggestion option.
After solution prototype
Is it valuable to the customer?
Is it usable for the end-user?
Is it feasible to for engineering to build?
Is it viable for the business?
In-session monitoring of 12 broker sessions showed complex application completion time fell from an average of 9 mins, 23 secs to 6 mins, 45 secs - A 28% reduction in time on task completion time.
In-session monitoring of 12 broker sessions, out of 264 application question, 192 we’re answered with a “Strong Match” selection - 73% Strong Match AI confidence rate
The biggest hurdle was estimating the AI token cost for a typical full application submission. We approximated the cost at ~$0.57 = full submission. Confidence via cascading resolution (API > RAG > defaults) with Critique Agent adversarial validation.
Pre-release already shows movement toward 7:30 target. Ongoing pilot measures real-world adoption. LangSmith observability enables cost-per-completion monitoring.
To verify the risks for this feature, I initiated in-session monitoring across 12 broker sessions to capture behavioral evidence of time reduction. I led analysis of AI suggestion confidence patterns across 264 application questions to validate broker comprehension and interaction quality. I partnered with engineering to model AI token costs and align on the architectural assumptions needed to ship sustainably. I worked with Product Analytics to define success criteria and set up the observability framework used to monitor adoption and cost in production.
Evidence of impact
Within the time on task comparision bar chart, the actual pilot result of 6 min 45 sec not only beat the 9 min 23 sec baseline, but also surpassed the 7 min 30 sec success target by 45 seconds, representing a 28% reduction in time on task.

Within the time on task comparision bar chart, the actual pilot result of 6 min 45 sec not only beat the 9 min 23 sec baseline, but also surpassed the 7 min 30 sec success target by 45 seconds, representing a 28% reduction in time on task.
Constraints and tradeoffs
The core constraints and trade-offs on this project centered on balancing speed, scope, and broker accountability. The pre-release study was limited to four brokers from a single internal cohort (Adelie Risk Advisors), which provided strong directional evidence but left open questions about performance across independent and enterprise brokers with different NAICS codes and risk profiles. The decision to exclude AI suggestions from Carrier questions sections and to block auto-apply on ratable pricing fields (revenue, payroll) intentionally traded maximum automation for broker trust and E&O accountability. The interaction pattern also had to balance field-level precision with cognitive load, which led to rejecting per-question and page-level designs in favor of a side panel that preserved the existing form mental model without disrupting broker workflow. Finally, token cost containment within a ~$0.232/submission budget required building on existing LangGraph infrastructure rather than pursuing a new architecture, maintaining feasibility while setting boundaries on how far AI capability could be extended at launch.
Status and next steps
With App Form Suggestions now shipped into the pilot environment, the focus moves to ensuring the feature reaches its full potential in the hands of real brokers at scale.
To drive that, the next steps center on three priorities: first, launching a lightweight product marketing push — a short email campaign and in-app onboarding moment — so brokers arrive with the right mental model and the 20% adoption target is reachable from day one rather than left to chance; second, tracking all three success metrics in production over a 30–60 day window to confirm that the 28% time-on-task reduction, ≥20% adoption rate, and ≥40% retention rate hold beyond the Adelie cohort and across a broader range of broker types and submission profiles; and third, using production telemetry to close the 27% non-Strong-Match gap by identifying which field types and NAICS codes are generating weaker confidence scores, then improving the MQS and CQS agent resolution logic to push that 73% Strong Match rate higher over successive iterations.
Wrap up here
Back to the homepage for more context and work.
Go Back Home