Financial Services Task Inventory

A comprehensive inventory of 4,075 financial services tasks mapped to a 4-level taxonomy across 125 O*NET occupations — providing the foundation for understanding what work gets done, what skills it requires, and how roles are constructed from tasks.

4,075
Tasks Inventoried
15
Business Functions
125
O*NET Occupations
486
Activity Categories
Explore the Inventory
Tasks, Skills & Roles

Understanding the building blocks of work — and why tasks are the right unit of analysis.

Most workforce discussions start with roles — job titles, org charts, headcount. But roles are composites. To understand work at its most fundamental level, you need to decompose roles into their constituent parts: the tasks people perform and the skills those tasks require.

The Three Building Blocks

Task

A discrete unit of work with a clear verb-object structure: "Reconcile daily trade settlements," "Conduct annual credit reviews." Tasks are observable, measurable, and assignable. They are the atomic level of work.

This inventory captures 4,075 tasks at the L4 level — the most granular layer of the taxonomy.

Skill

A learned capability required to perform a task. Skills can be technical (financial modeling, SQL, KYC/AML), cognitive (analytical reasoning, judgment under ambiguity), or interpersonal (client advisory, negotiation). A single task typically requires 2–5 skills.

Each task in this inventory lists its required skills, enabling skill-based workforce analysis.

Role

A bundle of tasks assigned to one person or job title. Roles exist because organizations need to group tasks into manageable work packages. But role boundaries are often inherited from legacy structures rather than designed from first principles.

The inventory maps tasks to primary roles and O*NET occupational codes for cross-referencing.

How They Relate

Tasks → Skills: Every task demands specific skills to execute. By inventorying tasks, you implicitly map the skills landscape of your organization. Skill gaps become visible when you can see exactly which tasks require capabilities your workforce hasn't developed.

Tasks → Roles: A role is defined by which tasks it contains. Two roles with different titles but overlapping tasks may be candidates for consolidation. A role whose tasks span wildly different skill profiles may be a candidate for splitting.

Skills → Roles: Shared skills create natural job families — groups of roles with overlapping competency requirements. Career mobility is highest within a job family because the skill transfer cost is lowest. This is the foundation of career pathing.

Why Start with Tasks?

Tasks Are Stable

Job titles change with reorganizations, mergers, and market trends. But the underlying work — reconciling ledgers, assessing credit risk, advising clients — persists. Tasks are the durable unit.

Tasks Are Comparable

A "Relationship Manager" at one bank may do very different work than a "Relationship Manager" at another. But a task like "Conduct annual credit reviews and covenant compliance checks" means the same thing everywhere.

Tasks Are Measurable

Each task can be independently assessed for complexity (Bloom's taxonomy), frequency, regulatory burden, cross-functional scope, and — critically — its exposure to technological change including AI.

Tasks Enable Redesign

When you understand work at the task level, you can reassemble it: combine tasks into new roles, identify which tasks can be automated or augmented, and design job hierarchies grounded in what people actually do rather than inherited structures.

The 4-Level Taxonomy

This inventory organizes financial services work into a structured hierarchy that moves from the broadest organizational level down to individual tasks:

L1

Function

15 business functions
(e.g., Retail Banking, Risk Management)

L2

Process

164 process groups
(e.g., Consumer Lending, Branch Sales)

L3

Activity

486 activity clusters
(e.g., Mortgage Origination, Credit Underwriting)

L4

Task

4,075 discrete tasks
(e.g., Originate Residential Mortgage Applications)

Navigating this tool: The Dashboard gives you a statistical overview. The Explorer lets you filter, search, and drill into individual tasks. The advisory sections — Role Mapping, SWP, and Job Hierarchy Redesign — show how to apply task-level data to organizational decisions.
Inventory at a Glance

Key metrics and distributions across the financial services task inventory.

4,075
Total Tasks
125
O*NET Occupations
3
Median Bloom's Level
31.1%
Regulatory-Driven
23.6%
Cross-Functional
0.888
Eloundou β (LLM Exposure)

Cognitive Complexity (Bloom's)

Tasks by Function & Disposition

AI Exposure Class (E0/E1/E2)

Agentic AI Potential

Defense Line Composition

Key Insight: The inventory spans the full spectrum of cognitive complexity, from Bloom's level 1 (recall and data entry) to level 6 (strategic creation and design). 31.1% of tasks are regulatory-driven, reflecting the heavily governed nature of financial services. The Eloundou β of 0.888 indicates that financial services has structurally high LLM exposure: 81.2% of tasks are classified E1 (direct LLM exposure), 15.3% E2 (LLM + tools), and only 3.5% E0 (no LLM exposure).
Methodology & Data Sources

A transparent accounting of how this inventory was constructed, enriched, and validated.

1

Taxonomy Construction

Built a 4-level hierarchy: 15 L1 business functions, 164 L2 processes, 486 L3 activities, and 4,075 L4 tasks. Anchored to a purpose-built O*NET FinServ database (125 financial services occupations, 2,530 task statements, 80,000+ enrichment records) and validated against Canadian Big 5 bank operations with 99.3% task match rate.

2

Task Enrichment

Each task is characterized across multiple dimensions: cognitive complexity (Bloom's taxonomy 1–6), business importance (1–5), frequency, regulatory classification, cross-functional scope, defense line, required skills, and primary roles. These attributes support diverse analytical use cases — from skills gap analysis to organizational design.

3

AI Exposure Assessment

Each task is classified using the categorical E0/E1/E2 rubric from Eloundou et al. (2023, “GPTs are GPTs,” Science): E0 = no LLM exposure (physical, embodied, in-person tasks), E1 = direct LLM exposure (writing, code, summarization — LLM alone reduces task time by 50%+), E2 = LLM + tools exposure (data analysis, system integration — requires additional software beyond the LLM). No continuous 0–100 score is used; the classification is categorical per the published rubric. The Eloundou β measure (β = [E1 + 0.5×E2] / total tasks) provides a single aggregation metric at the role or occupation level. Published AIOE z-scores (Felten, Raj & Seamans, 2021, Strategic Management Journal) are available at the occupation level for cross-validation.

AI Exposure Classification
ComponentSourceMethodOutput
Task ClassificationEloundou et al. (2023) “GPTs are GPTs” (Science)Each task is independently classified as E0 (no LLM exposure: physical, embodied, in-person), E1 (direct LLM exposure: writing, code, summarization, Q&A — 50%+ time reduction), or E2 (LLM + tools: data analysis, system integration, retrieval)Categorical label: E0, E1, or E2
AggregationEloundou et al. (2023) β measureβ = [E1 + 0.5 × E2] / total tasks. Ranges from 0 (no exposure) to 1 (all tasks directly LLM-exposed). Computable at role, function, or occupation level.Current inventory β = 0.888
Cross-ValidationFelten, Raj & Seamans (2021) AIOE Index (Strategic Management Journal)Published occupation-level z-scores for 85 matched financial services occupations. AIOE maps 10 AI capabilities to 52 O*NET abilities.Available for occupation-level benchmarking (not used for task-level scoring)
Why categorical? The Eloundou rubric is inherently categorical — it classifies whether an LLM can meaningfully reduce task completion time, not by how much. Published research does not provide a validated methodology for converting E0/E1/E2 to a continuous 0–100 percentage. Using the categorical classification as published ensures methodological integrity.
Why E1 dominates: Financial services is overwhelmingly cognitive work. Felten et al. found that the abilities most exposed to AI — Information Ordering (1.91), Memorization (1.69), Deductive Reasoning (1.04) — are core to nearly every FS role, while the least exposed abilities involve physical dexterity and strength, which FS rarely requires. Current distribution: E0 = 3.5%, E2 = 15.3%, E1 = 81.2%. This means FS as a sector has structurally higher AI exposure than the economy-wide average.
Empirical Validation: Anthropic Economic Index

Where available, tasks are cross-referenced against the Anthropic Economic Index (January 2026 release), which reports empirical task-level success rates from 2 million real Claude.ai conversations (November 2025 data). This provides a real-world complement to the theoretical Eloundou classification.

MetricValueNotes
Tasks matched1,709 of 4,075 (41.9%)Fuzzy-matched by task description against 2,506 O*NET tasks in AEI dataset
Mean success rate66.5%Percentage of conversations where Claude successfully completed the task
Data sourceHuggingFace: Anthropic/EconomicIndexCC-BY license. Methodology in two arXiv papers.
Source caveat: Anthropic is a commercial AI company. Success rates reflect Claude model capabilities specifically, not AI in general. The data is self-selecting (users choose tasks they expect Claude to handle), which inflates aggregate success rates. However, it is the only published dataset providing empirical task-level success metrics from real-world AI usage at scale.
Forward-Looking: Agentic AI Potential

The Eloundou (2023) rubric was calibrated to GPT-4–era chatbot capabilities. Since then, agentic AI — systems that autonomously execute multi-step workflows using tools, code execution, web browsing, and API orchestration — has substantially expanded what AI can do. No peer-reviewed framework yet measures this empirically at the task level (as of March 2026). As a forward-looking indicator, each task is flagged for Agentic Potential: the degree to which agentic AI capabilities would further increase AI’s utility beyond what a basic LLM chatbot provides.

LevelCountDefinitionExample Patterns
High287 (7.0%)Multi-step workflows, data pipelines, system integration, automated monitoring, code operations that agentic AI can orchestrate end-to-endETL pipelines, automated screening, regression testing, API orchestration, batch document processing
Medium624 (15.3%)Data consolidation, report generation, cross-referencing that benefits from AI tool chains but may need human oversightReconciliation, report compilation, data extraction, cross-system analysis
Low3,164Physical/embodied tasks (E0) or primarily conversational/advisory where agentic capabilities add little beyond basic LLMClient meetings, physical inspections, advisory conversations, manual approvals
Methodological transparency: This flag is a constructed forward-looking indicator, not a research-validated metric. It is based on rule-based keyword/pattern matching against known agentic capability domains. It should be treated as directional, not definitive. We include it because the alternative — ignoring agentic AI entirely — would make the inventory materially incomplete as a decision tool.
Data Sources
SourceDescriptionContribution
O*NET FinServPurpose-built database: 125 occupations, 2,530 tasks, 37 tables, 80,000+ records including skills, knowledge, abilities, technology, certifications, and DWA/IWA activity hierarchyCore task statements, skills/knowledge profiles, technology stacks, standardized activity mappings
Bank-SpecificCanadian banking domain expertise and Big 5 bank validation (50 real LinkedIn postings)Institution-unique processes, Canadian regulatory context
RegulatoryFINTRAC, OSFI, Basel III, IFRS 9, TCFD frameworksCompliance obligations
CertificationFINRA Series 7, CFA, FRM, CISSP outlinesProfessional knowledge standards
AI-EraEmerging tasks from AI/ML adoption in bankingMLOps, responsible AI, bias testing
Anthropic Economic IndexJanuary 2026 release: 2,506 O*NET tasks with empirical success rates from 2M Claude.ai conversations (CC-BY, HuggingFace)Real-world task-level success rates; fuzzy-matched to 1,709 inventory tasks
Task Schema: 18 Fields
FieldTypeDescription
task_idStringUnique ID encoding taxonomy path (e.g., RB.DEP.ACT.001)
task_nameStringVerb-object task name
task_descriptionStringFull description with regulatory/business context
L1_functionStringBusiness function (1 of 15)
L2_processStringProcess group within L1
L3_activityStringActivity cluster within L2
onet_soc_codesArrayO*NET Standard Occupational Classification codes
primary_rolesArrayJob titles that typically perform this task
importance1–5Business criticality rating
frequencyStringHow often the task is performed
cognitive_complexity1–6Bloom's taxonomy level
regulatory_drivenBooleanWhether driven by regulatory requirement
cross_functionalBooleanWhether spans multiple functions
ai_exposure_classE0/E1/E2Eloundou (2023) classification: E0 (no LLM exposure), E1 (direct LLM — 50%+ time reduction), E2 (LLM + tools — requires additional software)
agentic_potentialHigh/Med/LowForward-looking agentic AI potential: multi-step workflows, tool orchestration, autonomous operations
aei_success_rate0–100%Anthropic Economic Index empirical success rate (where matched; null if no match)
ai_dispositionStringAutomate, Augment, Restructure, No_Change
skills_requiredArrayKey skills needed
defense_lineStringRisk governance (1st, 2nd, 3rd, NA)
sourceStringData provenance category
Hay Method Integration Framework

This inventory is designed to inform Korn Ferry Hay Method job evaluations for hierarchy redesign. The framework operates at two layers: task-derived metrics (from this inventory) and job-level organizational context (from your HRIS/org structure). Both are required for accurate Hay scoring — task composition alone cannot distinguish between an analyst and a VP performing similar analytical work.

Why two layers? The Hay Method evaluates the job, not just its tasks. Two roles can share identical task compositions but score very differently because of organizational context: a VP “developing strategic plans” operates with broader scope, higher decision authority, and greater accountability than an analyst doing the same task. The job-level context layer captures this “organizational amplifier” that task attributes alone cannot express.

Layer 1: Task-Derived Metrics (from this inventory)

Hay FactorSubfactorTask Attribute MappingFormula / Approach
Know-How
Knowledge & skill for competent performance
Technical Depthcognitive_complexity × skill breadth across roleAvg Bloom’s level × count of unique skills_required per role
Managerial BreadthProportion of management/planning/directing tasksCount of supervisory/strategic tasks ÷ total role tasks
Human RelationsProportion of interpersonal/advisory/coaching tasksCount of client-facing or mentoring tasks ÷ total role tasks
Problem Solving
Thinking required, as % of Know-How
Thinking EnvironmentDerived from ai_exposure_classE0 tasks = most unstructured, novel problems; E1 = structured enough for direct LLM assistance; E2 = amenable to AI tool pipelines. Higher proportion of E0 = more human judgment required.
Thinking Challengecognitive_complexity (Bloom’s)Level 5–6 (Evaluate/Create) = high challenge; Level 1–2 = low
Accountability
Accountability for actions & consequences
Freedom to ActInverse of regulatory_driven densityRoles with high regulatory load = more constrained freedom
Scope / Magnitudeimportance × frequencyWeighted average across role tasks, scaled by L1 function materiality
Impactdefense_line + direct-impact proportion1st-line direct operations > 2nd-line oversight > 3rd-line assurance

Layer 2: Job-Level Organizational Context (from your HRIS / org structure)

Context VariableSourceHay Factor ImpactHow It Modulates
Job Grade / BandHRIS compensation dataAll three factorsServes as a validation anchor, not a direct input. Compare computed Hay scores against current grades to identify over/under-graded roles. Large gaps (>2 grades) flag misalignment.
Span of ControlOrg chart (direct + indirect reports)Know-How (Mgmt Breadth), Accountability (Scope)Multiplier: 0 reports = IC baseline; 1–5 = team lead (+15%); 6–20 = manager (+30%); 20+ = senior leader (+50%). Applied to managerial breadth and scope/magnitude subfactors.
Decision Authority LevelDelegation of Authority matrix, approval limitsAccountability (Freedom to Act)Maps approval thresholds to Hay freedom-to-act scale: prescribed (<$10K) → controlled ($10K–$1M) → standardized ($1M–$50M) → broadly defined ($50M+) → strategic direction (enterprise)
Budget / Revenue ResponsibilityFinancial planning data, P&L ownershipAccountability (Scope / Magnitude)Hay uses geometric progression: each level ~15% larger. Map to Hay magnitude scale using ln(budget) normalization. Cost-center roles score lower than revenue-generating roles at equivalent dollar levels.
Reporting LevelOrg chart (levels from CEO)Problem Solving (Thinking Environment)Fewer levels from CEO = less structured thinking environment, more strategic ambiguity. Maps to Hay thinking environment scale: semi-routine (6+) → patterned (4–5) → variable (3) → broadly defined (2) → abstractly defined (1)
Cross-Functional AccountabilityCommittee memberships, dotted-line reports, project governance rolesKnow-How (Mgmt Breadth), Problem SolvingRoles accountable across multiple L1 functions score higher on managerial breadth. Count of L1 functions in scope: 1 = activity (+0), 2–3 = diverse (+15%), 4+ = broad (+30%)
How to use for job redesign: (1) Aggregate task attributes to the role level using importance-weighted averages (Layer 1). (2) Overlay job-level context variables from HRIS (Layer 2). (3) Compute composite Hay factor scores using the combined two-layer model. (4) Compare against current grades — discrepancies reveal misgraded roles. (5) Cluster roles into job families by L1 function and similar Hay profiles. (6) Assign job levels based on Hay composite score bands (using Hay’s ~15% geometric step progression). (7) Model the future state: remove automated tasks, recompute Layer 1, hold Layer 2 constant, identify which roles shift levels.
Important: This is a reference model based on external data. It is not derived from any specific institution's internal data. Scores should be validated against your organization's actual operating model.
Task Explorer

Filter, search, and drill into 4,075 financial services tasks. Click any row to expand full details.

Business Function

AI Disposition

AI Exposure Class

Agentic Potential

Bloom's Complexity

Defense Line

Showing 4,075 of 4,075 tasks
ID Task Function Disposition E-Class Agentic Bloom
Mapping to Organizational Roles

A practical guide for connecting this reference inventory to your organization's actual job architecture.

This inventory uses generic role titles. Your organization will have different titles, structures, and task bundles. The mapping process below helps you translate between the two — revealing how tasks cluster into roles, what skills each role requires, and where role boundaries may need to shift.
1

Build Your Role–Task Alignment

List your actual job titles within each business function. For each role, identify which L2 processes and L3 activities they touch, and estimate the percentage of effort in each area.

Example

Your Role: "Client Service Associate — Branch"
L1: Retail Banking
L2 Processes: Deposit Products (60%), Consumer Lending (25%), Branch Sales (15%)
Reference Tasks: ~45 tasks from those L3 activities apply

2

Build a Role Complexity & Skills Profile

For each role, aggregate the task-level attributes to understand the role's overall character.

What You Can Derive

Complexity Profile: Distribution of Bloom's levels across the role's tasks — is this a primarily execution role (Bloom 1–2), analytical role (3–4), or strategic role (5–6)?

Skills Footprint: Union of all skills_required across the role's tasks — what is the full capability set this role demands?

Regulatory Burden: What percentage of the role's tasks are regulatory-driven? This affects change velocity and training requirements.

3

Analyze Role Composition

With tasks mapped and profiled, several analyses become possible:

  • Task Overlap: Which roles share significant task overlap? These may be candidates for consolidation or clearer boundary definition.
  • Skill Adjacency: Which roles share skill requirements? These form natural job families and career mobility paths.
  • Complexity Span: Does a role bundle tasks across too wide a Bloom's range? Roles spanning 4+ levels may need to split into tiered positions.
  • AI Exposure (Optional): Overlay AI scores and dispositions to understand which tasks within a role are most affected by technology change.
Tip: Export filtered task data from the Explorer, then map against your HRIS headcount data to produce headcount-weighted profiles for each role.
Strategic Workforce Planning Integration

How to integrate task-level data into your SWP cycle — for skills planning, capacity modeling, and organizational change.

1

Assess Current State

Use Role Mapping to establish a baseline of your workforce's task composition.

  • Pull headcount from HRIS by job title and function
  • Map job titles to reference tasks using the Role Mapping framework
  • Profile each role's complexity distribution, skill requirements, and regulatory burden
  • Identify roles with the highest task diversity (spanning many L2 processes) — these are your most complex workforce planning targets
2

Identify Gaps & Risks

Compare the desired future state against current capabilities across multiple dimensions.

  • Skills Gap: Which skills appear in high-complexity tasks but are underrepresented in your current workforce?
  • Capacity Risk: Are critical tasks concentrated in too few roles or individuals? What happens if those roles turn over?
  • Regulatory Exposure: Which roles carry heavy regulatory task loads? These require specialized succession planning.
  • Technology Impact: Use AI exposure scores to identify which tasks (and therefore roles) are most affected by technology change, including AI adoption.
3

Model Scenarios

Build scenarios to bound workforce evolution under different strategic assumptions.

Scenario Dimensions

Organizational Change: What if you consolidate roles within an L2 process? Model headcount and skill implications.

Technology Adoption: What if tasks with AI score >75 are automated within 24 months? Where does freed capacity go?

Regulatory Shift: What if new regulations add compliance tasks? Which roles absorb the load?

4

Implement & Monitor

Execute workforce transitions with measurable indicators.

  • Leading: Reskilling enrollment, internal mobility rate, time-to-fill for redesigned roles
  • Lagging: Productivity per role, cost-to-serve, customer satisfaction, error rates
  • Governance: Monthly reviews, cross-functional steering, union consultation where applicable
Job Hierarchy Redesign

Using the task inventory to rethink how roles, job families, and organizational layers are structured — grounded in the Hay Method for job evaluation.

The task inventory reveals what people actually do at the L4 level. This makes it possible to challenge existing job boundaries, identify where roles can be consolidated or split, and design a future-state hierarchy grounded in real task clusters rather than inherited org charts.

The Problem with Current Job Hierarchies

Most financial services job hierarchies evolved organically — roles were added, titles inflated, and boundaries hardened around legacy processes. When the underlying work changes (through technology, regulation, or market shifts), the hierarchy itself may no longer reflect the actual nature of the work being done. Two symptoms emerge:

Fragmented Roles

A single end-to-end process is split across 3–5 job titles, each owning a narrow slice. The result: duplicated skills, unclear accountability, and roles that lack the critical mass to justify a distinct grade.

Bloated Roles

A single title bundles unrelated tasks from different L2 processes. The role holder is a generalist by accident, not design — making it difficult to evaluate the role consistently or plan career progression.

The Hay Method & Task-Level Data

The Hay Method (Korn Ferry) is the most widely used job evaluation framework in financial services. It evaluates jobs on three core factors: Know-How, Problem Solving, and Accountability. Traditionally, these are assessed through job descriptions and interviews — a subjective, time-consuming process. The task inventory provides an empirical foundation for each factor.

Know-How

The sum of knowledge, skills, and experience required to perform the job competently.

Inventory fields that inform Know-How:

  • skills_required — directly enumerates the technical and interpersonal skills each task demands
  • cognitive_complexity — Bloom's level indicates the depth of knowledge application (recall vs. analysis vs. creation)
  • regulatory_driven — regulatory tasks typically require specialized, certified knowledge (AML, OSFI, Basel)
  • onet_soc_codes — links to O*NET's detailed knowledge and education requirements per occupation

Hay Application

Aggregate skills_required across all tasks in a role to measure the breadth of know-how. Use the maximum Bloom's level to gauge depth. Count distinct L2 processes to assess management breadth.

Problem Solving

The thinking required to analyze, evaluate, reason, and arrive at conclusions within the job's environment.

Inventory fields that inform Problem Solving:

  • cognitive_complexity — Bloom's taxonomy directly measures thinking demand: levels 1–2 (routine/guided), 3–4 (analytical/applied), 5–6 (evaluative/creative)
  • task_description — verb patterns reveal the thinking environment (e.g., "execute" = well-defined; "assess" = semi-variable; "design strategy" = abstract)
  • cross_functional — cross-functional tasks require navigating ambiguity across organizational boundaries
  • ai_exposure_class — E0 tasks tend to involve more novel, unstructured thinking; E1/E2 tasks have more structured, LLM-amenable components

Hay Application

Map the role's Bloom's distribution to Hay's Thinking Challenge scale. Use the proportion of cross-functional tasks to assess the Thinking Environment (how much guidance or precedent exists).

Accountability

The answerability for actions and their consequences — encompassing freedom to act, magnitude of impact, and directness of impact.

Inventory fields that inform Accountability:

  • defense_line — 1st line (direct execution/ownership), 2nd line (oversight/monitoring), 3rd line (independent assurance) map directly to freedom-to-act levels
  • importance — business criticality rating (1–5) indicates the magnitude of impact if the task fails
  • regulatory_driven — regulatory tasks carry external accountability to supervisors, auditors, and regulators
  • L1_function / L2_process — the organizational scope of the task indicates whether impact is local (branch) or enterprise-wide

Hay Application

Use defense_line to assign Freedom to Act. Weight importance scores by frequency to calculate Magnitude. Assess whether the role's tasks have direct (1st line) or indirect/advisory (2nd/3rd line) impact.

Four-Phase Hierarchy Redesign Process

1

Task Cluster Analysis

Start by grouping tasks from the inventory into natural clusters based on shared attributes, rather than inheriting current role boundaries.

  • By L2 Process: Which tasks belong together because they serve the same process end-to-end?
  • By Cognitive Complexity: Separate high-judgment (Bloom 4–6) from routine execution (Bloom 1–2) — these correspond to different Hay grades and should often be different roles.
  • By Skills Required: Tasks sharing common skill profiles are natural candidates for a single role family. Shared skills = shared Know-How = same Hay job family.
  • By Defense Line: Tasks on different defense lines carry fundamentally different accountability profiles and should not be combined in the same role.
Use the Explorer to filter by L1, Bloom's level, and defense line. Export the results and sort by skills_required to see natural groupings emerge. Each cluster is a candidate role.
2

Role Boundary Redefinition

Once task clusters are identified, draw new role boundaries around them and evaluate each using Hay criteria:

  • Critical Mass Test: Does the cluster contain enough tasks to justify a full-time position? If not, merge with an adjacent cluster that shares the same Know-How profile.
  • Hay Coherence: Do all tasks in the proposed role land within 1–2 Bloom levels (Problem Solving), the same defense line (Accountability), and overlapping skill sets (Know-How)? If not, the role is trying to span too many Hay grades.
  • Span of Complexity: Roles spanning more than 2–3 Bloom levels should split into tiered positions (e.g., Analyst vs. Senior Analyst). This directly maps to different Hay evaluation points.
  • Cross-Functional Alignment: Tasks flagged as cross_functional=true may indicate roles that should sit in a shared service or center of excellence, which changes the Accountability profile (broader magnitude, more indirect impact).

Example: Retail Lending Hierarchy

Current: Mortgage Intake Clerk → Mortgage Processor → Underwriter → Closing Coordinator → Post-Close Auditor (5 roles, 3 layers)

Hay Analysis: Intake and Processing tasks are Bloom 1–2 with overlapping skills. Underwriting is Bloom 4 with distinct regulatory know-how. Audit is 3rd-line with different accountability. Three natural Hay clusters, not five.

Redesigned: Origination Advisor (client-facing, Bloom 4–5, 1st line) + Lending Operations Specialist (process + exception, Bloom 2–3, 1st line) + Credit Risk Reviewer (2nd/3rd line, Bloom 4–5). Three roles, two layers, each internally coherent against Hay criteria.

3

Job Family & Career Level Architecture

Organize the new roles into job families and define Hay-aligned career progression:

  • Job Families: Group roles by shared Know-How domains (skill overlap). Families might be “Client Advisory,” “Risk & Control,” “Data & Intelligence,” “Regulatory Operations” — defined by the skills that their constituent tasks share.
  • Career Levels (Hay Grades): Within each family, Bloom's levels provide a natural grading structure. Bloom 1–2 tasks define entry/associate grades (lower Know-How, guided Problem Solving). Bloom 3–4 define mid-level grades (analytical Problem Solving, broader Accountability). Bloom 5–6 define senior/leadership grades (evaluative/creative thinking, enterprise-wide impact).
  • Progression Paths: Career mobility between levels is defined by which new tasks the next level adds. This makes promotion criteria objective: can the person perform the higher-Bloom tasks that define the next grade?
  • Compensation Banding: Hay evaluation points (derived from task-level Know-How, Problem Solving, and Accountability) provide a defensible, data-backed foundation for pay banding rather than market-matching by title alone.
4

Transition Mapping & Governance

The new hierarchy is a target state. Getting there requires managed transitions:

  • Current → Future Role Map: For each existing role, define the target role(s) it maps to. Export the Role Mapping Template from the Export Center and populate with your org's current titles.
  • Hay Re-Evaluation: Use the task data to draft Hay evaluation profiles for each new role. This accelerates the traditionally manual evaluation process because the Know-How, Problem Solving, and Accountability inputs are already captured in the inventory.
  • Skill Gap Analysis: Compare skills_required of the future role against current role holders. The delta defines training and reskilling needs.
  • Phased Rollout: Sequence by business impact. Start with functions where role fragmentation or bloat is most severe, and where the Hay re-evaluation reveals the largest gap between current grading and task-based grading.
  • Governance: Hierarchy redesign crosses HR, Compensation, business lines, and risk. Establish a cross-functional steering group with sign-off authority on role and grade changes.
Key Principle: The hierarchy should be designed around task clusters evaluated against Hay criteria — not inherited titles or current headcount. Let the tasks define the roles, and let the task attributes define the grades.

Using the Inventory for Hay-Aligned Hierarchy Analysis

Identify Consolidation Opportunities

Filter by L2 process and examine how many distinct primary_roles appear. If 4+ roles share the same L2, similar Bloom levels, and overlapping skills, they occupy the same Hay territory and consolidation is likely warranted.

Detect Grade Misalignment

Sort by cognitive_complexity within an L1 function. If roles at adjacent Bloom levels have identical task types and defense lines, they may be graded differently but doing the same work — a Hay evaluation would merge them.

Map Know-How Clusters

Export tasks and group by skills_required. Roles that share >70% of their skill footprint belong in the same job family. Roles that share <30% may be misclassified in the current hierarchy.

Validate Accountability Structures

Ensure the redesigned hierarchy maintains separation of duties. No role should mix 1st-line and 2nd/3rd-line tasks — the defense_line field makes this auditable and maps directly to Hay's Freedom to Act dimension.

Summary: The task inventory provides the raw material that Hay evaluations require — but captured systematically at scale rather than through role-by-role interviews. By aggregating task-level Know-How (skills, Bloom's, regulatory knowledge), Problem Solving (Bloom's distribution, cross-functional scope), and Accountability (defense line, importance, impact scope), organizations can draft Hay-aligned role evaluations directly from the data, dramatically accelerating the job architecture redesign process.
Implementation Runbook

How to recreate and extend this analysis internally, combining your organization's proprietary data with the external reference inventory.

This inventory was built entirely from external, publicly available data. Its value as a reference model is that it provides a validated starting point — a comprehensive task taxonomy, scoring engine, and enrichment schema — that any organization can adapt without starting from scratch. The sections below describe exactly how to do that.

Reusable Artifacts from This Analysis

The following outputs from this project can be used directly in your internal implementation. They represent significant upfront work that does not need to be repeated:

4-Level Taxonomy Structure

The hierarchy of 15 L1 Functions → 164 L2 Processes → 486 L3 Activities provides a ready-made classification framework. Your organization can adopt it as-is or modify branches to reflect your specific operating model.

Export: Full JSON from the Export Center. Extract unique L1/L2/L3 combinations to get the taxonomy tree.

Reference Task Library (4,075 tasks)

Each task is a verb-object statement with a full description, skills, roles, and classification metadata. Use as a starting checklist: walk through each L3 activity and confirm which tasks exist in your org, which need rewording, and which are missing.

Export: Full CSV. Filter by L1 function to produce function-specific worksheets for SME validation.

18-Field Task Schema

The schema (task_id, task_name, task_description, L1–L3, SOC codes, roles, importance, frequency, Bloom's, regulatory, cross-functional, AI score, disposition, skills, defense line, source) is designed for analytical versatility. Adopt it as your internal data standard.

Export: JSON schema is self-documenting. See the Methodology tab for field definitions.

AI Exposure Scoring Engine

Categorical E0/E1/E2 task classification per Eloundou et al. (2023, Science). Occupation-level Eloundou β measure for aggregation. Published AIOE z-scores (Felten et al. 2021) available for cross-validation. Full Python implementation in the Data Scientist Runbook below.

Export: The scoring logic is documented in the Methodology tab. Complete Python code in the runbook (Step 4).

O*NET SOC Code Mappings

Each task is linked to O*NET Standard Occupational Classification codes, connecting the inventory to the U.S. Department of Labor's occupational database (knowledge requirements, education levels, wage data, projected growth).

Export: SOC codes are included in every CSV/JSON export. Cross-reference against the free O*NET 30.2 database.

Hay Method Mapping Framework

The Job Hierarchy Redesign tab documents how inventory fields map to Hay's three evaluation factors (Know-How, Problem Solving, Accountability). This mapping template accelerates Hay-aligned job architecture work.

Export: Conceptual framework documented in the Redesign tab. Apply it to your org-specific task data.

Internal Data Sources to Integrate

To move from a reference model to an org-specific analysis, you need to overlay your proprietary data. Here are the key internal sources and what they contribute:

Internal SourceWhat It ProvidesHow It Integrates
HRIS / WorkdayJob titles, headcount, grades, compensation bands, reporting lines, org structureMap job titles to reference tasks (Role Mapping step 1). Headcount-weight the analysis to show FTE impact, not just task count.
Job Descriptions (JDs)Official role responsibilities, qualifications, competency requirementsValidate and customize the reference task list. Add org-specific tasks not in the external inventory. Confirm Bloom's levels match internal expectations.
Process Maps / SOPsDocumented workflows, system touchpoints, handoff pointsValidate L2/L3 taxonomy alignment. Identify tasks that are split across roles differently than the reference model assumes.
Time & Motion / Activity DataHow staff actually spend their time (if available from workforce analytics tools)Replace estimated effort weights with actual observed data. This is the single highest-value internal dataset for this analysis.
Learning Management System (LMS)Training records, certifications, competency assessmentsMap to skills_required to identify existing capability vs. gaps. Feeds directly into SWP skill gap analysis.
Hay / Korn Ferry EvaluationsExisting job evaluation scores, grade structures, point profilesCompare current Hay grades against the task-derived grades from the Hierarchy Redesign framework. Identify misalignment between current grading and actual task composition.
Incident / Issue RegistersOperational errors, compliance findings, audit issuesCorrelate with task-level data to identify which tasks (and therefore roles) are highest risk. Informs importance scoring and defense line validation.
Technology InventorySystems, platforms, automation tools currently in useInforms the “current digitization” scoring factor. Tasks performed on modern platforms score higher for AI readiness.

External Data Sources (Publicly Available)

These are the external sources used to build this reference inventory. All are freely or commercially available:

SourceAccessWhat to Extract
O*NET 30.2 DatabaseFree download: onetonline.orgTask statements, knowledge domains, skills, abilities, education requirements, and wage data for 1,000+ occupations. Filter by SOC codes relevant to financial services (13-xxxx, 15-xxxx, 43-xxxx).
Regulatory FrameworksFINTRAC, OSFI, Basel III/IV, IFRS 9, TCFD — all published onlineCompliance obligations that generate regulatory-driven tasks. These define the “non-negotiable” task layer that cannot be eliminated.
Professional CertificationsCFA Institute, GARP (FRM), (ISC)² (CISSP), FINRACertification body-of-knowledge outlines define the skill and knowledge standards for professional roles. Use to validate skills_required fields.
Industry Job PostingsCareers pages, Indeed, LinkedIn, GlassdoorReal-world role descriptions and responsibilities. Useful for validating that the inventory covers actual market roles (see the BMO Coverage Analysis for an example of this validation).
Bloom’s Taxonomy ReferenceStandard educational framework (widely published)Provides the 6-level cognitive complexity scale: Remember (1), Understand (2), Apply (3), Analyze (4), Evaluate (5), Create (6). Used to score each task.

Data Scientist Runbook

The following is a step-by-step technical guide for a data scientist to build the internal integration pipeline. Each step includes the inputs, outputs, and pseudocode logic.

1

Load & Validate the Reference Inventory

Start by loading the reference data and confirming its structure.

# Step 1: Load reference inventory import pandas as pd, json with open('task_inventory.json') as f: ref = json.load(f) ref_tasks = pd.DataFrame(ref['tasks']) print(f"Reference: {len(ref_tasks)} tasks, {ref_tasks.L1_function.nunique()} L1 functions") print(f"Schema: {list(ref_tasks.columns)}") # Validate: no nulls in key fields assert ref_tasks[['task_id','task_name','L1_function','L2_process','L3_activity']].notna().all().all() # Extract taxonomy tree for reuse taxonomy = ref_tasks[['L1_function','L2_process','L3_activity']].drop_duplicates().sort_values( ['L1_function','L2_process','L3_activity'] )
2

Load Internal HRIS Data & Build Role–Task Map

Pull your HRIS export and map each internal job title to reference tasks. This is the most labor-intensive step and typically requires SME input.

# Step 2: Load HRIS and build role-task mapping hris = pd.read_csv('hris_export.csv') # columns: employee_id, job_title, department, grade, fte_count print(f"HRIS: {hris.job_title.nunique()} unique titles, {hris.fte_count.sum()} FTEs") # Option A: Manual mapping worksheet (SME-assisted) # Export reference tasks by L1, have SMEs mark which tasks apply to each job title role_task_map = pd.read_csv('role_task_mapping.csv') # columns: job_title, task_id, effort_pct # Option B: Automated fuzzy matching (augments manual mapping) from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import cosine_similarity # Match JD text against reference task descriptions jd_texts = load_job_descriptions() # your JD corpus ref_descs = ref_tasks['task_description'].tolist() vectorizer = TfidfVectorizer(stop_words='english', max_features=5000) tfidf = vectorizer.fit_transform(ref_descs + jd_texts) ref_vecs = tfidf[:len(ref_descs)] jd_vecs = tfidf[len(ref_descs):] similarities = cosine_similarity(jd_vecs, ref_vecs) # For each JD, get top-N matching reference tasks above threshold THRESHOLD = 0.25 for i, jd_title in enumerate(jd_titles): top_matches = similarities[i].argsort()[::-1][:20] matched = [(ref_tasks.iloc[j].task_id, similarities[i][j]) for j in top_matches if similarities[i][j] > THRESHOLD]
3

Add Org-Specific Tasks

The reference inventory covers the industry broadly but won't capture every organization-specific process. Add custom tasks using the same schema.

# Step 3: Add org-specific tasks # Use the same 18-field schema custom_tasks = [] custom_tasks.append({ 'task_id': 'CUSTOM.RB.001', # prefix with CUSTOM to distinguish 'task_name': 'Process Internal Transfer via Proprietary Platform', 'task_description': 'Execute inter-branch account transfers using [OrgSystem]...', 'L1_function': 'Retail Banking', 'L2_process': 'Deposit Products & Services', 'L3_activity': 'Transaction Processing', 'onet_soc_codes': ['43-3071.00'], 'primary_roles': ['Branch Operations Specialist'], 'importance': 3, 'frequency': 'Daily', 'cognitive_complexity': 2, # Bloom's level 'regulatory_driven': False, 'cross_functional': False, 'ai_exposure_class': None, # will be classified in Step 4 'ai_disposition': None, 'skills_required': ['Core Banking System', 'Transaction Processing'], 'defense_line': '1st', 'source': 'Internal' }) # Merge with reference all_tasks = pd.concat([ref_tasks, pd.DataFrame(custom_tasks)], ignore_index=True)
4

Apply the AI Exposure Classification (E0/E1/E2)

Classify tasks using the categorical rubric from Eloundou et al. (2023). Each task is assigned E0 (no LLM exposure), E1 (direct LLM — 50%+ time reduction), or E2 (LLM + tools). Use the Eloundou β measure for role-level aggregation. Published AIOE z-scores (Felten et al. 2021) are available at the occupation level for cross-validation.

# Step 4: AI Exposure Scoring — Bottom-Up E0/E1/E2 Classification # References: # Eloundou et al. (2023) "GPTs are GPTs", Science 384(6702) # Felten, Raj & Seamans (2021) AIOE Index, Strategic Mgmt Journal import re, json import numpy as np import pandas as pd # ── 4a. Define keyword patterns for E0/E1/E2 classification ── E0_PATTERNS = [ # Physical, embodied, in-person tasks r'(physical|manual|hands-on|in-person|face-to-face|on-site|field)', r'(vault|cash handling|cash count|coin|currency|safe|lock|key)', r'(inspect|patrol|guard|security check|emergency response)', r'(lift|carry|move|transport|deliver|install|repair|maintain equipment)', r'(branch operations|teller|counter|window)', ] E0_STRONG = [ # High-confidence E0 signals r'(physically|manual labor|operate machinery|handle cash|count currency)', r'(vault operations|security patrol|emergency evacuation)', ] E1_PATTERNS = [ # Direct LLM tasks (50%+ time reduction with LLM alone) r'(write|draft|compose|author|prepare report|create document)', r'(summarize|synthesize|consolidate|abstract)', r'(review|proofread|edit|revise)', r'(code|script|program|develop software|debug)', r'(classify|categorize|tag|label)', r'(research|investigate|compile information|literature review)', r'(explain|describe|articulate|communicate)', r'(recommend|suggest|advise|propose)', r'(policy|procedure|guideline|standard|template)', r'(forecast|project|predict|estimate)', r'(train|educate|instruct|onboard)', ] E2_PATTERNS = [ # LLM + tools (data analysis, system integration) r'(analyze data|data analysis|statistical|regression|model)', r'(database|SQL|query|extract data|data mining)', r'(dashboard|visualization|chart|graph|report generation)', r'(monitor|track|alert|detection|surveillance)', r'(automate|workflow|process automation|RPA)', r'(reconcile|match|validate data|cross-reference)', r'(risk model|credit model|scoring model|pricing model)', r'(compliance monitoring|regulatory reporting|filing)', r'(transaction processing|settlement|clearing)', r'(due diligence|KYC|AML|screening)', ] # ── 4b. Task-level classification function ── def classify_task_e012(task_text, bloom_level): # Classify a task as E0, E1, or E2 per Eloundou rubric. # Returns (classification, confidence). text = task_text.lower() e0 = sum(1 for p in E0_PATTERNS if re.search(p, text)) e0s = sum(1 for p in E0_STRONG if re.search(p, text)) e1 = sum(1 for p in E1_PATTERNS if re.search(p, text)) e2 = sum(1 for p in E2_PATTERNS if re.search(p, text)) # Strong E0 signal overrides if e0s >= 1 and e1 == 0 and e2 == 0: return 'E0', 0.9 # Bloom's modulates: higher-order thinking = more LLM-amenable bloom_boost = max(0, (bloom_level - 2) * 0.1) if bloom_level else 0 total = max(e0 + e1 + e2, 1) scores = { 'E0': (e0 / total) * (1 - bloom_boost), 'E1': (e1 / total) + bloom_boost * 0.6, 'E2': (e2 / total) + bloom_boost * 0.4, } best = max(scores, key=scores.get) if best == 'E0' and e0 == 0: best = 'E1' # default to E1 if no E0 signals return best, min(0.9, scores[best]) # ── 4c. Disposition assignment (based on E-class, Bloom's, and regulatory status) ── def assign_disposition(e_class, bloom, regulatory): # Assign Automate/Augment/Restructure/No_Change using E-class + task attributes. # E0 tasks: no LLM exposure if e_class == 'E0': return 'No_Change' if bloom >= 5 else 'Restructure' # E1 tasks: direct LLM exposure if e_class == 'E1': if bloom <= 2: return 'Automate' # routine tasks the LLM can handle directly elif bloom <= 4: return 'Augment' if regulatory else 'Automate' else: return 'Augment' # high-order tasks: human + LLM together # E2 tasks: LLM + tools if e_class == 'E2': if bloom <= 2: return 'Automate' # tool pipelines can handle routine E2 tasks elif bloom <= 3: return 'Restructure' # may require workflow redesign for tooling else: return 'Augment' # complex tasks benefit from AI-augmented workflows return 'Augment' # ── 4d. Apply to all tasks ── for idx, row in all_tasks.iterrows(): text = f"{row['task_name']} {row['task_description']}" bloom = row['cognitive_complexity'] e_class, conf = classify_task_e012(text, bloom) all_tasks.at[idx, 'ai_exposure_class'] = e_class all_tasks.at[idx, 'ai_disposition'] = assign_disposition( e_class, bloom, row['regulatory_driven'] ) print(f"Classification: {all_tasks.ai_exposure_class.value_counts().to_dict()}") # ── 4e. Compute Eloundou beta at occupation level ── # beta = [E1 + 0.5 * E2] / total tasks for each occupation e_map = {'E0': 0, 'E1': 1, 'E2': 0.5} all_tasks['e_weight'] = all_tasks['ai_exposure_class'].map(e_map) occ_beta = (all_tasks.explode('onet_soc_codes') .groupby('onet_soc_codes') .agg(beta=('e_weight', 'mean'), task_count=('task_id', 'count')) .reset_index()) occ_beta.columns = ['soc', 'beta', 'task_count'] print(f" Occupation-level beta:") print(f" Mean beta: {occ_beta.beta.mean():.3f}") print(f" Range: {occ_beta.beta.min():.3f} - {occ_beta.beta.max():.3f}") # ── 4f. CROSS-VALIDATE against published AIOE (Felten et al. 2021) ── # Download AIOE data: github.com/AIOE-Data/AIOE # The AIOE is an occupation-level z-score index (not task-level). # We compare our occupation-level beta against AIOE for directional alignment. aioe_df = pd.read_excel('AIOE_DataAppendix.xlsx', sheet_name='Appendix A') aioe_map = dict(zip(aioe_df['O*NET-SOC Code'], aioe_df['AIOE'])) occ_beta['aioe'] = occ_beta['soc'].map(aioe_map) occ_valid = occ_beta.dropna(subset=['aioe']) # Spearman rank correlation (appropriate for comparing ordinal/z-score vs proportion) from scipy.stats import spearmanr rho, p_val = spearmanr(occ_valid['beta'], occ_valid['aioe']) print(f" VALIDATION: Spearman rho = {rho:.3f} (p = {p_val:.4f})") print(f"Matched {len(occ_valid)} occupations") print(f"Note: AIOE is an occupation-level z-score; beta is a task-derived proportion.") print(f"Directional agreement (both ranking occupations similarly) is the goal.")
5

Build Role-Level Profiles & Hay Method Evaluation

Aggregate task data to role level and compute Hay Method factor proxies for job hierarchy redesign. The three Hay factors (Know-How, Problem Solving, Accountability) are derived from task attributes using importance-weighted aggregation.

# Step 5: Role-Level Aggregation & Hay Method Factor Computation merged = role_task_map.merge(all_tasks, on='task_id') merged = merged.merge( hris[['job_title','fte_count','grade','direct_reports', 'indirect_reports','levels_from_ceo','approval_limit', 'budget_responsibility','l1_functions_in_scope']].drop_duplicates(), on='job_title' ) # NOTE: If your HRIS doesn't have all columns, fill with defaults: # merged['direct_reports'] = merged.get('direct_reports', 0) # merged['approval_limit'] = merged.get('approval_limit', 10000) # merged['levels_from_ceo'] = merged.get('levels_from_ceo', 6) # ── 5a. Basic role aggregation (Layer 1: Task-Derived) ── role_profiles = merged.groupby('job_title').agg( task_count=('task_id', 'count'), fte_count=('fte_count', 'first'), current_grade=('grade', 'first'), unique_skills=('skills_required', lambda x: len(set(s for sl in x for s in sl))), max_bloom=('cognitive_complexity', 'max'), mean_bloom=('cognitive_complexity', 'mean'), bloom_std=('cognitive_complexity', 'std'), l2_breadth=('L2_process', 'nunique'), l3_breadth=('L3_activity', 'nunique'), pct_cross_functional=('cross_functional', 'mean'), primary_defense_line=('defense_line', lambda x: x.mode().iloc[0]), mean_importance=('importance', 'mean'), pct_regulatory=('regulatory_driven', 'mean'), beta=('ai_exposure_class', lambda x: (sum(1 for v in x if v=='E1') + 0.5*sum(1 for v in x if v=='E2')) / len(x)), pct_e0=('ai_exposure_class', lambda x: (x == 'E0').mean()), pct_e1=('ai_exposure_class', lambda x: (x == 'E1').mean()), pct_e2=('ai_exposure_class', lambda x: (x == 'E2').mean()), pct_automate=('ai_disposition', lambda x: (x == 'Automate').mean()), pct_augment=('ai_disposition', lambda x: (x == 'Augment').mean()), ).round(3) # ── 5b. Hay Factor 1: KNOW-HOW ── # Technical Depth: cognitive complexity * skill breadth # Managerial Breadth: proportion of supervisory/strategic tasks # Human Relations: proportion of interpersonal/advisory tasks MGMT_KW = ['direct','supervise','manage','lead','plan','coordinate', 'delegate','oversee','strategic','governance','budget'] HR_KW = ['advise','counsel','coach','mentor','negotiate','present', 'relationship','communicate','client','stakeholder','mediate'] def compute_know_how(role_tasks): n = len(role_tasks) # Technical Depth (0-100): avg bloom * unique skill count, normalized avg_bloom = role_tasks['cognitive_complexity'].mean() skills = set(s for sl in role_tasks['skills_required'] for s in sl) tech_depth = min(100, avg_bloom * len(skills) / 2) # Managerial Breadth (0-100): % of tasks with mgmt keywords mgmt_count = sum(1 for _, t in role_tasks.iterrows() if any(kw in (t['task_name'] + ' ' + t['task_description']).lower() for kw in MGMT_KW)) mgmt_breadth = (mgmt_count / n) * 100 # Human Relations (0-100): % of tasks with HR keywords hr_count = sum(1 for _, t in role_tasks.iterrows() if any(kw in (t['task_name'] + ' ' + t['task_description']).lower() for kw in HR_KW)) human_rel = (hr_count / n) * 100 # Composite: weighted sum (Hay weights tech depth highest) return round(tech_depth * 0.50 + mgmt_breadth * 0.25 + human_rel * 0.25, 1) # ── 5c. Hay Factor 2: PROBLEM SOLVING ── # Thinking Environment: derived from E-class distribution (more E0 = more novel) # Thinking Challenge: Bloom's level distribution def compute_problem_solving(role_tasks): n = len(role_tasks) # Thinking Environment (0-100): based on E-class distribution # E0 tasks = fully human, novel problems. E1/E2 = more structured. e_weights = {'E0': 100, 'E1': 30, 'E2': 50} # E0=hardest, E2=mid, E1=most structured think_env = role_tasks['ai_exposure_class'].map(e_weights).mean() # Thinking Challenge (0-100): weighted by high Bloom's tasks bloom_dist = role_tasks['cognitive_complexity'].value_counts(normalize=True) # Weight: Bloom 5-6 tasks count 3x, Bloom 3-4 count 1x, Bloom 1-2 count 0.3x challenge = sum( pct * {1:10, 2:20, 3:40, 4:60, 5:80, 6:100}.get(level, 40) for level, pct in bloom_dist.items() ) # Composite (Hay: PS expressed as % of Know-How) return round(think_env * 0.40 + challenge * 0.60, 1) # ── 5d. Hay Factor 3: ACCOUNTABILITY ── # Freedom to Act: inverse of regulatory constraint # Scope/Magnitude: importance * frequency * L1 materiality # Impact: defense line + proportion of direct-impact tasks LOD_WEIGHT = {'1st': 1.0, '2nd': 0.7, '3rd': 0.5, 'NA': 0.3} def compute_accountability(role_tasks): n = len(role_tasks) # Freedom to Act (0-100): 100 - (% regulatory-driven * 100) freedom = (1 - role_tasks['regulatory_driven'].mean()) * 100 # Scope/Magnitude (0-100): mean importance (1-5) normalized to 0-100 scope = role_tasks['importance'].mean() * 20 # 5 -> 100 # Impact (0-100): defense line weight * importance lod_weights = role_tasks['defense_line'].map(LOD_WEIGHT).fillna(0.3) impact = (lod_weights * role_tasks['importance'] / 5 * 100).mean() return round(freedom * 0.30 + scope * 0.35 + impact * 0.35, 1) # ── 5e. Layer 2: Job-Level Organizational Context ── # These variables come from HRIS, not from task attributes. # They modulate the task-derived Hay scores to reflect organizational # position — the "amplifier" that distinguishes an analyst from a VP # doing similar analytical work. import math def span_of_control_multiplier(direct_reports, indirect_reports=0): # Hay managerial breadth scale based on total reports total = (direct_reports or 0) + (indirect_reports or 0) if total == 0: return 1.0 # Individual contributor elif total <= 5: return 1.15 # Team lead elif total <= 20: return 1.30 # Manager elif total <= 100: return 1.50 # Senior manager / Director else: return 1.70 # VP / Executive def decision_authority_score(approval_limit): # Map financial approval authority to Hay freedom-to-act scale (0-100) # Based on Hay Guide Chart A progression if approval_limit is None: return 30 # default: controlled limit = float(approval_limit) if limit < 10_000: return 15 # Prescribed elif limit < 100_000: return 30 # Controlled elif limit < 1_000_000: return 50 # Standardized elif limit < 50_000_000: return 70 # Generally regulated elif limit < 500_000_000: return 85 # Broadly defined else: return 95 # Strategic direction def budget_magnitude_score(budget): # Hay uses geometric (log) scale for magnitude # Normalized to 0-100 using ln(budget) / ln(max_expected) if budget is None or budget <= 0: return 20 return min(100, round(math.log(budget) / math.log(10_000_000_000) * 100)) def reporting_level_score(levels_from_ceo): # Fewer levels from CEO = more abstract thinking environment # Hay thinking environment scale level_map = {1: 95, 2: 80, 3: 65, 4: 50, 5: 40, 6: 30} return level_map.get(levels_from_ceo, max(15, 95 - levels_from_ceo * 12)) def cross_functional_breadth(l1_count): # Number of L1 functions in scope if l1_count is None or l1_count <= 1: return 1.0 elif l1_count <= 3: return 1.15 else: return 1.30 # ── 5f. Compute combined Hay scores (Layer 1 + Layer 2) ── hay_scores = {} for title in role_profiles.index: role_tasks = merged[merged.job_title == title] row = role_profiles.loc[title] # Layer 1: Task-derived base scores kh_base = compute_know_how(role_tasks) ps_base = compute_problem_solving(role_tasks) ac_base = compute_accountability(role_tasks) # Layer 2: Job-level organizational context modulation span_mult = span_of_control_multiplier( row.get('direct_reports', 0), row.get('indirect_reports', 0)) xfunc_mult = cross_functional_breadth( row.get('l1_functions_in_scope', 1)) da_score = decision_authority_score( row.get('approval_limit', None)) bm_score = budget_magnitude_score( row.get('budget_responsibility', None)) rl_score = reporting_level_score( row.get('levels_from_ceo', 6)) # Combine: Layer 1 base * Layer 2 modulation # Know-How: span of control amplifies managerial breadth, # cross-functional scope amplifies overall breadth kh_final = min(100, kh_base * span_mult * xfunc_mult) # Problem Solving: reporting level modulates thinking environment # (closer to CEO = more abstract/strategic thinking required) ps_final = min(100, ps_base * 0.60 + rl_score * 0.40) # Accountability: decision authority and budget replace the # task-derived freedom/scope estimates with actual org data ac_final = min(100, ac_base * 0.30 + # Task-derived impact da_score * 0.35 + # Decision authority (HRIS) bm_score * 0.35) # Budget magnitude (HRIS) hay_scores[title] = { 'know_how': round(kh_final, 1), 'problem_solving': round(ps_final, 1), 'accountability': round(ac_final, 1), # Keep Layer 1 scores for comparison 'kh_task_only': round(kh_base, 1), 'ps_task_only': round(ps_base, 1), 'ac_task_only': round(ac_base, 1), # Layer 2 context values 'span_multiplier': span_mult, 'decision_authority': da_score, 'budget_magnitude': bm_score, 'reporting_level': rl_score, } hay_df = pd.DataFrame(hay_scores).T hay_df['hay_composite'] = ( hay_df['know_how'] * 0.40 + hay_df['problem_solving'] * 0.30 + hay_df['accountability'] * 0.30 ).round(1) # Also compute task-only composite for Layer 1 vs combined comparison hay_df['hay_task_only'] = ( hay_df['kh_task_only'] * 0.40 + hay_df['ps_task_only'] * 0.30 + hay_df['ac_task_only'] * 0.30 ).round(1) role_profiles = role_profiles.join(hay_df) # ── 5g. Derive job levels from composite Hay score ── # Using Hay's ~15% geometric step progression across levels def hay_to_level(composite): if composite >= 80: return 'Executive / SVP' elif composite >= 68: return 'Director / VP' elif composite >= 56: return 'Senior Manager / Lead' elif composite >= 44: return 'Manager / Senior Specialist' elif composite >= 32: return 'Analyst / Specialist' else: return 'Associate / Coordinator' role_profiles['suggested_level'] = role_profiles.hay_composite.apply(hay_to_level) # ── 5h. Compare and identify misalignment ── print(" === HAY EVALUATION SUMMARY ===") print(f"{'Role':<40} {'KH':>5} {'PS':>5} {'AC':>5} {'Comp':>5} " f"{'Task-Only':>9} {'Δ':>4} {'Level'}") print("─" * 100) for title, p in role_profiles.iterrows(): delta = p.hay_composite - p.hay_task_only flag = '⬆' if delta > 10 else ('⬇' if delta < -5 else '') print(f"{title[:40]:<40} {p.know_how:>5.0f} {p.problem_solving:>5.0f} " f"{p.accountability:>5.0f} {p.hay_composite:>5.0f} " f"{p.hay_task_only:>9.0f} {delta:>+4.0f} {p.suggested_level} {flag}") # Show where Layer 2 context makes the biggest difference print(f" Largest Layer 2 impact (org context vs task-only):") role_profiles['layer2_delta'] = role_profiles.hay_composite - role_profiles.hay_task_only top_delta = role_profiles.nlargest(10, 'layer2_delta') for title, p in top_delta.iterrows(): print(f" {title[:45]:<45} +{p.layer2_delta:.0f} pts " f"(span={p.span_multiplier:.2f}x, auth={p.decision_authority:.0f}, " f"budget={p.budget_magnitude:.0f})") # ── 5i. Future-state modeling: remove automated tasks ── # Layer 1 (task composition) changes; Layer 2 (org context) held constant future_tasks = merged[merged.ai_disposition != 'Automate'] print(f" Future-state: {len(future_tasks)} tasks remain after automation") print(f"Removed: {len(merged) - len(future_tasks)} tasks " f"({(len(merged)-len(future_tasks))/len(merged)*100:.1f}%)") # Recompute Layer 1 for future state, keep Layer 2 constant for title in role_profiles.index: ft = future_tasks[future_tasks.job_title == title] if len(ft) > 0: # Layer 1 recalculated from remaining tasks kh_base = compute_know_how(ft) ps_base = compute_problem_solving(ft) ac_base = compute_accountability(ft) # Layer 2 held constant (org context doesn't change) p = role_profiles.loc[title] kh_f = min(100, kh_base * p.span_multiplier * cross_functional_breadth(p.get('l1_functions_in_scope', 1))) ps_f = min(100, ps_base * 0.60 + p.reporting_level * 0.40) ac_f = min(100, ac_base * 0.30 + p.decision_authority * 0.35 + p.budget_magnitude * 0.35) role_profiles.at[title, 'future_hay'] = round( kh_f * 0.4 + ps_f * 0.3 + ac_f * 0.3, 1) else: role_profiles.at[title, 'future_hay'] = 0 # role eliminated # Identify roles that shift levels role_profiles['future_level'] = role_profiles.future_hay.apply(hay_to_level) shifted = role_profiles[role_profiles.suggested_level != role_profiles.future_level] print(f" {len(shifted)} roles shift job levels in future state") # Show which way they shift for title, p in shifted.iterrows(): direction = '↓ DOWNGRADE' if p.future_hay < p.hay_composite else '↑ UPGRADE' print(f" {title[:40]:<40} {p.suggested_level} → {p.future_level} " f"({p.hay_composite:.0f} → {p.future_hay:.0f}) {direction}")
6

Generate Hierarchy Redesign Outputs

Use role profiles and Hay evaluations to produce actionable deliverables for workforce transformation.

# Step 6: Analytical outputs # ── A. Skill gap matrix ── current_skills = load_lms_data() # your LMS export required_skills = (merged.explode('skills_required') .groupby(['job_title','skills_required']).size() .unstack(fill_value=0)) gap_matrix = required_skills.subtract(current_skills, fill_value=0) # ── B. Role consolidation candidates ── from itertools import combinations from sklearn.metrics.pairwise import cosine_similarity # Build task vectors per role (binary: does role include task?) role_task_matrix = (merged.groupby(['job_title','task_id']).size() .unstack(fill_value=0)) sim = cosine_similarity(role_task_matrix) sim_df = pd.DataFrame(sim, index=role_task_matrix.index, columns=role_task_matrix.index) # Identify roles with >60% task overlap for r1, r2 in combinations(role_profiles.index, 2): if sim_df.loc[r1, r2] > 0.6: fte = role_profiles.loc[[r1,r2], 'fte_count'].sum() print(f"Consolidate: {r1} + {r2} " f"(similarity: {sim_df.loc[r1,r2]:.0%}, FTEs: {fte})") # ── C. Hay evaluation report ── print(" === HAY EVALUATION REPORT ===") for title, p in role_profiles.iterrows(): print(f" {'='*60}") print(f"ROLE: {title}") print(f"Current Grade: {p.current_grade} | Suggested Level: {p.suggested_level}") print(f"{'─'*60}") print(f"KNOW-HOW: {p.know_how:>6.1f}") print(f" Technical Depth: {p.unique_skills} skills, " f"Bloom avg={p.mean_bloom:.1f}, max={p.max_bloom}") print(f" Mgmt Breadth: {p.l2_breadth} L2 processes, " f"{p.l3_breadth} L3 activities") print(f"PROBLEM SOLVING: {p.problem_solving:>6.1f}") print(f" Thinking Env: beta={p.beta:.3f} " f"(lower=more novel, E0-heavy)") print(f" Challenge: Bloom std={p.bloom_std:.1f}, " f"{p.pct_cross_functional:.0%} cross-functional") print(f"ACCOUNTABILITY: {p.accountability:>6.1f}") print(f" Freedom to Act: {(1-p.pct_regulatory)*100:.0f}% non-regulatory") print(f" Defense Line: {p.primary_defense_line}") print(f" Importance: {p.mean_importance:.1f}/5") print(f"{'─'*60}") print(f"HAY COMPOSITE: {p.hay_composite:>6.1f} → {p.suggested_level}") if hasattr(p, 'future_hay') and p.future_hay != p.hay_composite: print(f"FUTURE STATE: {p.future_hay:>6.1f} → {p.future_level}") # ── D. SWP scenario model with Hay impact ── def model_scenario(roles_df, merged_df, automate_threshold): # Model headcount and Hay-level shifts at different automation thresholds. at_risk = roles_df[roles_df.pct_automate > automate_threshold] fte_impact = at_risk.fte_count.sum() * at_risk.pct_automate.mean() level_shifts = (at_risk.suggested_level != at_risk.future_level).sum() return { 'roles_affected': len(at_risk), 'fte_impact': round(fte_impact), 'level_shifts': level_shifts, 'skills_at_risk': merged_df[ merged_df.job_title.isin(at_risk.index) & (merged_df.ai_disposition == 'Automate') ].explode('skills_required')['skills_required'].value_counts().head(10) } scenarios = { 'Conservative (>50% auto)': model_scenario(role_profiles, merged, 0.5), 'Balanced (>30% auto)': model_scenario(role_profiles, merged, 0.3), 'Aggressive (>15% auto)': model_scenario(role_profiles, merged, 0.15), } for name, s in scenarios.items(): print(f" {name}: {s['roles_affected']} roles, ~{s['fte_impact']} FTEs, " f"{s['level_shifts']} level shifts") print(f" Top skills at risk: {', '.join(s['skills_at_risk'].index[:5])}") # ── E. Export role profiles for Hay Guide Chart input ── export_cols = ['task_count','fte_count','current_grade', # Layer 1 (task-derived) 'kh_task_only','ps_task_only','ac_task_only','hay_task_only', # Layer 2 context 'span_multiplier','decision_authority','budget_magnitude','reporting_level', # Combined Hay scores 'know_how','problem_solving','accountability','hay_composite', 'suggested_level','layer2_delta', # Future state 'future_hay','future_level', # AI exposure 'beta','pct_automate','pct_e0','pct_e1','pct_e2'] export_df = role_profiles[[c for c in export_cols if c in role_profiles.columns]] export_df.to_csv('hay_evaluation_export.csv') print(f" Exported {len(export_df)} role evaluations to hay_evaluation_export.csv") print(f"Columns: {list(export_df.columns)}")
7

Validate & Iterate

Quality assurance steps before presenting results to stakeholders.

  • SME Review: Have business leads review the role–task mapping for their function. Flag tasks that are missing, misattributed, or obsolete.
  • Bloom's Calibration: Spot-check 10% of tasks per L1 function to confirm cognitive complexity ratings match SME judgment.
  • E0/E1/E2 Validation: Run the AIOE cross-validation (Step 4f). Compute Spearman rank correlation between occupation-level β and published AIOE z-scores. Directional agreement (both ranking occupations similarly) is the validation target. If rank correlation is weak, review E0/E1/E2 keyword patterns for your domain-specific terminology.
  • Hay Cross-Check: Compare computed Hay composites against existing Korn Ferry evaluations (if available). Large discrepancies (>15 points) indicate either scoring issues or genuinely misgraded roles. Export from Step 5g for side-by-side comparison.
  • Coverage Test: Confirm that every internal job title maps to at least 5 reference tasks. Titles with <5 matches may indicate gaps in the inventory or misclassification.
  • Defense Line Audit: Verify no role mixes 1st-line and 2nd/3rd-line tasks. Flag violations for review with Risk and Compliance.
  • Future-State Reasonableness: Review roles where the future-state Hay level differs from current grade by >2 levels. These are candidates for accelerated reskilling or managed transition.

Pipeline Architecture Summary

Data Flow: Reference Inventory (JSON) + HRIS Export + Job Descriptions + LMS Data → Role–Task Mapping (manual + fuzzy match) → Org-Specific Task Inventory (merged) → E0/E1/E2 Classification (cross-validated against AIOE) → Role Profiles (task-aggregated) → Hay Factor Computation (Know-How, Problem Solving, Accountability) → Analytical Outputs (Hay evaluations, future-state modeling, skill gap matrices, consolidation candidates, SWP scenarios)

Estimated Effort

4–8 weeks for a mid-sized bank (<20k FTEs). Primary bottleneck is SME validation of role–task mappings (Step 2).

Team Composition

1 data scientist (pipeline), 1 HR/workforce planning analyst (mapping), SMEs from each L1 function (validation), 1 project lead.

Technology Stack

Python (pandas, scikit-learn), any SQL database for storage, BI tool (Power BI / Tableau) for visualization, Excel for SME worksheets.

Getting Started: Export the full JSON from the Export Center. That file contains the complete reference inventory, ready to load into your pipeline as Step 1. The Role Mapping Template CSV provides a blank worksheet for Step 2.
Export Center

Download the full inventory or a filtered subset.

Full Task Inventory (CSV)

4,075 tasks × 18 fields

Filtered View (CSV)

Tasks matching current Explorer filters

Full Task Inventory (JSON)

Complete dataset for system integration

Role Mapping Template (CSV)

Blank worksheet for mapping your roles