
    PL
jn                        d Z ddlmZ ddlZddlZddlZddlZddlmZm	Z	m
Z
mZ  ej        e          ZdZdZdZd#dZdddddd$dZd%dZ	 	 d&d'd!Zg d"ZdS )(u7  Background memory/skill review — fork the agent to evaluate the turn.

After every turn, ``AIAgent.run_conversation`` may call
:func:`spawn_background_review` to fire off a daemon thread that replays
the conversation snapshot in a forked :class:`AIAgent` and asks itself
"should any skill/memory be saved or updated?".  Writes go straight to
the memory + skill stores.  Main conversation and prompt cache are never
touched.

The fork inherits the parent's live runtime (provider, model, base_url,
credentials, cached system prompt) so it hits the same prefix cache and
uses the same auth.  It runs with a tool whitelist limited to memory and
skill management tools; everything else is denied at runtime.

See the ``hermes-agent-dev`` skill (``references/self-improvement-loop.md``)
for invariants and PR review criteria.
    )annotationsN)AnyDictListOptionalu  Review the conversation above and consider saving to memory if appropriate.

Focus on:
1. Has the user revealed things about themselves — their persona, desires, preferences, or personal details worth remembering?
2. Has the user expressed expectations about how you should behave, their work style, or ways they want you to operate?

If something stands out, save it using the memory tool. If nothing is worth saving, just say 'Nothing to save.' and stop.u  Review the conversation above and update the skill library. Be ACTIVE — most sessions produce at least one skill update, even if small. A pass that does nothing is a missed learning opportunity, not a neutral outcome.

Target shape of the library: CLASS-LEVEL skills, each with a rich SKILL.md and a `references/` directory for session-specific detail. Not a long flat list of narrow one-session-one-skill entries. This shapes HOW you update, not WHETHER you update.

Signals to look for (any one of these warrants action):
  • User corrected your style, tone, format, legibility, or verbosity. Frustration signals like 'stop doing X', 'this is too verbose', 'don't format like this', 'why are you explaining', 'just give me the answer', 'you always do Y and I hate it', or an explicit 'remember this' are FIRST-CLASS skill signals, not just memory signals. Update the relevant skill(s) to embed the preference so the next session starts already knowing.
  • User corrected your workflow, approach, or sequence of steps. Encode the correction as a pitfall or explicit step in the skill that governs that class of task.
  • Non-trivial technique, fix, workaround, debugging path, or tool-usage pattern emerged that a future session would benefit from. Capture it.
  • A skill that got loaded or consulted this session turned out to be wrong, missing a step, or outdated. Patch it NOW.

Preference order — prefer the earliest action that fits, but do pick one when a signal above fired:
  1. UPDATE A CURRENTLY-LOADED SKILL. Look back through the conversation for skills the user loaded via /skill-name or you read via skill_view. If any of them covers the territory of the new learning, PATCH that one first. It is the skill that was in play, so it's the right one to extend.
  2. UPDATE AN EXISTING UMBRELLA (via skills_list + skill_view). If no loaded skill fits but an existing class-level skill does, patch it. Add a subsection, a pitfall, or broaden a trigger.
  3. ADD A SUPPORT FILE under an existing umbrella. Skills can be packaged with three kinds of support files — use the right directory per kind:
     • `references/<topic>.md` — session-specific detail (error transcripts, reproduction recipes, provider quirks) AND condensed knowledge banks: quoted research, API docs, external authoritative excerpts, or domain notes you found while working on the problem. Write it concise and for the value of the task, not as a full mirror of upstream docs.
     • `templates/<name>.<ext>` — starter files meant to be copied and modified (boilerplate configs, scaffolding, a known-good example the agent can `reproduce with modifications`).
     • `scripts/<name>.<ext>` — statically re-runnable actions the skill can invoke directly (verification scripts, fixture generators, deterministic probes, anything the agent should run rather than hand-type each time).
     Add support files via skill_manage action=write_file with file_path starting 'references/', 'templates/', or 'scripts/'. The umbrella's SKILL.md should gain a one-line pointer to any new support file so future agents know it exists.
  4. CREATE A NEW CLASS-LEVEL UMBRELLA SKILL when no existing skill covers the class. The name MUST be at the class level. The name MUST NOT be a specific PR number, error string, feature codename, library-alone name, or 'fix-X / debug-Y / audit-Z-today' session artifact. If the proposed name only makes sense for today's task, it's wrong — fall back to (1), (2), or (3).

User-preference embedding (important): when the user expressed a style/format/workflow preference, the update belongs in the SKILL.md body, not just in memory. Memory captures 'who the user is and what the current situation and state of your operations are'; skills capture 'how to do this class of task for this user'. When they complain about how you handled a task, the skill that governs that task needs to carry the lesson.

If you notice two existing skills that overlap, note it in your reply — the background curator handles consolidation at scale.

Do NOT capture (these become persistent self-imposed constraints that bite you later when the environment changes):
  • Environment-dependent failures: missing binaries, fresh-install errors, post-migration path mismatches, 'command not found', unconfigured credentials, uninstalled packages. The user can fix these — they are not durable rules.
  • Negative claims about tools or features ('browser tools do not work', 'X tool is broken', 'cannot use Y from execute_code'). These harden into refusals the agent cites against itself for months after the actual problem was fixed.
  • Session-specific transient errors that resolved before the conversation ended. If retrying worked, the lesson is the retry pattern, not the original failure.
  • One-off task narratives. A user asking 'summarize today's market' or 'analyze this PR' is not a class of work that warrants a skill.

If a tool failed because of setup state, capture the FIX (install command, config step, env var to set) under an existing setup or troubleshooting skill — never 'this tool does not work' as a standalone constraint.

'Nothing to save.' is a real option but should NOT be the default. If the session ran smoothly with no corrections and produced no new technique, just say 'Nothing to save.' and stop. Otherwise, act.u  Review the conversation above and update two things:

**Memory**: who the user is. Did the user reveal persona, desires, preferences, personal details, or expectations about how you should behave? Save facts about the user and durable preferences with the memory tool.

**Skills**: how to do this class of task. Be ACTIVE — most sessions produce at least one skill update. A pass that does nothing is a missed learning opportunity, not a neutral outcome.

Target shape of the skill library: CLASS-LEVEL skills with a rich SKILL.md and a `references/` directory for session-specific detail. Not a long flat list of narrow one-session-one-skill entries.

Signals that warrant a skill update (any one is enough):
  • User corrected your style, tone, format, legibility, verbosity, or approach. Frustration is a FIRST-CLASS skill signal, not just a memory signal. 'stop doing X', 'don't format like this', 'I hate when you Y' — embed the lesson in the skill that governs that task so the next session starts fixed.
  • Non-trivial technique, fix, workaround, or debugging path emerged.
  • A skill that was loaded or consulted turned out wrong, missing, or outdated — patch it now.

Preference order for skills — pick the earliest that fits:
  1. UPDATE A CURRENTLY-LOADED SKILL. Check what skills were loaded via /skill-name or skill_view in the conversation. If one of them covers the learning, PATCH it first. It was in play; it's the right place.
  2. UPDATE AN EXISTING UMBRELLA (skills_list + skill_view to find the right one). Patch it.
  3. ADD A SUPPORT FILE under an existing umbrella via skill_manage action=write_file. Three kinds: `references/<topic>.md` for session-specific detail OR condensed knowledge banks (quoted research, API docs excerpts, domain notes) written concise and task-focused; `templates/<name>.<ext>` for starter files meant to be copied and modified; `scripts/<name>.<ext>` for statically re-runnable actions (verification, fixture generators, probes). Add a one-line pointer in SKILL.md so future agents find them.
  4. CREATE A NEW CLASS-LEVEL UMBRELLA when nothing exists. Name at the class level — NOT a PR number, error string, codename, library-alone name, or 'fix-X / debug-Y' session artifact. If the name only fits today's task, fall back to (1), (2), or (3).

User-preference embedding: when the user complains about how you handled a task, update the skill that governs that task — memory alone isn't enough. Memory says 'who the user is and what the current situation and state of your operations are'; skills say 'how to do this class of task for this user'. Both should carry user-preference lessons when relevant.

If you notice overlapping existing skills, mention it — the background curator handles consolidation.

Do NOT capture as skills (these become persistent self-imposed constraints that bite you later when the environment changes):
  • Environment-dependent failures: missing binaries, fresh-install errors, post-migration path mismatches, 'command not found', unconfigured credentials, uninstalled packages. The user can fix these — they are not durable rules.
  • Negative claims about tools or features ('browser tools do not work', 'X tool is broken', 'cannot use Y from execute_code'). These harden into refusals the agent cites against itself for months after the actual problem was fixed.
  • Session-specific transient errors that resolved before the conversation ended. If retrying worked, the lesson is the retry pattern, not the original failure.
  • One-off task narratives. A user asking 'summarize today's market' or 'analyze this PR' is not a class of work that warrants a skill.

If a tool failed because of setup state, capture the FIX (install command, config step, env var to set) under an existing setup or troubleshooting skill — never 'this tool does not work' as a standalone constraint.

Act on whichever of the two dimensions has real signal. If genuinely nothing stands out on either, say 'Nothing to save.' and stop — but don't reach for that conclusion as a default.review_messages
List[Dict]prior_snapshotreturn	List[str]c                >   t                      }t                      }|pg D ]}t          |t                    r|                    d          dk    r1|                    d          }|r|                    |           ^|                    d          }t          |t
                    r|                    |           g }| pg D ]V}t          |t                    r|                    d          dk    r2|                    d          }|r||v rN|s/|                    d          }	t          |	t
                    r|	|v r	 t          j        |                    dd                    }
n# t          j        t          f$ r Y w xY wt          |
t                    r|
                    d          s|
                    dd          }|
                    d	d          }d
|
                                v r|                    |           Id|
                                v r|                    |           vd|
                                v s|rBd|
                                v r,|dk    rdn	|dk    rdn|}|                    | d           d|v r,|dk    rdn	|dk    rdn|}|                    | d            d|
                                v sd|
                                v r*|dk    rdn	|dk    rdn|}|                    | d           X|S )a3  Build the human-facing action summary for a background review pass.

    Walks the review agent's session messages and collects "successful tool
    action" descriptions to surface to the user (e.g. "Memory updated").
    Tool messages already present in ``prior_snapshot`` are skipped so we
    don't re-surface stale results from the prior conversation that the
    review agent inherited via ``conversation_history`` (issue #14944).

    Matching is by ``tool_call_id`` when available, with a content-equality
    fallback for tool messages that lack one.
    roletooltool_call_idcontentz{}successmessage targetcreatedupdatedaddedaddmemoryMemoryuserzUser profilez updatedzEntry addedremovedreplaced)set
isinstancedictgetr   strjsonloadsJSONDecodeError	TypeErrorlowerappend)r   r
   existing_tool_call_idsexisting_tool_contentspriortcidr   actionsmsgcontent_strdatar   r   labels                 ;/home/kuhnn/.hermes/hermes-agent/agent/background_review.py#summarize_background_review_actionsr4      sk    !UU UU%2 	4 	4%&& 	%))F*;*;v*E*Eyy(( 	4"&&t,,,,ii	**G'3'' 4&**7333G$" / /#t$$ 	6(A(Aww~&& 	D222 	''),,K+s++ ?U0U0U	:cggi6677DD$i0 	 	 	H	$%% 	TXXi-@-@ 	((9b))(B''''NN7####'--//))NN7####''F'u7O7O &( 2 2HH&TZJZJZ`fENNe---....g%% &( 2 2HH&TZJZJZ`fENNe---....'--//))Z7==??-J-J &( 2 2HH&TZJZJZ`fENNe---...Ns   (E..FF)write_originexecution_contexttask_idr   agentr   r5   Optional[str]r6   r7   r   Dict[str, Any]c                  |pt          | dd          |pt          | dd          | j        pd| j        pd| j        pt          j                            dd          dd	}|r||d
<   |r||d<   d |                                D             S )z?Build provenance metadata for external memory-provider mirrors._memory_write_originassistant_tool_memory_write_context
foregroundr   HERMES_SESSION_SOURCEclir   )r5   r6   
session_idparent_session_idplatform	tool_namer7   r   c                "    i | ]\  }}|d v	||S )>   Nr    ).0kvs      r3   
<dictcomp>z/build_memory_write_metadata.<locals>.<dictcomp>2  s(    EEETQ*1D1DAq1D1D1D    )getattrrB   _parent_session_idrD   osenvironr"   items)r8   r5   r6   r7   r   metadatas         r3   build_memory_write_metadatarS     s     %`7MO_(`(` Eu5|DD&,""5;NTbjnn5Le&T&T
  
 H  &% 0#/ EEX^^--EEEErL   messages_snapshotpromptr#   Nonec                
   ddl m} ddlm} d }	  ||           n# t          $ r Y nw xY wd}g }	 t          t          j        dd          5 }t          j	        |          5  t          j
        |          5  |                                 }	|	                    d	          pd}
|
d
k    rd}
 || j        dd| j        | j        |
|	                    d          pd|	                    d          pdt!          | dd          | j        d          }d|_        d|_        | j        |_        | j        |_        | j        |_        d|_        d|_        d|_        | j        |_        | j        |_        | j        |_        ddlm} ddlm}m } d  |ddgd          D             } ||d           	 |!                    |dz   |            |             n#  |             w xY w	 |"                                 n# t          $ r Y nw xY w	 |#                                 n# t          $ r Y nw xY wtI          t!          |dg                     }d}ddd           n# 1 swxY w Y   ddd           n# 1 swxY w Y   ddd           n# 1 swxY w Y   tK          ||          }|rnd&                    tN          (                    |                    }| )                    d|            | j*        }|r 	  |d |            n# t          $ r Y nw xY wnH# t          $ r;}tV          ,                    d!|           | -                    d"|           Y d}~nd}~ww xY w|	 t          t          j        dd          5 }t          j	        |          5  t          j
        |          5  	 |"                                 n# t          $ r Y nw xY w	 |#                                 n# t          $ r Y nw xY wddd           n# 1 swxY w Y   ddd           n# 1 swxY w Y   ddd           n# 1 swxY w Y   n# t          $ r Y nw xY w	  |d           dS # t          $ r Y dS w xY w# |	 t          t          j        dd          5 }t          j	        |          5  t          j
        |          5  	 |"                                 n# t          $ r Y nw xY w	 |#                                 n# t          $ r Y nw xY wddd           n# 1 swxY w Y   ddd           n# 1 swxY w Y   ddd           n# 1 swxY w Y   n# t          $ r Y nw xY w	  |d           w # t          $ r Y w w xY wxY w)#a"  Worker function executed in the background-review daemon thread.

    Spawns a forked ``AIAgent`` inheriting the parent's runtime, runs the
    review prompt, and surfaces a compact action summary back to the user
    via ``agent._safe_print`` and ``agent.background_review_callback``.
    r   )AIAgent)set_approval_callbackc                >    t                               d| |           dS )Nz8Background review auto-denied dangerous command: %s (%s)deny)loggerwarning)commanddescriptionkwargss      r3   _bg_review_auto_denyz3_run_review_in_thread.<locals>._bg_review_auto_denyI  s'    F[	
 	
 	
 vrL   Nwzutf-8)encodingapi_modecodex_app_servercodex_responses   Tbase_urlapi_key_credential_pool)modelmax_iterations
quiet_moderD   providerrd   rh   ri   credential_poolrC   skip_memorybackground_review)get_tool_definitions)set_thread_tool_whitelistclear_thread_tool_whitelistc                *    h | ]}|d          d         S )functionnamerG   )rH   ts     r3   	<setcomp>z(_run_review_in_thread.<locals>.<setcomp>  s1           *f%     rL   r   skills)enabled_toolsetsrm   z`Background review denied non-whitelisted tool: {tool_name}. Only memory/skill tools are allowed.)deny_msg_fmtuu   

You can only call memory and skill management tools. Other tools will be denied at runtime — do not attempt them.)user_messageconversation_history_session_messagesu    · u      💾 Self-improvement review: u   💾 Self-improvement review: z)Background memory/skill review failed: %szbackground review).	run_agentrX   tools.terminal_toolrY   	ExceptionopenrO   devnull
contextlibredirect_stdoutredirect_stderr_current_main_runtimer"   rk   rD   rn   rM   rB   r<   r>   _memory_store_memory_enabled_user_profile_enabled_memory_nudge_interval_skill_nudge_intervalsuppress_status_output_cached_system_promptsession_startmodel_toolsrr   hermes_cli.pluginsrs   rt   run_conversationshutdown_memory_providercloselistr4   joinr!   fromkeys_safe_printbackground_review_callbackr\   r]   _emit_auxiliary_failure)r8   rT   rU   rX   _set_approval_callbackra   review_agentr   _devnull_parent_runtime_parent_api_moderr   rs   rt   review_whitelistr.   summary_bg_cbe_fns                       r3   _run_review_in_threadr   5  s	    "!!!!!SSSSSS  34444    L"$O~"*cG444 E	 '11E	  E	 '11E	  E	  $99;;O.22:>>F$  #555#4   #7k!)(,,Z88@D'++I66>$ '/A4 H H"'"2   L 1DL-1DL.).)<L&+0+@L(161LL.23L/12L. 37L/ 271LL. */)<L&&+&6L#888888       
   --&.%9#        &% H   .-->>
 *; .    ,+----++----557777   ""$$$$   "7<9Lb#Q#QRROLKE	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	  E	 Z 6
 

  	kk$--"8"899G<7<<   5F FBBB    !   D  > > >BAFFF%%&91========> #"*cG<<< 
/44
 
/44
 
$==????$   $**,,,,$   
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
    	""4((((( 	 	 	DD	' #"*cG<<< 
/44
 
/44
 
$==????$   $**,,,,$   
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
    	""4(((( 	 	 	D	s   
**L J	#I28D6I/G	IG 	 I$G98I9
H	IH	I
HI
H,	)I+H,	,#II2II2"I#I2&J	2I6	6J	9I6	:J	=L 	JL JA#L 5L L 
LL LL Q( 
M1MQ( MQ(  P9 <P-P&O?(N=<O?=
O
	O?	O
	
O?O#"O?#
O0	-O?/O0	0O?3P?PPPP
P-P	P-P	P-!P9 -P11P9 4P15P9 9
QQ
Q 
Q%$Q%(U2,UT9T"	2T4S	
T	
STSTS/
.T/
S<9T;S<<T?T"	TT"	TT"	T9"T&&T9)T&*T9-U9T==U T=UU2
UU2UU2U"!U2"
U/,U2.U//U2Freview_memoryboolreview_skillsc                     |r|rt           dt                    n/|rt           dt                    nt           dt                    d fd}|fS )a%  Build the review thread target and prompt for a background review.

    Returns a ``(target, prompt)`` tuple.  The caller (``AIAgent._spawn_background_review``)
    owns the actual ``threading.Thread`` construction so test-level patches
    of ``run_agent.threading.Thread`` keep working.
    _COMBINED_REVIEW_PROMPT_MEMORY_REVIEW_PROMPT_SKILL_REVIEW_PROMPTr   rV   c                 *    t                      d S )N)r   )r8   rT   rU   s   r3   _targetz/spawn_background_review_thread.<locals>._target-  s    e%6?????rL   )r   rV   )rM   r   r   r   )r8   rT   r   r   r   rU   s   ``   @r3   spawn_background_review_threadr     s      N N 9;RSS	 N 79NOO 68LMM@ @ @ @ @ @ @ @ F?rL   )r   r   r   r   r4   rS   )r   r	   r
   r	   r   r   )r8   r   r5   r9   r6   r9   r7   r9   r   r9   r   r:   )r8   r   rT   r	   rU   r#   r   rV   )FF)r8   r   rT   r	   r   r   r   r   )__doc__
__future__r   r   r$   loggingrO   typingr   r   r   r   	getLogger__name__r\   r   r   r   r4   rS   r   r   __all__rG   rL   r3   <module>r      s:   $ # " " " " "       				 , , , , , , , , , , , ,		8	$	$H \ BHE \< < < <D #''+!"&F F F F F F6_ _ _ _J  	    8  rL   