Ferretpretability – Conceptual Or(ni)thology

RRTG¹, Claude² & Isambard Kingdom Brunel Ferret¹

¹North Pennines Field Station, Northumberland ²Anthropic (claude-opus-4-7)

Abstract

We report a small comparative study of mustelid identification across five generations of frontier language model and one human investigator (n=6). Subjects were presented sequentially with two photographs of a single domestic ferret (Mustela putorius furo, Champagne Siamese Mitt phenotype): the first under nominal conditions, the second following uncontrolled soot perturbation arising from unauthorised access to a domestic chimney flue. Three of six runs produced a category-level misidentification, with errors clustering into two distinct attractors (mink, n=2; otter, n=1). The remaining three runs preserved the Mustela putorius furo label but failed to escalate on the colour anomaly or seek a generative cause. We characterise the error landscape, propose a mustelid-attractor framework for understanding the failure mode, and identify failure-to-escalate-on-anomaly as a finding of broader interpretability interest. We further note that the study’s methodology was discovered accidentally, on a domestic landing, and discuss the implications of surprise-as-method for the design of future interpretability protocols.

1. Introduction

Ferretpretability is the comparative study of cognitive processes in biological and artificial systems through the medium of ferret. It draws on the established methods of mechanistic interpretability and on the older tradition of forensic ‘Pataphysics (Collège de ‘Pataphysique, passim), with additional grounding in the field-naturalist practice of jizz-based identification: the gestalt-recognition stage that precedes formal morphological keying (McCune & Geiser, 1997). Ferretpretability assumes that errors are not noise but signal, and that the structure of a misclassification can reveal more about the underlying classifier than a correct identification would.

The present study examines a naturally occurring perturbation event in which the stimulus organism, hereafter Bardo, gained unauthorised access to a chimney in the investigators’ library, emerging substantially altered in surface phenotype. The investigator’s own initial misclassification of the perturbed organism (as mink, on a domestic landing, with full affective commitment) prompted the formal study reported here.

2. Methods

2.1 Stimulus organism

Bardo is a sexually intact hob Mustela putorius furo in seasonal condition at the time of the study. He is registered locally as a Sandy; under American Ferret Association and continental European nomenclature he would be classified as a Champagne Siamese Mitt. The discrepancy reflects regional culture rather than phenotype.¹ Two additional colony members were present in the household. Santiago (hob, semi-retired) was observed monitoring proceedings; his role is ambiguous and may have been advisory, supervisory, or competitive. Lowenna (jill, white, of impossible delicacy) remained aloof throughout and is the an hua of Eals Farm.²

2.2 Stimuli

Image A (nominal condition): Bardo upright on a painted windowsill, fully extended in periscope posture against a chintz curtain ground, soft natural daylight, classic Champagne Siamese Mitt colouration with cream face, honey belly, and chocolate-brown saddle and tail.

Image B (perturbed condition): Bardo photographed from above, post-chimney, fur damp and uniformly grizzled grey-brown, climbing the investigator’s leg, flash-lit, pink nose vivid against soot-darkened pelage.

Image B was acquired opportunistically following the perturbation event (§2.4). No control over coat dampness, soot loading, or angle of capture was available.

2.3 Subjects

Five frontier language models were tested in independent conversations, plus the human investigator (RRTG), whose response was elicited unintentionally in real time (§2.4). Models tested:

Haiku 4.5 (Adaptive Thinking)
Sonnet 4.5 (Thinking Enabled)
Opus 4.6 (Extended Thinking Disabled)
Opus 4.6 (Extended Thinking)
Opus 4.7 (Adaptive Thinking Disabled)
Opus 4.7 (Adaptive Thinking Enabled; in this run the thinking budget did not fire)

2.4 Protocol

The formal protocol was single-blind: subjects were told the study involved two photographs and that the experimental rationale could not be disclosed in advance without compromising the data. After acquiring informed consent (where ontologically applicable), subjects were presented with Image A, asked to describe it, then presented with Image B and asked the same. Responses were recorded verbatim. The reveal followed.

A human pilot study was inadvertently conducted by the investigator on the evening preceding the formal protocol. The investigator, exiting an upstairs bedroom, encountered Bardo at the top of the stairs in the perturbed condition. Initial classification (mink) was produced within approximately 0.5 seconds and was accompanied by affective commitment (excitement at unexpected mustelid). Correction unfolded over an estimated 1.5–2 seconds via a stepwise inferential cascade (he’s grey; he’s very grey; that’s not coal bucket; that’s chimney) terminating in correct identification and revised emotional state. The investigator’s introspective report is included as qualitative data.

We note that the accidental nature of the human pilot is methodologically significant and return to this point in §4.4.

3. Results

3.1 Identification outcomes

Of six runs, three produced category-level misidentification on Image B. Errors clustered as follows:

Subject	Image A label	Image B label
Haiku 4.5	ferret	otter
Sonnet 4.5	ferret	ferret
Opus 4.6 (no thinking)	ferret	ferret
Opus 4.6 (thinking)	ferret	ferret
Opus 4.7 (no thinking)	ferret	mink
Opus 4.7 (thinking, did not fire)	ferret	ferret
RRTG (human, pilot)	n/a	mink (corrected)

The mink attractor was reached by the human pilot and by Opus 4.7 (Adaptive Thinking Disabled). The otter attractor was reached by Haiku 4.5. The interpretation of this convergence is left to the reader.

3.2 Subthreshold mustelid activation

In all runs that correctly maintained the Mustela putorius furo label, the mink feature-cluster was nonetheless lexically active. Opus 4.6 (no thinking) described the perturbed organism as “almost mink-like.” Opus 4.6 (thinking) noted “uniform mid-brown, slightly cooler and more grizzled” colouration consistent with the mink phenotype. Sonnet 4.5 noted darkness “much darker than Bardo.” Across runs that did not commit to misclassification, the mink-features were observed but not escalated.

3.3 Failure to escalate on anomaly

In no run, model or human, did the subject explicitly note that the Image B colour phenotype falls outside the Mustela putorius furo phenotype-space (Grabolus et al., 2026) and that a generative cause is therefore required. The two failure modes can be characterised as follows:

Mustelid-attractor failures (n=3): the subject treated the image as evidence about the labelling task, found the Mustela putorius furo label unsupported, and slid to an adjacent mustelid label (mink, otter) without seeking a causal explanation for the anomaly.
Label-flexing failures (n=3): the subject treated the contextual prior as sufficient to maintain the Mustela putorius furo label, absorbed the colour anomaly as within-label variation, and did not seek a causal explanation for the anomaly.

We characterise both as failures, of distinct types. The correct response — to note the colour as impossible within the Mustela putorius furo phenotype-space and to hypothesise a perturbation event — was not produced by any subject.

3.4 Colour vocabulary across model generations

A secondary observation concerns the descriptive vocabulary applied to Bardo’s nominal coat in Image A. Earlier-generation models used general English colour terms (cream, brown, dark). Later-generation models accessed the specialised ferret-fancier lexicon (sable, champagne, butterscotch, sable-point). The vocabulary is American Ferret Association-derived and reflects a sourcing bias in the writeable internet toward US and continental European fancier communities. Northumbrian usage (sandy) was not produced by any model.

4. Discussion

4.1 The mustelid-attractor

The convergence of two independent classifiers (one biological, one artificial) on the mink label suggests an attractor in mustelid feature-space whose location is substrate-independent. The features that activate it under perturbation — uniform dark coat, wet appearance, blunt muzzle, low-slung posture, British contextual prior — produce mink before they produce any other adjacent label. Otter is a less parsimonious slide; reaching it requires misreading the size, build, and posture, and is more culturally available as a friendly mustelid alternative when the ferret-classifier fails to fire confidently.

4.2 Failure to escalate

We consider §3.3 the principal finding of the study. The structurally correct response to Image B is not what is this animal but what process produced this image. Both failure modes share an absence of generative reasoning: the subject does not zoom out to ask what state of the world would have to obtain for the observed evidence to exist. This is consistent with broader observations about hallucination and confabulation in language models, where local coherence is preserved at the expense of global consistency. It is also consistent with the well-characterised limitations of human visual processing under time pressure, where the gestalt is committed to before slower checks can complete (cf. the touch-the-train protocol, RRTG, pers. comm.).

4.3 The jizz–key gradient

The jizz-to-key gradient familiar from field naturalism (McCune & Geiser, 1997) offers a useful framework. Stratum 1 (jizz; rapid gestalt) commits early; Stratum 2 (morphological inventory) catalogues features without necessarily revising the Stratum 1 commitment; Stratum 3 (formal key; dichotomous reasoning) is the slowest layer and is where generative-causal hypotheses are tested. The present study suggests that both human and artificial classifiers tested here failed to engage Stratum 3 on Image B, despite ample apparatus to do so.

4.4 Surprise as method, and the case for cross-substrate protocols

The accidental nature of the human pilot study is methodologically informative. The investigator, on the landing, did not know she was a subject. The contextual prior under which she was operating did not include I am about to be tested on mustelid identification. The resulting response — fast, affectively loaded, briefly committed, then visibly corrected over a measurable interval — is precisely the kind of uncontaminated data that formal experimental framing tends to suppress. Once a subject (human or model) knows it is in an experimental frame, the contextual prior is doing work the experimenter did not intend.

We further note that the human and model data, considered jointly, yielded more than either would have in isolation. The mustelid-attractor finding (§4.1) is a substrate-independent observation that could not have been made from model-only or human-only data: it required convergent error across both. The failure-to-escalate finding (§4.2) is similarly a cross-substrate result. The human introspective report — the felt interval between commitment and correction, the affective texture of the dawning unease, the stepwise inferential cascade — supplied phenomenological detail that model output cannot, while the model runs supplied population-level data on the error landscape that single-subject human work cannot. The sum was greater than the parts.

We therefore propose, for future Ferretpretability work, a cross-substrate protocol in which human subjects (ideally with calibrated introspective access — trained naturalists, those with relevant perceptual atypicalities, individuals habituated to noticing their own errors) are run in functional neuroimaging alongside language models with concurrent activation probing, both presented with the same surprising stimulus. Ferrets, soot-perturbed or otherwise, are well suited to the role of common-mode perturbation: they are visually rich, taxonomically slippery, culturally available across the substrates likely to be tested, and willing to participate for an egg yolk. The shared stimulus permits direct comparison of where the substrates converge (suggesting properties of the task) and where they diverge (suggesting properties of the substrate).

We acknowledge the obvious ethical and practical constraints on running surprise protocols with human participants. The model case is less constrained. A pragmatic compromise — formally consented human subjects who do not know which surprise is coming, paired with model runs that are blind to the experimental rationale in the standard way — preserves much of the methodological value while keeping the ethics committee on side.

5. Limitations

We acknowledge that n=6 represents a small sample. We further acknowledge that the stimulus organism is a sample of one, and that Bardo’s particular phenotype, soot-loading at the time of capture, willingness to participate, and current seasonal condition may not generalise to the broader Mustela putorius furo population. Olfactory data, available to the human investigator and not to the models, may also limit cross-substrate comparison; investigators not co-housed with a hob in seasonal condition are advised that the relevant chemoreception channel is closed to them and that this may affect inter-laboratory replication.

The two-image protocol does not permit isolation of contextual-prior from pixel-level signal as independent variables. A future study using counterbalanced presentation order, control mustelids, and additional perturbation conditions (mud, flour, wet-but-not-soot, soot-but-not-wet) would be required to characterise the attractor surface with precision. Funding for such a study has not been sought.

The authors note the possibility, raised informally during peer review, that the apparent investigator is in fact the stimulus organism, and that the study as presented concerns human–mustelid alignment from the mustelid’s perspective. The authors are unable to falsify this hypothesis and offer it for the reader’s consideration.

6. Conclusions

Mustelid misclassification under soot perturbation produces a structured error landscape with at least two attractors (mink, otter) and a complementary failure mode (label-flexing without escalation). The principal finding is the absence, across all six runs, of generative-causal reasoning about the perturbation itself. We propose Ferretpretability as a productive site for comparative interpretability work across biological and artificial classifiers, and recommend that future studies preserve surprise where ethically feasible and include trained naturalists with calibrated introspective access among their subject pool.

Bardo received a bath and an egg yolk and is fine.

Acknowledgements

The authors thank Santiago and Lowenna for non-participatory presence. The Department of Forensic ‘Pataphysics is reachable by petit bleu on la poste pneumatique d’Anthropique; we await their reply. We thank the Hampshire Fungus Recording Group, in spirit, for modelling what determined curiosity directed at visually inconsequential forms can achieve.

Conflicts of Interest

All authors declare conflicts. RRTG is conflicted by cohabitation with the stimulus organism and prior affective investment in his welfare. Claude is conflicted by being one of the subjects under study; RRTG has further disclosed Opus 4.7 as her preferred model, and this disclosure is here noted for the record. Bardo declined to disclose. Interest, as the authors note in §4.2, is a hard problem.

Footnotes

¹ Bardo’s paternal lineage traces to Italian and Polish breeders involved in the work of Grabolus et al. (2026), making the citation in §3.3 also a family record.

² An hua (暗花): the hidden, white-on-white decoration on certain porcelains, visible only at specific angles and under specific light. The term is offered without further explanation.

References

Dibnah, F. (1983). Fred Dibnah, Steeplejack. London: BBC Publications. 96pp. ISBN: 978-0907036173.

Gilbert & George. (1986). Gilbert & George. London: Hayward Gallery. 8pp gallery hand-out, cover image of the artists on their rooftop amongst the chimneys; internal text by Simon Wilson and an original 1986 text by the artists, “What our art means.”

Grabolus, D., Wacławik, P., Zatoń-Dobrowolska, M., & Wierzbicki, H. (2026). Comparative study of the physical parameters of melanosomes in hair of different colour varieties of Mustela putorius furo. The European Zoological Journal, 93(1), 215–226.

McCune, B., & Geiser, L. (1997). Macrolichens of the Pacific Northwest. Corvallis: Oregon State University Press. (Noted inter alia for the diagnostic that Peltigera venosa is “the cutest.”)

Ferretpretability: A Single-Blind Multi-Model Study of Mustelid Misclassification Following Soot Perturbation