Stage 1: Glomeration

From Latin glomerāre (“to wind into a ball, to gather into a mass”), derived from glomus (“ball of yarn, sphere”). Etymologically, glomeration designates the act of winding disparate fibers into a compact, organized whole—the spinner's transformation of loose material into a coherent mass that may be stored, transported, and later unwound for use. This is not fabrication but consolidation: gathering scattered substance into accessible form.

“The greatest thing a human soul ever does in this world is to see something, and tell what it saw in a plain way. Hundreds of people can talk for one who can think, but thousands can think for one who can see.” — John Ruskin, Modern Painters III (1856)

Within the GNORIUM editorial pipeline, Glomeration constitutes the foundational stage wherein attestations—quotations evidencing lexical usage in situ—are systematically aggregated into a unified corpus. These attestations lie dispersed across digitized textual repositories: present in principle, yet functionally inaccessible without deliberate collection. Glomeration harvests these scattered witnesses to linguistic practice, consolidates them within a structured evidence base, and renders them available for subsequent analytical stages. Absent this preliminary gathering, the pipeline possesses no empirical foundation: no textual substrate from which lexicographic entries may be derived, no documentary evidence against which semantic and etymological claims may be validated.

Process

  • Ingestion: Corpus documents are parsed and segmented into candidate passages.
  • Extraction: Attestation boundaries are identified—the sentence or paragraph that constitutes a discrete attestation.
  • Assessment: Each candidate receives a preliminary significance score based on potential lexicographic value.
  • Selection: Passages meeting the threshold proceed to Nomination for enrichment.

Pathways

Attestations enter the editorial lifecycle through two complementary modalities of acquisition:

  • Programmatic harvesting: Automated extraction systems traverse digitized corpora at scale identifying candidate passages across millions of textual artifacts through pattern-based segmentation and significance heuristics.
  • Community contribution: Individual contributors—scholars, subject matter experts, and engaged readers—submit attestations discovered in situ during independent research, perpetuating the distributed volunteer model that historically underwrote comprehensive lexicographic enterprises.

Sources

The Glomeration stage draws upon a heterogeneous ecology of open textual corpora of notable attestations, public-domain literary and historical texts, encyclopedic registers, and preexisting lexicographic attestations. Each corpus contributes evidence spanning multiple centuries of linguistic usage—from early modern print to contemporary digital discourse—thereby constituting a diachronically and stylistically representative sample of the language as empirically attested. The objective is not exhaustive coverage but rather principled selection: a corpus sufficient to document semantic range, register variation, and historical development across the lexicon.

1 2 3 4 5 6 7